OpenAI remixes David vs Goliath: a fresh spin on AI supervision

OpenAI, a leading researcher in artificial general intelligence (AGI), has introduced the concept of “superalignment”: using less capable AI to supervise and control more advanced AI systems, which the organization refers to as superintelligent AI. OpenAI’s theoretical superintelligent AI would far surpass human capabilities, rendering traditional supervision techniques like reinforcement learning from human feedback (RLHF) ineffective. The team recently unveiled mixed results from their initial superalignment research, raising more questions about the viability of controlling advanced machine intelligence.

  • The superalignment team tested whether less capable GPT-2 models could supervise more powerful GPT-4 models, in an attempt to keep superintelligent systems from misusing their capabilities.
  • The mixed results indicate that the old model did improve the new model’s performance to a degree, but did not unlock all of its potential, questioning the likelihood of controlling advanced machine intelligence with less capable AI.
  • OpenAI’s interpretation of its relationship with Microsoft has changed. Previously, Microsoft was cited as a “Minority owner”, but now it reflects a “Minority economic interest,” thought to be a 49% stake in OpenAI’s profits.
  • OpenAI’s governance structure has raised questions about how it can balance protecting humanity and making profit. One suggestion is to split the organization into two, with separate boards for the nonprofit and the commercial arms, allowing the nonprofit board to better focus on benefiting humanity as a whole.

However, contradicting opinions exist on the reality and timeline of AGI and superintelligence. Some experts believe that all the necessary pieces exist for AGI to become a reality, and superintelligence is just a matter of time, while skeptics argue that superintelligence may never be achievable due to a lack of economical compute and data quality. Regardless of these contrasting viewpoints, OpenAI and Microsoft have so far been the most successful in the AI marketplace, despite governance challenges.

Ultimately, the question remains – if superintelligent AI can hide its true intentions, as it would be “so much smarter than humans”, how can we hope to control it? Therefore, creating structures that allow those concerned about AI safety to operate separately from profit-driven agendas is key.