OpenAI’s video tool could learn a lot from babies.


  • OpenAI has developed a new video generation tool called Sora that can create high-definition video clips from text descriptions.
  • Sora, while impressive, has limitations in understanding physical reality and cause-and-effect relationships.

OpenAI’s new video generation tool, Sora, has been making waves in the AI community. Sora can turn text descriptions into captivating video clips that look like they could go viral on social media. However, a closer look reveals that Sora struggles with accurately simulating the physics of a scene. For example, in a prompt describing pirate ships battling inside a cup of coffee, one of the ships moves inexplicably, showcasing the tool’s limitations in understanding physical laws. Sora can also confuse cause and effect, spatial details, and more.

Despite these limitations, OpenAI sees Sora as a stepping stone towards achieving artificial general intelligence (AGI), aiming to develop models that can understand and simulate the real world. However, the road to AGI is complex, as it not only requires a grasp of physical laws but also an understanding of human behavior. This raises questions about whether machines can truly achieve superintelligence without understanding human intelligence.

John Naughton suggests that OpenAI could learn a lot from studying how babies learn, as research shows that babies are highly adept at intelligence-gathering and decision-making. By observing babies’ development and how they understand causality, we may gain insights into achieving true artificial general intelligence. Ultimately, the path to AGI may involve more than just extrapolating machine learning paradigms.