How General Intuition is Using Video Games to Build Real-World AI

General Intuition is attempting a massive paradigm shift in robotics by using massive video game datasets to train agentic models for the physical world. With a fresh $320 million funding round, the startup is betting that the "action data" found in gaming is the missing link for artificial intelligence.

The Power of Action Labels and Spatial-Temporal Reasoning

While many AI researchers attempt to train models by simply observing video footage, General Intuition CEO Pim de Witte argues that video alone is insufficient. The company’s competitive edge lies in its access to proprietary data from Medal, a platform where users share video game clips.

Unlike standard video, these clips contain embedded "action labels"—precise records of which buttons a player pressed and exactly when. This allows the model to go beyond mere pattern recognition; it learns spatial-temporal reasoning. By understanding the direct link between a specific input (an action) and the resulting change in the environment (the reaction), the AI begins to grasp causality. This enables the model to distinguish the "self" from the "environment," a fundamental requirement for any autonomous agent.

From Fortnite to Quadruped Robots

The company’s technical ambition is to create a single model that generalizes across different domains: gameplay, simulation, and physical embodiment. In recent demonstrations, an AI agent trained on gameplay was able to navigate complex virtual environments, understanding that walls are solid objects and shadows change with the sun's movement.

Crucially, this "brain" is being ported directly to hardware. The company demonstrated a quadrupedal robot that utilized the same model powering its gaming agents. Notably, the team reported that it took only eight minutes of real-world robotics data—collected on actual streets—to fine-tune the model for the robot’s navigation. This suggests that the heavy lifting of learning physics and spatial awareness is being done in the "gym" of video games, making real-world deployment significantly more efficient.

A $2.3 Billion Bet on General Agents

The scale of this ambition is reflected in the company's valuation. General Intuition recently raised $320 million at a $2.3 billion valuation, bringing its total disclosed funding to $454 million. The round was led by Khosla Ventures, with significant participation from General Catalyst, Jeff Bezos, Eric Schmidt, and researchers from Google DeepMind and MIT.

The capital is earmarked for two primary goals: scaling compute capacity through a partnership with CoreWeave and making their API more widely available by the end of summer. For investors like Vinod Khosla, the goal isn't just better automation, but the emergence of "AI intuition"—a human-like capability to navigate the world through understanding, rather than just following programmed instructions.

Key Takeaways

  • Action-Driven Training: General Intuition uses button-press "action labels" from gaming clips to teach AI causality, moving beyond the limitations of video-only training.
  • Scalable Simulation: By using video games as a "gym," the company can train complex spatial-temporal reasoning without the massive expense of gathering real-world robotic data.
  • Massive Institutional Backing: With a $2.3 billion valuation and backing from figures like Jeff Bezos and Eric Schmidt, the company is positioning itself as a foundational player in the world model era.