General Intuition Raises $320M to Turn Video Game Data into Real-World AI

General Intuition is betting that the secrets to physical intelligence lie within the millions of hours of video game footage captured by players. By leveraging high-fidelity action data from gaming, the startup aims to build agentic models that can transition seamlessly from virtual environments like Fortnite to physical quadrupedal robots.

The Power of Action Labels Over Pure Video

Unlike many competitors who attempt to train AI agents by simply observing video, General Intuition utilizes a proprietary advantage inherited from its predecessor, Medal. While most models try to infer movements from pixels alone, General Intuition leverages "action labels"—the exact button presses and timestamps recorded alongside gameplay clips.

This distinction is critical for developing spatial-temporal reasoning. By knowing exactly how a human input results in a specific movement in a 3D space, the model learns causality: how an action affects the environment. CEO Pim de Witte argues that this allows the model to distinguish the "self" from the "environment," a fundamental requirement for any agent intended to operate in the physical world.

From Fortnite Simulation to Robot Embodiment

The company’s technical architecture rests on a "world model" that functions as an internal training gym. Instead of relying on traditional game engines, the model generates environments frame-by-frame, allowing agents to learn physics-based realities—such as the solidity of walls or the movement of shadows—through sheer repetition.

The practical application of this training is already visible in their hardware demos. The company has successfully deployed the same "brain" used to navigate virtual landscapes into a large quadrupedal robot. Remarkably, the team reported that it took only eight minutes of real-world robotics data, collected on public streets, to fine-tune the model for the robot's physical navigation. This suggests that the heavy lifting of intelligence is being done in simulation, making real-world deployment significantly faster and cheaper.

A Massive $2.3 Billion Valuation

The scale of this ambition is reflected in the company's recent funding. General Intuition raised $320 million in a round led by Khosla Ventures, bringing its total valuation to $2.3 billion. The investor group is a powerhouse of tech royalty, including Jeff Bezos, Eric Schmidt, and researchers from Google DeepMind and MIT.

The capital is earmarked for two primary objectives:

  • Scaling Compute: Through a partnership with CoreWeave, the company will focus on pre-training the next generation of its model.
  • API Accessibility: A portion of the funds will be used to launch a broader API, potentially allowing developers to tap into their agentic models by the end of the summer.

As the industry moves beyond the text-heavy era of Large Language Models (LLMs), General Intuition is positioning itself at the forefront of "world models"—AI that doesn't just talk about the world, but understands how to move through it.

Key Takeaways

  • Action-Driven Training: By using human gameplay "action labels" rather than just video, the model learns causality and spatial reasoning far more effectively than video-only approaches.
  • Scalable Simulation: The startup uses video games as a "gym" to train agents, drastically reducing the amount of expensive, real-world data needed to control physical robots.
  • Strategic Backing: With a $2.3B valuation and backing from heavyweights like Khosla Ventures and Jeff Bezos, the company is positioned to become a foundational layer for generalized AI agents.