Google DeepMind is not building a better gamer; it is building a better brain. The Scalable Instructable Multiworld Agent (SIMA) project, now in its second generation, is using the complex, open-ended universe of 3D video games, the very same worlds where players might be building a castle in Valheim or exploring a galaxy in No Man’s Sky, as the proving ground for nothing less than Artificial General Intelligence (AGI).
With SIMA 2, powered by the company’s flagship Gemini model, Google has moved its AI from being a passive instruction-follower to an active, reasoning co-collaborator, marking a pivotal step in the quest for truly generalist AI.
Also read: OpenAI’s group chat feature explained: The bold step toward AI-assisted collaboration
For years, AI breakthroughs in gaming were defined by specialization. DeepMind’s AlphaGo mastered the ancient game of Go, and AlphaStar conquered the real-time strategy of StarCraft II. But their brilliance was confined to a single, structured environment.
The reality of the world, whether physical or virtual, is chaos. That’s where SIMA comes in. SIMA 2 is designed to act like a human player: it observes the game screen, processes natural language commands, and uses a virtual keyboard and mouse to operate, all without access to the game’s hidden internal code.
The major breakthrough of SIMA 2 lies in its transfer learning capability. Having been trained across a diverse portfolio of commercial games, including survival, building, and exploration titles like Goat Simulator 3, Valheim, and Satisfactory, the agent learns abstract concepts. If it learns what “mining” means in one game, it can apply that understanding to “harvesting” in a completely different one.
“SIMA 2 is a step change and improvement in capabilities over SIMA 1. It’s a more general agent that can complete complex tasks in previously unseen environments.”
– Joe Marino, Senior Research Scientist at DeepMind
The engine behind SIMA 2’s dramatic improvement is the Gemini 2.5 flash-lite model. The integration elevates the agent from a simple command executor to a conversational partner:
Perhaps the most groundbreaking advancement in SIMA 2 is its autonomous self-improvement cycle.
The original SIMA relied heavily on recorded human gameplay data. SIMA 2 uses this as a base, but then shifts into a self-directed learning mode. Utilizing a separate Gemini model, the agent:
Also read: Beyond left and right: How Anthropic is training Claude for political even-handedness
This ability to rapidly adapt and learn in a new game, or even in entirely AI-generated 3D worlds created by Google’s Genie model, demonstrates a level of generalization that far surpasses previous benchmarks. The success rate for complex tasks has doubled compared to SIMA 1’s initial performance.
For DeepMind, SIMA is not a gaming tool; it is a virtual robot. The skills it is mastering within the simulated 3D environment, navigation, tool use, understanding complex instructions, and collaborative task execution, are the exact building blocks required for developing advanced, general-purpose physical robots.
Video games offer a safe, scalable, and boundless sandbox to stress-test these agents. Every time SIMA 2 successfully chops down a tree in Valheim or repairs a spaceship in No Man’s Sky, it is practicing a core capability needed by a future AI assistant in the real world.
While SIMA 2 remains a research-preview agent, currently unavailable to the public and limited to a few academic and developer partners, Google DeepMind’s ambition is clear. By harnessing Gemini’s reasoning power and training its agent across a multi-world intelligence landscape, they are not just aiming to win a game; they are laying the groundwork for the general-purpose, helpful AI agents that will eventually operate in all the complex, dynamic, and unpredictable environments of the real world.
Also read: Indian youth are using AI chatbots for emotional support, warn Indian researchers