Google DeepMind announces Genie 2, a model that can generate playable 3D worlds

Google DeepMind, the folks behind AlphaGo, have announced Genie 2, a groundbreaking tool that can generate interactive 3D worlds based on a single image prompt. This new model is designed to help train and test AI agents by allowing them to interact with these dynamic environments using keyboard and mouse inputs. It allows for training and testing AI agents in a wide variety of interactive environments, and here’s a breakdown of its key capabilities, according to DeepMind:

Action-Controllable: Genie 2 responds to actions, like keyboard and mouse inputs, allowing a person or AI to interact with the environment. For example, when you press the arrow keys, it understands that it has to move the character, and thus doesn’t mistakenly move objects like trees or clouds.

A gif showing genies capabilities — Image: Google DeepMind

Long Horizon Memory: Genie 2 can remember parts of the world that are no longer in view and render them when they come back into the scene, making the simulation feel more continuous and realistic.
New Content On-the-Fly: It can create new, consistent content while maintaining the integrity of the world over time, ensuring that environments are always evolving in a believable way.
Emergent Capabilities: Genie 2 can model complex interactions, like physics, gravity, and lighting, and even animate characters and simulate behaviors of non-playable characters (NPCs). It can handle everything from water effects to character movement and smoke.
Counterfactual Simulation: The system can generate different paths from the same starting point. This feature allows researchers to test different outcomes, providing a way to simulate a variety of experiences for training purposes.

Real-World Image Prompting: Not just limited to computer-generated images, Genie 2 can also use real-world photos as prompts, simulating natural elements like grass blowing in the wind or water flowing.
Rapid Prototyping: Researchers can quickly create interactive experiences with Genie 2, allowing for fast testing and training in different environments. It can turn concept art or drawings into full, interactive worlds.

Generative AI like Genie 2 isn't free from controversy. Copyright and intellectual property issues are major sticking points in this space. Models like these are often trained on datasets pulled from the internet, and that sometimes includes copyrighted material.

Artists, game developers, and even tech companies have raised concerns about unauthorized use of their content in training these models. Lawsuits have already popped up in other areas of generative AI, targeting companies like OpenAI and Stability AI, with plaintiffs arguing that their works were used without permission. It’s not hard to imagine similar cases arising here, especially as these AI-generated worlds become increasingly indistinguishable from human-created designs.

Adding another layer of complexity is the broader criticism of data scraping practices. Many are frustrated with companies like Meta and X, which use user data from their platforms to train models, often without explicit consent.