Google DeepMind Unveils Gemini Robotics 1.5 AI Models to Power Next-Generation General-Purpose Robots

Introducing Gemini Robotics-ER 1.5 and Gemini Robotics 1.5: Advanced AI Models for Robotics

Table of Contents

Gemini Robotics AI: Acting as the “Brain” for Future Robots
Two-Model Approach: How Google Solves the Planning vs. Execution Challenge
Real-World Applications of Gemini Robotics AI
Why Gemini Robotics Models Are a Breakthrough in AI and Robotics

In a major leap for artificial intelligence and robotics, google DeepMind has announced the launch of two new models in its Gemini Robotics family — Gemini Robotics-ER 1.5 and Gemini Robotics 1.5. These AI systems are designed to enhance the way robots reason, see, and act in the real world, making them more capable than any previous embodied ai models developed by the Mountain View-based tech giant.

The distinction between the two lies in their roles: while Gemini Robotics 1.5 is built to execute tasks using natural language instructions, Gemini Robotics-ER 1.5 functions as the planner or orchestrator, responsible for creating logical, multi-step strategies. Together, they aim to transform general-purpose robots into more intelligent, adaptable, and efficient assistants.

Gemini Robotics AI: Acting as the “Brain” for Future Robots

According to DeepMind’s official blog, these new AI models are designed to act as a robot’s brain, enabling real-world functioning through advanced vision-language and vision-language-action (VLA) capabilities. Unlike traditional robotics systems that require predefined programming or rigid interfaces, these models allow robots to understand and respond to natural language commands, making human-robot interaction smoother and more intuitive.

However, achieving this milestone has not been without challenges. Large language models (LLMs), despite their success in natural language understanding, struggle with tasks that require precise movements, complex spatial awareness, or real-time decision-making. Earlier, a single model had to handle both planning and execution, which often resulted in slower performance and errors in physical tasks.

Two-Model Approach: How Google Solves the Planning vs. Execution Challenge

To address these limitations, DeepMind introduced a two-model system. The first, Gemini Robotics-ER 1.5, is a sophisticated vision-language model (VLM) with advanced reasoning and tool-calling abilities. It can generate multi-step plans for a given task, and even integrate external tools like Google Search to retrieve relevant information before creating a solution. The company states that this model delivers state-of-the-art (SOTA) performance on multiple benchmarks for spatial and logical reasoning.

Once the plan is ready, execution is handled by Gemini Robotics 1.5, a vision-language-action (VLA) model. This model can process both visual inputs and natural language instructions, translating them into precise motor actions for robots. What makes it unique is its ability to explain its reasoning process in natural language before executing tasks, thereby improving transparency and trust in robotic decision-making.

Real-World Applications of Gemini Robotics AI

Google DeepMind claims that this dual-model system significantly improves a robot’s ability to handle complex, multi-step tasks. For example, if a user instructs a robot to “separate objects into compost, recycling, and trash bins”, the AI can first search local recycling guidelines online, then visually analyze the objects in front of it, create a detailed action plan, and finally execute the sorting task efficiently.

This opens up possibilities across multiple sectors, from household assistance and healthcare to warehouse automation and industrial robotics. Robots equipped with Gemini Robotics AI could become indispensable in environments where precision, adaptability, and reasoning are essential.

Why Gemini Robotics Models Are a Breakthrough in AI and Robotics

The unveiling of Gemini Robotics-ER 1.5 and Gemini Robotics 1.5 signals a significant step toward general-purpose robotics, where machines are no longer limited to single, repetitive tasks but can flexibly adapt to diverse real-world challenges. By separating planning and execution, Google has not only solved a longstanding limitation in robotics but also paved the way for more human-like, intelligent robot assistants.

With this announcement, Google DeepMind reinforces its position at the forefront of AI-driven robotics innovation. As these models evolve and move closer to real-world deployment, they hold the promise of transforming industries and daily life alike, shaping a future where intelligent robots can seamlessly collaborate with humans.

For breaking news and live news updates, like us on Facebook or follow us on Twitter and Instagram. Read more on Latest Technology on thefoxdaily.com.

COMMENTS 0

About the Author

Ashish kumar

Ashish Kumar is the creative mind behind The Fox Daily, where technology, innovation, and storytelling meet. A passionate developer and web strategist, Ashish began exploring the web when blogs were hand-coded, and CSS hacks were a rite of passage. Over the years, he has evolved into a full-stack thinker—crafting themes, optimizing WordPress experiences, and building platforms that blend utility with design. With a strong footing in both front-end flair and back-end logic, Ashish enjoys diving into complex problems—from custom plugin development to AI-enhanced content experiences. He is currently focused on building a modern digital media ecosystem through The Fox Daily, a platform dedicated to tech trends, digital culture, and web innovation. Ashish refuses to stick to the mainstream—often found experimenting with emerging technologies, building in-house tools, and spotlighting underrepresented tech niches. Whether it's creating a smarter search experience or integrating push notifications from scratch, Ashish builds not just for today, but for the evolving web of tomorrow.