Google DeepMind has launched Gemini Robotics 1.5, a new vision-language-action (VLA) model designed to help robots perform complex, multi-step tasks with greater autonomy and transparency.
The release includes two complementary models: Gemini Robotics 1.5 and Gemini Robotics-ER 1.5. The former is DeepMind’s most advanced VLA system to date, capable of turning visual input and instructions into motor commands.
Unlike previous generations, it generates reasoning steps before acting, allowing robots to explain their decision-making processes and adapt more effectively to new environments.
Gemini Robotics-ER 1.5, meanwhile, is positioned as an embodied reasoning model. Acting as a high-level “orchestrator”, it is able to plan, make logical decisions, and natively call digital tools such as Google Search to gather information.
According to DeepMind, the model has achieved state-of-the-art results across 15 embodied reasoning benchmarks, including ERQA, Point-Bench and RoboSpatial-VQA.
Together, the two models are intended to work in an agentic framework. Gemini Robotics-ER 1.5 generates high-level plans and instructions, while Gemini Robotics 1.5 executes them by translating language and vision into physical actions.
This cooperation is designed to improve robots’ ability to generalize across longer tasks and more diverse environments.
DeepMind highlighted several new capabilities enabled by Gemini Robotics 1.5. The system is able to “think before acting”, breaking down long tasks into smaller, more manageable segments.
It also demonstrates cross-embodiment learning, transferring skills learned on one robot to others with different form factors, such as from Aloha 2 to Apptronik’s Apollo humanoid or a dual-arm Franka robot.
The company emphasized the importance of safety in developing embodied AI. It said Gemini Robotics 1.5 incorporates high-level semantic reasoning for safety, alignment with Gemini’s existing safety policies, and low-level collision-avoidance subsystems.
DeepMind has also updated its Asimov benchmark for evaluating semantic safety, noting that Gemini Robotics-ER 1.5 achieved state-of-the-art performance in internal testing.
DeepMind framed the launch as a milestone toward building general-purpose robots capable of reasoning, planning and tool use.
By moving beyond reactive systems, the company said, the Gemini Robotics line represents a step toward solving artificial general intelligence (AGI) in the physical world.
Gemini Robotics-ER 1.5 is now available to developers through the Gemini API in Google AI Studio. Gemini Robotics 1.5 is being offered initially to select partners.