DeepMind launches Gemini Robotics 1.5 to advance AI agents in the physical world

Google DeepMind has launched Gemini Robotics 1.5, a new vision-language-action (VLA) model designed to help robots perform complex, multi-step tasks with greater autonomy and transparency.

The release includes two complementary models: Gemini Robotics 1.5 and Gemini Robotics-ER 1.5. The former is DeepMind’s most advanced VLA system to date, capable of turning visual input and instructions into motor commands.

Unlike previous generations, it generates reasoning steps before acting, allowing robots to explain their decision-making processes and adapt more effectively to new environments.

Gemini Robotics-ER 1.5, meanwhile, is positioned as an embodied reasoning model. Acting as a high-level “orchestrator”, it is able to plan, make logical decisions, and natively call digital tools such as Google Search to gather information.

According to DeepMind, the model has achieved state-of-the-art results across 15 embodied reasoning benchmarks, including ERQA, Point-Bench and RoboSpatial-VQA.

Together, the two models are intended to work in an agentic framework. Gemini Robotics-ER 1.5 generates high-level plans and instructions, while Gemini Robotics 1.5 executes them by translating language and vision into physical actions.

This cooperation is designed to improve robots’ ability to generalize across longer tasks and more diverse environments.

DeepMind highlighted several new capabilities enabled by Gemini Robotics 1.5. The system is able to “think before acting”, breaking down long tasks into smaller, more manageable segments.

It also demonstrates cross-embodiment learning, transferring skills learned on one robot to others with different form factors, such as from Aloha 2 to Apptronik’s Apollo humanoid or a dual-arm Franka robot.

The company emphasized the importance of safety in developing embodied AI. It said Gemini Robotics 1.5 incorporates high-level semantic reasoning for safety, alignment with Gemini’s existing safety policies, and low-level collision-avoidance subsystems.

DeepMind has also updated its Asimov benchmark for evaluating semantic safety, noting that Gemini Robotics-ER 1.5 achieved state-of-the-art performance in internal testing.

DeepMind framed the launch as a milestone toward building general-purpose robots capable of reasoning, planning and tool use.

By moving beyond reactive systems, the company said, the Gemini Robotics line represents a step toward solving artificial general intelligence (AGI) in the physical world.

Gemini Robotics-ER 1.5 is now available to developers through the Gemini API in Google AI Studio. Gemini Robotics 1.5 is being offered initially to select partners.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Google DeepMind unveils Gemini Robotics 1.5 to bring AI agents into the physical world

Related stories you might also like…

Share this:

Related stories you might also like…