Google DeepMind is launching Gemin 2.5 Deep Think, a new model for multi-agent reasoning, which is now available for AI Ultra subscribers ($250/month) in the Gemini app. First introduced at Google I/O 2025, Deep Think utilizes multiple AI agents to answer questions in parallel by proposing hypotheses, processing them, then choosing the winning hypothesis. The parallel thinking methodology takes time, but creates possibility for more deliberate, creative responses.
In its early testing, Google used a version of this model to win the gold medal at the 2025 International Math Olympiad, while the version available publicly, is a basic version that has used day-to-day utility, while testing in the bronze range on IMO type tests.
On hard reasoning benchmarks, Deep Think performs at the very top of the benchmarking suite: it scored 34.8% on Humanity's Last Exam compared to xAI's Grok 4 at 25.4%, and OpenAI's o3 at 20.3%; and scored 87.6% on LiveCodeBench 6 ahead of both competitors. The model supports built-in tools that allow it to use Google Search and execute code, and it is able to provide much longer, detailed responses.
Google has also utilized innovative reinforcement learning strategies so that Deep Think can emphasize the reasoning paths. Safety protocols have been tightened making it Google's most secure model to date. Access is limited to certain prompts per day, and broader enterprise and API access will be available in the coming weeks via Vertex AI. Google anticipates leveraging feedback through academic institutions and trusted testers to further polish before providing general availability to the public.
In summary, Gemini 2.5 Deep Think represents a sizeable step forward for public AI reasoning- combining slow and deliberate reasoning with competitive benchmark performance and now is rolling out to research use cases, while also enabling the most complex problem-solving capabilities.