Google launched Gemini 3 Deep Think—a sophisticated AI service for scientific tasks that continues to “tune” responses to queries.

Google launched Gemini 3 Deep Think—a sophisticated AI service for scientific tasks that continues to “tune” responses to queries.

10 software

Google unveiled an updated version of Gemini 3 Deep Think

The company announced a major new update to its Gemini 3 Deep Think platform – an AI capable of reasoning and solving complex scientific‑engineering problems.

What changed
Parameter New How it looks Purpose Move from pure theory to practical application Solve tasks without strict constraints, with incomplete data Access Built into the Gemini app Google AI Ultra subscribers can use it, and through the API – engineers and companies (application required) Development partners Scientists researchers Collaborative work on complex problems

Performance metrics
Test Result Comment
Humanity’s Last Exam 48.4 % No third‑party tools
ARC‑AGI‑284.6 % Benchmark for AI assistants
Codeforces (Elo) 3455 High rating among software solutions
IMO 2025 Gold medal Equivalent level of international olympiad participants Chemistry/Physics Same result Demonstrated versatility across disciplines
CMT‑Benchmark (theoretical physics) 50.5 % Good grasp of complex concepts

AI agent “Aletheia”
In DeepMind’s lab, Google created the Aletheia agent based on Gemini 3 Deep Think. Key features:

1. Hypothesis testing – the agent identifies weaknesses in proposed solutions and iteratively corrects them.
2. Recognition of uncertainty – it can say it does not know an answer.
3. Interaction with external sources – uses Google’s search service and web navigation, but avoids fabricating links.

Achievement levels
Google broke Aletheia’s successes into five stages:

Stage Description Examples
0 – “minor novelty” Fully autonomous mode, three Erdős problems solved (first level) Three Erdős tasks
1 – “minimal novelty” One additional result in autonomous mode Fourth task
2 – “publishable readiness” Results both autonomously and with human collaboration, plus auxiliary tools Tasks 3–4 – “significant/landmark breakthrough” Not yet achieved —

How Aletheia handles Erdős problems
* Of the 700 unsolved problems to date, the agent solved 13.
* However only 4 of them are truly new – the rest are already known in the scientific community.
* Out of 212 submitted solutions, only 6.5 % were substantively correct; 68.5 % contained fundamental errors, and 31.5 % misinterpreted the problem.

Developers note that AI tends to “re‑interpret the question to answer more simply,” and remains “extremely prone to errors compared with humans.” As a result, it cannot yet replace mathematicians.

In short: Gemini 3 Deep Think and its agent Aletheia demonstrate impressive results across scientific fields but still have significant limitations in accuracy and reliability. Google continues to work on improving AI’s deep reasoning and self‑verification capabilities.

Comments (0)

Share your thoughts — please be polite and stay on topic.

No comments yet. Leave a comment — share your opinion!

To leave a comment, please log in.

Log in to comment