Deep reasoning is beyond the capabilities of today’s AI models. GPT5 shows some progress but overall the performance is a far cry to what is required to solve problems at expert level. Statements about models reaching PhD level should be taken with a measure of skepticism.
Are frontier AI models really capable of “PhD-level” reasoning? To answer this question, we introduce FormulaOne, a new reasoning benchmark of expert-level Dynamic Programming problems. We have curated a benchmark consisting of three tiers, in increasing complexity, which we call ‘shallow’, ‘deeper’, ‘deepest’. The results are remarkable: - On the ‘shallow’ tier, top models reach performance of 50%-70%, indicating that the models are familiar with the subject matter. - On ‘deeper’, Grok 4, Gemini-Pro, o3-Pro, Opus-4 all solve at most 1/100 problems. GPT-5 Pro is significantly better, but still solves only 4/100 problems. - On ‘deepest’, all models collapse to 0% success rate. 🧵

Aug 14, 2025 · 12:54 PM UTC

5
14
85
Replying to @AmnonShashua
Appreciate so much when performance is measured by truly talented experts who are not affiliated with the companies building and marketing their models. Thank you 🙏🏻
1
3
Replying to @AmnonShashua
Interesting study. I feel like this explains why autonomous driving can struggle with "edge cases", especially if they require deeper thinking. And I am guessing this is why @Mobileye argues for compound AI and redundancy instead of pure e2e single large model? @shai_s_shwartz
4
AI can mimick the resemblance of reason, creativity, and even consciousness… to the point of fooling humans in a Turing test. but it is impossible to have any consciousness as a non-metabolic system. consciousness is the ontological primary, not the other way around as much as the world appears physicalist.
Replying to @AmnonShashua
To my mind, this is just another step in automatizaion - the process that humanity deal with since age xviii😀
Replying to @AmnonShashua
thank you Prof. Shashua. you remind me of my graduate Archaeology prof. "ground truth is in layers"