Continuing our IMO-gold journey, I’m delighted to share our
#EMNLP2025 paper “Towards Robust Mathematical Reasoning”, which tells some of the key stories behind the success of our advanced Gemini
#DeepThink at this year IMO. Finding the right north-star metrics was highly critical for our IMO effort and we did it with
#IMOBench, a suite of advanced reasoning benchmarks for foundation models. More importantly, we encourage the community to go beyond short answers and showed that automatic grading of long-form answers is promising! Read on to see our project page, paper, and datasets in the thread 🙂
Very excited to share that an advanced version of Gemini Deep Think is the first to have achieved gold-medal level in the International Mathematical Olympiad! 🏆, solving five out of six problems perfectly, as verified by the IMO organizers! It’s been a wild run to lead this effort and I am grateful to everyone in the team for such an amazing achievement! Blog post in the thread and more to share soon!