I departed Google DeepMind after 8 years. So many fond memories—from early foundational papers in Google Brain (w/
@noamshazeer @ashvaswani @lukaszkaiser on Image Transformer, Tensor2Tensor, Mesh TensorFlow) to lead Gemini posttraining evals to catch up & launch in 100 days, then leading the team to leapfrog to LMArena #1 (and stay there for over a year!), and finally working on the incredible reasoning innovations for Gemini’s IMO & ICPC gold medals (w/
@HengTze @quocleix).
Gemini has been a wild journey from one paradigm to another: first, revamping our LaMDA model (the first instruction-like chatbot!) from an actual chatbot to long contentful responses with RLHF; then, reasoning and deep thinking by training over long thinking chains, novel environments, and reward heads. When we first started, public sentiment was bad. Everyone thought Google was doomed to fail due to its search legacy and organizational politics. Now, Gemini is consistently #1 in user preference and spearheading new scientific accomplishments, and everyone thinks Google winning is obvious. 😂 (It also used to be the case that OpenAI would jump the AI newscycle by announcing before us from a backlog of ideas for every new Google release; safe to say that backlog is empty.)
I have since joined xAI. The recipe is well-known. Compute, data, and O(100) brilliant, hard-working people are all that’s needed to obtain a frontier-level LLM. xAI *really* believes in this. For compute, even at Google I have never experienced this # of chips per capita (& 100K+ GB200/300K’s are incoming with Colossus 2). For data, Grok 4 made the biggest bet in scaling RL & posttraining. xAI is making new bets to scale data, deep thinking, and the training recipe. And the team is quick. No company has gotten to where xAI is today in AI capabilities in as little as time. As
@elonmusk says, a company’s first- and second-order derivatives are the most important: xAI’s acceleration is the highest.
I’m excited to announce that in my first few weeks, we launched Grok 4 Fast. Grok 4 is an amazing reasoning model, still the top on ARC-AGI and new benchmarks like FinSearchComp. But it’s slow and was never really targeted for general-purpose user needs. Grok 4 Fast is the best mini-class model—on LMArena, it is #8 (Gemini 2.5 Flash is #18!), and on core reasoning evals like AIME, it is on par with Grok 4 while 15x cheaper. S/o to
@LiTianleli @jinyilll @ag_i_2211
@s_tworkowski @keirp1 @yuhu_ai_