Harshit Sikchi (will be at NeurIPS 25) · Aug 7, 2025 · 5:32 PM UTC

Harshit Sikchi (will be at NeurIPS 25)

Pinned Tweet

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Aug 7

Check out GPT-5. Starting around two months ago now, was fortunate to get to contribute to something so fun!

OpenAI

@OpenAI

Aug 7

GPT-5 is here. Rolling out to everyone starting today. openai.com/gpt-5/

Introducing GPT-5

Harshit Sikchi (will be at NeurIPS 25) · Nov 1, 2025 · 12:16 AM UTC

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Nov 1

One of the many things we reinvented and revived from RL; this one’s on policy distillation for LLM land

Shane Gu

@shaneguML

Oct 31

Hot take: DAgger (Ross 2011) should be the first paper you read to get into RL, instead of Sutton's book. Maybe also read scheduled sampling (Bengio 2015). And before RL, study supervised learning thoroughly.

Harshit Sikchi (will be at NeurIPS 25) · Oct 28, 2025 · 11:07 PM UTC

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Oct 28

I am on wait and watch mode on how good this is

@1x_tech

Oct 28

NEO The Home Robot Order Today

Harshit Sikchi (will be at NeurIPS 25) · Oct 23, 2025 · 2:43 AM UTC

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Oct 23

Absolutely insane; these are some amazing people

Jiaxun Cui 🐿️

@cuijiaxun

Oct 23

Meta has gone crazy on the squid game! Many new PhD NGs are deactivated today (I am also impacted🥲 happy to chat)

Mark Sellke · Oct 17, 2025 · 4:42 PM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Mark Sellke

@MarkSellke

Oct 17

Update: Mehtaab and I pushed further on this. Using thousands of GPT5 queries, we found solutions to 10 Erdős problems that were listed as open: 223, 339, 494, 515, 621, 822, 883 (part 2/2), 903, 1043, 1079. Additionally for 11 other problems, GPT5 found significant partial progress that we added to the official website: 32, 167, 188, 750, 788, 811, 827, 829, 1017, 1011, 1041. For 827, Erdős's original paper actually contained an error, and the work of Martínez and Roldán-Pensado explains this and fixes the argument. The future of scientific research is going to be fun.

Sebastien Bubeck

@SebastienBubeck

Oct 12

gpt5-pro is superhuman at literature search: it just solved Erdos Problem #339 (listed as open in the official database erdosproblems.com/forum/thre…) by realizing that it had actually been solved 20 years ago h/t @MarkSellke for pointing this out to me!

932

Yuda Song · Oct 15, 2025 · 3:02 AM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Yuda Song @yus167

Oct 15

🤖 Robots rarely see the true world's state—they operate on partial, noisy visual observations. How should we design algorithms under this partial observability? Should we decide (end-to-end RL) or distill (from a privileged expert)? We study this trade-off in locomotion. 🧵(1/n)

133

Harshit Sikchi (will be at NeurIPS 25) · Oct 8, 2025 · 6:14 PM UTC

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Oct 8

Even with cool ideas, researchers often overlook how important implementation details can be. Getting these things right can be key to scaling up deep RL

Brett Barkley @bebark99

Oct 7

(1/n) With over 1,300 citations, MBPO is often cited as proof that model based RL beats model free methods. In arxiv.org/pdf/2412.14312 we showed it often completely fails in DeepMind Control. In our new work, Fixing That Free Lunch (FTFL), we explain why and make it succeed.

Harshit Sikchi (will be at NeurIPS 25) · Oct 5, 2025 · 4:40 PM UTC

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Oct 5

SF does really summer in October

Eugene Vinitsky 🦋 · Oct 2, 2025 · 3:27 PM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Eugene Vinitsky 🦋 @EugeneVinitsky

Oct 2

We're finally out of stealth: percepta.ai We're a research / engineering team working together in industries like health and logistics to ship ML tools that drastically improve productivity. If you're interested in ML and RL work that matters, take a look 😀

Percepta | A General Catalyst Transformation Company

Transforming critical institutions using applied AI. Let's harness the frontier.

percepta.ai

100

Sebastien Bubeck · Sep 28, 2025 · 6:32 PM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Sebastien Bubeck

@SebastienBubeck

Sep 28

Yet more evidence that a pretty major shift is happening, this time by Scott Aaronson scottaaronson.blog/?p=9183&f…

125

456

134

3,526

Tejal Patwardhan · Sep 25, 2025 · 4:24 PM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Tejal Patwardhan

@tejalpatwardhan

Sep 25

Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.

OpenAI

@OpenAI

Sep 25

Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. openai.com/index/gdpval-v0

190

1,284

Harshit Sikchi (will be at NeurIPS 25) · Sep 18, 2025 · 3:19 PM UTC

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Sep 18

RLZero will be presented at @NeurIPSConf 2025 . Learn more about the work in the thread below:

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

10 Dec 2024

🤖 Introducing RL Zero 🤖: a new approach to transform language into behavior zero-shot for embodied agents without labeled datasets! RL Zero enables prompt-to-policy generation, and we believe this unlocks new capabilities in scaling up language-conditioned RL, providing an interpretable link between RL agents and humans and achieving true cross-embodiment transfer.

Harshit Sikchi (will be at NeurIPS 25) · Sep 17, 2025 · 8:51 PM UTC

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Sep 17

A good way to test generalizable capability in current world of potentially contaminated datasets are competitions and we are making steady progress!

Mostafa Rohaninejad

@MostafaRohani

Sep 17

1/n I’m really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the 2025 ICPC World Finals, the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have placed it first among all human participants. 🥇🥇

Caroline Wang · Sep 15, 2025 · 9:19 PM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Caroline Wang @CarolineWang98

Sep 15

[1/4] 🚀 We’re excited to announce the v1 release of JaxAHT – a new library for Ad Hoc Teamwork (AHT) research, built with JAX for speed & scalability! Check it out 👉 larg.github.io/jax-aht #AI #MARL #ReinforcementLearning #JAX #AdHocTeamwork

Yuda Song · Sep 11, 2025 · 4:18 PM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Yuda Song @yus167

Sep 11

LLMs lose diversity after RL post-training, and this hurts test-time scaling & creativity. Why does this collapse happen, and how can we fix it? Our new work introduces: 🔍 RL as Sampling (analysis) 🗺️ Outcome-based Exploration (intervention) [1/n]

467

Taylor W. Killian · Sep 9, 2025 · 12:26 PM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Taylor W. Killian @tw_killian

Sep 9

#K2Think (🏔️💭) is now live. We're proud of this model that punches well above its weights, developed primarily for mathematical reasoning but has shown itself to be quite versatile. As a fully deployed reasoning system at k2think.ai you can test it for yourself!

K2 Think

K2 Think - Advanced Reasoning Model

k2think.ai

MBZUAI

@mbzuai

Sep 9

Introducing K2 Think - a breakthrough in advanced AI reasoning. Developed by MBZUAI’s Institute of Foundation Models and @G42ai, K2 Think delivers frontier reasoning performance at a fraction of the size of today’s largest systems. Smaller. Smarter. Open to the world. Available now: K2Think.Ai/K2Think #K2Think #AI #OpenSource #MBZUAI #G42 #Innovation

117

Michele Wang · Aug 30, 2025 · 12:46 AM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Michele Wang

@michelelwang

Aug 30

our team at openai is hiring technical staff to build frontier evals for finance. If you're passionate about measuring real-world capabilities, have a love/hate relationship with Excel, or are an ex-banker/ex-investor with technical skills, please reach out! openai.com/careers/research-…

1,091

Sebastien Bubeck · Aug 20, 2025 · 4:05 PM UTC

Harshit Sikchi (will be at NeurIPS 25) retweeted

Sebastien Bubeck

@SebastienBubeck

Aug 20

Claim: gpt-5-pro can prove new interesting mathematics. Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it's correct. Details below.

300

1,279

430

8,001

Harshit Sikchi (will be at NeurIPS 25) · Aug 10, 2025 · 5:21 AM UTC

Harshit Sikchi (will be at NeurIPS 25)

@harshit_sikchi

Aug 10

It has been a good conference ⁦@RL_Conference⁩ ; Below ⁦@RLBRew_RLC⁩ social, edmonton flame, a great talk. Conference detox needed now