NaveenR InUK retweeted
If you work at the intersection of CS and economics (or think your work is of interest to those who do!) consider submitting to the ESIF Economics and AI+ML meeting this summer at Cornell: econometricsociety.org/regio…
2
23
86
NaveenR InUK retweeted
What if LLMs could tune their own decoding—no more guesswork with temperature and top‑p? Enter AutoDeco: the first architecture that makes LLM decoding truly end-to-end. By adding lightweight heads, the model predicts its own context-aware temperature and top‑p at every token step—turning decoding into a learnable, differentiable process. Results? 🔥 Beats default decoding by a wide margin 🎯 Matches an “oracle” that cheats by tuning per test case ✨ Learns to follow natural language instructions like “be more random” or “stay focused”—adjusting sampling strategy on the fly The End of Manual Decoding: Towards Truly End-to-End Language Models Tencent, CUHK-SZ Paper: huggingface.co/papers/2510.2… Code: github.com/Zacks917/AutoDeco Model: huggingface.co/collections/J… Our report: mp.weixin.qq.com/s/Kd1H57wai… 📬 #PapersAccepted by Jiqizhixin
2
17
1
125
NaveenR InUK retweeted
Testing Kimi K-2 has reminded me of how insane it is that firms picking AIs are treating them as fungible based on benchmarks Kimi & Grok & Claude & every other model have strengths, quirks & weaknesses that can make a big difference in aggregate Develop your own benchmarks!
NaveenR InUK retweeted
Soren (@soren_ai_inc) is building an AI engineer that never stops iterating on your AI systems. It continuously tests, diagnoses issues, and runs experiments in the background — helping your team iterate faster without the manual trial-and-error grind.
NaveenR InUK retweeted
Before YC, @therishic was working a regular 9–5, frustrated by how slow things moved and fighting to build anything new. Now, as co-founder of Kastle (YC S24), he’s able to finally build products that people really want. ycombinator.com/apply
NaveenR InUK retweeted
First MSM I've seen talking about this, Anthropic @ $350B from Google. Could end up even higher, let's see.
It is crazy that just ~2 years ago OAI board approached Dario to be CEO after firing Sam. At that time, Anthropic was valued at $18B, OAI at $86B. Anthropic will soon be valued at ~$300B (not yet reported). The gap is closing!
5
6
2
158
NaveenR InUK retweeted
Diffusion LLMs (dLLMs) have come a long way, from an idea in research labs into a cutting-edge tech redefining the frontiers of generative AI. Excited to announce our $50m seed round led by @MenloVentures and made possible by the tireless efforts of our team @_inception_ai.
Today’s LLMs are painfully slow and expensive. They are autoregressive and spit out words sequentially. One. At. A. Time. Our dLLMs generate text in parallel, delivering answers up to 10X faster. Now we’ve raised $50M to scale them. Full story from @russellbrandom in @TechCrunch. techcrunch.com/2025/11/06/in…
NaveenR InUK retweeted
When we began applying diffusion to language in my lab at Stanford, many doubted it could work. That research became Mercury diffusion LLM: 10X faster, more efficient, and now the foundation of @_inception_ai. Proud to raise $50M with support from top investors.
Today’s LLMs are painfully slow and expensive. They are autoregressive and spit out words sequentially. One. At. A. Time. Our dLLMs generate text in parallel, delivering answers up to 10X faster. Now we’ve raised $50M to scale them. Full story from @russellbrandom in @TechCrunch. techcrunch.com/2025/11/06/in…
39
88
10
1,280
NaveenR InUK retweeted
Zarna (@zarna_ai) is building private equity’s first AI associate class. These agents turn weeks of investment work into minutes, so firms close better deals faster.
NaveenR InUK retweeted
The prisoner’s dilemma in A.I. healthcare— when rational A.I. use leads to harmful outcomes Our new piece @TheLancet w/@pranavrajpurkar and Michael Moritz thelancet.com/journals/lance…
NaveenR InUK retweeted
When I was a VC, I really wanted to invest in @ekuyda and almost got the chance to. What she and her team built with Replika is groundbreaking. It is so so cool to see her on this next journey - even bigger in vision and excellent in early product experience. Excited for Wabi!
Today, we’re thrilled to announce $20M in funding led by @a16z, with support from @saranormous, @amasad, @akothari, @garrytan, @justinkan, @atShruti, @naval, @scottbelsky, @gokulr, @soleio, @kevinhartz and more. @wabi is ushering in a new era of personal software, where anyone effortlessly create, discover, remix, and share personalized mini apps. For 50 years, software was made for people. The next 50, it will be made by people. Just as YouTube unlocked creative power through video, Wabi will unlock creative power through software. The YouTube moment for apps is here. We can’t wait to see what you create.
3
4
2
82
Interested in Topological Deep Learning?🍩 20 days is more than enough to participate in the Topological Deep Learning challenge 2025! Details (including prizes) below 🌟 @geometric_intel @UCSBengineering @ucsantabarbara @EPFL @mathildepapillo @gbg1441 @lev_telyatnikov
🚨 20 DAYS LEFT to submit to the Topological Deep Learning Challenge! TopoBench just hit 200 stars 🌟, and we're looking forward to seeing your submissions before the deadline. This year's challenge has multiple ways to contribute and prizes. All details: geometric-intelligence.githu…
13
22
NaveenR InUK retweeted
You can explore each of the seven discoveries in detail on the Edison Scientific platform.
After two years of work, we’ve made an AI Scientist that runs for days and makes genuine discoveries. Working with external collaborators, we report seven externally validated discoveries across multiple fields. It is available right now for anyone to use. 1/5
7
31
5
184
NaveenR InUK retweeted
Something that stunned me about @gigaml is they've moved away from the FDE playbook that's become the default for fast growing AI startups. Instead they've built AI to covert plain English from the customer into Python code to make the product work for their use cases i.e. an AI FDE. It's a huge technical feat and is how they can onboard enterprises in weeks vs months.
NaveenR InUK retweeted
Continuing our IMO-gold journey, I’m delighted to share our #EMNLP2025 paper “Towards Robust Mathematical Reasoning”, which tells some of the key stories behind the success of our advanced Gemini #DeepThink at this year IMO. Finding the right north-star metrics was highly critical for our IMO effort and we did it with #IMOBench, a suite of advanced reasoning benchmarks for foundation models. More importantly, we encourage the community to go beyond short answers and showed that automatic grading of long-form answers is promising! Read on to see our project page, paper, and datasets in the thread 🙂
Very excited to share that an advanced version of Gemini Deep Think is the first to have achieved gold-medal level in the International Mathematical Olympiad! 🏆, solving five out of six problems perfectly, as verified by the IMO organizers! It’s been a wild run to lead this effort and I am grateful to everyone in the team for such an amazing achievement! Blog post in the thread and more to share soon!
NaveenR InUK retweeted
Giga isn’t building customer support that removes a cost center. It’s creating a voice superintelligence that performs better than what even the best call centers can do. When there is high volume and high % of gross margin on the line, there is no better voice AI in the world.
We have raised a $61M Series A to automate customer operations. The world’s leading companies like DoorDash trust Giga to supercharge customer experience with AI.
NaveenR InUK retweeted
A new nature published research on Medical-LLMs brings some bad news. ☹️ While GPT-5 shows progress in reducing hallucinations, it still fails in over half of difficult clinical scenarios. shows fluent doesn’t always mean being accurate or truly understanding. They show how progress in fluency and reasoning can mask persistent weaknesses — confident hallucinations, disappearing safety disclaimers, and even biosecurity concerns. In short, fluency doesn’t mean comprehension. To keep patients safe, we need stronger safeguards — independent testing, secure deployment, and clear accountability when AI enters the clinic. --- nature. com/articles/s41591-025-04008-8
NaveenR InUK retweeted
Nucleo (@Nucleo_Research) helps oncologists and radiologists extract insights from CT scans to support tumor characterization and treatment. They're already working with Stanford Hospital, Cedars-Sinai, Weill Cornell Medicine, and UCI Health. ycombinator.com/launches/Okn… Congrats on the launch, @AngelicaIAC & @lucapegolotti!