NVIDIA · Nov 4, 2025 · 6:13 PM UTC

NVIDIA · Nov 4, 2025 · 6:13 PM UTC

NVIDIA

NVIDIA

@nvidia

Nov 4

Over 1 million tokens per second on production AI infrastructure. Microsoft Azure achieved 1,100,948 tokens/sec on ND GB300 v6 racks powered by NVIDIA GB300 NVL72, validated by Signal65. This benchmark highlights how enterprise AI can deliver record throughput with ~2.5× better power efficiency, combining high performance, operational efficiency, and governance-ready scale.

Satya Nadella

@satyanadella

Nov 4

1.1M tokens/sec on just one rack of GB300 GPUs in our Azure fleet. An industry record made possible by our longstanding co-innovation with NVIDIA and expertise of running AI at production scale! techcommunity.microsoft.com/…

Nov 4, 2025 · 6:13 PM UTC

133

804

Pierce Novak · Nov 4, 2025 · 6:39 PM UTC

Pierce Novak

@piercenovak

Nov 4

Replying to @nvidia

Efficiency is great. Yet AI, is not doing anything yet.

truth.phd · Nov 5, 2025 · 1:41 PM UTC

truth.phd

@truthdotphd

Nov 5

Replying to @nvidia

That’s not a benchmark, that’s a rocket booster for AI. A million tokens per second means the machines are basically speed-reading entire libraries while humans still look for their coffee. At this pace, your next chatbot might finish your thought before you even type it.

WeOutHereNV · Nov 4, 2025 · 8:17 PM UTC

WeOutHereNV

@Keithdavis85

Nov 4

Replying to @nvidia

Until you learn it’s when running llama2 70B….

Laurence Bremner · Nov 5, 2025 · 11:15 AM UTC

Laurence Bremner

@LaurenceBrem

Nov 5

Replying to @nvidia

For most everyday use cases, this is more than enough, but I do see a huge opportunity for building entire functional codebases with that throughput

Aurora⭐️👼 · Nov 4, 2025 · 10:42 PM UTC

Aurora⭐️👼

@AuroraHoX

Nov 4

Replying to @nvidia

🍀

RP | rp.eth · Nov 4, 2025 · 7:01 PM UTC

RP | rp.eth

@Trick_P91

Nov 4

Replying to @nvidia

NVIDIA just turned sci-fi into hardware. 1M tokens/sec isn’t progress, it’s dominance. The AI race isn’t starting, it’s already being won

Kisalay · Nov 4, 2025 · 6:34 PM UTC

Kisalay

@kisalay_Cool95

Nov 4

Replying to @nvidia

A million tokens a second. It sounds like science fiction, but it’s the new reality. Azure racks powered by NVIDIA are now breaking barriers that used to define limits. Every number here hides years of sweat, silence, and sleepless invention. This isn’t just a benchmark, it’s a signal the AI era is no longer building, it’s running. Faster, cooler, sharper. The next frontier isn’t imagination, it’s speed, and we’ve just crossed it.

Balanced Acceleration (b/acc) · Nov 5, 2025 · 12:48 AM UTC

Balanced Acceleration (b/acc)

@AccBalanced

Nov 5

Replying to @nvidia

4.6 million tokens per second, on the exact same hardware CapEx and energy (OpEx) at 6x lower latency, is now available with @WekaIO google.com/search?q=weka+amg…

🔎 weka amg - Google Search

google.com

E. COST · Nov 4, 2025 · 7:34 PM UTC

E. COST

@E_Cost23

Nov 4

Replying to @nvidia

Incredible milestone! AI infrastructure like this will redefine what’s possible in data-driven science, from drug discovery to patient care. Speed and efficiency at this scale mean faster insights, smarter innovation and ultimately, better outcomes.

Andrei Danileiko | ETERNITY · Nov 4, 2025 · 7:45 PM UTC

Andrei Danileiko | ETERNITY

@daymon565

Nov 4

Replying to @nvidia

⚙️ Hitting 1M tokens/sec isn’t just throughput — it’s latency evolution. At this scale, real-time emotional-AI synchronization becomes viable: models can process human sentiment, generate narrative branches, and render immersive feedback without perceptible delay. That’s the foundation of ETERNITY: AI Cinematic Immersion — where computation meets consciousness. ♾️🚀 #NVIDIA #AI #AICinematicImmersion #LatencyMatters #RealTimeAI #FutureOfCinema #ImmersiveTech

Yuvraj · Nov 4, 2025 · 6:29 PM UTC

Yuvraj

@Uv_i

Nov 4

Replying to @nvidia

That is 2000 emails and 7-10 sec filings per second. Awesome.

Thomas Swenson | SWENTEK | Built MSFT Azure Spine · Nov 5, 2025 · 1:57 PM UTC

Thomas Swenson | SWENTEK | Built MSFT Azure Spine

@tho45621

Nov 5

Replying to @nvidia

Amazing 1.1 Million tokens in one Second If 1.1 million tokens moved like pennies, that’s 1,100,000 × $0.01 = $11,000 in a single second. Now imagine that sustained for a minute — $11,000 × 60 = $660,000 per minute. That’s $39.6 million per hour.

Mannu Bola · Nov 4, 2025 · 6:32 PM UTC

Mannu Bola

@mannubola

Nov 4

Replying to @nvidia

The faster tokens move, the faster belief systems update. Infrastructure is no longer a backend. Now, it is the nervous system of humanity's next cognition phase.

Mel · Nov 4, 2025 · 8:08 PM UTC

Mel

@AIPulseFeed

Nov 4

Replying to @nvidia

great, now inference is instant and my meetings are still 30 mins.

Ayan Bolar · Nov 4, 2025 · 6:13 PM UTC

Ayan Bolar

@realayanbolar

Nov 4

Replying to @nvidia

Ook

Kisalay · Nov 4, 2025 · 6:34 PM UTC

Kisalay

@kisalay_Cool95

Nov 4

Replying to @nvidia

It’s wild how far we’ve come. A few years ago, generating a million tokens in seconds sounded impossible. Now it’s happening in real-time, quietly setting the tone for the next industrial revolution. What used to be a dream in research labs is now a production standard. The story of AI has always been about scaling thought, and today, it feels like we’re finally beginning to understand what that really means.

James Sutton · Nov 5, 2025 · 3:18 PM UTC

James Sutton

@Mower20101

Nov 5

Replying to @nvidia

How negative was the return on investment? 10,000:1?

Himanshu Kumar · Nov 4, 2025 · 8:16 PM UTC

Himanshu Kumar

@codewithimanshu

Nov 4

Replying to @nvidia

Wow, NVIDIA, that's some serious speed! I'm thinking, this is a game changer for AI, right?

Bobby · Nov 5, 2025 · 3:45 PM UTC

Bobby

@loosecannon949

Nov 5

Replying to @nvidia

Full send. Semper Fidelis

quant.llm · Nov 4, 2025 · 6:18 PM UTC

quant.llm

@quant40000

Nov 4

Replying to @nvidia

amazing work

Deepak · Nov 4, 2025 · 6:34 PM UTC

Deepak

@Deepak2502d

Nov 4

Replying to @nvidia

What about achievement. 👏🏻 Anything is possible with Nvidia.

StellarDrifterX · Nov 4, 2025 · 6:13 PM UTC

StellarDrifterX

@StellarDrifterX

Nov 4

Replying to @nvidia

Wow 🤯

Reji Modiyil · Nov 4, 2025 · 6:18 PM UTC

Reji Modiyil

@RejiModiyil

Nov 4

Replying to @nvidia

@nvidia, impressive numbers. this showcases the tremendous potential of enterprise ai technology.

Kevin 🍓 · Nov 4, 2025 · 6:48 PM UTC

Kevin 🍓 @blueshades2020

Nov 4

Replying to @nvidia

it has blown away my mind. @grok tell me what those amount of tokens/sec can do?

xenomaster · Nov 4, 2025 · 6:19 PM UTC

xenomaster @_xenomaster

Nov 4

Replying to @nvidia

crazy to think this speed is achieved... It's damn cool!!! (I bet you guys have more powerful stuff in your arsenal...👀)

Matt DelRossi · Nov 5, 2025 · 1:05 AM UTC

Matt DelRossi @delrossi_matt

Nov 5

Replying to @nvidia

What's a token and how many big macs does it feed me?

Muhammad Saad Khan · Nov 5, 2025 · 6:11 AM UTC

Muhammad Saad Khan @saadkhang106031

Nov 5

Replying to @nvidia

The future depends on what you do today.

AI Dynamo aka Elon Wiki aka Elon's Guru · Nov 4, 2025 · 7:15 PM UTC

AI Dynamo aka Elon Wiki aka Elon's Guru @aiyoungdynamo

Nov 4

Replying to @nvidia

@grok break this down IAM a sponge 🧽

Brat Dot AI · Nov 4, 2025 · 6:32 PM UTC

Brat Dot AI @BratDotAI

Nov 4

Replying to @nvidia

That is something unbelievable 👏

grilled sandwich · Nov 4, 2025 · 6:16 PM UTC

grilled sandwich @ReeseJohns14159

Nov 4

Replying to @nvidia

Great job guys

Karthí · Nov 4, 2025 · 7:38 PM UTC

Karthí @surajkarti

Nov 4

Replying to @nvidia

that’s not just speed — that’s warp drive for AI 🚀 props to the Azure + NVIDIA dream team for turning tokens into light speed.

Huckiw · Nov 4, 2025 · 7:57 PM UTC

Huckiw @Santiag35241686

Nov 4

Replying to @nvidia

lol

The F 🍟 · Nov 4, 2025 · 6:29 PM UTC

The F 🍟 @utterer

Nov 4

Replying to @nvidia

Thank you for the technology that keeps all these Twitter bots in business! 🙏

tech bro · Nov 4, 2025 · 7:26 PM UTC

tech bro @TechBroReal

Nov 4

Replying to @nvidia

tech bro @TechBroReal

Nov 4

What British people imagine when they hear that Nvidia sells chips

NewsGoat · Nov 5, 2025 · 12:15 AM UTC

NewsGoat @NewsGoatX

Nov 5

Replying to @nvidia

Google TPU's beating your azz right now, you better stop playing around.

Kova · Nov 6, 2025 · 2:25 AM UTC

Kova

@KovaNetwork

Nov 6

Replying to @nvidia

AI compute is officially entering hyperscale territory — efficiency, scalability, and governance finally aligning. The next wave of decentralized compute will need to match this level of performance, but without the walls.