Over 1 million tokens per second on production AI infrastructure.
Microsoft Azure achieved 1,100,948 tokens/sec on ND GB300 v6 racks powered by NVIDIA GB300 NVL72, validated by Signal65.
This benchmark highlights how enterprise AI can deliver record throughput with ~2.5× better power efficiency, combining high performance, operational efficiency, and governance-ready scale.
1.1M tokens/sec on just one rack of GB300 GPUs in our Azure fleet.
An industry record made possible by our longstanding co-innovation with NVIDIA and expertise of running AI at production scale!
techcommunity.microsoft.com/…
Nov 4, 2025 · 6:13 PM UTC




































