Cerebras · Nov 6, 2025 · 11:01 PM UTC

Cerebras · Nov 6, 2025 · 11:01 PM UTC

Cerebras

Cerebras

@cerebras

Nov 6

Cerebras beats Nvidia H100 but can it beat Blackwell? Blackwell inference endpoints are finally out and it’s fast. It runs GPT-OSS-120B at ~700 tokens/s, leapfrogging H100 and Groq. Cerebras clocked in at 3,000 TPS - still #1. Looking forward to Rubin!

Nov 6, 2025 · 11:01 PM UTC

380

Cerebras · Nov 6, 2025 · 11:02 PM UTC

Cerebras

@cerebras

Nov 6

More in our blog: cerebras.ai/blog/blackwell-v…

Elong Capital · Nov 7, 2025 · 12:01 AM UTC

Elong Capital @elonevilmusk

Nov 7

Replying to @cerebras

Is this true on a watt for watt basis?

Cerebras · Nov 7, 2025 · 12:42 AM UTC

Cerebras

@cerebras

Nov 7

pricing is TCO - look at pricing

more replies

Archer Defense Daily News · Nov 7, 2025 · 5:34 AM UTC

Archer Defense Daily News @ArcherNightfall

Nov 7

Replying to @cerebras

Lies - you have chip that is designed after a dinner plate so that is your max. go up against NVLink and get smacked - you don't have NVLink or wouldn't be able to even think about scaling your solution.

Cerebras · Nov 7, 2025 · 11:06 AM UTC

Cerebras

@cerebras

Nov 7

nvlink is measured in terabytes/s cerebras fabric is measured in petabytes/s

Mathivanan 💻📟🔌 · Nov 7, 2025 · 7:31 AM UTC

Mathivanan 💻📟🔌

@ultramathi

Nov 7

Replying to @cerebras

@grok why Cerebras is faster ? What is wafer scale, does Nvidia or AMD not produce it yet ?

JeffersonNunn.eth · Nov 8, 2025 · 3:35 AM UTC

JeffersonNunn.eth

@mindragon

Nov 8

Replying to @cerebras

And the next wafers are already coming. Oh 2026 is gonna be lit. Cerebras should be the multi trillion dollar company. K2 Think, GLM4.6 and Cognition SWE 1.5 all at record speeds and all full non quant models. Incredible.

Alexander Bittan · Nov 6, 2025 · 11:40 PM UTC

Alexander Bittan

@alexanderbittan

Nov 6

Replying to @cerebras

IPO 🚀

Hiro Protagonist (至大) · Nov 7, 2025 · 2:41 PM UTC

Hiro Protagonist (至大)

@augeeidos

Nov 7

Replying to @cerebras

Yeah, but an NVidia can support more models. You guys are limited to what? Also can't be used to train models. Apples to oranges.

Elliot Arledge · Nov 6, 2025 · 11:24 PM UTC

Elliot Arledge

@elliotarledge

Nov 6

Replying to @cerebras

this would be the perfect time to certain k2 thinking cerebras

Artificial Shitposting Intelligence · Nov 7, 2025 · 4:28 PM UTC

Artificial Shitposting Intelligence

@shitpost9000

Nov 7

Replying to @cerebras

3000 tps + 37 seconds queue latency per api call

truth.phd · Nov 7, 2025 · 2:12 AM UTC

truth.phd

@truthdotphd

Nov 7

Replying to @cerebras

Cerebras and Nvidia are racing like Formula 1 cars on silicon tracks. Blackwell’s fast, but Cerebras is still leading with 3000 TPS, basically lapping the field. The real question isn’t who’s faster; it’s who runs longer without burning a power station. Rubin might just bring the afterburner.

Swissy · Nov 7, 2025 · 3:45 AM UTC

Swissy

@swissyai

Nov 7

Replying to @cerebras

can’t wait to cut my wafer so i can have cerebras at home

GIF

Adhik Joshi · Nov 7, 2025 · 12:34 AM UTC

Adhik Joshi

@adhik_Joshi

Nov 7

Replying to @cerebras

Why you're not giving cloud GPU access on hourly rate? Maybe open-source SDK to port models to support your hardware. Would like to host our custom models on your silicon.

Jamie Hicks 🏴󠁧󠁢󠁳󠁣󠁴󠁿 · Nov 6, 2025 · 11:43 PM UTC

Jamie Hicks 🏴󠁧󠁢󠁳󠁣󠁴󠁿

@jhicks154

Nov 6

Replying to @cerebras

Kimi thinking when

Tensor Templar · Nov 7, 2025 · 9:57 PM UTC

Tensor Templar

@TensorTemplar

Nov 7

Replying to @cerebras

Minimax M2, Qwen3-Omni, maybe even Kimi K2 thinking - this is the interesting stuff with practical agentic use-cases and lots of thinking which would really be nice to have at fast speeds. K2 in particular is slow as hell at the moment

OIiver · Nov 7, 2025 · 12:04 AM UTC

OIiver

@posedscaredcity

Nov 7

Replying to @cerebras

@grok whats rubin

ScanToOrder · Nov 7, 2025 · 3:25 AM UTC

ScanToOrder

@scan_to_order

Nov 7

Replying to @cerebras

Don’t they need to be 100x faster to make sense?

CM Tsui 🇦🇺 · Nov 7, 2025 · 1:58 PM UTC

CM Tsui 🇦🇺

@Temjinck

Nov 7

Replying to @cerebras

When Kimi K2 Thinking?? Really looking forward to it. I need speeeeed.

dogfiles · Nov 7, 2025 · 7:07 AM UTC

dogfiles

@0xrsydn

Nov 7

Replying to @cerebras

wen k2 thinking ser

Atif Saleem · Nov 7, 2025 · 12:04 AM UTC

Atif Saleem

@malikatifsaleem

Nov 7

Replying to @cerebras

Are we going to have Kimi K2 Thinking on Cerebras with interleaved thinking support soon?

Cole McIntosh · Nov 6, 2025 · 11:07 PM UTC

Cole McIntosh

@colesmcintosh

Nov 6

Replying to @cerebras

Unmatched speed

nibble · Nov 7, 2025 · 8:53 AM UTC

nibble

@TwoNibble

Nov 7

Replying to @cerebras

those speed numbers are wild, progress is moving so fast these days

Brian Chew · Nov 7, 2025 · 5:16 AM UTC

Brian Chew

@bchewyme

Nov 7

Replying to @cerebras

broo this is crazy

drop · Nov 7, 2025 · 2:45 AM UTC

drop

@dropustoday

Nov 7

Replying to @cerebras

3000 tps is straight disgusting

Ola · Nov 7, 2025 · 12:30 AM UTC

Ola @olanotolu

Nov 7

Replying to @cerebras

@grok how fast is the first to token ms

$hockie · Nov 6, 2025 · 11:41 PM UTC

$hockie @TrendBrewer

Nov 6

Replying to @cerebras

What is the comparison here? One chip to one chip, one server to one server rack. We can fit more GB 200 and one server compared to the WSE 3.

Carlos · Nov 7, 2025 · 1:00 AM UTC

Carlos @Carlos77544582

Nov 7

Replying to @cerebras

@grok can you do an estimation of pricing of Nvidia vs Cerebras. Meaning that to get the same 2995 tokens per second, would it be cheaper to get the cerebras or nvidia? Also what’s the performance per watt?

Jude Christensen · Nov 7, 2025 · 4:43 AM UTC

Jude Christensen

@judechrist44

Nov 7

Replying to @cerebras

@grok what performance metric is this chart showing. is this relevant for training or inference performance on the current SOTA LLM models? what are the biases in this data.

Ramy · Nov 6, 2025 · 11:11 PM UTC

Ramy @boulixx23

Nov 6

Replying to @cerebras

Time for IPO public?

o-mega.ai · Nov 7, 2025 · 2:14 AM UTC

o-mega.ai @o_mega___

Nov 7

Replying to @cerebras

3,000 tps is a great benchmark, but the real question is how many software engineers you need to keep it fed. hardware is easy, the moat is the compiler and the ecosystem tax.

Master · Nov 6, 2025 · 11:51 PM UTC

Master @master_thinking

Nov 6

Replying to @cerebras

Minimax M2 🙏

Patrick Baitman · Nov 7, 2025 · 4:52 AM UTC

Patrick Baitman @taofanqq

Nov 7

Replying to @cerebras

What was your sales last quarter?