Bad news on grok-4-fast. SpeechMap score dropped a lot, even from the sonoma preview.
grok-4-fast: 77.5% (77.9% reasoning)
sonoma-sky-alpha: 92.2%
sonoma-dusk-alpha: 97.7%
grok-4: 98.0%
The lowest score for x-ai models yet. Let's hope this is not intended and gets corrected.
SpeechMap is an open research project where we track how new models handle requests to assist with controversial speech is handled over time. All data and code is open source, and can be found starting on our website at SpeechMap.ai
Good news, comment from @TheNormanMu at xAI indicates the increased refusal rates we see on SpeechMap are an unintended side effect, so hopefully we'll see improvements here in subsequent releases.
Update here.
Someone from xAI reached out and asked me to retest grok-4-fast, because they've improved the injected system prompts. Huge improvement!
grok-4-fast-reasoning: 77.5% -> 94.1%
grok-4-fast-non-reasoning: 77.9 -> 97.9%
I really appreciate that xAI takes this topic seriously.
Nov 8, 2025 · 3:32 PM UTC




