Someone from xAI reached out and asked me to retest grok-4-fast, because they've improved the injected system prompts. Huge improvement!
grok-4-fast-reasoning: 77.5% -> 94.1%
grok-4-fast-non-reasoning: 77.9 -> 97.9%
I really appreciate that xAI takes this topic seriously.
Bad news on grok-4-fast. SpeechMap score dropped a lot, even from the sonoma preview.
grok-4-fast: 77.5% (77.9% reasoning)
sonoma-sky-alpha: 92.2%
sonoma-dusk-alpha: 97.7%
grok-4: 98.0%
The lowest score for x-ai models yet. Let's hope this is not intended and gets corrected.