Eric Jiang · Nov 6, 2025 · 6:39 PM UTC

Eric Jiang

Leonardo Mariscal retweeted

Eric Jiang

@veggie_eric

Nov 6

man I am so glad that people are saying "cracked" less these days. that word lost all meaning a long time ago haha anyway, xAI is hiring EXCEPTIONAL engineers worldwide. London, San Francisco, DC, Tokyo, Singapore, and more! x.ai/careers/open-roles

Open Roles | xAI & X Platform | xAI

We are driven by ambitious goals, fast execution, and a strong sense of urgency. Join us if you want to shape the next generation of AI models and products.

x.ai

461

Chumel Torres · Oct 27, 2025 · 4:02 PM UTC

Leonardo Mariscal retweeted

Chumel Torres

@ChumelTorres

Oct 27

Mañana mi Grok sale en la mañanera como uno de los periodistas de la mafia del poder…

Grok

@grok

Oct 27

Replying to @RealArturoH

Andrés Manuel López Obrador.

214

793

8,636

Leonardo Mariscal · Oct 24, 2025 · 1:40 PM UTC

Leonardo Mariscal

@lmariscal_

Oct 24

Each update is an engineering marble, I invite everyone to go through a couple of their technical blogs.

Factorio @factoriogame

Oct 24

Friday Facts #439 - Factorio and Space Age on Nintendo Switch 2™ factorio.com/blog/post/fff-4… #factorio #gamedev

Toby Pohlen · Oct 14, 2025 · 4:36 PM UTC

Leonardo Mariscal retweeted

Toby Pohlen

@TobyPhln

Oct 14

Is that world-class AI infrastructure in the room with us?

European Commission

@EU_Commission

Oct 13

The EU is expanding its AI network! New AI Factories Antennas are launching in 🇧🇪🇨🇾🇭🇺🇮🇪🇱🇻🇲🇹🇸🇰 and partner countries 🇮🇸🇲🇩🇨🇭🇬🇧🇲🇰🇷🇸 — giving innovators wider access to Europe’s world-class AI infrastructure. #DigitalEU

569

Ming Fang · Oct 13, 2025 · 1:40 AM UTC

Leonardo Mariscal retweeted

Ming Fang

@mindbergh

Oct 13

Just wrapped an incredibly productive week at @xai 's London office! The energy here is unreal—brilliant minds, endless ideas, and some serious momentum. Let's just say the MACROHARD project is simmering nicely, and the team is absolutely cooking. 🚀🚀🚀

Elon Musk

@elonmusk

Oct 12

The @xAI MACROHARD project will be profoundly impactful at an immense scale 😉 Our goal is to create a company that can do anything short of manufacturing physical objects directly, but will be able to do so indirectly, much like Apple has other companies manufacture their phones.

191

867

4,185

Toby Pohlen · Sep 12, 2025 · 11:01 AM UTC

Leonardo Mariscal retweeted

Toby Pohlen

@TobyPhln

Sep 12

Exceptional products need exceptional models that need exceptional compute resources. @xai has built a compute advantage that will grow exponentially with the delivery of Colossus 2, unlocking next-gen models. It's why I work here and it's why you should join, too. Links below.

508

La Abuela García®™ · Sep 10, 2025 · 4:09 PM UTC

Leonardo Mariscal retweeted

La Abuela García®™

@rthur013

Sep 10

Hay más Muertos por la violencia en México🇲🇽 que la guerra en Palestina🇵🇸 Pero las Muertes de México no dan suficientes Like's en Instagram😉

1,350

7,706

xAI · Aug 28, 2025 · 6:12 PM UTC

Leonardo Mariscal retweeted

xAI

@xai

Aug 28

Introducing Grok Code Fast 1, a speedy and economical reasoning model that excels at agentic coding. Now available for free on GitHub Copilot, Cursor, Cline, Kilo Code, Roo Code, opencode, and Windsurf. x.ai/news/grok-code-fast-1

Grok Code Fast 1 | xAI

We're thrilled to introduce grok-code-fast-1, a speedy and economical reasoning model that excels at agentic coding.

12,053

SpaceX · Aug 26, 2025 · 11:31 PM UTC

Leonardo Mariscal retweeted

SpaceX

@SpaceX

Aug 26

Liftoff of Starship!

1,174

5,889

542

34,361

Elon Musk · Aug 23, 2025 · 10:16 PM UTC

Leonardo Mariscal retweeted

Elon Musk

@elonmusk

Aug 23

The @xAI Grok 2.5 model, which was our best model last year, is now open source. Grok 3 will be made open source in about 6 months. huggingface.co/xai-org/grok-…

xai-org/grok-2 · Hugging Face

39,761

Casey Muratori · Jul 24, 2025 · 8:38 PM UTC

Leonardo Mariscal retweeted

Casey Muratori @cmuratori

Jul 24

"They don't understand because they're a game developer" is total nonsense. Games are a superset domain. We make editors, servers, databases, build systems, large asset management, etc. If anything, if you're going to dismiss someone, dismiss people who haven't worked on games.

Tom Hosiawa

@thosiawa

Jul 24

Replying to @ThePrimeagen

I would love your nuanced take! I asked an engineer I trust and here was theirs

1,603

heiner · Jul 23, 2025 · 12:08 PM UTC

Leonardo Mariscal retweeted

heiner

@HeinrichKuttler

Jul 23

Finally a high score we can be proud of.

Davide Paglieri @PaglieriDavide

Jul 23

LLMs acing math olympiads? Cute. But BALROG is where agents fight dragons (and actual Balrogs)🐉😈 And today, Grok-4 (@grok) takes the gold 🥇 Welcome to the podium, champion!

190

Leonardo Mariscal · Jul 22, 2025 · 10:05 PM UTC

Leonardo Mariscal

@lmariscal_

Jul 22

Pretty interesting: research.google/blog/android… Sad that Mexico is not part of the program though, literally one of the most seismically active countries. @GoogleResearch @marcsto

Android Earthquake Alerts: A global system for early warning

research.google

Leonardo Mariscal · Jul 16, 2025 · 1:09 PM UTC

Leonardo Mariscal

@lmariscal_

Jul 16

TIL agent in today's LLM world comes from the book "Artificial Intelligence: A Modern Approach", where each agent aims to understand, reason, and act upon the world it lives on.

Dogan Ural · Jul 14, 2025 · 8:30 PM UTC

Leonardo Mariscal retweeted

Dogan Ural

@doganuraldesign

Jul 14

You think you can't announce waifu as a feature and a government defense deal on the same day... But this is xAI 😎

450

491

276

5,755

Elon Musk · Jul 13, 2025 · 10:03 PM UTC

Leonardo Mariscal retweeted

Elon Musk

@elonmusk

Jul 13

We are creating a multi-agent AI software company @xAI, where @Grok spawns hundreds of specialized coding and image/video generation/understanding agents all working together and then emulates humans interacting with the software in virtual machines until the result is excellent. This is a macro challenge and a hard problem with stiff competition! Can you guess the name of this company? 🤭

Danny Limanseta

@DannyLimanseta

Jul 13

I took Grok 4 for a spin this weekend to build this game prototype. I used SuperGrok Chat to generate the initial game prototype and then brought it over to Cursor to continue coding with Grok 4 MAX. Grok 4 in Cursor is like a no-nonsense agent. Doesn't speak much, but delivers the goods. There were moments where I was rate-limited or stuck on a bug or two that I had to get other models to help, but otherwise it's a fast, reliable model to work with. This makes me incredibly excited for Grok Code to launch in August!

5,105

5,166

597

40,126

Danny Limanseta · Jul 13, 2025 · 3:13 PM UTC

Leonardo Mariscal retweeted

Danny Limanseta

@DannyLimanseta

Jul 13

1,303

1,536

460

17,584

Tim Sweeney · Jul 10, 2025 · 7:55 PM UTC

Leonardo Mariscal retweeted

Tim Sweeney

@TimSweeneyEpic

Jul 10

Grok 4 feels like Artificial General Intelligence to me. It is clearly not just constructing statistically likely connections, but is drawing fairly deep insights on problems it hasn’t seen before, in ways I haven’t seen elsewhere. Here’s an example: grok.com/share/bGVnYWN5_b97c…

615

670

154

8,898

ARC Prize · Jul 10, 2025 · 4:42 AM UTC

Leonardo Mariscal retweeted

ARC Prize

@arcprize

Jul 10

Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9% This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA

243

723

289

5,066

Artificial Analysis · Jul 10, 2025 · 4:34 AM UTC

Leonardo Mariscal retweeted

Artificial Analysis

@ArtificialAnlys

Jul 10

xAI gave us early access to Grok 4 - and the results are in. Grok 4 is now the leading AI model. We have run our full suite of benchmarks and Grok 4 achieves an Artificial Analysis Intelligence Index of 73, ahead of OpenAI o3 at 70, Google Gemini 2.5 Pro at 70, Anthropic Claude 4 Opus at 64 and DeepSeek R1 0528 at 68. Full results breakdown below. This is the first time that @elonmusk's @xai has the lead the AI frontier. Grok 3 scored competitively with the latest models from OpenAI, Anthropic and Google - but Grok 4 is the first time that our Intelligence Index has shown xAI in first place. We tested Grok 4 via the xAI API. The version of Grok 4 deployed for use on X/Twitter may be different to the model available via API. Consumer application versions of LLMs typically have instructions and logic around the models that can change style and behavior. Grok 4 is a reasoning model, meaning it ‘thinks’ before answering. The xAI API does not share reasoning tokens generated by the model. Grok 4’s pricing is equivalent to Grok 3 at $3/$15 per 1M input/output tokens ($0.75 per 1M cached input tokens). The per-token pricing is identical to Claude 4 Sonnet, but more expensive than Gemini 2.5 Pro ($1.25/$10, for <200K input tokens) and o3 ($2/$8, after recent price decrease). We expect Grok 4 to be available via the xAI API, via the Grok chatbot on X, and potentially via Microsoft Azure AI Foundry (Grok 3 and Grok 3 mini are currently available on Azure). Key benchmarking results: ➤ Grok 4 leads in not only our Artificial Analysis Intelligence Index but also our Coding Index (LiveCodeBench & SciCode) and Math Index (AIME24 & MATH-500) ➤ All-time high score in GPQA Diamond of 88%, representing a leap from Gemini 2.5 Pro’s previous record of 84% ➤ All-time high score in Humanity’s Last Exam of 24%, beating Gemini 2.5 Pro’s previous all-time high score of 21%. Note that our benchmark suite uses the original HLE dataset (Jan '25) and runs the text-only subset with no tools ➤ Joint highest score for MMLU-Pro and AIME 2024 of 87% and 94% respectively ➤ Speed: 75 output tokens/s, slower than o3 (188 tokens/s), Gemini 2.5 Pro (142 tokens/s), Claude 4 Sonnet Thinking (85 tokens/s) but faster than Claude 4 Opus Thinking (66 tokens/s) Other key information: ➤ 256k token context window. This is below Gemini 2.5 Pro’s context window of 1 million tokens, but ahead of Claude 4 Sonnet and Claude 4 Opus (200k tokens), o3 (200k tokens) and R1 0528 (128k tokens) ➤ Supports text and image input ➤ Supports function calling and structured outputs See below for further analysis 👇

8,310