j4orz · Sep 21, 2025 · 6:53 PM UTC

j4orz

Pinned Tweet

j4orz

@j4orz

Sep 21

good article from triton contributor and prev gpu compiler lead at google, claiming same as @__tinygrad__ you can't tape out without having a framework (the ir) because ml deals with incomplete irs. i.e. tpus only succeed because of jax and xla.

tender · Nov 7, 2025 · 5:01 PM UTC

j4orz retweeted

tender

@tenderizzation

Nov 7

[ENG SUB] how it feels to use eager pytorch in 2025

239

Songlin Yang · Nov 8, 2025 · 8:16 AM UTC

j4orz retweeted

Songlin Yang

@SonglinYang4

17h

Hi @JeffDean, what’s the plan for releasing the code for this line of work? None of these papers so far seem to have released any code

Jeff Dean

@JeffDean

Nov 7

An exciting new approach for doing continual learning, using nested optimization for enhancing long context processing.

808

Elon Musk · Nov 7, 2025 · 4:29 PM UTC

j4orz retweeted

Elon Musk

@elonmusk

Nov 7

Replying to @RolandForTexas

You are a taker, not a maker. All you’ve done your whole life is take from the makers of the world. The zero-sum mindset you have is at the root of so much evil. Once you realize that civilization is not zero-sum and that it is about making far more than one consumes, then it becomes obvious that the path to prosperity for all is just let the makers make. Regarding Tesla, the reality is that I have been given nothing. However, if I lead Tesla to become the most valuable company in the world by far and it stays that way for 5 years, shareholders voted to award me 12% of what is built. Anyone who wants to come along for the ride can buy Tesla stock. If Tesla “merely” becomes a $1.999 trillion dollar company, I get nothing. This is a great deal for shareholders, which is why they voted so overwhelmingly to approve this, for which I am immensely grateful. And they did so by a margin far more than you won your political seat.

8,717

13,616

1,934

116,079

j4orz · Nov 7, 2025 · 8:49 PM UTC

j4orz

@j4orz

Nov 7

updates to j4orz.ai/mlsysapp/. working on the runtime and eager kernels now. picograd is taking longer than other "hobby" autograds i've seen. but our plan is to be the *definitive* resource on building your own pytorch. we agree with @karpathy that course building is a very technical process which requires the pedagogical progression to be just right throughout the entire book. to make each step not too trivial, and not too challenging. the goal is to be the llm201 course on karpathy's starfleet academy! we are early in our journey — if you are interested in helping out please come join us in the @GPU_MODE discord under the #singularity-systems work group 🖤

Pope Leo XIV · Nov 7, 2025 · 12:45 PM UTC

j4orz retweeted

Pope Leo XIV

@Pontifex

Nov 7

Technological innovation can be a form of participation in the divine act of creation. It carries an ethical and spiritual weight, for every design choice expresses a vision of humanity. The Church therefore calls all builders of #AI to cultivate moral discernment as a fundamental part of their work—to develop systems that reflect justice, solidarity, and a genuine reverence for life.

2,082

4,794

1,315

30,482

Elon Musk · Nov 7, 2025 · 11:47 AM UTC

j4orz retweeted

Elon Musk

@elonmusk

Nov 7

Replying to @StefanoErmon @_inception_ai

Diffusion will obviously work on any bitstream. With text, since humans read from first word to last, there is just the question of whether the delay to first sentence for diffusion is worth it. That said, the vast majority of AI workload will be video understanding and generation, so good chance diffusion is the biggest winner overall. Also means that the ratio of compute to memory bandwidth will increase.

127

197

2,168

Elon Musk · Nov 7, 2025 · 12:19 PM UTC

j4orz retweeted

Elon Musk

@elonmusk

Nov 7

Replying to @StefanoErmon

Tesla is using single step diffusion for world model generation x.com/i/grok/share/CCHc5AW6U…

143

1,272

j4orz · Nov 7, 2025 · 1:39 PM UTC

j4orz

@j4orz

Nov 7

80% of the work is finding the spec

Georgi Gerganov · Nov 6, 2025 · 4:40 PM UTC

j4orz retweeted

Georgi Gerganov

@ggerganov

Nov 6

Initial M5 Neural Accelerators support in llama.cpp Enjoy faster TTFT in all ggml-based software (requires macOS Tahoe 26) github.com/ggml-org/llama.cp…

metal : initial Metal4 tensor API support by ggerganov · Pull Request #16634 · ggml-org/llama.cpp

Rework matrix-matrix multiplication Use Tensor API when available TODOs Update mul_mm_id kernel Test on M5 (looking for volunteers to test as I won't have hardware anytime soon) How to...

github.com

362

Zephyr · Nov 6, 2025 · 4:06 PM UTC

j4orz retweeted

Zephyr

@zephyr_z9

Nov 6

AND Kimi also moves to the Frontier Tier with the release of K2 thinking

Nathan Lambert

@natolambert

Aug 17

A tier list of China's top 19 open model builders. Who did we miss? At the frontier * DeepSeek * Qwen Close competitors * Moonshot AI (Kimi) * Zhipu / Z AI Noteworthy * StepFun * Tencent (Hunyuan) * RedNote (Xiaohongshu) * MiniMax * OpenGVLab / InternLM * Skywork On the rise * ByteDance Seed * OpenBMB * Xiaomi (MiMo) * Baidu (ERNIE) Honorable Mentions * Multimodal Art Projection * Alibaba International Digital Commerce Group * Beijing Academy of Artificial Intelligence (BAAI) * inclusionAI * Pangu (Huawei) I learned a lot from these. We have so much more we need to do to understand how their AI ecosystem works.

941

j4orz · Nov 5, 2025 · 5:19 PM UTC

j4orz

@j4orz

Nov 5

looking at the backward rules for an autodiff has the same beauty looking at a lisp interpreter. you can describe the world with the chain rule. you can simulate any turing-complete language.

Harald Schäfer · Nov 5, 2025 · 2:54 AM UTC

j4orz retweeted

Harald Schäfer

@___Harald___

Nov 5

Datacenters in space is the solar roadways of the 2020s.

Sebastian Raschka · Nov 3, 2025 · 9:17 PM UTC

j4orz retweeted

Sebastian Raschka

@rasbt

Nov 3

Replying to @natolambert

Actually I think it was a pretty eventful Fall so far. E.g., Qwen3-Next, DeepSeek V3.2, GLM 4.6, MiniMax-M2, Kimi Linear

dr. jack morris · Nov 4, 2025 · 2:47 PM UTC

j4orz retweeted

dr. jack morris

@jxmnop

Nov 4

defending today 🥲

250

113

3,321

GPU MODE · Nov 3, 2025 · 8:00 PM UTC

j4orz retweeted

GPU MODE

@GPU_MODE

Nov 3

If you'd like to win your own Dell Pro Max with GB300 we're launching a new kernel competition with @NVIDIAAI @sestercegroup @Dell to optimize NVF4 kernels on B200 2025 has seen a tremendous rise of pythonic kernel DSLs, we got on-prem hardware to have reliable ncu benchmarking available to all and we hope the best kernel DSL and the best kernel DSL author win

164

j4orz · Nov 3, 2025 · 2:57 PM UTC

j4orz

@j4orz

Nov 3

boots