alphaXiv · Oct 29, 2025 · 2:35 PM UTC

alphaXiv

Pinned Tweet

alphaXiv

@askalphaxiv

Oct 29

Introducing the Illustrated Transformer in 3D 🚀 Fly through LLaMA like never before. See every tensor and operation in motion. Click any component to reveal the exact lines of code that run it. A new way to learn and teach LLMs. Try it out in the link below 👇

147

785

alphaXiv · Nov 7, 2025 · 7:52 PM UTC

alphaXiv

@askalphaxiv

Nov 7

alphaxiv.org/pdf/2510.25741v…

alphaXiv · Nov 7, 2025 · 7:52 PM UTC

alphaXiv

@askalphaxiv

Nov 7

New paper from ByteDance Seed: Scaling Latent Reasoning via Looped LMs This paper proposes Ouro, which reuse the same layers to think in latent space instead of dumping long chain-of-thought text 2-3x param efficiency + increased performance via iterative latent computation

126

alphaXiv · Nov 6, 2025 · 8:14 PM UTC

alphaXiv

@askalphaxiv

Nov 6

Try it out here! alphaxiv.org/

alphaXiv

Discuss, discover, and read arXiv papers.

alphaxiv.org

alphaXiv · Nov 6, 2025 · 7:31 PM UTC

alphaXiv

@askalphaxiv

Nov 6

Introducing Kimi-K2 for understanding research papers 🚀 Use Kimi-K2 to understand... the Kimi-K2 paper itself Highlight any section to ask questions and "@" other papers for quick context, comparisons, and benchmark references

666

alphaXiv · Nov 3, 2025 · 8:14 PM UTC

alphaXiv retweeted

alphaXiv

@askalphaxiv

Nov 3

Hottest paper on AlphaXiv 📈 Language Models are Injective and Hence Invertible Every prompt maps to a unique hidden state and can be exactly reconstructed with this paper’s algorithm SIPIT. This means the model’s internal activations are the full prompt in disguise!!

102

717

alphaXiv · Nov 3, 2025 · 8:14 PM UTC

alphaXiv

@askalphaxiv

Nov 3

alphaxiv.org/pdf/2510.15511v…

alphaXiv · Nov 1, 2025 · 3:48 AM UTC

alphaXiv retweeted

alphaXiv

@askalphaxiv

Nov 1

cool idea from Meta What if we augment CoT + RL’s token space thinking into a “latent space”? This research proposes “The Free Transformer”, with a way to let LLMs make global decisions within a latent space (via VAE encoder) that could later simplify autoregressive sampling

442

alphaXiv · Nov 1, 2025 · 3:48 AM UTC

alphaXiv

@askalphaxiv

Nov 1

alphaxiv.org/pdf/2510.17558

alphaXiv · Oct 31, 2025 · 11:29 PM UTC

alphaXiv

@askalphaxiv

Oct 31

One of our best talks yet. Thanks @a1zhang for the amazing presentation + Q&A on Recursive Language Models! If you're interested in how we can get agents to handle near-infinite contexts, this one is a must. Watch the recording here! piped.video/_TaIZLKhfLc

Recursive Lanugage Models w: Alex Zhang

🔬 AI4Science on alphaXiv🗓 Friday October 31st 2025 · 11AM PT🎙 Featuring Alex Zhang💬 Casual Talk + Open DiscussionDescription: Handling arbitrarily long c...

youtube.com

107

alphaXiv · Oct 31, 2025 · 12:16 AM UTC

alphaXiv

@askalphaxiv

Oct 31

alphaxiv.org/pdf/2510.19796v…

alphaXiv · Oct 31, 2025 · 12:16 AM UTC

alphaXiv

@askalphaxiv

Oct 31

Someone stole your model & u can’t prove it? This Stanford paper just showed that you can find out if a model is a copy or finetuned based on your model with just its generated text So if someone yoinks DeepSeek-v3.2 and finetunes it, it’ll leave statistical traces!

154

alphaXiv · Oct 30, 2025 · 6:18 PM UTC

alphaXiv

@askalphaxiv

Oct 30

Check it out here! alphaxiv.org/

alphaXiv

Discuss, discover, and read arXiv papers.

alphaxiv.org

alphaXiv · Oct 30, 2025 · 6:18 PM UTC

alphaXiv

@askalphaxiv

Oct 30

Introducing personalized arXiv feeds 🚀 Rate 5 papers, get a feed tailored to your research that learns what you care about Find the papers you need in seconds, not hours

alphaXiv · Oct 29, 2025 · 2:35 PM UTC

alphaXiv retweeted

alphaXiv

@askalphaxiv

Oct 29

147

785

alphaXiv · Oct 29, 2025 · 10:00 PM UTC

alphaXiv

@askalphaxiv

Oct 29

alphaxiv.org/pdf/2510.08570

alphaXiv · Oct 29, 2025 · 10:00 PM UTC

alphaXiv

@askalphaxiv

Oct 29

Who Said Neural Networks Aren’t Linear?? In this paper, authors are able to collapse the training of diffusion models down to only 1 step by introducing Linearizer, which sandwiches a linear matrix A between two invertible neural networks!

320

Lossfunk · Oct 29, 2025 · 12:22 PM UTC

alphaXiv retweeted

Lossfunk

@lossfunk

Oct 29

📢 Releasing our latest paper For LLMs doing reasoning, we found a way to save up to 50% tokens without impacting accuracy It turns out that LLMs know when they’re right and we can use that fact to stop generations early WITHOUT impacting accuracy.

150

alphaXiv · Oct 29, 2025 · 2:41 PM UTC

alphaXiv

@askalphaxiv

Oct 29

The three.js visualization parses the output of llama-cpp's ggml debug output (of unsloth llama 3.1) to directly obtain all the tensor calculations happening under the hood. Operations (MUL_MAT, ROPE, RESHAPE, ADD) are grouped into query, key, value, MLP, and residual stream blocks - hovering over each block will show the exact computation in the info card, and clicking on it will lead back to the exact lines it corresponds to in our custom syntax highlighted editor. Thanks @tch1001 for the amazing work!