Vladimir Albrekht · Oct 31, 2025 · 8:29 AM UTC

Vladimir Albrekht

Pinned Tweet

Vladimir Albrekht

@Albrekht_V

Oct 31

must read. 🤗

Loubna Ben Allal

@LoubnaBenAllal1

Oct 30

After ~4 years building SOTA models & datasets, we're sharing everything we learned in ⚡The Smol Training Playbook We cover the full LLM cycle: designing ablations, choosing an architecture, curating data, post-training, and building solid infrastructure. We'll help you navigate the messy training reality that LLM papers don't cover. Chapter highlights in the 🧵

Loubna Ben Allal · Oct 30, 2025 · 4:10 PM UTC

Vladimir Albrekht retweeted

Loubna Ben Allal

@LoubnaBenAllal1

Oct 30

158

1,003

Vladimir Albrekht · May 25, 2025 · 7:49 AM UTC

Vladimir Albrekht

@Albrekht_V

May 25

artifact claude.ai/public/artifacts/0…

Claude: An Interactive Self-Portrait | Claude

Claude: An Interactive Self-Portrait - interactive HTML page created with Claude.

claude.ai

Vladimir Albrekht · May 25, 2025 · 7:48 AM UTC

Vladimir Albrekht

@Albrekht_V

May 25

Claude Opus 4 > generate an artifact to represent yourself

Vladimir Albrekht · Feb 24, 2025 · 8:16 PM UTC

Vladimir Albrekht

@Albrekht_V

Feb 24

Akame from Akame ga kill by Sonnet 3.7 with thinking mode. 2 turns

Vladimir Albrekht · Feb 19, 2025 · 9:05 PM UTC

Vladimir Albrekht

@Albrekht_V

Feb 19

Why? 🤔 It's a warmup end at 2k steps but still why so huge drop?

Vladimir Albrekht · Feb 18, 2025 · 3:41 PM UTC

Vladimir Albrekht

@Albrekht_V

Feb 18

SkyReels. image2video

Vladimir Albrekht · Feb 5, 2025 · 10:48 PM UTC

Vladimir Albrekht

@Albrekht_V

Feb 5

Thanks for the great content Karpathy sensei.

Andrej Karpathy

@karpathy

Feb 5

New 3h31m video on YouTube: "Deep Dive into LLMs like ChatGPT" This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental models of how to think about their "psychology", and how to get the best use them in practical applications. We cover all the major stages: 1. pretraining: data, tokenization, Transformer neural network I/O and internals, inference, GPT-2 training example, Llama 3.1 base inference examples 2. supervised finetuning: conversations data, "LLM Psychology": hallucinations, tool use, knowledge/working memory, knowledge of self, models need tokens to think, spelling, jagged intelligence 3. reinforcement learning: practice makes perfect, DeepSeek-R1, AlphaGo, RLHF. I designed this video for the "general audience" track of my videos, which I believe are accessible to most people, even without technical background. It should give you an intuitive understanding of the full training pipeline of LLMs like ChatGPT, with many examples along the way, and maybe some ways of thinking around current capabilities, where we are, and what's coming. (Also, I have one "Intro to LLMs" video already from ~year ago, but that is just a re-recording of a random talk, so I wanted to loop around and do a lot more comprehensive version of this topic. They can still be combined, as the talk goes a lot deeper into other topics, e.g. LLM OS and LLM Security) Hope it's fun & useful! piped.video/watch?v=7xTGNNLP…

Vladimir Albrekht · Jan 31, 2025 · 6:14 AM UTC

Vladimir Albrekht

@Albrekht_V

Jan 31

git: github.com/Alpha-VLLM/Lumina…

GitHub - Alpha-VLLM/Lumina-Image-2.0: Lumina-Image 2.0: A Unified and Efficient Image Generative...

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework - Alpha-VLLM/Lumina-Image-2.0

github.com

Vladimir Albrekht · Jan 31, 2025 · 6:13 AM UTC

Vladimir Albrekht

@Albrekht_V

Jan 31

New Lumina-Image 2.0 looks good at instruction following for only 2B model it's impressive. (Lumina-Image-2.0 is a 2 billion parameter flow-based diffusion transformer capable of generating images from text descriptions.) Prompt in alt

Vladimir Albrekht · Jan 29, 2025 · 3:00 PM UTC

Vladimir Albrekht

@Albrekht_V

Jan 29

More compute = time traveling. So that's how the world works

Vladimir Albrekht · Dec 26, 2024 · 7:23 AM UTC

Vladimir Albrekht

@Albrekht_V

26 Dec 2024

Source: huggingface.co/CCRss/flux-lo…

CCRss/flux-lora-models at main

huggingface.co

Vladimir Albrekht · Dec 26, 2024 · 7:22 AM UTC

Vladimir Albrekht

@Albrekht_V

26 Dec 2024

flux_dev Lora trained on the Akame from (Akame ga kill)

Vaibhav (VB) Srivastav · Dec 25, 2024 · 4:31 PM UTC

Vladimir Albrekht retweeted

Vaibhav (VB) Srivastav

@reach_vb

25 Dec 2024

The whale strikes yet again! DeepSeek v3 Base 🔥 > 685 B (MoE) params > fp8 > Instruct beats Claude 3.5 Sonnet in Aider bench I hope they release the Instruct version too ;)

447

Vladimir Albrekht · Dec 7, 2024 · 8:03 AM UTC

Vladimir Albrekht

@Albrekht_V

7 Dec 2024

huggingface.co/spaces/multim…

Vladimir Albrekht · Dec 7, 2024 · 8:02 AM UTC

Vladimir Albrekht

@Albrekht_V

7 Dec 2024

Looks really interesting to visit with a unique style. FLUX Style Shaping. In comment 2 source images.

Vladimir Albrekht · Dec 5, 2024 · 4:20 PM UTC

Vladimir Albrekht

@Albrekht_V

5 Dec 2024

Source: github.com/showlab/ShowUI

GitHub - showlab/ShowUI: [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI...

[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use. - showlab/ShowUI

github.com

Vladimir Albrekht · Dec 5, 2024 · 4:20 PM UTC

Vladimir Albrekht

@Albrekht_V

5 Dec 2024

Open source model for the "Computer use". - model only 2B based on QwenVL 2B. - new possible variation of VLM usage. I tested on a few examples just random pictures, model grounded correctly. Authors claim that their method of training the model is 1.4x faster than regular VLM 2B training.

Kevin Lin · Dec 4, 2024 · 5:11 PM UTC

Vladimir Albrekht retweeted

Kevin Lin

@KevinQHLin

4 Dec 2024

🔥We're thrilled to announce: ShowUI Local Run!🔥 🧑‍💻Now, you can use our 2B vision-language-action model for Local Computer control! 💰30x Cheaper than Claude! 🔗Model: github.com/showlab/ShowUI 🔗Computer Use OOTB: github.com/showlab/computer_… #ComputerUse #Agent #Claude