Pinned Tweet
My old Twitter (@ventali_) and Telegram were hacked. The attacker is impersonating me to target tech and crypto founders & investors with phishing links — even using deepfakes of me on Zoom. >50 people were contacted. Some nearly clicked. I filed 10 reports to @X and haven't heard back. I called the SF police and filed an FBI IC3 report. Please don’t click any links from the attackers. I only use: Twitter → @_ventali Telegram → @ventalitan
I'm hosting AI & science reading group tonight at Mox. Come through!
1
1
2
Ventali Tan retweeted
I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language person) is whether pixels are better inputs to LLMs than text. Whether text tokens are wasteful and just terrible, at the input. Maybe it makes more sense that all inputs to LLMs should only ever be images. Even if you happen to have pure text input, maybe you'd prefer to render it and then feed that in: - more information compression (see paper) => shorter context windows, more efficiency - significantly more general information stream => not just text, but e.g. bold text, colored text, arbitrary images. - input can now be processed with bidirectional attention easily and as default, not autoregressive attention - a lot more powerful. - delete the tokenizer (at the input)!! I already ranted about how much I dislike the tokenizer. Tokenizers are ugly, separate, not end-to-end stage. It "imports" all the ugliness of Unicode, byte encodings, it inherits a lot of historical baggage, security/jailbreak risk (e.g. continuation bytes). It makes two characters that look identical to the eye look as two completely different tokens internally in the network. A smiling emoji looks like a weird token, not an... actual smiling face, pixels and all, and all the transfer learning that brings along. The tokenizer must go. OCR is just one of many useful vision -> text tasks. And text -> text tasks can be made to be vision ->text tasks. Not vice versa. So many the User message is images, but the decoder (the Assistant response) remains text. It's a lot less obvious how to output pixels realistically... or if you'd want to. Now I have to also fight the urge to side quest an image-input-only version of nanochat...
🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping 97% OCR accuracy at <10×. 📄 Outperforms GOT-OCR2.0 & MinerU2.0 on OmniDocBench using fewer vision tokens. 🤝 The vLLM team is working with DeepSeek to bring official DeepSeek-OCR support into the next vLLM release — making multimodal inference even faster and easier to scale. 🔗 github.com/deepseek-ai/DeepS… #vLLM #DeepSeek #OCR #LLM #VisionAI #DeepLearning
🤍
Esteemed physicist and Nobel laureate Chen Ning Yang passes away at the age of 103 in Beijing on 18 October.
Ventali Tan retweeted
Happy 103rd birthday to theoretical physicist Chen Ning Yang (楊振寧). Yang shared the 1957 #NobelPrize in Physics with Tsung Dao Lee for investigating the parity laws, which led to discoveries regarding the elementary particles. Learn more: nobelprize.org/prizes/physic…
Ventali Tan retweeted
Announcing Bread Technologies. We’re building machines that learn like humans. We raised a $5 million seed round led by Menlo Ventures and have been building in stealth for 10 months. Today, we rise 🍞
Ventali Tan retweeted
Evals and breakfast crew with @denise_teng25 @vibhuuuus @_ventali @nicksrose72 @tomas_hk (@notdiamond_ai) @thegavinbains. Awesome to debate and discover where evals are being/could be used (synthetic data, rubrics, prompt optimization and prompt learning). Tl;dr feedback is fuel
1
2
9
okay so we've been building valida 0.1 to 0.9 for 3 yrs. now it's finally 1.0!! forgive our (or my, or morgan's?) OCD!!!!! huge thanks to Ivan @imikushin and Hideaki Takahashi for contributing to valida. we will have more updates in the next release <3
We’re releasing Valida zkVM 1.0.0 — our first major milestone. This version brings parallel proving, debugger support, smarter codegen, and stronger soundness checks. Thread 🧵
2
2
9
🤍
next release soon 🤍
1
I'm hosting an AI & Science reading group at Mox SF on Wednesdays 7-9PM. Each week, we’ll dive into a different area where AI is reshaping scientific discovery and research. Topics we’ll explore include: • AI & Genomics • AI & Mathematics • AI & Physics • AI & Drug Discovery • AI & Material Science • AI & Climate Science • AI & Neuroscience • AI & Cosmology • (and more as the group evolves!) Format: • Short expert talk or group-led introduction (when we have a guest speaker) • Paper / reading discussion (everyone welcome to suggest!) • Open conversation + brainstorming collaborations RSVP: partiful.com/e/VoChGapvZUmH5… #AI #science #research
2
12
Ventali Tan retweeted
New fastest shortest-path algorithm in 41 years! Tsinghua researchers broke Dijkstra’s 1984 “sorting barrier,” achieving O(m log^(2/3) n) time. This means faster route planning, less traffic, cheaper deliveries, and more efficient networks - and a CS curriculum revamp =)
listening to venetian snares in waymo is surreal
1
4
sudo is opening up 10 spots for early testers. you will receive 10% extra credits on LLM routing. DM me if interested!
1
7
Ventali Tan retweeted
Releasing Valida Plus — our biggest update yet to the Valida zkVM. From client-side proofs to Ethereum block proving, this release pushes the boundaries of ZK performance and usability. Highlights below 🧵
9
5
5
47
Valida is now open-source. We started Valida as an open project — and we’ll keep building it in the open. It’s hard to push forward a fundamentally different architecture in a world that rewards short-term wins. But this is who I am, and what this team stands for. So be it. Valida has already made a significant impact across the industry. By open-sourcing it, I hope we can take it further — to serve the wider world that needs trust in computation. Would love to support any teams that want to build on top of Valida!
Big milestone for us — Valida is now open-source! One of the fastest and most user-friendly zkVMs, now ready for the world to build on.
4
7
1
57
。゚+.ღ(ゝ◡ ⚈ღ)
hiii lita's in berlin! ღゝ◡╹)ノ♡
7
On this topic, we also wrote an article on why we think a zk-friendly ISA could be a better target:
Re: @VitalikButerin’s proposal Why we think Custom ISA > RISC-V for Ethereum’s future. lita.foundation/blog/optimiz…
4
Great overview!
On ISAs and Ethereum, just in time for the weekend. Next week's discussions will be fun🌶️ Thanks for the comments @alexberegszaszi @ethchris @georgwiese @gballet @drakefjustin !
1
4
Just dropped our thoughts on EVM 2.0 on Ethmagicians. We see 3 options: (1) execution optimized, (2) ZK optimized, (3) 2-step approach: HLL->blockchainVM->ZKVM. We suggest exploring all 3 (we took (3) with Cairo but Ethereum should look at all). ethereum-magicians.org/t/evm…