Akshay 🚀 · Nov 8, 2025 · 12:30 PM UTC

Akshay 🚀

Abhishek Reddy retweeted

Akshay 🚀

@akshay_pachaar

14h

Fine-tune DeepSeek-OCR on your own language! (100% local) DeepSeek-OCR is a 3B-parameter vision model that achieves 97% precision while using 10× fewer vision tokens than text-based LLMs. It handles tables, papers, and handwriting without killing your GPU or budget. Why it matters: Most vision models treat documents as massive sequences of tokens, making long-context processing expensive and slow. DeepSeek-OCR uses context optical compression to convert 2D layouts into vision tokens, enabling efficient processing of complex documents. The best part? You can easily fine-tune it for your specific use case on a single GPU. I used Unsloth to run this experiment on Persian text and saw an 88.26% improvement in character error rate. ↳ Base model: 149% character error rate (CER) ↳ Fine-tuned model: 60% CER (57% more accurate) ↳ Training time: 60 steps on a single GPU Persian was just the test case. You can swap in your own dataset for any language, document type, or specific domain you're working with. I've shared the complete guide in the next tweet - all the code, notebooks, and environment setup ready to run with a single click. Everything is 100% open-source!

156

1,024

Raul Junco · Nov 6, 2025 · 1:01 PM UTC

Abhishek Reddy retweeted

Raul Junco

@RaulJuncoV

Nov 6

System design is the art of making scale look boring. Here are 3 system-design project ideas you can actually build and reason about 👇 Project idea 1: “Instagram-style Feed Service” Your goal: design a timeline that scales reads. Key challenges to solve: - fan-out on write vs fan-out on read - caching the feed (Redis? CDN?) - handling the “celebrity problem” (1M followers) Deliverable: write a design doc that defends why you picked your fan-out strategy and how you avoid thundering herds. Project idea 2: “URL Shortener at 5k RPS” Your goal: tiny API that forces huge decisions. Key challenges to solve: - ID generation strategy (Snowflake IDs? base62?) - consistent hashing across shards - hot key protection Deliverable: build a prototype, hammer it with a load generator, and tune your write path until you get predictable low-latency writes. Project idea 3: “E-Commerce Checkout as a SAGA” Your goal: durability + correctness over everything. Key challenges to solve: → Payment, Inventory, Order microservices coordination → Orchestrator vs Choreography → idempotency and retries Deliverable: show how you avoid double-charging customers through idempotent event handling + a durable orchestrator. Just picking a “cool” tool won’t save you. Good system design comes from defending your trade-offs.

532

ₕₐₘₚₜₒₙ · Nov 7, 2025 · 9:28 PM UTC

Abhishek Reddy retweeted

ₕₐₘₚₜₒₙ

@hamptonism

Nov 7

Study Von Neumann (Game Theory), Study McCulloch (Neural Networks), Study Jung ( Consciousness).

355

2,962

Nikki Siapno · Nov 7, 2025 · 1:28 PM UTC

Abhishek Reddy retweeted

Nikki Siapno

@NikkiSiapno

Nov 7

How do we design effective and safe APIs? APIs have increasingly become the backbone of modern software. To understand some of the key principles and best practices of API design, Let's analyze a social media platform example: 🔹 𝗥𝗲𝘀𝗼𝘂𝗿𝗰𝗲 𝗻𝗮𝗺𝗶𝗻𝗴 ↳ Clarity is key when creating APIs. Adopting simple resource names, like /users for accessing user profiles and /posts for retrieving user posts, streamlines the development process and reduces mental strain. 🔹 𝗨𝘀𝗲 𝗼𝗳 𝗽𝗹𝘂𝗿𝗮𝗹𝘀 ↳ It's important to maintain a standard of consistency in API design. For consistency and readability, use plural resource names, such as GET /users/{userId}/friends vs. /friend), to avoid ambiguity in API requests. 🔹 𝗖𝗿𝗼𝘀𝘀-𝗿𝗲𝗳𝗲𝗿𝗲𝗻𝗰𝗶𝗻𝗴 𝗿𝗲𝘀𝗼𝘂𝗿𝗰𝗲𝘀 ↳ Interlinking resources, like taking comments on a post using GET /posts/{postId}/comments, simplifies the retrieval of related data. It provides a more streamlined and well-organized user experience. 🔹 𝗜𝗱𝗲𝗺𝗽𝗼𝘁𝗲𝗻𝗰𝘆 ↳ Maintaining API reliability is crucial. Idempotency ensures that operations like profile updates (PUT /users/{userId}/profile) produce the same result no matter how many times it’s executed. Learn more about idempotency here: lucode.co/idempotency-in-api… 🔹 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆 ↳ It goes without saying, security is a must-have. To secure the API endpoints, employ authentication methods like X-AUTH-TOKEN and X-SIGNATURE, and use authorization headers for verifying user permissions. 🔹 𝗩𝗲𝗿𝘀𝗶𝗼𝗻𝗶𝗻𝗴 ↳ Communicating version updates is another important practice. Endpoints like GET /v2/users/{userId}/posts allow API versioning to maintain functionality regardless of updates. This approach ensures backward compatibility and a smooth transition for users and us. 🔹 𝗣𝗮𝗴𝗶𝗻𝗮𝘁𝗶𝗼𝗻 ↳ Important for performance. Paginate large datasets, like feeds or comment lists, with GET /posts?page=5&pageSize=20 to enhance data delivery and UX. Great APIs come from good practices. Clear docs, strong monitoring, consistent error handling, and more. Adopting these practices helps us build secure, performant APIs that deliver great user experiences. What else would you add? -- 👋 PS: If you like this post, then you'll love our newsletter. Join 25,000+ software engineers: lucode.co/luc-newsletter-lm1… PPS: You get our Architecture Patterns Playbook for free when you join. It’s packed with visuals, tradeoffs, & real-world examples. -- 🔖 Save for later • ♻️ Repost to help others 🙋🏻‍♀️ Follow Nikki Siapno • Turn on notifications 🔔

135

872

Math Cafe · Nov 7, 2025 · 6:58 AM UTC

Abhishek Reddy retweeted

Math Cafe

@Riazi_Cafe_en

Nov 7

Maryland's "Mathematical Logic" notes by David W. Keuker PDF 1: math.umd.edu/~dkueker/712.pd… PDF 2: math.umd.edu/~dkueker/713.pd…

102

568

Ganesh Kumar · Nov 7, 2025 · 9:54 AM UTC

Abhishek Reddy retweeted

Ganesh Kumar

@Ganeshuor

Nov 7

Matrix Cheat Sheet

363

4,859

Alex Smith · Nov 7, 2025 · 4:00 PM UTC

Abhishek Reddy retweeted

Alex Smith

@ninja_maths

Nov 7

On the subject of Laplace Transforms, our Differential Equations course is well underway. I'm excited about this one. Here's a sneak peek.

Alex Smith

@ninja_maths

Nov 6

I love @3blue1brown. However, the biggest beneficiaries are those who already have solid fundamentals. Take the recent excellent video on Laplace Transforms. I wouldn't recommend starting there if you've no idea what a Laplace Transform is! Build the fundamental skills first.

758

Mathieu · Nov 7, 2025 · 6:27 PM UTC

Abhishek Reddy retweeted

Mathieu

@miniapeur

Nov 7

1,072

Akshay 🚀 · Nov 7, 2025 · 12:30 PM UTC

Abhishek Reddy retweeted

Akshay 🚀

@akshay_pachaar

Nov 7

Multi-head attention in LLMs, visually explained:

237

GIF

Python Programming · Nov 6, 2025 · 7:15 AM UTC

Abhishek Reddy retweeted

Python Programming

@PythonPr

Nov 6

Linear Regression Image Credit- Data Interview

124

676

Lorenzo Noci · Nov 7, 2025 · 3:50 PM UTC

Abhishek Reddy retweeted

Lorenzo Noci @lorenzo_noci

Nov 7

Replying to @micbucci @francoisfleuret

Having init close to linear is important to have a well defined large depth limit at init (which ensures stable gradients). For instance, this is a paper that investigates this problem arxiv.org/abs/2110.01765

Rapid training of deep neural networks without skip connections or...

Using an extended and formalized version of the Q/C map analysis of Poole et al. (2016), along with Neural Tangent Kernel theory, we identify the main pathologies present in deep networks that...

arxiv.org

François Fleuret · Nov 7, 2025 · 2:18 PM UTC

Abhishek Reddy retweeted

François Fleuret

@francoisfleuret

Nov 7

There is a paper from 2017 that introduced a trick that I love but never seen used. Consider two linear layers f and g that you initialize with the same parameters, and then you use h(x)=f(relu(x))+g(-relu(-x)) Then at initialization, h is linear! 1/2

511

Branko · Nov 7, 2025 · 7:55 AM UTC

Abhishek Reddy retweeted

Branko

@brankopetric00

Nov 7

Our API handled 500 requests per second with zero issues. Marketing ran a campaign. Traffic hit 5,000 rps. What broke wasn't what we expected: - Application servers? Fine. - Database? Fine. - Load balancer? Fine. - Our rate limiter crashed because we stored rate limit counters in Redis with no memory limits. Redis ran out of memory. Rate limiter failed open. Actual traffic hit the API unthrottled. Then everything crashed. We optimized for scale but not for the failure modes of our safety mechanisms.

897

Joachim Schork · Nov 6, 2025 · 8:17 PM UTC

Abhishek Reddy retweeted

Joachim Schork

@JoachimSchork

Nov 6

One of the most common mix-ups in statistics is between standard deviation (SD) and standard error (SE). They sound similar, but they describe two completely different things—and using the wrong one can lead to misleading conclusions. Here's how to tell them apart. 🔹 Standard Deviation (SD): SD measures how spread out individual values are in your sample. It tells you about the variability within the data set. Example: How much do individual incomes vary in a sample of 1,000 people? 🔹 Standard Error (SE): SE measures how much an estimate (like a mean or proportion) would vary across repeated samples. It tells you how precise your estimate is. Example: How much would the sample mean income change if you ran the survey again? As your sample gets larger, SE gets smaller because you're more confident in your estimate. But SD often stays about the same since it reflects the natural spread in the data, not how many observations you have. Use SD to describe the data, and SE to describe the reliability of the estimate. For more on statistics, data science, R, and Python, subscribe to my email newsletter. Click this link for detailed information: eepurl.com/gH6myT #datastructure #Python #RStats #Python3 #Data #programmer

101

517

Adithya S K · Nov 7, 2025 · 3:59 PM UTC

Abhishek Reddy retweeted

Adithya S K

@adithya_s_k

Nov 7

I feel the Gemma 3 models are quite underrated. - They are multimodal (each image → 256 tokens). - Have a robust tokenizer (~260k unique tokens, one of the most efficient tokenizers I’ve seen across multiple languages). - Support a 128k input context. - Offer relatively fast inference and the training speed is faster when compared to qwen2.5/3vl - Are very easy to fine-tune, and they generalize well to many downstream tasks using both SFT and RL (GRPO). - Are widely supported across frameworks. I’ve been working with them for a while they’re not groundbreaking models, but they make an excellent starting point for fine-tuning espeically the 4b parameter model can be finetuned on a T4 gpu as well If any one is interested can share a sample finetuning script

100

Ganesh Kumar · Nov 6, 2025 · 2:15 PM UTC

Abhishek Reddy retweeted

Ganesh Kumar

@Ganeshuor

Nov 6

Everything about Derivative and Integral.

166

1,060

Tom Dörr · Nov 7, 2025 · 1:59 AM UTC

Abhishek Reddy retweeted

Tom Dörr

@tom_doerr

Nov 7

Extracts clean data from documents using vision-language models

410

Charles · Nov 6, 2025 · 4:29 AM UTC

Abhishek Reddy retweeted

Charles @m84736062

Nov 6

We reproduce deepseek-ocr training from scratch, the code, model, results can be found in our website #DeepSeek pkulium.github.io/DeepOCR_we…

Technical Report-v1: DeepOCR

Open-source reproduction of DeepSeek-OCR

pkulium.github.io

279

joao carreira · Nov 6, 2025 · 7:11 PM UTC

Abhishek Reddy retweeted

joao carreira @joaocarreira

Nov 6

I'm looking for a student researcher to work with me at Google DeepMind in London, preferably starting early next year -- topics will be around novel video model architectures / learning from a single video stream / representation learning .

751

Edward Milsom · Nov 7, 2025 · 8:25 AM UTC

Abhishek Reddy retweeted

Edward Milsom @edward_milsom

Nov 7

It is viva day, my dudes.

548