Fine-tune DeepSeek-OCR on your own language! (100% local) DeepSeek-OCR is a 3B-parameter vision model that achieves 97% precision while using 10× fewer vision tokens than text-based LLMs. It handles tables, papers, and handwriting without killing your GPU or budget. Why it matters: Most vision models treat documents as massive sequences of tokens, making long-context processing expensive and slow. DeepSeek-OCR uses context optical compression to convert 2D layouts into vision tokens, enabling efficient processing of complex documents. The best part? You can easily fine-tune it for your specific use case on a single GPU. I used Unsloth to run this experiment on Persian text and saw an 88.26% improvement in character error rate. ↳ Base model: 149% character error rate (CER) ↳ Fine-tuned model: 60% CER (57% more accurate) ↳ Training time: 60 steps on a single GPU Persian was just the test case. You can swap in your own dataset for any language, document type, or specific domain you're working with. I've shared the complete guide in the next tweet - all the code, notebooks, and environment setup ready to run with a single click. Everything is 100% open-source!

Nov 8, 2025 · 12:30 PM UTC

Replying to @akshay_pachaar
I find two things really interesting about it: - Everything is happening on a single GPU. - 57% performance boost after just 60 steps. Thanks for the code, impressive!
1
1
5
💯 This makes such powerful tech accessible to everyone.
1
Replying to @akshay_pachaar
curious how well it handles mixed scripts.
1
2
Interesting point. Technically shouldn’t be an issue. I have created this streaming app that you can run and test this.
Replying to @akshay_pachaar
What about the parsing into a format for further analysis?
Replying to @akshay_pachaar
I live this @akshay_pachaar what other use cases for ocr are there for indian languages?
"Western values. The collapse of illusions" (2024, 100 x 100 cm) I lived most of my life in the USSR and Russia, and then moved, as I thought at the time, from the "Sovok" and the "Criminal Russia" to a civilized world, a blooming garden with its freedom, democracy, the inviolability of private property, independent courts, security, cleanliness, and order. However, soon I experienced a severe disiappointment in my illusions (the train of illusions derailed amidst the crumbling cities). And I decided to say goodbye to rotting Europe and leave it. Maybe I'll come back to Russia or will move to Asia. Anyway, I have already withdrawn my money from Europe. P. S. The inscription in Russian "Я" means "I", and "мои деньги" means "my money".
Replying to @akshay_pachaar
Do you know if there is OCR for music notes?
Replying to @akshay_pachaar
Do u have any tutorial on distillation? Would love to see how a 70b model can be distilled to 8b or lower for specific use cases and perform from precision-recall perspective
Replying to @akshay_pachaar
ok but... this sounds like persian but is not. i mean it's an old text and the style and words are a bit different but the text on the right is full of nonsensical words.
Replying to @akshay_pachaar
OCRs are important, but they were costly. The average cost per document was around ~0.5 cents. I hope this new approach will bring it down.
Replying to @akshay_pachaar
that’s incredible
Replying to @akshay_pachaar
do it with the voynich manuscipt and get the nobel price
2
Replying to @akshay_pachaar
DeepSeek-OCR sounds quite promising, Akshay! It's amazing how much progress is being made in AI, right?
Get paid 1 day after getting funded and daily payouts afterwards
592
805
89
11,830
Replying to @akshay_pachaar
Is there any web online app to try?
Replying to @akshay_pachaar
Use Tesseract OCR for free
2
Replying to @akshay_pachaar
Interesting, smaller vision tokens could be a big win.
Replying to @akshay_pachaar
@grok how much does this cost to setup
Replying to @akshay_pachaar
My app relies on AWS Textract for data extraction from pdf files, is it worth it to invest in this?
Replying to @akshay_pachaar
can DS OCR recognize Free Taiwan?
The FreeForm Playground is live! We’re teaming up with @printablescom to challenge makers worldwide! Take our modular hardware and make it your own. Design, print, and share your FreeForm creation for a chance to win a Prusa MK4S Kit, a Cooler Master Custom PC Build, and more. Don’t wait! Submit your entry before the deadline on December 1, 2025. Your build, your rules. 🔗 linkto.cm/FB__FF_Playground
1
14
83