Rayan · Aug 4, 2025 · 6:25 PM UTC

Rayan

Pinned Tweet

Rayan

@AskRayan

Aug 4

Replying to @Alibaba_Qwen

QWEN-IMAGE NOW SUPER FASSSSSSST (<12s) replicate.com/qwen/qwen-imag… @replicate

Minette Kaunismäki · Nov 6, 2025 · 11:00 AM UTC

Rayan retweeted

Minette Kaunismäki @MinetteKaum

Nov 6

Huge thank you to everyone who joined us yesterday to create a new merch design for @PrunaAI ! The design will be launched soon, stay tuned 👀 If you’re at the @dotaiconf today, come say hi. And don’t miss @Bertrand_Charp talk at 2:50 PM, see you there! 🚀

John · Oct 7, 2025 · 11:27 PM UTC

Rayan retweeted

John

@johnrachwan

Oct 7

HunyuanImage 3.0 on @replicate passes this test with flying colors. Try it here: replicate.com/tencent/hunyua…

fofr

@fofrAI

Sep 10

Seedream 4 passed this prompt test pretty well, only missing the tear in the gold backdrop.

Replicate · Oct 7, 2025 · 7:25 PM UTC

Rayan retweeted

Replicate

@replicate

Oct 7

You can now run Hunyuan-Image-3.0 on Replicate replicate.com/tencent/hunyua… The #1 model in Text-to-Image LMArena Create images in under 30 seconds Another @PrunaAI collab to deliver the fastest speeds possible

Rayan · Oct 6, 2025 · 9:03 PM UTC

Rayan

@AskRayan

Oct 6

i love this model

Replicate

@replicate

Oct 6

You can now run Ovi by @character_ai Simultaneously generate both audio & video in under 40 seconds replicate.com/character-ai/o… Another @PrunaAI collab to deliver the fastest speeds possible

Tiezhen WANG · Sep 28, 2025 · 12:45 AM UTC

Rayan retweeted

Tiezhen WANG

@Xianbao_QIAN

Sep 28

Replying to @johnrachwan

Me too. Do some extreme optimization please! @PrunaAI @wavespeed_ai !

Alex Volkov (Thursd/AI) · Sep 26, 2025 · 5:00 PM UTC

Rayan retweeted

Alex Volkov (Thursd/AI)

@altryne

Sep 26

This replacement on @replicate was actually like the highest res, but it kept me as black and white (like source image) which stands out. The jacket is CRISP tho!

Misha Feinstein · Aug 25, 2025 · 5:44 PM UTC

Rayan retweeted

Misha Feinstein

@MishaFein

Aug 25

Learn how our team cut 𝐀𝐈 𝐢𝐧𝐟𝐞𝐫𝐞𝐧𝐜𝐞 𝐭𝐢𝐦𝐞 𝐛𝐲 𝟓𝟎% in just 2 days, without compromising quality. Some torch.compile + collaboration with our partners at @PrunaAI is the secret sauce. We've captured the full story in our latest blog ➡️go.bria.ai/4fJX6Le Link preview

How We Cut AI Inference Time by 50% in Just 2 Days - Without Losing Quality

A real-world deep dive into optimizing GenAI inference latency with Torch Compile, LoRA hot-swapping, and Pruna AI.

blog.bria.ai

Luis Catacora · Aug 24, 2025 · 3:24 AM UTC

Rayan retweeted

Luis Catacora

@lucatac0

Aug 24

Animate Pokémon cards with Wan2.2

Pruna AI · Aug 22, 2025 · 11:00 AM UTC

Rayan retweeted

Pruna AI

@PrunaAI

Aug 22

🔥 Master model compression, quantization, and deployment optimization with our comprehensive learning path! Key Features: • Deep-dive lecture slided on LLM architectures & compression • 7+ hands-on coding exercises with real benchmarks • CPU vs GPU performance comparisons • Advanced quantization techniques Perfect for: ML engineers, researchers, and students ready to optimize AI models for production. Hardware: Works on modest GPUs (1080Ti+) or Google Colab 👉 Access materials now: github.com/PrunaAI/ai-effici… ⭐️ Don’t forget to star our repo!

GitHub - PrunaAI/ai-efficiency-courses: Courses on building, compressing, evaluating, and deploying...

Courses on building, compressing, evaluating, and deploying efficient AI models. - PrunaAI/ai-efficiency-courses

github.com

Rayan · Aug 21, 2025 · 12:07 PM UTC

Rayan

@AskRayan

Aug 21

Efficient ML papers + hot takes 👇 :)

Bertrand Charpentier

@Bertrand_Charp

Aug 21

𝗜’𝗺 𝘀𝘁𝗮𝗿𝘁𝗶𝗻𝗴 𝗮 𝗻𝗲𝘄 𝘀𝗲𝗿𝗶𝗲𝘀: 𝗣𝗮𝗽𝗲𝗿 𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀! I’ll share research papers about (efficient) AI I’ve read including their code when available. 𝗣𝗮𝗽𝗲𝗿 𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁 #𝟬𝟭: “Olica: Efficient Structured Pruning of Large Language Models without Retraining” | 𝗔𝘂𝘁𝗵𝗼𝗿𝘀: Jiujun He, Huazhen Lin | 𝗩𝗲𝗻𝘂𝗲: @icmlconf 2025 This paper explores how to efficiently prune large language models 𝘄𝗶𝘁𝗵𝗼𝘂𝘁 𝗳𝘂𝗹𝗹 𝗿𝗲𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴. Key contributions: • 𝗦𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱 𝗽𝗿𝘂𝗻𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗳𝗮𝘀𝘁 𝗣𝗖𝗔: They analyze matrix products in the multi-head attention (MHA) and remove neurons with the lowest importance scores. • 𝗙𝗮𝘀𝘁 𝗿𝗲𝗰𝗮𝗹𝗶𝗯𝗿𝗮𝘁𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗳𝗲𝘄 𝗱𝗮𝘁𝗮 𝘀𝗮𝗺𝗽𝗹𝗲𝘀: Residual errors are compensated via a low-rank decomposition, requiring only a small calibration dataset. • 𝗣𝗿𝗮𝗰𝘁𝗶𝗰𝗮𝗹 𝗲𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝗰𝘆: An LLM can be pruned and recalibrated in ~7 minutes using just hundreds of samples, resulting in smaller and faster models. 𝗣𝗲𝗿𝘀𝗼𝗻𝗮𝗹 𝗛𝗼𝘁 𝘁𝗮𝗸𝗲: This does not feel like zero retraining because of the recalibration phase but it makes retraining very efficient!

Pruna AI · Aug 20, 2025 · 3:00 PM UTC

Rayan retweeted

Pruna AI

@PrunaAI

Aug 20

🚀 Major Partnership Drop: Pruna AI × @Wiroai! 🔬 We've sped up Wiro's video generation models with our optimization technology! Our compression delivers blazing-fast inference speeds without sacrificing quality. Performance Gain: • Wan 2.2-TI2V-5B – Text-to-Video-Fast → 16 sec wiro.ai/tools/Wan-AI/Wan2.2-… - wiro.ai/tools/Wan-AI/Wan2.2-… - • Wan 2.2-TI2V-5B – Image-to-Video-Fast → 12 sec wiro.ai/tools/Wan-AI/Wan2.2-… - wiro.ai/tools/Wan-AI/Wan2.2-… - • Wan 2.2-T2V-A14B-Fast → 22 sec wiro.ai/tools/Wan-AI/Wan2.2-… - wiro.ai/tools/Wan-AI/Wan2.2-… - • Wan 2.2-I2V-A14B-Fast → 10 sec wiro.ai/tools/Wan-AI/Wan2.2-… - wiro.ai/tools/Wan-AI/Wan2.2-… - 🏎️ This collaboration showcases how intelligent model compression can transform creative workflows - users now get enterprise-grade video generation with consumer-friendly speed! 👉 See our optimization magic in action at wiro.ai/ ⭐️ Interested in optimizing your models? Let's talk!

Wan2.2 Image To Video - Wiro AI

A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B image-to-video

wiro.ai

Rayan · Aug 19, 2025 · 12:04 PM UTC

Rayan

@AskRayan

Aug 19

New efficient ml meetup let's go !

Pruna AI

@PrunaAI

Aug 19

🚀 We offer tech croissants and pizzas in post-holiday Paris | Wednesday 17 September 🔥 Join AI developers, ML engineers, and researchers for an evening of cutting-edge optimization insights and practical implementation strategies! This meetup is built for practitioners scaling LLMs, fine-tuning diffusion models, and deploying real-time inference pipelines. 📅 What's Coming: 🎯 7:00 PM: Welcome & Networking 🧠 7:30-8:15 PM: Technical Sessions 🍕 8:15-9:00 PM: Deep Tech Conversations & Food • James Martin (CEO, Better Tech): "AI's Impacts: Solving Real Problems" • Jules Belveze (Software Engineer, Dust): "Early Exiting: Under-Hyped Compression Methods" 🎤 Speaking Opportunity: Submit your talk proposal - we want YOUR breakthrough insights! 📍 Location: Paris 9th (exact venue shared with approved attendees) ⚡️ Limited spots available: lu.ma/i24gy980

fofr · Aug 18, 2025 · 6:58 PM UTC

Rayan retweeted

fofr

@fofrAI

Aug 18

Qwen Image Edit. 3 seconds, $0.03. replicate.com/qwen/qwen-imag… @PrunaAI do the impossible. > Make the text 3D and floating on a city street

Replicate

@replicate

Aug 18

The much anticipated image editing model from Qwen is now on Replicate replicate.com/qwen/qwen-imag… Edit images in just 3 seconds, for $0.03 per image. We've worked with Pruna to deliver you the fastest way to use Qwen Image Edit.

301

Rayan · Aug 18, 2025 · 7:52 PM UTC

Rayan retweeted

Rayan

@AskRayan

Aug 18

Replying to @Alibaba_Qwen

who wants an über fast version of the model? Available with our inference partners :)

Replicate

@replicate

Aug 18

Pruna AI · Aug 18, 2025 · 7:37 PM UTC

Rayan retweeted

Pruna AI

@PrunaAI

Aug 18

We are really happy to collaborate with @replicate to bring you a 3s per generation Qwen-Image-Edit!

Replicate

@replicate

Aug 18

Pruna AI · Aug 18, 2025 · 11:00 AM UTC

Rayan retweeted

Pruna AI

@PrunaAI

Aug 18

🌊 We didn’t just ride the wave, we shredded it and made it faster 🤘 With Wan 2.2 Juiced, we’ve launched the fastest, most affordable version of Wan 2.2 Video on Replicate, optimized with Pruna’s compression research. ➤ 100x cheaper than Veo3 at $0.05 per video ➤ Around 2x faster than the original Wan 2.2 👉 Ready to experience lightning-fast video generation? Try our endpoints on @replicate: replicate.com/wan-video/wan-… - replicate.com/wan-video/wan-… -

Alex Genovese · Aug 17, 2025 · 8:47 PM UTC

Rayan retweeted

Alex Genovese @alexgenovese

Aug 17

just published a @Docker using @FastAPI and @PrunaAI that make you able to download and compile @bfl_ml #Flux, @StablesLabs Stable diffusion and many to come!

John · Aug 15, 2025 · 5:25 PM UTC

Rayan retweeted

John

@johnrachwan

Aug 15

🚀 Big update to qwen-image on @replicate ! → replicate.com/qwen/qwen-imag… ⚡ 3x faster: 3s per gen 🎨 New: img2img pipelines 🪄 Fast LoRAs: 4s per gen Checkout this cool realistic image (right) I generated with a realism lora (input in replies) compared to base (left)

Pruna AI · Aug 13, 2025 · 11:00 AM UTC

Rayan retweeted

Pruna AI

@PrunaAI

Aug 13

🔥 Wan 2.2-Image is the cool new kid on the block Wan 2.2 Image generates one 2 Megapixels images for just $0.02 • 𝟮.𝟰𝘅 𝗳𝗮𝘀𝘁𝗲𝗿 𝘁𝗵𝗮𝗻 𝗦𝗲𝗲𝗱𝗗𝗿𝗲𝗮𝗺 • 𝟭.𝟴𝘅 𝗳𝗮𝘀𝘁𝗲𝗿 𝘁𝗵𝗮𝗻 𝗙𝗹𝘂𝘅-𝟭.𝟭 𝗣𝗿𝗼 • 𝟭.𝟭𝘅 𝗳𝗮𝘀𝘁𝗲𝗿 𝘁𝗵𝗮𝗻 𝘁𝗵𝗲 𝗮𝗹𝗿𝗲𝗮𝗱𝘆 𝗳𝗮𝘀𝘁 𝗪𝗮𝗻 𝟮.𝟭 𝗜𝗺𝗮𝗴𝗲. Try it on @replicate: replicate.com/prunaai/wan-2.… In short, have fun, make the most of the model, and help us improve it when needed!

Pruna AI · Aug 13, 2025 · 3:01 PM UTC

Rayan retweeted

Pruna AI

@PrunaAI

Aug 13

🚀 v0.2.8 Released - Image Generation Metrics Powerhouse! 🔥 Major Feature Drop: • ARNIQA - Advanced image quality assessment • CLIPIQA - Perceptual quality metrics • Sharpness Detection - Crystal-clear evaluation tools Now you can benchmark your compressed models with professional-grade evaluation tools. ⚡ 𝗣𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗕𝗼𝗼𝘀𝘁: supercharged our CI pipeline with parallel test execution - faster merges, quicker releases! 🐛 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗜𝗺𝗽𝗿𝗼𝘃𝗲𝗺𝗲𝗻𝘁𝘀: • Fixed device state handling for memory metrics • Resolved transformers compatibility issues • Enhanced documentation efficiency • Streamlined deprecated code cleanup 👉 𝗥𝗲𝗮𝗱𝘆 𝘁𝗼 𝗲𝘃𝗮𝗹𝘂𝗮𝘁𝗲 𝘆𝗼𝘂𝗿 𝗺𝗼𝗱𝗲𝗹 𝗰𝗼𝗺𝗽𝗿𝗲𝘀𝘀𝗶𝗼𝗻 𝗿𝗲𝘀𝘂𝗹𝘁𝘀? github.com/PrunaAI/pruna/rel… ⭐ Star the repo if these metrics boost your AI workflow!