technologist | distributed systems | reactive systems | edge computing | ml | ai | agentic computing | multi-cloud | k8s | embedded systems | iot

Washington D.C.
Joined June 2009
Blaize D'souza ❤️ retweeted
Reproducing the AWS Outage Race Condition with a Model Checker wyounas.github.io/aws/concur… We’ll use a model checker to see how such a race could happen. Formal verification can’t prevent every failure, but it helps us think more clearly about correctness and reason about subtle bugs
1
18
Blaize D'souza ❤️ retweeted
Google just dropped a new 50-page doc on building agents that actually work in the real world. it's a fast introduction to the theory of what you must know about agents. It covers: → core agent architecture → LLM (the brain behind agents) → tools (the hands of the agents) → multi-agent orchestration → how to deploy agents → evaluation and metrics → self-evolving learning agents → how agents evolve and learn → covering AlphaEvolve example you can download and read it from Kaggle: kaggle.com/whitepaper-introd…
Blaize D'souza ❤️ retweeted
RTOS Part 1 - What is a Real-Time Operating System (RTOS)? @ShawnHymel explains! ➡️ dky.bz/3z5iB4i
19
1
113
Blaize D'souza ❤️ retweeted
🌟 7 Steps to Make Your OSS Project AI-Ready 🤖👨‍💻 📢 Open-source maintainers, here’s how to onboard AI into your project, per persona 👇 For Users 🔹 Add llms.txt – help LLMs find your docs 🔹 Add chat to docs – instant Q&A v/ @kapa_ai @inkeep 🔹 Expose APIs via MCP – let AI agents use your project For Contributors 🔹 Add AGENTS.md – teach AI tools how to build/test 🔹 Define AI use rules - update CONTRIBUTING.md (great example: @openinfradev 👏) For Maintainers 🔹 AI code reviews - first line of defence: @coderabbitai 🔹 Automate triage & issue mgmt - try @dosu_ai 💡 If unsure where to start: add AGENTS.md + clear AI policy first. Then a chat with docs! Full guide ↓ 🙏♻️ #OSS #OpenSource #AI open.substack.com/pub/genera…
3
10
Blaize D'souza ❤️ retweeted
Vibe Coding != AI-Assisted Engineering by @addyosmani addyo.substack.com/p/vibe-co…
2
4
9
Blaize D'souza ❤️ retweeted
Compression techniques I’d study if I wanted small but smart LLMs. 1.Quantization 2.Distillation 3.Low-Rank Adaptation 4.Weight Sharing 5.Sparse Matrices 6.Layer Dropping 7.Knowledge Transfer 8.Embedding Compression 9.Mixed Sparsity 10. Progressive Shrinking 11.Structured Pruning 12.AutoML Compression Follow @asmah2107 to update your game on LLM optimisations.
6
78
612
Blaize D'souza ❤️ retweeted
Let's talk about - Redis But first, make sure you’re following @techNmak ➸ Redis wasn't born in a lab – it was born from frustration. 😉 How? ➸ Back in 2009, Salvatore Sanfilippo needed a faster way to analyze website traffic for his startup. Existing databases just couldn't keep up. ➸ So, he built his own solution => Redis (a super-fast data store that lives in your computer's memory.) It was a game-changer, and it's still transforming how we handle data today. 📌 𝐋𝐞𝐭'𝐬 𝐮𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝 𝐢𝐧 𝐝𝐞𝐭𝐚𝐢𝐥. Redis => REmote DIctionary Server ◾ initially designed to be a remote data structure server accessible over the network. ◾ open-source, in-memory data store. ◾ functions mainly as a key-value database ◾ but goes beyond by offering a variety of data structures like strings, lists, sets, sorted sets, hashes etc. ◾ primarily resides in RAM, providing incredibly fast read and write operations. ◾ options for persistence to disk, ensuring data durability. 𝐑𝐞𝐝𝐢𝐬 𝐢𝐬 𝐟𝐚𝐬𝐭. 𝐖𝐡𝐲? As we discussed, Unlike traditional databases that primarily store data on disk, Redis keeps its entire dataset in memory (RAM). This eliminates the latency associated with disk seeks and reads, allowing for very fast data access. 📌 𝐑𝐄𝐒𝐏 𝐏𝐫𝐨𝐭𝐨𝐜𝐨𝐥 => REdis Serialization Protocol ◾ Redis uses its own binary-safe protocol called RESP for communication. - designed for simplicity and efficiency - less overhead as compared to text-based protocols like HTTP 📌 𝐄𝐯𝐞𝐧𝐭 𝐋𝐨𝐨𝐩 𝐚𝐧𝐝 𝐈/𝐎 𝐌𝐮𝐥𝐭𝐢𝐩𝐥𝐞𝐱𝐢𝐧𝐠 ◾ The heart of Redis => event loop. (single-threaded) ◾ continuously monitors file descriptors (sockets) for client connections and incoming commands. ◾ event-driven architecture, combined with I/O multiplexing (epoll, kqueue, or select) => allows Redis to handle thousands of concurrent clients efficiently without the overhead of multiple threads. ◾ non-blocking I/O operations - not waiting for I/O operations to complete - main thread remains responsive and can quickly process other commands ◾ if main thread encounters a time-consuming operation (like accessing the disk or network), it doesn't halt and wait => delegates the task to the operating system and registers a callback function to be executed once the operation completes. 📌 𝐌𝐞𝐦𝐨𝐫𝐲 𝐅𝐫𝐚𝐠𝐦𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧 𝐚𝐧𝐝 𝐣𝐞𝐦𝐚𝐥𝐥𝐨𝐜 ◾ Redis uses a memory allocator called jemalloc to manage memory efficiently. ◾ Jemalloc helps reduce memory fragmentation, which can occur when objects are allocated and freed repeatedly. 📌 𝐑𝐞𝐝𝐢𝐬 𝐨𝐟𝐟𝐞𝐫𝐬 𝐯𝐚𝐫𝐢𝐨𝐮𝐬 𝐝𝐞𝐩𝐥𝐨𝐲𝐦𝐞𝐧𝐭 𝐦𝐨𝐝𝐞𝐬. ◾ Standalone - A single instance of Redis. ◾ Cluster - A distributed implementation for scalability and high availability. ◾ Sentinel - High availability for standalone or replicated Redis instances. ◾ Replication - Master-replica setup for data redundancy and read scalability.
7
40
2
340
Blaize D'souza ❤️ retweeted
There is a difference between a Distributed System and a Distributed Execution Distributed System A set of components that communicate by exchanging messages Distributed Execution An execution that is spread over the components, each performing a part of the execution
3
1
24
Blaize D'souza ❤️ retweeted
[new blog post] Taurus Database: How to be Fast, Available, and Frugal in the Cloud muratbuffalo.blogspot.com/20…
19
108
Blaize D'souza ❤️ retweeted
The best tool to reason about Distributed Executions is the one we already know: Recursive Abstraction a.k.a Functions If we collapse local function calls, we are left with the local parts of a distributed execution and can reason about its mapping onto a distributed system
There is a difference between a Distributed System and a Distributed Execution Distributed System A set of components that communicate by exchanging messages Distributed Execution An execution that is spread over the components, each performing a part of the execution
6
1
53
Blaize D'souza ❤️ retweeted
Basic primitives and first principles On restart, return the memoized value, skip the function call That's @resonatehqio's Durable Executions, just functions and promises No workflows, no activities, no event logs
Another foundational principle: Substitution Substitution refers to the idea that we can replace an expression with its value f(x) = v Memoization turns Substitution into a recovery tool: by caching v, we can transparently substitute the function call with the cached value
3
1
25
Blaize D'souza ❤️ retweeted
This Stanford paper explains exactly how AI agents do human work. AI agents now complete human-like workflows across 83% of tasks, working 88% faster and at 90–96% lower cost. However, their output often lacks reliability as data fabrication and tool misuse still occur frequently. The study finds that agents perform well in structured, programmable work but struggle in tasks requiring human judgment and context. The key takeaway is to let AI handle routine operations while humans focus on creativity and oversight. Do you trust your AI agents to work for you?
1
1
4
Blaize D'souza ❤️ retweeted
A simple trick cuts your LLM costs by 50%! Just stop using JSON and use this instead: TOON (Token-Oriented Object Notation) slashes your LLM token usage in half while keeping data perfectly readable. Here's why it works: TOON's sweet spot: uniform arrays with consistent fields per row. It merges YAML's indentation and CSV's tabular structure, optimized for minimal tokens. Look at the example below. JSON: { "𝘂𝘀𝗲𝗿𝘀": [ { "𝗶𝗱": 𝟭, "𝗻𝗮𝗺𝗲": "𝗔𝗹𝗶𝗰𝗲", "𝗿𝗼𝗹𝗲": "𝗮𝗱𝗺𝗶𝗻" }, { "𝗶𝗱": 𝟮, "𝗻𝗮𝗺𝗲": "𝗕𝗼𝗯", "𝗿𝗼𝗹𝗲": "𝘂𝘀𝗲𝗿" } ] } Toon: 𝘂𝘀𝗲𝗿𝘀[𝟮]{𝗶𝗱,𝗻𝗮𝗺𝗲,𝗿𝗼𝗹𝗲}: 𝟭,𝗔𝗹𝗶𝗰𝗲,𝗮𝗱𝗺𝗶𝗻 𝟮,𝗕𝗼𝗯,𝘂𝘀𝗲𝗿 It's obvious how few tokens are being used to represent the same information! To summarise, here are the key features: 💸 30–60% fewer tokens than JSON 🔄 Borrows the best from YAML & CSV 🤿 Built-in validation with explicit lengths & fields 🍱 Minimal syntax (no redundant braces, brackets, etc.) IMPORTANT!! That said, for deeply nested or non-uniform data, JSON might be more efficient. In the next tweet, I've shared some benchmark results demonstrating the effectiveness of this technique in reducing the number of tokens and improving retrieval accuracy with popular LLM providers. Where do you think this could be effective in your existing workflows? Find the relevant links in the next tweet!
64
166
14
1,146
Blaize D'souza ❤️ retweeted
Cheers to NextGen Xiao YiKang from the Department of Electrical Engineering! His pioneering SiC MOSFET transient modeling and flexible commutated power semiconductor have won him the IET Postgraduate Research Award 🏆, making him Asia’s only awardee this year.
Blaize D'souza ❤️ retweeted
RAG retrieves documents. Agentic RAG decides what, why, and when. One searches. The other reasons strategically. Let me break this down in a way that'll save you months of trial and error. 》What Makes Agentic RAG Different? Traditional RAG is like asking a librarian for one book. They hand it over, and you're done. Agentic RAG? That's your research team that: ✸ Routes queries across 7+ databases simultaneously ✸ Breaks complex questions into sub-tasks automatically ✸ Cross-references conflicting sources before answering ✸ Iterates until the answer is actually complete The difference isn't subtle. It's transformational. 》The Numbers Don't Lie ✸ Radiology diagnostic accuracy jumped from 68% to 73% using Agentic RAG systems in recent clinical studies. ✸ That 5% isn't a minor improvement. It's lives saved every single day. ✸ Google reduced enterprise search time by 50% with Agentic RAG. What took employees hours now completes in seconds. ✸ IBM Watson cut document review time in half for healthcare and legal sectors using these systems. These aren't pilots. They're production systems processing millions of queries. 》When You Actually Need Agentic RAG Three clear signals: → Your queries require multi-step reasoning across sources → Data lives in multiple databases with different structures → Context changes dynamically and requires real-time adaptation If you're in healthcare, finance, legal, or any regulated industry handling complex compliance? This isn't optional anymore. 》Single-Agent vs Multi-Agent Architecture Look at the visual I've shared above. Single-agent RAG queries one database and stops. Multi-agent Agentic RAG orchestrates 4+ specialized agents checking 7+ sources simultaneously. ✸ That's not an incremental upgrade. ✸ That's architectural evolution. ✸ One agent handles routing. ✸ Another validates retrieval quality. ✸ A third cross-checks for hallucinations. ✸ A fourth synthesizes the final answer. This is how production systems actually scale. 》The Frameworks That Matter Since I teach LangGraph, CrewAI, PydanticAI, OpenAI Swarm, and MCP to developers worldwide, here's what works: ✸ LangGraph excels at state management across agent reasoning cycles ✸ CrewAI dominates multi-agent orchestration with role-based delegation ✸ PydanticAI ensures type-safe, validated outputs from agents. ✸ OpenAI Swarm offers lightweight agent handoffs with minimal overhead ✸ MCP standardizes how agents connect to data sources Each framework solves different patterns. Knowing which one to use separates systems that work from systems that scale. pper: arxiv.org/pdf/2501.09136a ≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣ ⫸ want to Master AI agent in 30 days? ꆛ Join My 𝗛𝗮𝗻𝗱𝘀-𝗼𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝟱-𝗶𝗻-𝟭 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 trusted by 1,500+ builders worldwide! ➠ 9 Real-World Projects ➠ 5 frameworks: MCP · LangGraph · PydanticAI · CrewAI · Swarm ➠ 100% Hands-on ✔ Basic Python is all you need. 👉 𝗘𝗻𝗿𝗼𝗹𝗹 𝗡𝗢𝗪 (𝟱𝟲% 𝗢𝗙𝗙): maryammiradi.com/ai-agents-m…
10
81
449
Blaize D'souza ❤️ retweeted
Here's a common misconception about RAG! Most people think RAG works like this: index a document → retrieve that same document. But indexing ≠ retrieval. What you index doesn't have to be what you feed the LLM. Once you understand this, you can build RAG systems that actually work. Here are 4 indexing strategies that separate good RAG from great RAG: 1) Chunk Indexing ↳ This is the standard approach. Split documents into chunks, embed them, store in a vector database, and retrieve the closest matches. ↳ Simple and effective, but large or noisy chunks will hurt your precision. 2) Sub-chunk Indexing ↳ Break your chunks into smaller sub-chunks for indexing, but retrieve the full chunk for context. ↳ This is powerful when a single section covers multiple concepts. You get better query matching without losing the surrounding context your LLM needs. 3) Query Indexing ↳ Instead of indexing raw text, generate hypothetical questions the chunk could answer. Index those questions instead. ↳ User queries naturally align better with questions than raw document text. This closes the semantic gap between what users ask and what you've stored. ↳ Perfect for QA systems. 4) Summary Indexing ↳ Use an LLM to summarize each chunk. Index the summary, retrieve the full chunk. ↳ This shines with dense, structured data like CSVs and tables where raw text embeddings fall flat. The bottom line: You don't need to retrieve exactly what you indexed. Match your indexing strategy to your data, and your RAG system will perform significantly better. What indexing strategies have worked best for you?
9 MCP, Agents, and RAG projects for AI engineers:
8
86
1
347
Blaize D'souza ❤️ retweeted
A loose analogy of chip design is building a house. Compute is the solar panel. You want it as dense as possible per square feet. Yet solar panels have to be mounted on a solid roof, with correct wiring and orientation to max out its peak throughout. Then you will need breaker box, climate control, sealant, etc, to make it habitable.
11
9
253