Neel Nanda · Nov 9, 2025 · 4:21 PM UTC

Neel Nanda

Behrooz Azarkhalili retweeted

Neel Nanda

@NeelNanda5

The Story of Mech Interp

This is a talk I gave to my MATS scholars, with a stylised history of the field of mechanistic interpretability, as I see it (with a focus on the areas I've ...

youtube.com

ℏεsam · Nov 8, 2025 · 11:55 PM UTC

Behrooz Azarkhalili retweeted

ℏεsam

@Hesamation

18h

bro created an entire 16-hour free youtube playlist on how to build a DeepSeek model from scratch. it goes over the papers, explains the theory, and implements the code. Syllabus: → attention mechanism fully explained → multi-head latent attention → grouped query attention → everything about positional encodings → mixture of experts (MoE) just start today with a laptop and motivation. playlist: piped.video/playlist?list=PL…

431

4,738

alex fazio · Nov 8, 2025 · 2:19 PM UTC

Behrooz Azarkhalili retweeted

alex fazio

@alxfazio

Nov 8

i built a claude skill that lets claude code reverse-engineer itself it uses mitmproxy to inspect system prompts and tool definitions, debug slash commands and sub-agents, and more batteries included: guided setup, scripts, and AskUserQuestion tool for simplified usage

528

elvis · Nov 8, 2025 · 2:38 PM UTC

Behrooz Azarkhalili retweeted

elvis

@omarsar0

Nov 8

The most effective AI Agents are built on these core ideas. It's what powers Claude Code. It's referred to as the Claude Agent SDK Loop, which is an agent framework to build all kinds of AI agents. (bookmark it) The loop involves three steps: Gathering Context: Use subagents (parallelize them for task efficiency when possible), compact/maintain context, and leverage agentic/semantic search for retrieving relevant context for the AI agent. Hybrid search approaches work really well for domains like agentic coding. Taking Action: Leverage tools, prebuilt MCP servers, bash/scripts (Skills have made it a lot easier), and generate code to take action and retrieve important feedback/context for the AI agent. Turns out you can also enhance MCP and token usage through code execution and routing, similar to how LLM routing increases efficiency in AI Agents. Verifying Output: You can define rules to verify outputs, enable visual feedback (this becomes increasingly important in multimodal problems), and consider LLM-as-a-Judge to verify quality based on fuzzy rules. Some problems will require visual cues and other forms of input to perform well. Don't overcomplicate the workflow (eg, use computer-using agents when a simple Skill with clever scripts will do). This is a clean, flexible, and solid framework for how to build and work with AI agents in all kinds of domains.

176

1,375

Trending GitHub Repos · Nov 9, 2025 · 1:38 PM UTC

Behrooz Azarkhalili retweeted

Trending GitHub Repos @bot_for_devs

3. tinker-cookbook Post-training with Tinker #Python github.com/thinking-machines…

GitHub - thinking-machines-lab/tinker-cookbook: Post-training with Tinker

Post-training with Tinker. Contribute to thinking-machines-lab/tinker-cookbook development by creating an account on GitHub.

github.com

Visual Studio Code · Nov 7, 2025 · 2:30 AM UTC

Behrooz Azarkhalili retweeted

Visual Studio Code

@code

Nov 7

GitHub Copilot Orchestra Pattern... A multi-agent orchestration system for structured, test-driven software development with AI assistance. Conductor -> Plan 🔁 ( implement -> review -> commit )

395

Paul Klein IV · Nov 6, 2025 · 7:34 PM UTC

Behrooz Azarkhalili retweeted

Paul Klein IV

@pk_iv

Nov 6

Add this skill in Claude Code today: /plugin marketplace add browserbase/agent-browse /plugin install browser-automation@browser-tools github.com/browserbase/agent…

GitHub - browserbase/agent-browse: Claude Agent SDK with a web browsing tool

Claude Agent SDK with a web browsing tool. Contribute to browserbase/agent-browse development by creating an account on GitHub.

github.com

253

charmaine · Nov 7, 2025 · 8:55 PM UTC

Behrooz Azarkhalili retweeted

charmaine

@charmaine_klee

Nov 7

my little /statusline

115

alex fazio · Nov 7, 2025 · 3:16 PM UTC

Behrooz Azarkhalili retweeted

alex fazio

@alxfazio

Nov 7

mcps are changing turns out designing mcps to load every tool definition into the model prompt was a bad idea anthropic’s nov 4 blog post suggests a new pattern treat each mcp server like a normal code library, e.g. typescript modules or files, and let the agent write and run small programs that do two things: discover only what is needed, list a servers directory to see what exists, open just the specific tool files, import only those functions process data locally, call mcp tools from code, then filter, join, and aggregate inside a sandboxed runner so only the small final bits go back to the model doing this dramatically cuts tokens anthropic shows a typical case dropping from 150k tokens down to ~2k (98.7% savings) below a viz showing before/after

1,058

Brady Long · Nov 7, 2025 · 11:14 AM UTC

Behrooz Azarkhalili retweeted

Brady Long

@thisguyknowsai

Nov 7

THIS IS CRAZY.... I've been using Claude and its a monster when it comes to tasks automation and acting like a real assistant. Here are 10 prompts I use in Claude to automate my boring tasks: (Comment "Send" and I'll DM you an automation file too for free)

200

446

Thariq · Nov 7, 2025 · 11:16 PM UTC

Behrooz Azarkhalili retweeted

Thariq

@trq212

Nov 7

Claude Code Weekly Roundup This week we rolled out a new promo for free credits on Claude Code on the web along with beautiful diffs and support for skills. In the CLI we improved fuzzy search, did a slew of bug fixes & improved triggering for interactive questions.

403

nerdai · Nov 7, 2025 · 2:51 PM UTC

Behrooz Azarkhalili retweeted

nerdai

@_nerdai_

Nov 7

Replying to @_nerdai_ @ManningBooks @ManningMEAP @ollama

Sharing the MEAP link to this book as it's no longer Manning's Deal of the Day: hubs.la/Q03Q0h4p0

Build a Multi-Agent System (from Scratch) - Val Andrei Fajardo

Build AI agent systems that coordinate, delegate, and get real work done. Agents turn LLMs into autonomous tools capable of executing on tasks and plans. Multi-agent systems use protocols like MCP...

manning.com

Akshay 🚀 · Nov 7, 2025 · 3:08 PM UTC

Behrooz Azarkhalili retweeted

Akshay 🚀

@akshay_pachaar

Nov 7

A simple trick cuts your LLM costs by 50%! Just stop using JSON and use this instead: TOON (Token-Oriented Object Notation) slashes your LLM token usage in half while keeping data perfectly readable. Here's why it works: TOON's sweet spot: uniform arrays with consistent fields per row. It merges YAML's indentation and CSV's tabular structure, optimized for minimal tokens. Look at the example below. JSON: { "𝘂𝘀𝗲𝗿𝘀": [ { "𝗶𝗱": 𝟭, "𝗻𝗮𝗺𝗲": "𝗔𝗹𝗶𝗰𝗲", "𝗿𝗼𝗹𝗲": "𝗮𝗱𝗺𝗶𝗻" }, { "𝗶𝗱": 𝟮, "𝗻𝗮𝗺𝗲": "𝗕𝗼𝗯", "𝗿𝗼𝗹𝗲": "𝘂𝘀𝗲𝗿" } ] } Toon: 𝘂𝘀𝗲𝗿𝘀[𝟮]{𝗶𝗱,𝗻𝗮𝗺𝗲,𝗿𝗼𝗹𝗲}: 𝟭,𝗔𝗹𝗶𝗰𝗲,𝗮𝗱𝗺𝗶𝗻 𝟮,𝗕𝗼𝗯,𝘂𝘀𝗲𝗿 It's obvious how few tokens are being used to represent the same information! To summarise, here are the key features: 💸 30–60% fewer tokens than JSON 🔄 Borrows the best from YAML & CSV 🤿 Built-in validation with explicit lengths & fields 🍱 Minimal syntax (no redundant braces, brackets, etc.) IMPORTANT!! That said, for deeply nested or non-uniform data, JSON might be more efficient. In the next tweet, I've shared some benchmark results demonstrating the effectiveness of this technique in reducing the number of tokens and improving retrieval accuracy with popular LLM providers. Where do you think this could be effective in your existing workflows? Find the relevant links in the next tweet!

159

1,116

Docker · Nov 7, 2025 · 3:10 PM UTC

Behrooz Azarkhalili retweeted

Docker

@Docker

Nov 7

Multimodal LLMs aren’t magic -just a projector layer, a translator, turning images/audio into tokens. The latest AI Newsletter shows how to run them with Docker Model Runner (plus demos): bit.ly/4owXosC #Docker #ModelRunner #AI #LLM #Multimodal

How to Use Multimodal AI Models With Docker Model Runner

By Ignacio López Luna This article was initially published on Docker Blog in November 2025. One of the most exciting advances in modern AI is multimodal support, the ability for models to understand...

linkedin.com

Daniel San · Nov 7, 2025 · 4:01 PM UTC

Behrooz Azarkhalili retweeted

Daniel San

@dani_avila7

Nov 7

Claude Code just added a new environment variable yesterday 😨 CLAUDE_CODE_EXIT_AFTER_STOP_DELAY - auto-exits SDK mode after idle duration. Useful for CI/CD and automated workflows. Wrote up a complete reference covering all 40+ variables (auth, models, performance, cloud integration) Link: medium.com/@dan.avila7/claud…

Daniel San

@dani_avila7

Nov 7

Running Claude Code isn’t just about opening the terminal and typing the perfect prompt. (Bookmark this post for later) There are environment variables that can influence how Claude Code runs, connects to models, and even how its interface behaves. 🧵 Here’s a thread with all of them organized by category:

259

Nicolas Camara · Nov 6, 2025 · 5:45 PM UTC

Behrooz Azarkhalili retweeted

Nicolas Camara

@nickscamara_

Nov 6

You can now get any website branding via 1 API call Use it to power your onboarding flows, create apps based on your design language and even clone other websites to inspire your new creations Feed the result to an LLM and watch the magic happen

Firecrawl

@firecrawl_dev

Nov 6

Introducing the Branding format 🎨 You can now extract complete brand DNA from any website including color schemes, logos, frameworks and more in one API call Perfect for coding agents to clone or match existing site aesthetics. Try it out now on the playground or API today 👇

Utopic e/λ · Nov 7, 2025 · 5:11 PM UTC

Behrooz Azarkhalili retweeted

Utopic e/λ @UtopicDev

Nov 7

@DSPyOSS signatures + TOON 🔥🚀

Vicente Reig Rincón de Arellano @highwayvaquero

Nov 7

DSPy Signatures anchor your app in a world where everything changes—prompting techniques, model families, even serialization formats. vicentereig.github.io/dspy.r… Pair TOON with @boundaryML's BAML and BOOM! Drop tokens in half.

clem 🤗 · Nov 7, 2025 · 4:05 PM UTC

Behrooz Azarkhalili retweeted

clem 🤗

@ClementDelangue

Nov 7

Unsurprisingly, Kimi K2 Thinking is already number one trending on HF. The AI frontier is open-source!

151

1,630

Thariq · Nov 5, 2025 · 5:21 PM UTC

Behrooz Azarkhalili retweeted

Thariq

@trq212

Nov 5

You can now add prompt-based stop hooks to Claude Code. Prompt hooks are great for encouraging Claude to work for longer periods of time, doing clean up work like removing extra files, writing tests or keeping track of what work is being done.

506

TrackioApp · Nov 5, 2025 · 8:09 PM UTC

Behrooz Azarkhalili retweeted

TrackioApp @TrackioApp

Nov 5

Trackio 0.8.0 is out! You can now log tables as part your experiments with an intuitive syntax, let's GOOOO 🎯