Alexis Gallagher · Oct 11, 2025 · 5:29 PM UTC

Alexis Gallagher

Alexis Gallagher

@alexisgallagher

Oct 11

Yes. To use agents well, you need to be fluent both with abstraction (focusing on the interface, not the implementation) and with uncertainty (doing work when you cannot be 100% sure of your information). But this is the opposite of what many folks enjoy about programming!

Simon Willison

@simonw

Oct 11

I'm beginning to suspect that a key skill in working effectively with coding agents is developing an intuition for when you don't need to closely review every line of code they produce. This feels deeply uncomfortable!

Alexis Gallagher · Oct 11, 2025 · 9:31 PM UTC

Alexis Gallagher

@alexisgallagher

Oct 11

But these dichotomies are not absolute. People who don’t trust the abstraction of an interface and think they want to see all the implementation don’t really want to see all of it. They don’t study the machine code from their compiler, the circuit layout of their processor, or the quantum mechanics underlying the silicon. In fact, it’s all interface. But we find the interface we enjoy and declare that to be the natural boundary.

Alexis Gallagher · Oct 11, 2025 · 9:36 PM UTC

Alexis Gallagher

@alexisgallagher

Oct 11

With uncertainty, it’s a bit different. How could you ever be sure something was correct? You could inspect it! But what does that really buy you, unless you are infallible? Merely confidence. I make too many errors in simple arithmetic to believe that my careful inspection of a piece of code gives me an absolute guarantee of its quality. You could use a type system! This is good. Type systems are completely reliable, but only for catching type errors. You could use unit tests! Yes, that helps too, but your unit tests are finite so those too don’t guarantee you anything. My point is, the situation of “I have this software artifact and I can’t be sure it is correct” is not new. It is completely and utterly ordinary. Consequently, it requires relatively ordinary engineering skills to use agentic tools effectively. But it does require skills.

Alexis Gallagher · Oct 11, 2025 · 9:43 PM UTC

Alexis Gallagher

@alexisgallagher

Oct 11

One possibility I’d like to explore is deeper use of property-based testing. This is where you define the property you which to test as a general declared invariant, rather than handwriting a finite set of test cases. If the problem today is that we can generate an implementation cheaply, but verifying it by inspection is so expensive that it is now the bottleneck, then maybe the solution is to find a way to do /scalable verification/, where you can just add more compute to get a deeper level of verification. So for instance, if you define your tests as properties, and you want to verify a generated solution without inspecting the code manually, then you could use property-based testing to generate the desired number of test cases which conform to the property, and in that way you could achieve an arbitrary level of confidence. In other words, if you can’t formally prove the artifact is correct, you can still pour in compute to get within the desired epsilon of full coverage. This is just one idea. There are many obvious ones waiting to be tried. Exciting times!

Alexis Gallagher · Oct 11, 2025 · 9:53 PM UTC

Alexis Gallagher

@alexisgallagher

Oct 11

Another utterly, utterly obvious point is that the optimal verification regime depends on the (1) COST of verification and also on the (2) VALUE of the thing being verified. Say you’re building a cryptography library, or the autopilot which land the helicopter you’re flying in. Do you just vibe it out? Write it in Python and wait to see if you get a runtime error? No, of course not. You sweat every detail by hand, and bring in experts to doublecheck, and bring as much automated tooling as you can to triple check the work — type systems, unit tests, fuzzers, detail-oriented coworkers to do code review, etc.. That’s because those artifacts are expensive to verify and expensive if they fail. But if you need code to generate a matplotlib chart? Just generate it! Dare i say, vibecode it! It’s cheap to verify because you can visually inspect the chart. And it’s cheap to fail because if the chart comes out wrong it literally cost you 30 seconds and you can just generate a better one. So just vibe it out. Yes you still need to use your brain! But use it for what it’s good for, and where it is necessary. And I suggest the first good use of our brains should be a little honest meta self-reflection about what those cases are.

Alexis Gallagher · Oct 13, 2025 · 2:10 AM UTC

Alexis Gallagher · Oct 13, 2025 · 2:10 AM UTC

Alexis Gallagher

@alexisgallagher

Oct 13

> The biggest difference is really just the latter group is making an explicit choice to design their engineering workflows to actually make agents effective I guess this is getting to be an obvious point now?

Aaron Levie

@levie

Oct 12

We’re in a window right now where there’s a huge advantage if you’re a startup or a team that takes an AI agent-centric approach to workflows. Just in coding, we see an incredible spread between in productivity gains between two seemingly only slightly different types of practices. You’ll talk to some teams that say they’re getting 20-30% lift from AI, and others that are getting 2-3X or more. The biggest difference is really just the latter group is making an explicit choice to design their engineering workflows to actually make agents effective, instead of just assuming it will happen organically. Moving to focus on better prompting, spec writing, reviewing code, orchestrating agents, testing different models regularly, giving agents much larger tasks to execute, and so on. All of this is very different from what AI coding looked like just a year ago. The same will also happen in the rest of knowledge work as well as more and more tools emerge to support these practices. We’re going to see this play out in nearly every major vertical and line of business. Eventually the gains will be too hard for anyone to ignore so we’ll see more standardization, but for now it’s an advantage for the teams to adopt these approaches earlier.

Oct 13, 2025 · 2:10 AM UTC