Posts

2026.APR.29

The West Forgot How to Build. Now It's Forgetting Code

Denis Stetskov:

Five to ten years from now, we’ll need senior engineers. People who understand systems end to end, who can debug distributed failures at 2 AM, who carry institutional knowledge that exists nowhere in the codebase. Those engineers don’t exist yet because we’re not creating them. The juniors who should be learning right now are either not being hired or developing what a DoD-funded workforce study calls “AI-mediated competence.” They can prompt an AI. They can’t tell you what the AI got wrong.

Ignore the click-baity title. This is a well-written and well-argued post on how the software industry might be hurtling towards a grim future, the kind of present that the West’s defence industry has found itself in as an unexpected war broke out between Ukraine & Russia.

2026.APR.29

The Basics

Thorsten Ball:

Here's what I consider to be the basics. I call them that not because they're easy, but because they're fundamental. The foundation on which your advanced skills and expertise rest. Multipliers and nullifiers, makers and breakers of everything you do.

They don't usually show up in technical books and yet without them a lot of brilliant effort can go to waste. I constantly have to remind myself of them, sitting on my own shoulder and wagging a finger in my face.

What a great set of obvious but seldom articulated things that every developer would do well to go through at some regular interval (because these are easy to forget, especially in this age of agentic engineering).

This is an old post that I discovered through last Sunday's Joy & Curiosity, Thorsten Ball's weekly round up of really wonderful links (a lot of them focused on agentic engineering, given Thorsten is building Amp, which I've heard described as the Porsche of coding agents).

2026.APR.26

You and Your Research

I had come across this old lecture from Richard Hamming before but never watched it. But I had multiple people recommending it on Xitter in just the last day, including Paul Graham. Thorsten Ball (of Amp Code) recommended it in his excellent Joy & Curiosity newsletter today, and I had to watch it.

What an excellent talk covering such a wide array of topics, but all towards an exhortation for how to be great. I watched the entire thing at 1x speed. Yes, it's that good!

2026.APR.26

Tim Cook Personified Big Tech's Maturity

Andrew Sharp:

And that's ultimately Cook's legacy, to me. He made sensible choices under the circumstances, nurturing Apple profits and its stock price at every turn. If many of those choices were ultimately predictable and unfulfilling, well, that's the game for a company at Apple's stage of the corporate life cycle.

Where Apple under Jobs was selling performance and possibility, Apple today capitalizes on our collective dependence on the iPhone ecosystem and promises superior reliability to any peers. And that's still a pretty good deal! But it's a categorically different value proposition than that of the company that was changing the way an entire generation interacted with technology.

This is the best commentary I've read in light of the announcement of Cook's retirement. Most of the other coverage has been way too positive, this is much more balanced and closer to how I feel.

2026.APR.25

The Zechner-Lopopolo Continuum

Alex Volkov:

The Zechner-Lopopolo Continuum

This is a recap of the AI Engineer Europe conference that took place in London a couple of weeks ago. But the more interesting thing is the debate that the title and above image points to.

Mario Zechner (creator of the Pi coding agent, my preferred coding agent) talked about

  • why & how he built Pi (this summarises why I'm in love with Pi)
  • the complexities brought about on OSS by people wielding agents and how he is tackling these with innovative solutions like OSS Vacations/Weekends
  • (critically) advocating for reading critical code thoroughly and generally slowing down to ensure we don't drown in AI slop code

Ryan Lopopolo (from OpenAI) talked about some vague things like code being a liability and how he is a "token billionaire"; and how he has mandated his team to not look at the code. Maybe he talked about more things, I just couldn't sit through the entire talk.

If it's not obvious, I'm firmly at the Zechner end of the continuum.

Maybe this will change in a couple of years or even in just a few months, but in April 2026, anyone who is too far out on the Lopopolo end is taking on a lot of technical debt that they may not really be able to pay off.

And no: no amount of tests or specs is going to prevent that technical debt from building up, because the debt is not about correctness. The things that lead to this debt from agents are the same things that lead to debt buildup from humans: poor design choices, code duplication, needlessly defensive code, and many other such sins that agents can add at a pace hitherto unimaginable for humans.

The only way to prevent or tame this is for humans to read the code. Or break the problem down into small enough chunks so that agents actually follow the "don't duplicate code" and other testaments from our AGENTS.mds. Or in the words "human in the loop."

"But that will slow us down," I can hear some people say. Yes, slow the fuck down1.

Footnotes

  1. We'll still be way faster than we were a year ago, so don't despair.

2026.APR.25

Why Isn't Everything Different Yet?

Dave Griffith:

So: where are we? The technology exists and is impressive. The infrastructure buildout is underway and massive. Workflows are being redesigned in early-adopter organizations, often via guesswork. We've got one (1) product area (software development agents) where we're past "early adopter" and moving onto mass-market. Legal frameworks are being written badly by people who have never used the technology, which is traditional. Business models are being discovered by trial and error, also traditional. Fortunes are being made and lost, another time-honored tradition.

The critics who say nothing has changed are measuring at the wrong resolution. The critics who say change should have been instantaneous have a broken model of how change works. The honest answer is: this is going extremely fast, it will often feel slow until suddenly it doesn't, and the people who have built understanding now will not be scrambling in three years.

Amen. Good, entertaining read.

I'm going to refer people to this when they say either that things will not change dramatically or when they say that the dramatic change has already happened (so much more to come).

ai
2026.APR.25

Coding Models Are Doing Too Much

nrehiew:

If you have used any of these tools in the past year, you have probably experienced something like this: you ask the model to fix a simple bug (perhaps a single off-by-one error, or maybe a wrong operator). The model fixes the bug but half the function has been rewritten. An extra helper function has appeared. A perfectly reasonable variable name has been renamed. New input validation has been added. And the diff is enormous.

I refer to this as the Over-Editing problem where models have the tendency to rewrite code that didn't need rewriting.

Yes! A thousand times, yes.

GPT models are especially prone to this overediting problem. A part of this comes from writing code that is way too defensive1, but it's not just that — they are really eager to "fix" your code even when there is really no need for that.

Thankfully, GPT models are also very good at following instructions. So I have had instructions to circumvent this problem in my global AGENTS.md for a while and it helps quite a bit.

This is what the linked post also found: the over-editing reduces across models when they are prompted for it.

This is a good post. It's not an opinion piece, but takes a scientific approach by setting up experiments and providing evidence in the form of results.

Footnotes

  1. I've seen a couple comments saying that GPT-5.5 has gotten better in this regard and doesn't write such defensive code anymore. I'm yet to ascertain this.

2026.APR.25

Multi-Agents: What's Actually Working

I've largely sat out the hype around multi-agent orchestration or agent swarms because it felt too gimmicky. Heck, I've only recently started using subagents in a limited way (mostly explicitly invoked when I feel like something is parallelizable).

This blog post is not trying to hype these up. It is a measured take on how Cognition has been able to use some limited forms of this in production for Devin (background/cloud agent) and what they had to do to make it work well.

Walden Yan (Cognition):

1) The Code-Review-Loop that's so stupid it shouldn't work

You would think that making a model review its own code would not result in any useful findings. But even on PRs written by Devin, Devin Review catches an average of 2 bugs per PR, of which roughly 58% are severe (logic errors, missing edge cases, security vulnerabilities). Often the system will loop through multiple code-review cycles, finding new bugs each time (which isn't always great since it can take a while). Today, we make Devin and Devin Review natively iterate against one another, so that most bugs are already resolved by the time a human opens the PR.

This is effectively my (manual) workflow in almost every coding agent I have used for several months now. Of course, Cognition has automated this as a workflow, which makes sense in a background agent like Devin.

I wouldn't want to automate it in my manual workflow though, as I tend to not accept all the review comments from the review agent. Hence why I don't use extensions such as pi-review-loop which exist to do just that.

2) Large, expensive models are back - introducing "Smart Friend"

The actual architecture we used to achieve this was by offering the smarter/expansive model as a "smart friend" tool that the primary/smaller model could make a call out to. Basically, let the primary/smaller model decide when a situation was tricky enough to be worth consulting the smarter/expensive model.

This is basically akin to Amp Code's /oracle1 but invoked automatically (by exposing it as a tool). Seems obviously beneficial if the primary model is not smart enough to tackle the problem at hand.

What about unstructured swarms? We think the unstructured-swarm approach, arbitrary networks of agents negotiating with each other, is mostly a distraction. The practical shape is map-reduce-and-manage: a manager splits work, children execute, the manager synthesizes and reports back. Making this type of system feel as coherent as a single agent working on a single task is at the center of some of our upcoming work in 2026.

There's a shared through-line with all of these experiments: multi-agent systems work best today when writes stay single-threaded and the additional agents contribute intelligence rather than actions. A clean-context reviewer catches bugs the coder can't see. A frontier-level smart friend catches subtleties a weaker primary misses. A manager coordinates scope across child agents without fragmenting decisions.

The open problems are all communication problems. How does a weaker model learn when to escalate? How does a child agent surface a discovery that should change its siblings' work? How do you transfer context between agents without drowning the receiver? You can get decently far with prompting, but we also expect the next generation of models, including the ones we train ourselves, to start closing these gaps.

Footnotes

  1. Peter Steinberger has an /oracle prompt template to use in any agent for consulting GPT Pro models for such situations.

2026.APR.19

The Peril of Laziness Lost

Bryan Cantrill:

The problem is that LLMs inherently lack the virtue of laziness. Work costs nothing to an LLM. LLMs do not feel a need to optimize for their own (or anyone's) future time, and will happily dump more and more onto a layercake of garbage. Left unchecked, LLMs will make systems larger, not better—appealing to perverse vanity metrics, perhaps, but at the cost of everything that matters. As such, LLMs highlight how essential our human laziness is: our finite time forces us to develop crisp abstractions in part because we don't want to waste our (human!) time on the consequences of clunky ones. The best engineering is always borne of constraints, and the constraint of our time places limits on the cognitive load of the system that we're willing to accept. This is what drives us to make the system simpler, despite its essential complexity. As I expanded on in my talk The Complexity of Simplicity, this is a significant undertaking—and we cannot expect LLMs that do not operate under constraints of time or load to undertake it of their own volition.

So well put. Recommended reading.

2026.APR.19

Mechanical Sympathy

Vicki Boykis:

What makes good engineers good at product design is the same thing that makes them good at engineering. They feel for the boundaries of what the code and the product allows them to do and stop at those boundaries.

Another name for being able to understand and plan for affordances, either through good product intuition, or experience, or both, in the real world is mechanical sympathy.

I agree with the assertion that agentic coding tools don't have mechanical sympathy. At least as of now; maybe the future models will overcome this (but maybe not).