Posts

2026.MAR.14

No More Code Reviews

Philip Su:

And — you heard it here first — we’ll one day be scared, positively petrified, to use any mission-critical software known to have allowed human interference in its codebase.

Very provocative. Put this way, it does evoke the feeling that we could very well be heading into this future.

2026.MAR.09

The Deal Is So Good

Mo Bitar:

What we do is because the deal is so damn good, we change ourselves to make that deal acceptable.

And what I've figured out now is that I'm unwilling to change myself to make that deal acceptable.

I could feel the emotions as I watched the video. Well worth the time.

2026.MAR.08

End of Productivity Theater

Murat Demirbas:

I remember the early 2010s as the golden age of productivity hacking. Lifehacker, 37signals, and their ilk were everywhere, and it felt like everyone was working on jury-rigging color-coded Moleskine task-trackers and web apps into the perfect Getting Things Done system.

So recently I found myself wondering: what happened to all that excitement? Did I just outgrow the productivity movement, or did the movement itself lose stream?

I was very much in the audience for the productivity theatre. I still am to an extent, even if the stage has lost most of its oomph. A good, short read.

2026.FEB.28

The Third Era of AI Software Development

Michael Truell, CEO of Cursor:

When we started building Cursor a few years ago, most code was written one keystroke at a time. Tab autocomplete changed that and opened the first era of AI-assisted coding. Then agents arrived, and developers shifted to directing agents through synchronous prompt-and-response loops. That was the second era.

Now a third era is arriving. It is defined by agents that can tackle larger tasks independently, over longer timescales, with less human direction. As a result, Cursor is no longer primarily about writing code. It is about helping developers build the factory that creates their software.

Thirty-five percent of the PRs we merge internally at Cursor are now created by agents operating autonomously in cloud VMs.

Agent Orchestration and Agent Swarms are a couple of ways folks are referring to this idea. Steve Yegge had predicted several weeks ago (a very long time horizon in the world of AI) that this would be the next frontier in agentic engineering.

I remain sceptical though. I'm not saying I don't trust that 35% of the PRs at Cursor are being opened this way; it is very believable, given how good the frontier models are now. But not all PRs are equal, and I wager these 35% are relatively simpler bugs/features. Or that there is a lot more work being done to iterate on the PRs once they are raised.

My argument isn't that this isn't useful work. On the contrary, background agents taking such issues off of engineers' hands is invaluable, as they can now focus on work that provides higher leverage. But I don't believe the logical extension of this is that "the vast majority of development work" will be done this way in a year.

2026.FEB.27

Two Beliefs About Coding Agents

Drew Breunig:

I'm lucky enough to talk to a range of developers and teams, spanning a variety of company sizes and a broad array of skill sets. From these conversations, two beliefs have emerged and solidified about coding agents and their (current) impact on coding.

Drew makes two very astute observations, both of which I endorse. The first one in particular is under-appreciated:

Most talented developers do not appreciate the impact of the intuitive knowledge they bring to their coding agent.

Coding agents are amplifiers of skills of the engineers that wield them, they are not magic beans that'll let an amateur cook up a compiler.

The second observation should be obvious to anyone who has built software products, but somehow the current mania is making people ignore it:

Most work people are sharing are incredible personal tools, but they are not capital-P products.

2026.FEB.18

Codex CLI vs Claude Code on Autonomy

nilenso:

I spent some time studying the system prompts of coding agent harnesses like Codex CLI and Claude Code. These prompts reveal the priorities, values, and scars of their products. They're only a few pages each and worth reading in full, especially if you use them every day. This approach to understanding such products is more grounded than the vibe-based takes you often see in feeds.

While there are many similarities and differences between them, one of the most commonly perceived differences between Claude Code and Codex CLI is autonomy, and in this post I'll share what I observed. We tend to perceive autonomous behaviour as long-running, independent, or requiring less supervision and guidance. Reading the system prompts, it becomes apparent that the products make very different, and very intentional choices.

Very interesting comparison. But I don't believe the difference in the behaviour is primarily, or even likely, driven by the system prompts. The difference is far more ingrained, most likely RL'd during post-training.

Why do I say this? I've been using both the models in Pi coding agent with its default system prompt1, which is both really small and the same for all models. And even in Pi, this difference in behaviour comes across clearly.2

Footnotes

  1. Pi allows us to replace the entire system prompt by placing a markdown file at ~/.pi/agent/SYSTEM.md

  2. I feel that the models both behave better in Pi than in their respective canonical harnesses; but this is a very subjective opinion.

2026.FEB.16

SaaS Isn't Dead. It's Worse Than That.

Michael Bloch:

I'm more bullish on AI than I've ever been. And that's exactly why I'm bearish on most software companies. Not because their customers will leave, but because their next thirty competitors just got a lot easier to build.

I've seen/heard a bunch of different people quip exactly this. This is one of the crispest articulations. Rings ominous to me.

2026.FEB.15

Cognitive Debt

From How Generative and Agentic AI Shift Concern from Technical Debt to Cognitive Debt (via Simon Willison):

Cognitive debt, a term gaining traction recently, instead communicates the notion that the debt compounded from going fast lives in the brains of the developers and affects their lived experiences and abilities to "go fast" or to make changes. Even if AI agents produce code that could be easy to understand, the humans involved may have simply lost the plot and may not understand what the program is supposed to do, how their intentions were implemented, or how to possibly change it.

I hadn't come across this term before. It is a useful one to add to our collective vocabulary. I suppose that in just a couple of years we'll all be talking about this phenomenon like we talk about technical debt now.

I haven't personally felt this way yet, maybe that means I'm not fully embracing and giving in to the agents. But I can feel the urge to go there.

I bet that one of the best ways to avoid getting into cognitive debt is to continue to be the bottleneck.

2026.FEB.14

The Final Bottleneck

Armin Ronacher:

I too am the bottleneck now. But you know what? Two years ago, I too was the bottleneck. I was the bottleneck all along. The machine did not really change that. And for as long as I carry responsibilities and am accountable, this will remain true. If we manage to push accountability upwards, it might change, but so far, how that would happen is not clear.

I too am the bottleneck. And I'm glad I am. When I stop being the bottleneck, I'm no longer involved at all. And if I'm not involved, it doesn't matter to me.

A very good and thought-provoking read.