Blogmark

the risk of AI side quests

via jbranchaud@gmail.com

https://bsky.app/profile/carnage4life.bsky.social/post/3mfytrcpux22q
LLM AI-assisted Coding

I've been in meetings where I've been asked to imagine a near-future in which members of the team who aren't currently producing code (e.g. project managers, C-suite, support team) are able to prompt an LLM blackbox with feature requests. Each of those requests will process in the background and eventually produce a preview environment for the prompter to look at and a PR to handoff to a software developer.

One of the assumptions baked in to the idea of this workflow is that there are all these well-defined, high-priority issues just sitting in the project management software waiting to be worked. Maybe that is the case in some orgs, however in my experience, the majority of teams and projects I've worked on don't have this. The backlog is a place where "nice-to-have" improvements and half-backed ideas collect dust and lose proximity to the state of the software system.

there’s a risk of AI side quests distracting from doubling down on the main one.

My hard-earned intuition for what works well with various LLM models and coding agents combined with my general expertise in software engineering combined with my knowledge of the specific codebase(s) are what allow me to get strong results from LLM tooling.

I suspect in a lot of cases I'd be tossing out the initial "engineer-out-of-the-loop" attempt and re-prompting from scratch, all while trying to keep tabs on the main quest work I have in progress.

Andrej Karpathy wrote a post recently that gets at why I think "asking an LLM to build that feature from the backlog" is not as straightforward as it seems:

It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges.


I'd like to coin the term "AI-assisted Backlog Resurrection", but I don't think it's going to catch on.