Working Code

There’s a joke I like to tell my teams. Senior engineers nod in gruff acknowledgement, and junior engineers tend to laugh nervously and nod with (I hope) enlightenment. It goes like this:

If you’ve been working on a piece of code for a while, and it works the first time you run it, junior engineers will celebrate. Senior engineers will get scared, because they don’t know where it’s broken.

Can we all agree that engineers are machines that turn coffee into bugs? The first version of any code is going to be buggy, and senior engineers know that you need to work through the code repeatedly in order to debug – and even to understand – what you’ve written.

There are so many different kinds of bugs. Off-by-ones, logical mistakes, inefficient data structures or algorithms, mistaken requirements, misunderstood parameters, non-idempotent / non-reentrant code, memory leaks, race conditions, deadlock, livelock, and on and on and on.

Modern software engineering has developed a variety of techniques to try to find or prevent bugs, including linting, static analysis, automated tests, and code reviews – but no matter how awesome your unit tests, integration tests, and end-to-end test coverage, your code has bugs. And while your code reviewers might get lucky, it’s hard to find subtle problems in code that you haven’t been thinking about deeply, digging through, and living with for a while.

None of the above should be controversial, and it should all sound pretty familiar to anyone who’s worked in the industry for even a little while, but somehow it seems to fly out the window when people start talking about AI. You give the magic box a prompt, it generates code that works the first time out, and voilá – you’re done!

Of course you know that you have to go through and validate that it’s doing what you think it should be doing. You probably went through many iterations of your prompt to get it to build what you wanted, and you probably wrote prompts to generate a bunch of unit tests, but does it really do what you want? Did you define your prompts with absolute fidelity? How can you know? Because you’re a code reviewer now. And a code reviewer never understands the code as well as the person who wrote it.

Leave a comment