The 5 stages of debugging and what can we learn from them
Psychology teaches us that when dealing with grief we go through 5 stages: “denial”, “anger”, “bargaining”, “depression” and “acceptance”. And nothing causes more grief to software engineers than dealing with bugs. Especially other people’s bugs.
In general terms, a bug is an anomaly in a computer system that causes it to behave in unexpected ways. The first bug was a moth that got stuck in the relays of a 1940s computer (full story here).
Present-day bugs are more software rather than hardware or well…biological in nature. And the fix requires a software engineer to dive in and patch the faulty code, although in some cases the solutions can be more exotic. Like that time NASA fixed the Mars Rover by instructing it to hit itself with a shovel.
It starts with a QA or project manager having a conversation with the engineer about the bug. During which, the latter goes through the 5 stages:
Denial. “It works on my machine!”— this phrase is so widely used that they should teach it in universities. “I tested it myself” the engineer continues. “Let’s have a look together!” followed quickly by a “what's the URL again!?”, thus proving that the code was not actually tested, to the point that the engineer doesn't even know where it was deployed.
Anger. Once the feature is tested and it’s clear it doesn’t work as expected, we move on to anger. The second stage. Emotions run wild and the hunt for possible culprits begins: “The specifications were not detailed well enough!”. It can’t be the engineer’s fault if the requirements were wrong. Or maybe it’s a good time to blame DevOps: “The production environment is wrong! They didn’t follow the documentation”. The part that’s conveniently left out is that the only available documentation was written more than two years ago by someone who left the company.
Bargaining. “Come on, it’s not that bad. I’m sure that nobody actually uses that feature” — says the programmer. This statement comes bundled with the temporary onset of colorblindness with regards to the bright red of the health metrics. “I think we can just leave it as it is for now” says the engineer, somehow unbothered by the perplexed look on the project manager’s face.
Depression. “At this rate, we’ll never get anything done” — the whining continues— “we’re always fixing shit like this!”. After a not so brief explanation that “shit like this” is what powers the business, pays for the salaries, the free drinks, and the PlayStation in the lobby, the engineer moves to the final stage.
Acceptance. “Okay, just add a ticket and I will get on with it”. The bug is a “blocker” and needs to be addressed right away.
Of course, these conversations are fictional, but the situations they convey did happen to this “friend of mine” on more than one occasion. And in some cases, they were far less “friendly” as the one depicted above.
Engineers don’t like to fix bugs, but almost every single one of us would jump at the opportunity to do a refactor or a full rewrite. That has to do with our training. We’re trained to write code, not to read code. Writing code is easy, writing good code is hard. Understanding code is really hard. Understanding bad code is particularly hard.
Writing code is easy, writing good code is hard. Understanding code is really hard.
And when dealing with bugs, especially in large projects, sometimes the problem comes from elsewhere. Another part of the codebase. Probably written by another engineer or by someone who left the company. Pinpointing and isolating the problem requires hours of reading through hundreds, sometimes thousands of lines of code, some of which belong on The Daily WTF.
It’s a tedious process that nobody wants to go through. And yet, we all have to. We cannot simply rewrite the entire application from scratch every time there’s a glitch. And, even if we could, there’s no guarantee that this time it will be better. The original team or engineer that wrote the code didn’t plan for it to suck.
Bugs are a fact of (software engineering) life. Debugging as a skill is just as important as programming. And the question arises, how do we vet for it in interviews?
Debugging as a skill is just as important as programming.
Over the course of my career, I participated in about 30 to 40 interviews as a candidate and conducted north of 200 as an interviewer. In some cases, I was able to decide the topics and format of the interviews. I always advocated for real-life situations: connect to some APIs, aggregate the data, display it in a certain format, create an interface that allows the users to interact with it.
Candidates were always allowed to start from scratch. They never had to understand an existing piece of code. Which raises the question: how much information does the interview yield? And how relevant is it? Is this the right candidate?
For senior roles, the expectations are that they are able to quickly understand the code they’re looking at. Not just for the purpose of bug fixing, but to also mentor the more junior members of the team. Or to properly do a code review. Experience tells me that engineers that have difficulties understanding code tend to rubber-stamp code reviews.
Is it the time to start assessing how well candidates can read code rather than just writing it?