One of these things...

Posted on

I tend to work a lot with junior and mid-level engineers and one of the things I notice frequently is a tendency for inefficient systems debugging. Sure, if you are working on nothing but frontend or backend code then understanding how to set a breakpoint and drop into the debugger is a key skill. However, that’s not what I’m really talking about. I’m talking about solving the real problem because sometimes everything just looks right to you, and even to that second pair of eyes you brought over to help. Or better yet, how does setting that breakpoint help if what you’re doing is standing up infrastructure?

Where I see engineers going wrong is that they jump around too frequently or too randomly and “just try stuff”. Sometimes you’re going to get lucky, but we don’t need to rely on luck.

What I’ve found is this is rooted in the habit of making assumptions about what a system is doing based on what you think it should be doing, rather than relying on the fact that a computer system is going to do exactly what you told it to do. Nine out of ten times, the reason you are having a problem is because you told it to do the wrong thing and you just can’t see where or how.

There is a popular methodology for debugging called Rubber duck debugging, which in all honesty is just a variant on the Socratic Method. Both of those techniques have their place in debugging software, but I’ve found that they work best when dealing with algorithms, talking through complex user interactions, etc.

What I am going to discuss in this post is a variation that I call “One of these things is not like the other” which is named after the popular children’s game. I find this to be one of the most powerful techniques in a couple of scenarios frequently encountered by developers in these modern times where we are not just writing application code but also managing infrastructure, build tools, and CI/CD pipelines.

You can get the gist of the game by watching this classic Sesame Street clip:

I’ll often find my self sending this clip over to engineers, or posting it in a Slack/Teams channel when I can see they are starting to spiral into randomly debugging a problem. Or, after they get to know me, I can just ask them to play “one of these things” or simply start singing the song on a group meeting/at standup.

How it works

Let’s say you’ve done one of those things we all do and have found an example of (almost) exactly what you want to do on a place like StackOverflow or on a vendor’s documentation site. You dutifully copy some code or port it into your project with a couple of key changes.

You keep going a ways but when you go to test things, you notice it just isn’t quite working the way you expected it would. To make matters worse, you can’t see the issue.

Well, instead of randomly changing code, one of the first things you should do is play “one of these things”.

Go back to the source you pulled the snippet from. Make sure that your expectations match what the example was really doing. What is different about the example and your scenario? Start eliminating differences and testing in between changes until you’ve got it working exactly the same as what is in the source example, even if that means it still needs to be modified to do exactly what you want.

The point of the game is to isolate differences and eliminate them until both you and the example are “on the same page”. Then you can proceed with making your adjustments and alterations.

Make your changes slowly, deliberately, and test in between. Testing in between small, frequent changes is important because you can catch any drift along the way and know exactly where and when something broke.

I find this works very well for infrastructure. Let’s say you follow along with the AWS tutorial, but at the end your Application Load Balancer can just never seem to detect healthy instances. Start going front-to-back and compare to the example along the way.

The Real Point

Whether you prescribe to the Socratic Method, Rubber Duck Debugging, or even “One of these things” there is a commonality between all of them in that you need to be deliberate with what you are doing. Methodically go through your system and compare it to something else that’s already working. Eliminate any assumptions you have by checking them against objective reality.