A Variable By any Other Name

24 Apr 2017

Sometimes when you do everything right, things still go wrong. I previously talked about how bad I am at spelling and grammar in “The Four Year Typo”, which reminded me of my first major production failure at Heroku.

Here’s the setup. We have a service that builds apps. You don’t need to know this, but it’s called “codon”. This is the service that runs the buildpacks such as the one I currently maintain, the Heroku Ruby Buildpack. Believe it or not, when I started we had no production monitoring of build failures. If we so much as hiccup, Twitter tends to catch on fire and our support tickets come in like a tsunami, so there wasn’t a huge need. However, the faster we can find out about failures the faster we can fix them and the fewer people who are impacted. Also, when you’re deploying as many apps as we are, a one-in-a-million bug occurs a non-trivial number of times. So, we really have to be on top of things. One day I made a change to a buildpack that caused one of those one-in-a-million bugs to be exposed in Bundler, but because it wasn’t a major system meltdown, we didn’t really hear anything about it. While that is a bug, it’s not the one I’m writing about.

The bug I introduced came later, as a remediation for this bug. We decided we should set up error alerting and monitoring for language specific build failures (where before, they were system wide). To do this we already had a script that was aggregating failures, I just needed to modify it to log language failures. The reporting system worked like this, you created a hash and assigned values from a database to it:

guages = {}
guages["my.key"] = value_from_database

Later this got serialized and sent to a collection service. I refactored a bit of the logging and alerting code to DRY it up. I introduced a block that we could call with different metrics:

POST_GAUGES.call(gauges)

Then later I added my metrics

gauges = {}
gauges["my.new.key"] = value_from_database

If you’ve got a keen eye, you might notice what I didn’t. The original code was misspelled guages instead of gauges. This code worked for ages because it was consistently misspelled all over the method. No, I didn’t write the original code.

Using a different variable name would normally have raised an error, but due to the way the code was structured, we were creating the same variable multiple times to send multiple batches of metrics, or at least it used to be setting the same variable. Instead it was accidentally sending the same variable multiple times gauges instead of guages. Devoid of any other context it kinda looked like this:

gauges = {}
gauges["my.new.key"] = value_from_database
POST_GAUGES.call(gauges)

# ...

guages = {}
guages["my.key"] = value_from_database
POST_GAUGES.call(gauges)

When we deployed the code, my metrics reporting was working, but all of our build fleet started reporting zero builds. In the words of a former governor and Dancing with the Stars contestant, who somehow “runs” the Department of Energy:

“Oops”

We found the issue and fixed it up, but for awhile it was a head scratcher. This went through a code review and no one caught it. I was technically “right” when I spelled the variable correctly, but simultaneously “wrong” because it wasn’t consistent.

This just goes to show that consistency can be a powerful tool. In the face of a major mistake, misspelling the variable, consistency saved the day with the original code and provided smooth operation until someone came along and broke the consistency. With such a powerful tool we can use it for good or evil.

Could consistency have saved this bug from going into production too? Had I consistently broken each of the metric generation pieces into its own method, that would have prevented a gauges variable from being in scope when the original code was expecting guages. This would have raised an error and lead to detection in development instead of production. While often we might look at “best practices” as somewhat arcane rules, having a baseline of consistency can help write code that is easier to read and more bug free.

If you liked this bug journey, you might also enjoy my talk about the process of testing the Heroku Ruby Buildpack and lessons learned from testing non-deterministic systems, watch Testing the Untestable.

Keep Reading 🚀

Docker without Dockerfile: Build a Ruby on Rails application image in 5 minutes with Cloud Native Buildpacks (CNB)

I love the power of containers, but I’ve never loved Dockerfile. In this post we’ll build a working OCI image of a Ruby on Rails application that can run locally without the need to write or maintain a Dockerfile. You will learn about the Cloud Native Buildpack (CNB) ecosystem, and how to utilize the pack CLI to build images. Let’s get to it!

Read More
It's dangerous to go alone, `pub` `mod` `use` this.rs

What exactly does use and mod do in Rust? And how exactly do I “require” that other file I just made? This article is what I wish I could have given a younger me.

Read More
What is github.com/zombocom and why most of my Ruby libraries there?

The other day I got another question about the zombocom org on GitHub that prompted me to write this post. This org, github.com/zombocom, holds most all of my popular libraries). Why put them in a custom GitHub org, and why name it zombocom? Let’s find out.

Read More
Pairing on Open Source

I came to love pairing after I hurt my hands and couldn’t type. I had to finish up the last 2 months of a graduate CS course without the ability to use a keyboard. I had never paired before but enlisted several other developers to type for me. After I got the hang of the workflow, I was surprised that even when coding in a language my pair had never written in (C or C++), they could spot bugs and problems as we went. Toward the end, I finished the assignments faster when I wasn’t touching the keyboard, than I was by myself. Talking aloud forced me to refine my thoughts before typing anything. It might be intimidating to try pairing for the first time, but as Ben puts “it’s just a way of working together.”

Read More
My book 'How to Open Source' launches today!

Today is the day. How to Open Source is now available for purchase at howtoopensource.dev.

Read More

A Variable By any Other Name

Subscribe to my Newsletter 😻 🤠

Keep Reading 🚀

Docker without Dockerfile: Build a Ruby on Rails application image in 5 minutes with Cloud Native Buildpacks (CNB)

It's dangerous to go alone, `pub` `mod` `use` this.rs

What is github.com/zombocom and why most of my Ruby libraries there?

Pairing on Open Source

My book 'How to Open Source' launches today!