❶ Author of How to Open Source (.dev). A book to take you from coder to contributor.

❷ Creator of CodeTriage, a free service helping developers contribute to open source.

❸ Core committer to ruby/ruby.

❹ Rust Cloud Native Buildpack author .

❺ Married to Ruby, literally.

Jumping Off The Ruby Memory Cliff

12 Apr 2017

The memory use of a healthy app is like the heartbeat of a patient - regular and predictable. You should see a slow steady climb that eventually plateaus, hopefully before you hit the RAM limit on your server:

Like a patient suffering from a heart ailment, the shape of our metrics can help us determine what dire problem our app is having. Here’s one example of a problem:

Memory spikes way up. The dark purple at the bottom indicates that we are now swapping significantly. The dotted line is our memory limit, which you can see we go well over. This sharp spike would indicate a problem, however, it’s curious how the memory simply drops off and goes back down after a bit. What’s going on?

If you’re running on Heroku, you’ll likely see a cluster of H12 - Request timeout errors at the same time. What is going on?

If you’re using a web server that has multiple processes (called forks or workers), one explanation is that some request or series of requests, came in that were so expensive they locked up the child process. Maybe you accidentally coded in an infinite loop, or perhaps you’re trying to load a million records from the database at once, whatever the reason, that request needs a LOT of resources. When this happens things grind to a halt. The process starts aggressively swapping to disk and can’t run any code.

Lucky for us, when this happens, web servers like Puma have a failsafe in place. Every so often, the child process sends a “heartbeat” notification to the parent to let it know that it’s still alive and doing well. If the parent process does not get this notification for a period of time, it will reap the process (by sending it a SIGTERM) telling it to shut down, and starting a new replacement process.

That’s what’s going on here. The child process was hung and using a lot of resources. Eventually, it hits the “worker timeout”, the default of which is 60 seconds and the process then gets killed. There are a few reasons why the problem persists longer than 1 minute - there may be multiple problem processes, the child process might not shut down right away, or the server may use so many resources that the parent process is having a hard time getting CPU resources to even check for a timeout.

Another memory signature you might see looks like this:

It’s kinda like a saw tooth, or a group of sharks. In this case the memory isn’t suddenly spiking as badly as before and something is making it go back down. It still has a drop off cliff at the end. This is likely due to an app using something like Puma Worker Killer with rolling restarts. Recent versions of Ruby will free memory back to the operating system, but very conservatively. It doesn’t want to give memory back to the OS just to turn around and have to ask for more a second later. So essentially that cliff indicates that something is dying, either intentionally or otherwise.

Now you know why your app is “jumping off the memory cliff”, what can you do? You need to find the endpoint that is taking up all your resources and fix it. This is extremely tricky because any performance tools such as Scout, won’t have the resources to report back analytics while the process it is running in is stuck. This means you’re more or less flying blind. My best advice would be to start with your H12 errors and use a logging add-on that you can search through. Find what endpoints are causing H12 errors then try to reproduce the problem locally using rack-mini-profiler or derailed benchmarks. I would look at the first H12 error in the series first.

If you can’t reproduce, add more logging around that endpoint - maybe it only happens with a specific user, or at a certain time of day, etc. The idea is to get your local app as close to the conditions causing the problem as possible.

The bad news is that you’re still jumping off a cliff. The good news is that at least you know why and the approximate location of the cliff. It might take awhile to narrow down the problem and make it go away, but after that, you can hit the ground running.

Keep Reading 🚀

Cinnamon Buns & Commit Bits: A RubyConf Story

RubyConf 2026 is coming up soon! The deadline to buy a ticket and enter the VIP raffle by buying a RubyConf ticket is July 1st. That’s only 2 days from this post’s publication. This personal essay explores why a language-specific conference in the age of AI is still important and why events are still important for bringing developers together.

Read More
RubyGems Fracture Incident Report

This is a copy of the canonical report I published on RubyCentral’s site. I recommend reading there for the most updated copy. This document attempts to give closure to the Ruby community about the events that led to the incident, September 10-18, 2025, which I’ve named “RubyGems Fracture.”

Read More
How to "Sustain Heroku"

This is a personal essay (I speak for me and my views, not for my employer) about what exactly a “Sustaining Engineering Model” is, in the context of the recent Heroku announcement, and the book “The Innovator’s Dilemma,” as seen by someone who has worked at the company for the past 14 years.

Read More
Non-Violent Comments: Calling out or Calling in?

Now that programmers are at war with the robots (Gen AI) for our jobs, we need to lean into the things that they cannot do. Today, I’m going to be talking about how to be a human and communicate with other humans in the most hostile of scenarios, “in conflict (drama).”

Read More
Disallow code usage with a custom `clippy.toml`

I recently discovered that adding a clippy.toml file to the root of a Rust project gives the ability to disallow a method or a type when running cargo clippy. This has been really useful. I want to share two quick ways that I’ve used it: Enhancing std::fs calls via fs_err and protecting CWD threadsafety in tests.

Read More

Jumping Off The Ruby Memory Cliff

Subscribe to my Newsletter 😻 🤠

Keep Reading 🚀

Cinnamon Buns & Commit Bits: A RubyConf Story

RubyGems Fracture Incident Report

How to "Sustain Heroku"

Non-Violent Comments: Calling out or Calling in?

Disallow code usage with a custom `clippy.toml`