❶ Author of How to Open Source (.dev). A book to take you from coder to contributor.

❷ Creator of CodeTriage, a free service helping developers contribute to open source.

❸ Core committer to ruby/ruby.

❹ Rust Cloud Native Buildpack author .

❺ Married to Ruby, literally.

Benchmarking Rack Middleware

31 Oct 2014

Performance is important, and if we can’t measure something, we can’t make it fast. Recently, I’ve had my eye on the ActionDispatch::Static middleware in Rails. This middleware gets put at the front of your stack when you set config.serve_static_assets = true in your Rails app. This middleware has to compare every request that comes in to see if it should render a file from the disk or return the request further up the stack. To do that it hits the disk, basically doing this on every request:

Dir["#{full_path}#{ext}"].detect { |m| File.file?(m) }

My gut said there had to be a better way, but how do we measure a singular rack middleware’s performance? I couldn’t find any really good posts on it, so I improvised using benchmark/ips and Rack::MockRequest to simulate traffic.

Bootstrap your Middleware

First you need to load the file where your middleware is defined:

require 'rack/file'
require 'action_controller'
load '/Users/schneems/Documents/projects/rails/actionpack/lib/action_dispatch/middleware/static.rb'

Now we need to load our test capabilities:

require 'rack/test'

Now we can instantiate a new object that we can call in isolation:

noop       = Proc.new {[200, {}, ["hello"]]}
middleware = ActionDispatch::Static.new(noop, "/my_rails_app/public")`

Then we wrap it up in a mock request:

request = Rack::MockRequest.new(middleware)

I wanted to compare the speed of the middleware with the speed of the proc that it hits, so I made a no-op mock request as well:

noop_request = Rack::MockRequest.new(noop)

Now, if we want to exercise a request against our singular middleware we can call

request.get("/path_i_want_to_hit")

Run your Benchmarks

To do the comparison you’ll need the benchmark ips gem installed

$ gem install benchmark-ips

The gem works by running different blocks of code for variable amounts of time to record how many iterations per second they can achieve. The higher the number, the faster the code.

We set up our benchmark:

require 'benchmark/ips'

Benchmark.ips do |x|
  x.config(time: 5, warmup: 5)
  x.report("With ActionDispatch::Static") { request.get("/")  }
  x.report("With noop")                   { noop_request.get("/") }
  x.compare!
end

You should get an output similar to this:

Calculating -------------------------------------
With ActionDispatch::Static
                          1525 i/100ms
           With noop      2667 i/100ms
-------------------------------------------------
With ActionDispatch::Static
                        15891.2 (±11.6%) i/s -      79300 in   5.056266s
           With noop    28660.9 (±11.7%) i/s -     141351 in   5.009789s

Higher iterations are better, so the blank no-op middleware was (28660.9 - 15891.2)/15891.2 * 100 #=> 80 roughly 80% faster or ran 80% more operations than with the default ActionDispatch::Static. This is expected, but only gives us a baseline. So, we still need to test our new code.

Comparing Benchmarks

I ran the tests a few times to ensure I wasn’t getting any flukes. Then, I set it up so that the middleware had some optimizations to not hit the disk were incorporated:

Calculating -------------------------------------
Modified ActionDispatch::Static
                          2330 i/100ms
           With noop      2422 i/100ms
-------------------------------------------------
Modified ActionDispatch::Static
                        24490.9 (±7.9%) i/s -     123490 in   5.081158s
           With noop    26870.1 (±8.7%) i/s -     135632 in   5.093423s

Here you can see that our no-op code ran (26870.1 - 24490.9)/24490.9 * 100 # => 9.71 roughly 10% faster than the default ActionDispatch::Static. Here the closer the better as the nooop is the fastest possible case.

When we graph the results

You can see that my slight optimizations got us pretty close to the optimal state. The tick marks on each bar show the standard deviation (the ±) to make sure that the numbers are somewhat sane.

If we do the math, we can see that my new middleware is (24490.9 - 15891.2)/15891.2 * 100 # => 54.11 or roughly 54% faster than the original ActionDispatch::Static in the case when we’re making a request that is not requesting a file.

Keep it in Context

Make sure not to get tunnel vision when you’re benchmarking, in this case my optimizations made all non-asset requests faster, but since that required more logic, it actually makes asset requests (i.e. request.get("assets/application-a71b3024f80aea3181c09774ca17e712.js")) slightly slower. Luckily, I did some benchmarking there and found the difference to not really be measurable. Either way, don’t just benchmark your happy path, make sure to benchmark all common conditions.

Right now you might also be thinking “Holy cow 54% speed improvement in Rails, zOMG!” but you have to remember that this is a middleware tested in isolation. The performance improvement isn’t as much when we compare it to the whole Rails application stack, which I had to benchmark as well (and is a whole different blog post). The end result came to a ~2.6% overall speedup with the new middleware. Not bad. Here’s the PR to Rails: https://github.com/rails/rails/pull/16464.

Go forth and benchmark your Rack middleware! If you have any feedback or know of a different/better way to do this, find me on the internet @schneems.

Keep Reading 🚀

Don't McBlock me

“That cannot be done.” Is rarely true, but it’s a phrase I’ve heard more and more from technical people without offering any rationale or further explanation. This tendency to use absolute language when making blocking statements reminded me of a useful “McDonald’s rule” that I was introduced to many years ago when deciding where to eat with friends. It goes something like this:

Read More
Bad Type Patterns - The Duplicate duck

Why aren’t people writing more types? Perhaps it’s because the intermediate and expert developers deleted the patterns that didn’t work and left no trace for beginners to learn from. This post details some code I recently deleted that has a pattern I call the “duplicate duck.” You can learn the process I used to develop the type, and why I deleted it. Further, I advocate for Rust developers to document and share their mistakes in the hope that we can all learn from them.

Read More
A Daft proc-macro trick: How to Emit Partial-Code + Errors

Update (2025/04/02): The change I suggested below was merged in PR #64. It’s pretty neat I went from knowing nothing about this project to contributing to it in the span of a single blog post.

Read More
Installing the sassc Ruby gem on a Mac. A debugging story

I’m not exactly sure about the timeline, but at some point, gem install sassc stopped working for me on my Mac (ARM). Initially, I thought this was because that gem was no longer maintained, and the last release was in 2020, but I was wrong. It’s 100% installable today. Read the rest to find out the real culprit and how to fix it.

Read More
My Red Hot ADHD Programming 'Affliction'

Sorry, Dave, ADHD is real, and (not acknowledging it) can hurt you. Hi. I’m Richard. I’m a Ruby Core Contributor. I also code in Rust, and enjoy giving talks and writing books about How to (Contribute to) Open Source. I was diagnosed with ADHD in my late 30’s. What does it mean that I was “diagnosed” with ADHD? Am I simply a speed junkie? What even is ADHD, and why is there so much misinformation and misunderstanding about it? Keep reading to find out.

Read More

Benchmarking Rack Middleware

Bootstrap your Middleware

Run your Benchmarks

Comparing Benchmarks

Keep it in Context

Subscribe to my Newsletter 😻 🤠

Keep Reading 🚀

Don't McBlock me

Bad Type Patterns - The Duplicate duck

A Daft proc-macro trick: How to Emit Partial-Code + Errors

Installing the sassc Ruby gem on a Mac. A debugging story

My Red Hot ADHD Programming 'Affliction'