The Oldest Bug In Ruby - Why Rack::Timeout Might Hose your Server

Update: There’s a great resource for dealing with timeouts in Ruby called The ultimate guide to Ruby Timeouts, via @codefolio. Also there’s some good dicussion on Reddit around the possibility of maybe using Thread.handle_interupt in gems, read the comments.

The “bug” comes up in a few contexts. The problem comes when an error is raised from within an ensure block from another source. If you don’t know how that’s possible keep reading, otherwise skip the next section.

WTF huh? How is that possible

We are mostly familiar with exceptions

raise "something bad happened"

If we absolutely need to clean up something in our code we can do it in an ensure block:

  file_1 = make_file("file1.csv")
  file_2 = make_file("file2.csv")

  # do work ...

  raise "something bad happened" if work.bad?

  return work
  clean_up file_1

  clean_up file_2

In this case, we could be writing to files that need to be deleted after every call to this method. It is guaranteed to be called when the method exits and any time an exception happens in the block.

If you don’t follow check out exceptional ruby by Avdi.

Unfortunately things can raise exception other than your own code. For example when you’re running a program and want to close it, Ruby will receive a signal by the operating system, to let it know to clean up. In the case of a SIGKILL it will raise a SignalException exception where-ever the code is in execution. This means it could happen here

# ...
  clean_up file_1
  # Exception could be raised between the two calls right here <================================
  clean_up file_2

If that happens Ruby will never execute clean_up file_2. Granted this is a contrived example and you can do things like use tmp files in a block syntax but that’s not the point. The point is that exeptions can come from outside of your code and it can happen inside of an ensure block. This means that we are never actually guaranteed to full execute an ensure block even if all of our code is “correct”.

For more information on Ruby’s signal behavior check out my post License to SIGKILL.

The other case is raising an exception from another thread, you can do this with Thread#raise. Here’s a contrived example:

require 'thread'

threads = []

threads << do
    # ...
    clean_up file_1

    clean_up file_2

sleep rand(0..2)

threads.each {|thread| thread.raise "no one expects another thread to raise an exception!" }

Okay, so you may never do this, that looks like an awful idea. But you do use it and just don’t know abut it. It turns out that’s almost exactly what’s happening with Timeout in Ruby. Why Ruby’s Timeout is dangerous

It spawns up a new thread, sleeps the amount you want and when it wakes up it raises a Timeout::Error. If you’ve ever used rack timeout, (and recently I learned about slowpoke which also adds postgresql timeouts) it uses Thread#raise to kill an entire web request running in another thread.

Most of the time this isn’t too awful. However when it goes bad it goes really bad. For example network connections such as Database connections might not be released properly. This could cause issues in your app or in your database. The exception may have happend in a place that puts the thread in an un-recoverable state. It’s not dead so the webserver you’re using thinks it can handle web requests, but maybe it’s stuck and can’t actually process those requests.

I work at Heroku and see this. When an app is getting millions of requests and some of them timeout, if something bad can go wrong it eventually will. Traditionally rubyists avoided this problem by throwing away a process and starting a new one. Killing a process is much safer than raising an exception in a thread. Unfortunately this is expensive to throw away an entire process every time there is a small timeout in your web request.

So right now people have to actively choose between not timing out requests which may cause a domino effect of web request backups, or between killing long running requests which may cause threads to be un-usable.

That’s the state of the Ruby timeout and thread raising. Basically it’s really scary but people do it anyway.

So now what?

The behavior can’t be removed, it’s useful to some. The behavior can’t be changed dramatically. However, what if we added new behavior to Ruby? I’m proposing a way to tell Ruby that we want to wait until an ensure block has finished before an exception is raised. Maybe something like

thread.safe_raise(exception: "no one expects another thread to raise an exception")

So if your code is in the middle of an ensure block

  clean_up file_1
  # You are here <========================
  clean_up file_2

Ruby waits until it is finished before raising the new error:

  clean_up file_1

  clean_up file_2

# Ensure is done, raise "no one expects another thread to raise an exception"  now <========================

The nice thing about this is that it guarantees our ensure blocks execute. The previous non-deterministic behavior is gone. What about the down sides? We could be deeply nested in ensure blocks, you would have to go all the way up the stack to see if you’re fully out of an ensure block. This seems complicated and maybe that process isn’t deterministic (thinking halting problem), but I don’t know.

The other problem is that you can do slow things in the ensure block and if you’re trying to raise a time critical exception such as shutting down a program you may actually want things to stop abruptly and not finish.

For that case, maybe add a timeout behavior to safe_raise

thread.safe_raise(timeout: 3, exception: "no one expects another thread to raise an exception")

So we wait 3 seconds for the timeout to propagate, otherwise we raise where-ever in the code we are. So now we’re back to square one in terms of non-deterministic error raising.

So if we have the same problem why do I think this would be better? Right now we have no choice but to raise an exception and pray for the best. If we had new APIs we would allow the developer more control over the behavior they desire.

Another option could be to allow a timeout handler to be registered, maybe if we know we’re in a bad state, we want to term the whole process.

thread.safe_raise(timeout: 3, timeout_handler: -> { Process.kill('SIGKILL', }, exception: "#...")

The point is that we need more control over this behavior.

Next Steps

So what do you think? Do you like it, hate it? Would you use an API like that? If I get some good responses I’ll kick the can around and submit a feature request to Ruby.

How the F does Sprockets Load an Asset?

How does an asset get compiled? It’s less of a pipeline and more of a recursive ball of, well assets. To understand the process we will, start off with an asset with no directives (no require at the top). We’ll then walk through all the steps Sprockets goes through until a usable asset is loaded into memory. For this example we will use a js.erb file to see how a “complex” file (i.e. multiple extensions) type gets compiled. All examples are with Sprockets 4 (i.e. master branch). Here’s the file:

$ cat assets/users.js.erb
var Users = {
  find: function(id) {
    var t = '<%= %>';

When we compile this asset we get:

var Users = {
  find: function(id) {
    var t = '2016-12-13 11:01:00 -0600';

This is with the simplest of sprockets setup:

@env =
@env.cache = {}

What happens first? We call


This calls the find_asset method in Sprockets::Base. The contents are deceptively simple

uri, _ = resolve(*args)
if uri

The resolve method comes from sprockets/resolve.rb and the load method comes from sprockets/load.rb. Resolve will find where the asset is on disk and give us a “uri”. We’ll skip over exactly how resolve works, its task is relatively straightforward, find an asset on disk that satisfies the requirement of resolving to a users.js file. We can go into it in detail some other time.

A “uri” in sprockets looks like this:


It has a schema with the type of thing it is (in this case a file). We can tell that it is an absolute path because after the schema file:// it starts with a slash. The full path to this file is /projects/sprockets/test/fixtures/asset/users.js.erb. Then in the query params we carry extra info, in this case we are storing the mime type, which is application/javascript. While the file itself is a .js.erb the expected result of loading (compiling) this file is to be a .js file.

Internally Sprockets mostly doesn’t care about file extensions, it really cares about mime types. It only uses file extensions to generate mime types. When you register any processors, you register via a mime type.

The body of the load method from sprockets/loader.rb is fairly complicated. It handles a few cases.

  • Asset has an :id param, which is a fully digested hash, meaning that the asset is fully resolved and we can attempt to load it from the cache. This has two outcomes
    • Asset is in cache, use it
    • Asset is not in cache, delete the :id parameter and try to load normally.
  • Asset does not have an :id param, we call fetch_asset_from_dependency_cache which returns a block. This method does a lot, it has method docs that are fairly comprehensive, go check them out for full details. Essentially it has two modes. Looking for an asset based on dependency history, or not.
    • Looking for asset based on history:
      • If all dependencies for an asset are in the cache, then we can generate an asset from the cache. Otherwise we move on.
    • Not found based on history:
      • We’ve proven at this point that the asset isn’t in cache or one or more of it’s dependencies aren’t in the cache. At this point we have to load the entire asset.

We’re going to assume a fresh cache for our example. That means that we hit the fetch_asset_from_dependency_cache method and fall back to the `“not found based on history” case so we have to load it.

Loading an unloaded asset (pipeline = nil/:default)

The bulk of work happens in the method load_from_unloaded. We’re going to start getting really technical and low level, so follow along in the code for better comprehension what I’m talking about. We first generate a “load” and a “logicial” path:

puts load_path.inspect
# => "/projects/sprockets/test/fixtures/asset"

puts logical_path.inspect
# => "users.js.erb"

There is an edge case that is handled next. In sprockets foo/index.js can be resolved to foo.js, it’s a convention in some NPM libraries. That doesn’t apply to this case. Next we generate an extname and a file_type

puts extname.inspect
# => ".js.erb"

puts file_type.inspect
# => "application/javascript+ruby"

The file_type is the mime type for our .js.erb extension. Note the +ruby which designates that this is an erb file. I think this is a Sprockets convention. This mime type will be very important.

In this case the only params we have are {:type=>"application/javascript"} so we skip over the pipeline case.

We do have a :type so we’ll run that part. The logical_path was trimmed down to remove the extension

puts logical_path.chomp(extname)
# => "users"

Now we pull an extension based off of our mime type and add it to the logical path

puts config[:mime_types][type][:extensions].first
# => ".js"

Putting these together our new logical path is:


We’ll use this later. This should match the original thing we looked for when we used @env.find_asset.

Next comes a sanity check. Either we’re working with a mime type which we’re requesting, or we’re working with a mime type that can be converted to the one we’re requesting. We check our transformers which is an internal concept to Sprockets, see guides/ for more info on building a transformer. They essentially allow you to convert one file into another. Sprockets mostly cares about mime types so we check the transformers to see if it’s possible to transfer the existing mime type into the desired mime type i.e. we want to convert application/javascript+ruby to application/javascript.

Next we grab the “processors” for our mime type. These can be transformers as mentioned earlier, or they can be processors such as DirectiveProcessor which is responsible for expanding directives such as //= require foo.js in the top of your file.

Into this processors_for method we also pass a “pipeline”. For now it is nil, which means that the :default pipeline is used.

A pipeline is registered like a transformer or a processor. They’re an internal concept. Here is what the default one looks like

register_pipeline :default do |env, type, file_type|
  # TODO: Hack for to inject source map transformer
  if (type == "application/js-sourcemap+json" && file_type != "application/js-sourcemap+json") ||
      (type == "application/css-sourcemap+json" && file_type != "application/css-sourcemap+json")
    env.default_processors_for(type, file_type)

Here if we’re requesting a sourcemap we only want to run the [SourceMapProcessor] otherwise we find the “default” processors that are valid to our type (in this case application/javascript) from our file_type (in this case application/javascript+ruby). Default processors are defined here:

def default_processors_for(type, file_type)
  bundled_processors = config[:bundle_processors][type]
  if bundled_processors.any?
    self_processors_for(type, file_type)

Either we return any “bundled” processors for the type or we return “self” processors. In our case there is a bundle processor registered Sprockets::Bundle. It was registered. In sprockets.rb:

require 'sprockets/bundle'
register_bundle_processor 'application/javascript', Bundle

Now we’re back to the loader.rb file. We have our processors array which is simply [Sprockets::Bundle]. We call build_processors_uri. This generates a string like:


This string gets added to the “dependencies”. This array is used for determining cache keys, so if a processor gets added or removed the cache key will change (I think).

Now we have to call each of our processors. First we resolve! the original filename, but with a different pipeline i.e. pipeline: :source. The resolve! method raises an error if the file cannot be found.

After we resolve the file we get a source_uri that looks like this:


Now here’s where things get complicated (I know right). We load the exact same file that is already being loaded with this new pipeline=source.

Recursive asset loading is recursive (pipeline=source)

At this point we get recursive, we call repeat everything in load_from_unloaded but with pipeline=source. The results should be the same but with a different pipeline. The :source pipeline looks like this:

register_pipeline :source do |env|

In this case the processors returned is an empty array [].

We skip over the processor section, and instead hit this:

dependencies << build_file_digest_uri(unloaded.filename)
metadata = {
  digest: file_digest(unloaded.filename),
  length: self.stat(unloaded.filename).size,
  dependencies: dependencies

The file is digested to create a “digest” and the length is added via stat. Also “dependencies” are recorded which look like this:

#<Set: {"environment-version", "environment-paths", "processors:type=application/javascript+ruby&file_type=application/javascript+ruby&pipeline=source", "file-digest:///projects/sprockets/test/fixtures/asset/users.js.erb"}>

After this we build an asset hash:

asset = {
  uri:          unloaded.uri,
  load_path:    load_path,
  filename:     unloaded.filename,
  name:         name,
  logical_path: logical_path,
  content_type: type,
  source:       source,
  metadata:     metadata,

Which is then used to generate a Sprockets::Asset and is returned by our load method.

Jumping back up the stack (pipeline=default)

Now that we have a “source” asset we can go back and finish running the processors for pipeline=default

We did all that work, just to get a digest path:

source_uri, _ = resolve!(unloaded.filename, pipeline: :source)
source_asset = load(source_uri)

source_path = source_asset.digest_path
# => "users.source.js-798a333a5596e1495e1cc4870f11c7729f168350ee5972637053f9691c8dc326.erb"

Which kinda seems insane, maybe we don’t have to need go all recursive to get this tiny piece of information, but whatevs. If there’s one thing I’ve learned from working on Sprockets, is that the code resists refactoring and most of the seemingly “clever” code is actually a very clean way of accomplishing tasks. That is to say, I’m not going to change this without a lot more research.

Now we execute the call_processors pass in our array of processors [Sprockets::Bundle] and our asset hash:

  environment:  self,
  cache:        self.cache,
  uri:          unloaded.uri,
  filename:     unloaded.filename,
  load_path:    load_path,
  source_path:  source_path,
  name:         name,
  content_type: type,
  metadata: {

If we had more than one processor this would call each of them in reverse order and merge the results before calling the next. In this case there’s only one processor. Guess it’s time to figure out what that one does.

Bundle processor (still on pipeline=default)

The bundle processor is defined in sprockets/bundle.rb. Open it up to follow along. We pull out dependencies from the hash. For now it is very simple:

#<Set: {"environment-version", "environment-paths", "processors:type=application/javascript&file_type=application/javascript+ruby"}>

The next thing we do is we resolve the file (yes, again) this time using pipeline=self

processed_uri, deps = env.resolve(input[:filename], accept: type, pipeline: :self)

puts processed_uri.inspect
# => "file:///projects/sprockets/test/fixtures/asset/users.js.erb?type=application/javascript&pipeline=self"

puts deps.inspect
# => #<Set: {"file-digest:///projects/sprockets/test/fixtures/asset/users.js.erb"}>

We merge this deps with the dependencies from earlier. The file-digest:// that was returned from the resolve method indicates that there is a dependency on the contents of the file on disk, if the contents change, the digest should change.

You ready for some more recursion? You better hold onto your butts.

The next thing that happens is we build a proc

find_required = proc { |uri| env.load(uri).metadata[:required] }

This proc takes in a uri and loads it, then returns a set of “required” files. Sprockets uses this proc to do a depth first search of our processed_uri (i.e. the pipeline=self uri). We can look at the dfs now:

def dfs(initial)
  nodes, seen =,
  stack = Array(initial).reverse

  while node = stack.pop
    if seen.include?(node)
      stack.concat(Array(yield node).reverse)


The purpose of this search is that we want to make sure to only evaluate each file once and only once. Otherwise if we had an a.js that required a b.js that required a c.js that required a.js if we didn’t keep track, then we would be stuck in an infinite loop. There is more involved in making sure infinite loops don’t happen, but that’s maybe for another post.

For the first iteration this creates an array with only our URI in it:

puts stack.inspect
# => ["file:///projects/sprockets/test/fixtures/asset/users.js.erb?type=application/javascript&pipeline=self"]

It then adds this uri to the “seen” set and puts it back on the stack. The next line is a little tricky

stack.concat(Array(yield node).reverse)

Here the node is:


So we call the block with that node, remembering our block is

find_required = proc { |uri| env.load(uri).metadata[:required] }

So our DFS method invokes this block and passes it our pipeline=self uri, which invokes our load method again.

Load recursion kicked off from within bundle (pipeline=self)

I feel like we can’t get out of this load method, here we are again. This is what our pipeline=self looks like:

register_pipeline :self do |env, type, file_type|
  env.self_processors_for(type, file_type)

This method `self_processors_for` is non-trivial:

def self_processors_for(type, file_type)
  processors = []

  processors.concat config[:postprocessors][type]
  if type != file_type && processor = config[:transformers][file_type][type]
    processors << processor
  processors.concat config[:preprocessors][file_type]

  if processors.any? || mime_type_charset_detecter(type)
    processors << FileReader


First we grab any postprocessors that are registered for application/javascript mime type. There are no postprocessors registered by default, so I don’t know why they exist, but you can register one using register_postprocessor.

Next up, we pull out a transformer for our file type. This returns us a Sprockets::ProcessorUtils::CompositeProcessor. This is a meta processor that contains possibly several transformers. It is generated via a call to register_transformer. In this case the full processor looks like this:

#<struct Sprockets::ProcessorUtils::CompositeProcessor
  # ...

It’s doing some things with source maps and you can see now we have our ERBProcessor in there as well a DirectiveProcessor.

Next up, we gather any preprocessors, of which there are none. Finally, if there are any processors in our list we add a FileReader if we detect that it is not binary. Sprockets assumes a text file if the mime type has a charset defined. This is pretty standard.

So now we have our meta CompositeProcessor as well as our FileReader processor.

Now we call each of the processors in reverse order. First up is the FileReader.

class FileReader
    env = input[:environment]
    data = env.read_file(input[:filename], input[:content_type])
    dependencies =[:metadata][:dependencies])
    dependencies += [env.build_file_digest_uri(input[:filename])]
    { data: data, dependencies: dependencies }

It takes in filename, reads that file from disk and adds to the :data key of the hash. It also adds a dependency of the file, in case there isn’t already one:


After the file is done being read from disk, next up is the CompositeProcessor. This delegates to its other processors in reverse order so these get called

#<Sprockets::DirectiveProcessor:0x007f85b1322448 @header_pattern=/\A(?:(?m:\s*)(?:(?:\/\/.*\n?)+|(?:\/\*(?m:.*?)\*\/)))+/>

First up is the ERBProcessor, it takes the input[:data] which is the contents of the file and runs it through an ERB processor. There’s a little magic in that file to detect if someone is using an ENV variable in their erb, in which case we auto add that as a dependency.

Next the DirectiveProcessor looks for any directives such as //= require foo.js of which there are none in this file. Finally we call DefaultSourceMap. This processor adds a 1-to-1 source map if one isn’t already generated. If you’re not familiar with source maps check out guides/ which has some of my notes.

Now all of our processors for pipeline=self have been run, the load call completes and now we go back to where we were in our Bundle processor for pipeline=default.

Return to Bundle for (pipeline=default)

You may remember that we were in the middle of a depth first search.

def dfs(initial)
  nodes, seen =,
  stack = Array(initial).reverse

  while node = stack.pop
    if seen.include?(node)
      stack.concat(Array(yield node).reverse)


ALL that last section happened during the yield node section of this code. The return was an array of dependencies, which are reversed and added onto the stack. In our case there are no “required” files for file:///projects/sprockets/test/fixtures/asset/users.js.erb?type=application/javascript&pipeline=self so that yield call returns an empty set`.

The only node on the stack has already been “seen” so it is added to our nodes set. This was the last thing on the stack so we return that array. Our required list looks like this:

#<Set: {"file:///projects/sprockets/test/fixtures/asset/users.js.erb?type=application/javascript&pipeline=self"}>

If we were using a required directive such as //= require foo.js then we would have more things in this set. Another concept that Sprockets has is a “stubbed” list. Gonna be totally honest, I have no idea why you would need it but it is there. From the method docs: “Allows dependency to be excluded from the asset bundle”. So there you go. To get this list we call into load AGAIN

stubbed  = Utils.dfs(env.load(processed_uri).metadata[:stubbed], &find_required)

Though there is one thing I never mentioned, not all calls to load are created equal:

Cached Environment

Something I’ve failed to mention is that all calls to an env are not created equal. There is a Sprockets::Environment and a Sprockets::CachedEnvironment. The cached environment wraps the Sprockets::Environment and caches certain calls such as load so in the above example env.load(processed_uri) is returning a cached value and not actually calling into load, that’s a relief.

It turns out that this whole time I was somewhat misleading you, we weren’t using the version of fine_asset from Sprockets::Base but rather we were using Sprockets::Environment

def find_asset(*args)

This call to cached creates a `CachedEnvironment object:

def cached

Now any duplicate calls to load (with the EXACT same url) will return a cached copy. The rest of the implementation of find_asset is from the Sprockets::Base.

The first time we hit the cache in this example was with


It is first put in the cache at:

/projects/sprockets/lib/sprockets/loader.rb:149:in `load_from_unloaded'

Note some of my line numbers might not match perfectly due to changes in source, also I’m adding in debug statements etc.

Or this line:

source_uri, _ = resolve!(unloaded.filename, pipeline: :source)
source_asset = load(source_uri) # <========== THIS LINE ===========

source_path = source_asset.digest_path

When we pull it from cache we do so in the bundle processor:

/projects/sprockets/lib/sprockets/bundle.rb:35:in `block in call'

Which corresponds to this code:

(required + stubbed).each do |uri|
  dependencies.merge(env.load(uri).metadata[:dependencies]) #< === Called from cache here

Which brings us back to the bundle processor we were looking at before:

Finish the bundle processor (pipeline=default)

We loop through our required set (which is #<Set: {"file:///projects/sprockets/test/fixtures/asset/users.js.erb?type=application/javascript&pipeline=self"}>) minus our stubbed set (which is empty).

For each of these we merge in dependencies. Our final dependencies set looks like this:

#<Set: {

We then look up “reducers” and get back a hash of keys and callable objects:

    #<Proc:0x007ffef7b74398 (lambda)>
    #<Proc:0x007ffef8027070 (lambda)>

A reducer can be registered like so:

register_bundle_metadata_reducer '*/*', :data, proc {"") }, :concat
register_bundle_metadata_reducer 'application/javascript', :data, proc {"") }, Utils.method(:concat_javascript_sources)
register_bundle_metadata_reducer '*/*', :links, :+
register_bundle_metadata_reducer '*/*', :sources, proc { [] }, :+
register_bundle_metadata_reducer '*/*', :map, proc { |input| { "version" => 3, "file" => PathUtils.split_subpath(input[:load_path], input[:filename]), "sections" => [] } }, SourceMapUtils.method(:concat_source_maps)

It acts on a key such as :data to transform or “reduce” individual keys.

If we had some “required” files do to the directive processor

assets = { |uri| env.load(uri) }

Then this last line is where they would be concatenated via our reducers:

process_bundle_reducers(input, assets, reducers).merge(dependencies: dependencies, included:

In this case our only “required” asset is from file:///projects/sprockets/test/fixtures/asset/users.js.erb?type=application/javascript&pipeline=self which is important because you’ll remember that the pipeline=self is when the FileReader and ERBProcessor were run.

Finally we can return from our original pipeline=nil/:default case since all of our pipelines have been executed. In our original call to load.

The rest of the code is just doing things like taking digests and building hashes, we’ve already covered it in a previous section.

Finally a Sprockets::Asset is generated and returned from our original @env.find_asset invocation.


2020 Hindsite

There’s a few confusing things going on here. It isn’t always clear that calls to an env are going to CachedEnvironment and its even less clear if we’re calling something that has already been cached or loading something new.

The pattern of loading files that Sprockets uses is a reactor. It stores state via pipeline=<whatever> and essentially loops with different pipeline variations until it gets its desired output. While this is very powerful, it’s also really hard to wrap your brain around. Most of the code, especially in the Bundle processor are indecipherable if you don’t know minute details about how things work inside of all of Sprockets. These two designs, the recursive-ish load reactor pattern and the CachedEnvironment are sometimes difficult to wrap your mind around. Especially this pattern of loading files creates a forking back trace, so if you’re trying to debug it’s not always immediately clear what’s going on. Debug statements are usually output several times per each method call.

The other thing that makes Sprockets hard to understand is the plugin ecosystem. Sprockets is less a library and more a framework that uses itself to build an asset processing framework. Things like transformers, preprocessors, compressors, bundle_processors, etc. make it confusing exactly where work gets done. Some of the processors are highly coupled, such as the Bundle processor and the DirectiveProcessor. Again it’s extremely powerful and makes the library very flexible but difficult to reason about.

Much of Sprockets resists refactoring. Many of the design decisions are very coupled to the implementation. I’ve spent hours trying to tease out CachedEnvironment into something else, but eventually gave up. One thing to consider if you’re prone to judging code like I am, this project is 70%+ written by one person. These design decisions are all very powerful and many times very beautiful in their simplicity. If you’re the only one that works on a project, sometimes it pays to pick a powerful abstraction over one easier to read and understand.

I’ve got some ideas on how we could tease some abstractions, but it’s a hard thing to do. We have to be backwards compatible, and bake in room for future features & growth. We also need to be performance conscious.

There are other features that I haven’t covered in this example such as how files get written to disk, and how manifest files are generated, but how an asset gets loaded is complicated enough for now. How is your life better now that you know how the “F” Sprockets loads an asset? I have no idea, but I’m sure there’s something good about it. If you enjoyed this technical deep dive check out my post where I live-blog a writing a non-trivial Rails feature. Thanks for reading!

If you liked this post (or even if you didn’t) you can subscribe to my mailing list to get updates when I post new content. I average a little less than a post a week, often fewer. The more subscribers I get, the more incentive I have to put out content consistently.

Who Called Git? An Unusual Debugging Story

I don’t usually talk about support ticket work that I do. Most tickets are so specific it’s hard to write generalized articles. When I get a type of ticket that is worth blogging about, I usually find a better place to write about it like devcenter docs, or I look for a way to push a fix to an upstream open source library. Today I got an unusual bug, and I fixed it in a fairly unusual (for me) way. Thought you might be interested.

The Issue

The customer was complaining that this output was coming out when they would run commands

$ heroku run rails console
fatal: Not a git repository (or any parent up to mount point /app)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

Originally they thought that the issue may have been coming from our heroku CLI since the issue wasn’t happening locally.

BTW, did you know you can open up a shell session to Heroku using heroku run bash. I’ve been telling people this for about 5 years and it’s still surprising to them, very useful for debugging.

Anyway, after a little investigation it turns out that they get that output any time you run any app commands on the server

$ heroku run bash
~$ rake assets:precompile
fatal: Not a git repository (or any parent up to mount point /app)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
# ...

The problem isn’t that traumatic, but it’s annoying. Also, it’s best practice to run with as few warnings as possible. On an unrelated support ticket: a customer of mine was getting a warning that told them exactly how to fix their problem, but they didn’t notice it because they were regularly getting 5 or 6 other warnings on deploy so when a new one was added they didn’t see it and didn’t take action.

Bug Hunt

The customer gave application access so I was able to reproduce the issue pretty easily. That error output is what you would get if you try to run a git command when there isn’t a git repo.

$ cd /tmp
$ git checkout foo
fatal: Not a git repository (or any of the parent directories): .git

If you didn’t know where the message came from, a quick google would point git as being the source of the error. But why was this only happening on Heroku?

One of the ways that we make scaling so fast is that we package your app in a zipped file and (roughly) all we have to do to get it to run is put it on a dyno and unpack it. One of the limiting factors is how fast we can transfer the zipped file (we call it a slug), so we recommend keeping the size of your repo low. To help with this we strip out the .git directory of your app since it only adds extra disk space and makes scaling, restarting, or migrating to other dyno types slower. The error from git is telling us that we’re trying to run a git command with no git repository. This makes sense since we strip the .git directory.

So we know why the issue is happening (no git repo on the dyno plus we’re running a git command). But we don’t know where the issue is happening, i.e. what is triggering that git code.

My first instinct was to grep their app for git calls:

~ $ grep -R git app/* config/* lib/*

I got a few hits but they were mostly di"git" strings, no smoking guns.

Expanding the search to all directories gave me way too many hits. 5210 references to the string git to be exact. It would take too long to go through all of them.

We can get a bit more specific if we assume we are shelling out searching for "\git”` however there’s still 195 entries, which is still quite a bit.

The Trick

While this specific problem isn’t that common, this general type of problem is. When one program calls another one, we usually want to know where in the first program that the second was being called. In a perfect world we would be able to raise an exception in the second program and get a backtrace from the first, however this doesn’t always work. In our case the call to git was erroring and returning a non-zero exit code, however the parent program was ignoring it.

The only option I had was to attempt to get more information. I figured maybe if I could find out what arguments are passed to git I could better find the problem.

To do this I wrote an executable to bin/git and since bin/ is first on the path, the OS will pick it over the orignial git command.

$ heroku run bash
~$ cat <<EOT > bin/git
#!/usr/bin/env ruby

raise ARGV.inspect

Don’t be alarmed, we aren’t modifying your production code, a heroku run instance spins up a new dyno that isn’t connected to your running app. You can do whatever you want to this instance’s disk and it won’t effect your production website.

You also have to make sure it’s executable

~$ chmod +x bin/git

Now we can see that our command is preferred:

~ $ which git

Now when we run any command against the app we get a new error output

/app/bin/git:4:in `<main>': ["rev-parse", "--short", "HEAD"] (RuntimeError)

This gives us more information. The output means that git is being called with git rev-parse --short HEAD. A quick grep for this gives us one and only one match:

~ $ grep -R "git rev-parse --short"
.../gems/raven-ruby/lib/raven/configuration.rb:      self.release = `git rev-parse --short HEAD`.strip rescue nil

This was comming from the raven gem which is a sentry client. They are trying to determine the SHA of the latest “release” to report to the sentry server, but they weren’t checking to see if a .git folder was present first.

So that’s where our error message was coming from. It turns out this was already fixed on a more recent version of raven where the check was added.

After the customer upgraded to a more recent version of raven and the error went away!


Writing a custom executable to output debugging info isn’t a common technique and likely you’ll never use it. I wanted to write this because it was an interesting case, maybe there is a better way to find this info, but this one worked just fine.

We don’t always go on such deep dives for customer issues, but if we get a ton of similar error reports or find an issue that is intriguing enough, sometimes we can dig in, and sometimes it’s fun. One debugging technique I do use ALL the time that would have potentially solved this issue is to update gems. Most people run with really old dependencies. Usually I recommend this when there is some related culprit. For example if rake assets:precompile is failing I suggest upgrading any gem with “sprockets” or “asset” in the name. Often bugs are fixed by the larger community but you won’t take advantage of those fixes unless you upgrade to a version with that fix.

If you don’t have a smoking gun of where the problem is coming from, you can run bundle outdated. This is an example of the command run against the repo:

$ bundle outdated
Fetching gem metadata from
Fetching version metadata from
Fetching dependency metadata from
Resolving dependencies....

Outdated gems included in the bundle:
  * psych (newest 2.1.0, installed 2.0.17, requested ~> 2.0.12) in group "default"
  * rack (newest 2.0.1, installed 1.6.4) in group "default"
  * rails (newest 5.0.0, installed, requested ~> 4.2.7) in group "default"
  * actionmailer (newest 5.0.0, installed
  * actionpack (newest 5.0.0, installed
  * actionview (newest 5.0.0, installed
  * activejob (newest 5.0.0, installed
  * activemodel (newest 5.0.0, installed
  * activerecord (newest 5.0.0, installed
  * activesupport (newest 5.0.0, installed
  * arel (newest 7.0.0, installed 6.0.3)
  * rails-dom-testing (newest 2.0.1, installed 1.0.7)
  * railties (newest 5.0.0, installed

Thanks to Nate for reminding me about this command at Rubyconf. Also, coincidentally, he happens to be the person who made the fix in the Sentry client.

If you liked this consider following @schneems on twitter or signing up to get new articles in your inbox (about 1 email a week when I’m on a roll).

Writing a Rails Feature - Blow by Blow

My favorite part of seeing someone live code is all the mistakes they make, but not because I’m a mean awful person who likes to see others fail. Watching others recover from mistakes helps me recover from my mistakes. It also makes me feel better when I see they mess up the same ways that I do. Too often, programmers beat themselves up when they can’t remember an API and have to Google it, or they lose an hour to a simple spelling mistake. Everyone does these things.

I have another confession, which is that I love taking notes while programming. It’s been a game changer for when I’m working on systems too large to fit inside of my head, which these days is pretty much all the time. (maybe my head is getting smaller).

I’ve mentioned this before at several conferences, and one of the tricks I use to make sure my notes are thorough is to pretend I’m preparing to write a blog post. I usually take it one step further and actually use my notes to generate a blog post. While this helps me do feature work, and hopefully is helpful in some of the writing I publish, you’re getting a polished and refined version of my notes.

I had an idea for a new feature in Rails that I wanted to build recently but it was such a large change that I couldn’t make myself work on it. I would start, get demoralized and quit constantly, so I decided to try something new. I took notes, yes, but I also noted all my mistakes as well. The result is a play-by-play of me implementing a non-trivial feature in Rails. You’ll see me make progress, make mistakes, reflect on my process as I go, and read the whole thing pretty much as it unfolded.

Problem Statement

Before we get started, here’s some context. I maintain Sprockets after the original maintainer left. One of my major goals with the project is to make debugging problems much easier. I love good error messages when you made an obvious mistake and the library catches it and gently corrects your behavior. Prior to my patch, Sprockets wasn’t allowed to raise errors like this because the Rails asset pipeline expected that when an asset is not found, the original string gets returned. For example, a valid asset lookup might return

#=> "assets/application-ai189cm58bvmadosifu913248.css"

While an invalid asset will return the original string

#=> "ap1ic4ti0n.czz"

This behavior may seem strange, however in the original days when the pipeline was introduced, everyone put their assets in the public/ folder, as to allow people to use “the asset pipeline” with these files, and the fallback was allowed.

I want to add errors to Sprockets, but before I can do that I need to know 100% for sure that the user expected an asset from the pipeline before I can raise an error, that means I need to make the API in Rails explicit. My original idea was to have asset_path be for asset pipeline and public_asset_path for things you didn’t expect Sprockets to find. It might look like an easy change on the surface, but it wasn’t.

I eventually implemented and shipped a feature pretty similar to what I wanted so that in the future, Sprockets will give you a better experience. Keep reading and follow me on the journey of implementing a fairly invasive Rails feature.

In the Beginning

I went back and added some narrative, so you wouldn’t be totally confused, the code examples and most of the commentary are actually part of my original notes.

How does Sprocket hook into Rails to form the asset pipeline? Your Rails app will use something like asset_path('smile.png') in a view. This is implemented by action_view/helpers/asset_url_helper.rb which has this magical line:

if source[0] != ?/
  source = compute_asset_path(source, options)

If you don’t have Sprockets-Rails in your app then the public path will be used:

def compute_asset_path(source, options = {})
  dir = ASSET_PUBLIC_DIRECTORIES[options[:type]] || ""
  File.join(dir, source)

But if you do have Sprockets-Rails that method gets over-written:

def compute_asset_path(path, options = {})
  debug = options[:debug]

  if asset_path = resolve_asset_path(path, debug)
    File.join(assets_prefix || "/", legacy_debug_path(asset_path, debug))

Sprockets-rails will be used to see if the asset exists as part of the pipeline if it doesn’t exist it falls back to the original compute_asset_path method from Action View.

Keep in mind as you read, that the majority of these statements and assertions are things I’m having to look up and discover via debugging. I didn’t roll out of bed with this info.

This is the secret sauce because every *_path call from javscript_path to image_path all call out to asset_path. Likewise, their counterparts the *_url methods also use the asset_path helper to compute their output. So everything hits asset_path and asset_path hits compute_asset_path inside of Sprockets-Rails.

There is some more magic for debugging purposes but that’s the general gist.

What we need to do is to introduce a public API that purposely bypasses Sprockets for non-asset-pipeline public files.

I like to list out goals in my notes to remind myself of my intentions. It helps me pick up where I left off.

Here are public *_path helpers:

  • compute_asset_path
  • resolve_asset_path
  • legacy_debug_path
  • asset_digest_path
  • asset_path
  • image_path
  • video_path
  • audio_path
  • font_path
  • javascript_path
  • stylesheet_path

Keeping track of work to be done, in this case, methods that need a public_* counterpart helps me retain focus.

Before we can change any APIs we need to implement deprecations and warnings. Of these methods, here are the ones we want warning on:

  • asset_path
  • image_path
  • video_path
  • audio_path
  • font_path
  • javascript_path
  • stylesheet_path

So now the question is what exactly does asset_path do?

  • It checks for nil, which is nice.
  • It calls to_s on any source passed in.
  • It returns an empty string "" if the source was blank
  • It returns the string if the source matches URI_REGEXP, which means that you’re giving a full URI i.e. if you pass in This is needed so that we can mix in controlled and remote assets in things like javascript_path which doesn’t require managed assets.
  • It then pulls out any query params such as ?utm_tracking=foo and preserves them as a “tail”

If you find yourself taking notes on what a method does consider going back and making a documentation PR to make all that behavior more obvious to future developers.

After that it gets into conditional logic. We pass the source into this method:

def compute_asset_extname(source, options = {})
  return if options[:extname] == false
  extname = options[:extname] || ASSET_EXTENSIONS[options[:type]]
  extname if extname && File.extname(source) != extname

I don’t know what this does despite method docs, I’m going to blame to see when it was added and see if I can find more context. It was added in 2012 as part of “Refactor AssetUrlHelper”. Based off of this earlier commit it looks like it allows you to do something like call javascript_path("application") and there is an API where .js can be auto added.

P.S. if you’re blame diving get the blame parent chrome extension. It allows you to press and hold “ALT” when viewing a blame view on GitHub. When you do that you get a little ^ next to the commit SHA that allows you to go to the parent of that commit SHA. Basically the same as going to that commit and seeing what the previous was before that one.

Then we check to see if the source passed in starts with a slash /. If so, we skip the asset pipeline and assume it’s a fully qualified path and this is surprising behavior to me:

if source[0] != ?/
  source = compute_asset_path(source, options)

Then there is code supporting “relative_url_root” we can see it here:

relative_url_root = defined?(config.relative_url_root) && config.relative_url_root
if relative_url_root
  source = File.join(relative_url_root, source) unless source.starts_with?("#{relative_url_root}/")

What does it do? A quick grep gives us the docs for relative_url_root:

  • config.action_controller.relative_url_root can be used to tell Rails that you are deploying to a subdirectory. The default is ENV['RAILS_RELATIVE_URL_ROOT'].

So basically this lets us tack on a few folders to the front of our assets if the Rails app isn’t in the same directory as the process was booted.

Finally, we add a host if one is declared:

if host = compute_asset_host(source, options)
  source = File.join(host, source)

You might want to do this if you’re using a CDN for example. Since both *_path and *_url helpers go through asset_path there are options in here for optionally telling the method that you, in fact, want a full on URL to come out of the helper.

The very last thing is that anything in the params i.e. “tail” such as ?utm_source=foo will be added to the end.

Now what

We’ve got a single entry point where Rails interfaces with Sprockets to make the asset pipeline. We need an entry point to public URLs that isn’t over-written by Sprockets-Rails. My first idea there is to alias compute_asset_path to make callable even when Sprockets-Rails writes over it.

We are going to use an alias to preserve the original method:

alias :public_compute_asset_path :compute_asset_path

If you’re not familiar this is creating a new method public_compute_asset_path that is essentially a copy of the original compute_asset_path method. This means that even if the compute_asset_path method is modified, or over-written, our public_compute_asset_path will still implement the original behavior.

Now since everything goes through asset_path we can add logic there to explicitly call this method. Maybe the option is :public_folder and we can add logic to asset_path

if options[:public_folder]
  source = public_compute_asset_path(source, options)
  source = compute_asset_path(source, options) if source[0] != ?/

Now that we have a way to explicitly call the public_compute_asset_path method, we need to expose a bunch of new helpers.

  • public_asset_path
  • public_image_path
  • public_video_path
  • public_audio_path
  • public_font_path
  • public_javascript_path
  • public_stylesheet_path

For example:

def public_asset_path(source, options = {})
  asset_path(source, options.merge(public_folder: true))

Since we’re using Rails 5.1 we can use named arguments (the minimum required Ruby version is 2.2) to make the method signatures a bit clearer.

def asset_path(source, public_folder: false, **options)

And then in our new method:

def public_asset_path(source, options = {})
  asset_path(source, public_folder: true, **options)

Now we can repeat for all our other assets.

Hold up. That named asset change looks heavy handed so let’s work with what we’ve got now and maybe come back and add it in later. We can change all the method signatures at the same time to make things easier afterward.

I reverted the method signature back to something like:

def public_stylesheet_path(source, options = {})
  path_to_stylesheet(source, {public_folder: true}.merge!(options))

Some subtle things here, we’re preferring the internal name path_to_stylesheet over stylesheet_path which someone might be using if they’ve defined a stylesheet route. Also for performance we have to allocate a hash to merge with options, if we used regular merge then we would allocate 2 hashes, by merging the first allocated hash in place we can save an allocation. This is only possible because the hash {public_folder: true} is not passed into the method which means that there are no other parts of the program that have a reference to it and would expect it to not be mutated.

I’m going to skip over the *_url methods for now, that’s just busy work. We implement them just like the *_path helpers.

The question we have now is how can we detect when Sprockets has fallen through and couldn’t find an asset? Two options come to mind. The first would be to check to see if the public_compute_asset_path matches compute_asset_path and deprecate. However, this poses some problems. It’s a performance concern since we’re now doing up to twice the work. There’s also the case where Sprockets isn’t being used and the path will always match since the method is being over-written by Sprockets-Rails. That’s not a great option.

The second option I can think of is for Sprockets-Rails to raise an error when it cannot find an asset. This is eventually what we want to happen, however, we can’t simply start raising errors on code that worked last week. We need to go through a deprecation cycle instead. We could catch the error and instead issue a deprecation. This causes some performance issues as well and exceptions in Ruby are really expensive (i.e. raising and catching exceptions is slow). There are other ways to communicate such as catch/throw but that would require Action View to know too much about Sprockets-Rails.

A third option could be to put the deprecation inside of sprockets-Rails. Something like:

def compute_asset_path(path, options = {})
  debug = options[:debug]

  if asset_path = resolve_asset_path(path, debug)
    File.join(assets_prefix || "/", legacy_debug_path(asset_path, debug))
    # Deprecate the code here, right here <====================================

However, this isn’t a great spot for a user to deprecate. The Sprockets-Rails gem doesn’t know what method you’re using. Ideally, we want the deprecation to be something like:

DEPRECATION: You are using `stylesheet_path` with an asset not managed by the asset pipeline instead use `public_stylesheet_path`

If we deprecate from the the inside of that compute_asset_path in Sprockets-Rails we don’t know what top level method exactly you’re trying to use.

There’s another option. We can port all the logic of public_*_path to Sprockets-Rails. We can also have our own custom asset_path provided by Sprockets-Rails.

Ughhh, this is hard. Right now I’m stuck. We are still stuck with the same problem - that the place where the logic is implemented isn’t where the deprecation needs to be. We can do fancy metaprogramming (ish) to grab the caller object, walk back up the stack to see what method was being called before, but that would fail if someone is writing their own library to call these methods and adding another layer, which I guess isn’t a bad thing.


Right now is a good place to stop. We have a big question, we have lots of options and there’s no clear winner. Shower, walk, do errands, go work on something else. Come back in a bit. Think about the problem when the mood strikes, but don’t force it. Let your unconscious mind work the problem. If you go for a month without ever thinking of this problem again maybe it’s not a problem worth solving, or maybe you don’t like programming. Maybe you had a really busy month. Who knows. Anyway, BBIAB.

This whole process happened over the course of weeks. What takes minutes to read might take hours to implement.

Okay, I’m back. What now? Well, I’m going to re-read what I wrote before I left. Haven’t had any brilliant epiphanies yet, let’s hope that changes. I’m going to look into being clever and using the caller inside of asset_path, however,, I’m going to do this within Sprockets-Rails.

When stuck with multiple options and no clear winner, pick one and start working on it. Even if it’s the wrong way to go, working on the problem gives you more context to pick the right solution.

I spent a bunch of time playing with code, no real direction - about 30 minutes to an hour. Mostly trying to re-remember throw/catch syntax and then debugging my mistakes. Forgot to write anything in my notes.

This is the solution I came up with, it’s relatively elegant but know that it wasn’t the first thing that came out of my fingers. In Sprockets-Rails,, we over-write asset_path then inside of the compute_asset_path which is called by asset_path we “throw” when we there is an asset miss.

Here is the code I came up with:

def compute_asset_path(path, options = {})
  debug = options[:debug]

  if asset_path = resolve_asset_path(path, debug)
    File.join(assets_prefix || "/", legacy_debug_path(asset_path, debug))
    result = super
    if respond_to?(:public_asset_path)
      throw(:asset_not_found, result) # <===== Note the throw

def asset_path(*args)
  catch_asset_not_found = catch(:asset_not_found) do
    return super(*args)
  # ... # <======== Here we can implement deprecation

Now if the asset is found, the behavior is exactly the same, however if code ever get’s past the catch block it means that the asset wasn’t found and that a public_asset_path exists which means we can safely assume that you’re running on a version of Rails that supports this behavior. Keep in mind that Sprockets-Rails has to run on multiple versions of Rails and Sprockets at a time.

The full asset_path method looks like this:

def asset_path(*args)
  catch_asset_not_found = catch(:asset_not_found) do
    return super(*args)

  result = catch_asset_not_found
  deprecate_invalid_asset_lookup(result, caller)

I wanted to move the deprecation code to its own method since it’s pretty gnarly. It looks like this:


  def deprecate_invalid_asset_lookup(name, call_stack)
    message =  "The asset #{ name.inspect } you are looking for is not present in the asset pipeline.\n"
    message << "The public fallback behavior is being deprecated and will be removed.\n"

    method_name = call_stack.first.split("in ".freeze).last.gsub(/`|'/, ''.freeze)

    if method_name.end_with?("_path".freeze) || method_name.end_with?("_url".freeze)
      message << "please use the `public_*` helper instead. For example `#{ "public_#{ method_name }" }`.\n"
      message << "please use the `public_*` helper instead for example `public_asset_path`.\n"
    ActiveSupport::Deprecation.warn(message, call_stack)

I’m being a bit “clever” (not normally a good thing) here. I’m using the caller to determine the last method that was called. If it ends in _path or _url then it’s safe to assume that you’re using something like stylesheet_path and that’s the method you care about, if it’s not then you’re probably using asset_path. We can refine this over time, maybe check to see if there is a valid public_* method that the object responds to.

Issuing this deprecation isn’t fast, however, neither is the asset pipeline lookup. If we can get the user to use public_stylesheet_path instead of stylesheet_path then they’ll avoid the deprecation and also get faster code, it’s a win-win.

The final message looks like this:

$ rails c
Loading development environment (Rails 5.1.0.alpha)
irb(main):001:0> helper.public_audio_path("blah")

DEPRECATION WARNING: The asset "/audios/blah" you are looking for is not present in the asset pipeline.
The public fallback behavior is being deprecated and will be removed.
please use the `public_*` helper instead. For example `public_audio_path`.
 (called from irb_binding at (irb):1)

There is one piece I want to deprecate inside of asset_path directly. We already had bypass code when your path starts with a slash /. This is not intuitive. I want to remove this so that you must be explicit about either using a public_ helper if you don’t want any warnings or errors.

source = compute_asset_path(source, options) if source[0] != ?/

I’m going to change this to:

if source[0] != ?/
  source = compute_asset_path(source, options)
  message =  "Skipping computing asset path since asset #{ source.inspect } starts with a slash `/`.\n"
  message << "This behavior is deprecated and will be removed. Instead explicitly declare that\n"
  message << "Use a `public_*` helper instead."

Here, I’m not doing that fancy method guessing via the caller, I’m hoping that valid use cases of starting an asset with a slash are minimal or likely accidental. Hopefully, the owner of the code will be able to figure out where they are calling the asset based on the full source name since they can grep their project to find it. Technically the public_compute_asset_path isn’t exactly the same as not adding the source, i.e. if there is a type specified when passing to public_compute_asset_path then we’ll auto add a directory.

The behavior actually makes a bit more sense now. I think that is so you can do something like:

stylesheet_path("application", "/path/directly/to/stylesheet")

Otherwise you’ll get /stylesheets prepended to the source you pass in. For this I think it’s fine since they’re going against convention of where to place those external stylesheets to make them break that up into two separate calls. However we will need a bypass mechanism. I’m thinking we allow a key called :raw which does not do the transformation. So you could do something like:

public_stylesheet_path("path/directly/to/stylesheet", raw: true)

With that in mind, we need to implement this behavior if we’re going to deprecate.

if source[0] != ?/
  source = compute_asset_path(source, options)
elsif !options[:raw]
  message =  "Skipping computing asset path since asset #{ source.inspect } starts with a slash `/`.\n"
  message << "This behavior is deprecated and will be removed. Instead explicitly declare that\n"
  message << "Use a `public_*` helper instead. Optionally pass in `raw: true` to get the exact same behavior."

Granted, somewhere else in Rails might be relying on this behavior so I’m not 100% confident we can deprecate and change it. But it’s worth looking into. We always need a way to avoid the deprecation notice, and when the previous use case is valid, we need to provide a 100% compatible replacement.

As a side note, I would probably like error checking when calling public_ methods so that we error out when you’re referencing a file that doesn’t exist on disk. We have to be careful about performance though. I would likely only want to do it in development and make it easy to turn off. Could perhaps use something like the evented file watcher and store all public contents in memory. Just some thoughts.

At the end of the day, our goal is a better development experience and the only way to get that is with requiring more explicit behavior from the user so we can raise helpful error messages.

The other piece of the puzzle is I’m not sure how to best add errors into Sprockets. We can’t simply start raising errors inside of Sprockets when an asset isn’t found. That would break lots of stuff. While Sprockets-Rails is linked to a version of Sprockets, there’s nothing to stop someone from using Sprockets-Rails 3.0.4 with Sprockets version 9000

s.add_dependency "sprockets", ">= 3.0.0"

We’ll need to move both of those pieces in lock step when we do add errors to Sprockets. We likely want to put a maximum version on the dependencies of Sprockets-Rails so that when we add errors (likely in Sprockets 5) then we can make sure a version of Sprockets-Rails exists that understands the errors. Otherwise, bad things will happen.

Up to this point, I’ve not put anything in GIT. This is really bad. I usually at least make a WIP commit at the end of a working session but I previously forgot. I tend to not commit a ton because I refactor as I go and would have a bunch of bad commit points with errors and 'bugfix' commits which I don’t love. To each their own. I’m making my “initial” commit.

We typically ask contributors to put things in only one commit for PRs as it makes blaming and reverting easier. I’ve got commit access so socially the rules are a little more lax. It makes sense to break out a large PR like this into several commits in case a direction I was intending to go didn’t work out and it makes it easier for me to roll back. We can always rebase later if we need to.

With the initial proof of concept done we need to flesh out the PRs. We need to add *_url methods like we did with paths. We also need to alias all the methods so they’re not accidentally over-written by someone who adds a pulic_audio route to their routes.rb. After that, I want to look into changing method signatures, writing tests and adding docs. With feature work, I typically don’t know what the outcome will look like until I’m done so I wait to write a regression test. With bug work, I sometimes try to write a test first. Sometimes finding the right place to write a test and writing a good test takes longer than the actual code. I always write a test, however very little of what I do is considered “TDD”. I wrote a piece on this a while back.

Protip: If you can’t find a good place to add a test, break something else in the same file you’re working in (comment out a random line or add a raise somewhere) and run your test suite. The test files that show the most failures and exceptions are a good place to look.

In the process of adding url helpers:

def public_font_url(source, options = {})
  url_to_font(source, {public_folder: true}.merge!(options))
alias_method :path_to_public_font, :public_font_path # aliased to avoid conflicts with an font_path named route

I found that my hack for deciphering the best deprecation message wasn’t valid since it assumed that the correct caller was only 1 back in the stack. In the case of _url helpers we call asset_url which then calls asset_path which means our correct method is 2 back on the stack. To account for this I split out the method extraction logic into it’s own method and updated the deprecation builder to also check to see if self responds to the public_ url:

  def extract_method_from_call_frame(frame)
    frame.split("in ".freeze).last.gsub(/`|'/, ''.freeze)

  def deprecate_invalid_asset_lookup(name, call_stack)
    message =  "The asset #{ name.inspect } you are looking for is not present in the asset pipeline.\n"
    message << "The public fallback behavior is being deprecated and will be removed.\n"

    path_method_name = extract_method_from_call_frame(call_stack[0])
    url_method_name  = extract_method_from_call_frame(call_stack[1])
    if url_method_name.end_with?("_url".freeze) && respond_to?("public_#{ url_method_name }")
      message << "please use the `public_*` helper instead. For example `#{ "public_#{ url_method_name }" }`.\n"
    elsif path_method_name.end_with?("_path".freeze) && respond_to?("public_#{ path_method_name }")
      message << "please use the `public_*` helper instead. For example `#{ "public_#{ path_method_name }" }`.\n"
      message << "please use the `public_*` helper instead for example `public_asset_path`.\n"
    ActiveSupport::Deprecation.warn(message, call_stack)

Which gives us the correct deprecations:

irb(main):003:0* helper.font_url("balksdjf")
DEPRECATION WARNING: The asset "/fonts/balksdjf" you are looking for is not present in the asset pipeline.
The public fallback behavior is being deprecated and will be removed.
please use the `public_*` helper instead. For example `public_font_url`.
 (called from irb_binding at (irb):3)
=> "/fonts/balksdjf"
irb(main):004:0> helper.font_path("balksdjf")
DEPRECATION WARNING: The asset "/fonts/balksdjf" you are looking for is not present in the asset pipeline.
The public fallback behavior is being deprecated and will be removed.
please use the `public_*` helper instead. For example `public_font_path`.

Turns out that it’s easier to copy-paste the same method + docs and edit all the things at once using CMD+d with my editor. I use Sublime Text.

I was asking questions about where to test things in Rails Contributors Basecamp chat, it looks like railties might have some integration tests for assets. I prefer integration tests whenever possible as they’re closer to how a user would use our code.

Also, the case of things like image_tag was brought up. Do we want to go through all the tags and make a public_image_tag etc? That’s a lot of helpers. Ideally, we want to be very explicit about when we expect the asset pipeline and when we don’t. There may need to be some middle ground about when we want fallback behavior in the asset_path but I still think this is bad. I’m going to think on it for a bit and keep working down the current implementation.

I found asset_debugging_test.rb in railties. See, the asset pipeline works by over-writing asset behavior. By default, without Sprockets-Rails any public_* method will be identical to its’ non public_* counterpart. So we need to test in an environment where the over-writes are in play. This test boots a test rails app that uses the asset pipeline and hits an endpoint. Going to loop through all our helpers and assert we’re getting a good match.

test "public paths" do
  contents = "doesnotexist"
  cases = {
    public_image_path: %r{/images/#{ contents }},
    public_asset_path: %r{/#{ contents }},

  cases.each do |(view_method, tag_match)|
    app_file "app/views/posts/index.html.erb", "<%= #{ view_method } '#{contents}' %>"

    app "development"

    class ::PostsController < ActionController::Base ; end

    get '/posts?debug_assets=true'

    body = last_response.body
    assert_match(tag_match, body, "Expected `#{view_method}` to produce a match to #{ tag_match }, but did not: #{ body }")

Good thing I added tests, I accidentally used options instead of options = {} in a few places.

Normally I like to test for positive and negative cases. I.e. make sure that the behavior is different between public_asset_path and asset_path. We can do this pretty easily by adding another test case:

test "public path methods do not use the asset pipeline" do
  cases = {
    asset_path:        /\/assets\/application-.*.\.js/,
    public_asset_path: /application.js/

  cases.each do |(view_method, tag_match)|
    app_file "app/views/posts/index.html.erb", "<%= #{ view_method } 'application.js' %>"

    app "development"

    class ::PostsController < ActionController::Base ; end

    get '/posts?debug_assets=true'

    body = last_response.body.strip
    assert_match(tag_match, body, "Expected `#{view_method}` to produce a match to #{ tag_match }, but did not: #{ body }")

Ideally,, we would also test the deprecation notice, but that’s coming from Sprockets-Rails so we’ll have to work on that separately.

That’s it for today I think. This is a pretty good stopping point. I want to think about the image_tag problem a bit, so I’ll go work on something else. If you’re curious it’s 2:07pm right now. I started at around 9am. BBIAB.

2 months later

Has it really been about 2 months since I last worked on this? Other stuff came up, this wasn’t a priority. It honestly still isn’t a priority, but I’m stuck on some other even harder project and want to get this shipped in Rails 5.1 if possible.

What now? Good thing I wrote all those notes, because I have no idea where I left off. I wish I came back to this earlier, 2 months is too much. I forgot to meditate on the problem at hand, but hopefully, I’ve got something stashed at the back of my brain when I start looking at the problem again.

I previously wrote “I want to think about the image_tag problem a bit” so I’m going to start there. No clue, what I meant, going to run tests.

Looking at asset_tag_helper.rb here’s all the methods:

  • javascript_include_tag
  • stylesheet_link_tag
  • auto_discovery_link_tag
  • favicon_link_tag
  • image_tag
  • video_tag
  • audio_tag

Previously based on my notes, I was worried that changing all the helpers would be too much work, this isn’t so bad. I’m going to take a shot at modifying these few with public_ alternatives.

When stuck at a crossroads, pick a path. You can always backtrack when you have more information.

All these methods use a tag method. We can put our deprecation notice there. So it turns out that while I did put the the code in Rails in GIT, I didn’t save any of my code from Sprockets-Rails. Ugh. Luckily I took very detailed notes that also included code examples.

Before I make those changes, I’m going to add public tag helpers.

The image_tag method uses path_to_image helper. We can modify it to pass a public_folder if present:

def image_tag(source, options={})
  options = options.symbolize_keys
  path_to_image(source, .merge!(options))

  src = options[:src] = path_to_image(source, { public_folder: options.delete(:public_folder) })

  unless src =~ /^(?:cid|data):/ || src.blank?
    options[:alt] = options.fetch(:alt){ image_alt(src) }

  options[:width], options[:height] = extract_dimensions(options.delete(:size)) if options[:size]
  tag("img", options)

Now we add a method:

def public_image_tag(source, options={})
  image_tag(source,  { public_folder: true }

Not too bad. Let’s test it. We can plug in another case to our existing test:

public_image_tag:        %r{<img src="/images/#{ contents }"},

This works. Moving on to the next item. favicon_link_tag we can use the same approach:

def favicon_link_tag(source='favicon.ico', options={})
  tag('link', {
    :rel  => 'shortcut icon',
    :type => 'image/x-icon',
    :href => path_to_image(source, { public_folder: options.delete(:public_folder) })

def public_favicon_link_tag(source='favicon.ico', options={})
  favicon_link_tag(source, { public_folder: true }.merge!(options))

******STOPPED HERE*********

Tests work. The auto_discovery_link_tag is meant to link to a URL and not an asset, so I don’t think we need to do anything there. It uses url_for under the hood.

The javascript_include_tag and public_stylesheet_link_tag already pass options so our job is much simpler. We only need to list public_folder as a valid option.

def javascript_include_tag(*sources)
  options = sources.extract_options!.stringify_keys
  path_options = options.extract!('protocol', 'extname', 'host', 'public_folder').symbolize_keys

That leaves us with video and audio tags. Both of these use the multiple_sources_tag helper.

def audio_tag(*sources)
  multiple_sources_tag('audio', sources)

def public_audio_tag(*sources)
  options = sources.extract_options!
  audio_tag(*sources, { public_folder: true }.merge(options))

Then we add an options hash to multiple_sources_tag and pass them into the send it is using:

options[:src] = send("path_to_#{type}", sources.first, options)

We do the same with video_tag now. Except there’s a slight hitch. Turns out that it does 2 things: not only does it convert the “video” you pass in it but it also converts the “poster” which is essentially the screenshot show before the video loads. So there can be 4 different cases:

  • Video is in public, image is from pipeline
  • Video is in public, image is in public
  • Video is in pipeline, image is in pipeline
  • Video is in pipeline, image is in public

Seems a tad repetitive to list out all the cases, but so was working on this section of code. These notes were to keep me on track and make sure I handled all the edge cases.

Instead of going overboard with public_video_tag and public_video_tag_with_public_poster_tag I’m going to assume if you’re using one static all of them are. This is a flexible assumption. As people use this they can chime in on what they want the API to be.

Okay, we’ve got all of our *_tag helpers working. Now I need to add docs.

After adding docs I realized that in a few places I added an options hash to a method that is using a splat, which may already contain an options hash. I have to extract and merge first.

Took about 15 minutes to fix syntax errors, rename a few variables, and fix other things. Oddly enough I had this in my code:

path_to_image(source, )

And it didn’t throw an error, apparently, that’s valid Ruby code, who knew!?

Anyway, I think I’m pretty much done(ish) with the Rails code. I need to modify Sprockets-Rails. I want to push a PR to both projects at the same time. Now I get to roll back through this document and re-implement everything.

It turns out I did still have all my changes for Sprockets-Rails, but I never committed them, whoops. At least I didn’t lose all that work. I made a branch in Sprockets–Rails schneems/deprecate-fallback-behavior

I write the branch name in my notes so I can find it easier next time.

The PR

I made two PRs:

There is a famous quote from Mike Tyson: “Everyone has a plan until they get punched in the mouth”. This is also true of pull requests and feature work. A PR isn’t really a finished feature, it’s more like a prototype, a concept of an implementation of a feature. While building a feature, we make many assumptions, and when we go through the review process we often see those assumptions come apart. This is a good thing. When we challenge our assumptions (in a healthy environment), we formulate stronger and better models of reality.

It was decided that we could do without all the public_ helpers by exposing the public_folder: true API directly to the user. Now I’m going to add a commit that removes all the helpers. It seems like I did a bunch of unnecessary work but really adding all those helpers helped me in the API design. The public_* helpers only work if they’re all aware of the public_folder: key, so it wasn’t totally useless.

Hopefully, this means that all I need to do is delete the helpers and change the tests.

Also, I’m changing multiple_sources_tag to multiple_sources_tag_builder since that’s what it’s doing. It’s a private method anyway. If I do that then I can detect where to show the deprecation backtrace easier in the sprockets-rails patch.

We later talked about using asset_path with an asset that starts with a slash. There is prior art there and your assets won’t render if you’re using it incorrectly anyway. I’m not a huge fan of NOT removing and deprecating that, but we could come back to it in another PR.

In the sprockets-rails patch we introduced a config option config.assets.unknown_asset_fallback that lets you decide if you want the asset pipeline to fall back or not. We also changed public_folder key to be skip_pipeline to make it more descriptive.

I didn’t take as many notes during the PR process, since much of it was reacting to comments. You can go back and read the comments. One thing to notice is that the PR messages don’t have the original interface I proposed. When I change the top level API I often go back and update the PR message to match, in this case I did that but it’s still not 100% perfect. Using this method I was able to make the deprecation notifications less “clever” by saying we can pass in the value and not having to mention the specific method name.

I was also able to get rid of the catch/throw logic and put the deprecation directly inside of compute_asset_path. I would say that I put about an equal amount of time into updating and re-working the PR as I did working on getting to a point where I could make a PR.

The other thing a major feature needs before it can ship is docs. We need method docs, and to update the Rails guides. It’s hard taking notes while writing docs, as docs are a kind of generalized note taking.


The PRs were accepted, and now we have a way to deprecate and raise when an asset isn’t found as part of the pipeline. If you’re not using any assets from the public directory you can go ahead and use the flag if you upgrade to the latest sprockets-rails:

config.assets.unknown_asset_fallback = false

If you are using assets in the public folder and need the skip_pipeline flag, you’ll need to wait for Rails 5.1 to come out.

Don’t give up

Some things that are interesting to me is that much of the original code examples I posted in these notes were not the ones that shipped with the PR. It reminds me of the famous Mark Twain quote “I didn’t have time to write a short letter, so I wrote a long one instead.”. As we gain clarity through the PR process and get feedback from our peers our code often gets better, clearer, and sometimes - shorter.

If you’ve never submitted a feature PR to a major open source project, don’t let this dissuade you. They’re not all this long or complicated. Think of this as the worst case scenario. Also remember that I did this work over weeks and months, so you can too. Breaking up a large PR into smaller manageable tasks and working on it when inspiration hits can help. Just don’t forget to take notes so you can pick up where you left off. The more you do these, the easier they get.

If you wanted to make a change this large, if it all possible if you can make it as a gem first it’s best so even if the core team doesn’t immediately want your patch, others (including you) can still use it. On the flip side, if you’re in the middle of a really gnarly patch and the back and forth with the maintainers seems to take forever, don’t give up. It can be normal, even for someone who does this kind of work on a daily or weekly basis.

If you liked this consider following @schneems on twitter or signing up to get new articles in your inbox (about 1 email a week when I’m on a roll).

Statistical Literacy and You: An Intro for Programmers

America. World. Let’s talk about statistics. Statistical literacy is a giant gaping hole in humanity’s collective toolkit, and it matters a lot more when your job relies heavily on numbers. Take programmers for example - as programmers, we love determinism and don’t do well with uncertainty. Yet, we work with inherently uncertain systems. How we build programs to deal with anomalies, how we generate and present data can both be greatly improved with a few basic statistical methods. If you were confused about the Cubs winning the world series, or how the polls in the US Presidential election could be “so wrong”, keep reading.

An Ode to Error

The first problem with trying to use numbers to predict something is error. Most people have a negative connotation with errors or don’t know how to use them. To them, errors indicate a bad thing that should be avoided. That’s the problem. One example from Nate Silver’s book (which I recommend as follow-up to this post) The Signal and the Noise comes when a town is told that rainfall will be 49 feet, which is not enough to overtake their flood barriers (built to withstand 51 feet of flooding). When the rain came, it went over the barrier and eventually came to 54 feet, 3 feet above the levee and 5 feet above what was predicted. 75% of homes were damaged or destroyed.

As a result of the low estimate, few people left the town, bought flood insurance or took any additional protections against the threat. They thought that the 49-foot estimate was a “maximum” or that it was extremely accurate. It turns out that the people doing the rainfall calculations knew their estimate was good within a range plus or minus 9 feet, which gave a 35% chance of flooding. Why didn’t they give that information to the people of the town? They were afraid the people wouldn’t understand and lose faith in the agency. Why didn’t the people of the town demand to know the error range? The people didn’t know enough to demand more so they took the number they were given and made their own assumptions. Statistical literacy is important.

Some of this may seem obvious, yet when we are given numbers we are rarely given an error range. We see a single number is given, for example, a 20% chance of rain. We rarely get the second crucial piece of data, that is the error. It may rain only 3 days out of 100 or it may rain 30. We as consumers of data can make better predictions with more data. Errors in our data isn’t a bad thing, it’s more data about the data. If you are planning a picnic the difference between a 20-30% chance of rain might not be much to work around, but what if you’re trying to coordinate a multi-million dollar outdoor movie shoot, that extra 10% could cost you a lot of cash. Errors are good and should be celebrated. When we get numbers we should be asking how confident are those numbers. What range could they be off by? If the person giving you data can’t answer that question then you don’t have data, you have a guess in numerical form.

If you’re familiar with stats, you might know of error as “standard deviation” or stdev. If you work with stats a lot, you need to understand how it’s calculated. This article is more of a primer, so I refer to it as error.

Error and the Election

Error is where most people misunderstood the polls that were presented at sites like 538. When you view the average, you also saw a bar representing the amount of error in the guess. The larger the bar, the larger the uncertainty. Too many people took one look at the average and immediately declared a winner. This was a bad call. If you follow Nate’s writing he very firmly said that there was an equal chance of a Trump electoral win with a popular vote loss as there was a blow-out by Clinton (with an 8 percentage point lead). This is because she was largely polling around 3-4% with an uncertainty of 3% so she could have been up 7% or down as we saw, to slightly under 0% (in the electoral college) which turned out to be the case.

In 2012, Nate Silver correctly called 50 out of 50 states in the presidential election. By his own words, this was pure luck. Not that he’s being humble as he was talking about being statistically lucky. One model, Florida, had a high error rate and was hovering around the 50% mark for both candidates. Meaning, that the data basically told us that there wasn’t enough data. He guessed and got lucky. The media largely misread this and misrepresented his luck as an indication that he was an infallible oracle, who gazed into his crystal ball built of numbers and simulations. These types of representations are harmful. We can still celebrate his achievement of precisely predicting all the other races and even celebrate his luck in that one state, without minimizing the role of error in the forecast.

One of the ways that people get better at estimating things is the act of estimating and then re-evaluating their methods based on performance. If the actual results end up being outside of your estimated error range, then it wasn’t a very good estimate at all. Rather than cry foul that a poll was “rigged” or that someone sold us snake oil, the only way we can get better as a community is to review the info with an open mind and open error rates. 538 does a good job of presenting their error rates in their model, they even talk about it extensively in blog posts and tweets. What we don’t have is a general public who is capable of internalizing and consuming statistical error in a meaningful way.

Programmer error

How can you as a programmer use error? One of the most prominent ways in my life is in performance calculations. The benchmark/ips gem gives you error reported by default:

require 'benchmark/ips'

def slow
  100.times { |i| i + 1 }

def fast
  1.times { |i| i + 1 }

Benchmark.ips do |x|"slow") { slow }"fast") { fast }!

# Warming up --------------------------------------
#                 slow    19.658k i/100ms
#                 fast   208.907k i/100ms
# Calculating -------------------------------------
#                 slow    196.468k (±10.3%) i/s -    982.900k in   5.061751s
#                 fast      5.329M (±12.1%) i/s -     26.322M in   5.023471s

# Comparison:
#                 fast:  5328510.2 i/s
#                 slow:   196468.1 i/s - 27.12x  slower

Here we see that the fast method is 27 times faster than slow. It can run 5.32 million iterations per second compared to “slow”-s measly 196 thousand. The error rate is right next to each metric (±10.3% for slow). This means that it can conceivably run as fast as 216K iterations or as slow as 175K iterations. This isn’t that important when there is a clear winner, but is important when things are close:

Benchmark.ips do |x|"slow    ") { slow }"slow too") { slow }!

# Warming up --------------------------------------
#             slow        20.061k i/100ms
#             slow too    20.338k i/100ms
# Calculating -------------------------------------
#             slow        212.392k (± 7.5%) i/s -      1.063M in   5.040024s
#             slow too    208.185k (± 8.5%) i/s -      1.037M in   5.028060s

# Comparison:
#             slow    :   212392.3 i/s
#             slow too:   208185.1 i/s - same-ish: difference falls within error

Here even though “slow” wins over “slow too” it is important that we look at the error. Here, there’s not a meaningful difference between the two and there shouldn’t be because we are using the same method in both.

Key point: when using numbers to compare things, error matters. Tell your friends, tell your family, tell your co-workers. From now on when someone gives you a number, (politely) ask for the error.

It’s about Ethics in Data Journalism

As programmers, we generate a lot of data. Often we present raw data such as the number of likes or a dashboard of new user signups. Very seldom do we consider the ethical implications of how we are producing and showing that data.

One of my favorite “average” stories I heard on 99 percent invisible was about the Vietnam war. Pilots were dying at a rapid rate, not because the enemy was shooting them down but because of freak accidents, which turned out to be not so freakishly uncommon. It turns out that the cockpit was built for an “average” pilot size, which might seem reasonable. However, they didn’t take into account that while there might be an average height or an average arm length, none of the pilots flying the planes were perfectly average. There is error in the average, of course, some pilots might be taller which prevented them from pulling all the way back on the flight stick. Some of them might have had shorter arms, preventing them from reaching a critical switch on the dash quickly. For the Air Force, this seemingly benign lack of error reporting was causing very real deaths.

The solution to the pilot problem was to introduce adjustable components. With adjustable seating, the number of life ending accidents come to an abrupt halt. In this case making a more accurate estimate wasn’t possible but instead, the system had to change.

Imagine how many lives could have been saved if the people presenting the measurements took the time to also report how un-average most of their measurements were. They could have made this even more obvious by making models, maybe showing the average hand size from smallest to largest.

We already saw earlier how a missing error rate gave a town a false sense of flood protection. If someone presenting that data produced a graph that showed how much higher or lower than the average could have been, lives and property could have been saved. Homes could have been fortified and the rates of flood insurance purchase surely would have gone up.

Most of the data we work with aren’t informing life or death decisions but that doesn’t mean we should hold it to a lesser standard. By normalizing the presence of error data, we consumers of that data can make better choices. I also think we’ll see a snowball effect, with the more error rates reported, the more people become familiar with how to use them and the more they will look for them. Until your consumers start asking for it, we’ll have to do the hard work and figure out the best places to add this extra error info.

Re-normalizing a Denormalized Culture of Statistics

The vast majority of humanity’s exposure to statistical data is via the National Weather Service. Percentages fill up mornings with coffee and help shape weekend plans and weddings. Have you ever wondered why when there is a 20% chance of rain, it is almost a sure thing that it won’t rain? While our government gives our weather data for free, most reporting agencies (such as your local weather broadcaster, The Weather Channel, or Dark Sky) will interpret it to “add value”. This largely means that they add in a “wet bias”.

In the case of weather forecasts, we often don’t care if the high temperature is off by one or two degrees, a little error is accepted and generally ignored. However, when it rains on a day with 20% chance of rain, people generally think that the weather was “wrong”. Statistically speaking it should rain 20 days for every 100 that are forecast with a 20% chance - this is 1 in 5. Yet if it rained 1 day out of a 5 day work week with a 20% chance, you would feel cheated.

Forecasters “enhance” the data then to make themselves seem more right more often. People generally don’t care as much if rain is predicted but none comes. So when a forecast is low, say 5%, they will report it as 20% so when it does rain for those 5 days out of 100, it doesn’t seem like the forecast was nearly as wrong.

Forecasters will also back off of aggressive estimates. Rarely do you see a 100% chance of rain (unless it’s actively raining at the time of the report). Instead, you’ll see a 90% chance, to account for the error in the model. If on the 1 day that it doesn’t rain out of 100 with a 100% chance of rain, then forecasters won’t seem stupid.

Wet biased poll-conflation

While this might seem useful, it also gives the average statistics consumer a false idea about what those numbers mean. At one point, Hillary was predicted to have an 80% chance of winning. If you’re only consuming adjusted weather reports, you might translate that to think she has a 99% chance of winning. When polls closed and 538 rolled out their final prediction before general voting, Hillary was around 66%. That means out of 100 elections, Trump would win 33% of these, which is not a trivial number. Many casual observers and pundits falsely interpreted these numbers.

Getting better

One way that 538 tends to be better than your average poll is that they are averaged polls. One way to account for random error is to add someone else’s random error, and if it’s truly random then it will cancel out. Even with this optimization, their models aren’t 100% perfect.

They run multiple simulations of a vote using a distribution based on the error from the average received by the polls. They do this many many times and see where the data converges. It’s not perfect but that’s not the point. Elections happen so infrequently that it is very hard to get enough data to brute force gains. We mentioned weather earlier due to increased data collection and computing power, so weather forecasts have gotten much better. For other estimations where data isn’t as plentiful and where we have less opportunity to check our assumptions, it is much harder to make progress.

As you go about your day-to-day life, try to take note of the numbers that are being presented to you. How accurate are they? Could you make better decisions with some error info? If you speak up and let those that show you information that error is important we can start to be collectively more conscious.

If you find these concepts interesting and want to dig deeper, I recommend “The Signal and the Noise” by Nate Silver.