Varnish Cache Invalidation with Fastly Surrogate Keys

We've been using Fastly on a few projects at HotelTonight and I recently moved this blog onto their service too. I've seen some tremendous performance gains and have been clocking some excellent page speed scores.

However, in the beginning I did feel some pain...

There are only two hard things in Computer Science: cache invalidation and naming things. -- Phil Karlton

Out of the box Varnish supports flushing of URLs via the HTTP PURGE method. This works pretty well for simple one-to-one mappings of resources to URLs. However, complexity can arise when a URL depends on multiple resources within an application.

Luckily Fastly has done most of the heavy lifting for us and added Surrogate Keys to their service offering.

TL;DR we can return an HTTP header Surrogate-Keys that will allow us to build a many-to-many relationship between keys-and-pages within our site. When we purge a key; any page associated with that key will also be purged.

Creating Surrogate Keys

Using the codebase from this blog site as an example we'll walk through the steps
to set up Surrogate-Keys in a Ruby on Rails application.

Lets first add two convenience methods to our Article class so instances of it
can generate their own keys.

# app/models/article.rb
class Article < ActiveRecord::Base  
  def resource_key
    "#{collection_key}/#{id}"
  end

  def collection_key
    self.class.table_name
  end
end  

Then we can use the collection_key and resource_key methods in the ArticlesController
to generate the appropriate keys.

class ArticlesController < ApplicationController  
  before_filter :set_cache_control_headers, only: [:index, :show]

  def index
    @articles = Article.published
    set_surrogate_header 'articles', @articles.map(&:resource_key)
  end

  def show
    @article = Article.find(params[:id])
    set_surrogate_header @article.resource_key
  end

  # ...

  private

  def set_surrogate_header(*keys)
    response.headers['Surrogate-Key'] = keys.join(' ')
  end
end  

The set_surrogate_header method adds a new header to the Rails response and sets
the value to the keys passed in (if we pass in multiple keys it will join them together
into one space delimited string).

Lets verify we're getting the expected response:

curl -X HEAD http://localhost:5000 -I  
> HTTP/1.1 200 OK
> Surrogate-Key: articles articles/1 articles/2 articles/3

curl -X HEAD http://localhost:5000/articles/2 -I  
> HTTP/1.1 200 OK
> Surrogate-Key: articles/2

Purging Surrogate Keys

When we modify an Article through the CMS we'll want to purge the article "index page"
and the "article detail" (as well as any other pages that might include the Article
in question).

# app/controllers/articles_controller.rb
class ArticlesController < ApplicationController  
  def create
    @article = Article.new(article_params)

    if @article.save
      Fastly.purge(@article.collection_key)
    end

    respond_with @article
  end

  def update
    @article = Article.find(params[:id])

    if @article.update_attributes(article_params)
      Fastly.purge(@article.resource_key)
    end

    respond_with @article
  end

  private

  def article_params
    # ...
  end
end  

Whenever a new Article is created the articles key will be purged (which will
bust the cache for index page). Similarly, when an article is updated it will purge
articles/:id which will bust the cache for the article page, and the index page.

If we were to add new pages to the site such as "Tags" or "Popular" we can use the
same Surrogate Keys from Articles to programmatically purge the cache for those
new pages too.

Purge in the Background

As purge traffic increases its worth exploring putting the Purge requests into a
non-blocking background job. The Sidekiq Unique Jobs
is helpful in reducing duplicated purge requests.