cache headers, express

Cache headers in Express js app

Effective implementation of caching strategy can significantly improve resilience of your Express app and save cost. When you implement the right caching strategies combined with a content delivery network (CDN) your website can serve massive spikes in traffic without significant increase in the server load.

Caching strategy

We will leverage the power of HTTP cache-control headers combined with caching and serving your Express app via CDN network. We will not be using any server side caching, as the assumption is that with the right configuration of the HTTP headers and the CDN there will be no need to burden your server with any additional requests related to server-side caching.

Cache headers intro

The http header Cache-control allows to define how your proxy server (e.g. Nginx), your CDN and client browsers will cache content and serve it instead of forwarding requests to the app.

You can find the full specification of Cache-control at MDN. The following are the key parameters that we will use in our example configuration:

  • public - indicates that the response may be cached by any cache. That means that every layer that your response passes through is allowed to cache your content and serve it. This is the setting you want for most of your content representing articles, blogs, 'static' pages, product pages etc...

  • max-age - in seconds specifies how long the content is supposed to be cached. You want to set this as long as possible without compromising the ability to refresh your content where it is changing. Anything from 5 minutes to a day (or even a few days) is probably a good choice for most 'publishers' depending on the frequency they publish and refresh content. So for example, if you publish a lot of new content, your home page could be cached for only 5 minutes but pages with articles or blog posts could be cached for a day.

Here is how the header looks like to enable public 5 minut cache:

cache-control: public, max-age=300

Setting Cache-control header in Express

You can set HTTP headers in an Express app using the response api:

res.set('Cache-control', 'public, max-age=300')

It would be very cumbersome to apply the code above for every single route. We will create a caching middleware to help automatically set the right header for the entire Express app.

Setting cache control middleware in Express

The simplest middleware to set cache-control would be to set the cache header for the entire application as follows:

app.use(function (req, res, next) {
  res.set('Cache-control', 'public, max-age=300')
})

The problem with the above approach is that you would be caching responses to all other methods (e.g. POST, PUT...) which is probably not something you want to do. So, it is far better to write a slightly more tailored caching middleware function, which you can also later tailor even further:

let setCache = function (req, res, next) {
  // here you can define period in second, this one is 5 minutes
  const period = 60 * 5 

  // you only want to cache for GET requests
  if (req.method == 'GET') {
    res.set('Cache-control', `public, max-age=${period}`)
  } else {
    // for the other requests set strict no caching parameters
    res.set('Cache-control', `no-store`)
  }

  // remember to call next() to pass on the request
  next()
}

// now call the new middleware function in your app

app.use(setCache)

Your Express app will now send cache-control headers configured to your specifications. The next step is to set up the content delivery network.

Forget server side cache, use CDN!

There are some good articles on the server side cache but with the implementation of cache-control headers, you do not need all that hassle! For most use cases, using a CDN to leverage cache-control is much leaner, cheaper, faster and more resilient than building your own server side cache mechanism.

For example, if you are already running your Express app on AWS, you can consider using their CDN CloudFront - which has points of presence in over 70 cities across over 35 countries in the world.

All other major cloud providers offer their own CDNs and if you are looking for an independent option, you could consider CloudFlare. In many cases it is easier to integrate if you go with the stack from your main cloud provider but other than convenience, the basic principles behind major CDNs are the same.

The key parameter to set up in your CDN configuration is the caching behaviour. You want your CDN to honour the cache-control header values that your app is sending rather than overwrite with a generic setting. That way, you can have a precise control of the caching behaviour right from your app and down to the single route level.

CDN magic

When you serve your Express app through a CDN, once the first request passes through, the content delivery network will capture it and put it into cache. Any future requests to the same path received within the expiry period will be served directly from the CDN without going to the server.

This where the real magic happens when it comes to ability to serve massive spikes in traffic. Imagine you hit a jackpot with a blog post that gets 3 million views in 12 hours. If you set max-age to one hour, your server will only need to serve the popular post 12 times in these 12 hours. The remaining 2,999,988 pages will be served from the CDN!

JAMSTACK competition

The JAMSTACK approach is gaining a lot of popularity recently. I am a great fan of tools such as Gridsome for generating static pages. However, for a blog or a content catalogue with more frequent updates, I am leaning more towards an approach of using very small virtual servers with a very aggressive caching approach leveraging a CDN. The advantage of that approach is that the entire website does not need to be regenerated after every content update and it is still very inexpensive overall thanks to the benefits of the CDN.

Full Cache-Control specification

Following are more details on Cache-Control specification, syntax and examples. You can use these to fine-tune your caching strategy, speed and server load.

This article is primarily about using Cache-Control in responses, however some directives can also be used in requests. For clarity, below is a full breakdown of directives and whether they can be used in response and / or requests.

Directive Request Response
max-age yes yes
max-stale yes
min-fresh yes
no-cache yes yes
no-store yes yes
no-transform yes yes
only-if-cached yes
must-revalidate yes
public yes
private yes
proxy-revalidate yes
s-maxage yes

Syntax

There are only a few simple rules to follow to make sure you write valid cache-control headers

  • directives are not case sensitive
  • use a comma to separate directives
  • some directives have an optional argument

Setting cacheability

public

Cache-Control: public

All responses may be stored by any cache.

private

Cache-Control: private

All responses may be stored by a browser's cache.

no-cache

Cache-Control: no-cache

All responses may be stored by any cache, however, the stored response must always go through validation with the origin server before using it.

no-store

Cache-Control: no-store

The response may not be stored by any cache. Use this directive to prevent caching responses. There is no need to set any other directives, for example, max-age=0 is already implied.

Expiration time

max-age

Cache-Control: max-age=<seconds>

The maximum amount of time a response is considered fresh. This directive is relative to the time of the request.

s-maxage

Cache-Control: s-maxage=<seconds>

Only applicable to shared caches where it overrides max-age or the Expires header, Ignored by private caches.

max-stale

Cache-Control: max-stale[=<seconds>]

Used by clients in a request header to indicate the client will accept a stale response. An optional value in seconds indicates the upper limit of staleness the client will accept.

min-fresh

Cache-Control: min-fresh=<seconds>

Used by clients in a request header to indicate the client wants a response that will still be fresh for at least the specified number of seconds.

stale-while-revalidate

Cache-Control: stale-while-revalidate=<seconds>

Accept a stale response, while asynchronously checking in the background for a fresh one. The seconds value indicates how long the client will accept a stale response.

stale-if-error

Cache-Control: stale-if-error=<seconds>

Accept a stale response if the check for a fresh one fails. The seconds value indicates how long the client will accept the stale response after the initial expiration.

Revalidation and reloading

must-revalidate

Cache-Control: must-revalidate

Caches may not use the stale copy without successful validation on the origin server.

proxy-revalidate

Cache-Control: proxy-revalidate

Shared caches may not use the stale copy without successful validation on the origin server. Ignored by private caches.

immutable

Cache-Control: immutable

Clients will not send revalidation for unexpired resources to check for updates even when the page is refreshed.

Other directives

no-transform

Cache-Control: no-transform

Prevents shared caches or proxies from modifying responses. For example, proxies are prevented from optimising images for slow connections.

only-if-cached

Cache-Control: only-if-cached

Used by clients in requests to prevent using network for getting response. The cache will either return a stored resource or respond with 504 status code.

Powered by

Node
Github
Copyright 2019 - present