December 29th, 2016
Using CloudFront for Your Entire WordPress Site
By Andrew Clark

CloudFront, a global content delivery network (CDN) provided by AWS, allows you to increase the performance of your website, reduce server load, and scale up rapidly to handle spikes in traffic by leveraging the power of Amazon’s network. It can be used to deliver all aspects of a website, including static assets, streaming video, and even dynamic content such as pages generated by WordPress.

When it comes to WordPress, most of the resources I’ve come across discuss using CloudFront only for static assets and by placing them into S3. While this approach works for many and allows you to significantly reduce traffic to the origin, it also requires making some changes to your website in order to serve these requests from a separate domain/subdomain. Depending on the complexity of your site, you may find yourself having to update WordPress templates and theme files, or installing additional plugins to look for and correct the URLs. It also may require a change in workflow to ensure that all resources are pushed to S3 as website changes are made.

In this post, I’d like to discuss an alternative approach: placing your entire website behind a CloudFront distribution. Doing so requires no change to your website code and it allows you to also cache your website’s dynamic content: the pages themselves. Additionally, it gives you the option of using Amazon’s Web Application Firewall (WAF), which can be associated with a CloudFront distribution, and in some cases it allows you to continue serving your website requests even when your origin is down. I’ll also be including a CloudFormation template in a future update that can be used to automate the deployment of the configuration I outline below.

But first, some CloudFront definitions:

  • Distribution: an endpoint you can send traffic to. You’ll point your domain to the distribution via DNS.
  • Origin: where your server resides. This is the hostname serving your WordPress website and can be in AWS or not. Note that you may need a new way of referring to this server as your main domain will be pointing to the CloudFront distribution and you don’t want to point it to itself.
  • Behavior: a URL pattern and its associated caching behavior.

In our example, we will create one distribution, one origin for our WordPress server, and multiple behaviors for the various WordPress URLs, each pointing to the same origin. This allows us to control how each request is cached with a greater degree of granularity. For example, you may choose to cache your homepage for 24 hours if you’re not making frequent updates. Images, CSS, Javascript, fonts, and other resources that don’t tend to change very often might be cached for a week. WordPress admin pages should never be cached as you’ll want an accurate reflection of your site’s settings and a secure session that’s specific to the logged in user. These details are handled by the individual behaviors.

Headers, Cookies, & Query Strings

Requests made to your website first flow to Amazon’s “edge” locations, which receive the HTTP request. It is here that CloudFront checks whether the requested object already exists in the edge cache. If so, it is sent back to the user. If not, CloudFront forwards the request to your origin, gets a response, and passes it back to the user while keeping a copy in the cache if appropriate.

In deciding whether or not the object is already in cache, CloudFront has to determine if users are asking for the same thing. For example, should the following requests return the same content?

  • /about/
  • /about/?ref=amazon

If the about page is already in cache, you may think it’s fine to have CloudFront return the cached copy without going back to the origin. But doing so prevents the origin from ever seeing the request and may prevent you from tracking something like an affiliate referral if that’s tracked on the backend (rather than through Javascript). So there are two things to consider here: 1) how are things cached and 2) when are things sent back to the origin. HTTP headers, cookies, and query strings are three pieces that need to be accounted for.

In the case of WordPress, we have the following files/folders to think about (as of 4.7):

/wp-content/* Most of the static assets and theme files will likely be here
/wp-admin/* and /wp-login.php* The admin pages
/wp-signup.php Used for visitor signups if your site supports it
/wp-trackback.php Blog post trackback functionality
/xmlrpc.php The WordPress API
/wp-cron.php WordPress scheduled task functionality
Everything else Homepage, sub pages, blog posts, etc.

Let’s assume that we want to cache all of our pages for a day and our static assets for a week while not caching the admin and other WordPress-internal URLs mentioned above. Let’s hop over to the AWS Console and create a Web CloudFront distribution by clicking on “Get Started” under “Web”:

When you first create a distribution, the AWS console has you create one origin and the default behavior while you’re at it. You can later add additional origins and behaviors. In our case, we’ll enter the following:

  • Origin Domain Name: the hostname of the WordPress server. Remember, this should be different from the final domain name of your website, which will be pointed at the distribution. If it’s in AWS, you can use the load balancer or the hostname of the EC2 instance (IP addresses are not allowed). If it’s not, you may want to create a new subdomain such as web.example.com and point it to the IP address of the server.
  • Origin Protocol Policy: in our example, we will assume that our origin server is only using HTTP and that it listens on the default port of 80.

Next, we’ll enter the default behavior settings. In our case, the defaults will apply to all URLs not specifically mentioned above and will therefore refer mostly to pages and posts.

  • Viewer Protocol Policy: Since we are using HTTP, we want to make sure viewer (user) requests work and are not converted to HTTPS, an option CloudFront provides if you want to enforce HTTPS.
  • Allowed HTTP Methods: We’ll want to make sure that all HTTP methods are allowed so that forms can be filled out (POSTs).
  • Cached HTTP Methods: Optionally, check “OPTIONS” to cache further. This is used in some cases for AJAX and font requests that use CORS.
  • Forward Headers: I find that Host and Origin are good headers to forward to the origin, so we whitelist them. And, as mentioned above, they are also used by CloudFront to make the caching decision. Host ensures that if we have multiple websites running on the same server, they won’t get tangled together from a caching perspective. We want CloudFront to look at the domain name and the URL (example.com/request), not just the URL (/request) to determine what it has in cache. Origin helps with CORS so that requests sent from external websites are inspected to identify the request’s origin.

Next, the caching TTLs:

  • Object Caching: “Use Origin Cache Headers” is a bit misleading and using it causes a default 24 hour caching period unless the origin provides its own caching headers. I always choose Customize to know exactly what I’m getting.
  • Default TTL: 86400 seconds is one day. Out of the box, our pages will be cached for 24 hours unless our origin (nginx, Apache, PHP, a WordPress caching plugin, etc.) specifies otherwise.
  • Min TTL and Max TTL: If our origin does specify caching, CloudFront will force it to fall within this range. In this case, we are allowing the website to dictate that there are certain pages or resources that should never be cached (0 seconds) and some that can be cached for up to a year.

WordPress uses various cookies for both admin functionality and for visitor commenting. Including these cookies ensures that default functionality continues to work in cases where these are present.

  • Forward Cookies: Whitelist
  • Whitelist Cookies: See below

  • Query String Forwarding and Caching: Forward all, cache based on all. This is a pretty flexible choice that ensures that search functions work and it also lets you more easily invalidate objects if necessary by simply appending something like ?v=2 to the end of an image URL that may have recently changed and is still stale in the cache.
  • Compress Objects Automatically: Yes. If your origin uses gzip compression, CloudFront will simply pass the compressed responses back to the client untouched. If not, it’ll compress responses where it makes sense, further increasing performance.

Alternate Domain Names (CNAMEs): Enter the domain names that will be pointed to the distribution. These usually include the zone apex and the www subdomain.

  • Supported HTTP Versions: HTTP/2, HTTP/1.1, HTTP/1.0. By including HTTP/2, we can take advantage of some additional performance capabilities, but note that most modern browsers require HTTPS in order to use HTTP/2, so this won’t make a difference in our case until we start using HTTPS. In the mean time, CloudFront will fall back to HTTP/1.1 if it has to, so it’s safe to leave this on. Note also that HTTP/2 is supported from the client to CloudFront, but not yet from CloudFront to the origin.

Click “Create Distribution” to finish. It’ll take about 15 minutes for the distribution to be “Deployed”. While we’re waiting, we can add the additional behaviors. We’ll head over to the Behaviors tab and start adding additional entries for each of the paths we described above. For example, the wp-admin folder:

Notice that the Path Pattern doesn’t have to start with a slash. Once again, we allow all HTTP methods since we’ll be submitting forms when interacting with the admin, and we turn off all caching features since we want a real-time view of our admin and a secure login. We forward all headers, cookies, and query strings.

At this point, we have a pretty good setup. Everything is cached for 24 hours except for requests related to the admin, which are live with no caching. To fine tune things just a little more, we can add additional behaviors for the remaining WordPress URLs listed above. The login, sign up, trackback, xmlrpc, and cron pages can be setup identical to the admin behavior.

The wp-content/* behavior usually only needs GET, HEAD, and OPTIONS since static assets aren’t posted to. There are some themes and plugins, however, that use AJAX files within this folder and therefore require POST. To be safe, allow all HTTP methods. Forward the Host and Origin headers, forward query strings, and turn off cookie forwarding. You may also want to give it a different default TTL than the default behavior (one week, for example).

We now have a CloudFront distribution that supports all aspects of our WordPress website, including live admin functionality, 24 hour page and post cache times, 1 week static asset caching, and undisturbed internal WordPress functions such as cron and xmlrpc. As always, review your setup to make sure this configuration works for you and look for ways of optimizing further. As mentioned, you can also take advantage of WordPress caching plugins that add additional caching logic within the parameters we’ve setup. Take a look at the Popular Objects tab under CloudFront’s Reports & Analytics to see what your hit/miss ratios look like and to identify any issues.

In a future update, I’ll be including a CloudFormation template that allows you to automate the creation of this configuration. Enjoy!