
May, 2019
I've been on a somewhat of a home-improvement kick lately and have been spending my side-project time sprucing up this website. I recently added a cloudfront distribution in front of my s3 bucket; my original goal was just to be able to have private sections of the website protected by http basic authentication, but I've made a few other improvements and learned more about using cloudfront effectively.
Last week I ended my blog on a somewhat frustrated note because I didn't have time to completely repair my website after breaking it when I first added a cloudfront distribution. My problem was that I had changed my s3 bucket permissions to only allow access to cloudfront, but I hadn't correctly configured my cloudfront distribution to serve a default object in subdirectories. So, for example, requests to /blog/ would fail because cloudfront didn't serve anything up from that url; you would have to make a request to /blog/index.html.
A number of readers from Last Week in AWS were helpful enough to email me with pointers on fixing this problem (thanks!). I ended up following this guide to create a lambda and attach it to origin requests in cloudfront. It was very easy; I only had a few hangups.
The first was that you have to create the lambda in us-east-1 in order to be able to deploy it to a Lambda@Edge cloudfront distribution. The ui isn't entirely clear on this; it just doesn't show the cloudfront option on the left for triggers if you aren't in us-east-1.
The second hangup was around naming my lambda. I had originally named it website-lambda, thinking I would put all my website functionality in one lambda. But there are actually two different cloudfront events I'm going to attach lambdas to (more on that later), so I decided to rename this lambda to rewrite-default-objects. Perhaps I should have named it something to indicate the event type that it responds to, in case I add more behavior in the future. I could see something like alexkudlick.com-origin-requests. I'm still getting a feel for how to effectively use lambdas.
Anyways, the meat of the lambda is the following line:
var newuri = olduri.replace(/\/$/, '\/index.html')
That bit rewrites any request that ends in a slash to end with /index.html so that urls like https://alexkudlick.com/blog/ actually serve up the index.html file in the blog subdirectory from s3.
I only made one addition to that logic. I changed it to:
var newuri = olduri
  .replace(/\/$/, '\/index.html')
  .replace(/\/([^\./]+)$/, '/$1/index.html');
This rewrites any request whose last section does not contain a dot to end with /index.html.  With this change, /blog will work just as well as /blog/. The reasoning behind checking for a . was that requests for actual files will have a .in the file name, but directories won't.
As I mentioned at the top, my goal for introducing cloudfront was to be able to have private sections of my website. I found this guide about protecting the website with basic authentication, and it seemed simple enough. That's the reason I changed my s3 bucket permissions; so that the only way to access the website content would be through cloudfront, and I could control access to it with a lambda. Once I had the hang of how to set up a cloudfront distribution, this was pretty easy to configure, I pretty much just used that guy's lambda with one simple addition:
const isPrivate = request.uri.indexOf('/private') !== -1
if (isPrivate && (typeof headers.authorization == 'undefined' || headers.authorization[0].value != authString)) {
This way, only urls that are under /private require authentication, so I can have private and public sections.
One thing I found useful during this process was the "Test" button in the lambda console. I had originally written my condition with an error:
const isPrivate = request.uri.indexOf('/private') !== -1
if (isPrivate && typeof headers.authorization == 'undefined' || headers.authorization[0].value != authString) {
I was missing parentheses around the || condition, so the lambda attempted to access [0] of the authorization header when it wasn't present. The test button was very useful to figure out that problem.
One problem with introducing cloudfront has been the cache behavior. My deploy process uses an npm module I wrote  to sync only the files that have changed. I don't go with the default next.js behavior of generating a new directory for javascript files on every build because then every deploy would have to sync every file to s3. I like my deploys to be fast - real fast. The ideal time to deploy is a few seconds at most. I like to do it live as they say.
But with cloudfront in front of s3, if I make changes to a file like /blog/index.html and sync it up to s3, cloudfront is still going to serve the cached version of the file and clients won't see the new version. I decided that my strategy for handling this would be to invalidate the cloudfront cache for the changed files on deploy. AWS says creating invalidations is not as "cost-effective" as using cache-busting urls, but I don't really want to use cache-busting urls and, unless they're charging me a bunch for each invalidation, I can't see the cost difference being noticeable.
I changed my s3-syncing library to return a list of files that were uploaded and deleted so that I could pipe that into cloudfront to invalidate those paths. I put the cloudfront invalidation in a simple script in my website repo:
require('dotenv').config();
const AWS = require('aws-sdk');
const canonicalPath = path => path.startsWith('/') ? path : '/' + path;
module.exports = config => {
  const distributionId = config.distributionId;
  const region = config.region;
  const paths = config.paths.map(canonicalPath);
  if (paths.length) {
    new AWS.CloudFront({ region }).createInvalidation({
      DistributionId: distributionId,
      InvalidationBatch: {
        CallerReference: Date.now().toString(),
        Paths: {
          Quantity: paths.length,
          Items: paths,
        }
      }
    }, (err, data) => {
      if (err) {
        console.log(err, err.stack); // an error occurred
      } else {
        data.InvalidationBatch.Paths.Items.forEach(path => {
          console.log("Triggered cache invalidation for " + path);
        });
      }
    });
  }
};
Now when I deploy I get helpful output on the uploaded and cache-invalidated files:
Uploaded: private/index.html
Uploaded: _next/static/build-2/pages/_error.js
Uploaded: _next/static/build-2/pages/_app.js
Uploaded: _next/static/build-2/pages/index.js
Uploaded: _next/static/build-2/pages/private.js
Uploaded: _next/static/build-2/pages/the-four-color-theorem.js
Uploaded: _next/static/build-2/pages/blog/ila-react-component-communication.js
Triggered cache invalidation for /_next/static/build-2/pages/private.js
Triggered cache invalidation for /_next/static/build-2/pages/blog/ila-react-component-communication.js
Triggered cache invalidation for /_next/static/build-2/pages/index.js
Triggered cache invalidation for /_next/static/build-2/pages/_app.js
Triggered cache invalidation for /private/
Triggered cache invalidation for /_next/static/build-2/pages/_error.js
Triggered cache invalidation for /_next/static/build-2/pages/the-four-color-theorem.js
Triggered cache invalidation for /private/index.html
And that's it! That's all I have for this week. Thanks for reading.