AWS

Building a static site with S3, Cloudfront and Jekyll

I experimented with a static site, hosted on AWS S3. After some years, I have decided to preserve the blog basics here in one megapost, and shut down the original "Wordcasa" S3 site.

Brian

06 Feb 2022 • 19 min read

Back in 2018 I experimented with a static site, hosted on AWS S3. After some years, I have decided to preserve the blog basics here in one megapost, and shut down the original S3 site, freeing up the domain.

I believe that this sort of hosting has its place, but for a small blog, the maintenance overhead just can't compete with THIS sort of blog - Ghost on a light Docker infrastructure with a friendly CMS. So here it is, all seven posts in their original glory.

1: Starting out HTML5 for a serverless website

Mar 19, 2018

The first Word

So where to start with all this? I’ve finally found myself on a holiday of a few days between jobs, where I hope to seed a project I have been thinking about for a long time. Everyone knows Wordpress, but having been an AWS cloud architect for a few years now it just seems archaic in this day and age of serverless and S3 to run a server.

I am embarking on an overly ambitious project which might never come to fruition but consists of:

Create a serverless blog with basic html to host this text.
Create a serverless blog platform accessible to the world at advanced consumer level

So without further ado, I will start today with part 1 to see if I can’t get a basic blog with https on an S3 bucket.

Starting Out Html-5 For a Serverless Website

A basic HTML page does not need a whole lot, but I will be starting with a basic template from the HTML5 boilerplate to allow a framework for future responsiveness and structure.

First up, I create a folder on my local workstation, which is a windows 10 laptop. In that folder, I save this file as a simple text file with an MD suffix. I am using Atom as a text editor, not because it’s great for html editing but because I intend to use it later. I am typing this in markdown, with the “markdown preview” package enabled so I can see the formatted version as I type.

The file name is not so important at this stage, but when complete I will export the MD file as an html file, where the name will become meaningful. I will call it starting-out-html5-for-a-serverless-website.html

I visit the HTML5 Boilerplate site and click the big download button presented to me. At the time of writing it’s version 6.01. Unpacking in my local project folder yields a tree with typical files to get up and running. Opening the index.html in a local browser should give “Hello world! This is HTML5 Boilerplate.”

A basic website repository

The engineer in me wants to just upload this and get on with things but the developer on my shoulder is whispering about checking in code. Sigh.

I already have a Bitbucket account, as I want to keep the source files private, so I will use that. It is quite powerful on the free level. You could use Bitbucket, github or nothing at all.

I first move my small collection of files to a different place on my local drive, as I don’t want them in the current OneDrive folder. The version “sync” should be controlled by Git, not by OneDrive. This story is not about the repo but in summary, I

Create a blank repo in Bitbucket
Go to a local folder (not cloud synced) like “dev”
Git clone the repo to that folder
Copy in the files
Add a .gitignore with entry for *.zip files and .gitignore itself
Add, Commit and push

Prepare First pages

We are going to copy the core HTML files and make our basic local version. Create a folder in the repo area call “distro” or similar. If you are not familiar with HTML just copy over all the contents of the boilerplate including subdirectories. If your tree is right it should look something like: your-dev-repo-dir/distro/index.html and all the associated boilerplate files and directories at that level.

Index.html tweaks

I edit the index.html (I prefer Atom for this).

Set the Title, between the provided tags. I am going with “WordCasa Blog”

Change the default text, I am going with “Hello world! This is the WordCasa blog.”

Also I would like to add the publication timestamp. I am not completely sure on this syntax and method so might come back to it. To keep it moving I am going with this format, in the “head” section.

<meta property="article:published_time" content="2018-03-19T01:57:10+12:00" />
<meta property="article:modified_time" content="2018-03-19T01:57:40+12:00" />

As the index.html will be a home page, I will also add a link to our first page, this one!

Other file changes

Because I don’t want any crawling to take place yet, I also edit robots.txt and change to look like this.

# www.robotstxt.org/

User-agent: *
Disallow: /

None of the logos or finesse is in there yet but I believe this is good enough for deploying our hello world web site so let’s move on to the next page, on setting up our domain, bucket, cloudfront and deploying.

2: Building a basic S3 website with domain, cloudfront and https

Mar 19, 2018

There are many articles of this nature around so I will not go into too deep a dive in all places.

Domain

Unless you want your website to have a very long s3 bucket name or a generic cloudfront distribution name, you will need a domain. The requirements for provider are completely up to you, the main prerequisite being that your provider allows you to add a CNAME entry to divert web traffic to the S3 URL.

It’s not an endorsement but I have been using gandi.net as I saw them once referred to by Amazon, and so far they seem good all around.

If planning on using Cloudfront the domain should be ready in advance.

S3 Bucket

You will need:

An Amazon AWS account
A favourite region, in my case I will choose Oregon, being us-west-2
An IAM user with adequate permissions to write to a/your bucket
The AWS CLI installed
CLI Credentials for writing to that bucket

Bucket creation

Here’s a tip that might save you a lot of trouble. When making an S3 bucket for website purposes, the bucket name must be the same as the ultimate URL. So if your URL is going to be https://example.com then the bucket should be called example.com

The AWS documentation on setting up an S3 web site is valuable here.

I am creating a blog.wordcasa.com bucket with default permissions for now. Important is to get hold of the ARN (permissions -> bucket policy) in order to perform file uploads and downloads from the CLI, and the endpoint (properties -> static website hosting) to configure with cloudfront.

Uploading files to the bucket

You can use the AWS console GUI to upload files. I am however going to do it “the hard way” and use the CLI in order that I can do an aws s3 sync, uploading only the files that have changed. This needs IAM CLI credentials created with appropriate permissions.

Now I cd into the distro directory and:

aws --profile myprofile s3 sync . s3://blog.wordcasa.com

Then watch with satisfaction as my files are uploaded. In future we can add the –delete parameter to clean up removed files.

If you have carried out the correct process as per the AWS documentation you should be able to see your web site running off an open URL with the S3 long name, such as “http://blog.wordcasa.com.s3-website-us-west-2.amazonaws.com”. There will be warnings that the bucket is publicly accessible, which is to be expected.

Cloudfront

Using Cloudfront is not mandatory and the AWS docs do give instructions on applying a custom domain directly to the S3 bucket. This is HTTP only so make yourself aware of what that means. It’s likely the cost is very low for low traffic sites but that research is up to you.

My opinion is that Cloudfront might not necessarily increase speed, and the main reason to use it is to facilitate HTTPS.

SSL/HTTPS Certificate issue

To get hold of an AWS certificate for your unique domain name, in the same account as the bucket, apply on the console with either DNS or Email validation. It is in the Security, Certificate Manager section. Once you have it validated, the certificate should appear in Cloudfront ready to be attached.

You can and should combine multiple domains within the certificate, as a minimum to cover www prefixes. For example:

blog.wordcasa.com
*.blog.wordcasa.com

Configuring a Cloudfront distribution

We can configure Cloudfront to send traffic to the S3 bucket, with the key addition of SSL certificates. Note this key procedure: https://docs.aws.amazon.com/AmazonS3/latest/dev/website-hosting-cloudfront-walkthrough.html

While the distibution is processing, you can take the provided DNS name and add it to the domain zone file config.

If the provided cloudfront DNS name is d3dvxuaexample.cloudfront.net, then create a CNAME record that points to it.

3: Managing the templates and text - the Jekyll alternative

Mar 20, 2018

So I have an up and running web site on S3, but of course using pretty raw HTML it looks awful. I’m interested in the infrastructure behind a web deployment platform, and less of the templating management, so a utility to pick up the heavy lifting for that would be ideal.

About Jekyll

Jekyll has been on my radar for a while, but as it does have a slight hump to get started (the package management) I have not yet tried it. Until today that is.

What if I could use something like Jekyll to manage the serverless page templating, but obfuscate the command line work behind a consumer friendly web interface? It could work.

System requirements

Let’s check out the Jekyll system requirements. The first line says:

“GNU/Linux, Unix, or macOS”

Hmm ok, I am on a Windows laptop. I could spin up a dev env on AWS, but hark, what is this?

https://jekyllrb.com/docs/windows/#installation

https://docs.microsoft.com/en-gb/windows/wsl/about

Nice. Linux on Windows without a VM. How Microsoft has changed!

Jekyll with Ruby on bash on Ubuntu on Windows

That’s a mouthful. Without going into too much detail, following the above links get us back to the main Jekyll install process. The new site should be built from Linux under the /mnt/c tree, such as /mnt/c/Users/winuser/repos/jekyll-wordcasa-blog.

Eventually running a localhost server on http://127.0.0.1:4000/ yields the Jekyll welcome page. I then manage to add the Linux embedded folder as a project folder in Atom.

Simple handling seems to consist of updating of files, running a Jekyll build which instantiates changes to the _site.

Replacing the quickstart files with my basic wordcasa blog

I need to patch in the initial raw html versions into the Jekyll file structure. At the top level we have index.md for the home page, a _posts directory and a (I’ll create) _drafts directory.

This isn’t a Jekyll walkthrough so I’ll jump forward.

Deploy the Jekyll version to S3

I am keen to get this up so will finish up this file, get it looking bearable in the local version and upload it with the CLI. I will manually upload the robots.txt afterwards to request no-crawling until I figure out how to include it.

4: Getting images embedded into Jekyll posts

Mar 21, 2018

To get my blog looking slightly less raw I’ll need to get images in there. I will start with the Jekyll image embed instructions on writing posts. I will try and set this up under an S3 folder. What gets interesting now is that instead of dumping everything in with image1.jpg, image2.jpg etc I will need some way to associate the images with the posts. Perhaps a post-id is needed.

URLs and permalinks

As I take a closer look at the categories I notice that it’s built the articles under a tree which is not what I wanted, so to start I replace the categories with tags in the front matter.

http://127.0.0.1:4000/development/platform/2018/03/19/starting-out-html5-for-a-serverless-website.html

Now the URLs are of the form:

http://127.0.0.1:4000/2018/03/19/starting-out-html5-for-a-serverless-website.html

Close, but that date in front is not great. So let me have a look at the docs to see if we can remove it. I find a page on permalinks. While the permalinks can be set in every front-matter I can also set it globally in the _config.yml. I update with an entry like: permalink: /:title:output_ext and.. viola:

http://127.0.0.1:4000/starting-out-html5-for-a-serverless-website.html

It’s not such a big deal as crawling is still “off”, but it’s important should I ever want to get this into a Google index.

Getting those images embedded

Without worrying too much about the post association for now, I will first try a basic embed. I need to know first if I should drop assets directly into the _site/assets tree or create a higher level path. So test 1: try and create _site/assets/img and see what happens.

I try, and on build as I suspected it promptly drops the img directory so I will have to dig deeper. I create an assets/img folder in the root instead - one level up - and drop in a screen shot .jpg.

Another build and it’s still there. So I will try and embed it following their example which should be similar to ![My helpful screenshot](https://blog.wordcasa.com/assets/screenshot.jpg)

So here goes:

I restart the dev web server with the –drafts option as this article is a work in progress and it works!

Tuning the image

The image is 121kB, generally a no-no to be over 100kB. So with the use of a randomly downloaded optimiser (my Creative Cloud sub ran out), I crunch it down to 66kB. It’s now noticeably rougher but this is mainly an academic exercise. It’s showing as “892px × 844px (scaled to 740px × 700px)”, if I drag the browser window smaller it scales down nicely so high fives to the Jekyll devs.

Eventually this could be an automated process where a cloud service can crunch down images to a smaller size.

Associating images with posts

Thinking about the usage, it would make sense to not associate images 1:1, so that images can be used in more than one post. The downside is that images will end up in a massive pile, usually viewed in alphabetical order.

Classic blog apps like Wordpress make use of a media library view where images are viewed usually in date order. I also must consider that for larger deployments, there is a lookup overhead to retrieve S3 objects. Where less that 100 requests per second exist Amazon claim there is no problem. While Cloudfront may take some of the load, to ensure maximum scalability, good practice should be followed as postulated in the AWS documentation.

5: Adding a favicon to my S3 Jekyll site

Mar 22, 2018

Another random direction, I’d like to get a favicon in place.

For many years my tool of choice for making banners and icons has been Photoshop, but I don’t currently have a license. As I lamented the lack of any decent equivalent in my Office 365 suite, I noticed Paint 3D, the replacement for classic paint. The interface is not great, but after some fiddling I did manage to get a favicon sized 32x32 picture, in both white and transparent background versions.

I copy in my 32x32 PNG to the top level as filename favicon.ico and restart the dev web server.

After a bit of hacking, I managed to get it loading by saving as favicon.png, then adding a link to the index.md like: <link rel="icon" href="/favicon.png" type="image/png" />

I need to get this into the header template, which is part of the theme and does not seem to be parameterised from what I can tell. This is a bit convoluted but according to the Jekyll docs is the default way to alter a theme file.

Creating a local head.html

First I had to find my default theme files. The commands suggested on that page did not work, so I resorted to good old Linux find command. In my case the Minima theme files were in /var/lib/gems/2.4.0/gems/minima-2.4.0/_includes

The contents of the head.html are as follows:

<head>
  <meta charset="utf-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1"><!-- Begin Jekyll SEO tag v2.4.0 -->
<title>Adding a favicon to my S3 Jekyll site | WordCasa</title>
<meta name="generator" content="Jekyll v3.7.3" />
<meta property="og:title" content="Adding a favicon to my S3 Jekyll site" />
<meta property="og:locale" content="en_US" />
<meta name="description" content="Another random direction, I’d like to get a favicon in place. For many years my tool of choice for making banners and icons has been Photoshop, but I don’t currently have a license. As I lamented the lack of any decent equivalent in my Office 365 suite, I noticed Paint 3D, the replacement for classic paint." />
<meta property="og:description" content="Another random direction, I’d like to get a favicon in place. For many years my tool of choice for making banners and icons has been Photoshop, but I don’t currently have a license. As I lamented the lack of any decent equivalent in my Office 365 suite, I noticed Paint 3D, the replacement for classic paint." />
<link rel="canonical" href="https://blog.wordcasa.com/adding-favicon-to-jekyll-s3-site.html" />
<meta property="og:url" content="https://blog.wordcasa.com/adding-favicon-to-jekyll-s3-site.html" />
<meta property="og:site_name" content="WordCasa" />
<meta property="og:type" content="article" />
<meta property="article:published_time" content="2018-03-22T14:13:05+13:00" />
<script type="application/ld+json">
{"description":"Another random direction, I’d like to get a favicon in place. For many years my tool of choice for making banners and icons has been Photoshop, but I don’t currently have a license. As I lamented the lack of any decent equivalent in my Office 365 suite, I noticed Paint 3D, the replacement for classic paint.","@type":"BlogPosting","url":"https://blog.wordcasa.com/adding-favicon-to-jekyll-s3-site.html","headline":"Adding a favicon to my S3 Jekyll site","dateModified":"2018-03-22T14:13:05+13:00","datePublished":"2018-03-22T14:13:05+13:00","mainEntityOfPage":{"@type":"WebPage","@id":"https://blog.wordcasa.com/adding-favicon-to-jekyll-s3-site.html"},"@context":"http://schema.org"}</script>
<!-- End Jekyll SEO tag -->
<link rel="stylesheet" href="/assets/main.css"><link type="application/atom+xml" rel="alternate" href="https://blog.wordcasa.com/feed.xml" title="WordCasa" /></head>

It’s small enough that I can just make a new head.html - and a _includes directory - and paste it in.

The first thing I need to do is check that this change has not caused the site to implode! I rebuild and so far everything looks ok.

Adding the icon reference

I find my command again and update the second <link> in the <head> so it looks like:

<head>
  <meta charset="utf-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1"><!-- Begin Jekyll SEO tag v2.4.0 -->
<title>Adding a favicon to my S3 Jekyll site | WordCasa</title>
<meta name="generator" content="Jekyll v3.7.3" />
<meta property="og:title" content="Adding a favicon to my S3 Jekyll site" />
<meta property="og:locale" content="en_US" />
<meta name="description" content="Another random direction, I’d like to get a favicon in place. For many years my tool of choice for making banners and icons has been Photoshop, but I don’t currently have a license. As I lamented the lack of any decent equivalent in my Office 365 suite, I noticed Paint 3D, the replacement for classic paint." />
<meta property="og:description" content="Another random direction, I’d like to get a favicon in place. For many years my tool of choice for making banners and icons has been Photoshop, but I don’t currently have a license. As I lamented the lack of any decent equivalent in my Office 365 suite, I noticed Paint 3D, the replacement for classic paint." />
<link rel="canonical" href="https://blog.wordcasa.com/adding-favicon-to-jekyll-s3-site.html" />
<meta property="og:url" content="https://blog.wordcasa.com/adding-favicon-to-jekyll-s3-site.html" />
<meta property="og:site_name" content="WordCasa" />
<meta property="og:type" content="article" />
<meta property="article:published_time" content="2018-03-22T14:13:05+13:00" />
<script type="application/ld+json">
{"description":"Another random direction, I’d like to get a favicon in place. For many years my tool of choice for making banners and icons has been Photoshop, but I don’t currently have a license. As I lamented the lack of any decent equivalent in my Office 365 suite, I noticed Paint 3D, the replacement for classic paint.","@type":"BlogPosting","url":"https://blog.wordcasa.com/adding-favicon-to-jekyll-s3-site.html","headline":"Adding a favicon to my S3 Jekyll site","dateModified":"2018-03-22T14:13:05+13:00","datePublished":"2018-03-22T14:13:05+13:00","mainEntityOfPage":{"@type":"WebPage","@id":"https://blog.wordcasa.com/adding-favicon-to-jekyll-s3-site.html"},"@context":"http://schema.org"}</script>
<!-- End Jekyll SEO tag -->
<link rel="stylesheet" href="/assets/main.css"><link type="application/atom+xml" rel="alternate" href="https://blog.wordcasa.com/feed.xml" title="WordCasa" /><link rel="icon" href="/favicon.png" type="image/png" />
</head>

It works - my highly sophisticated Favicon is now in place on the site.

It’s unfortunate that it’s convoluted in this way, and a mild concern that my local head.html could fall behind if the master gets upgraded.

This process has inspired me for the next post. Which actually won’t be a Post, it will be a Page. If I am going to have Jekyll as an underlying framework this sort of task is far beyond consumer level, so I will need to maintain a list of requirements to “consumerise” it.

Picking up the Jekyll page about pages, this proves remarkably easy.

In the jekyll root directory I create a new file called roadmap.md and for front matter:

---
layout: page
title: Roadmap
---
Hello world.

Build and - that’s all. The Roadmap appears in the top nav bar of my site.

6: Moving the WordCasa repo to AWS CodeCommit

Apr 2, 2018

Deciding to relocate code

So far I have been using BitBucket to store the site source files. It’s great at free level, with a nice interface. But it occurred to me that as an AWS hosted solution I should be drinking more AWS coolaide, so CodeCommit all the way. A quick check of the costing structure at the time of writing shows for the first 5 users it’s free for:

Unlimited repositories
50 GB-month of storage
10,000 Git requests/month

I should be well within that! So onward.

Create new CodeCommit repo

I am sticking with Oregon region for now. I am hoping everything we need can be done with Lamba to keep costs down. At a later stage if I find a “server” is necessary (e.g. running Jekyll builds, image optimisation), it would likely be Fargate so I might have to revise to us-east-1, but we’ll see.

I briefly toy with the idea of using Cloudformation, but as I don’t imagine I will be managing the repo “infrastructure” in code I decide to go manual. With Oregon region selected I go to the console and create.

I’ll call it the jekyll-wordcasa-blog. Clicking create creates in a flash. I take the default options, and will select the button to have an SNS topic created as well.

As a brief aside, this quick screen dump is 87kB which is somewhat awful, but I may come back and rebuild when I have an image optimisation strategy.

Selecting the Create Topic button gives a popup where a similar name can be given, in my case codecommit_jekyll_wordcasa_blog.

Saving creates the Topic and returns this screen.

Selecting Manage Subscriptions allows you to add a suitable email subscription address. For more, the CodeCommit docs are here. (opens in a new tab)

Following the email link cycle will subscribe you to the repo SNS Topic.

Connecting to the new repository

If you are using the AWS console there are a number of steps to follow to get the repo in sync, including using IAM to issue GIT credentials and then cloning to local.

Note that where images are “off” it won’t upload them to a repo so you will need to make sure you have an alternative backup strategy. I am in two minds about this now as there is a healthy free space allocation, and the images themselves do have versions when optimised. For now I will keep offline copies.

There are a number of options with IAM users, the use of screen to upload and / or git command line, and transition code from your existing repo, if you have one. Frankly it’s a bit of a hack-fest, which is one of the key objectives of mine: to get a user friendly front end onto it.

At this point I will go away to load my repo, and digress back to the blog building project.

7: Getting the Jekyll site ready for crawling

Apr 2, 2018

New Jekyll site basics for pubic consumption

Some time back I had a go at creating a news web site, a directory web site, lots of sites. The game was a tough one, the world was all about ranking on Google, where a utopian world of millions of readers and thousands of bucks of ad revenue awaited.

Unfortunately a billion other people were trying to do exactly the same thing. End result: No result. Luckily, I could not really care less about this blog and its Google ranking, it’s more of a tech journal of the WordCasa hobby project. But still, it will need to be at least cleaned up a bit before getting indexed.

A homepage image

A blog homepage image. Who cares on a tech blog right? Well, I will dig one out anyway. One I took myself, to remind me what the outside world looks like, and that I should go visit it now and then.

I edit the “index.md” file in my repo root and - I think the picture will be fine underneath the tagline - so append the insert text.

I use “Paint 3D” (nothing but the best here) to resize and make it banner shaped, then crunch it through some freemium software to make it as small as poss. 36.9kB.. nice.

My index.md now contains:

---
# You don't need to edit this file, it's empty on purpose.
# Edit theme's home layout instead if you wanna make some changes
# See: https://jekyllrb.com/docs/themes/#overriding-theme-defaults
layout: home
---
Cloud. Blogging. Serverless. Brain dumps. The WordCasa blog.

![Image of path along a beach](https://blog.wordcasa.com/assets/img/beachside-path-banner-1080x202.jpg)

Email address

I am really not sure about putting an email address on the footer of a web site inviting spam from the whole bot-scraper-verse, so I will remove it.

It’s a matter of finding the email line in the _config.yml and commenting it out.

RSS Feed

I am a fan of RSS in general, but the paragraphs in the output are not quite configured how I like.

After some research I find that the excerpt can be delimited by an arbitrary string, so I am going to declare a  to insert in every article. It’s a bit manual but it has a double use, for a blogroll preview heading in future.

I add an entry to the config file like so:

excerpt_separator: "<!--end-excerpt-->"

Then manually find an appropriate place in each article. Though I will make a note to myself in the issues Trello to one day find a better way to automatically do this.

If I forget it’s not the end of the world, it just means there is a massive “excerpt”.

Robots.txt

Thus far I have had a basic robots.txt containing a site wide “don’t crawl”. Now I cast the site to the bots.. and crawlers.. and spammers.. by reverting the Disallow bakck to allow, like so.

# www.robotstxt.org/

User-agent: *
Allow: /

💬

Your comments are welcome. Please COMMENT and read those of others on the Bluesky Post for this article.

Retrospective blog post to use BlueSky for comments: techroads.org/building-a-s... #AWS #S3 #Jekyll

[image or embed]
— TechRoads blog (@techroads.org) Feb 22, 2024 at 11:46 am