Migrating my galleries to GitHub and DigitalOcean

I use thumbsup to generate my galleries. That’s worth a whole other post–the short version is that the metadata is written to the images themselves, and then that metadata generates a series of static html pages for albums, tags, and categories. The resulting site is easy to host–no database and a minimum of CSS and Javascript. I’ve used Cloudfront and S3 on Amazon for several years but as I previously mentioned I’m moving my stuff off Amazon.

GitHub pages isn’t suitable on its own: I have about 13 GB of images. The generated site structure looks like this:

1
2
3
4
album/
media/
public/
index.html

The album directory contains the various generated HTML pages for the albums, categories, and tags. It’s all text and about 20 MB. That can go to GitHub. The public directory is the CSS and Javascript assets from thumbsup and it’s another 1.1 MB. That leaves media.

I decided to put the images on DigitalOcean Spaces. I’ve admired DigitalOcean’s documentation for years, though I’ve never used them for anything. Their Spaces offering is very similar to S3, though they built their own solution using Ceph. I’ll say upfront that the pricing; as I’m using it now, isn’t competitive with Amazon S3. It’s $5/month upfront to “subscribe” to Spaces, which gets you 250 GB of storage. That’s $0.02 per GB, which is what S3 costs, but without a minimum commitment. Price isn’t the reason I did this, and I may find other uses. Spaces does have CDN support, but the offering isn’t a full-fledged offering like AWS CloudFront or Cloudflare, so I still needed GitHub for the web hosting.

To sync the files to Spaces, I used S3cmd. It can be installed with homebrew. DigitalOcean has good docs. It can actually be used with either Spaces or S3. I’ve run into two minor gotchas:

  • Files aren’t publicly readable by default (good), and I need them to be for my use case. I didn’t worry about this before because S3 and CloudFront handled this with an Origin Access Identity. S3cmd supports setting an ACL when synchronizing (--acl-public).
  • Full synchronizations feel slower than when going to S3.

The final issue to solve is that thumbsup doesn’t support an external CDN out of the box. This was tricky, especially as I like testing my galleries locally and didn’t want to hard-code the CDN there. That would lead to a lot of unnecessary remote requests and mean I couldn’t test new albums until I’d synchronized. Ultimately, I customized the GitHub Actions workflow to run a few sed commands against the artifact before it was deployed:

1
2
3
4
name: Rewrite image URLs
run: |
sed -i "s/url('media/url('https:\/\/galleries-goodbyeplease.nyc3.digitaloceanspaces.com\/media/" index.html
sed -i 's/..\/media/https:\/\/galleries-goodbyeplease.nyc3.digitaloceanspaces.com\/media/' album/*.html

This is after the Checkout step and before the Setup Pages step. This is hacky, but it’s not a production example of using a regular expression to rewrite HTML since it’s merely URL replacement. That’s my defense, and anyway it’s working.