Backing up your photos using the Google Photos API

Sat 29 Dec 2018

I've been a happy user of Google Photos for the last year or so. Prior to that, I was persevering with Apple Photos (formerly iPhoto) but the macOS app had become very buggy and even basic tasks like sharing albums with other Apple users was very hit-and-miss (good luck trying to share with a non-Apple user). The Google Photos web and iOS apps are great, and most people have a Google account so sharing is really convenient.

I trust Google to do a good job of looking after my photos but it seems sensible to have an offline backup too. One way to do that would be to enable the Google Photos folder in Google Drive. You could then install Backup and Sync to sync the photos to your computer and then backup as normal (e.g. using Time Machine). A small snag is that I don't have enough disk space on my laptop to download all of my photos and Backup and Sync is also a bit crashy.

Fortunately, Google announced a new Google Photos API back in September. The API supports everything I need to build a basic backup program as well as uploading photos and creating albums which might be useful for future projects. Unlike Backup and Sync, I can run this program on a Raspberry Pi and backup everything to an external HDD.

Google Photos API

The API is quite straightforward and nicely documented. There are example REST requests as well as Java and PHP (in 2018!) code samples but sadly nothing for Go or Python. However, there is a Go client and its usage is similar enough to the Gmail API that you can just adapt the Gmail getting started guide.

The Go client documentation is also missing some examples, but the API is really well thought out and being a statically typed language all of the function arguments and return values are explicit and easy to follow.

Go

Go is a fun language for a small project like this one - I'm still definitely a beginner but it's easy enough to pick up as Go's syntax is quite simple and familiar. Having all the documentation in one place and formatted consistently is great too.

I do have a few small gripes. The compiler will refuse to compile your code if there's an unused import or variable - this makes sense for production code as it might indicate a typo or bug, but it can make prototyping and debugging quite painful. Using an editor with decent Go integration (I use Visual Studio Code) makes this bearable. Error handling in Go is very explicit which makes the code easy to follow although not necessarily that readable (if err != nil everywhere). I prefer the Result type in Rust along with the ? operator as it's more succinct and clearer that either a value or an error is returned (in Go you could return both or neither).

For parsing command line arguments, I am using docopt which is easy to use but seems somewhat unmaintained. Originally I wanted to use the built-in flag package but it is quite opinionated about using -foo instead of --foo for command line options and doesn't really support positional arguments. For future projects, I will probably use Cobra instead.

For dependencies, I am using Dep although Go Modules seem to be the future (experimental in Go 1.11). This project only has a handful of dependencies, so the choice of tool probably doesn't matter too much - nevertheless dep was a pleasure to use. It's quite similar to Ruby's Bundler - there's a human-editable Gopkg.toml where you specify acceptable version ranges and a generated Gopkg.lock with the exact versions of each of your project's dependencies.

Go has pretty good out-of-the-box support for cross-compilation which is useful as I'd like to build binaries for Linux (AMD64 and ARM) and macOS. Gox is a thin-wrapper that makes this even easier - it supports a few extra features like building binaries for each platform in parallel.

CircleCI

I used CircleCI to automate the build and release process. Unsurprisingly the CircleCI workflow consists of two jobs: a build job to compile the binaries and a release job to create a GitHub release and upload the binaries.

The build job is run on every git push and cross-compiles binaries for Linux and macOS using gox. The binaries are stored as artefacts in CircleCI to make it easy to try out pre-release versions. The binaries are also added to the workflow's workspace for use later on by the release job.

The release job runs when a new tag is pushed. It uses ghr (another Go tool) to create a GitHub release (if it doesn't already exist) and upload the binaries.

Uh-oh, where are all of these releases coming from?

Useless machine

I had a bit of a mishap with ghr but first a Bash interlude (foreshadowing). In Bash it's good practice to include set -euo pipefail at the top of your script - "strict mode". The -e option will stop your script if a command exits with a non-zero exit status (zero is success and anything else is normally used to indicate some sort of failure). The -o pipefail option does something similar for pipes. Here's an example:

# 1. Without the -e option
bash -c 'false; echo reachable'
# Prints "reachable" and exits with a 0 exit status

# 2. With the -e option
bash -e -c 'false; echo unreachable'
# Prints nothing and exits with a 1 exit status

In the first example, the Bash script continued despite the false command "failing" (the false command always exits with a 1 exit status). The -e and -o pipefail options are really useful for catching unexpected errors and stopping your script before it does any damage.

Back to the ghr issue, in CircleCI I had something similar to this:

#!/bin/bash

set -euo pipefail

ghr "$(git describe --abbrev=0 --tags)" releases/

git describe --abbrev=0 --tags returns the Git tag for a commit. However in CircleCI, the git command was failing as I hadn't cloned the repo. Unfortunately, set -e doesn't apply to command substitution in this case so the command being run was effectively ghr "" releases/. Since the tag is empty, ghr helpfully creates its own tag. This triggers another CircleCI build for that newly created tag, the git command fails in that build too, creating yet another Git tag. Oops! Fortunately, I noticed fairly quickly and was able to cancel one of the errant builds and break the loop.

Obviously cloning the Git repo before the ghr command fixed the issue but a safer (and more readable) way of doing this is:

#!/bin/bash

set -euo pipefail

tag=$(git describe --abbrev=0 --tags)
ghr "$tag" releases/

When Bash Scripts Bite does a better job of going into the caveats of set -e. While I'm on the subject of bash errors, shellcheck is a great tool for catching similar issues - just not this one!

`google-photos-backup`

The final program is a single binary named google-photos-backup. Running google-photos-backup /path/to/backup will download all your photos and their metadata to /path/to/backup/photos and all your album metadata to /path/to/backup/albums. The directory structure and JSON metadata files are designed to be easy to parse programmatically rather than being human-readable. It would be relatively easy to write the corresponding restore program or perhaps a photo migration program for another service.

google-photos-backup is available to download over on GitHub at rupert/google-photos-backup.