Backing Up with Git and the Cloud


​My next post was intended to be about streamlining a workflow by using various LaTeX authoring tools and a bibliography manager, but last week my Mac died. It is an older computer admittedly (from 2011), but it just had the graphics/motherboard replaced a year and a half ago due to a recall and I had high hopes that it would last at least until the end of the year. Unfortunately that didn’t happen - while at a conference over the weekend it gave up the ghost, leaving me with just a USB drive that had my presentation, and the last-minute changes to a Praat script that I had managed to upload to GitHub the night before.

Fortunately I have been saving up towards a new computer, since I knew this would probably happen. It’s had a good run, but when I got back to Singapore, replacing the mother/logic board turned out not to be worth it - I’d be better off putting the money I would spend fixing it towards a new computer.

What I find remarkable, however, is that I literally lost nothing except some emails that are stored on the email server anyhow, and maybe some intermediate changes to the Praat script. This is partly because I saw that the computer was starting to fail and backed up all my files with Time Machine on an external hard drive, but partly also because my workflow lends itself to backing up.

As per my previous post regarding Git, it is important to have a workflow that encourages you to snapshot changes. This is what the Git “commit” command does - fail to integrate it into your Git workflow at your peril. Another way to do ensure that your files are backed up is to use online cloud storage like Google Drive and Dropbox. Most likely, if you’re an academic you probably use both these services, but maybe your backup system is a bit ad-hoc. In my case, the death of my computer has suggested to me that I need to use multiple cloud storage services.

So here’s what I am currently doing using a temporary user account on my wife’s computer:

  1. Copy my backed up Git repository (of files/documents/etc..) from Time Machine to Google Drive (I had to change the settings so that Google Drive didn’t automatically try to create duplicate Google docs).

  2. Create an upstream repository in Dropbox, and push all changes in the Google Drive repository to the Dropbox repo.

  3. Make changes in the Google Drive repository, committing/pushing as I go.

The main reason this works for me is that I generally don’t operate with Dropbox and Google Drive on - I only start these services manually when I want to sync with the online server. In some of my online research, it seems that there MAY be issues with running a Git repo or editing files when cloud storage is syncing, so I get around this potential issue by simply doing my offline edits, and then syncing cloud storage manually.

So as part of my workflow:

    1. Edit files
    1. Commit changes
    1. Push to Dropbox
    1. Open Google Drive and Dropbox apps to sync changes with the cloud.

As I go, I am also discovering more files that need to be in a cloud server to avoid data loss. This is mitigated to some extent by regular Time Machine backups, but I’m beginning to wonder if I also need to invest in expanded cloud storage. The only other drawback of this process is that since I’m backing up in multiple cloud services, the storage takes up twice as much space on my hard drive as it otherwise would. I’m still thinking about how to deal with that issue…