The problem at hand
We use a large, external drive to store the household Photos library (Photos as in Apple Photos.app). The external drive is already backed up using one of the popular cloud backup services. I want to have a more convenient backup in case of catastrophic damage to the drive, which would reduce the time it takes to recover all our photos.Assess the options and devise a plan
Given that the drive is already backed up to the cloud, we have some freedom. We could just connect another drive and do some kind of back up; we could set up Time Machine to do something (maybe); we could try to move the library to the computer's drive and then use the external drive as a back up; we could occasionally copy the photos to another drive or computer. I happen to have a Synology NAS ready for this type of thing, so I will use that. Instead of relying on my memory or even an automated reminder to go and sit in front of my computer and do periodic backups, I will automate the backup.Here's the plan: write a script that will do the backup of the Photos library to the NAS, and then schedule that script to run regularly. Easy.
Implementation
1. Enable SSH using keys on Synology
My Synology NAS is obviously under utilized, since I had never bothered to set up SSH keys before. It turns out to be easy, but only because others have already documented the process.Follow instuctions posted here:
https://forum.synology.com/enu/viewtopic.php?t=126166
Note, at first I did all the steps except setting the home directory permissions. Turns out that post is correct that it is necessary. You need to change the permissions on the home directory or else SSH will still just prompt for the user password even though the keys are present.
Once this step is done, you can ssh into the NAS without entering a password.
2. Initial backup of the Photos library
Again, this road has been taken. For example:https://kevingoedecke.me/2015/08/30/backup-mac-photos-library-with-rsync-over-ssh/
In that example they use rsync, and I don't see any reason that isn't a great way to go for my purposes. I'm taking most of that rsync command, but I'm removing the "--delete" just in case (I have room, so no need to worry about it).
On the NAS, make a location for the backup:
cd /volume1/some_place
mkdir photos_library_backup
Just as a note for those who haven't looked at this before, it important to realize that the Photos library is called something like
/Volumes/external_drive/Photos\ Library.photoslibrary
I don't know any details about this, but I do know that it isn't a "file", but more like a directory. You can even just cd into it and look around.
The command that I'm using will be something like this:
rsync -Phca --stats --include="/Photos Library.photoslibrary/" --include="/Photos Library.photoslibrary/***" --exclude="*" -e "ssh" "/Volumes/external_drive/" my_name@external_drive.local:/photos_library_backup/
3. Set up automation with launchd
The post from Kevin Goedecke uses a shell (sh) script and crontab, but this is not the Apple way. We should use launchd.I'm no expert at launchd, so I looked at a bunch of examples. Here are a few:
A pretty nice quick overview:
https://stackoverflow.com/questions/132955/how-do-i-set-a-task-to-run-every-so-often
See also:
https://killtheyak.com/schedule-jobs-launchd/
And:
http://www.launchd.info
The bottom line is that you have to choose if you want the job to run only when you are logged in, or allow it to run as root. I want it to go ahead and run as root, since the photos library might be updated by other uses when I am not around. To do that, we just need to put a proper "plist" file into /Library/LauchDaemons. A plist file is just an xml file, but we have to follow some conventions which all the above links describe. Since this is a super simple job, my plist file is 20 lines.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>local.photo_backup</string> <key>ProgramArguments</key> <array> <string>/anaconda3/bin/python</string> <string>/HOME/Code/escape/photo_backup.py</string> </array> <key>StartCalendarInterval</key> <dict> <key>Hour</key> <integer>3</integer> <key>Minute</key> <integer>0</integer> </dict> </dict> </plist> |
4. Write a script to do the backup
The plist just schedules a job. That job is actually to run a python script. There are so many things that you could do here, but I'm just interested in doing one thing withoug incident: run rsync to backup my whole Photos Library.
There's not going to be anything fancy, but I think this does the job correctly. I will use subprocess.call() to invoke rsync, exactly as I did for my initial backup in step 2. That could have been put into a shell script, or even a python script, using very few lines.
I was slightly worried about what would happen if the external drive were unmounted/removed when the script was run. This could definitely happen; I think all the external drives get unmounted when no one is logged in to the system. So I put in a check, just to see if the path to the external drive seems valid:
from pathlib import Path if local_loc.exists() and local_loc.is_dir(): # proceed else: raise IOError("Something is wrong")
The script also defines where to point the rsync command, which is just a string. So the work is basically accomplished by
prc = subprocess.run(["rsync", "-Phca", "--stats", '--include="/Photos Library.photoslibrary/"', '--include="/Photos Library.photoslibrary/***"', '--exclude="*"', '-e', '"ssh"', local_loc, remot_plc], stdout=subprocess.PIPE, stderr=subprocess.STDOUT,)
I used that stdout/stderr in order to write a useful log file. I am just using the standard library logging module, and setting a log file for each daily run using a simple timestamp:
import logging from datetime import datetime # construct the log file name using today's date logloc = Path("/HOME/logs/") lognam = datetime.now().strftime('photo_backup_log_%Y%m%d_%H%M.log') logfil = logloc/lognam logging.basicConfig(level=logging.INFO, filename=logfil)
def log_subprocess_output(p): lines = p.stdout.decode("utf-8").split("\n") for line in lines: logging.info(line)
That's it. We now have a script that will try to run rsync to backup any changes we have made to the Photos library to a designated place on the NAS. If the external drive isn't available, it will raise an exception and exit. It logs all the steps to a timestamped file. The script is run by a root process at 3am every day. Not bad for 50 lines of python with no dependencies and a 20 line plist file.