One of the things that has concerned me for a number of years is the fact that I am storing more and more important stuff on my PCs, electronic documents, source code and data files for home projects and a growing collection of videos and photos of my daughter (as well as those of my wife and I prior to her entering our lives). I need a good backup solution more now than ever.
Whilst I see a pressing need to backup my important data, I want to avoid using burnable media as a backup solution (especially given how long a life burnable disks have) and I really want a solution that doesn’t require constant manual intervention to keep it up-to-date. I want the system to be tolerant of me loosing files in my local copy and not realising for a few days, a week/month/year.
As a die-hard Linux user, my solution has to work on Linux. This seems to rule out a lot of packaged solutions and my distrust of a lot of black-box solutions rules out even more. I also don’t want to use an on line service, some backup solutions have already closed, such as BT vault, I’m concerned that such a service could fold at short notice, leaving me with no access to my data. I also don’t want to pay a monthly subscription.
In he past I have configured a separate machine to backup important machines where I have been working. This is a nice arrangement, but I don’t really want to have a machine dedicated to backup at home, though this is mainly because of power consumption and given how easy it is to get a hold of very low power ARM devices (such as the Raspberry Pi) this is something that I may change in the future. For now one less machine means one less machine to maintain, as I have a server which I use as a file server and MythTV backend, I’ve decided to add a dedicated backup hard-drive to this machine.
Cron, which is still included by default in most of the Linux distributions I’ve used recently, provides a hand mechanism for launching a script at specific times. In my case this is one minute past 12 each night. My script uses Rsync to synchronise files from my chosen back locations to the backup drive in my server. The cp commands ability to create hard links instead of straight copies (not to be mistaken with symbolic links) combined with Rsync’s behaviour of unlinking files its updating (i.e. those that have changed) can be combined to create a rolling backup, where the extra file space usage, to an approximation, only grows with additional and modified files.
My backup drive is kept in a read-only state most of the time, the script makes the drive writeable when it runs and makes it read-only again when it’s finished. This is not ideal, I’d like a read-only version to be constantly accessible with a write access limited to the script, but mounting the backup disk in the root home folder makes it harder for it to be accidentally accessed and files deleted whilst it’s writeable (though it does also make it harder to access). This is one advantage of doing this from a dedicated machine, a read-only Samba or NFS share can be set-up for access to the backups with the script able to access the backups with full read/write privileges.
I my case, my script creates a rolling 7 day backup (i.e. a backup for the preceding 7 days), with 12 rolling monthly backups (one created for the previous month on the 1st of each month) and permanent yearly backups (created at the beginning of each new year). Each copy is in it’s own dated folder:
# ls 2013 2014-01-14 2014-01-16 2014-01-18 2014-01-20 2013-12 2014-01-15 2014-01-17 2014-01-19 #
My code in it’s current form has been running since mid December, it has created a yearly backup, a monthly backup and is rolling a 7 day backup. Each incremental backup (on my fairly static data) only taking 10s of Mb:
# du -sh * 40G 2013 42M 2013-12 42M 2014-01-14 42M 2014-01-15 46M 2014-01-16 42M 2014-01-17 42M 2014-01-18 42M 2014-01-19 42M 2014-01-20 #
This script is now available on GitHub, to update a one of Linus Torvald’s fairly famous quotes “Only wimps use tape backup: real men just upload their important stuff to a git repository, and let the rest of the world mirror it”. Maybe not the approach I’ll take with all my important stuff, but quite a good solution for this backup script. :-)