Hidden Linux : The perfect backup


rdiff-backup is a great little utility. At it's simplest it backs up one directory to another. The command
rdiff-backup dir1 dir2
will mirror the first directory onto the second, preserving everything -- subdirectories, hard links, permissions, ownership info, modification times and all extended attributes. Install it on both client and server and the command
rdiff-backup dir1 user@system::/dir2
will do so over a network.
Because backups are mirrors of the source, you can use regular tools like find and locate without having to wade through zipped archives. Accidentally deleted a source file? Simply drag and drop it from backup. What's more, if the backups are on a different drive (which they should be!) and your source drive crashes, you can simply mount the backup disk in its place and carry on!
But wait, there's more!
Once it's done the initial backup, future backups are simply done on diffs -- which is to say file differences. Not only does that make backups blindingly fast (unless you regularly change lots of files!), but it also records incremental changes.
To appreciate what that means, imagine you have cron job set up to perform an rdiff-backup every hour. At 3.00 in the afternoon you realise you've been on the wrong track for the last couple of hours and want to go back to the version of the document you were working after lunch. The command
rdiff-backup -r 2h /dir1/file /dir2/file
will do just that -- picking up the version it saved two (h)ours ago. (Other useful interval characters are s, m, h, D, W, M, or Y indicating seconds, minutes, hours, days, weeks, months, or years respectively.)
What it also means is that unless you use the --remove-older-than command at some point, you can effectively restore anything since you first started doing backups. Even files you deleted months or even years ago.
There's a whole lot more to rdiff-backup than that, including the ability to include and exclude files and file types, but you probably just want to get started and have a play. Because of its flexiblity and features, its man page looks a little daunting so try the rdiff-backup examples page instead.
<--Previous Hidden Linux Next Hidden Linux -->



Comments
Looks good, i use cron to run rsync every second day. How demanding would a 1 hourly incremental backup be on a network? With say 40 + users?
Posted by: Mark | June 18, 2009 9:36 AM
Thank you for this post. Just a small typo : the link for "cron" points to "diff" on wikipedia.
Posted by: antoine | June 18, 2009 8:18 AM
Thank you for this post. Just a small type : the link for "cron" points to "diff" on wikipedia.
Posted by: antoine | June 18, 2009 8:17 AM
I was wondering how long it would take for someone to write up this indispensable tool. Good on you, Geoff. We use it for ALL our customers, and on all of our personal machines (all Linux).
Posted by: Dave Lane | June 17, 2009 9:55 AM
One -- ? :-) -- of the best posts, thank you for that!
Posted by: macias | June 17, 2009 3:48 AM