Incremental Backup System

In the never-ending quest for a Good Backup System, I present this bit of "crap code" -- Incremental Backup System, or IBS.

TL;DR version

IBS provides a very useful, very capable disk-to-disk backup system. After the first backup run, all following backups are incremental based on the previous backup, and yet through the magic of Unix hard links, each individual backup LOOKS like a "full" backup, but only takes up disk space incrementally.

The only dependencies are rsync (any modern version on the backup server, incredibly old supported on the client), and a POSIX shell (ksh, sh or bash) on the backup server.

Restoring from one of these incremental backups is trivial -- absolutely everything you backed up is in one place, there's no need to restore the "full backup" and then the ten or so "incrementals". You can directly access all the backed up files of any historic backup, no reconstruction is needed. You can directly inspect and compare any versions of files between any backups.

The real magic in IBS backups is in the use-fullness of the backup. A traditional backup tends to be useful for only one thing -- restoring data. IBS backups can be used to quickly find answers to common administrative questions, such as, "I see this user 'bob' -- when did he get added to this system? how many systems does 'bob' exist on? What DNS resolvers are my systems using? If you can ask a question in a way that the existence or the contents of a file can answer it, you can query an entire corporation of servers in moments.

I have implemented this system on a Atom powered Netbook doing "self-backups" of its hard disk to a SD card installations of over 400 servers and over 20TB of data. I've implemented it on systems with a 250GB hard disk on a Pentium II to small AIX systems with SANs for storage.

Dovetailing nicely into this is the File Alteration Reporting Tool, which looks for files that have been changed that you didn't expect changes on. Using FART is not required for IBS to be very useful, but at least as implemented here, FART is dependant upon IBS.

Note that what I'm providing here is a set of sample scripts based on lessons learned over almost 20 years of using this process. This is not being presented as a "finished product". No apology is made for this -- every environment is different in some ways, and at some point it is easier to just implement a system that does exactly what you want than to crow-bar an off-the-shelf solution into your needs. If you are after a turn-key solution, you should probably look elsewhere.

I'm interested. Tell more

What this system does well:

What this system DOESN'T do so well:

Still interested? Ready for Special High Intensity Training!

The Script

Sounds complicated, can you help us?

Maybe. Contact me and let us see what we can do together.

Who's the competition?

well, I'm not anticipating making any money on the sale of this program, and can only dream of maybe making some money on consulting and setup, so "competition" isn't the right word. But, hey, the idea isn't mine, and while I think I've been doing it longer than most, other people have come out with rsync --link-dest projects as well. I'm just late to documenting it. If you have any other rsync --link-dest projects that I should list here, let me know.
 
 

Holland Consulting home Page
Contact Holland Consulting
 

since February 20, 2022

Page Copyright 2022, 2023 Nick Holland, Holland Consulting.