Special High Intensity Training: How it works

Key to understanding IBS is understanding the concept of a Unix link. There are two types of links in Unix, the Hard Link and the Symbolic link. The magic of IBS is based on the hard link, but a real-world IBS system will probably make use of both, so I will do a review of both here.

The Hard Link

If while reading this, you think, "That is like a Windows shortcut!", no, you are missing it. Read it again, or search for "unix hard vs symbolic links".

Unix files have two parts we are concerned about -- the data itself, and the "link" -- what you might think of as the "directory entry" that points to the data of the file itself. Just as a reference in a book's table of contents or index is not the data itself, a link/directory entry isn't the file. A key feature of Unix file systems is that one file can have multiple links pointing to it.

That is important to understand. A physical file can be accessed through multiple links, and thus, multple file names or directory locations. While Unix keeps track of how many links a file has, it does not care about which link was made first, there's no pecking order of links; all are equal. As long as at least one link still exists to a file, it remains accessible. Once the last link as been deleted, the file's disk space is released. So yes, you can create a file, create three more links (four total), remove any of the links in any order. There may look like there are four files, but the space is only taken once, but all the directory entries are "equal". A change to any of these files is refected in all (though this is sometimes complicated by the fact that an app may make a backup of a file and make changes to that copy (a different file!), then change the original link to point to the new file, so ... it may not look like this is true).

A hard link can't cross file systems. Which...if you think about it, makes a lot of sense. The directory entry and the file it is pointing to must reside on the same file system.

Symbolic (Soft) Links

This is where you can say, "That's like a Windows shortcut!" without being wrong.

The symbolic link (symlink) is a very special file, but acts as a redirection indicator. It basically says, "You are looking at me, but you should actually look for your data at this other path". So, a symlink points to another link, which points to the actual file. Or that other link could be another symlink. Yes, you can have an infinite loop of symlinks; A points to B, B points to A. All modern Unixes will detect this and report about too many indirections after some point.

Yes, very much like the Windows shortcut. Except that GENERALLY, everything happens under the cover of the operating system. Your application can find out if something is a symlink rather than a "real" file, but generally, that is "Doing It Wrong". Your application should just be looking at files and directories, and symlinks should be something the administrators work with. Symlinks can cross file systems -- often to great advantage.

For example, an application may want to put data in a particular directory. If the file system that directory exists in is out of space, the administrator might copy the existing directory to another file system (or a new file system), then replace the old directory with a symlink to the new one. The application should not notice this happened (unfortunately, some apps like to "help" with this, and usually screw it up when they do).

Back to IBS

Each backup is done to a separate directory. Typical file path will be in the format /buroot/machinename/date, for example, /bu/firewall1/2022-02-24. The entire file tree (or at least, the desired parts) are copied from the source machine rooted in that target directory. So, following the example above, the file /etc/hosts on the backup system will be in /bu/firewall1/2022-02-24/etc/hosts.

Now, if the file /etc/hosts hasn't changed since the previous backup, rather than copying the file over, a new hard link is made to the same physical file as the previous backup. So -- the files /bu/firewall1/2022-02-24/etc/hosts and /bu/firewall1/2022-02-25/etc/hosts take up only one "file" worth of storage. Creating an additional link to this file is very quick, and requires basically no network traffic once the file has been confirmed to have not changed. For (typically) small files like /etc/hosts, this isn't a big win, but if you have large files that rarely change, this can be a huge benefit.

Some types of files are better than others. For example, mbox mail spools (all messages for one user in one file) will have to be entirely copied each backup there is a change, but maildir format mail spools (each message is its own file) will have only the NEW files copied over. Putting some numbers on it -- if you have a 100MB mbox file that gets 100k of new messages each day, you will end up with a 100MB growth in backup disk space usage each day because of this file. But if you use maildir, most of the 100MB will be just hard linked, only the 100kb of new files will be copied over the network and use additional space in the new backup.

Note that rsync has an algorythm they are very proud of where only the changed parts of a file are transfered over the wire. My experience with this has, unfortunately, been bad. I have found in REAL WORLD cases, rsync will often spend way too much time and effort trying to find changes and non-changes and sending over the deltas than just sending over the whole file. Perhaps even worse, the time required for a backup could vary widely -- from minutes to hours, which created problems when we thought we knew how long a backup would take, but lots of little changes were made in big files, and suddenly minutes turned to hours. For this reason, I have disabled the rsync delta protocols, and just using the "whole file" option. The results I have seen were not only better predictability, but also usually faster backups. Your results may vary. You are encouraged to experiment.

The first time I saw this backup system was in a project called Dirvish. This project started by making hardlinks of all the data from your previous backup to your new backup, then rsync'ing over from the system being backed up to the new backup directory. Great, but it was rendered a bit over-engineered when the rsync people added the --link-dest option, which basically moved most of the cool Dirvish code into rsync itself.

There are a number of these kinds of projects out there, I really think for the most part, this idea is simple enough a good system administrator should be able to set it up with just a little guidance -- my goal here is to provide that guidance based on running this kind of backup now for going on close to 20 years. You are welcome to use my code, but it is important you understand it so you can adjust it to your needs. And my code may suck. I admit this.

Implementation

IBS, as I distribute it, assumes a level of indirection between where the backups appear to reside (and as far as IBS is concerned, where they run) and where they physically live.

My default is to have the backups appear to reside in /bu, but that is just a tiny partition or subdirectory with a bunch of symlinks that point into /v/1, /v/2, etc.

$ ls -l /bu
total 0
lrwxr-xr-x  1 root  wheel  12 Aug  9 12:27 console -> /v/2/console
lrwxr-xr-x  1 root  wheel   8 Oct 20  2020 dbu -> /v/3/dbu
lrwxr-xr-x  1 root  wheel   9 Jul  4 14:20 dbu1 -> /v/1/dbu1
lrwxr-xr-x  1 root  wheel  11 Aug 29  2019 fluffy3 -> /v/1/fluffy
lrwxr-xr-x  1 root  wheel  17 Aug 29  2019 g2 -> /v/1/g2
lrwxr-xr-x  1 root  wheel   7 Aug 29  2019 gw -> /v/2/gw
lrwxr-xr-x  1 root  wheel  31 Apr  1  2020 gwold -> /v/1/gwold
lrwxr-xr-x  1 root  wheel   8 Aug 29  2019 hc1 -> /v/1/hc1
lrwxr-xr-x  1 root  wheel   9 Oct 30  2021 hc1archive -> /v/3/hc1archive
lrwxr-xr-x  1 root  wheel  22 Aug 17  2020 monitor -> /v/1/monitor
lrwxr-xr-x  1 root  wheel  10 Dec 19  2021 node1 -> /v/2/node1
lrwxr-xr-x  1 root  wheel  10 Dec 21  2021 node2 -> /v/1/node2
lrwxr-xr-x  1 root  wheel  10 Sep  4  2020 suzy2 -> /v/1/suzy2
lrwxr-xr-x  1 root  wheel  31 Aug  9 17:31 web.holland-consulting.net -> /v/2/web.holland-consulting.net
lrwxr-xr-x  1 root  wheel  11 Sep 14  2019 z-logs -> /var/z-logs

So, while the actual data is scattered over several volumes, the data can all be accessed from within the /bu/ directory.

Other tips

Your storage on the backup server should be redundant, i.e., RAID1, RAID5, etc. And you should know how to (and do!) monitor the health of the RAID system, recover from a failure, replace a drive, etc.
Some people will argue, "It's just a backup". Except, if you lose your backups, you lose your historical archive. And if you lose your backup, your data is no longer protected. The only excuse for not having redundant storage on your IBS system is if the alternative is "no backup at all". I've worked in places like that; I get it. (in fact, one of those places sent me a screen shot several years after I left them of a system I had set up but after I left, they had forgotten -- almost 3000 days of uptime on a system I cobbled together out of spare parts, including a non-redundant hard disk and a single-powersupply machine).
Lots of smaller file systems, not one giant file system. How big is appropriate is for you to determine. My guideline is "what is appropriate for the hw in use". For example, my home system is "chunked" into 2TB chunks. At work, where I used some very expensive hw, we used a 8TB chunk size. There are lots of reasons to chunk your data, just do it.
Super fast hardware isn't always a benefit: you don't want your backup system to swamp your production machines. There are a few ways to protect against overloading your production machines, one of which is lackluster hardware -- modest performance disk interface, slow processor, slow NIC, etc. On the other hand, when you need to do a restoration or maintenance, you will care about performance, so don't go too slow on this. For home use, a single or dual core low-end processor does very well, for business use, you will want to make sure the system you have is servicable by whatever expectations your business has, but in general, that means less than five year old hardware, so "insufficient performance" will NOT be a problem.
Memory requirements are modest. Keep an eye on your backup server, make sure it doesn't go into swap, but with modern-ish hardware, you will probably run out of processor before you run out of RAM, and the primary IBS script shouldn't start a job if too many rsyncs are already running, and you have the ability to define "too many" for your needs. 500MB available RAM is probably sufficient for home use. For business use, 2G might be a good starting point. But rsync memory use will vary a lot depending on the number of files on the target systems and how they are organized, so be prepared to spend a little time verifying you aren't approaching swap.
You will have a root access to all your backed up systems from the IBS backup server, so access to the backup server should be carefully controlled. The IBS server is a prime candidate to use as an administrative control system (a machine you use to manage other machines). But always be aware that a user on the IBS server can do all kinds of mischief on remote systems. This can be a bit of an issue because some of the "non-backup" abilities of an IBS implementation are very useful to programmers and other people who have to compare or dig through dozens or hundreds of machines. You probably don't want to give untrusted or not highly skilled people access to the IBS server.
IBS scripts should be located in a place where they will survive system updates. In OpenBSD, that's /usr/local/sbin, on AIX, that might be better in /opt/bin, as a common upgrade process for AIX is to completely wipe and reload the root volume group, which would include /usr. There is also a config directory, in OpenBSD, I use /etc/ibs, in some other OSs, /opt/etc or /usr/local/etc might be more appropriate.
The version of rsync you use on the IBS server must support the --link-dest option, which is basically everything since maybe 2005 (I'm not verifying age here, as it shouldn't matter for an IBS implementation!) or so, so this should not be a problem. However your IBS server can backup files from systems running a much older versions of rsync, as the --link-dest magic is all on the IBS server, not the system being backed up. However, considering the age of any system where this would be a problem, you have much bigger issues than the old version of rsync on that backup target.
Backing up the IBS server to tape: You may feel you want to put your IBS backups to tape. This ... should be done carefully. The IBS backup file systems are a mess of hard links, and a lot of commercial backup software handles that somewhere between oddly and wrong. If this is your plan, I'd suggest running the IBS backups, then pushing the most recent backups only to another file system/directory/server, and let your tape backup system handle just that. Otherwise, you may discover restoring data from the IBS backups put to tape becomes very convoluted (i.e., you might have to restore ALL the backups for a particular machine to get a complete copy of the most recent backup). Regardless, do test restores in realistic settings. This is true for ANY backup system.
Backing up the IBS config: This is something that's easy to forget the importance of. At a bare minimum, you want the ibs config directory and the ibs script itself (I'm assuming you have made local modifications). The IBS server can back itself up, but do make sure you don't end up accidentally backing up the BUBASE directory to itself. That would be bad!
Hopefully this goes without saying, but it's important enough, I'll say it anyway: The hardware seelcted for an IBS server doesn't need to be new, but it does need to be maintainable, so that ANYTHING that breaks can be replaced, and you have to know how it is done (for example, replacing a RAID controller is often far more difficult than one would hope). And the OS that IBS runs on should be supported. I often recommend installing on a one-release old version of an OS, setting everything up and getting it working, then doing a release upgrade to verify you know how to do that before going into production with it.

Holland Consulting home Page
Contact Holland Consulting

since August 10, 2022