[sf-lug] Example of an rsync(1) based backup scheme/script

Michael Paoli Michael.Paoli at cal.berkeley.edu
Tue Feb 5 01:33:28 PST 2013


So, I've occasionally mentioned before[1], and also elsewhere, e.g.
BUUG[2], some bits about rsync(1), and caveats about using it for
backup purposes, and also a bit more generally about backups, e.g.
[1][3].  Anyway, latest case I had for doing an rsync based
script/program was rather recently.  I do, at least occasionally,
backup vicki[4], most notably bits there of interest to BALUG[5], and at
least sometimes also SF-LUG.  Anyway, I thought it was time I create
script/program(s) related to such to make it not only more convenient,
but at least typically quite a bit faster.  And rsync - at least with
suitable invocation, is well suited to the task.

In this particular scenario, as is highly typical when I want to use
rsync for backup purposes - and where I or others might possibly end up
quite dependent upon that backup data, I want to quite ensure that the
backups are good, accurate, and complete.  The rsync program, at least
by default, plays a bit loosey-goosey with that, leaning more towards
speed and efficiency over integrity.  It strikes a good balance for
many of its more typical usages, but not exactly the balance I prefer
where I want an at least somewhat higher degree of integrity on
backups.  Fortunately rsync has lots of options, so one can generally
tweak that suitably to one's liking, and I do find its options quite
sufficient for doing so, and to suit my purposes.  One downside/caveat
I've found with that, though - the options tend to change significantly
among various (major) version differences of rsync releases.  So,
unfortunately, it does sometimes need some careful review and
tweaking/updating (e.g. I've written fairly similar scripts before, and
have had to significantly change them due to updated/different versions
of rsync).

Anyway, I give example script[6], and example usage below.  I'm not going
to fully explain rsync (or shell, or ssh, or Perl, or ...) below, or
even anywhere close, but I'll point out at least some of the more
interesting points and bits, and some of the bits that might not be so
obvious or intuitive.

First of all, rsync can be very fast.  Most notably where target already
looks rather to quite similar to source - and that is, in fact, what
rsync very much excels at, and why I chose rsync as key component of
backup in this case (backups happening over Internet, and with at least
one link not or not guaranteed to be especially fast - e.g. not suitable
for regularly streaming full backup of all the source system's data -
but quite fine for grabbing the incremental/differential data - e.g. via
rsync).

Anyway, a bit further below, I first show some example runs of the
script/program (which I called rsync_host2dir[6]), along with timing.  I
don't show the "very first run" - that took on the order of several
hours, as the target was quite out-of-date (had lots of the "user data"
and applications data, but was quite out-of-date and/or relatively
incomplete on most of the rest of the operating system (OS) bits).
However, subsequent backups have generally been comparatively very fast
- e.g. under 6.5 minutes total to fully backup (rsync) 3 hosts (one
physical host, two virtual guest hosts).  The first timing run shown,
was a bit longer (bit under 38 minutes for one of the hosts), but in
that case the host had somewhat over 380 MiB of additional new data.
All three hosts, combined, have roughly around 8 MiB total of data of
interest (actual data on filesystems, excluding: easily reobtained OS ISO
images, some FSH[7] volatile contents such as on/under /tmp,
virtual/pseudo filesystems such as /proc and /sys, purely redundant
filesystems such as an bind mount, and one rather ancient no longer very
important backup of a much older predecessor host).

Anyway, examples.  I show some basic command line usage, in this case,
bit of "one liner" (or so) invocation to backup 3 hosts.  I've
reformatted the input command line a bit as it would look if I did it a
bit differently to make it more readable - otherwise it's the
same/equivalent.  In each case, I prepared target, by first copying
(most) recent earlier backup to the target location(s) (a moderately
large, but fast local copy), and then using the rsync script to update
the targets.

The script takes two arguments, a source host, and a local target
directory.  In the examples below, that's seen on the lines:
> do time rsync_host2dir "$tmp" \
and the following line giving the target.
In the examples, the shell substitutes in for "$tmp" the host arguments
that I've given to the shell's for loop.  The leading "> " on each line
is the shell's PS2 prompt (it's essentially prompting that it needs
additional input to complete the command's syntax).
The do and time are part of how I invoked it under shell - do part of
the for loop syntax, and time, a built-in to the shell that gives us
some timing information - most notably what it reports as "real" is
total elapsed "real" time according to the system's clock - useful for a
gross overall "how long did it take?".  The not-so-obvious "host" (DNS)
names under the .balug.org. domain are for the BALUG host of interest,
and for host vicki (the physical host which contains the BALUG and
SF-LUG virtual hosts).

Somewhat more detailed description of the rsync_host2dir script/program
follows these example invocations.  The actual code for rsync_host2dir
is also shown further down in the references and also available at [6].

$ (for tmp in \
> sf-lug.com. \
> balug-sf-lug-v2.balug.org. \
> balug-sf-lug-v2.console.balug.org.
> do time rsync_host2dir "$tmp" \
> /home/r/root/tmp/mnt/balug/2013-01-28_BALUG/"$tmp"/root
> done)

real    1m24.543s
user    0m7.368s
sys     0m3.156s

real    3m52.409s
user    0m21.633s
sys     0m7.648s

real    37m3.104s
user    0m27.754s
sys     0m15.449s
$

$ (for tmp in \
> sf-lug.com. \
> balug-sf-lug-v2.balug.org. \
> balug-sf-lug-v2.console.balug.org.
> do time rsync_host2dir "$tmp" \
> /home/r/root/tmp/mnt/balug/2013-01-29_BALUG/"$tmp"/root
> done)

real    1m1.408s
user    0m5.908s
sys     0m1.356s

real    3m27.008s
user    0m19.945s
sys     0m6.656s

real    1m50.894s
user    0m7.040s
sys     0m2.916s
$

So, some comments about the rsync_host2dir script/program.  First of
all, it includes some bits to, if not executed as superuser, to
reinvoke itself as superuser via sudo.  Although in some other contexts
I might write script/program to do similar (or go from root to some
application ID via su if not invoked as the application ID), this is a
bit atypical compared to my more common rsync scripts.  But in this
case, it's (thus far) fired up manually on an ad hoc (but hopefully
fairly regular) basis.  Were it intended to be, e.g. driven by a cron
job, that would probably be a bit different.  Here also, in that sudo
use, it essentially leverages (presumed) user's ssh key access via
ssh-agent and passes that along.  Again, a bit atypical compared to
such scripts I've more commonly done - but particularly
handy/useful/convenient in this case - especially since those ssh keys
are passphrase protected and generally only used via ssh-agent, and
generally "only" by that invoking user.  Note also that it doesn't need
the ssh keys for very long.  It does two ssh connections to host, first
to gather mount information, and second - quite shortly thereafter - to
do the rsync based backup.  They keys or only needed when making the
ssh connections, not after they're already established, so, e.g., the
key(s) can be made available only a quite short time (e.g. a minute or
two or less) via ssh-agent, and still work quite fine, even if the
backup takes much longer.  Scheduled production uses would typically
have a somewhat different setup regarding key(s) and ID(s) and such.

In the program's first ssh connection to source host, it runs the mount
command, and then parses the output of that.  It does so to determine
filesystem(s) to back up, and also the order in which we want them
backed up.  It looks at filesystem type, and mount point, only selecting
filesystems that are of a desired type and also excluding mount point
patterns we wish to skip.  They're then sorted in a priority order -
this ordering is based upon a probable restore order in "worst case
scenario" where we need to restore "everything".  In such cases, where
there are separate filesystems, we will generally need things restored
in this order of priority:
/boot
/
/usr
/var
/home
And then anything/everything else in sorted order (so filesystems
containing mountpoints of other filesystems are restored are restored
before those other filesystems).  We're also not horribly picky about
the order of these latter filesystems, other than that caveat, so we use
a basic sort to cover that.  Anyway, in "full" recovery/restore
scenarios, one may often want to first recover those initial
filesystems, and may then opt to restart the recovered OS, quite
possibly in single user mode, and then restore the remaining filesystems
onto that running OS.  Also, in this particular case, since we're
writing target to filesystem(s) - essential (presumed) random access media,
rather than sequential, the order isn't as important, but still may be
fairly useful (and that bit of code, or quite similar, is also used in
some other backup code I use - including code that also does backups to
sequential media or media that's handled more-or-less as sequential in
full recovery/restore scenarios).  The shell then shoves the list of
filesystems desired into named parameter ("variable") backupmountpoints.
Perl is used in parsing the output of the mount command.  The only Perl
bit that might not be quite so obvious for those not somewhat familiar
with Perl, is bits about quoting and shell/Perl interaction.  The Perl
program is executed as part of shell program/script, so it's given as a
single argument to Perl's -e option.  To do that, the whole thing is put
within single quotes ('), to protect it from interpretation by the
shell.  That's all fine and dandy, except then how do we effectively do
' within the perl program, since the shell is interpreting ' in that
context.  We've two options:
'\''
q/STRING/
The first of those, within the context of single quoted string within
shell, gets interpreted as a single quote, and thus passed to Perl that
way.  Or more precisely, it ends up as terminating the single quoted
string, having a literal single quote, and then starting (resuming)
single quoted string - which shell then parses as all part of same
argument, leaving the literal single quote in, and discarding the
surrounding single quotes, and passing that along as argument.  However,
the '\'' context gets ugly to read.  It can, however, be used, as
needed, recursively - but the parsing of such is best left to programs,
as that does end up quite ugly.  In this case, however, we go for the
second option.  In perl, ' is just a more common shorthand for Perl's
more generalized q operator.  By using it explicitly, starting with q,
we can explicitly give our "single quote character" (or implied matched
pair) to be used by Perl on that particular invocation of quoting.
That makes it easier for the person familiar with Perl to read, than
seeing '\'' and having to decode the shell context first, before Perl.
It also can be a bit easier for the person looking at shell, as start
and end of the single quoted string is easier to find/see/parse/search,
without a bunch of use of '\'' within.  So, we use Perl's q in this
case.

In our rsync invocation, I use a bunch of non-default options, to
accommodate two particular objectives.  First of all, want rather high
integrity backups, so that adds a spattering of non-default options,
e.g.: --archive --numeric-ids --sparse --checksum --ignore-times.
In this case, bit of double-edged sword, but we definitely want
--numeric-ids, as we always want those interpreted consistently,
regardless of where that backup may move to or what /etc/passwd and
/etc/group or the like look like on the system having those backups.  We
also chose to do it that way, as that data will never be directly used
(e.g. run as operating system) on backup host - at least certainly not
without suitable adjustments or context (e.g. also along with use of the
backed up host's user/group context information).  The other objective
is more-or-less attempting certain optimizations for our particular
backup scenario and usage.  E.g. we use --relative, as we may have
multiple source mountpoints, and we want to preserve their hierarchial
relationship under the target directory.  We use --one-file-system, as
we've explicitly selected all the filesystem(s) we wish to backup, and
wish to not include any others.  We give --compress-level=9, as we wish
to optimize for bandwidth, rather than CPU, even if that might make for
slower over-all backups (we're more likely to have CPU to spare, and may
not have bandwidth to spare or may wish to conserve bandwidth as
feasible).  In other scenarios we might make a very different
CPU/bandwidth tradeoff decision (e.g. >= Gigabit uncongested "free" or
fixed cost bandwidth with desire to minimize backup time).
We use some --filter= options to exclude some stuff we don't want to
backup.  We'd excluded on filesystem basis earlier, this bit is to
exclude any bits that may be within filesystems - e.g. we don't want to
backup the FHS volatile /tmp, nor do we want to backup easily reobtained
OS ISO images (and related data), so we exclude where we have only
those.

Well, hopefully that covers at least the bits that may not be so
obvious, at least given other handy reference documentation (man pages,
etc.).  Script/program is shown further below and also available at [6].

And yes, I did talk at least some bit about rsync at and immediately
following the SF-LUG 2012-01-21 meeting, and have also discussed rsync
at other meetings, e.g. [2].

If you actually find a bug, please certainly let me know.  But I'm not
exactly looking for "feature requests" or the like - this is (almost) a
one-off program, not (quite) designed/intended to more generally solve
this particular type of backup scenario (but it's "general enough" I
could use it for multiple systems, and in fact use it for at least 3
hosts thus far).

references/excerpts:
1. http://linuxmafia.com/pipermail/sf-lug/2010q1/007678.html
2. http://www.buug.org/
3. http://linuxmafia.com/pipermail/sf-lug/2010q2/007732.html
4. http://linuxmafia.com/pipermail/sf-lug/2012q1/009159.html
     
http://www.wiki.balug.org/wiki/doku.php?id=system:vicki_debian_lenny_to_squeeze
    http://www.wiki.balug.org/wiki/doku.php?do=index&idx=system
5. http://www.balug.org/
6. http://www.rawbw.com/~mp/unix/sh/examples/rsync_host2dir
7. http://www.pathname.com/fhs/

$ expand -t 4 < ~/bin/rsync_host2dir
#!/bin/sh
program=/home/m/michael/bin/rsync_host2dir

[ $# -eq 2 ] || {
     1>&2 echo "usage: $0 host directory"
     exit 1
}
[ -n "$1" ] || {
     1>&2 echo "host cannot be null: usage: $0 host directory"
     exit 1
}
[ -n "$2" ] || {
     1>&2 echo "directory cannot be null: usage: $0 host directory"
     exit 1
}
[ x$(id -u) = x0 ] || {
     # make our directory absolute before cd /
     directory=$(pwd -P)/"$2" || exit
     set -- "$1" "$directory"; unset directory
     cd / &&
     {
         exec sudo su - root -c "LC_ALL=C SSH_AUTH_SOCK=$SSH_AUTH_SOCK  
$program $1 $2" ||
         exit
     }
}

host="$1"
directory="$2"
set --

[ -d "$directory" ] || {
     1>&2 echo "$0: directory $directory doesn't exist, aborting"
     exit 1
}

# ssh -atx "$host" 'hostname; id'; exit

backupmountpoints=$(
     ssh -ax "$host" 'cd / && umask 077 && exec mount' |
     #/home/m/michael/src/backup/bin/device__mount_point__type__options
     perl -e '
         $^W=1;
         use strict;

         #data fields to gather from output of mount(8)
         my $match_mount=q:^(.+) on (.+) type (.+) \((.*?)\)\n*$:;
         #                  device
         #                          mount point
         #                                    type
         #                                           options
         my @mount=();

         while (<>){
             if (/$match_mount/) {
                 my $device=$1;
                 my $mount_point=$2;
                 my $type=$3;
                 my $options=$4;
                 #skip filesystems we are not presently interested in
                 (
                     #must be one of these types ...
                     $type =~
                         /
                             ^   (?:
                                     ext[234] |
                                     reiserfs
                                 )
                             $
                         /ox
                         ||
                     #or one of these type and ...
                     $type =~
                         /
                             ^   (?:
                                     ntfs |
                                     vfat |
                                     fat
                                 )
                             $
                         /ox
                         &&
                     #mounted readonly
                     $options =~
                         /
                             (?:^|,)
                                 ro
                             (?:,|$)
                         /ox
                 )   &&
                     #and not one of these mount points
                     $mount_point !~
                     m!
                         ^
                             (?:
                                 /+mnt |
                                 /+media |
                                 /+var/+local/+pub/+iso |
                                 /+var/+local/+tower |
                                 /+home/+r/+root/+tmp/+mnt
                             )
                         (?:$|/)
                     !ox
                 or next;
                 #push device mount_point type options on our array,
                 #split out the options
                 push @mount,[$device,$mount_point,$type,[split(/,/,$options)]]
             }
             else {
                 print ("else\n");
                 print STDERR ("$0: ",(m:^(.*?)\n*$:)," failed to  
match $match_mount\n");
             }
         }

         @mount=sort {
             #handle highest priorities (if present) first:
             #/boot, / (root), /usr, /var, /home
             for my $pri (
                 q:/boot:,
                 q:/:,
                 q:/usr:,
                 q:/var:,
                 q:/home:
             ) {
                 if(@$a[1] eq $pri && @$b[1] ne $pri) { return -1; }
                 if(@$b[1] eq $pri && @$a[1] ne $pri) { return 1; }
             }
             #everything else is higher and compares normally
             #print ("@$a[1] cmp @$b[1] ",@$a[1] cmp @$b[1],"\n");
             @$a[1] cmp @$b[1];
         }   @mount;

         my $mountpointsout=q::;
         for my $line (@mount) {
             #print join(q: :,(@{$line}[0..2]),join(q:,:,@{${$line}[3]})),"\n";
             #print(@{$line}[1],"\n"); # just the mount points
             if($mountpointsout ne q::){
                 $mountpointsout .= q: :;
                 $mountpointsout .= @{$line}[1];
             }else{
                 $mountpointsout = @{$line}[1];
             };
         }
         print($mountpointsout);
     '
)

#echo "$backupmountpoints"

rsync \
     --archive \
     --acls \
     --xattrs \
     --hard-links \
     --numeric-ids \
     --relative \
     --sparse \
     --rsh='ssh -aTx -o BatchMode=yes ' \
     --checksum \
     --partial \
     --one-file-system \
     --delete-excluded \
     --ignore-times \
     --compress-level=9 \
     --filter='-,/ /tmp/**' \
     --filter='-,/ /var/local/pub/mirrored/cdimage.debian.org/**' \
     --quiet \
     "$host":"$backupmountpoints" "$directory"
#   --verbose
#   --bwlimit=KBPS
#   --inplace
#   --compress





More information about the sf-lug mailing list