[sf-lug] inux backup software .. that meets unique requirements
Michael Paoli
Michael.Paoli at cal.berkeley.edu
Sun Mar 21 15:19:41 PDT 2010
Though you may not find a ready-made solution that fully suits your
specifications, most or all of the pieces to fairly easily put such
together do readily exist. As much of that's already been mentioned
else-thread, I'll try to just cover non-(/less-)redundant bits.
rsync - wonderful tool, but I recommend great caution when using it for
backup purposes. E.g. by default, if source and target file match in
their appropriate paths and size and mtime, but their data differs, the
target's data won't be updated to match the source. Do also keep in mind
that mtimes are "user" settable (no special permissions required by user
or program to arbitrarily set mtime of file they have write access to).
One can use non-default options to force data checksum comparisons, but
then the data of all source and target files is read, which is often
overkill for updating (see also below). There are lots of other things
rsync doesn't, or may not do as one would expect and may require - so,
as I state, due caution is advised.
ctime - great way to tell if file (of any type) has changed. So long as
you can trust superuser (a.k.a. "root") to not do particularly stupid
stuff (like compromising filesystem integrity by not properly protecting
permissions on filesystem devices, or screwing up the system time).
Essentially, if the file changes in any way whatsoever - other than only
atime being updated due to read, the ctime will be updated. By backing
up files that have a ctime newer than the last (or earlier reference)
backup was started (or when the earlier backup of the file in question
was started), one can pretty well cover picking up and backing up all
the additional changes. But that alone isn't quite perfect - one also
needs to track and deal with what was removed. One also needs to deal
with other filesystem rearrangements, e.g.:
$ stat -c '%D %i %Z %n ' . ? ?/?
3a02 128939 1269208095 .
3a02 128940 1269208095 d
3a02 128998 1269208095 d/f
$ mv d D
$ stat -c '%D %i %Z %n ' . ? ?/?
3a02 128939 1269208117 .
3a02 128940 1269208095 D
3a02 128998 1269208095 D/f
$
Note with the above:
o the ctime for ?/f and ? didn't change
o ? and ?/F remained, respectively, on same device and with same inode
o ctime for . changed
So ... ctime changes on directories may be quite a bit trickier, but
by suitably also tracking device and inode numbers, the contents of
directories can be suitably reconciled (and one can often avoid backing
up a huge file yet again, just because it changed name or location
within the filesystem - even rsync isn't that smart).
*Really* doing backups exceedingly well is a difficult problem to fully
solve - even more so when writing to some archive format. E.g. how does
one handle:
o if, after starting to backup a file, it:
o grows - especially if one's already read and backed up the metadata
o shrinks
o shrinks to smaller than the amount of data one's already written out
while backing up the file
o the metadata changed after one started backing up the file
o the file's data continually changes faster than one can read all the
data from the file
o how does one handle transactional consistencies across:
o distinct files within the same filesystem
o across filesystems
One approach to (most of) the above, is to only back up the filesystem
when it's unmounted, mounted read-only, or use a read-only "snapshot" or
equivalent image of the filesystem.
Full backups can often be much easier - notably avoid the complexity of
determining how to efficiently and quite properly and fully do an
incremental or differential backup.
Anyway, most or all of the other points regarding your requirements
specification, and rather to quite feasible "solutions" have already
been covered earlier from the original thread.
> Date: Thu, 11 Mar 2010 14:17:27 -0500
> From: David Rosenstrauch <darose at darose.net>
> Subject: Linux backup software .. that meets unique requirements
>
> Trying to find linux backup software that meets what apparently is a
> unique set of requirements. I've checked out all the major backup apps
> (backuppc, duplicity, bacula, drop.io, etc.) but none of them seem to
> meet all of my requirements. Anyone here know of one that does? Here's
> what I'm looking for:
>
> 1) Backs up efficiently. e.g., uses rsync algorithm or similar so as to
> not repeatedly backup/store files that haven't changed.
> 2) Transmission of backups is encrypted.
> 3) Storage of backups is encrypted.
> 4) Can back up to offsite/remote file system.
> 5) Can back up to any old plain vanilla sftp/rsync-accessible remote
> file system. (Reason behind this requirement is that I already have
> access to a large amount of (non-encrypted) space for backups, and don't
> want to pay extra for another service.)
> 6) Does not require that you periodically take the network i/o storage
> hit of making a new full backup. (This is not an option, as what the
> data I want to back up is getting large, and I only want to do it once.)
More information about the sf-lug
mailing list