[sf-lug] Linux backup software .. that meets unique requirements

David Hinkle hinkle at cipafilter.com
Thu Mar 11 12:03:41 PST 2010


Have you considered writing a script that produces a copy of the tree you're working on, but encrypts each file with a simple symmetric cipher using the filename as the initialization vector for the encryption?   Then you can run rsync to back that tree up.   Rsync will have to copy any file that changes in it's entirety, but the data should be safe.  You could take it one step further and cache the md5sums of the files before you encrypt them (like duplicity) so you can avoid re-encrypting files that haven't changed.   An approach like that might work well enough for a source code repository, but will be bad for anything like a database.

David

-----Original Message-----
From: sf-lug-bounces at linuxmafia.com [mailto:sf-lug-bounces at linuxmafia.com] On Behalf Of David Rosenstrauch
Sent: Thursday, March 11, 2010 1:56 PM
To: sf-lug
Subject: Re: [sf-lug] Linux backup software .. that meets unique requirements

Well, (IIUC) the approach that duplicity takes solves that problem 
nicely.  It stores file fingerprint hashes on the remote site.  Then, at 
the next backup, it retrieves them and compares the actual files to see 
if they've changed.  This allows it to work in an rsync-like fashion, 
even if the files are encrypted remotely.  So something that works in a 
duplicity-like style would fit the bill.

(Of course duplicity itself *doesn't* fit it, unfortunately, since it 
has the irritating requirement of having to periodically go back and 
make - and upload - a fresh full backup.)

Secrecy at the remote end is definitely very important, by the way, as 
it's a shared server, which I don't own.

Thanks,

DR

On 03/11/2010 02:38 PM, David Hinkle wrote:
> The primary technical hurdle you face here is that good encryption by
> definition is going to change the entire output file for any minor
> change in the input file.   That means if you encrypt localy then use
> rsync or a similar algorithm to transmit you must transmit any
> changed file in it's entirety.   If you have many small files, each
> independently encrypted this might be fine.
>
> If you encrypt at the remote end, (even if you use an SSH transport
> for intermediary) you get to make best used of the rsync algorithm by
> only transmitting changes, but you loose perfect secrecy because the
> remote end could theoretically intercept your un-encrypted data after
> decrypting from the tunnel but before encrypting to the disk.
>
> You have a couple options, which one is best depends on how important
> your perfect secrecy is to you, and what your files look like.  Can
> you tell us more about the scenario?   I've got a great deal of
> experience with rsync based backup solutions so I can probably help
> you out.  You can implement either of the above with a good script.
>
> David
>
> -----Original Message----- From: sf-lug-bounces at linuxmafia.com
> [mailto:sf-lug-bounces at linuxmafia.com] On Behalf Of David
> Rosenstrauch Sent: Thursday, March 11, 2010 1:17 PM To: sf-lug
> Subject: [sf-lug] Linux backup software .. that meets unique
> requirements
>
> Trying to find linux backup software that meets what apparently is a
> unique set of requirements.  I've checked out all the major backup
> apps (backuppc, duplicity, bacula, drop.io, etc.) but none of them
> seem to meet all of my requirements.  Anyone here know of one that
> does?  Here's what I'm looking for:
>
> 1) Backs up efficiently.  e.g., uses rsync algorithm or similar so as
> to not repeatedly backup/store files that haven't changed.
>
> 2) Transmission of backups is encrypted.
>
> 3) Storage of backups is encrypted.
>
> 4) Can back up to offsite/remote file system.
>
> 5) Can back up to any old plain vanilla sftp/rsync-accessible remote
> file system.  (Reason behind this requirement is that I already have
> access to a large amount of (non-encrypted) space for backups, and
> don't want to pay extra for another service.)
>
> 6) Does not require that you periodically take the network i/o
> storage hit of making a new full backup.  (This is not an option, as
> what the data I want to back up is getting large, and I only want to
> do it once.)
>
> For some reason it's impossible to find something that fits all of
> the above.
>
> * backuppc and rsyn don't satisfy #3.
>
> * duplicity doesn't satisfy #6.
>
> * drop.io and bacula don't satisfy #5.
>
> etc.
>
>
> Anyone have any suggestions here?  I'm actually on the verge of
> writing my own backup software to do this, but I'd *really* like to
> avoid that unless there's no other choice.
>
>
> Thanks,
>
> DR

_______________________________________________
sf-lug mailing list
sf-lug at linuxmafia.com
http://linuxmafia.com/mailman/listinfo/sf-lug
Information about SF-LUG is at http://www.sf-lug.org/




More information about the sf-lug mailing list