[sf-lug] sf-lug.mbox ... wget ... rsync? :-)
Michael Paoli
Michael.Paoli at cal.berkeley.edu
Sun Feb 1 00:10:13 PST 2015
[list hopping this to sf-lug list, as seems much more fitting/relevant there]
So ...
http://linuxmafia.com/pipermail/sf-lug.mbox/sf-lug.mbox
and the potential hazards of
wget --continue
(of which I'm well aware) ...
Rick, I notice you've got rsync server running on linuxmafia.com
and fair bit of publicly accessible content. :-)
Is
http://linuxmafia.com/pipermail/sf-lug.mbox/sf-lug.mbox
Also available via such rsync access (and if so could you let me/us
know the rsync path to it), or if it isn't could you make it so
and let us know? Thanks for your consideration on this.
references/excerpts:
> From: jim <jim at well.com>
> To: balug-talk at lists.balug.org
> Subject: Re: [BALUG-Talk] Good News, Sad News
> Date: Sun, 25 Jan 2015 14:52:35 -0800
>
> Thank you, Rick,
> The man page from which you quoted is the same as is on
> my system. I think it's particularly well-written info, although
>
> Below is a summary of my attempt to run the wget
> command that Rick sent me (my attempt worked).
>
> $ wget http://linuxmafia.com/pipermail/sf-lug.mbox/sf-lug.mbox
>
> I'm inferring from the man page info that I can, in the near
> future, run
> $ wget -c ...
> and that wget will open the remote file, move the file pointer
> to the byte position (in the remote file) of the last byte in the
> local file (on the design assumption that there has been no
> change to the contents of the remote file below that byte
> position), and then commence copying the remote file,
> appending the contents to the existing local file.
>
> On 01/25/2015 02:26 PM, Rick Moen wrote:
>> Quoting Jim Stockford (jim at well.com):
>>
>>> I'm willing to read man pages....
>> Could help. ;->
>>
>> $ man wget
>> [...]
>> `-c' `--continue' Continue getting a partially-downloaded file.
>> This is useful when you want to finish up a download started
>> by a previous
>> instance of Wget, or by another program. For instance:
>>
>> wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z
>>
>> If there is a file named `ls-lR.Z' in the current directory, Wget
>> will assume that it is the first portion of the remote file,
>> and will ask
>> the server to continue the retrieval from an offset equal to
>> the length
>> of the local file.
>>
>> Note that you don't need to specify this option if you just want
>> the current invocation of Wget to retry downloading a file should the
>> connection be lost midway through. This is the default behavior.
>> `-c' only affects resumption of downloads started _prior_ to this
>> invocation of Wget, and whose local files are still sitting
>> around.
>>
>> Without `-c', the previous example would just download the remote
>> file to `ls-lR.Z.1', leaving the truncated `ls-lR.Z' file alone.
>>
>> Beginning with Wget 1.7, if you use `-c' on a non-empty file, and
>> it turns out that the server does not support continued
>> downloading, Wget
>> will refuse to start the download from scratch, which would
>> effectively
>> ruin existing contents. If you really want the download to
>> start from
>> scratch, remove the file.
>>
>> Also beginning with Wget 1.7, if you use `-c' on a file which is
>> of equal size as the one on the server, Wget will refuse to
>> download the
>> file and print an explanatory message. The same happens
>> when the file
>> is smaller on the server than locally (presumably because it
>> was changed
>> on the server since your last download attempt)--because
>> "continuing" is not meaningful, no download occurs.
>>
>> On the other side of the coin, while using `-c', any file that's
>> bigger on the server than locally will be considered an incomplete
>> download and only `(length(remote) - length(local))' bytes will be
>> downloaded and tacked onto the end of the local file. This behavior
>> can be desirable in certain cases--for instance, you can use
>> `wget -c'
>> to download just the new portion that's been appended to a data
>> collection or log file.
>>
>> However, if the file is bigger on the server because it's been
>> _changed_, as opposed to just _appended_ to, you'll end up with a
>> garbled file. Wget has no way of verifying that the local file is
>> really a valid prefix of the remote file. You need to be especially
>> careful of this when using `-c' in conjunction with `-r', since
>> every file will be considered as an "incomplete download" candidate.
>>
>> Another instance where you'll get a garbled file if you try to
>> use `-c' is if you have a lame HTTP proxy that inserts a "transfer
>> interrupted" string into the local file. In the future a
>> "rollback" option
>> may be added to deal with this case.
>>
>> Note that `-c' only works with FTP servers and with HTTP servers
>> that support the `Range' header.
>>
>> Note the caution about files that have been _changed_ as opposed to
>> merely appended to. If an administrative user has, for some reason,
>> decided to purge some past mails from an mbox, your 'wget -c' fetch
>> of that file will append whatever follows the byte range you already
>> have, though the assumption that resuming from that point makes sense is
>> actually incorrect. To be safe against that (unlikely) option, you
>> would omit '-c'. As it happens, the mbox file isn't really very big.
More information about the sf-lug
mailing list