[sf-lug] High(er) availability for SF-LUG site(s) (& some BALUG stuff too).

Michael Paoli Michael.Paoli at cal.berkeley.edu
Thu Nov 26 11:30:47 PST 2015


High(er) availability for SF-LUG site(s) (& some BALUG stuff too).

This covers the web sites & DNS master (list is hosted separately by
Rick Moen), and this also covers fair bit of BALUG stuff too (everything
*except* BALUG's [www.]balug.org and list stuff - e.g. wiki, archives,
test/beta/staging sites, etc.)

Anyway, more-or-less per earlier plan, I did get so far yesterday, as
doing the first *live* migration of that host.  And also *without*
shared storage (which does also work perfectly fine - just takes a wee
bit longer for the actual overall migration, but still when the actual
final switch itself happens for guest VM itself, it's exceedingly fast
(on the order of 10s of milliseconds or so? - I haven't actually timed
that final bit quite yet).  So, ... that VM host is no longer "stuck"
just on my personal laptop :-) ... which means it can very much remain
up and Internet accessible - even when my laptop isn't (or, e.g. travels
away from home).

$ wakeonlan 00:30:48:91:97:90
...
Not 100% anticipated, but not a huge surprise, and easy enough to
address - some of the last kinks to be worked out were in allowing the
live migration to be successful.  CPU type and flags/capability:
error: unsupported configuration: guest and host CPU are not  
compatible: Host CPU does not provide required features: popcnt,  
sse4.2, sse4.1
Ah, so laptop CPU wee bit more modern than that on "vicki" - and by
more-or-less default, guest CPU was configured to take advantage of
many/most of those host CPU capabilities.  Easy enough to deal with that
- bring the VM guest down, reconfigure the virtual CPU to disable those
3 capabilities, bring VM guest back up again, and repeat the attempt -
made it fine past that error.  Next glitch was a bit more puzzling:
error: internal error: unable to execute QEMU command 'migrate': State  
blocked by non-migratable device '0000:00:05.0/ich9_ahci
Wee bit 'o search and ... QEMU can't live migrate SATA (at least not
yet safely in version I'm using, and at least by default migration of
such is disabled for safety reasons).  Bring host down, take virtual
hard drive off of SATA, turn it into SCSI and attach it to SCSI ... and
... same error?  Checked configuration again - nothing attached to
(virtual) SATA bus/controller, but the SATA bus/controller still there,
... next step, remove those, and repeat ... and ... success, all went
fine, no errors:
# virsh migrate --live --persistent --copy-storage-all --verbose balug \
> qemu+ssh://192.168.55.2/system && virsh autostart --disable balug
... and all went fine and dandy.  And then, live migrating back:
# virsh migrate --live --undefinesource --copy-storage-all --verbose \
> balug qemu+ssh://tigger.mpaoli.net./system
And that went perfectly fine too, not so much as a glitch to notice on
the guest VM itself (though the storage replication took a while, so
it's not a speedy move from perspective of the physical hosts) ... TCP
connections between guest and Internet, etc., all maintained perfectly
fine across the live migrations.

Wee bit more stuff to do / work on ... e.g. (at least theoretically),
o Turn it into a (nearly) push-button operation (run one relatively
   simple script - or pair of scripts - partially drafted, but yet to
   polish those off.).
o Investigate/test --copy-storage-inc - if suitable and safe, that may
   significantly speed the disk data copy portion of the migration (some
   of the storage I have set up is highly optimized for physical storage
   space reduction, but consequently has very low write performance
   characteristics - which is mostly quite fine, but slows migration
   especially back to laptop; read performance, however, is more than
   sufficient.  E.g. on physical host (laptop) we have:
   # ls -hnos balug-sda
   4.8G -rw------- 1 114 16G Nov 26 18:41 balug-sda
   #
   Quite efficient (deduplication + compression) space utilization - but
   at cost of write performance (and some CPU burn, particularly on
   heavy writing, and some more suck of RAM too) - but that happens to
   be the trade-off I want the majority of the time for that storage -
   so that's highly acceptable (laptop SSD is "only" about 150 GiB ...
   and I've a whole lot of other stuff on it too - I'm fine with LUG VM
   taking ~5 GiB of physical storage ... but not gonna give it 16 GiB!).
   Where it resides, it also does deduplication across some ISOs that
   quite correspond to the installed operating system (and also other
   data), so that also aids in reducing total physical storage space
   consumed.
   ).
o I'll also carefully review, and likely adjust/tweak other bits of the
   migration options and handling of the VMs after migration - mostly
   notably bits regarding undefine or not, and autostart or not - and
   where.  And of course test it all out more fully.  :-)


> From: "Michael Paoli" <Michael.Paoli at cal.berkeley.edu>
> Subject: Re: sf-lug site & hardware
> Date: Tue, 24 Nov 2015 06:32:02 -0800

> Just an FYI update.
>
> So, my (overly optimistic) theoretical timeline - was hoping to have
> the sf-lug site relocated onto the higher availability hardware
> (notably not on VM on my laptop) by around 2015-11-15 or so.  Have
> adjusted the target timeline a bit, after some considerations (and also
> being relatively busy with other stuff too).  Anyway, one thing I
> didn't fully take into account earlier - fan noise.  That system that
> was in the colo - 1U unit, is comparatively noisy (I've gotten a bit
> spoiled mostly not listening to fan noise of such volume - even though
> it uses a fan and airflow design that mostly avoids tiny 1U high-RPM
> fan(s) - it's still noisier than most typical desktop systems - but
> less noisy than many typical 1U servers).  So, ... I adjusted my
> (theoretical) plans a bit.  With wakeonlan, qemu-kvm live migration,
> and wee bit of infrastructure (which I was mostly planning to do
> anyway), and small bit of scripting, I could arrange to have the VM
> running on the noisier (but higher availability) hardware, mostly only
> when it wouldn't be running on my laptop at home.  And with live
> migration, the migration would be effectively "invisible" to the guest
> VM host itself, its state, connections to it and sessions on it, etc.
> Anyway, fair bit closer to having that plan fully implemented.  Current
> target timeline for completion, by 2015-11-29, or at least not later
> than 2015-12-13.  May be fair bit sooner.  I'll update once it's in
> place and fully operational (did get a fair chunk of related
> infrastructure completed yesterday and today).
>
> references/excerpts:
> https://en.wikipedia.org/wiki/Wake-on-LAN
> https://en.wikipedia.org/wiki/Live_migration
>
>> From: "Michael Paoli" <Michael.Paoli at cal.berkeley.edu>
>> Subject: Re: sf-lug site & hardware
>> Date: Thu, 12 Nov 2015 14:01:41 -0800
>
>> FYI, this morning Jim Stockford and I did retrieve the physical server
>> host from the colo, upon which, up until some months back, the sf-lug
>> web site was running.  So, that improves the hardware resource
>> situation.  I'm guestimating I'll have the sf-lug website again running
>> on VM atop this hardware by sometime this weekend or so - that should
>> improve the availability a fair bit (notably the sf-lug website then
>> won't go down when my personal laptop goes down, offline, or out the
>> door from home).
>>
>> Thanks Jim!
>>
>>
>>> From: "Michael Paoli" <Michael.Paoli at cal.berkeley.edu>
>>> Subject: Re: Have you guys thought about http://www.freelists.org/  
>>> (hosted ...)
>>> Date: Wed, 11 Nov 2015 18:26:25 -0800
>>
>>> would be down or that it wasn't (relatively) high availability (at least
>>> compared to virtual machine running on my personal laptop - which does
>>> have the sf-lug site go out when my laptop goes out ... hopefully that
>>> situation will be improved in near future ... waiting on some resources
>>> to be able to do that.)
>>>
>>> references/excerpts:
>>> http://linuxmafia.com/pipermail/sf-lug/2015q4/011454.html
>>> http://linuxmafia.com/pipermail/sf-lug/2015q4/011441.html
>>>
>>>> From: Shane Tzen <shane at faultymonk.org>
>>>> Date: Wed, 11 Nov 2015 15:56:14 -0800
>>>> Subject: Re: [sf-lug] updated/upgraded: SF-LUG - operating system  
>>>> presently hosting
>>>> To: Michael Paoli <Michael.Paoli at cal.berkeley.edu>
>>>> Cc: SF-LUG <sf-lug at linuxmafia.com>
>>>>
>>>> Have you guys thought about http://www.freelists.org/about.html ?
>>>>
>>>> Looks like various LUGs are hosted -
>>>> http://www.freelists.org/cat/Linux_and_UNIX
>>>>
>>>>
>>>> On Fri, Oct 30, 2015 at 3:41 AM, Michael Paoli <
>>>> Michael.Paoli at cal.berkeley.edu> wrote:
>>>>
>>>>> It's been updated/upgraded:
>>>>> from: Debian GNU/Linux 7.9 (wheezy)
>>>>> to: Debian GNU/Linux 8.2 (jessie)
>>>>>
>>>>> http://lists.balug.org/pipermail/balug-admin-balug.org/2015-October/002989.html
>>>>>
>>>>> Still definitely *not* high availability though (alas, still sits atop
>>>>> a virtual machine on my *laptop*!).
>>>>>
>>>>> Hopefully in not too horribly distant future (like *real soon*), the
>>>>> physical box the site was earlier running upon will be successfully
>>>>> retrieved - once that happens, some high(er) availability options
>>>>> become possible.
>>>>>
>>>>> Let me know if you notice anything awry (notwithstanding the less than
>>>>> high availability).
>>>>>
>>>>> From: "Michael Paoli" <Michael.Paoli at cal.berkeley.edu>
>>>>>> Subject: It's alive*!: Re: SF-LUG - DNS, web site, ..., etc.
>>>>>> Date: Mon, 24 Aug 2015 03:10:26 -0700
>>>>>>
>>>>>
>>>>> Anyway, have taken the liberty ...
>>>>>> it's alive* ...
>>>>>> the [www.]sf-lug.{org,com}
>>>>>> websites are available again.





More information about the sf-lug mailing list