[sf-lug] traffic shaping, ...: Re: [on-list] site up, http[s] down: Re: Wierd problems trying to access linuxmafia.com

Thu Jan 3 23:03:40 PST 2019

As a Raw Bandwidth customer, with most or all of their services,
and I believe also the "package" you have (and probably same
or similar to what I have), there are probably some additional
tests one can do to rather effectively test {up,down}load bandwidth,
and could of course also look at latency while running such tests,
and may significantly reduce (at least some) other variables.

E.g., with the package(s) I'm presuming, one also gets some
FTP storage on rawbandwidth (www.rawbw.com).  So, can do speed checks
there, e.g. upload "large" file (or set of files) via FTP,
and download via FTP or (no password required) http.

Alternatively for the download, can suck down, probably mostly
via http, various documents, etc. from rawbandwidth's site - any of
their documentation, ... or users' web pages stuff hosted there.

In all probability, bottleneck would be the [A]DSL link, and not
hosts or drive I/O at either end.
I'm also presuming that Raw Bandwidth has comparatively ample bandwidth
among it's own servers - for other large ISPs that might not be as
valid a presumption, as their various servers may be quite highly
scattered across networks - I'm presuming most or all of
Raw Bandwidths are well quite well connected network-wise at
a site or two or so.

Also be sure, especially with http (or https, if used), to use a
"dumb enough" client, to not be having the server compress and client
decompress the data - one can probably control that (if it doesn't already
not do compression by default) with a client such as wget or curl (and
probably most or all ftp clients).  Alternatively, one can use
data that's already highly compressed - e.g. via xz -9 - then most
any client/server either won't try to compress that further, or even if
it attempts such, really won't be able to (at least in most all cases),
and probably won't bottleneck on CPU even if it attempts to compress
such further.

(Such relatively incompressible data, will to most any
compression attempt, look effectively like (or nearly like) random
data ... and mostly won't compress worth beans - lacking significantly
large repeating patterns, and too random in smaller repeated patterns
to be stored any more efficiently with any added compression - so not much
RAM or CPU gets sucked in further compression attempts, thus not much of
CPU burden.  Comparative example: note the first compression sucks quite a
bit of CPU, but the 2nd, hardly any:
$ (t1=$(mktemp) && t2=$(mktemp) && 2>"$t1" time xz -9 <  
/usr/share/dict/american-english-insane | 2>"$t2" time xz -9  
 >>/dev/null; cat "$t1" "$t2"; rm "$t1" "$t2")
3.41user 0.11system 0:03.53elapsed 99%CPU (0avgtext+0avgdata  
108820maxresident)k
0inputs+0outputs (0major+38520minor)pagefaults 0swaps
0.40user 0.04system 0:03.53elapsed 12%CPU (0avgtext+0avgdata  
79840maxresident)k
0inputs+0outputs (0major+36055minor)pagefaults 0swaps
$
Note that the elapsed time is the same (part of same pipeline),
but CPU is much lighter on the 2nd pass attempt,
whereas CPU is maxed (the bottleneck) on the first pass (and
especially since that was purely local - no network connections
involved).
We get similar if we use different compression for the 2nd pass:
$ (t2=$(mktemp) && xz -9 < /usr/share/dict/american-english-insane |  
2>"$t2" time bzip2 -9 >>/dev/null; cat "$t2"; rm "$t2")
0.28user 0.00system 0:03.57elapsed 7%CPU (0avgtext+0avgdata 7508maxresident)k
72inputs+0outputs (1major+1693minor)pagefaults 0swaps
$ (t2=$(mktemp) && xz -9 < /usr/share/dict/american-english-insane |  
2>"$t2" time gzip -9 >>/dev/null; cat "$t2"; rm "$t2")
0.04user 0.00system 0:03.48elapsed 1%CPU (0avgtext+0avgdata 1492maxresident)k
0inputs+0outputs (0major+131minor)pagefaults 0swaps
$ (t2=$(mktemp) && xz -9 < /usr/share/dict/american-english-insane |  
2>"$t2" time gzip >>/dev/null; cat "$t2"; rm "$t2")
0.06user 0.00system 0:03.44elapsed 1%CPU (0avgtext+0avgdata 1676maxresident)k
0inputs+0outputs (0major+136minor)pagefaults 0swaps
$
)

Also, if one has shell access/account with Raw Bandwidth, then there are
of course additional ways one can test.  Note also that if one, e.g. uses
ssh, again beware of compression - ssh - by default and/or depending how it's
configured, may do or attempt to do compression (mostly notably because
attempting that after encryption is effectively futile ... so if one is
going to do it with data that's to be encrypted, either do it before, or
don't bother).

> From: "Rick Moen" <rick at linuxmafia.com>
> Subject: Re: [sf-lug] traffic shaping, ...: Re:  [on-list] site up,  
> http[s] down: Re: Wierd problems trying to access linuxmafia.com
> Date: Thu, 3 Jan 2019 20:31:58 -0800

> Quoting Michael Paoli (Michael.Paoli at cal.berkeley.edu):
>
>> There are traffic shaping packages for at least most major Linux
>> distributions.
>
> Yes, I'll be looking at those.  As usual, there is an enlightening Gentoo
> wiki page about a specific implementation of this idea that uses only
> kernel code and iproute2's 'tc' (traffic control) utility:
> https://wiki.gentoo.org/wiki/Traffic_shaping
>
> ..and a similar Arch Linux wiki page:
> https://wiki.archlinux.org/index.php/advanced_traffic_control
>
> The Gentoo page combines the kernel Fair Queue Controlled Delay AQM
> (FQ_CODEL) module to create fair bandwidth for all flows, while
> attempting to minimise buffers (and hence delays) with throttling of
> total upload and download speeds to about 90% (adjustable, obviously) of
> one's maximum link speed using several kernel QoS modules and the
> iproute2 'tc' utility.
>
> I'm thinking that's probably what I _most_ need, because of the rather
> unfortunate characteristics of aDSL:  The outbound capacity is small
> compared to inbound, with the result that requests can saturate outbound
> and then effectively everything gets clogged for new traffic such as,
> oh, say, my ssh connections to my server, at which point I currently get
> peevish and start shooting services in the foot.  ;->
>
>
> But let's also consider the uplink.
>
> My Raw Bandwidth Communications (RBC) invoice states the _nominal_
> bandwidth I pay for:
>
> 'Residential DSL 384K-1.5M/128-384K monthly w/circuit'.
>
> I take that to mean 384 kbps to 1.5 Mbps _downwards_ bandwidth, 128 kbps to
> 384 kbps _upwards_ bandwidth.  Of course, the actual bandwidth inevitably
> differs and in my case is probably lower on account of high distance
> from the telco central office (AT&T LATA1, central office aka Serving
> Wire Center MNPKCA11 [1]) at 2950 Sand Hill Road (nr. I-280), Menlo Park,
> about 2.6 to 4.0 km (1.6 to 2.5 miles) away, depending on which streets
> the copper cabling follows.  As per the chart on
> https://www.increasebroadbandspeed.co.uk/2012/graph-ADSL-speed-versus-distance
> , ADSL speed starts dropping at 2.5 km (1.5 miles) and peters out about
> 5 km (3.1 miles).
>
> What I probably should do in the near future to get (at least at one
> sampling time) the _real_ effective total bandwidth up and down is shut
> off all network services on my server, then use something like
> http://www.speedtest.net/ to measure real throughput up and down -- as a
> baseline for subsequent throttling.
>
> No time like the present, actually.  So:
>
> Speedtest.net via RBC aDSL to Speedtest's server at Monkey Brains in  
> San Francisco:
>
> 0.65 Mbps (665 kbps) down
> 0.07 Mbps (72 kbps) up
> 62 ms ping times
>
> For purposes of _that_ test, I made linuxmafia.com be quiescent by
> temporarily shutting down apache2, exim, rsyncd, vsftpd, ntpd, and
> bind9, leaving only sshd listening.  However, on reflection I worried:
> Suppose there were hidden services serving bad guys, or other host weirdness?
> So:
>
> Repeating that test with my server completely severed from the RBC aDSL,
> i.e., yanking its ethernet cable (to baseline this measure and
> double-check for possible server mischief):
>
> 0.66 Mbps (675 kbps) down
> 0.06 Mbps (62 kbps) up
> 71 ms ping times
>
> That's so close as to be well within margin of error, further supporting
> my continuing working hypothesis that the linuxmafia.com server has
> _not_ been security-compromised.  (Yay.)
>
>
>
> However, while I was at it, it occurred to me to do additional test runs
> that eliminated not only my linuxmafia.com server as a bandwidth hog but
> also my entire inside LAN (an Arduino watering system, an Apple Airport
> Extreme WAP, some aging ethernet cabling, and an ethernet switch), by
> connecting my laptop (with static IP) on ethernet directly to the aDSL
> bridge box, and disconnecting my entire house network -- leaving as
> local suspects only the house landline (AT&T's problem along with
> everything else outwards from the demarc), the demarc box, the
> RBC-supplied high-pass filter widget that splits aDSL digital from
> landline voice, the RJ-11 cable from there to the aDSL bridge box, the
> (Westell) aDSL bridge box, and the short and apparently undamaged
> ethernet cable to my laptop.  (The only one of those components that I
> suspect could in theory be dodgy is the RJ-11 cable.)
>
> I was able to get one additional good test run to Speedtest's server at
> Monkey Brains, but then I gather that I must have triggered capping on my
> IP as I was getting partial and then complete test failure, and had to
> switch around to other S.F.-based Speedtest servers -- but, with outlier
> data points in two cases for upbound, the results were consistent:
>
> Still using Speedtest's server at Monkey Brains in San Francisco:
> 0.61 Mbps (624 kbps) down
> 0.08 Mbps (82 kbps) up
> 60 ms ping times
>
> Same but to Speedtest's Race Communications server in San Francisco:
> 0.63 Mbps (645 kbps) down
> 3.96 Mbps (4055 kbps)  up
> 72 ms ping times
>
> Same but to Speedtest's AT&T server in San Francisco:
> 0.70 Mbps (717 kbps) down
> 0.28 Mbps (286 kbps) up
> 95 ms ping times
>
> Same but to Speedtest's Fastmetrics server in San Francisco:
> 0.65 Mbps (665 kbps) down
> 0.07 Mbps (72 kbps) up
> 68 ms ping times
>
>
> So, I'm thinking that the last test's set of measures are probably
> typical, at least at this hour.  I'm tempted to do more tests to see
> if there are substantial variations over time, but have curtailed those
> tests for now because it was cold out there.
>
> The Lineus Bottomus:  Correct me if I'm wrong, but everything I'm seeing
> so far suggests saturation of the (~72 kbps) uplink.
>
> My server-centric household use-case is the opposite of what aDSL is
> good for:  Normal non-server aDSL usage does mostly download traffic
> with light traffic back through the uplink.  A server does mostly
> upbound traffic with light downwards (e.g., Web request) traffic.  So,
> the moment the uplink saturates, everything goes miserable.
>
> https://wiki.sonic.net/wiki/Bandwidth_Saturation
> (Page mostly assumes the user-typical balance of traffic, but is
> otherwise useful.)  Also:
> https://newsignature.com/articles/tcp-adsl-uplink-oversubscription-bad/
>
>    [...]
>    To wrap up, it is generally a very good idea, and should be considered
>    best practice, to use only synchronous connectivity for business
>    customers who have anything more than basic internet connectivity needs.
>
> Yeah, that was why I was so bummed when my prior provider, NorthPoint
> Communications, went belly-up in 2001, because it provided residential
> _SDSL_ (Symmetric Digital Subscriber Line), which didn't have this
> problem at all.
>
> In conclusion:  QoS / traffic-shaping needed.  And I'll ask Mike Durkin
> to mail me a known-good RJ-11 patchcord.
>
>
>
> Just for giggles, here's _good_ broadband connectivity as measured to
> the same Speedtest server over my mother-in-law's Comcast Business cable
> Internet connection:
>
> 29.7 Mbps down
> 12.0 Mbps up
> 17 ms ping times
>
>
>
>
>> Note also that traffic shaping can be deployed as a pass-through for,
>> e.g. a home network, giving benefit - and relative fairness to all -
>> though traffic shaping can also be rather to quite useful even if
>> deployed on as little as one individual host.
>
> This is one of the advantage of networks having a capable bastion host,
> through which everything flows on the way in and out.  My RBC
> aDSL-connected house network lacks such a bastion host, but OTOH only
> trivial amounts of traffic in and out normall got to or from anywhere
> _but_ my server, so that's really where I'll want to apply some brakes
> on impolite bandwidth-sucking.
>
>> As for Apache, I do rather also like the idea of a robots.txt with some
>> essentially "honeypot" do not go here stuff (and especially that's not
>> otherwise linked to or at all or found by a regular web crawl), and
>> use that (via logs, or CGI, or whatever) to have such offending client
>> IPs automagically throttled (at least until unused for some while) down
>> to the bandwidth of sucking chilled molasses through a cocktail straw.
>
> I like that.  It's sneaky.
>
> You might be as surprised as I was to hear that .htaccess is deprecated
> as a place to do such things:  The hit to Apache httpd performance on
> each page load is quite considerable, and you're better off putting the
> same logic, if at all, into Apache's conffiles, as their contents are
> cached at startup.
>
> But I am increasingly thinking the problem at my address has nothing
> fundamentally to do with abusive Web traffic.  It's just uplink
> saturation.
>
>
>
> [1] For a small amusement, try to Internet-research the street address of
> _your_ telco central office (for either landline or ADSL/ADSL2/ADSL2+
> purposes).  Ever since the year 2001, utilities have rather
> pathologically implemented a 'Telling the public anything helps
> terrorists' excuse for hiding information about their infrastructure.
> Nonetheless, if you are persistent and peruse regulatory filings at CPUC
> and FCC, you can find yours, as I did mine.