[sf-lug] (forw) (forw) Trouble ticket # CR109553011 diagnostic data

Rick Moen rick at linuxmafia.com
Fri Sep 22 22:30:08 PDT 2023


And, _here_ we are.  All that effort to clarify the problem, and so far,
it _looks_ like the only support ticket that existed (a) was wrong, and
(b) got closed.

I'm currently on the phone to level 1 support _again_, seeing if I can
get a real support case created.

----- Forwarded message from Rick Moen <rick at linuxmafia.com> -----

Date: Fri, 22 Sep 2023 22:28:30 -0700
From: Rick Moen <rick at linuxmafia.com>
To: Nick Varela <Nick_Varela at cable.comcast.com>
Subject: (forw) Trouble ticket # CR109553011 diagnostic data
Organization: If you lived here, you'd be $HOME already.

What's going on?

Mr. Varela, alas, I have _big_ problems with the handling of this 
ticket (CR109553011) so far, by Comcast Business as a whole -- at least, 
what I can see by logging into the account and viewing tickets.

Your field service technician, Alberto, visited about 12 hours ago and 
took down technical details for ticket CR109553011 including concise
diagnostic information, demonstrating the problem.  As Alberto left my 
premises about 10AM PST, he asked me to e-mail you detailed technical
information so that the ticket could be escalated to level 2 or level 3 
Support.  I furnished the information below.

Now, 12 hours later, I check on the status of this "emergency support
ticket" at
https://business.comcast.com/account/support-tickets/689906011127102015Comcast.IMS
, and, you know what I see there?

  Ticket      Status  Service  Address            Date Opened  Last Action
  CR109553011 Closed  Internet 1105 ALTSCHUL AVE  Sep 22, 2023 Sep 22, 2023
                               UNIT HMOFC, MENLO
                               PARK CA 94025

   Summary: Cant Connect/Surf

_That_ is the "emergency support" ticket.  Problem description "Cant
Connect/Surf" (dead wrong) and "Closed" (dead wrong).

Does Comcast Business need to start over, to acknowledge and address
the described problem?

I have not been called/SMSed since Alberto left, and I have given no
indication anyone is working on, or even acknowledges, this problem
so far.



----- Forwarded message from Rick Moen <rick at linuxmafia.com> -----

Date: Fri, 22 Sep 2023 10:42:09 -0700
From: Rick Moen <rick at linuxmafia.com>
To: Nick Varela <Nick_Varela at cable.comcast.com>
Subject: Trouble ticket # CR109553011 diagnostic data

Alberto asked me to please pass along data on the DNS problem.  

I have a home Linux server ("linuxmafia.com" aka "ns1.linuxmafia.com")
on one of my five Comcast-issued static IPs, 96.95.217.99.  The physical 
box connects directly on a short ethernet cable to the Technicolor
CGA4332COM gateway box.  Among my server's duties is authoritative DNS,
communicating with a number of remote authoritative DNS servers,
including ns1.sf-lug.org, IP 96.86.170.229, to serve domain's DNS data
to the public.  This notably includes "zone transfers" (IXFR/AXFR
operations) between authoritative DNS servers.

DNS operations _had_ been normal and correct using this gateway box,
until yesterday.

Starting 1:02pm yesterday (Thursday, Sept. 21), my server began logging
peculiar failures when attempting _all_ zone transfers, starting with
this one:

Sep 21 13:02:57 linuxmafia named[16488]: zone sflug.com/IN: refresh: non-authoritative answer from master 96.86.170.229#53 (source 0.0.0.0#0)


To investigate, from linuxmafia.com's command line, I simulated a zone
transfer.  It failed.

linuxmafia:~# dig sflug.com. axfr @96.86.170.229 +short
; Transfer failed.
linuxmafia:~#


As shown below using the traceroute utility, the DNS nameserver at IP
address 96.86.170.229 is 11 router hops away (being in Alameda County):

linuxmafia:~# traceroute -n 96.86.170.229
traceroute to 96.86.170.229 (96.86.170.229), 30 hops max, 40 byte packets
 1  96.95.217.102  3.020 ms  3.734 ms  3.714 ms
 2  100.92.140.67  46.765 ms  46.778 ms 100.92.140.66  51.110 ms
 3  96.216.9.185  20.392 ms  20.324 ms  20.009 ms
 4  68.85.154.113  18.371 ms  18.385 ms  18.291 ms
 5  96.108.99.245  20.634 ms 96.108.99.249  19.246 ms 96.108.99.245 20.520 ms
 6  68.86.143.89  21.276 ms 68.86.143.93  17.622 ms 68.86.143.89  18.264 ms
 7  162.151.86.58  22.078 ms 162.151.87.226  15.395 ms  18.973 ms
 8  162.151.78.186  18.869 ms  17.885 ms 162.151.79.134  17.828 ms
 9  68.85.191.206  17.609 ms  17.481 ms 68.85.103.154  17.425 ms
10  73.189.65.18  27.509 ms  32.286 ms  32.124 ms
11  96.86.170.229  30.625 ms  27.555 ms  23.441 ms
linuxmafia:~#


However, if instructing traceroute to probe using TCP-type packets
rather than its default ICMP-type, and to probe the remote target's 
port 53 (used for DNS operations), an immediate answer comes back in
_one_ hop:

linuxmafia:~# traceroute -nTp 53  96.86.170.229
traceroute to 96.86.170.229 (96.86.170.229), 30 hops max, 40 byte packets
 1  96.86.170.229  2.009 ms  1.509 ms  1.177 ms
linuxmafia:~#

This is obviously wrong.  IP 96.86.170.229 is not one hop away.  The
Technicolor gateway box is one hop away.  It appears to be intercepting
my server's efforts to reach remote hosts' TCP port 53, purporting to
proxy-respond, and failing if asked to process a zone transfer (query
type AXFR).

This same failure occurs with all zone transfer attempts from any of
my server's other authoritative DNS nameserver peers.  All of this
started at 1:02pm yesterday, and is consistently repeatable.

By reading documentation about the Technicolor gateway box, I see that
it has the open source caching DNS proxy daemon "dnsproxy" built into
its firmware, intended to function as a transparent DNS cache.  Perhaps
this behavior is related to that.


The Technicolor gateway's firewall settings for IPV4 are set to "Minimum
Security (Low)", and "Disable Firewall for True Static IP Subnet Only"
is checked.  For IPV6, firewall security level is set to "Typical
Security (Default)"


My server does _not_ receive its IP from DHCP:  That server is true
static-IP, locally configured.  

My 5-usable IP CIDR block is netblock 96.95.217.96/29.  My notes about
my use of those IPs:

Static IP assignments:
96.95.217.96 network
96.95.217.97 [unassigned]
96.95.217.98 [unassigned]
96.95.217.99 uncle-enzo   (this is my server, linuxmafia.com)
96.95.217.100 gigabyte miniITX
96.95.217.101 WebPowerSwitch
96.95.217.102 gateway
96.95.217.103 broadcast

I think the only reference ot the 96.95.217.96/29 CIDR block in the
Technicolor gateway's admin WebUI is this under Gateway > Connection >
Comcast Network:  "WAN Static IP Address (IPv4):96.95.217.102"


I notice that Connected Devices > Devices, in the "Online Devices"
table, does include my server in known devices, as follows:


Host Name           DHCP/Reserved IP  RSSI Level  Connection
00:20:ED:13:BA:89   DHCP              NA          Ethernet


If I hit "edit" for that entry, and try to change the second field from
"DHCP" to "Reserved IP", entering 96.95.217.99, this change gets
disallowed with message "Reserved IP Address is not in valid range:
10.1.10.2 ~ 10.1.10.253".


"Port Forwarding", "Port Triggering", and "Port Management" are all
switched off.


I notice that this Technicolor box, unlike the defective one it replaced
around a month ago, includes a provision for "DMZ", "to allow a single
computer on your LAN to open all of its ports".  This is not configured, 
as we carried forward our configuration unchanged from the replaced
unit.


Admin WebUI page "Advanced > Static Routing" has nothing configured in it.

Admin WebUI feature "Bridge Mode" is _not_ enabled.


I would be glad to save the Technicolor gateway box's configuration to
file and send it, and/or provide screenshots of particular pages of its
admin WebUI, or anything else Support needs.



My colleague Michael Paoli's description of the problem follows:


Problem: Comcast Business Technicolor gateway severely breaking DNS!

All customer side network use of Internet UDP or TCP port 53 (DNS) on or
beyond the Comcast Business Technicolor gateway device is being
intercepted with and mucked with by that device, and not going straight
out to The Internet.  This is severely breaking DNS and must be
corrected!

Some examples of the numerous problems this causes detailed below:

customer networks:
96.95.217.96/29
2603:3024:182f:d100::/64
10.1.10.0/24

>From customer networks, at least all these problems are observed:

All attempts at AXFR fail, even from Internet IPs that are open to all for AXFR:
$ dig @96.86.170.229 AXFR sflug.com.

; <<>> DiG 9.18.16-1~deb12u1-Debian <<>> @96.86.170.229 AXFR sflug.com.
; (1 server found)
;; global options: +cmd
; Transfer failed.
$ dig @2001:470:1f05:19e::2 AXFR sflug.com.
;; Connection to 2001:470:1f05:19e::2#53(2001:470:1f05:19e::2) for
sflug.com. failed: timed out.
;; Connection to 2001:470:1f05:19e::2#53(2001:470:1f05:19e::2) for
sflug.com. failed: timed out.
;; Connection to 2001:470:1f05:19e::2#53(2001:470:1f05:19e::2) for
sflug.com. failed: timed out.
$

Attempts to communicate to UDP or TCP port 53 on The Internet don't
actually work.
The Comcast Business Technicolor gateway device may fake the traffic,
but it's not actually passed to the requested servers!

There is currently nothing on these two IP addresses:
96.86.170.225
2001:470:1f05:19e::dead:beef
Yet we see as responses via the
Comcast Business Technicolor gateway device:
$ dig @96.86.170.225 +noall +answer +nottl comcast.com. A
comcast.com.            IN      A       96.99.227.0
$ dig @2001:470:1f05:19e::dead:beef +noall +answer +nottl comcast.com. A
comcast.com.            IN      A       96.99.227.0
$

When we use traceroute on TCP port 53, we find that via the
Comcast Business Technicolor gateway device
That these Internet server IP addresses that are listening on port 53 TCP and UDP,
we find on IPv4 that it's an impossibly faked 1 hop away:
# traceroute -nTp 53 96.86.170.229
traceroute to 96.86.170.229 (96.86.170.229), 30 hops max, 60 byte packets
 1  96.86.170.229  1.489 ms  3.020 ms  4.033 ms
#
and we find on IPv6 that it's unreachable, even though in reality it's open:
# traceroute -nTp 53 2001:470:1f05:19e::2 | sed -e 's/^[ 0-3][0-9] //' | sort | uniq -c | sort -bnr
     30 * * *
      1 traceroute to 2001:470:1f05:19e::2 (2001:470:1f05:19e::2), 30 hops max, 80 byte packets
#
If we repeat those same tests, instead targeting TCP port 443 instead of
53, we find the true number of hops away and intermediary IP addresses,
as with port 443, rather than 53, the
Comcast Business Technicolor gateway device
isn't inserting itself and interfering with TCP port 443 traffic as it is
unlike where it messes with corrupts TCP port 53 traffic:
# traceroute -nTp 443 96.86.170.229
traceroute to 96.86.170.229 (96.86.170.229), 30 hops max, 60 byte packets
 1  96.95.217.102  1.836 ms  1.979 ms  2.076 ms
 2  100.92.140.66  24.901 ms  24.858 ms  24.935 ms
 3  96.216.9.177  16.515 ms  16.673 ms 96.216.9.185  16.340 ms
 4  68.85.154.117  16.830 ms  16.958 ms  17.109 ms
 5  96.108.99.249  19.133 ms 96.108.99.245  18.959 ms  18.823 ms
 6  68.86.143.93  16.493 ms  17.326 ms 68.86.143.89  17.446 ms
 7  162.151.87.226  17.487 ms  12.051 ms  17.965 ms
 8  162.151.79.134  19.776 ms  19.934 ms 162.151.78.186  16.562 ms
 9  68.85.103.154  18.947 ms  18.980 ms  21.573 ms
10  73.189.65.18  35.630 ms  36.055 ms  35.724 ms
11  96.86.170.229  34.308 ms  34.426 ms  36.755 ms
# traceroute -nTp 443 2001:470:1f05:19e::2
traceroute to 2001:470:1f05:19e::2 (2001:470:1f05:19e::2), 30 hops max, 80 byte packets
 1  2603:3024:182f:d100:4a4b:d4ff:fe20:d676  1.797 ms  1.894 ms  2.103 ms
 2  2001:558:101e:85::2  36.449 ms 2001:558:101e:85::3  36.227 ms 2001:558:101e:85::2  36.446 ms
 3  2001:558:82:101c::1  16.912 ms 2001:558:82:101a::1  17.619 ms 2001:558:82:101c::1  18.992 ms
 4  2001:558:80:26f::1  18.063 ms  18.752 ms  19.356 ms
 5  2001:558:80:1b4::2  17.993 ms 2001:558:80:262::1  16.784 ms 2001:558:80:1b4::2  19.369 ms
 6  2001:558:80:1a9::1  20.856 ms 2001:558:80:1b4::2  18.778 ms 2001:558:80:1a9::1  20.005 ms
 7  2001:558:3:97d::1  33.979 ms 2001:558:3:97e::1  19.025 ms 2001:558:3:97d::1  20.244 ms
 8  2001:558:3:97e::1  19.682 ms 2001:558:3:97f::1  19.201 ms  24.331 ms
 9  * 2001:558:3:4a::2  23.912 ms *
10  * * *
11  2001:470:0:3d3::2  32.798 ms * *
12  2001:470:0:3d3::2  17.678 ms  17.763 ms  17.373 ms
13  2001:470:0:45::2  19.240 ms  19.341 ms  21.601 ms
14  2001:470:1f05:19e::2  29.025 ms  28.792 ms  24.763 ms
#

This needs to be fixed so customer traffic to UDP and TCP port 53 on
The Internet works and is not interfered with, distorted, or altered
by the Comcast Business Technicolor gateway device.


----- End forwarded message -----

----- End forwarded message -----



More information about the sf-lug mailing list