[conspire] Slice of life

Carl Myers cmyers at cmyers.org
Fri Sep 18 16:41:56 PDT 2009


Ah yes, mention here of the "danger of picking up someone else's changes
accidentally" is related to the comment I just made - another feature of the
deployment system was the ability to take just certain changes and move them to
production, without moving others.  "Merging a diff" of package versions, you
might say.  That, in combination with rollback capability, was what made it so
powerful.


On Fri, Sep 18, 2009 at 03:51:06PM -0700, Rick Moen wrote:
> Date: Fri, 18 Sep 2009 15:51:06 -0700
> From: Rick Moen <rick at linuxmafia.com>
> To: conspire at linuxmafia.com
> Organization: Dis-
> Subject: Re: [conspire] Slice of life
> 
> So, just a couple of words about "the Rick test", for the few of you who
> are likewise administering significant DNS server installations using
> ISC's BIND9 software.  It was:
> 
> 
> /usr/sbin/named-checkconf -z -t /var/named/chroot/ /etc/named.conf | \
> egrep 'missing|not allowed|unknown|not at top of zone|\
> appears to be an address|no current owner name|MAXTTL|file not found|\
> may not be used with|outside epoch|in future|invalid|unsupported|no TTL|\
> ignoring|TTL set to prior TTL' | sort -u
> #Should return null.
> 
> 
> You might be wondering what that's all about.
> 
> 
> When I took over being the main guy in charge of $FIRM's DNS, I noticed
> a recurring syndrome:  Somebody would push a DNS change out of cvs,
> sometimes picking up _other_ people's cvs checkins in so doing.  That
> person would then go to the master nameserver to bring the changes
> online.  
> 
> In cases where the change includes dropping a domain, adding a domain,
> or otherwise changing BIND9's configuration file, it's not sufficient
> to issue the command ("rndc reload") that tells BIND "Don't restart
> everything, but just reload all of the DNS zones you service from disk."
> In those cases, you have to do "service named restart", which (of
> course) first stops the BIND9 daemon completely, unloading it from
> memory, and then reloads and relaunches it.
> 
> With "rndc reload", the worst that might happen is that BIND9 would
> refuse to load a zonefile that it didn't like (syntax error, or such).  
> However, with "service named restart", something far, far worse can
> happen:  If BIND9 sees something it doesn't like in _either_ its
> configuration files _or_ any of the (potentially large number of)
> zonefiles, it will choke and die in the middle of loading zones.
> 
> Not only that:  Even though BIND9 lists the names of zones as it starts
> and loads them, the last one echoed before the daemon dies tells you
> nothing about where the problem is.  There you sit, trying to triage
> the problem, while waiting for the automated alarms to start coming in,
> and the CEO to walk over and ask "How'd you manage to break the master
> nameserver?"
> 
> 
> Some relatively recent version of BIND9 finally introduced a _separate_
> pair of utilities, named-checkzone and named-checkconf, that externally
> provide the proper input validation that remains missing from the BIND9
> daemon, itself.  named-checkzone can check any individual zonefile(s)
> for basic syntax errors -- but doesn't understand chrooting, and so
> breaks on #include references to within a chroot jail.
> 
> named-checkconf is more useful:  By itself, it checks BIND9's conffiles
> for basic syntax errors, and _does_ understand the effects of chrooting.  
> Even better, if you include the "-z" flag, it'll also check referenced
> zonefiles, again, with correct comprehension of what chrooting is all
> about.
> 
> So:  "/usr/sbin/named-checkconf -z -t /var/named/chroot/ /etc/named.conf"
> produces a very detailed listing of any problems with, first,
> /etc/named.conf (and include files) as a BIND9 configuration set, then,
> any problems with each of the zones referenced in the conffiles.
> 
> The remaining problem is that named-checkconf's report is way, way too
> verbose.  Errors and warnings don't stick out, unless you are reading
> the hundreds of lines of output very attentively.
> 
> To fix that problem, I found (using ldd) the ISC library file that
> contains all of named-checkconf's error messages, then abstracted from
> those strings sixteen substrings that seemed the ones potentially worth 
> worrying about.  The resulting egrep incantation says "filter out all
> named-checkconf output lines that don't include one of these significant
> error strings, and show only those lines."
> 
> As a result, null output shows pretty clearly that everything's OK, and 
> anything non-null highlights which zone or conffile has a problem.
> 
> I actually just realised, by checking the unfiltered output, that I need
> to add an item to the filter list, because of warning messages like
> this:
> 
> reverse/1-26.0.168.192.in-addr.arpa:16: warning: edge.example.com.1/26.0.168.192.in-addr.arpa: bad name (check-names)
> 
> 
> (Again, I'm substituting example.com for the real domain, and
> 192.168.0.0/26 for the real CIDR IP block.)  One of my colleagues had 
> created reverse-DNS zonefile 1-26.0.168.192.in-addr.arpa for reverse
> domain 1/26.0.168.192.in-addr.arpa with the following entry:
> 
> 25   IN  PTR   edge.example.com
> 
> ...thereby committing the second most-common DNS mistake (after failing
> to increment the serial number), because he meant to say:
> 
> 25   IN  PTR   edge.example.com.
> 
> The error didn't break DNS, but it resulted, for lack of the trailing
> period, in the reverse DNS for 25.1-26.0.168.192.in-addr.arpa becoming 
> 
>    edge.example.com.1/26.0.168.192.in-addr.arpa
> 
> ...which was not what he intended.
> 
> So, I guess my revised test needs to be 
> 
> /usr/sbin/named-checkconf -z -t /var/named/chroot/ /etc/named.conf | \
> egrep 'missing|not allowed|unknown|not at top of zone|\
> appears to be an address|no current owner name|MAXTTL|file not found|\
> may not be used with|outside epoch|in future|invalid|unsupported|no TTL|\
> ignoring|TTL set to prior TTL|bad name' | sort -u
> 
> 
> _______________________________________________
> conspire mailing list
> conspire at linuxmafia.com
> http://linuxmafia.com/mailman/listinfo/conspire

-- 
Carl Myers 
PGP Key ID 3537595B
PGP Key fingerprint 9365 0FAF 721B 992A 0A20  1E0D C795 2955 3537 595B

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://linuxmafia.com/pipermail/conspire/attachments/20090918/9de3692b/attachment.pgp>


More information about the conspire mailing list