[sf-lug] DNS: sf-lug.com., general, and balug.org.

Mon Mar 2 02:15:50 PST 2009

Quoting Michael Paoli (Michael.Paoli at cal.berkeley.edu):

> If I'm not mistaken, since >= BIND 8, the current terminology is
> master/slave ... primary/secondary was used for older versions of BIND.

It is, but (1) I am supremely indifferent to ISC's choice of jargon du
jour, (2) I personally found the earlier terminology much clearer, and
(3) within the context of explanations to newcomers, I find using _both_
to be clearer still.

So, yes, I knew extremely well what ISC says and doesn't say, these
days, and deliberately ignored that.

> and perhaps unlike primary/secondary, master/slave also works more
> logically for chaining relationships - e.g. a DNS server - even for
> the same zone, can be both a master (to downstream slave(s)), and a
> slave (to upstream master(s)) - primary/secondary doesn't seem to fit
> quite as smoothly in a chaining construct....

Sure it does.  The term fits perfectly well as a role designation concerning 
how one nameserver communicates with another concerning a particular
zone.  Think about it.

> (particularly if one wants to avoid the messiness of adding tertiary,
> etc.

But doing that would be dumb.  Thus, irrelevant objection.

> one generally only needs to talk about the relationship between a pair
> of servers).

Which is exactly why primary / secondary is fine, and the objection of
"but then you'd need to talk about tertiary, etc." is rubbish.

> Why I jumped to checking queries against the slave 

Yeah, no problem there.  Actually, just about any starting point in
finding the problem is fine; you just have to proceed systematically
from there.  That's why you need to make sure you know where diagnostic
"dig" queries _go_ (when dealing with situations like the recent one),
rather than just omitting the "@" parameter and hoping for the best.

> Key thing is to at least be sure and include checking end results -
> yes, we think we fixed it - that's good ... but ... does it actually
> *work*?

Exactly!

Something Jim Dennis talks about, in his lectures on system
administration, is that the concepts of unit testing from extreme
programming are exactly what a sysadmin needs, to do each job right the
first time.  That is, you need to include in the planning and execution
of each task the thinking out and execution of a suitable means to test
the thing you're doing.  Testing should be _integral_ to each task.

I used to tell my staff at Linuxcare a variation on that concept:  I'd
tell them "Your task isn't done until it's tested."  If the task was to
set up a piece of software, then the task isn't done until you've made
that software perform its function.  If the task includes making
software start at boot time, then you need to schedule a reboot to test
your assumption that everything will work OK during startup.  (Even on
production servers, you can work in a planned reboot _sometime_.  It's
better than finding out only during _unplanned_ reboots whether startup
is OK.)

> Yes, three or more (up to appropriate reasonable limit) is better.

It's not just "better"; it's RFC-recommended best practices.  RFC 2182
section 5:

   The DNS specification and domain name registration rules require at
   least two servers for every zone.  That is, usually, the primary and
   one secondary.  While two, carefully placed, are often sufficient,
   occasions where two are insufficient are frequent enough that we
   advise the use of more than two listed servers.  Various problems can
   cause a server to be unavailable for extended periods - during such a
   period, a zone with only two listed servers is actually running with
   just one.  Since any server may occasionally be unavailable, for all
   kinds of reasons, this zone is likely, at times, to have no
   functional servers at all.
   [...]
   It is recommended that three servers be provided for most
   organisation level zones, with at least one which must be well
   removed from the others.

> Maximum ... well, depends, in certain cases that's as high as, but no
> higher than 13.

"As high as"?  Fsck no.

You can have big problems from having anywhere near that many
nameservers, and seven is the practical limit -- and the _recommended_
limit -- for almost all situations.  (The root nameservers are an
anomalous case, for reasons I'd rather not get into.)

> So, ... what really bad happens with too many?  The complete DNS
> response isn't guaranteed to all fit within a single UDP packet.

That is only the beginning of the problems, and the simplest and most
mechanistic problem, one is likely to have.  Do I _really_ need to get
into that?

> > I can offer you SVLUG's nameserver as a second slave.  NS1.SVLUG.ORG, IP
> > 64.62.190.98.  Just add it to ns1.sf-lug.com's allowed-transfer ACL in
> > /etc/bind/named.conf, restart BIND9, and let me know.  I'll set up slave
> > nameservice and confirm that it can pull down zones and answer quereies,
> > and you then add it to the authoritative list.
> 
> For running production BIND, generally much safer to reload (and then
> inspect logs for errors, and test to confirm), than restart.

It's entirely and spectacularly irrelevant to my point whether Jim, in
the previously discussed situation, does "rndc reload" or "service named
restart".  I.e., you are wandering off onto a typical
obsessive-compulsive geek irrelevancy.  However, that being said, in
Jim's shoes, if I had a fatal configuration error within that pice of
cr__ BIND9's conffiles, I'd much rather know sooner than later.

> Jim - if you're interested, let me know - I can also point you at an
> excellent free resource for DNS slave that I found when I was
> researching such for BALUG.

I hope you're not yet another person pushing EveryDNS, with its broken
djbware-based implementation that doesn't support AXFR and ignores
NOTIFY.

I respect Ulevitch and crew, but losing the ability to have timely
updates to seconaries is a pretty sad disadvantage, especially given
that any number of people in the Valley will be glad to give you
more-competent secondary service for free, too.

> Well, really, ideally :-) SF-LUG (and BALUG) should have appropriate
> monitoring set up ... and much of that monitoring should exist on
> systems *other* than those they're monitoring....

Thank you, Captain Obvious.  ;->

> Yes, good to monitor registry bits ... but I'd treat that as a rather
> distinct matter, as compared to DNS.

Um, excuse me?  The identity of which IPs are authoritative is "a
distinct matter from DNS"?  In what universe?

> but thus far I've seen it once where a TLD registry had
> the nameservers correct in whois data, but that data didn't match
> behavior of the authoritative nameservers 

That's why you check the glue records in the parent zone, genius.

> Well, opinions on BIND9 will vary :-) ... but I certainly agree with at
> least many of Rick's points about it.

_Many_?  Are you prepared to seriously assert that...

o BIND9 is fast?
o BIND9 is RAM-thrifty?
o BIND9 is sveltely featured?
o That running a single monolithic binary for all of the several distinct
  roles of a security-sensitive network daemon is a good security model?

If not -- and I sincerely hope not -- then you and I are in complete
agreement, on that matter.

> On the other hand, there may be other factors to consider - e.g.: does
> that distribution have an NSD package?

Why does nobody bother to do simple Web-searching?
Here's the RHEL-oriented .spec file, which will do great on CentOS:
http://www.nlnetlabs.nl/svn/nsd/trunk/contrib/nsd.spec
And rpmbuild is your friend:
http://linuxhacks.org/tutorials/jakes_rpm_build_tutorial.php
Actually, the .spec file is included _inside_ the souurce tarball
in its "contrib" directory, which also has a README file that says:

   nsd.spec: a rpm specfile to generate binary and source rpms. 
    Put the source tarball in  /usr/src/redhat/SOURCES. Then 
          rpmbuild -ba nsd.spec

Build requirements are the usual gcc, yacc, autoconf, make, OpenSSL
unless you disable TSIG support.  I can't remember if there's anything
else, but there's not much if there is.

> Teaching/training aspects of the sf-lug.com. box?

Yeah, God forbid that people might go a tiny bit out of their way to
learn something _better_ than the same old junk.  Here's
/srv/site-docs/nsd-instructions on SVLUG's Web/DNS server, the docs for
the SVLUG sysadmin team:

  NSD is an authoritative-only DNS nameserver written from scratch by the 
  people who run the .nl (Netherlands) top-level domain.  It does not
  provide recursive service, so the machine on which it runs needs to have 
  access to a full-service nameserver somewhere, via reference in 
  /etc/resolv.conf.

  NSD's advantages are high speed, small RAM footprint, and high security.
  It does support IXFR/AXFR (etc.) zone transfers, and thus is fully usable
  for both master and slave DNS service.

  Although it uses the same zonefile format that BIND8/BIND9 does, it achieves 
  much higher performance, in part by using a hashed binary database.  
  Accordingly, whenever you modify one of its zonefiles, you must "compile" it 
  using the "zonec" compiler utility.

  Runtime control of NSD is best asserted using the "nsdc" utility, whose syntax
  and features are modeled on those of BIND's "rndc" utility.  

  NSD's main configuration file is /etc/nsd/nsd.zones , which has details 
  one zone per line, and has ";"-delimited header comments describing the 
  syntax for master ("primary") records, which typically feature a "notify" 
  IP list; and slave ("secondary") records, which feature a "masters" IP list.
  Standard location for zonefiles is subdirectories "primary" and
  "secondary".  

  Your maintenance sequence will typically be like this:

  1.  "cd /etc/nsd/primary"
  2.  Edit svlug.org.zone or whatever in your choice of text editor.  Don't
      forget to increment the S/N value!
  3.  "cd .."
  4.  "zonec -v nsd.zones"  This compiles the zone revision.
  5.  nsdc restart
  6.  Check your work, by doing "dig -t a www.svlug.org @ns1.svlug.org".
      (Substitute an appropriate reference record for "-t a www.svlug.org"
      to reflect whatever you worked on.)  It's a good idea to at least 
      run "dig -t soa svlug.org @ns1.svlug.org" to verify that your S/N 
      update is reflected in actual DNS return values.

  Further information on NSD is at this article:
  http://hardware.newsforge.com/article.pl?sid=05/06/28/1618219&tid=65

> If one dig(1)s (okay, pun wasn't initially intended, but boy it works
> ... especially when adding (1)) a bit deeper, one may find things fairly
> interesting in/around, oh, ... say around balug.org. and new.balug.org. and
> @ns1.balug.org.
> @ns1.everydns.net.
> @ns2.everydns.net.
> @ns3.everydns.net.
> @ns4.everydns.net.
> @150.135.84.2

Ugh.  Broken EveryDNS secondary nameservice.  Ignores NOTIFY, doesn't do
AXFR.  Avoid.