OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
Subject: Re: unknown_client_reject_code = 5xx?
From: Greg A. Woods (woodsweird.com)
Date: Fri Sep 01 2000 - 14:46:32 CDT


[ On Friday, September 1, 2000 at 19:19:12 (+0200), Brad Knowles wrote: ]
> Subject: Re: unknown_client_reject_code = 5xx?
>
> At 11:34 AM -0400 2000/9/1, Greg A. Woods wrote:
>
> > The database dump of the central nameserver should show the source
> > address (i.e. the nameserver from whence the bogus record was
> > retrieved).
>
> Nope, it just said "; cr=auth".

Oh. That's not good.

Hmm... seems you may have hosed yourself, at least with respect to
getting sufficient information for this kind of debugging....

You might not like to spend the resources, but I think turning on the
"host-statistics" option is the only way you can trace these things with
any degree of certainty. For the benefit of bystanders:

   host-statistics
           If yes, statistics are kept for every host that the the
           nameserver interacts with. The default is no. Note:
           turning on host-statistics can consume huge amounts of
           memory.

I think that's normally going to be 96 bytes per "neighbour".....

With that option turned on you will of course see exactly where your
currently cached data arrived from, eg. this from my cache:

$ORIGIN skynet.BE.
ns4 81354 IN A 195.238.1.36 ;NT=78 Cr=addtnl [193.74.208.139]
ns2 81354 IN A 195.238.1.34 ;NT=67 Cr=addtnl [193.74.208.139]
ns3 81354 IN A 195.238.1.35 ;NT=16 Cr=addtnl [193.74.208.139]
ns1 81354 IN A 195.238.1.33 ;NT=68 Cr=answer [193.74.208.139]

> The bogus information I have in the dump is:
>
>[[trimmed to the meat]]
>
> $ORIGIN REAKTIE.COM.
> reaktie3 14074 IN NS ns.xlserver.net. ;Cr=auth
> 14074 IN NS ns2.xlserver.net. ;Cr=auth
> 14074 IN A 216.71.62.112 ;Cr=auth
> ; 8713 IN MX ns.xlserver.net. hostmaster.xlserver.net. (
> ; 100021401 86400 7200 3600000 28800 );reaktie.com.;NODATA
> ;-$ ;Cr=auth

Hmmmm.... What's up with the "MX" on the SOA record there?!?!?!?!?
That's really bogus looking too!!!! I hope that's a transcription error
because it's not even possible if I read the code in db_dump.c
correctly....

> In particular, the IP address for reaktie3 is not correct. The
> correct IP address is 209.150.128.56.
>
> The reverse DNS for this zone is a little strange, and served by
> other machines. But when querying the authoritative nameservers, it
> all appears to check out. When running "doc -d reaktie.com", the
> only warnings I got were for three different unique SOAs for the .com
> zone, which is not surprising (considering how large it is, etc...).

In the mean time the only clue I can offer is that I find that the
nameservers which are supposed to be authoritative for
62.71.216.IN-ADDR.ARPA (i.e. the reverse zone for the bogus record) are
also (blindly) authoritiative for REAKTIE.COM, though they apparently do
have the correct data:

$ host -C 62.71.216.in-addr.arpa
62.71.216.in-addr.arpa NS NS.HOST4U.NET
62.71.216.in-addr.arpa SOA record currently not present at NS.HOST4U.NET
62.71.216.in-addr.arpa has lame delegation to NS.HOST4U.NET
62.71.216.in-addr.arpa NS NS2.HOST4U.NET
62.71.216.in-addr.arpa SOA record currently not present at NS2.HOST4U.NET
62.71.216.in-addr.arpa has lame delegation to NS2.HOST4U.NET

14:55 [184] $ host -r -a reaktie3.REAKTIE.COM NS.HOST4U.NET
reaktie3.REAKTIE.COM MX 10 mail.reaktie3.REAKTIE.COM
reaktie3.REAKTIE.COM A 209.150.128.56
reaktie3.REAKTIE.COM NS ns.host4u.net
reaktie3.REAKTIE.COM NS ns2.host4u.net
reaktie3.REAKTIE.COM SOA ns.host4u.net hostmaster.host4u.net (
                        99022212 ;serial (version)
                        86400 ;refresh period (1 day)
                        7200 ;retry interval (2 hours)
                        3600000 ;expire time (5 weeks, 6 days, 16 hours)
                        28800 ;default ttl (8 hours)
                        )

$ host -r -a reaktie3.REAKTIE.COM NS2.HOST4U.NET
reaktie3.REAKTIE.COM MX 10 mail.reaktie3.REAKTIE.COM
reaktie3.REAKTIE.COM A 209.150.128.56
reaktie3.REAKTIE.COM NS ns.host4u.net
reaktie3.REAKTIE.COM NS ns2.host4u.net
reaktie3.REAKTIE.COM SOA ns.host4u.net hostmaster.host4u.net (
                        99022212 ;serial (version)
                        86400 ;refresh period (1 day)
                        7200 ;retry interval (2 hours)
                        3600000 ;expire time (5 weeks, 6 days, 16 hours)
                        28800 ;default ttl (8 hours)
                        )

> So far as I can tell, their nameservers were actually configured
> correctly in this instance, and this is a problem where our
> nameserver somehow got bogus information into it's database, and
> therefore caused messages to bounce.

I'm betting that your nameservers learned the bogus record from some
other nameserver that has/had a corrupted cahche, though why it did so
is harder to determine. I've never yet seen incorrect data that didn't
come from either an incorrectly configured auth server, or from another
corrupted cache, or of course from intentionally spoofed responses
intended to maliciously corrupt a cache.

The only way I can see this being a legitimate mis-configuration
anywhere though would be if some TLD zone has a glue record with that
name and with the incorrect IP address. I don't see it anywhere obvious
though, correct or incorrect.....

> It wasn't so much concern for bogus information returning, but
> for accumulation of bogus data over a long period of time. I wanted
> to make sure that we restarted often enough that we didn't get a
> chance to accumulate to much cruft and propagate said cruft to other
> machines.

In the good old BIND-4 days the problems I saw a lot of with cache
corruption were due to major (top level, eg. .CA) nameservers running in
recursive mode. Their corruption would spread like widlfire. Having
named track the source of every record it cached was critical to
finding and shooting the bad messenger!

> I am somewhat familiar with this bug -- kre was reporting on the
> bind-workers mailing list some vague problems that he wasn't too
> concerned about, I mentioned that I was getting SERVFAIL from his
> machine for the various ccTLD zones that his nameserver is supposedly
> authoritative for, he then went into high gear and did some serious
> debugging and found a strange edge condition where the machine could
> get overloaded and certain assumptions that were made in terms of how
> BIND should behave and how things should be retried were being
> violated, and between him and Paul they fixed the bug. At that time,
> kre also made his server non-recursive, and this helped greatly
> reduce the load of queries it was having to answer.

Yeah, that's the one -- there really otta be a rule that registered
nameservers are not ever allowed to run in recursive mode! :-)

> In fact, this situation has only reinforced my conviction that
> the *only* safe response code to use when faced with potential DNS
> failures that you cannot guarantee are not your fault, is a 4xx
> temporary failure code. Of course, you obviously disagree, and you
> are welcome to hold that opinion.

Yup, to me it means the exact opposite! ;-) If you're going to do any
DNS-based validation of SMTP envelopes then you really do not want any
e-mail hiding in a queue for days on end....

Yes, this does imply that if people want your e-mail to bounce in that
case then they need only corrupt the DNS!

-- 
							Greg A. Woods

+1 416 218-0098 VE3TCP <gwoodsacm.org> <robohack!woods> Planix, Inc. <woodsplanix.com>; Secrets of the Weird <woodsweird.com>