mentby.com
Blog | Jobs | Help | Signup | Login

no more recursive clients: quota reached



Hi,

I'm not really sure what to do about this. I'm running Bind 9 on FreeBSD. Suddenly this morning I began noticing the following in /var/log/messages:

Aug 26 12:48:56 netlink named[295]: client 207.191.185.6#60614: no more recursiv
e clients: quota reached
Aug 26 12:48:56 netlink named[295]: client 207.191.185.6#51149: no more recursiv
e clients: quota reached
Aug 26 12:48:58 netlink named[295]: client 207.191.185.6#56825: no more recursiv
e clients: quota reached

The client in question (206.191.185.6) is our mail server. I read that one should not allow recursive queries from outside of your network, but the mail server is within our network. Also on the mail server, the mail queue currently has about 40 entries. It usually has from 2 - 5 or is empty.

Our DNS server is not heavily used, so I assumed it would be OK to increase the number of recursive queries allowd. In /etc/named.conf I inserted the following:

recursive-clients       5000;

then restarted bind. That didn't seem to help much, I am still getting the same error message in /var/log/messages on an intermittent basis. Also, if I do a rncd status I see the following:

number of zones: 14
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 564/1000
tcp clients: 0/100
server is up and running

The lins recursive clients: 564/1000 bothers me, did my change to /etc/named.conf not get oicked up? It appears that the max recursive clients is still at bind's default of 1000.

Any ideas on how I should go about solving/fixing this?

Thanks,

Lisa Casey


Lisa Casey Wed, 26 Aug 2009 10:38:52 -0700

At Wed, 26 Aug 2009 13:37:09 -0400,

True.  It's also true that
recursive-clients       5000;
will increase the quota in question to 5000.  So the only sensible
explanation I can think of is that you made an error in updating the
configuration file.

BTW, it would always be helpful to identify the exact version of
BIND9 when you ask something like this.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.


JINMEI Tatuya / ç¥æéå Wed, 26 Aug 2009 10:47:53 -0700

I'ld suggest you check your connectivity and routing.

    We see this behaviour occasionally, but only ever as a
    consequence of a back-hoe incident or similar catastrophe which
    isolates one of our campuses where there is a local resolving
    server.

    Best regards,

    Niall O'Reilly

    University College Dublin IT Services


Niall O'Reilly Thu, 27 Aug 2009 09:18:14 -0700

Although it may not be a problem on your end of the network.  You
could be seeing a spike in DNS queries because somebody really, really
wants to talk to a remote location that is having problems.

DNStop may be able to help you pinpoint what DNS queries are giving
you problems: http://dns.measurement-factory.com/tools/dnstop/

Run it on the DNS server to see if there are any queries that you are
seeing get repeated continuously.

--
Dave


Dave Sparro Fri, 28 Aug 2009 06:00:10 -0700

What version of BIND, what version of FreeBSD?

On FreeBSD the default location for named.conf is /etc/namedb (for
historical reasons). Have you changed the location of named.conf with
an option in /etc/rc.conf?

Doug


Doug Barton Fri, 28 Aug 2009 09:54:35 -0700

I concur.  1000 is a lot of simultaneous queries.  Perhaps your site
is busy enough to generate that many "legitimate" queries, but
hitting that 1000 mark can also be a symptom of something slowing or
black-holing queries.  When I've seen "quota reached" logging,
typically further investigation reveals that there were network  
connectivity
issues at the time.

Your example of 564/1000, if that's typical suggests that perhaps you  
truly do have enough
normal queries to top out occasionally.  On the other hand, if you  
usually see fewer than
100, but it occasionally shoots to 1000, that could be a specific app  
doing something
(e.g., monthly web access log analysis), but could also be network  
issues.
(In some cases, it might be useful to set up a separate nameserver  
dedicated to the
demanding app.)

The age of the queries can also be revealing.

John


John Wobus Fri, 28 Aug 2009 10:11:14 -0700

Dear list users,

I'd like to understand a point about quotas on recursive clients quotas
and reading books, manuals and this list's archives hasn't made it
entirely clear to me.

I have the classical error logs :

17-Mar-2010 12:14:44.026 client: warning: client 129.88.30.5#57960: no
more recursive clients: quota reached

I have a lot of these... (two thousand unique clients blocked over the
last two weeks on my main resolver)

Is this quota global for all clients? I.e. one rogue client sending
massive amounts of recursive requests would blow the quota for everyone.
Or is it per client? It seems unlikely to me but I'm not clear on that
point.

Is increasing the quota limit the only solution?

It seems odd to me to hit the default bind limit on my servers when they
are not open recursive servers and only clients on my networks (a few
thousand clients for three recursive resolvers) can interrogate them.

The problem is particularly crucial because one of the clients is a
router behind which many of my clients are nated and each time the quota
is reached on the servers they use all the clients behind the router
address are blocked and get network timeouts.

I'm going to increase the quota, but if you can tell me if this the
right thing to do or if I should be looking for something else that
would be great.

Best regards,

Oliver Henriot


Oliver Henriot Wed, 24 Mar 2010 07:45:07 -0700

See the BIND ARM for the option recursive-clients

As in:

options {
        recursive-clients       4000;
};

I don't recall what the default is (maybe 1000), but our environment required an increase to 4000.

You may also want to look at these options:  tcp-clients X; clients-per-query N; max-clients-per-query P;

The defaults may vary on BIND version. Furthermore, settings may vary for the environment the DNS server is in.

Assuming the BIND version supports the rndc utility, one can see a snap shot in time on the current settings and activity.
For example:
# rndc status
version: 9.6.0-P1 (version.bind/txt/ch disabled)
CPUs found: 2
worker threads: 2
number of zones: 14
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 71/3900/4000
tcp clients: 0/200
server is up and running

HTH -- Chris


Fr34k Wed, 24 Mar 2010 08:13:28 -0700

It is the length of the queue of all outstanding recursive queries.
This depends not just on the RATE of queries coming in, but also the
time it takes to resolve them. (If the queue fills up, BIND gives up
on the ones that have been outstanding longest.)

Monitor the count with "rndc stats" to find out whether the outstanding
query queue is often close to the limit, or is spiking. In any case,
when the queue is large, take a look at it by using "rndc recursing"
(dumps the queue to "named.recursing" in BIND's current directory). You
may find that you have a lot of queries for some domain that is failing
to resolve in a timely fashion (we've had problems like that with people
trying to use RBLs from which we are blocked, for example).

You should also bear in mind the possibility of network problems, as
others have suggested. And firewall software might be mangling certain
outgoing queries, or the responses to them, making them appear to time
out.

--
Chris Thompson
Email: cet1*******


Chris Thompson Wed, 24 Mar 2010 10:10:42 -0700

Yes, and it is the Baofeng attack
< https://www.dns-oarc.net/files/workshop-200911/Ziqian_Liu.pd[..] >


Stephane Bortzmeyer Fri, 26 Mar 2010 01:17:08 -0700

Typically you can increase the default without harm, e.g., double or x  
10 if you
have a recent-vintage server with typical memory and speed, but
something might be causing the behavior that is impervious to
such a change or that needs some other kind of attention.
Such a problem might solely stem from sheer load, but quite often stems
from queries that are not receiving answers and are just sitting there
until they time out.

One of your clients might be making up names and trying them:
many would receive negative responses but a percent would receive
no response and sit.  Or it could be that some specific locally-
popular domain's
nameservers are down or unreachable.  Or it could be intermittent  
network
problems. Or some kind of long-term routing/connectivity issue, e.g. the
consequences of firewalling.

If there are short episodes with tons of these log entries, that hints  
at
short problems with your Internet connection, or a specific app that
is causing the issue when it runs.  If your Internet connectivity
goes away in such manner that packets "disappear", then the number
of outstanding recursive queries typically steadily rises until the  
quota
is reached.

If you look at the number of clients at random times and it is always
substantial and/or close to the quota, it may be that increasing the
quota is the right solution.

rndc lets you view the outstanding queries and see how long they've
been waiting, which provides a lot of insight into what is happening.

John Wobus
Cornell IT


John Wobus Fri, 26 Mar 2010 07:04:54 -0700



Related Topics

Post a Comment