After several years of running BIND9 on our DNS caching servers we
have finally ditch it and switch to D. J. Bernstein’s dnscache. On an average day
our SMTP and Spam Filtering servers send 1400 queries per second to
each of our DNS servers during peak hours. We made the switch because as we’ve grown we have seen more reliability, performance and general weirdness
issues with BIND.
Most notably, when the BIND cache would reach about 250 MB, its
performance deteriorated noticeably. It would respond slowly and even
drop queries. I have heard this is caused by BIND’s internal data
structures not efficiently getting rid of old cache records. Instead
BIND tries to cache every record until it expires and when it does
reach some internally calculated limit, BIND starts to discard new
cache records instead of old records. This causes the server’s
performance to take a nose dive, and causes our pagers to go off…. Time to
run "service named stop; sleep 2; service named start" again!
Also, BIND didn’t efficiently cache records from our rbldnsd servers
behind it. We could never really figure out why so many requests were
reaching rbldnsd and not hitting the BIND cache. Now with dnscache, we
have a good view of exactly what it is doing and have fine tuned the
SOAs in rbldnsd so that dnscache caches our spam DB lookups exactly how
we want it. No more weirdness going on behind the scenes.
Mr. Bernstein has a lot of nasty things to say about BIND. Don’t believe all of his hype, but do trust the fact that DJB’s code is
much simpler, more reliable and possibly more powerful than BIND. BIND
is overkill for almost every use. It tries to be all things for all
systems, whereas the DJB keeps things simple and provides a different server for
each purpose. I
like simplicity. The install process was little bit awkward on a Linux system though,
with the daemon tools and stupid errno patch.
FYI our dnscache servers are AMD Athlon 3200s with 1GB RAM, and they
each are handling 1400 queries per second using only 15% CPU. Currently
we have a 100 MB cache size, but we are still tuning that.