"Oh, I'm not particular as to size," Alice hastily replied; "only one doesn't like changing so often, you know..."
"Are you content now?" said the Caterpillar.
"Well, I should like to be a little larger, sir, if you wouldn't mind...."
Other factors to keep in mind are the host's connectivity, the software it runs (BIND and otherwise), maintaining the homogeneity of your name servers, and security:
It's doubly important that your primary master name server be well connected. The primary needs good connectivity to all the slaves that update from it, for reliable zone transfers. And, like any name server, it'll benefit from fast, reliable networking.
Just which tasks are "name server-intensive"? Surfing the Web can be name server-intensive. Sending electronic mail, especially to large mailing lists, can be name server-intensive. Programs that make lots of remote procedure calls to different hosts can be name server-intensive. Even running certain graphical user environments can tax your name server. X Windows-based user environments, for example, query the name server to check access lists (among other things).
The astute (and precocious) among you may be asking, "But how do I know when my name servers are overloaded? What do I look for?" An excellent question!
Memory utilization is probably the most important aspect of a name server's operation to monitor. named can get very large on a name server that is authoritative for many zones. If named 's size, plus the size of the other processes you run, exceeds the size of your host's real memory, your host may swap furiously ("thrash") and not get anything done. Even if your host has more than enough memory to run all its processes, large name servers are slow to start and slow to spawn new named processes (e.g., to handle zone transfers). Another problem, peculiar to BIND 4: since a BIND 4 name server creates new named processes to handle zone transfers, it's quite possible to have more than one named process running at one time -- one answering queries and one or more servicing zone transfers. If your BIND 4 master name server already consumes 5 or 10 megabytes of memory, count on two or three times that amount being used occasionally.
Another criterion you can use to measure the load on your name server is the load the named process places on the host's CPU. Correctly configured name servers don't use much CPU time, so high CPU usage is often symptomatic of a configuration error. Programs such as top can help you characterize your name server's average CPU utilization.[54] Unfortunately, there are no absolute rules when it comes to acceptable CPU utilization. We offer a rough rule of thumb, though: 5% average CPU utilization is probably acceptable; 10% is a bit high, unless the host is dedicated to providing name service.
[54]top is a very handy program, written by Bill LeFebvre, that gives you a continuous report of which processes are sucking up the most CPU time on your host. The most recent version of top is available via anonymous FTP from ftp://eecs.nwu.edu as file /pub/top/top-3.4.tar.Z.To get an idea of what normal figures are, here's what top might show for a relatively quiet name server:
Okay, that's really quiet. Here's what top shows on a busy (though not overloaded) name server:last pid: 14299; load averages: 0.11, 0.12, 0.12 18:19:08 68 processes: 64 sleeping, 3 running, 1 stopped Cpu states: 11.3% usr, 0.0% nice, 15.3% sys, 73.4% idle, 0.0% intr, 0.0% ker Memory: Real: 8208K/13168K act/tot Virtual: 16432K/30736K act/tot Free: 4224K PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 89 root 1 0 2968K 2652K sleep 5:01 0.00% 0.00% named
Another statistic to look at is the number of queries the name server receives per minute (or second, if you have a busy name server). Again, there are no absolutes here: a fast Pentium III running NetBSD can probably handle thousands of queries per second without breaking a sweat, while an older Unix host might have problems with more than a few queries a second.load averages: 0.30, 0.46, 0.44 system: relay 16:12:20 39 processes: 38 sleeping, 1 waiting Cpu states: 4.4% user, 0.0% nice, 5.4% system, 90.2% idle, 0.0% unk5, 0.0% unk6, 0.0% unk7, 0.0% unk8 Memory: 31126K (28606K) real, 33090K (28812K) virtual, 54344K free Screen #1/ 3 PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 21910 root 1 0 2624K 2616K sleep 146:21 0.00% 1.42% /etc/named
To check the volume of queries your name server is receiving, it's easiest to look at the name server's internal statistics, which you can configure the server to write to syslog at regular intervals.[55] For example, you could configure your name server to dump statistics every hour (actually, that's the default for BIND 8 servers), and compare the number of queries received between hours:
[55]Some older BIND name servers need coercion to dump their statistics: the ABRT signal (IOT on older systems). BIND 4.9 name servers automatically dump stats every hour, but 4.9.4 through 4.9.7 name servers, once again, need to be coerced with ABRT.
BIND 9 name servers don't support the statistics-interval substatement, but you can use rndc to tell a BIND 9 name server to dump statistics on the hour, for example in crontab :options { statistics-interval 60; };
You should pay special attention to peak periods. Monday morning is often busy, because many people like to respond to mail they've received over the weekend first thing on Mondays.0 * * * * /usr/local/sbin/rndc stats
You might also want to take a sample starting just after lunch, when people are returning to their desks and getting back to work -- all at about the same time. Of course, if your organization is spread across several time zones, you'll have to use your own good judgment to determine a busy time.
Here's a snippet from the syslog file on a BIND 8.2.3 name server:
The number of queries received is dumped in the RQ field (in bold). To calculate the number of queries received in the hour, just subtract the first RQ value from the second one: 458332 - 458031 = 301.Aug 1 11:00:49 terminator named[103]: NSTATS 965152849 959476930 A=8 NS=1 SOA=356966 PTR=2 TXT=32 IXFR=9 AXFR=204 Aug 1 11:00:49 terminator named[103]: XSTATS 965152849 959476930 RR=3243 RNXD=0 RFwdR=0 RDupR=0 RFail=20 RFErr=0 RErr=11 RAXFR=204 RLame=0 ROpts=0 SSysQ=3356 SAns=391191 SFwdQ=0 SDupQ=1236 SErr=0 RQ=458031 RIQ=25 RFwdQ=0 RDupQ=0 RTCP=101316 SFwdR=0 SFail=0 SFErr=0 SNaAns=34482 SNXD=0 RUQ=0 RURQ=0 RUXFR=10 RUUpd=34451 Aug 1 12:00:49 terminator named[103]: NSTATS 965156449 959476930 A=8 NS=1 SOA=357195 PTR=2 TXT=32 IXFR=9 AXFR=204 Aug 1 12:00:49 terminator named[103]: XSTATS 965156449 959476930 RR=3253 RNXD=0 RFwdR=0 RDupR=0 RFail=20 RFErr=0 RErr=11 RAXFR=204 RLame=0 ROpts=0 SSysQ=3360 SAns=391444 SFwdQ=0 SDupQ=1244 SErr=0 RQ=458332RIQ=25 RFwdQ=0 RDupQ=0 RTCP=101388 SFwdR=0 SFail=0 SFErr=0 SNaAns=34506 SNXD=0 RUQ=0 RURQ=0 RUXFR=10 RUUpd=34475
Even if your host is fast enough to handle the number of queries it receives, you should make sure that the DNS traffic isn't placing an undue load on your network. On most LANs, DNS traffic is too small a proportion of the network's bandwidth to worry about. Over slow leased lines or dialup connections, though, DNS traffic could consume enough bandwidth to merit concern.
For a rough estimate of the volume of DNS traffic on your LAN, multiply the number of queries received (RQ) plus the number of answers sent (SAns) in an hour by 800 bits (100 bytes, a rough average size for a DNS message), and divide by 3600 (seconds per hour) to find the bandwidth utilized. This should give you a feeling for how much of your network's bandwidth is consumed by DNS traffic.[56]
[56]For a nice package that automates the analysis of BIND's statistics, look for Nigel Campbell's bindgraph in the DNS Resources Directory's tools page, http://www.dns.net/dnsrd/tools.html.To give you an idea of what's normal, the last NSFNET traffic report (in April 1995) showed that DNS traffic constituted just over 5% of the total traffic volume (in bytes) on their backbone. The NSFNET's figures were based upon actual traffic sampling, not calculations like ours using the name server's statistics.[57] If you want to get a more accurate idea of the traffic your name server is receiving, you can always do your own traffic sampling with a LAN protocol analyzer.
[57]We're not sure how representative of the current state of the Internet these numbers are, but it's extremely difficult to wheedle equivalent numbers out of the commercial backbone providers that succeeded the NSFNET.Once you've found that your name servers are overworked, what then? First, it's a good idea to make sure your name servers aren't being bombarded with queries by a misbehaving program. To do that, you'll need to find out where all the queries are coming from.
If you're running a BIND 4.9 or 8.1.2 name server, you can find out which resolvers and name servers are querying your name server just by dumping the statistics. These name servers keep statistics on a host-by-host basis, which is really useful in tracking down heavy users of your name server. BIND 8.2 or newer name servers don't keep these statistics by default; to induce them to keep host-by-host statistics, use the host-statistics substatement in your options statement, like this:[58]
[58]BIND 9 doesn't support the host-statistics substatement -- or keeping per-host statistics, for that matter -- as of 9.1.0.
For example, take these statistics:options { host-statistics yes; };
After the Global entry, each host is broken out by IP address in brackets. Looking at the legend, you can see that the first field in each record is RQ, or queries received. That gives us a good reason to look at hosts 15.17.232.8, 15.17.232.16, and 15.17.232.94, which appear to be responsible for about 88% of our queries.+++ Statistics Dump +++ (829373099) Fri Apr 12 23:24:59 1996 970779 time since boot (secs) 471621 time since reset (secs) 0 Unknown query types 185108 A queries 6 NS queries 69213 PTR queries 669 MX queries 2361 ANY queries ++ Name Server Statistics ++ (Legend) RQ RR RIQ RNXD RFwdQ RFwdR RDupQ RDupR RFail RFErr RErr RTCP RAXFR RLame ROpts SSysQ SAns SFwdQ SFwdR SDupQ SFail SFErr SErr RNotNsQ SNaAns SNXD (Global) 257357 20718 0 8509 19677 19939 1494 21 0 0 0 7 0 1 0 824 236196 19677 19939 7643 33 0 0 256064 49269 155030 [15.17.232.4] 8736 0 0 0 717 24 0 0 0 0 0 0 0 0 0 0 8019 0 717 0 0 0 0 8736 2141 5722 [15.17.232.5] 115 0 0 0 8 0 21 0 0 0 0 0 0 0 0 0 86 0 1 0 0 0 0 115 0 7 [15.17.232.8] 66215 0 0 0 6910 148 633 0 0 0 0 5 0 0 0 0 58671 0 6695 0 15 0 0 66215 33697 6541 [15.17.232.16] 31848 0 0 0 3593 209 74 0 0 0 0 0 0 0 0 0 28185 0 3563 0 0 0 0 31848 8695 15359 [15.17.232.20] 272 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 272 0 0 0 0 0 0 272 7 0 [15.17.232.21] 316 0 0 0 52 14 3 0 0 0 0 0 0 0 0 0 261 0 51 0 0 0 0 316 30 30 [15.17.232.24] 853 0 0 0 65 1 3 0 0 0 0 2 0 0 0 0 783 0 64 0 0 0 0 853 125 337 [15.17.232.33] 624 0 0 0 47 1 0 0 0 0 0 0 0 0 0 0 577 0 47 0 0 0 0 624 2 217 [15.17.232.94] 127640 0 0 0 1751 14 449 0 0 0 0 0 0 0 0 0 125440 0 1602 0 0 0 0 127640 106 124661 [15.17.232.95] 846 0 0 0 38 1 0 0 0 0 0 0 0 0 0 0 809 0 37 0 0 0 0 846 79 81 -- Name Server Statistics -- --- Statistics Dump --- (829373099) Fri Apr 12 23:24:59 1996
If you're running an older name server, the only way to find out which resolvers and name servers are sending all those darned queries is to turn on name server debugging. (We'll cover this in depth in Chapter 13, "Reading BIND Debugging Output".) All you're really interested in is the source IP addresses of the queries your name server is receiving. When poring over the debugging output, look for hosts sending repeated queries, especially for the same or similar information. That may indicate a misconfigured or buggy program running on the host, or a foreign name server pelting your name server with queries.
If all the queries appear legitimate, add a new name server. Don't put the name server just anywhere, though; use the information from the debugging output to help you decide where best to run one. In cases where DNS traffic is gobbling up your Ethernet, it won't help to choose a host at random and create a name server there. You need to consider which hosts are sending all the queries, then figure out how to best provide them name service. Here are some hints to help you decide: