MultiNet Tech Tip: Tracking "Server Failed" Error from DNS

 

Introduction

When a DNS lookup returns "server failed" in response to a query, this means that some DNS server queried sent back a response packet with code SERVFAIL.

 

Possible Causes

Some possible causes of this problem are:

  • Connectivity was lost due to network failure.
  • The boot file contains an incorrect IP address.
  • The query may have been sent to a secondary name server for an expired zone (the expire time set in the Start of Authority [SOA] record elapsed before the secondary could transfer new zone data from the primary).
  • The name server may have encountered some error when trying to perform the lookup–typically, too much data was cached for a given zone (frequently the root zone).

 

Locating the Source of the Error

First check the zone's IP entry in the boot file, and make sure it really is the correct IP address for the master server.

Then try to ping the server to check connectivity. If the ping is successful, you need to look further.

Use nslookup and turn on norecurse to track the path of the name servers down to the ultimate source of the SERVFAIL code.

The source of the "server failed" error can be any of the name servers that the resolver is configured to query (typically, it originates in the local name server–see the logical MULTINET_NAMESERVERS), any forwarders configured, any of the root name servers, or any name servers in the path of delegation from the root down to the official name servers for the zone in question.

Use nslookup to walk through all possible name servers until you locate the source of the problem.

 

Identifying the Problem

Once you find the name server that is generating the SERVFAIL error, what can you do?

  • If the problem is due to an expired zone, make sure:
    • that the primary name server is properly providing authoritative name service for the zone
    • that the secondary is correctly configured to transfer the zone
  • If the problem is due to a name server having too much data for a given zone, then try to fix that problem. One possible cause is that the root name servers have all recently changed their names (which occurred in September, 1995). If this is the case, update the root cache files on all name servers in your network and restart them.

 

Example

This section contains an example of how to track down the source of a "Server failed" error message on servfail.calvin.yoyodyne.com.

 $ multinet nslookup servfail.calvin.yoyodyne.com.
 Server:  HQ.TGV.COM
 Address:  161.44.128.70

 *** HQ.TGV.COM can't find SERVFAIL.calvin.yoyodyne.com.: Server failed
 $

Just because HQ.TGV.COM reported SERVFAIL, that does not necessarily indicate the source of the "Server failed" message.

 

Tracking the Error Down

  1. Get list of name servers that your resolver might be querying. You may need to check these.
      $ sho log multinet_nameservers ! get resolver's list of nameservers
      "MULTINET_NAMESERVERS" = "161.44.128.70" (LNM$SYSTEM_TABLE)
    
  2. Check the BIND bootfile (DOMAIN-NAME-SERVICE.CONFIGURATION) on your name server for any forwarders. Check all forwarders as well. In this example, name server 161.44.128.70 does not happen to have any forwarders.
  3. Start nslookup. Turn on norecurse. Walk down through the DNS. (You may have to double back.)
      $ multinet nslookup
      Default Server:  catbert.ABC.com
      Address:  161.44.128.71
    
      > set norecurse
      > servfail.calvin.yoyodyne.com.
      Server:  catbert.ABC.com
      Address:  161.44.128.71
    
      Name:    servfail.calvin.yoyodyne.com
      Served by:
      - treefrog.com
                128.196.128.234, 128.196.128.233
                yoyodyne.com
      - NS1.WESTNET.NET
                128.138.213.13
                yoyodyne.com
      - rip.psg.com
                147.28.0.39
                yoyodyne.com
    

    In the example, the SERVFAIL is not coming straight from the local name server, 161.44.128.71. When norecurse is on, the error does not occur.

  4. The next step is to try the authoritative name servers for yoyodyne.com.
      > server treefrog.com.
      Default Server:  treefrog.com
      Addresses:  128.196.128.234, 128.196.128.233
    
      > servfail.calvin.yoyodyne.com.
      Server:  treefrog.com
      Addresses:  128.196.128.234, 128.196.128.233
    
      Name:    servfail.calvin.yoyodyne.com
      Served by:
      - serv2.calvin.yoyodyne.com
                192.192.192.2
                servfail.calvin.yoyodyne.com
      - hobbes.aces.net
                192.192.192.1
                servfail.calvin.yoyodyne.com
    

    The example shows how to go down the path of delegation. The problem may have come from any another name sever in the delegation path between the root and servfail.calvin.yoyodyne.com.

  5. Query some other servers.
     > server serv2.calvin.yoyodyne.com.
     Default Server:  serv2.calvin.yoyodyne.com
     Address:  192.192.192.2
    
     > servfail.calvin.yoyodyne.com.
     Server:  serv2.calvin.yoyodyne.com
     Address:  192.192.192.2
    
     *** serv2.calvin.yoyodyne.com can't find servfail.calvin.yoyodyne.com.: Server
     failed
    

    This is it!

    (If the "Server failed" error had not occured here, you would have had to keep trying by querying other servers.)

  6. To be certain, check another name server for servfail.calvin.yoyodyne.com.
      > server hobbes.aces.net.
      *** Can't find address for server hobbes.aces.net.: Non-authoritative answer
    

    This answer occurs because norecurse is on.

  7. Turn off norecurse temporarily. In this example, use the IP address of hobbes.aces.net instead.
      > server 192.192.192.1
      Default Server:  hobbes.ACES.NET
      Address:  192.192.192.1
    
      > servfail.calvin.yoyodyne.com.
      Server:  hobbes.ACES.NET
      Address:  192.192.192.1
    
      hobbes.ACES.NET can't find servfail.calvin.yoyodyne.com.: No   response from server
    

    hobbes.aces.net isn't responding, and servfail.calvin.yoyodyne.com is returning SERVFAIL. Maybe serv2.calvin.yoyodyne.com is a secondary for the zone and has exipred it, or maybe serv2.calvin.yoyodyne.com has a bad root name server cache.

  8. Next test serv2.calvin.yoyodyne.com and see if it can resolve other names.
      > server serv2.calvin.yoyodyne.com.
      Default Server:  serv2.calvin.yoyodyne.com
      Address:  192.192.192.2
    
      > set TYPE=any
      > .
      Server:  serv2.calvin.yoyodyne.com
      Address:  192.192.192.2
    
      Non-authoritative answer:
      (root)  nameserver = F.ROOT-SERVERS.NET
      (root)  nameserver = G.ROOT-SERVERS.NET
      (root)  nameserver = A.ROOT-SERVERS.NET
      (root)  nameserver = H.ROOT-SERVERS.NET
      (root)  nameserver = B.ROOT-SERVERS.NET
      (root)  nameserver = C.ROOT-SERVERS.NET
      (root)  nameserver = D.ROOT-SERVERS.NET
      (root)  nameserver = E.ROOT-SERVERS.NET
      (root)  nameserver = I.ROOT-SERVERS.NET
      (root)
              origin = A.ROOT-SERVERS.NET
              mail addr = HOSTMASTER.INTERNIC.NET
              serial = 1995092000
              refresh = 10800 (3 hours)
              retry   = 900 (15 mins)
              expire  = 604800 (7 days)
              minimum ttl = 86400 (1 days)
    
      Authoritative answers can be found from:
      (root)  nameserver = F.ROOT-SERVERS.NET
      (root)  nameserver = G.ROOT-SERVERS.NET
      (root)  nameserver = A.ROOT-SERVERS.NET
      (root)  nameserver = H.ROOT-SERVERS.NET
      (root)  nameserver = B.ROOT-SERVERS.NET
      (root)  nameserver = C.ROOT-SERVERS.NET
      (root)  nameserver = D.ROOT-SERVERS.NET
      (root)  nameserver = E.ROOT-SERVERS.NET
      (root)  nameserver = I.ROOT-SERVERS.NET
      F.ROOT-SERVERS.NET      internet address = 39.13.229.241
      G.ROOT-SERVERS.NET      internet address = 192.112.36.4
      A.ROOT-SERVERS.NET      internet address = 198.41.0.4
      H.ROOT-SERVERS.NET      internet address = 128.63.2.53
      B.ROOT-SERVERS.NET      internet address = 128.9.0.107
      C.ROOT-SERVERS.NET      internet address = 192.33.4.12
      D.ROOT-SERVERS.NET      internet address = 128.8.10.90
      E.ROOT-SERVERS.NET      internet address = 192.203.230.10
      I.ROOT-SERVERS.NET      internet address = 192.36.148.17
    

    This looks fine. Otherwise, turn on debug or d2.

  9. Try to look up some random name.
      > rs.internic.net.
      Server:  serv2.calvin.yoyodyne.com
      Address:  192.192.192.2
    
      Authoritative answers can be found from:
      INTERNIC.NET    nameserver = RS0.INTERNIC.NET
      INTERNIC.NET    nameserver = ds0.INTERNIC.NET
      INTERNIC.NET    nameserver = noc.cerf.NET
      RS0.INTERNIC.NET        internet address = 198.41.0.5
      ds0.INTERNIC.NET        internet address = 198.49.45.10
      noc.cerf.NET    internet address = 192.153.156.22
    

    This name server looks good, so now you can assume there's something wrong with the servfail.calvin.yoyodyne.com zone.

(In this example, you would need to look on serv2.calvin.yoyodyne.com to find out more.)