The Penguin's Practical Network Troubleshooting Guide - page 2
Troubleshooting a Non-responsive Server
Suppose you have a remote Web server that is not responding. I know this is horridly obvious, but sometimes people forget that the first step in network troubleshooting is always to confirm connectivity, and for this we have our little friend ping. There is more to ping than you may realize, so let's take a closer look. First, always make sure you are connected to the network. I've been bitten by this more than once. Next, ping localhost:
$ ping -c4 -a localhost
-c4 sends four ICMP ECHO_REQUESTs, and -a makes it ping audibly. Then ping the IP of the box you're trying to connect to. Then ping the hostname. With three simple commands you have confirmed that your NIC is up, and tested both connectivity and DNS.
ping messages give some clues as to where the problem lies. This example shows that the hostname resolves, and there is a route to the host, but ping is receiving no responses of any kind:
$ ping -c10 somename.com
PING somename.com (220.127.116.11) 56(84) bytes of data.
--- somename.com ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 9999ms
You can try pinging the IP to see if it's a DNS problem:
$ ping -c10 18.104.22.168
PING 22.214.171.124 (126.96.36.199) 56(84) bytes of data.
--- 188.8.131.52 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 8999ms
Nope, it's not DNS. Chances are the entire remote network is offline, because you should at least get a "Destination Host Unreachable" message from the network's border router. But it could be a problem anywhere between you and the remote machine. Trying to pinpoint an Internet trouble spot is difficult and frustrating. In the olden days traceroute was a good tool for this, but in these here modern times a lot of network admins program their routers to not respond to traceroute packets. ping is often blocked as well.
A good alternative is to use tcptraceroute. tcptraceroute sends TCP packets instead of UDP datagrams or ICMP ECHO requests like traceroute, so it's unlikely they'll be blocked. And a nice bonus is tcptraceroute traverses NAT firewalls. Use it like this:
$ tcptraceroute somename.com
Selected device wan, address 184.108.40.206, port 32783 for outgoing packets
Tracing the path to somename.com (220.127.116.11) on TCP port 80 (www), 30 hops max
1 18.104.22.168 18.383 ms 15.855 ms 14.915 ms
2 router.foo.net (22.214.171.124) 16.884 ms 15.412 ms 14.670 ms
3 126.96.36.199 15.942 ms 16.928 ms 14.914 ms
4 188.8.131.52 45.727 ms 44.255 ms 43.988 ms
5 184.108.40.206 57.315 ms 55.858 ms 63.676 ms
6 tbr1-p012501.st6wa.ip.att.net (220.127.116.11) 56.307 ms 60.762 ms 54.591 ms
7 18.104.22.168 60.220 ms 59.547 ms 52.862 ms
8 POS2-0.BR1.SEA1.ALTER.NET (22.214.171.124) 51.870 ms * 51.498 ms
9 0.so-4-2-0.XL1.SEA1.ALTER.NET (126.96.36.199) 55.560 ms 52.386 ms 55.570 ms
10 0.so-7-0-0.XL1.DCA6.ALTER.NET (188.8.131.52) 117.896 ms 115.958 ms 121.841 ms
11 0.so-6-0-0.WR1.IAD6.ALTER.NET (184.108.40.206) 119.130 ms 120.162 ms 136.860 ms
12 so-1-0-0.ur1.iad6.web.wcom.net (220.127.116.11) 117.899 ms 126.721 ms 118.613 ms
13 18.104.22.168 120.843 ms 117.494 ms 116.865 ms
14 * * *
15 uu-3-166.hostdomains.com (22.214.171.124) 'open' 120.232 ms 145.716 ms 176.254 ms
This shows that tcptraceroute can trace the remote server all the way to its origin, which is the fictional hostdomains.com, a Web hosting service. So now we know the pipeline is open end-to-end, but the somename.com server is not reachable, and we know which service provider to nag to fix it. If it were your own remote location, this would tell you that the problem is at your remote site.
If it's possible, set up a direct dial-in connection to any remote server that you are running. This will let you find out quickly if it's up or not, and also to perform troubleshooting tests from the other direction.
Other Ping Messages
Destination Host Unreachable means that ping got as far as a router that is local to the remote host, but it cannot reach the remote host.
unknown host means you entered the wrong hostname, DNS is broken, or you are not connected to the network. Ping the IP to see if it's a DNS or connectivity problem.
If you are on a multi-homed box, use ping's -I flag to select a single interface:
# ping -I eth1 'IP or hostname'