How Does it Work?
Understanding the Route
Caveats and Quirks
Tracking Down Network Problems
Debugging Network Slowdowns
Exercises and Quiz
Online Traceroute Tool
Traceroute is a very handy tool written by >Van Jacobson that can show you the route that packets take from one host to another. It can also be used sometimes to help debug network problems, if you know how to interpret its results.
How Does it Work?
Every IP packet can specify how many hops it can go through before it is no longer forwarded on. When a packet is no longer forwarded on, that router just forgets all about it, but it also will usually send out a message to the source host saying, "Hey, sorry, but your packet died here." So, traceroute cleverly manipulates these values so that the first round of packets it sends out to the designated host are specified such that they can only go through one hop before dying. So that first hop gets those packets, sees that it's not supposed to forward them on any further and doesn't, and then sends a message back to the source host telling it that the packets died. When traceroute receives the "your packets died here" message from the router, it knows that's the first hop. It then sends on the second round of packets specifying that they can only go through two hops, and the cycle continues. It finishes when it gets a response from the final destination. For each hop, traceroute then displays the RTT, Round Trip Time, or the time difference between when the probe was sent from traceroute and the time the response arrived for each packet.
Let's take a look at an example traceroute:
[onyx]:[3:30am]:[/home/rnejdl] > traceroute www.neo.com traceroute to www.neo.com (18.104.22.168), 30 hops max, 40 byte packets 1 border1 (22.214.171.124) 2.053 ms 2.054 ms 2.322 ms 2 host-c-129.vcso.verio.net (126.96.36.199) 3.644 ms 1.195 ms 1.180 ms 3 int1-s1-0.dlls.tx.verio.net (188.8.131.52) 1.810 ms 1.795 ms 1.828 ms 4 ge-5-0-0.a10.dllstx01.us.ra.verio.net (184.108.40.206) 2.125 ms 2.070 ms 2.020 ms 5 ge-6-0.r00.dllstx01.us.bb.verio.net (220.127.116.11) 1.975 ms 1.969 ms 2.013 ms 6 ATM3-0.BR1.DFW9.ALTER.NET (18.104.22.168) 3.269 ms 3.175 ms 2.989 ms 7 140.at-5-0-0.XR2.DFW9.ALTER.NET (22.214.171.124) 3.583 ms 3.342 ms 3.167 ms 8 184.at-2-0-0.TR2.DFW9.ALTER.NET (126.96.36.199) 3.483 ms 3.249 ms 3.448 ms 9 128.at-5-1-0.TR2.LAX9.ALTER.NET (188.8.131.52) 35.689 ms 35.958 ms 35.937 ms 10 196.ATM6-0.XR2.LAX4.ALTER.NET (184.108.40.206) 38.666 ms 38.775 ms 38.837 ms 11 192.ATM4-0.GW6.LAX4.ALTER.NET (220.127.116.11) 38.846 ms 38.879 ms 39.045 ms 12 softawareoc3-gw.customer.alter.net (18.104.22.168) 34.409 ms 34.260 ms 34.437 ms 13 newDuke-p1.softaware.com (22.214.171.124) 34.901 ms 35.183 ms 35.111 ms 14 stonewall.softaware.com (126.96.36.199) 35.395 ms 35.748 ms 35.395 ms 15 flint-softaware.mla.cyberverse.net (188.8.131.52) 42.941 ms 37.415 ms 50.875 ms 16 tucker.mla.cyberverse.net (184.108.40.206) 56.009 ms 62.434 ms 47.725 ms 17 neon.neo.com (220.127.116.11) 98.993 ms 76.639 ms 72.790 ms [onyx]:[3:30am]:[/home/rnejdl] >
From this traceroute, you can see that it took 17 hops to go from onyx.training.verio.net to www.neo.com and that the round trip time was roughly 72-98 ms (based on the 3 numbers on the last line). Keep in mind that the RTT's reported are the round trip times from the source host to that router hop. It's not a cumulative sum of the previous times. Each hop is going to add some time to the path, so you'd expect each hop to take a little bit more time to get to than the last. Looking at this example, you can see that this is pretty much the case here, except for slight fluctuations on the orders of milliseconds due to network traffic.
Now an important thing to know when using traceroute is what the asterisks/stars mean. If you see traceroute print out a star instead of a round trip time, that means that either your probe packet got dropped, or the reply back to you for that probe got lost along the way. This is usually referred to as "packet loss," and we will discuss this later.
Understanding the Route
To understand how to interpret a route, you will need to know a little bit about interpretting reverse DNS. When ever a traceroute is done, the program will look up the reverse DNS of each host as it goes and print that information as part of each line. This can help to give you clues as to each network that a packet goes through when it travels from you to the final destination. Let's go through an example and show how to interpret it.
[onyx]:[4:07am]:[/home/rnejdl] > traceroute www.idsoftware.com traceroute to charon.idsoftware.com (18.104.22.168), 30 hops max, 40 byte packets 1 border1 (22.214.171.124) 2.103 ms 1.995 ms 2.407 ms 2 host-c-129.vcso.verio.net (126.96.36.199) 1.324 ms 1.158 ms 1.169 ms 3 int1-s1-0.dlls.tx.verio.net (188.8.131.52) 1.827 ms 1.866 ms 1.858 ms 4 ge-5-0-0.a10.dllstx01.us.ra.verio.net (184.108.40.206) 2.136 ms 2.084 ms 2.097 ms 5 ge-6-0.r00.dllstx01.us.bb.verio.net (220.127.116.11) 2.056 ms 1.954 ms 1.981 ms 6 ATM3-0.BR1.DFW9.ALTER.NET (18.104.22.168) 3.122 ms 3.078 ms 3.344 ms 7 140.at-6-0-0.XR2.DFW9.ALTER.NET (22.214.171.124) 3.260 ms 3.507 ms 3.330 ms 8 284.ATM7-0.XR2.DFW4.ALTER.NET (126.96.36.199) 4.436 ms 4.342 ms 4.306 ms 9 194.ATM9-0-0.GW1.DFW1.ALTER.NET (188.8.131.52) 5.085 ms 4.628 ms 5.285 ms 10 savvis-dfw-gw.customer.ALTER.NET (184.108.40.206) 5.571 ms 5.964 ms 5.462 ms 11 idsoft-1.CR-1.usdlls.savvis.net (220.127.116.11) 11.800 ms 9.365 ms 10.870 ms 12 charon.idsoftware.com (18.104.22.168) 10.687 ms 9.927 ms 11.687 ms [onyx]:[4:07am]:[/home/rnejdl] >
In this example, we have tracerouted to www.idsoftware.com from a host within Verio's network. We can now analyze each hop along the way.
From this traceroute, we can tell that www.idsoftware.com is hosted by ID software themselves in the Dallas/Fort Worth metroplex. We also know that ID Software is a customer of savvis.net, who is in turn a customer of Alter.net.
Let's look at another traceroute:
[onyx]:[4:07am]:[/home/rnejdl] > traceroute www.7up.com traceroute to 7up.com (22.214.171.124), 30 hops max, 40 byte packets 1 border1 (126.96.36.199) 2.093 ms 2.019 ms 2.336 ms 2 host-c-129.vcso.verio.net (188.8.131.52) 3.700 ms 1.171 ms 1.193 ms 3 int1-s1-0.dlls.tx.verio.net (184.108.40.206) 1.810 ms 1.883 ms 1.856 ms 4 ge-5-0-0.a10.dllstx01.us.ra.verio.net (220.127.116.11) 2.144 ms 2.113 ms 2.027 ms 5 ge-6-0-0.r01.dllstx01.us.bb.verio.net (18.104.22.168) 2.096 ms 2.070 ms 2.176 ms 6 p1-0-0-0.r01.oremut01.us.bb.verio.net (22.214.171.124) 59.328 ms 59.354 ms 59.319 ms 7 pvu1.vwhpvu1.verio.net (126.96.36.199) 59.516 ms 60.039 ms 59.890 ms 8 7up.com (188.8.131.52) 60.132 ms 60.405 ms 60.496 ms [onyx]:[4:11am]:[/home/rnejdl] >
This traceroute follows much the same path as the last one up to hop 5. Hop 6 is the Verio router in Orem, Utah that Iserver uses. The router on hop 7, pvu1.vwhpvu1.verio.net is the border router (the router that connects Iserver to the Verio backbone) for Iserver. Finally, we can see that www.7up.com is hosted on an Iserver platform.
We will look at one more traceroute that shows another example of what you might see.
[onyx]:[4:22am]:[/home/rnejdl] > traceroute www.hellers.com traceroute to www.hellers.com (184.108.40.206), 30 hops max, 40 byte packets 1 border1 (220.127.116.11) 3.743 ms 2.090 ms 2.430 ms 2 host-c-129.vcso.verio.net (18.104.22.168) 1.351 ms 1.187 ms 1.126 ms 3 int1-s1-0.dlls.tx.verio.net (22.214.171.124) 1.810 ms 1.801 ms 1.810 ms 4 ge-5-0-0.a10.dllstx01.us.ra.verio.net (126.96.36.199) 2.132 ms 2.070 ms 2.022 ms 5 ge-6-0.r00.dllstx01.us.bb.verio.net (188.8.131.52) 2.143 ms 2.231 ms 2.022 ms 6 p1-1-0-2.r03.mclnva01.us.bb.verio.net (184.108.40.206) 45.922 ms 45.934 ms 45.967 ms 7 p4-0-1.r00.mclnva02.us.bb.verio.net (220.127.116.11) 45.999 ms 45.979 ms 45.805 ms 8 fa-3-0-0.a06.vinnva01.us.ra.verio.net (18.104.22.168) 46.159 ms 46.191 ms 46.215 ms 9 fa-0.n01.vinnva01.us.ra.verio.net (22.214.171.124) 47.217 ms 46.802 ms 46.860 ms 10 router.hellers.com (126.96.36.199) 73.894 ms 73.852 ms 74.150 ms 11 dybbuk.hellers.com (188.8.131.52) 85.513 ms * 75.824 ms [onyx]:[4:22am]:[/home/rnejdl] >
This last traceroute is to www.hellers.com. This website is hosted on the Verio network, however, it is hosted by a customer of Verio's, so is not Verio's responsibility, other than maintaining connectivity.
Caveats and Quirks
Before we continue on, there are a couple little caveats to using traceroute that you should be aware of, so you don't accidently misinterpret the results.
The first caveat to be aware of is that sometimes it will look like the last hop on a traceroute dropped a packet, when it really didn't. This is due to both the fact that this host is the actual final destination of your traceroute probes, and how certain Operating Systems handle ICMP. (ICMP, Internet Control Message Protocol, is one protocol that machines on the Internet use to send messages to each other, and the "Your packet died here" message that traceroute relies on is an ICMP message.) Since the last hop is your destination, instead of that host sending you back an ICMP message saying "Sorry your packet died here," that host will send back a different ICMP message saying "Hi, your packet made it here, but this port is unreachable." This is because traceroute purposefully sets the probe packet's destination to be some large port number that will most likely be unreachable at the destination host because it wants to receive that "port unreachable" message back. The caveat here has to do with the fact that some OS's, such as IOS (which Cisco routers run) and Sun Solaris, purposefully drop ICMP responses like "port unreachable" if it gets too many of them in a short period of time. They do this presumably as a security precaution. So, if you were to add in more delay between probes, you wouldn't see this erroneous packet loss.
Another caveat of traceroute is that ICMP, which is the protocol traceroute relies on to get responses from each hop, is usually the lowest priority protocol. So if one router is really busy it might decide to drop ICMP messages, and you will see lots of packet loss, but that router might be forwarding on more common, higher priority traffic just fine.
Also, some sites will filter ICMP for various reasons, so it might appear in a traceroute that a site might be unreachable, but in fact it is reachable.
Tracking Down Network Problems
So now that you have a basic understanding of traceroute, it's time to learn how to use traceroute to track down network problems. The first kind of network problem that traceroute can help you debug would be a loss or lack of connectivity to a site. If you appear to be having problems reaching a remote site, like a web site, do a traceroute to that site. If the traceroute reaches that site fine, then chances are that you have connectivity to the host, but that the web server on that host crashed. But, if the packets start to die somewhere along the path, it's likely that some router along the way, or the host itself is down. Here is an example traceroute:
traceroute to 184.108.40.206 (220.127.116.11): 1-30 hops, 38 byte packets 1 SF-rt5-fe9-0.geo.net (18.104.22.168) 0.48 ms 0.440 ms 0.378 ms 2 SF-core1-h1.geo.net (22.214.171.124) 0.618 ms 0.571 ms 0.521 ms 3 SF-rt2-f0.geo.net (126.96.36.199) 1.19 ms 1.94 ms 1.13 ms 4 * * * 5 * * *
Just remember that such a traceroute can also be an example of a firewall that is filtering packets, or a router that throws away the kinds of packets that traceroute depends on when it gets overloaded.
Debugging Network Slowdowns
Using traceroute's results to see what hops IP packets take from you to a remote host is really straight forward. However, using traceroute's results to debug where "slowness" occurs in a link is fairly tricky for a number of different reasons. The first of which is the fact that traceroute only shows you the hops from you to a remote host, not the hops from the remote host to you. So, the best way to determine where network slowness is occurring is to do a traceroute from host A to host B, and then another traceroute from host B back to host A. By looking at both, a trained eye can usually get a pretty good idea where the network slowness is occurring. This is due to the fact that pretty much every Tier1 ISP on the Internet uses closest-exit routing which often results in asymmetric routes (completely different routes from host A to B than from host B to A).
For instance, host A might be on the west coast using ISP X, and host B might be on the east coast using ISP Y. The path from host A to host B will then probably exit ISP X as soon as it can, most likely at some peering point on the west coast and enter ISP Y's network from there onto host B. Conversely, the path from host B to host A will most likely exit ISP Y's network as soon as it can on the east coast, and enter ISP X's network and continue on to host A.
Here's an example:
traceroute to web-proxy.geo.net (188.8.131.52) 1 E40-RTR-E40-SERVER72-ETHER.MIT.EDU (184.108.40.206) 4 ms 4 ms 4 ms 2 EXTERNAL-RTR-FDDI.MIT.EDU (220.127.116.11) 4 ms 4 ms 4 ms 3 cambridge2-br2.bbnplanet.net (18.104.22.168) 4 ms 4 ms 4 ms 4 cambridge1-br1.bbnplanet.net (22.214.171.124) 4 ms 78 ms 105 ms 5 nyc1-br2.bbnplanet.net (126.96.36.199) 12 ms 12 ms 12 ms 6 nynap.bbnplanet.net (188.8.131.52) 12 ms 12 ms 16 ms 7 sprint-nap.geo.net (184.108.40.206) 94 ms 82 ms 74 ms 8 SF-rt5-a1.geo.net (220.127.116.11) 70 ms 78 ms 74 ms 9 SF-core1-h1.geo.net (18.104.22.168) 82 ms 78 ms 74 ms 10 SF-rt2-f0.geo.net (22.214.171.124) 273 ms 234 ms 98 ms 11 web-proxy.geo.net (126.96.36.199) 133 ms 90 ms 82 ms traceroute to BIG-SCREW.MIT.EDU (188.8.131.52), 30 hops max, 40 byte packets 1 SF-rt2-f2.geo.net (184.108.40.206) 1.218 ms 1.219 ms 1.479 ms 2 SF-core1-f0.geo.net (220.127.116.11) 0.704 ms 0.68 ms 0.678 ms 3 MAE-West-h0.geo.net (18.104.22.168) 3.926 ms 3.402 ms 4.285 ms 4 sanjose1-br1.bbnplanet.net (22.214.171.124) 5.071 ms 4.839 ms 6.973 ms 5 su-bfr.bbnplanet.net (126.96.36.199) 6.695 ms 6 ms 8.342 ms 6 chicago1-br2.bbnplanet.net (188.8.131.52) 71.597 ms 70.278 ms 70.166 ms 7 boston1-br1.bbnplanet.net (184.108.40.206) 76.612 ms 74.881 ms 75.66 ms 8 boston1-br2.bbnplanet.net (220.127.116.11) 74.099 ms 77.012 ms 76.715 ms 9 cambridge2-br1.bbnplanet.net (18.104.22.168) 75.399 ms 75.376 ms 74.932 ms 10 ihtfp.mit.edu (22.214.171.124) 78.895 ms 76.066 ms 76.434 ms 11 E40-RTR-FDDI.MIT.EDU (126.96.36.199) 77.556 ms 76.115 ms 75.627 ms 12 BIG-SCREW.MIT.EDU (188.8.131.52) 76.484 ms 76.226 ms 77.748 ms
Note the vastly different paths that these two traceroutes take from host A to host B and from host B to host A, each with a different number of hops. The first traceroute shows the path from MIT to geo.net goes through Sprint Nap, an exchange point in New Jersey. This makes sense, since MIT is on the east coast and BBN is using closest exit routing. The second traceroute shows that the path from geo.net in San Francisco back to MIT goes through MAE West, an exchange point in the San Francisco Bay Area, the closest exit point for geo.net.
Now, to make the issue more confusing, the second reason why tracking down network "slowness" is tricky is the fact that in networking there is no "slow" or "fast", but instead there are bandwidth and latency, which are two different concepts that can both determine how "fast" a network is. (If you are unclear on the difference between bandwidth and latency, check out a cool paper written by Stuart Cheshire called "It's the Latency, Stupid".
Tracking Down Packet Loss
So now we know that bandwidth is how many packets you can stuff in your pipe and that latency is the delay, and that packet loss can adversely affect both. So, in general, when trying to track down network "slowness", you should be looking for packet loss. But this can get kind of tricky because packet loss is random. So, you might actually be getting packet loss at hop #2, but with the default 3 probes per hop, maybe all 3 will get back OK. Then at later hops you will start noticing the packet loss that really occurs at hop #2, but it might look like it's occurring at hop #3. So, it's usually better to do more than 3 probes per hop.
Let's try to debug a bad traceroute and see what might be causing the problem. So as to not try to make any other specific ISP look bad, some hostnames and IP addresses will be changed to protect the innocent. Let's say you're connected to GeoNet via a T1, and you have another office in Chicago that is connected via a different ISP. One day you notice some definite slowness in transferring files and/or logging into machines at the remote site and you want to see where the problem lies. So you decide to do some traceroutes. A traceroute from your GeoNet connected office shows you:
traceroute to chicago4.mycompany.com (3184.108.40.206): 1-30 hops, 38 byte packets 1 router.SanFrancisco.mycompany.com (220.127.116.118) 3.52 ms 2.75 ms 2.63 ms 2 some_interconnect.geo.net (166.90.420.231) 71.5 ms 3.71 ms 3.5 ms 3 SF-core1-h1.geo.net (18.104.22.168) 3.23 ms 3.20 ms 3.25 ms 4 MAE-West-h0.geo.net (22.214.171.124) 7.30 ms 13.7 ms 6.33 ms 5 mae-west.other-isp.net (126.96.36.1996) 21.0 ms 31.4 ms 29.7 ms 6 core2.SanFrancisco.other-isp.net (254.70.100.245) 21.44 ms 32.2 ms 32.5 ms 7 core1.Denver.other-isp.net (254.70.40.229) 73.1 ms * 97.4 ms 8 border3.Chicaco.other-isp.net (254.70.56.23) 62.3 ms 86.23 ms 53.88 ms 9 my-company-t1.Chicago.other-isp.net (254.70.111.34) 120.43 ms 95.3 ms 86.44 ms 10 router.Chicago.other-isp.net (3188.8.131.52) * * 112.42 ms 11 chicago4.mycompany.com (3184.108.40.206) 132.34 ms * 104.12 ms
So looking at this traceroute, you can see that there is some packet loss, but it's hard to tell exactly where it starts. It could be the link between hops 6 and 7, but it's hard to know for sure. So, being an educated tracerouter, you decide to do a traceroute from Chicago back to your office in San Francisco. You get:
traceroute to sf13.mycompany.com (220.127.116.117): 1-30 hops, 38 byte packets 1 router.Chicago.other-isp.net (318.104.22.168) 3.85 ms 2.64 ms 4.15 ms 2 my-company-t1.Chicago.other-isp.net (254.70.111.33) 5.16 ms 3.94 ms 7.22 ms 3 border3.Chicaco.other-isp.net (254.70.56.23) 3.62 ms 4.28 ms 5.15 ms 4 core1.Denver.other-isp.net (254.70.40.229) 25.8 ms 27.2 ms 23.7 ms 5 core2.SanFrancisco.other-isp.net (254.70.100.245) 141.0 ms * 49.7 ms 6 pb-nap.geo.net (22.214.171.124) 123.43 ms * 76.22 ms 7 SF-rt3-f0.geo.net (166.90.354.7) * 94.12 ms 102.32 ms 8 some_interconnect.geo.net (166.90.420.232) 85.24 ms * 97.3 ms 9 sf13.mycompany.com (126.96.36.1997) 117.31 ms 234.42 ms 99.19 ms
So now you have more to go on. First of all you see that this route is an asymettric one. The first route is 11 hops and the route back is 9 hops. Now the number of hops doesn't make any significant difference in how fast your connection is, but it can make things like packet loss and latency increases appear to be occur between two hops when it really isn't there. This is because the packet loss or increase in latency might be between two hops you don't even see because the route back to you is completely different.
So now you can make an educated guess as to where the packet loss might be occurring. Based on the first traceroute, it looked like the bad link might be between core2.SanFrancisco.other-isp.net and core1.Denver.other-isp.net, and by looking at the route back in the other direction, it appears that this assumption might be correct. At this point, your best bet it to copy and paste your traceroutes and get these sent to the appropriate NOC (Network Operations Center). With this type of information, you will now have a lot better chance of tracking down the problem than if you just sent an e-mail saying "my connection to my Chicago office is slow." It also gives you a better understanding of how traffic is exchanged on the Internet.
In summary, traceroute is a network diagnostic tool that will show you the hops your Internet traffic takes from your host to a remote location. It will also tell you how long it takes for packets to get from your host to each hop as well as if packets get lost along the way, which can be useful in tracking network problems. Since routes on the Internet are often asymmetric, it's usually a good idea to do traceroutes in both directions if possible when trying to debug network slowness. In doing so, you can provide your ISP with crucial information that can help them to fix the network problem.
Here are some exercises that you can do to practice your traceroute skills and learn to interpret the output better.
If you have read through this article and gone through the exercises, then you should be ready to take a quiz on this to see how well you retained this knowledge.
This page was created in 0.00064 seconds
Comments and Questions
Last modified: March 12 2013.