For this assignment we were tasked to use the “traceroute” command line command to garner insight into the trajectories of information packets across IP addresses when we try to access certain websites.
I chose to analyze traffic from adultswim.com (a media streaming site), duckduckgo.com (a search engine), drive.google.com (Google Drive), popvapor.com (a vendor), reddit.com (a highly trafficked social network), pornhub.com (not American company, self-explanatory), classes.nyu.edu (NYU hosted website), and samekrystal.com (my personal website, hosted by GoDaddy). I ran traceroute in three locations, under three different conditions – at my childhood home in Woodbridge, Connecticut, at my apartment in Downtown Brooklyn, and in the Interactive Telecommunication facilities. In each location, I ran “traceroute” with two different Mullvad VPNs – one European (Zurich by way of Oslo) and American (Atlanta), and once without a VPN.
I used the traceroute mapper tool provided by Tom Igoe to visualize my traceroute logs, and have included my logged files HERE for individuals with NYU credentials. I applied for a free trial of Maxmind’s GeoIP tool to conduct further analysis, and used their IP batch lookup tool to get a more granular understanding of my traceroute logs. My GeoIP results and pivot table analyses can be found HERE for individuals with NYU credentials.
Overarching Insights, Patterns
- It should be noted that my results are not very accurate for reasons covered below.
- I found this resource helpful for understanding what goes on during traceroute, how to better analyze my logs, and why I should be skeptical of my results and insights.
- Internet Service Providers (ISPs)
After running my logs through GeoIP batch lookup, I was better able to tabulate which ISPs my packets ran through. Though they are not perfect, my insights are more accurate and precise with GeoIP than with traceroute mapper. I have included maps of common network patterns for fun.
|Core Back Bone||9|
|GTT Communications Inc.||44|
|Interoute Communications Limited||5|
|Level 3 Communications||26|
|Managed Network Systems||1|
|New York University||70|
|Private Layer INC||50|
|Tata Communications (america)||4|
|Total Server Solutions L.L.C.||46|
|Verizon Internet Services||20|
|VPS Datacenter, LLC||8|
Certain ISPs are attributable to specific domains. For instance, all the Microsoft Corporation instances came from DuckDuckGo traceroutes. All but two of the Google instances came from Google Drive traceroutes. Half of the New York University ISP instances came from NYU Classes traceroutes, though the other half came from traceroutes that I conducted in ITP’s facilities. The Verizon and Optimum ISPs are attributable to my apartment and home traceroutes, as these are the services I use.
RASCOM Telecommunications pattern
Telia Carrier pattern
I saw this triple asterisk frequently while running traceroute, particularly when I was given the warning that the sites I was using the command for had multiple addresses (Reddit and Google Drive). I have more research to do, but two possible reasons for these triple asterisks are a timeout, or that some routers are configured to block the ICMP echo protocol used by traceroute to protect from DDOS attacks.
Mullvad is based in Sweden, any traffic there (from what I can tell) was through their servers.
My European VPN consistently ran packets through Private Layer Inc. in Aargau Switzerland, subsequently (occasionally) to a server in Moscow, and ultimately to a myriad of obfuscated American host servers belonging to Private Layer Inc. I know for a fact that this VPN is partially hosted in Oslo, though this information never came through in my GeoIP or traceroute IP results.
My American VPN ran through host servers in Roswell, Atlanta, and Kennesaw Georgia linked to the Total-Server-Solutions L.L.C.
The Mystery of Panama City
I have yet to understand exactly why traceroute mapper picked up Panama City as a packet-stop. This route was not revealed in my GeoIP data, and even when I looked at confidence GeoIP’s intervals for locations near Latin America this insight did not seem plausible.
Individual Domain Insights
I have abstained from trying to analyze traceroute data based off of my traceroute mapper maps, as my GeoIP data (though it may be incomplete) is more precise. This IP data is lacking in accuracy, because packets can take different routes to their destination. Were I to have run more traceroutes on my domains, I would have had a more complete picture of where packets to these domains might go. Furthermore, were I to have changed the parameters of my traceroutes, such as the type of ports I wanted to ping and the amount of time I wanted to allocate per node passage, I could have more complete routes and insights.
Before I received my GeoIP results, it was difficult for me to glean insightful or remarkable data about Adult Swim. After GeoIP I realized that many of my traceroute timed out almost every time, after being passed through a series of anonymized Hurricane Electric hosting servers. In the instances that my traceroutes did not time out, I found the terminal server for Adult Swim, ASN-TBS-1, owned by Turner Broadcasting (Adult Swim’s parent company). I found it interesting that the ASN-TBS-1 server was listed as a residential server and not a business server – perhaps this is a cost saving measure.
DuckDuckGo had the most ISPs of my traceroute searches. This is attributable to Microsoft’s host server transparency. Microsoft inserts itself into DuckDuckGo’s packet routes through MSN and Azure attributable host servers. While my GeoIP data did not explicitly say which city Microsoft ran its traffic through, with traceroute mapper we can assume that either Azure or MSN is hosted in Seattle, though we cannot surmise how accurate this information may be. I find Microsoft’s involvement in DuckDuckGo’s traffic ironic given DuckDuckGo’s emphasis on privacy.
Two things stuck out to me about Google Drive’s traffic. First, some of Google Drive’s packets ran through servers on Cherokee land, which apparently belong to Google. Second, I found out that Google has its own backbone, the 1e100.net server. The high concentration of California-based traffic is attributable to Google’s servers as well.
Much of PopVapor’s traffic ran through Level3 business servers in Miami, and Fort Lauderdale, Florida. PopVapor’s traffic was unique in that it ran through GTT Communication (residential) servers in Massachussets, VPS Datacenter hosting servers in undisclosed locations, and in a high frequency exchange of various Comcast servers in New Jersey. The last unique insight was part of the largest/longest traceroute log that I recorded, I am curious as to what incentive Comcast had to bounce this traffic around so much. This insight was not readily apparent from my traceroute map.
Interestingly, Pornhub has the most intricate traffic routing in the greater New York area and in Europe, passing through the most backbones and unique backbone companies. This may be because PornHub is a foreign company that stores much of its data in compliance with Europe’s General Data Protection Regulation, which mandates the storage of data generated in a country to servers that country – implying that videos on one PornHub page could be stored in multiple different countries (this is an assumption).
If I were to give Reddit’s traceroute logs a superlative, it would be “most obfuscated”. The only transparent information that I received from Reddit was information attributable to my VPNs and home wireless networks – otherwise it appears as though all of my traceroute timed out before they revealed anything interesting (to me). Presumably, Reddit blocks the port that traceroute pings by default.
With my American VPN, I saw the involvement of Comcast. With my European VPN I saw a unique configuration of German traffic routed through multiple backbone services, and I wonder if this has to do with how NYU’s proprietary VPN traffic is routed. Unsurprisingly, Classes’ traceroute logs had the highest instance of NYU Domain servers.
My Personal Website
My GeoIP results confirmed that GoDaddy has servers in Phoenix, Arizona though they did not clarify the mapped DMV area traffic. However, my GeoIP results revealed traffic run through Vermont-based servers attributed to the European backbone Telia Company – a unique configuration attributable to my European VPN traffic.
I think it would behoove me to run more traceroutes with different parameters, such as latency tollerance (time between traffic pings), and ports pinged. I would also like to figure out how to gain insight into packet transmission latency – different traceroutes took different amounts of time, and I would like to better understand these implications and patterns through observation of packet timestamps. Lastly, I would really like to figure out what happened with my supposed Panama City traffic – perhaps running my traceroute logs through another tool provided by Tom Igoe could help.