What is TULIP
TULIP is a web application being developed by the MAGGIE-NS team from the National University of Sciences and Technology (NUST) School of Electrical Engineering and Computer Sciences (SEECS) (formerly known as NIIT) and the Stanford Linear Accelerator Center (SLAC) Internet End-to-end Performance Monitoring (IEPM) project. TULIP's purpose is to geolocate a specified target host (identified by IP name or address) using ping RTT delay measurements to the target from reference landmark hosts whose positions are well known. Knowing the speed of light in fibre or copper (roughly 0.6*c, we use 1ms. is equivalent to 100km), the minimum ping RTT measurement of 5 pings (see Ref 1 for the error distance versus the number of probes) from each landmark site gives a rough estimate of the fibre + copper cable distance of the landmark from the target host. Lateration is applied on these distance estimates to estimate the position of the specified host on the globe. We are focusing on a platform agnostic, open non-proprietary tool (c.f. Traceware from Digital island or Edgescape from Akamai) that can be used to evaluate the effectiveness of this technique for hosts outside the U.S. and Europe specifically in less well developed countries. There is a map and a table of the TULIP landmarks
Lateration is the calculation of position information based on distance measurements. Calculating an object's position in two dimensions requires distance measurements from 3 non-collinear points (hence Trilateration).Multilateration computes the position of an object by measuring its distance from multiple reference positions. We use Multilateration following "Wireless Position Technologies and Applications" written by Alan Bensky, 2008, British library Cataloguing. The algorithm for multilateration was designed for Wireless Sensor Networks, with a little tweaking of parameters like Time of Arrival and distance based on wireless sensor location.
Also see Problem of Apollonius and Descartes Theorem for tangential circles.
- If one knows where a host is located then one can choose what content to send to the host, for example what language to use, what local services to recommend etc. Typically this does not demand accurate geographical locating, often determining the state or country is enough.
- It can be useful for security to pin-point the location of a suspicious host (assuming it has not blocked pings).
- It can be used to help determine from where to get a replicated service.
- Applications that try to draw maps of host locations, such as Visual Traceroute, require accurate locations of routers.
- By determining the geographical path data travels on, one can analyze the efficiency of a network. For example, determining tha route used between countries in Africa and even within countries in Africa, one can determine that traffic frequently goes via Europe or North America, vastly increasing the RTTs and using more expensive transcontinental links.
- It can be used to supplement or verify the information in databases such as Whois, DNS, Geo IP Tools and PingER.
- The pings from multiple landmarks can help identify hosts that have proxies. For example many web servers in developing countries have proxy web servers in N. America or Europe.
- Hosts can move. For example host names belonging to companies that are acquired can move to new locations, or hosts names that are associated with a show (e.g. SuperComputing) that moves to new locations can also move. Then there are hosts (e.g. laptops, PDAs) that are inherantly mobile.
- Projects such as Zooknic Internet Intelligence study the geography of the Intenet industry providing maps of the Internet domains in the world and their relations to economic growth.
- TULIP is being used in Phantom OS as a location estimation service for making self configuring sub-grids. Phantom OS is a Grid OS being developed at SEECS (formerly NIIT), Pakistan in collaboration with UWE, UK.
- TULIP can also be used to make ping requests from multiple landmarks to see whether the target is accessible by ping from multiple sites. If it is accessible from many landmarks but not all, then:
- It is possible that the landmark is not working. To test this try other targets and see if they all fail from a given landmark.
- The target may be blocking ping access from some landmarks.
- There may be some network problems (e.g. routing) between the landmark and target. Reviewing the ping output may assist in determining whether the target host's name is known to the landmark.
- TULIP can show up anomalies, e.g. a host masquerading as another host. In this case multiple hosts may show up with inconsistent min-RTTs. For example we have seen a case where a registered mail server in Iran (Geo-IP Tools and traceroutes showed it in Teheran), yet the IP addrress had min-RTTs from US and Canadian landmarks that showed less than a few tens of ms.
- TULIP can help identify if a host is connected via a satellite link. In this case minimum RTT from most or all landmarks will be >~ 400 msec.
- TULIP can help identify hosts that are replicated. For example root name servers (e.g. 18.104.22.168) or servers (e.g. gfx1.hotmail.com, yahoo.com) with identical names or IP addresses that show up in many different regions. In this case the replicated host as seen from multiple landmarks will have impossibly short minimum RTTs that are less than the RTTs between the landmarks.
- By using the landmarks in the various regions, TULIP can be used to identify which regions have large RTTs to which regions. This may be used to identify where it would be advantageous to place a server in a region to reduce the RTTs and hence improve service. It may also be used to verify Service Level Agreements that involve RTTs.
Ways to Locate
There is no geographical tie between Internet architecture and geogrpahy. For example. unlike the phone system where phone numbers provide countries, areas and exchanges in areas, the Internet IP address is not designed to provide any location information. In fact, it needs to be understood that methods to derive location of Internet hosts were not originally designed for this. As mentioned previously, however, it can be important to know the location of a host. It is also very useful to have multiple ways to find the location of a host both since all methods ahve their problems, and also to look for agreements or discrepancies. The paper Distributed Traceroute Approach to Geographically Locating IP Devices
investigates and evaluates existing (2003) methods and solutions. Basically there are three major ways of locating a host:
- By using databases such as whois, DNS, or location specific databases.
- By using traceroutes and extracting locations from router names.
- By using ping Round Trip Times.
Databases may give the location of a host, in particular a router,
as being the location of the
owner's location rather than the location of the router itself. A couple of
examples for hosts are www.seecs.edu.pk and www.hec.edu.pk which one would
expect to be located in Islamabad since that is where ns.seecs.edu.pk is
located, but are actually in Texas. Some examples
for routers include GeoIP Tools locates address 22.214.171.124. in Kansas,
has < 5ms minimum RTT from Liverpool and < 1ms (< 100km) from
Rutherford Lab near Oxford in the UK. The router is in ASN 3356 which
is Level 3 Communications, based in the U.S. It also locates
mia2-fiu-1-us.mia.seabone.net as being in Europe (presumably since
it is owned by a European ISP (Seabone) however it is in located in the US.
When trying to map the topology of a traceroute onto a map of the world,
such errors cause problems. For example
a traceroute from Brazil to
Costa Rica apparently (according to GeoIP Tools) goes to
Florida then to Italy, back to Florida and then to Costa Rica.
Actually it goes to Florida and then to Costa Rica. Hostip.info
gets it right. Another example is
from Brazil to a Venezuelan (according to Geo IP Tools)
node. Again the route according to Geo IP Tools goes via Italy and
again HostIP.com gets it right. Geo IP Tools also shows the end host
www.unerg.edu.ve in Venezuela while Hostip.info says its in the US.
Other tests (RTT, Octant) make us believe it is in the US possibly
Dallas (Octant) or Florida (TULIP). We are starting to
compare Hostip with
locations from the PingER database of known host, and see how well
TULIP does for various regions (see for example
TULIP estimates from Europe to PingER hosts).
- Domain Name Services (DNS) may also help in locating a host. The DNS LOC (location) resource record is designed to make this data available. In addition the names of routers often contain their location (e.g. city) so a traceroute may help identify where a host is near. Examples include VisualRoute, NeoTrace and GTrace. See reference 1 for a comparison for the U.S. of the DNS method compared to ping RTTs and a cluster technique.
- Autonomous Systems (AS): Given an IP or host name you can use Fixed Orbits to find the relevant AS. Then using a table of AS number to name you can find out more about the AS (e.g. contacts, HQ site etc.) Another source for finding AS' is Team Cymru's whois database
- Whois databases: Examples of sites that provide information from such sources include:
IP2location (max 20 requests per day unless sign up),
Maxmind has a
free downloadable GeoLite database that includes
they also have an IPv6 database.
DNSstuff Geolocation, and
Hostip.info. Unfortunately the information is
often missing, inaccurate or stale. Also a large block of geographically disperded IP
addresses may be assigend to a single entity and the Whois database may contain
a single entry for all of them.
- Also see NetGeo
from CAIDA, which though no longer maintained has many useful links. It has
a database of previously successfully found hosts, if this fails it uses DNS,
then a traceroute is performed with a WHOIS database lookup as a last resort.
It is now a commercial product from NetGeo Inc..
- Geo IP Tool (also see the explanation) and IP-address.com display the location of a selected host/address using Google maps. Geo IP Tool uses a database and probably has the best overall coverage and accuracy. However, it often fails for routers. GeoBytes requires one to provide the IP address (not the name) which is a slight inconvenience. It provides lat/long as well as City, Country, population, currency etc.
- GeoTool this seems a promising new entry, it uses maxmind's database (see above).
- Networldmap determines geographical information by acquring location information from willing participants.
- Quova has a
large (2.4 Billion addresses) database of IP addresses to locations
that they can provide access to for organizations.
- Traceroute: Typically such methods use regular expressions to deduce the location of a router (e.g. a router with the name 500.Serial3-11.GW8.BOS1.ALTER.NET is using the Boston, US airport code (BOS) and is in the city of Boston, Massachussetts.)
- Round Trip Times: methods typically use the minimum RTT from several landmarks to the target host to triangulate the poistion of the target host.
- TULIP uses a Trilateration algorithm.
- Similar tools to TULIP are the Constraint Based Geolocation of Hosts2 and Octant from Cornell. Both of these foused on the technique, only worked in the U.S. and are longer providing a service.
If you want to find the great circle distance and know the latitude and longitude coordinates of the two ends then you can use the Movable Type Scripts
web page. World Gazeteer
provides access to data with lat/longs, cities, countries & populations ( download data
). If you want to calculate it for yourself then see Deriving the Haversine Formula
. You can also make a name server lookup
for a host, or if you don't know the exact name tryDomainSurfer
. There is also an Atlas of Cyberspace
that provides maps and graphic representations of the geographies of the new electronic territories of the Internet, the World-Wide Web and other emerging Cyberspaces and the Corpex sponsored Cyber Geography Research
If you need to find the latitude and longitude of a place whose location you can find on a map, then try the Latitude & Longitude finder. Latitude & Longitude finder 2. If you need to find the location ofd a knwon latitide and longitude then use Google Maps, latitude, Longitude Popup.
Versions of TULIP
We have developed two version of TULIP. The first (TULIP1) was developed to understand the feasability. The second (TULIP2) evolved from ideas and experiences encountered with TULIP1.
TULIP1 is based on Java and Java Web Start must be installed on your system. The applet does all the work, it get the request for the target from the user, sends the requests to ping the target to the landmarks, gets the results back, amek the analysis and displays the rsults in graphical form. It requires a configuration file that provides the name and location of each landmark, the URLs for the ping and traceroutes. At SLAC this is kept at /afs/slac/www/comp/net/wan-mon/tulip/Sites.txt.
The user's browser accesses a form to make a request to locate a target. This is sent to tulip-viz.cgi
at www-wanmon.slac.stanford.edu. It uses a Google visualization package to display a sortable table of the landmarks and their RTTs to the target. When it has gathered all the RTTs from the responding landmarks, it uses Google maps to display a map of the RTT circles used in the lateration and various location estimates for the target. The requests to the landmarks to ping the target are made by reflector.cgi
running on www-wanmon.slac.stanford.edu. Having a central reflector enables more control over the requests and their impact, as well as keeping logs. It also enables us to use a single cookie to access PlanetLabs landmarks.
For hosts in the world at large it is important to have landmarks that enable the host to sit within a triangle of landmarks 4,5. Thus we are very interested in getting more landmarks that cover the world. We also need thelatitude and longitude (lat/long)
of each landmark. Most Internet hosts are located
in developed countries of North America, Europe and East Asia. Thus we need landmarks to cover these areas. Many countries in the developing world do not have direct access to other nearby countries (see for example the Case Studies
on S. Asia, Africa and Palestine) but go via Europe or the US. Thus the route is very indirect and extended so distances estimated by RTT will be too long. Thus we also need landmarks in such developing countries in particular those with a large Internet presence, e.g. countries with > 1 million connected Internet users (see Internet World Statistics
). We have three main sources of landmarks:
- One possible source of ping servers are the various, such as Public Route Servers and Looking Glass Sites, Advanced Internet Routing Resources, Cisco.net
- A second source is Planetlab. As of June 2007 there are about 370 PlanetLab hosts with about 50% in the U.S., 25% in Europe, 5% in Japan etc. In many cases there are >1 host at a site with the same lat/long. So we use only a subset of the hosts. For the PlanetLab sites there are are about 18 covering China, 2 in Puerto Rico, 6 in Brazil and 2 in Uruguay. To utilize them one sends a script to be executed. We have such a script and will integrate it. We have a map of the PlanetLab sites.
- We get landmarks installed at interesting places. The requirement on such a landmark is to install a reverse traceroute/ping server (see Traceroute Servers for HENP & ESnet) on a web server at the landmark site. Instructions for downloading and installing the traceroute server are available at http://www.slac.stanford.edu/comp/net/wan-mon/traceroute-srv.html#code. After it is installed please let us know so we can add the landmark to the TULIP configuration file (see the bottom of this page for our email addresses).
- It is possible one might be able to use the WatchMouse ping servers, if one could find their locations (lat/long).
Currently (May 2007) we have landmarks in about 30 countries (see map), so we have a way to go.
The SLAC traceroute/landmark server that is frequently used by landmarks servers: rejects attempts to traceroute to a broadcast address; does not allow a remote host name to be greater than 255 characters to prevent buffer overflow attempts; does not allow a remote host in a different domain to do a traceroute to a host within the same domain as the web server; limits the maximum number of traceroute processes running in the server to reduce the chance of a denial of service request; starts the traceroute after 3 hops if the client/browser and server are in different domains in order to hide internal routing information from outsiders.\; has a blacklist of sites that are blocked.
The use of a central reflector to manage all the requests enables us to provide a single IP address that landmarks can enable access from, while disabling requests from other hosts.
A major concern is that the target is pinged simultaneously from multiple landmarks. This can look like a scan of multiple hosts when the target host responds to the ping requests. It can also look like a denial of service attack, especially for hosts with limited available bandwdth, such as are found in developing countries. We thus limit the number of pings from a landmark to a target to 5. We are also looking at tiering the landmarks. The top tier will enable us to locate the region of the world and then the second tier can be used to find the location in that region. This reduces the number of landmarks used and divides them in time into two or more sets. We are thus studying using tiering to tier the N. American and European hosts.
TULIP only allows one copy of the client to be running on a client host. TULIP also hides the URLs used for the landmarks to reduce the possibility of people gleaning the URLs for a denial of service attack. Editing the landmark URL's requires a password known only to the developers.
We have also considered whether the knowledge that a machine and possibly the usual owner can be accurately located may violate some privacy issue. This may require us to add some fuzz to results. So far this has not been done.
There is a centralized log
of < 100 Mbytes, with time stamped records of all requests, the requesting host (client), the target, landmark, result (RTT, loss), errors etc. The log is truncated to the last 20% of the records when it reaches its maximum size. This is analyzed
for response time of landmarks, abusing clients, and types of failures.
Problems with TULIP
- If the target host is connected via a geostationary satellite (minimum RTTs > 400ms) then the triangulation will not be accurate.
- Some IP names and addresses actually refer to multiple hosts in very different locations. An example are some of the Internet's root name servers or yahoo.com. Such hosts will appear to be close to multiple locations, so TULIP will be confused.
- Sometimes name service can be very slow causing the pings to timeout. In such a case you can try giving it the IP address instead of the name.
- If TULIP complains that it cannot load since another copy is already running, then delete the file C:\Program Files\Mozilla Firefox\pres
- Some landmarks appear to be unable to resolve IP names and so you will need to provide the IP address rather than the name if these landmarks are to provide information.
- Multiple IP addresses may resolve to a single IP name. In this case it is probably best if the user is asked to resolve which IP address they want.
- Sometimes routes are very indirect. This can add greatly to the RTT and hence give bad distance estimates. Examples include:
- Between India and Pakistan the routes go via the US or Canada.
- To get to E. Asia from Europe the undersea fibre goes around Spain, through the Mediterranean, the Red Sea, around India, then past Singapore and around the East coast of Asia. This is much more indirect than a great circle route which crosses the Asia land mass. On the other hand the route from Europe to Australia is more direct via mainland US and Hawaii.
- The geographical distance from ICTP, Trieste in Italy to Ljubljana the capital of Slovenia is ~ 60 miles. However, the Internet route goes via Milan and Vienna and is a factor of ten larger.
Development and More Information
Development is at School of Electrical Engineering and Computer Sciences (SEECS, formerly known as NIIT), National University of Sciences and Technology (NUST), Pakistan and SLAC by Qasim Bilal Lone (SEECS and SLAC), Shahryar Khan (SEECS and SLAC) and Les Cottrell (SLAC). More information may be found at:
We gratefully acknowledge the cooperation of the landmark sites, in particular PlanetLab, and those installing the SLAC reverse traceroute/ping server in developing regions (not Australia, N. America, Europe, Japan) of the world where there are fewer landmarks. These include:
- South Africa: TENET (Cape Town).
- Democratic Republic of the Congo (Kinshasa).
- Burkina Faso (ouagadougou).
- E. Asia
- China: IHEP (Beijing)
- Hong Kong: UST (Kowloon)
- Korea: KHU (Suwon)
- Singapore: NOC (Singapore)
- Taiwan: TWAREN and NCHC (Taipei)
- Thailand: UNINET (Bangkok)
- Latin America:
- Bolivia: University Mayor de San Simon (La Paz)
- Brazil: RNP (Brasilia), SPRACE and UNESP (Sao Paolo), UERJ (Rio De Janeiro)
- Mexico: CUDI (Juarez)
- Middle East
- Israel: ILAN (Tel Aviv)
- Palestine: AQU (Jerusalem), IUGAZA (Gaza City),
- Russia: BINP (Novosibirsk), ITEP, KIAE (Moscow)
- S. Asia
- India: CDAC (Mumbai and Pune), VSNL (Mumbai)
- Pakistan: SEECS, NUST (formerly NIIT) Islamabad, Micronet, NCP and PERN (Islamabad)
- Sri Lanka: LERN (Colombo)
For documentation of work on TULIP see
The TULIP Wiki
- "An Investigation of Geographic Mapping Techniques for Internet Hosts", N. N. Padmanabhan and L. Subramanian,
- "Constraint-Based Geolocation of Internet Hosts", B. Gueye, M. Crovella, A. Ziviani, S. Fdida .(2004)
- "Constraint-Based Geolocation of Internet Hosts", B. Gueye, M. Crovella, A. Ziviani, S. Fdida .(December 2006)
- "An Empirical Evaluation of Landmark Placement. on Internet Coordinate Schemes." Sridhar Srinivasan. Ellen Zegura.
- "Geometric Exploration of the Landmark Selection. Problem." Liying Tang and Mark Crovella
- Geolocation Software
- Chapter 7 "Time of Arrival and Time Difference of Arrival", from Wireless Positioning Technologies and Applications, by Alan bensky
Solving Passive Mulrtilateration Equations using Bancroft's Algorithm", M Geyer, A Daskaladis,
- Probabailistic Model of Triangulation,
Xiaoyun Li, David K Hunter.