Author: Les Cottrell. Created: November 20, 2000
There were 8 representatives: Fred - AT&T, Chuck - CNRI, Sharad - HP, Rick-Agilent, Axel - Ericsson, Jeff - Intel, Les - SLAC. Visual has dropped out since they bought up InverseNet. Telcordia could not make it . There is interest from Alcatel and Genuity (Steve Blumenthal). In discussions with Charles Brownstein, Sri Kumar said it would be good if the XIWT took a leadership in mining/analyzing the PingER data to indicate what information is available. Charles has not had success yet in replacing Jeremy at CNRI.
Fred had sent out a proposal A Measurement Infrastructure for IPEX: Collaboration for Measurement, Validation and Research Into Anomaly Detection. Fred, who was on a phone link made a brief presentation. The primary statement is that would like to go ahead with passive monitoring. There is believed to be considerable interest from large service providers, many of whom are already doing or trying to do this in practice. there is one public passive measurement deployment from NLANR using OCxMon. There are a couple of private efforts, including a SprintLabs effort. Most public efforts are based on active measurements such as AMP/Surveyor/RIPE/PingER. A big question is what do we add to existing measurements, we could extend the deployment but this is not a primary goal. The passive measurements look like a less well covered area. The addition to the NLANR would be to extend it beyond the academic, government, or public NAP sites that NLANR covers. A 2nd goal is to extend the PingER deployment. A 3rd longer term goal is that the IPEX measurement infrastructure will be designed to provide "field test" facilities to member organizations.
Homogeneity of monitoring probes is a very important goal.
He proposes to use Sun Netra, it is hoped/expected that Sun may donate a dozen machines. Their advantages are they have remote power on/off, reboot, they are certified and can be deployed in central offices, they are rack mountable etc. The measurement could be connected to a switch port which could have the traffic spanned to it. This could be either 100Mbps or GE. An alternative would be to use the OCxMon solution with an optical splitter.
The NIMI infrastructure looks like a good place to start. A lot of thought has gone into robustness, scalability, dynamism, security. Tcpdump, CoralReef and also Neville Brownlee's NetraMet look like candidates, and could be adapted for our use.
Possible ways to analyze the data include characterization of applications, protocol behavior, combining active and passive measurements, looking at IPv6, understanding Internet traffic (e.g. multi-fractals), traffic matrices, security & intrusion detection (use packet headers to characterize intrusion activities). Most analyses of Internet traffic so far have been from private traces, so it would be good to get analyses of public traces. One can use tcpdpriv to anonymise the the data and ensure privacy.
For deployment, hardware would need to be funded or donated. Network connectivity will depend on the appearance of the network at the site. Topology needs thought as to where to locate the probes to get the widest interesting coverage. Commercial traces would be very valuable. Much of today's data does not cover financial sector, the dial up, "broadband".
There have been problems with span ports and reflecting data. Fred has had successful experience with Alteon and Extreme have worked well, has had performance issues with Cisco. Would need to do some lab testing first.
There may be an issue with spanning switches at the remote sites, may want a default switch. This may depend on the ISPs such as ATT, UUnet, Bell South, Genuity who are involved in XIWT. May need to show proof of concept by deploying idea at a couple of sites demonstrating the value. There will also need to be guidance on where to place the probes (e.g. outside the firewall), which probably means specifying the traffic types that one is interested in measuring, and the problems that might be solved with access to the data..
The prevailing wisdom at the ISPs is to use GE interfaces.
A big question is what are the long term goals and how do we demonstrate success (what are the measures of success). We need a clear distinction of what we add. Want to characterize "our" segment of the Internet, but may not need a very wide deployment since deploying at exchange points and/or transport networks. On the other hand it would also be good to get some measurements at the edges for special applications such as used by a financial organization.
There was a discussion of why the Netra solution was chosen. It was agreed that Fred would add a section on how Netras meet the requirements better than other solutions such as Intel/Solaris.
It was also agreed that the management/administration of the platforms would be done centrally via ssh access.
For deliverables the main one is the datasets which are defined by their format. Fred points out that they have a lot of analysis/reporting tools that may be of value and may become part of the base set. There may be a large value of making these tools publicly available.
The next step is to complete the proposal and then shop it round to potential clients. This may require defining a set of profiles and also show the benefits that one gets from the measurements, i.e. show what a site/organization would expect to get from the project. Since some/many ISPs do passive measurements already, is there a big benefit from doing it collaboratively? In addition may need to indicate the required level of involvement of a site to get value from the project. Is the involvement tiered or is it all or nothing.
Alliance with NIMI would be good. NIMI is funded by DARPA until the end of 2001. NIMI only traces packets addressed from or to it, so it is not a general passive monitor. A probable concern at some organizations will be privacy/security issues with having a sniffer on the network tracing packets. In some cases anonymising the data will help allay such concerns. VPN traffic can be separated out, but cannot be analyzed.
We will need to add something about the archiving requirements, space expected to be needed, backup, reliability, how users find information, i.e. cataloguing, discovery etc. The amount of data saved can vary enormously depending on what is collected, with what granularity and for how long. The amount of storage may be driven by budget/costs. Also whether it is kept on tape vs. disk and how long it is available needs some thought. Fred has about 30TBytes of data stored. May follow NLANR/CAIDA datacube model to make the information available.
There was a discussion of using GPS to get accurate time synchronization, i.e. more accurate than is available with NTP. There was concern about deploying GPS and the need to get aerials installed etc. It was agreed that we do need to be able to synchronize the timing and should specify what the synchronization accuracy should/will be assuming we will start out with NTP only.
It was agreed that we should continue to collect the XIWT PingER data, and make it available.
So the next step is to flesh out the proposal, and then to shop it around to members to solicit interest. Particular/possible members would include those running networks and/or transiting traffic (e.g. Verio/NTT, ATT), a broad band/dial-up center (e.g. EarthLink, ENRON) site, a content site (e.g. WestGroup, Intel does web hosting), a corporate network (one would be of interest, more would be better so can compare and validate, e.g. does the corporation do a lot of tunneling, streaming, how does small company compare with large), a hosting site (e.g. Exodus), a wireless provider (e.g. Bell South, AT&T or Ericsson customers, perhaps SBC or Verizon might be interested), an ecommerce site, a research high performance computing site (e.g. SLAC) would be good participants. Such companies have resources that may be added to the proposal. We feel we can be successful with a fairly small but diverse deployment. Then one would assess interest, get an idea of what people are interested in doing. Is there value to members who do not/cannot deploy the probes on the WAN, e.g. deploy tools internally/privately/non-sharing of data on their LAN. Getting participation (of people willing to install etc.) is a key ingredient to moving forward. Is there a minimal target of participants? Then get together a phone conference to see who will really proceed. Then go for an extended seminar, say in January, for people signing up. Then go to DARPA to redirect the proposal. Fred has a lot of requests for access to his repository, this could be made available via the current proposal umbrella. It would be interesting to use this data to illustrate some applications such as web streaming, or traffic mixes at various places.
Another question orthogonal to benefits seen by participants is what are the requirements that a member has to ante up in order to participate. For example: meeting/workshop participation, release of data, dataset subscription to selected data available early on, locating a probe with access to network, providing ssh access for administration. What does it take to be a participant, active vs. passive participants.
A timeline was proposed:
DARPA wants the data to drive models, to validate models. It may be necessary to send out an email to the DARPA modeling PI folks saying this is the data we are hoping to capture and make available, is it useful to you, in what form, does it need extending either in location or in what is measured, how should it be aggregated. IPv6 is another possible area to expand into especially for the wireless folks. Wireless traces would be very interesting for the DARPA community. Another interest is deducing throughput from loss and delay measurements made passively and compare with active measurements.