Held at SLAC October 10, 2002
Rough Notes by Les Cottrell
Attendees: Brian Tierney, Martin Stoufer, Dan Gurney, Jin Guojun (LBL), Les Cottrell, Connie Logg, Jiri Navratil, Warren Matthews, plus Gary Buhrmaster, Paoloa Grosso & Jerrod Williams for the SCNP part of the meeting (SLAC).
See Self-Configuring Network Monitor Project, and Self-Configuring Network Monitor Project: an Infrastructure for Passive Network Monitoring,
The client sends a request packet to a remote host on port 5050. SCNPs along the route see the activation packet and will turn on sniffing for packets between the client and remote host. Intermediate SCNP boxes do not have accounts. The monitoring will have a maximum duration of 8 minutes. The request packet indicates what port (e.g. ssh port 22) to sniff traffic for. The end host does nothing, except send an ICMP port not responding packet back. In the long run they will add PKI so can do monitoring by a 3rd party. The sniffed headers are sent back to the source. The box is extremely useful for tracking problems down. Brian showed an example of identifying packet loss within ESnet between ORNL & LBL.
LBL has a modified version of tcptrace to allow displaying of multiple traces as they pass through the SNCPs.
They have SNCPs at LBNL, NERSC, & ORNL. They are looking to add boxes at SLAC, PSC, NIKHEF, & CERN. They run on FreeBSD. SLAC has no official expertise/support on FreeBSD. There are no current plans/progress to port SNCP to Linux. So there is an issue of how to support the box, e.g. add new patches etc. We will take as an action item to see whether someone at SLAC can maintain the SCNP. If not, a possibility is for Jin to support it like he supports the other SCNP hosts at LBL. It could be placed outside the firewall. They have made modifications to FreeBSD to improve the GE drivers etc. The system is stripped down and to first order has ssh and the SCNP application.
The SNCPs have a 2*GE and a 100Mbits/s Ethernet. They use a passive multi-mode tap. SLAC may have a bonded GE connection for its border in a year so then changes will be needed in the SCNP.
Web services right now provide network performance data for resource brokers (PingER, IEPM-BW). Provide predictions, reasonable TCP parameters (windows, streams). Has bbcp-wrapper that goes to the web service to get hard-coded windows streams settings and then runn bbcp.
Web services are planned to replace the existing Globus toolkit based on OGSA. Has been talking to BaBar folks, but theyt are focusing on LDAP.
There was discussion on the naming conventions (e.g. do we say msec, milliseconds, ms)
Users "document based" SOAP to subscribe request / subscribe reply, query request / query reply. It is a library, not really a working system. For subscribe, data transport is separate (they use netlogger). There is not much code (mainly built into Python). GMA idea is to use web services to request the data (e.g. what data, target, how often etc. Then the data comes over a separate channel. He showed examples for ping. The DAMED GGF group is coming up with naming conventions. CIM has a definition it is very large & complex. Globus OGSA is also in this space.
Web100 now has the ability to read and set the AI and MD constants and also has the ability to implement the Floyd limited slow start and fast TCP options. The limited slow start does not make a big difference to bulk throughput, the fast TCP appears to be a bit disappointing.
Brian sees big differences in UDP and TCP throughputs, factors of 10 in some cases. Especially if blast UDP to push back TCP, then run a 2nd UDP annd TCP streams have not recovered. HSTCP appears to help by 10-20% in some cases but need a cwnd of 38 segments before the Floyd improvement kicks in.
It would be interesting to try UDP and multi-streams on iepm-bw.