Author: Les Cottrell. Created: Jan 17 '02
There were about 175 attendees. There was wireless connectivity on the first day. The meeting was at the Reston Hyatt near Dulles airport. Richard Mount and I represented SLAC, Richard as PI for PPDG, and me for PingER and INCITE.
Send presentations to Mary Anne and Thomas.
NSF Middleware Initiative (NMI) for $10M. Bridge between network and applications. Goal is whether it works or not, it it being used does it solve the middleware needs. First release in April. Next round of proposals due March 1st.
MAGIC middleware and grid coordinations, bridges national, institutional, how to get federal agencies together to coordinate activities, and to get international agreement.
The NSF grants are not available to DoE Labs.
Looking at special interest groups across the various problems being addressed. Came up with 4 topics. Short term view what to do to get started over next 6 to 12 months. Want an engineering view of what panelists expect to produce in next 6 months, and how they will interact with one another's projects. Panels represent producers and consumers. Some of the joint topics may result in special interest groups. Chairs to take note of early issues taht emerge, the actions agreed upon (if any), and suggestions for ongoing dialogue. There were 4 panels, in concurrent pairs. Distributed Data and Meta Data (1) and Security and Identification (2); swap at 9:45am to 2nd pair.
Terascale (TFlops, TBytes), want poperational grid for sharing data. Federated set of servers, with analysis and data servers. Using Hierarchical Research Management (HRM) from Arie Shoshani of LBL for "heavy" lifting of data (heavy is TeraByte scale, 100s TB by mid decade, this is for a single operation). Metadata is a big issue, will not define, need to ensure ESG is interoperable with what is being developed in community. Will rely heavily with GridFTP and RFT.
Mainly talking about metadata rather than heavy data lifting. It is a lightweight, flexible middleware to support creation & use of metadata and annotation. Goals: improved completeness, accuracy & availability of scientific record; reduced barriers to integration & evolution in portals/ scientific comp environments (allow conversion between different meta data schemas); integration of records capabilities with primary data sources.
Challenges: managing storage resiurces in an unreliable environment, heterogeneity (MSS, HPSS, Castor, Enstore, various disks systems and attachments, system attached disks, NAS, parallel ...), optimization issues (avoid extra file transfers, caches, multi-tier storage system organization). They are modifying GridFTP to use HRM in blocking mode.
Layered approach: at bottom is data movement (optimized endpoint management, bulk parallel transfer, rate limited transfer, disk/network scheduling), above this is data transfer service (end to end file transfer with link optimization, performance guarantees, admission control), top layer is collective data managements (collection management, priority, fault recovery, replication, resource selection). Key ideas: co-reservation of resources, intelligent adaptive recovery. Goal is high quality APIs instantiated in software that is deliverable.
Want to create scalably sharable network storage. Key technology is Internet Backplane Protocol (IBP) A storage level equivalent of IP. Byte arrays are not files (weaker syntax) size & duration limited volatile allocations, best effort reliability & availability, no directory structure or accounting, no caching, replication. Single file abstraction put together from multiple distributed data resources.
Showed how loss in slow start gives very slow (linear) ramp up in throughput. Showed fractal behavior of jitter in message transfers. As increase competing UDP load with TCP then TCP behavior becomes chaotic. Showed how with netlets can remove end-to-end jitter of TCP which should be useful for realtime applications.
Commodity hi-perf dist comp relies on Internet. Will develop inference & analysis tools. Want fast, dynamic inference of available bandwidth, location of bottlenecks and available bandwidth along a path. Internal network measurements not available. Want end-to-end model. Develop lightweight chirp and fatboy path probing. Will be both active and passive. Want to do tomography to infer what is going on in the net cloud. Create a new generation of bandwidth protocols. One question is what is probability new protocols will be deployed.
Two tools pathrate (capacity estimation), pathload (available bandwidth) estimation. Many attempts starting with pathchar. Early ones did not work at current link speeds. Use variable length packet streams, pairs and packet trains. Will develop a better GUI.
Goal is to develop a network aware operating system. Develop/deploy network probe/sensors, develop a network metrics data base, develop transport protocol optimizations, develop a network-tuning daemon. Will develop network tools analysis framework. Auto-tuning gets close to hand tuning. Concerned about overall impact of active probing.
Infrastructure for passive monitoring. Want to look into interior of the network. Want to minimize impact on network. Use fiber splitter. Based on libpcap for packet capture and bro (used for hacker signature capture). Can only monitor own traffic. Want to put on monitoring box close to each router in ESnet. Activation sent by UDP to all monitors along the path. Focus on capture tools not on analysis. Have a prototype setup at LBL and NERSC. Monitor host system installed and maintained by net administrator.
Re-examine protocol-stack issues and interactions in context of cluster & grid computing. Adaptive flow-control to give dynamic right sizing. Not TCP unfriendly. Improved throughput by factor 7-8x at SC2001 from Denver to LANL. Applies to bulk-data transfers where bandwidth delay fluctuates. Have a kernel mod. Will develop a application layer/user space version. Alpha in Linux 2.4 kernel space is available. Have a packet spacing algorithm. Wu does not have an RFC.
Interactive QoS across heterogeneous hardware and software. http://www.cc.gatech.edu/systems/projects/IQEcho
Motivation: no terascale test network, so develop a simulator. SSFnet the portable simulator written in Java. It has a domain modeling language, network elements, protocols. Renesys SSFnet is hared memory, proprietary, not 64 bit clean, has no scheduler. Will add POS, MPLS, NIST doing ATM, Web 100 MIB, JavaNPI and namespace extensions, add hinting to DML, build examples of ESnet and Internet 2.
The Web100 went into a IETF RFC at the last meeting. TCP will need to evolve (maybe via eXperimental TCP implementations (XTP)) e.g. new startup algorithms, and addressing new technologies such as lambda switching. N.b. the Internet only works due to commons sharing concept and fairness. It is a major activity to develop and deploy high quality TCP into standard operating systems, can take years. There can be problems with research not evolving to an operational infrastructure (throw it over the wall concept). Thomas asked how to maintain communication among ongoing projects. Bigger problem is the tie back to the middleware community. Middleware folks want to know what one can do with the monitoring tools. So need to ID deliverables from network research to middleware. Need to continue dialogue between applications and networking. Objective is to advance Science.
Mailing lists will be set up: measurement & analysis focus group; transport protocols focus group; interacting with applications communities focus group.
Need common CP (Certificate Policy) and CPS. Trust management is at the resource end. Certificate is like a DMV license or passport in that it gives a reasonable identification that someone who is who they say they are, does not say what they are entitled to (e.g. whether they can pay for something).
They are looking for production systems, with long term support for software to be put in hands of users, need heavy lifting (may take days/weeks to move data), large heterogeneity in OS, protocols, applications, mass storage systems. Meta data description is a challenge. Error propagation is a problem i.e. how is one told something did not work, and what does one do about it, how to tell the user, how is the error passed up the hierarchy.
I met with Rolf Riedi of Rice to discuss INCITE, how to proceed with automated chirp measurements, and arranging visits for a student to SLAC and Jiri Navratil to Rice. It appears the best time for Jiri to go to Rice will be end of March (after March 25th when Rolf returns from vacation). The student is Hong Kong Chinese so I will work on seeing what is needed to prepare for her visit. She would like to come as soon as possible since this quarter she has a light load. We agreed we need a C/perl analysis program that can be called from an application. This will be led by Rice, since they understand the analysis needed. Rolf does not feel this is very hard. This analysis code will be used to reduce the data so it is easy to report on (e.g in a time series graph) or in comparison with something else. Rolf will assist with coming up with reasonable parameters with which to call chirp with. Typical optimum chirp sizes (# of packets sent in a chirp) is in the range 6-10. We should also saves the results from one chirp run to use as input parameters to the next chirp run. SLAC will keep the raw chirp data for up to a month (about 30MBytes), and make it available (e.g. via FTP or HTTP) for Rice to pick and keep a permanent copy. Some handshaking will be needed so SLAC will know Rice has got the data, and it can be deleted. Rolf encouraged SLAC to make contact with Vinny while he is in the Bay Area (working as an intern for Sprint).
I met with Brian Tierney of LBL and Micah Beck of UTK to arrange visits to SLAC later this month. I had a long discussion with Constantinos Davropolis of U Delaware about pathrate (for capacity measurements) and pathload (for available bandwidth measurements). I resolved some questions on pathrate, and we discussed how it should be used for automated long term measurements. We will get an early beta release of pathload. I had shorter discussions with kc claffy of CAIDA and Matth Mathis of PSC. I worked with Guojun Jin of LBL to make progress in getting him an account at SLAC for assistance with Pipechar testing. Thomas Dunigan of ORNL and I talked about Web100 in particular porting webd to SLAC and setting up an appropriate host at SLAC with GE access to run it on. I had brief separate discussions with Thomas Ndousse, George Seweryniak and Mary Anne Scott all of DoE concerning funding. I talked to Jim Leighton about the needs to get higher bandwidth to Renater. Jim and George Seweryniak are trying to get funding to upgrade the ESnet backbone which is getting close to saturation as more sites get OC12 connections (the backbone at best is currently 2*OC12).
SciDAC wants to put together a monthly newsletter. SciDAC will have a booth at SC2002.