ESSC Meeting at PNNL Richland, Washington Sep 23-25, 1998.

Rough notes by Les Cottrell

Contents

Washington Update - George Seweryniak
DOEBN/ENsim video from Washington
Enterprise Network Infrastructure Strategic Information Management (ENsim)
Pacific Northwest GigaPop - Steve Corbato
ESnet End-to-end Performance Monitoring - Les Cottrell
Advanced Applications update & discussion
ESnet Project Update - Jim Leighton
QWEST Technical Update - Wes Kaplow, QWEST Communications
Cisco Technical Update - Andy Bechtolsheim
SSP Networking Requirements
ESnet Review Summary - Larry Price
ATM Test Tool - Bill Wing
New ESCC Chairman
International Issues - Chris Witzig
ESnet Site Coordinating Committee - Roy Whitney

Washington Update - George Seweryniak

The ESnet review gave ESnet an outstanding grade. The actions needed are network monitoring  needs to continue, international bandwidth needs to improve, university connectivity needs to improve. The program plan has been a big seller. ESnet did very well in 1998. There is nothing in NGI for 1999, they will try again in 2000. A long range implementation plan is needed.

There is a new network being put in place for DOE business services. It is called DOEBN. It will affect all sites that have DOE employees on site. Additional security planning will be needed. The MICS budget in 1999 took a $12M hit. The ESnet budget for FY99 is currently about $13M, they are hoping for another $1.5M. At the lower the impact will be on staffing at LBNL.

Network requirements will outstrip the funding for bigger pipes, especially international pipes. We need to look at closer work with other agencies & Internet 2 especially for university connections.

Tom Rowlett has joined MICS. Originally his goal was to work on NGI. But now been refocussed to work on PKI. Want to help bring research community into helping the business side, also looking at differentiated services (policy issues of the department).

DOEBN/ENsim video from Washington

Brief overview on corporate business network. The proposal is being reviewed by the DOENC.

It is a "closed" network, with "adequate" response, to serve as DOE Inranet, to be centrally organized and managed. It needs increased protection of proprietary or sensitive data. Will upgrade the existing frame relay ~ 40 sites DOEBN to make it corporate network using ATM/frame relay. There are over 100 DOE locations of which it is unclear how many have DOE employees. It will implement DES encryption and will have a NMIC (Network Management Information Center) at HQ. They are working on capacity planing. Want to run voice & video, enhanced encryption/securty, and will need network disaster planning. There has been a discussion with Price, Whitney and Leighton to look at how it interworks with ESnet, in particular for ESnet sites that only have a couple of DOE employees with a need to get tocorportae applications over the corporate network and may be using dialup today. They may be able to use encrypted VPN over ESnet. More information may be available at http://cio.doe.gov/sim/eni/blt/

Enterprise Network Infrastructure Strategic Information Management (ENsim)

Its purpose is to meet legislative mandates using best business practices. They will document the current inftrastructure. It will provide a high level strategic direction for DOE and produce a business plan for 3-5 years out.

The scope is to connect all DOE sites. DOE networks include ESnet, DOEBN (DOENET), video teleconferencing, an old IBM simulation network, an emergency network (ECN). They want to rationalize the use of FTS circuits (currently there are 53 pages of FTS circuits). They want to minimize the overlap yet interface the networks.

There has been a lot of activity with 2 workshops (2nd 9/28), and a 3rd on 11/17. They are running surveys and have several teams. There will then be business case to be written and reviewed and then sent to the CIO. See http://cio.doe.gov/sim/eni/blt/ for current information.

Pacific Northwest GigaPop - Steve Corbato

The evolution is from NSFnet to I2/NGI. This includes regional networks => GigaPops, one backbone => multiple high performance and commodity connections, AUPs => COUs, challenge to bandwidth myth (Qwest, L3), DS1/DS3 => OC-{12,48}, regional aggregation still makes sense.

The University of Washington has been an IP-only campus backbone since 1989. They connect to NWNet (now Verio-Northwest). They have been involved in the Washington K-20 network, partnered with the Seattle city fiber project. These enabled them to get fiber within the city and in the state and freed them from the RBOC tyranny of bandwidth dependent tariffs. The have also worked with a Seattle Northwest NAP (SNNAP). The P/NW GigaPop got an NSFnet HPC award 12/1996. They are just now bringing up the vBNS connections. They run an NOC which is highly regarded. They are an Internet 2 / UCAID partner, and have members on several committees.

The GigaPop uses IP as the common bearer service. They are using Frame-based L2 technologies. These include Gigabit Ethernet and SONET, and will pursue DWDM later. This is the frame analog of cell-based MREN (Chicago). There is a Gigapoint for backbone connectivity with a redunadant high-availability switching core, they intend to have multiple I2/NGI and commodity NSP links. They recognize they have a geographically challenged backhaul (covers Alaska, Idaho etc.)

Currently they have 4 Catalyst 5000 core switches located at 2 sites in Seattle. One is a "neutral carrier hotel (Westin) in downtown Seattle" in which they have 3000 sq-ft  of space. They have both a primary and secondary network. The switches are interconnected with Gigabit Ethernet. They have lit fiber interconnectivity at OC-48 => OC192 using SONET. They have major NSP/IXCs including Worldcom/UUnet, Sprint, Qwest (abilene), Starcom but not MCI (C&W), L3. They also have several ISPs including Verio, InterNAP, IXA. There is leased space ay UW. The Westin is the obvious choice for the Gigapoint in the region. They have colocation space available. They are looking at how to bring in ESnet via an ATM switch connected to the Sprint backbone and use a PVC for ESnet.

They have connectivity to vBNS in place, will connect to Abilene. They are very interested in connecting to the federal mission networks (ESnet, NREN), the DARPA supernets (HSCC(Qwest)), NTON2 (GST to be carrier) is being worked on with Bill Lennon, and also connecting to commodity NSPs such as UUnet.

The participation includes the Puget Sound area including UW (4 fiber miles for Westin), Microsoft (one site has a server farm with 5000 machines with an OC48 connection, addition to that there is the Microsoft campus), NOAA, and Boeing. For the Puget Sound they have plans for a local fiber testbed. East of the Cascades is more difficult. There is PNL (OC3), WSU (DS3), UI (DS3), MSU (DS3). Oregon is more tricky still. OU wanted to be a GigaPoP. Both UO and OS have HPC awards but there is not agreement as to where the GP should be located, Corvallis, Eugene and Portland are all contenders. Then again Alaska is even more of a challenge. A fiber ship just arrived at Alaska so they should have lit fibers soon.

Interest in Western Canada includes CA*Net3, BCNet, UBC, UVic, Simon Fraser U, TRIUMF, UAlberta and UCalgary. Seattle to Vancouver ~ 240km. Ther is fiber between Seattle and Vacouver. They are looking to provide connectivity between the Seattle GigaPoP and CA*Net. The air gap may be as little as 150 ft. They are looking at OC3 over SONET.

The NSF cooperative agreement with MCI expires April 2000. vBNS is IP-over-ATM-over-SONET service on the MCI commercial ATM cloud (Hyperstream) with an OC12 connections and OC48 in the backbone some places. The Seattle GigaPoP will be connected up to vBNS this week. It has 2 connections one to the SFO Bay Area, the other to Denver.

The Abilene backbone is a UCAID partnership with Qwest, Nortel and Cisco. It is  a complementary I2 GigaPop interconnect. It will be IP-over-PPP-over-SONET. There will be OC48 => 2 wavelengths => 2  dark fibers (2 years), Qwest will backhaul for GigaPops. Abilene will provide transit to other high performance networks such as ESnet, but it will not a be a general purpose transit network. It is a testbed for the I2 Class of Service model (which they hope to deploy next year). Qwest has a fiber in construction between Salt Lake City and Portland/Seattle which will provide a redundant path (the other path runs up the west coast from San Francisco).

ESnet End-to-end Performance Monitoring - Les Cottrell

My presentation can be found at http://www.slac.stanford.edu/grp/scs/net/talk/essc-sep98/

Advanced Applications update & discussion

The path forward  is to: get some high-level driving applications; identify cross cutting technologies; identify components of the cross-cutting technologies; and, identify who is going to push forward those components.

The applications areas include: remote experiemntal operations, distributed parallel computing, remote / share code edvelopment, remote & distributed data access, collaborative engineering, visualization, teleconferencing and videoconferencing.

A tentative proposal would be to provide services aimed at collaborations of 2-20 people. For example this includes Mbone video service a la CERN/Caltech; electronic notebook; collaborative environment like PictureTalk/Audio bridge; shared code development environment; collaborative engineering environment.

Components include non-backbone connectivity, International access,  university access, increased bandwidth [A], network & application security [A] [B], differentiated services (brokers) [B],  multicast support [A] [B], resource location services (directory services) [A] [B], performance monitoring & diagnostics [A], higher level protocol support (testbed support) [B].  In the above [A] means we know how to do it, [B] means it needs to be investigated maybe with a pilot.

Richard Stevens said that in NSF the Partnership for Advanced Computation Infrastructure (PACI, the follow on to the Super Computing Centers program) which consists of 90 sites have agreed to eliminate the use of clear text passwords on the network by the end of this year.tMayb in ESnet we need a proposal that within 9 months want to eliminate the use of plain text passwords on the network, within 18 months would want to have a PKI testbed in place, and within 36 months want to have it deployed.

QWEST Technical Update - Wes Kaplow, QWEST Communications

This talk covered trends in long haul networks. The old monopoly of AT&T until broken by MCI in the 80's did not foster competition. Technology has also improved with fiber optics, electro-optical components (erbium doped optical amplifiers, WDM), and multi-gigahertz electronics. The large geographical area has also meant it takes a long time to deploy the infrastructure, but now a lot of it is in place or is well on the way with very aggressive time schedules. Qwest is laying fiber conduit at 4-5 miles per hour.

Until recently (< 1988) one multiplexed T carrier signal over 30 miles, then electro-optically detect and regenerate, and there was only one channel of operation. The maximum speed was just over 1 Gbps. To upgrade one had to change every component in the path. Today's system allows multiple (8-16 in short term) wavelengths, on top of 10Gbps (instead of 1 Gbps), with a  an optical amplifier every 60 miles, and a regenerative unit every 3-4 optical amplifiers. This reduces the incremental cost of upgrading (mainly by adding extra wavelengths). Regeneration is still a major cost. Regenerations requires detection, clean up, and regeneration. In the next generation will be able to go 1600km with nothing electrical (there is an optical repeater every 80-140km). At each 1600km one needs rgeneration.

Best customer prices for across country OC192 is currently ~$20M/year.

The diffciculty of getting to terabit networks is that this capacity is >> curent US capacity (160Gbps). In addition customer applications are still in the Mbps range. The SONET infrastructure is designed for high availability and efficient multiplexing. Until recently there was no dark fiber available.

There are testbed (e.g. MIT/LL) to go at 12*10Gbps WDM over 1600km eliminating or dramatically reducing the number of much more expensive electro-optic regeneration. In addition tehre are new optical components such as all-optic cross-connects (e.g. DARPA's successful MONET program) which also provide high reliability.

One can reduce the costs by taking budled discounts, reducing the redundancy. In the future higher bandwidths with highre levels of multiplexing.

The new non-traditional network providers using the current best-in-class technology have significant advantages. A new pricing structure is in the making. Expect Moore's law to continue. Expect to see new leading-edge technologies enable revolutionary new services   at revolutionary prices.

Cisco Technical Update - Andy Bechtolsheim

In the LAN the bandwidthgaines were 1981 shared 19Mbps Ethernet, 1994 switched 10 Mbit (10x), 1998 switched 100 Mbit (100x), 2000 switched Gbps (1000x). In the LAN driven by large centralized servers, major aggregation over backbones, by uniprocessor performance doubling every 18 months.

Internet growth at 8x per year is faster than the LAN growth. Total available fiber in the country in 1997 was 10Tbps (not necessarily lit, in use fiber was almost 2 orders of magnitude lower). Say there are 266M Americans each with T1 access, this requires 407 Tbps. Multicast could be a major fraction of the internet in 2000. Top end routers GSR 10-40Mpps 98-99, GSR+ 100+Mpps 99-01, GSR++ 200+Mpps 2000-2002, TSR 1000+Mpps 2002. 10 Gbps Ethernet by year 2000 over 10-20km or up to 100km with a long reach laser. They see the 10Gbps Ethenet as a cheaper alternative to OC192 SONET. In the LAN the 10Gbps is seen as how to link together the backbone switches, with 1 Gbps to the edge switches. Expect 50% price erosion/year for a fixed bandwidth.

Multisession bridge update - Gary Roediger

SSP Networking Requirements

There are three major applications: global systems (read long range weather prediction) - $40M, combustion (read diesel engine simulation) - $40M, basic science - $20M. Suppoting this is computing science and enabling technologies $45M, and under that is platforms and network infrastructure $100M. The hoped for funding is given above but is expected to be lower.

The challenge is to provide affordable 50+TFLOPs by 2005. To do this may take 1000's of processors which will be location independent so networking is critical. In addition they expect SSP to generate mlti-terabytes per run and have multiple runs/year.

Estimates for 2004 are: 100Gbps - 1 Tbps for tier 1 sites (large sources & sinks of data); 30Gbps-100Gbps for sites with large number of users (tier 2), 1Gbps for standard universities (tier 4). The idea is that universities get handled by Internet 2, and ESnet focusses on the tier 1 and 2 sites.

Rick showed the components that give rise to these numbers. They consider the bytes/session and then the number of sessions. Major contributors to sessions, Gbps are remote databases (20-100, 0.1-1Gbps), collaboration (30-50,0 .1-1Gbps), remote desktop visualization (20-100,0.1-1Gbps), data exploration (5-20, 0.5-5Gbps). The above plus some smaller apps gives 0.01-0.6.8 Tbps. If one adds remote I/O (2-3,30-600Gbps) & computations (1-3, 100-1000Gbps) then that adds 0.06-1.8Tbps and 0.1-3Tbps. This (0.17-5.46Tbps) is a big stretch to do affordably.

ESnet Review Summary - Larry Price

The following are the major points made in the review with ESSC responses in italics.

Committee reviewed ESnet project as outstanding. Agree.

Recommended that network monitoring activities be expanded with more support in the ESnet core staff. Note how we close the loop and the present success. Independent evaluation is good.

They recommended a rolling strategic implementation plan with a horizon of 5 years and updated semi anually. This will be a new task for ESnet (few page plan). Annual or semianual updates to be determined - as need arises.

Need to ensure ESnet strategies are adapting to technologies and programmatic needs. They want to designate a high level networking integration staff person to work with ESnet networking group and with ER planning groups (for example the current SSI group). This staff person could also liaison with DOE netework researchers if the right talent could be attracted. ESnet needs a multi-year (e.g. 3 year) strategy for testing new and emerging technologies. ESnet  plans to do this.

Provide a strategy for testing new and emerging technologies. Agree, there are resource issues, the technologies are in the 3 year plan

ESnet should engineer connections to a selected set of GigaPOPs. Universities are a continuing priority for us. We agree with the strategy of connecting to GigaPOPs and work is under way.

Regarding international connectivity the complexity and high cost of international connectivity andthe associated sociology requires a strategy. The present course must be changed by active collaboration. Agree that this is a priority. We need to come up with a plan to focus action on improving the situation. There is an international connectivity. Will require a new initiative on international connectivity.

Extra funding is required to meet future program needs of SSI, DOE 2000, NGI & ASCI (agree; find ways to interface with the initiatives); to pursue a clear strategy for international connections; to deploy differentiated services in an operational manner (evaluate tradeoffs versus higher bandwidth cost effcetiveness); to offer infrastructure for research & development networking  in addition to production.

ESnet Project Update - Jim Leighton

Domestic Issues

Packet size still growing, average about 450 bytes. Bytes accepted still growing, about 10**13 bytes/month. Packet growth is a bit less aggressive.

The UC Berkely GigaPOP will be connected at OC3 via LBNL with a no cost local loop. It will be a CALren2 and Abilene peering point, they are waiting for peers to arrive. ANL has an OC12c ATM under test, tested sustaining 570Mbps, with low level cell loss at interconnect, they expect to test for a month. LBNL has been upgraded to an OC12c ATM.

The vBNS contract expires March 2000 with no intent to renew/rebid. The future Internet 2 peering focus is with Abilene. So ESnet will emphasize connections to the Internet 2 GigaPOPs. There are existing connections at Chicago, SDSC and  Perryman (may not survive MCI transition). Next connections  with UCB (see above). They are considering Atlanta and Seattle, and prefer Atlanta. Other alternatives for Seattle are to go via PNNL or to use NTON.

There are no particular interests on the part of ESnet to enhance intera-agency interconnects, so they will not spend additional funds to do NGI interconnects (NGIXen is an effort by the LSN/JET committee to define NGI eXchange points (likely will include FIX-W, Chicago and DC). They will provide an ESnet OC3c connection for SC98.

SecureNET (ASCI) testing is in progress. They get 48Mbps end-to-end on IP OC3 connection. They are not blaming the network, high speed end-to-end performance is hard to achieve (host, backplane, local I/O, application issues).

They have begun discussions with Qwest on SSP for example to put in a test bed between Albuquerque and the Bay Area.

Many of the public NAPs at which ESnet is peering are beginning to see congestion on the others (non ESnet side). This is beginning to cause problems and may lead ESnet to have to use private peering.

International access:

ESnet has had a phone conference with NSF to encourage peering university access to foreign collaborators at Perryman (DFN, INFN, CERN, DANTE). Proposed to do in exchange for partial funding of interconnect between Perryman and STAR-TAP. NSF needs to provide a plan. It will cost ESnet about $100K/year. The Germans do not appear to be very interested.

There was a meeting with DFN to develop plans for next generation WINS. They will increase the transatlantic link to OC3 (i.e. 3*T3 from current 2*T3), and also double the preferred bandwidth to ESnet from 1*T1 to 2*T1.

They will upgrade the Italian link from T1 => E1. KEK plans to upgrade the current T1 to LBL with a 10Mbps slice (out of an NACSIS OC3) connection to Chicago (so traffic from KEK to SLAC will go to Palo Alto, to Chicago, and then via ESnet to SLAC) to be operational in Fall '98. CERN now has 4 E1s to the US, and it looks lightly loaded. ESnet does not exchange much traffic with JANET. JANET expects to have a POP located in NY area, and plan to peer with ESnet there.

The VCSS provides a very nice easy to use audio bridge as well as PictureTalk support. It is in pilot mode, noit quite production. It will need to be advertised. They can be bridged to one another and also to the video conferencing.  There is SSL support for security.

Differeniated services will require a lot of new technology to be evaluated, tested and even developed. For example traffic shaping, queue management, policing, cassifiers, marking etc. They are looking at the overall arrchitecture for an ESnet DS approach. They are deploying clipper testbed facilities. They are testing OC12.

They are looking at throughput performance issues to see why end-to-end performance does not match available network bandwidth. Often this is a host configuration issue. This includes a systematic study of interaction between TCP fallback, router traffic, IP and ATM queue management and ATM policing  etc.

They are looking at automating th help desk. They are looking at a Web based with expert system. They are also evaluating Web caching from Cisco but are not very happy with it. Needs new features and bug fixtures, it is fairly limited in features. Is a transparent cache as opposed to a proxy cache (squid, harvest). Transparent web caching is considered evil by many, especialy ISPs since they no longer deliver the traffic. Recommendation is that ESnet does not deploy, the benefits do not outwiegh the possible downsides (e.g. webcache misbehaves, can hose users in a very frustrating way, there may be privacy issues, web hit statistics may be affected).

Advanced Technology

They are deploying PIM in the ESnet routers. They are about 1/3 done.  SLAC has still to be done. IPv6 activities are being led by Bob Fink. He thinks transition/deployment is a few years away. They want to hire one person for advanced technology work.

Miscellaneous

The Sprint contract has been extended from August 1999 to August 2001. The extension will have lower ATM port costs, they ar working on reducing the costs of the local loop. They are considering a follow on contract with another vendor to be run in parallel.

ESnet is Y2K compliant and they passed the audit. They stated that the vendor claims compliance, the router is not reliant on year-date for operation. They completed testing for Y2K rolloover and leap-year oddities. They have a contingency plan and procedures to by-pass problems if needed.

ATM Test Tool - Bill Wing

Need a tool that ooks at & warns of congestion ahead of time. Need to do at ATM level. Bill has built one based on commercially available (Duke) ATM snifer, a GPS and a Linux PC. The one-off cost to build is about $20K. There might be interest in buildng several for ESnet sites. It would be useful to build the monitoring into the PingER monitoring.

New ESCC Chairman

The new ESCC chairman will be Bill Wing of ORNL. He will take over from Roy Whitney who si stepping down. The initial term is for 3 years.

International Issues - Chris Witzig

This was working group of Jean Noel LeBoef, Jim leighton, Larry Price and Chris Witzig. They created a list of interesting international sites for each program. They came up with 92 sites, 50 in Western Europe, Eastern Europe & Russia 17, Far East 17, 3 in Canada, 3 in Australia. ARM gave 9 in quite exotic places (including Antartica), NP 24, HEP 14, Fusion 1, MICS 32, and Molecular Biology 2. Germany had 17, France 10, UK 8, Switzerland 5, and others 15. Some sites are counted double if mentioned by more than one program. There is no prioritization even though some programs did prioritize. Some programs submitted more sites than others. We will add these sites in the ESnet monitoring.

They started on a draft of a paper to make a case for better international connectivity.  They are working on the draft and will submit it by Email shortly after this meeting. Would like to incorporate input from the ESSC at this meeting.

The outline contains a background stressing the importance of interational connections for all programs and the current imbalance of funding between Europe and US (Europe funds 90% of the connectivity between US and Europe of the research network (270Mbps) - see http://www.dante.net/policy/co-fund.html. Following this is a section on the present role of international connections for each program. Then the future requirements and a list of specific recommendations.

After a long discussion it was agreed taht something is needed to be added on how it would cost to get what, so a proposal can be made to DOE/MICS.

ESnet Site Coordinating Committee - Roy Whitney

There is an active set of task forces (TF) and working groups (WG). Next meeting at FNAL Oct. 19-23, 1998. Highlights include network upgrades and SSP infrastructure, network monitoring, differentiated services, Gbps nets, IPv6, Email, security & firewalls, enhanced supercomputing conference support (moving beyond the weapons labs), PKI & DCE, Entrust, Netscape,MSIE ready smart cards, and interface with DOEBN.

Security may have a big impact. The LBNL PKI project is going very well, thereb is a formal name didividual who will be replacing Bill Johnston of LBNL. The globus project (http://www.globus.org/) uses X509 certificates in a GSSAPI implementation. Five labs are running Entrust, classified networks are using Entrust (not to protect classified information from disclosure). DOE continues to work on its PKI policy. The driver is commerce. The ESCC  interfaces needed with the SLCCC technical WG on security.

A Kerberized ssh is being installed on ASCI IBM SP2 nodes. DCE cross cell service accredited in classified environment between LLNL, LANL and SNL. DCE accreditation between these sites is under way. DFS being used to separate foreign users from exploring code.

Issues include:

The SLCCC is working on showing how the laboratories connection to the DOE mission with the laboratories as a system. Other issues for the SLCCC include DOE information systems (e.g. individual Labs personnel systems, badging systems, ES&H stuff etc.) Security is another big issue. The SSP is a big issue would could lead to $250M flowing into high end software development etc. in DOE and NSF.  ESnet could naturally be seen as the network provider at least for some of the major sites, especially since the NSF vBNS is going away and being replaced by the university Abilene network.