November 5, 2003
There were about 20 attendees from FNAL, ESnet, DoE, NSF, StarLight, CENIC/NLR, Caltech, ORNL, plus some companies.
The goal is to push forward a testbed proposal for high performance networking for ESnet, to support science applications. Want to learn how it is supported from the community and how we want to use it. Hope for co-funding with NSF. Strong support from upper levels in DoE (Ed Oliver). There were previous workshops to address the requirements for ESnet, and now there is a DoE Science Networking Challenge, Roadmap to 2008 document available on the web. UltraNet will be a multi year effort.
Abilene, with GigaPoPs and OC192c & OC48c links across the country (OC48 Hou- ATL, SNV-DEN and DEN-KC, IND-ATL), Juniper T640s. Only 10Gbps connections to Abilene at the moment are from CalREN2 at LA, Seattle/Washington & StarLight. Have MPLS capability, but not much used the moment. Focus areas: multicast, IPv6, large flows e2e; Abilene observatory to support net research community, open measurement & experimental platform; dedicated capability experimentation e.g. QoS enabled MPLS tunnels; network security; advanced restoration techniques (bold face is of interest to today's discussions). QoS/MPLS removes congestion, but there is no congestion on Abilene at the moment, so not much demand, but open to suggestions (e.g. for "riff-raff" traffic give scavenger service).
Stylish to talk about scheduled transfers, but the shared backbone continues to surprise (e.g. latest LSR 5.6Gbps with large MTUs). Regional networking are changing fundamentally: GigaPoPs being replaced by metro and regional networks (Regional Optical Networks - RONs). Distances important (<60km no amps), 500-1000km (amps), > 1000km need OEO regeneration. There are many RONs emerging (e.g. CALREN, I-WIRE, I-LIGHT, SURA Crossroads, Floria Lambda Rail, NC LambdaRail, Connecticut Education Network ...), may have different emphases on production and research.
Emerging dedicated capability options in US: Abelene MPLS tunnels over 10Gbps IPv4/IPv6; NLR GE VLAN over 10Gbps lambda; I2-NLR muxed circuit over 10Gbps; NLR individual 10Gbps lambda; FiberCo dark fiber route-miles.
Sustaining dependable flows in 10Gbps range over shared IP may be difficult; scalable GLIF optical/TDM beyond a limited number of sites is difficult; dedicated capabilities provide a vehicle for testing and v. hi end requirements; potential end-states - shared IP nets with lambda resources dynamically balanced & optimized to meet changing needs, dedicated lambda (or subrate lightpath) resources visible e2e apps in response to application needs, hybrids (but of what nature).
HOPI (Hybrid optical and Packet Infrastructure, prelude to process for 3rd generation I2 Internet architecture): convening a design team, objective short term trials lead to scalable long-term hybrid architecture, target deliveries early 2004, 2005-2006 time frame for implementation (in time for LHC), in interim observe SURFnet process carefully.
~$80M ($51M committed) budget for full national backbone (CAPEX + 5 yr OPEX).; contribution assumed to be sunk costs; added lambdas can be provisioned with pricing tied to incremental costs. Committed: SAN, LAX, SFO, Portland, SEA, DEN, CHI, Cleveland, Pittsburgh, DC, Raleigh, ATL, & Jacksonville. SNV-CHI 12/03 buildout schedule (fairly firm within 1 month). Contact Dave Reeese for NLR equipment co-located at L(3) PoP, dark fiber is best choice, colo available via Quit or NLR, space in CHI is contrained.
Operational since summer 2001, with 1GE and 10GE switch/routers (Force10) for high performance access to participating networks, 40 racks, no cross-connect fees. www.startap.net/starlight/NETWORKS/. Increased space in last year, new racks spoken for, looking to build out another 40 racks. Also GE lambda exchange for US, Canada, Europe, Asia & S. America for experimental networks.. Also 1&10Gbps MEMS-switched (software controlled patch panel) research net hub. Have fibers and circuits from SBC, QWest, AT&T, Global crossing, T-Systems, Looking Glass, RCN & I-Wire. Chicago host fro NSF DTFnet (a 4*10Gbps net for TeraGrid & DTF/ETF links to Abilene; NLR USAWaves, others coming. For international coming to StarLight avoids transit issues.
TransLight is a global-scale networking initiative to support prototypes with lambda scheduling, this is the operational part of GLIF. Enable grid researchers to experiment with deterministic provisioning of dedicated circuits, compare results with standard & experimental aggregated Internet traffic. Test include moving large amounts of data, support real-time collaboration & visualization, enabling globally distributed lambda grid computing.
Any special handling requirtes packet ID on entry to net: special end host addressing; unique ingress/egress interfacesw; tagging of ingress packets. Once ID then can route differently (e.g. QoS). Create LSPs using RSVP to signal along predefined LSP. Need policing to protect production traffic. Option is to use MPLS class of service (e.g. to give Scavenger COS): on ingress interface before queued packets sent into memory or re-written on egress interface before output queue.
Juniper has "filter-based forwarding" to allow separate routing instances to separate routing tables for testbed and production. Scavenger service could be used to prevent over-running production traffic (support is in place on ESnet), sites can tag as they please.
Average pass-through traffic on a network is 60-75% rest is dropped at that site. European network even higher. THis is a problem for end-to-end provisioning I.e. the backbone bandwidths are >> end links, 3Tbps aggregate demand with 600Gbps link capacity. Want photonic switching (vs. OEO) since requires less space, less power, and in addition want agile (i.e. not need manual intervention). Need to know reachability of wavelength, power control etc.
Technology for photonic switching is largely in place (e.g. MEMS switches, tunable lasers). Tunable filters (Fabry-Perot) are a bit more exotic, but are under development. So can build a digital photonic control layer. Then can provision wavelengths on demand, and is technology agile (i.e. scales with evolving technology), also enhances MTTR/availability.
What are we trying to do: get beyond limitations of TCP/IP, firewall, take advantage of real throughput available in optical nets. Need hero efforts to get > 6Gbps. Firewalls throw it all away. TCP/IP limitation is 13 Gbps. WE have optical circuit switching, but does not scale so not useful for commercial networks, but good for DoE high performance sites. Build a sparse lambda-switching testbed, connect hubs to DoE's largest science users. Separately fund research projects (eg. hi-erf protocols, control, visualization) that wiull exercise the network and directly support applications at the host institutions.
Will use off-hours capacity from ESnet. Possibly as much as 3*OC48 between SNV, CHI in N route. Dedicated lambda on NLR 110GE SNV CHI, at least one more in year 2 or 3. One or 2 dedicated 10GE to ORNL. Progression of switching technologies: start with Force10 E600 (wide selection of ports, can do VLAN switching today, MPLS or Cisco 15454 (explicit sub-lambda switching (OC12, OC48), never will do MPLS), migrate to Calient all optical or hybrid. Logistical storage Deport. Progression of experimental point-to-point transport technologies (e.g. FC-IP).
The net will be 10Gbps lambda, with out of band scheduling. Only experimental traffic. Nagi Rao is concerned about making sure that the UltrNet community/applications are distinguishable from Internet2. PNNL wants in, and NLR goes via Seattle.
How do we get production loads on the testbed. The equipment at the labs (e.g. disk/compute farms) are a substantial investment and are not feasible to move to the PoPs.
The roadmap requested about $17M for this activity.
Chicago SNV paced by NLR expect 1st user traffic by June 04.
Thomas will fund SciDAC projects to use this infrastructure in FY04.
Long lived expensive experiments with large data generation/storage requirements. Cost of reproducing HEP data is high so has long term storage requirements. FNAL has over a PByte of data stored. Build, integrate own data systems.
SAM consumes 200TB/month. Need access from UltraNet, StarLight and ESnet to the FNAL core network and the large storage systems for CMS, D0, CDF. These storage systems are millions of $ of investment (say $3-4M/year), they will be modified to take advantage of UltraNet.
FNAL trying to reach StarLight. Has major collaborators (say ten tier 2 sites, 5 CMS, 5 for Atlas) in US states. FNAL sees broad range of activities, access to hiperf nets, ease of investigation, ...
EU - US Grid network research for hperf nets, inter-domain QoS, advanced bw reservation. EU-US Grid interoperability using GLUE interoperability. LHCnet & DataTAG testbed for CERN-US production testbed to experiment with massive file transfers across the Atlantic hi-speed peering at 10Gbits/s with Abilene & GEANT. Now have OC192CHI - CERN. Will use for production and testing, so difficult to get 10Gbps dedicated to testing. DoE/HENP funded part of the OC192 trans Atlantic link. Have powerful Linux frams, native IPv6, QoS, LBE, looking at MPLS and GMPLs, get hands-on experience with operation of Gbps nets using multi-platform, multi-technology. This is IP based, complementary to UltraNet. Caltech to LA downtown Feb 2004. To get LA StarLight could use CENIC/Abilene (shared), Level(3) MPLS backbone, NLR/UltraNet, UltraLight. With dedicated and shared can compare and also learn how to take advantage of different paths (dedicated, shared, QoS, VLAN, 15454 TDM style, MPLS etc.) to experiment and see which works best. By 2007 need 2-4 production 10 Gbps links CERN-Chicago, need to know how to use. Reach economic for fiber optics favor 10Gbps (multiples) vs 40Gbps.
Land Speed Record has 5.64Gbps for 1 hour with no packet loss (source rate limited) used Intel 10GE NIC and Itanium, MTU 8152 bytes, Cisco GSR, 760x, Force10 (StarLight), Juniper T640. Used TCP/Reno.
UltraLight submitted to MPS/Physics "Physics at the Information Frontier". First hybrid packet-switched & circuit switched network. Learn how to use 10Gbps lambda efficiently, scheduled or sudden overflow demands handled by provisioning added wavelengths (GE, n*GE, eventually 10GE).
Genome is an organism's complete set of DNA. Proteins perform almost all of like function & make up most of cell structures. DoE Genomics/Proteomics is very complex, and need huge computational needs. 10Gbps from MS (8.5-17TB/day) => 15 TB datasets.
Vision: Creation of national experimental infrastructure for developing cost-effective advanced network technologies to ensure US leadership in large-scale science endeavors by accelerating the discovery process. Office & Science & inter-Agency support. National & international coverage & visibility, join an ongoing effort. Accelerate science discovery: access to instruments, ter-scale computing resources, PB data archives, remote visualization. Advanced technologies to enable new cost-effective business modules for science networks. Integrated (applications - R&D, Middleware - R&E, Network - R&E). If not tied to applications then it is doomed from day 1.
Technology goals: agile hybrid network technology, user-level (application-level) provisioning, transport protocol transparent, multiple IP/optical domain.
Potential application communities: must advocate advanced networking; prototype applications; access to UltraNet, access to UltraNet; potential cross-cut activities (i.e. not just MICS but also other program offices). SciDAC applications would be the first to run on Ultranet.
Two problems that UltraNet can address: how to utilize 10Gbps transport; how to schedule a rich set of sub-10Gbps circuits. Then can grow to multiple 10Gbps lambdas later.
Management issues: prioritize experiments; coordinate traffic exchange; communication between UltraNet engineering & application communities; technology transfer. Management team: Ultranet engineering; ESnet engineering (will absorb technology so cannot be left out of loop); application reps/grids; international reps; StarLight. Monthly design meetings: status changes; design changes.
Broader vision: funding; FY06 budget submission - March FY04 at latest. Need a compelling argument for a $10M increase. Workshop projected need $13M for research (ESnet stays stable and is separate). Need PR for Associate Directors of Labs.
LSN inter-agency actions: develop an assessment & vision of National experimental infrastructure for developing advanced networks for the next Internet - high capacity testbeds; wireless testbeds; cyber security testbeds; Actions: an inter-agency testbed (DoE, NSF, DARPA, NASA), experimental networks (regional networks, JETnets plans, STARLight). Applications are what makes DoE unique.
Dan Orbach & Ed Oliver are in favor of UltraNet.