Visit to RAL & Amsterdam Peering Workshop

R. L. A. Cottrell

March 22 - March 25, 1999

Visit to:

Rutherford Appleton laboratory, Abingdon, England

and

Attendance at workshop on International peering and Bandwidth management Technologies and Architectures, Amsterdam, Netherlands.

Traveler: Roger. L. A. Cottrell, Assistant Director, SLAC Computer Services, Stanford Linear Accelerator Center, Stanford University, Stanford, California 94309.

Dates of Trip: March 19 - March 25, 1999

Purpose of Visit: To visit RAL in connection with network and computing services, and to attend the workshop the peering workshop.

Visit to RAL

RAL Security

RAL has set up a computer security committee with representatives from the major areas of computing at RAL. As part of this they have set up an email list of the system administrators for the various sets of systems. In the past they used software from Optimal to look for acceptable use of the network and extended that for a bit to also look for intrusions. More recently they are using NetXray to scan traffic at the border of RAL. They record all TCP open requests apart frm HTTP, SMTP and NNTP, and they can hold this data in core (before it is looked at later) for 90% of the weekends today. A new version of NetXray will allow spinning off the data to disk which will allow longer holding times. Their use pushes NetXray to the limit sionce it can only trigger on 20 daata patterns and 20 booleans. They have developed a set of filters for NetXray that allow them to manually scan the results at the start of each workday via NetXrays GUI interface. Among other things they look at IMAP and finger, and look for DNS scans. If they spot scans they they apply blocks in the border routers ACLs for the host making the scan. Typically they keep 60-70 blocks and retain a block for ~ 40 days.

They have devised a scheme for setting aside 128 IP addresses in each subnet they have 1024 addresses in each subnet). These are reserved for servers which need access from the outside world. Machines with other addresses are blocked from responding to a TCP open request. They hope to introduce this scheme in the next few months. This initial scheme will not address UDP or X or FTP issues. In the second stage they will add further blocks via ACL's in group routers which are centrally controlled. This blocking will employ a block all, allow specific protocol strategy. Thus people will need to request/register to allow servers to have access through the routers. In the third stage they will look at how to take advantage of private IP addresses. (e.g. for printers and financial applications). The fact that RAL/DL is a split into 2 sites separated by a couple hundred miles adds complexity to the design.

RAL Y2K

They are using Oracle Financials 10.7 which is Y2K compatible. They run it on Data General machines. They have run acceptance tests where they rolled the clock forward.

For PCs they are using SMS to update clienst, It has a steep learning curve, and upgrading is difficult due to user customization of platforms. It is not completely automatic yet. They have stick on labels that they put on each PC that is Y2K compliant to indicate its BIOS, OS & Office 97 applications are Y2K compliant.

Their phone switch is not Y2K compatible in particular in the area of billing back. Thus they are putting out an RFP to upgrade their switch.

UK A&R Networking

BaBar in the UK has requested and received 800K pounds Sterling for large scale distributed databases. The idea is to have 5TB at RAL, and 2-3TB elsewhere. Only the universities could bid on this money, and Manchester is assigned the money. LHCB has been awarded 500K pounds Sterling. There is also a JANET infrastructure fund of about 600K pounds Sterling. CDF/D0 and BaBar have put bids in, and hope to hear in April. It is expected to be used to help with off-line analysis. One thought is to use the funds (if granted) to provide some form of dedicated bandwidth form the UK to SLAC. JANET has costed a 2Mbps connection to US at about 150K punds Sterling/year.

The UK A&R network is upgrading the transatlantic link from 90 Mbps to 155Mbps, and later this year (~May 1999) to 2*155Mbps. There is a proposal that 10% of the 2*155Mbps be reserved for providing quality of service/managed bandwidth for special projects (one such project could be for BaBar use). For long term (> 3 months) projects there would be a one-off set up charge and a quartely capacity charge (1K pounds/quarter/megabit). If/when this comes to pass a challenge will be how to connect in the right people and things like acceptable use policies etc. This will require close cooperation between JANET/RAL/ESnet/BaBar/SLAC to come up with an effecive plan.

Traceping

John MacAllister is working on extending tarceping so it can monitor all the PingER Beacon sites. The test results can be found at http://av9.physics.ox.ac.uk:8097/. The main scaling problem has been the cpu cycles required for the analysis. Currently the analysis is srunning on a DEC Alpha 3400. With 50 beacon sites there are 50 processes running simultaneously and each takes 1-2% of the cpu. The design allows the monitoring to be done on a separate machine to the analysis. The communication (i.e. the data is sent from the monitoring machine(s) to the analysis/collection machine(s)) is via email. Today there are Traceping monitor/probe processes running at DESY, CERN, Oxford, Munich, Gran-Sasso & SLAC (SLAC & Oxford are also analysis/collector sites, Oxford acts as the cllector/analysis site for the other monitor sites).

We discussed the next steps needed to get Traceping installed at the PingER monitoring sites. I am identifying which PingER monitoring sites can/ are willing to run Traceping today. John will then work with those sites to install/configure it at those sites. This will require John to get an estimate of the storage space required today. Les will also identify which monitoring sites can/are willing to run Traceping if it is ported to Unix/WNT and identify which patforms are most popular. John will work on porting Traceping to Perl so it can run on Unix or WNT (currently it runs on VMS).

Peering Workshop

Workshop on international peering and bandwidth management technologies and architectures

Introduction

This meeting was organized by Cisco (the organizer was Bob Aiken <raiken@cisco.com>)to identify and evaluate the various possible peering architecgures, models, and technologies available to the European Research and Education (R&E) community to support bandwidth and peering management for trans-European and trans-Atlantic network connectivity.

Overview & History - Bob Aiken

A brief history of the internet peering: ARPANET IMPS, NSFNET/POPs regionals, FIXes (US federal), CIX (commercial), NAPs, (neutral peering points where did not have to worry about acceptable use policy (AUP) or class of use (COU)), commercial NAPs/private peering.

Issues: distributed peering versus centralised peering, QoS, open to public or private, use of a public routing database, congestion issues/scalability of public peering points, cost sharing/business models, security of data & traffic patterns, relationships of large ISPs & small ISPs (how large do you have to be to have bi-lateral peering), media - switch/bus/ring / ..., value added services (time, caching, security management ...).

ESnet - Jim Leighton

US Agency peering domestic uses public peering points of MAEs & NAPs typically L-2 interconnect points and policy free and the lost successful to date are ATM based; and GigaPoPs are essentially aggregation hubs for the US R&E community, form the principle customer set for vBNS & Abilene and also accept external connections (e.g. ESnet is connected to several GPoPs); the NGIXen is the evoloution of the FIX concept & interconnects for JET (NGI) networks, is of arguable value even among JET network managers, costly & little exchange of traffic among agencies.

Peering between production networks and experimental networks has its own concerns: research networks are unstable (new code, route flaps, can advertise bogus routes), need apllications to test new technology.

US Agency peering international have dramatic changes underway: seems to be abandoning the traditional half-circuit model, many international peers now arriving at US peering points with multi-megabit links e.g. STAR-TAP, Perryman (DFN, CERN, INFN), Telehouse (DFN, DANTE, INFN soon), 60 Hudson, NY (Surfnet), allow sUS agencies to peer for cost on a domestic interconnect. There are 3 major peering points in the US - San Francisco, Chicago & NY/Washington.

GigaPoPs - Paul Love (I2)

I2 (Internet 2) started 2.5 years ago, with 34 universities signed up with membership dues & commiment to end-to-end high performance connectivity, and developing high end applications. Current membership is 147 universities, 27 affiliates and 50 corporate members.

Provides high-quality widely available interconnect among participating GigaPoPs/universities with an IP bearer service. Routing uses OSPF, BGP4 & a routing arbiter database, measurements using Surveyor, traffic utiliztion, OC3MON, end-to-end flows, QoS/Qbone. Has OC48 packet over SONET, & OC12 POS, will have OC48 trancontinental by end April. Have 10 core nodes. Will add Washington DC to core.

Abilene important to connect to other high performance nets: vBNS, federal agencies, non-US NRENs etc. Paul feels NGIX are important to interconnect through. Will peer with NGIXes in SFO & Washington, Startap, CaNet3.

GigaPoP architecture idea came from George Strawn of NSF. Idea was a catalyst for I2, it is an aggregation point that provides economies of scale, not limited to university members, high speed local traffic stays local. There are a variety of business models and technical approaches. They all support IP as a common bearere service, inter-GigaPop routing policy & design, measurements, trouble ticket sharing. GigaPoPs connect to GigaPoPs and to the backbone. There are about 30 GigaPoPs today.

There are GigaPops with: star topologies, reduces problems of high bandwidth in core, but single point of failure and long back-hauls; distributed optimizes backhauls reduces politics, but bandwidth between nodes can be a bottleneck; combination e.g. two stars. As an example Pacific Northwest uses Gbit Enet with 3 physical points 10 miles apart, try to avoid single points of failure. A 2nd example is the Great Plains network with 6 state networks connected to 3 hubs (Souix Falls (dept of interior has a lot of earth sciences data)), Kansas City, Minneapolis). CalREN-2 is distributed over N & S California with OC12 (vBNS) between N & S, OC48 rings in Bay Area & S.Cal, connect to ESnet & Abilene in N, & in S ESnet and Abilene & NREN in future. No campus can put more than OC3 onto the network. The Southern Crossroads (SoX) has a single hub at Georgia Tech.

MAE-East experiences - Steve Feldman (MCI/Worldcom)

Started in 1992 with shared Ethernet over DS3 ring, in 1993 migrated to switched Ethernet, in 1994 added FDDI, in 1995 added switched FDDI (Gigaswitch), in 1997 multiple Gigaswitches with star topology, 1998 MAE-ATM introduced. Today has a core switch with 5 FDDI trunks to edge switches whre cusotmer routers connect. There are about 116 connections to MAE-East with 2.1Gbps peak period traffic (was 2.0Gbps in Nov '98 & 1.6Gbps in Aug 98). Gigaswitch is an old design which has head of line blocking problems. There have been problemns with overloaded trunks, overloaded access points and solution does not scale.

So introduced MAE-ATM services based on Cisco/Stratacom BPX switches. They provide fixed-bandwidth PVCs among providers with virtual private peering. Port speeds are up to OC12c today and it is not interconnected with the FDDI MAE. They have 3 ATM switches today. They have a web based PVC provisioning tool to allow customers to provision their own PVCs. They also enforce no over-subscription & bilateral agreements.

They are looking to replace the FDDI with Fast Ethernet (since few people are building fast FDDI switches, all the development is going into fast Ethernet with Gbit trunks (learnt hard way that do not want edge connections to overload the core), eventually they will be going to Gbit Ethernet switches), evaluating bigger ATM switches (estimated need by summer).

Larger ISPs set up private peering, MAEs are used to connect large to medium & medium to medium ISPs.

CANARIE: optical peering - Bill St Arnaud (Canarie/CA*net3)

Deploying an IP over DWDM network over Canada. Using 2 wavelengths but have 8 available to Canarie, so will have 80Gbps available. Then in each province they have GigaPoPs. The have issued an RFP to add extra wavelengths. Plan to use 10Gbps Ethernet when available. It is all IP, no SONET or ATM., i.e. router (GSR 12000 with OC48 connect) connects directly to fiber. They share the fiber with the commercial carrier. Commercial carrier has 4 wavelengths, Canarie has 8 wavelengths, the fiber currently carries 16 wavelengths. They can configure the wavelengths to provide asymmetric traffic. They use 2 rings so have failover (hope to get to 1/3 second restoral). The elimination of ATM & SONET has big cost savings (quoted 50% of cost). There has been a point of contention between US & Europe due to the asymmetric nature of traffic built on top of symmetric links which since most traffic flows US to Europe means Europe to US get a free ride. Can configure different wavelengths to use different technologies, e.g. one wavelength could be ATM, another IP over SONET, another IP over DWDM etc. So once one gets access to dark fiber the connection can be very cheap. Next year will be able to build a 400km ring with new amplifiers being developed. Today have 64 wavelengths possible, next year expect 128 wavelengths.

Expect to see High Performance Network Service Providers (HPNSP) peering at multiple points, each point being for a different community (e.g. ANX) rather than using the general exchange point.

APAN, STARTAP, TransPac - Linda Winkler

TransPac is a 73Mbps VBR-nrt with IP over ATM with exchange point to the STARTAP in Chicago. There is policy routing done in Tokyo.

Policy routing can be costly in cpu utilization in Cisco router, e.g. for an policy in the access list at position 1 get 15% utilization (Cisco 7500/RSP4 with 11.3 IOS single 14kpps packet stream), 50 22%, 100 30%, 200 57%, 300 83%, 350 98%.

STARTAP moving to 3 Ascend switches with OC12 trunks. CERN imminent connection, France imminent, Israel imminent, Netherlands, NORDUnet imminent.

CERN - Olivier Martin

CERN Internet Connection Point (CIXP) started in 1989, official IXP since 1996 with a regional scope for France & Switzerland, open to all ISPs having POPs in CH & FR with at least 3 peering agreements in place.

CERN-US with IN2P3 & WHO. MCI/Perryman POP will be relocated to C&W Chicago, UCAID membership under consideration. 2*E1 upgrade to 12Mbps 1/4/99 then 20Mbps 1/10/99 via C&W (ATM-VBR-nrt). Will have STM-1 to STARTAP.

Bandwidth management to provide fair share of capacity to partners while allowing unused bandwidth to be usable by other partners. Since mid-1998, FR based solution in place (Cisco/Stratacom IGX). FR does not scale to higher speeds, will have to use ATM. Use WFQ which works well for telnet. CAR capable IOS from Cisco but buggy & disabled. Packet video uses a repackaging if MBONE tools (vic, vat, rat) behind a web interface & a conference room reservation system. Does not use MBONE rather uses reflectors.

Lot of progress on traffic statistics, network probes etc. Much remains to be done on QoS.

SURFnet, GigaPort & AMS-IX - Eric Jambeau

SURFnet private compant located in Utrecht users are research & educational users in Netherlands. Currently has up to 622Mbps. Has sigend an MOU with UCIAD so can connect to Abilene and vBNS. Has 4 GigaPoPs interconnected at 622Mbps with Cisco 12000 (GSR) routers. Plus 16 connection PoPs at 155Mbps. One PoP in NY at Teleglobe (60 Hudson St.) also connections to JaNet & Nordunet. There are 200 connections to network with speeds 64K - 155M. Coverting multicast PIM-SM.

200Mbps to N. America with CAR & WFQ. To Europe 155Mbps (TEN-155). AMS-IX (Amsterdam Internet eXchange) at 200 Mbps with 35 peers. Soon hope to connect to Abilene at 75Mbps end March 1999. STAR-TAP at 45Mbps will connect in April 1999, and will also peer with Abilene/vBNS there. NORDUnet will have its own connection to STARTAP.

GigaPort has 2 components: a GigaNet (SURFnet) to be the NGI in the Netherlands; GigawWorks for applications & services (Telematics Institute). There was a grant of 85M US$ for 3 years. Will need external connectivity & will run pilots for access networks (e.g. cable modems, mobile networks).Hope for 1st trial of GigaNet in 1999 with IP over DWDM at four PoPs at 2.5G. For 2002 hope to have 2.5Gbps to N. America.

AMS-IX is the oldest & one of largest IXs in Europe, it is an association under Dutch law. It has 2 locations SARA & NIKHEF in Amsterdam. There is switching hardware & a router with a Gig Ether trunk. 10BaseT & 100BaseT connections, 55 ISP conected. Have experiments with MBGP & MSDP, IPv6 in cooperation with 6TAP, soon they will have fast Etherchannel connections and later Gig Ethernet connections. Looking at more colocation facilities.

SuperJanet - Kevin Hoadley (UKERNA)

SuperJANET started in 1992 with 140/155Mbps ATM core with 70+ sites on SMDS (10 Mbps). SuperJANET III (1998) 155Mbps ATM core, plus regional metropolitan area networks (MAEs).

SJIII backbone supplied by Cable & Wireless with 16 regional nodes (4 core SWC switches). MANs supplied by local cable/power companies, varying from 155Mbps SDH to dark fiber all currently running ATM and are locally managed.

Not really a multiservice metwork, there is wide use of video but curcuit based on CBR @Mbps CBRs and it consumed most of the ATM network. Currently video uses ISDN at 384kbps, the exception is within Scotland with M-JPEG video on ATM with 60Mbps reserved.

The external links are: 2*155Mbps to US from Teleglobe (May) with transit to Global Internet, plus FM space at 60 Hudson St; TEN-155; London Internet eXchange (LINX); CERNET with China at 512kbps.

Diversified peering: where do they go from Hudson St; where do they go within UK (looking at private peering), MANs doing their own thing; where else in the world - real interest in more than US/EU, interest in Singapore, Malaysia, Hong Kong.

US-Russian ATM link & MIRnet project - Valerii Vasenin

Have 6Mbps Chicago to Moscow. MIRnet to begin in a few days. Bandwidth limited by finances available on Russian side.

TEN-155 in Europe - Jan Novak (DANTE)

Core is UK, NL, FR, DE, with dual links SE & CH, plan to add IT ~ 20 countires are connected. Full mesh of STM connections to core plus SE, CH. This eliminates transit traffic which is a big savings on switch & router interfaces.

Renater peering & QOS approach for NRN - Jean Marc Uze (Renater)

Peer with US (155 Mbps), to SFINX (Service for French Internet eXchanbede) 10Mbps to Spain, & 155Mbps to TEN-155. US via France Telecom peering at STARTAP and Sprint service points at Chicago, Pennsauken, Stockton & Relay. They claim there is no congestion France to US (congestion in reverse direction)..

Signed MOU with STARTAP to alows peeing with ESnet, CA*net, NASA, Abilene, vBNS.

The problems with a national backbone are how to go from big pipes to QoS. Will have to do `since can't afford high speed links. Will increase bandwidth as budgets allow.

GARRnet - Christina Vistoli (INFN/CNAF)

GARR IT2 is to provide the oportunioty to achieve competeence in advanced use of networking for R&E with an experimental testbed for adv. net services and the development of applications that need these network services. It is in cooperation with private industry research partners. Connection with US Labs is a typical request. Testbed to be provided by regional networks & service providers. It will be a dedicated network QoS capable, multi-protocol (multicast, IPv6), peering with other test beds. Only used for experiments of advanced IP services. Will focus on infra services such as QoS/DiffServ considering ATM (since largely available, often used & has much experience), leased lines (e.g. SDH/SONET) directly into routers, optical network with WDM. To do QoS have to distinguish traffic. Will experiment with RSVP over ATM (e.g. for remote control of experimental devices). Also experiment with DiffServ to DiffServ mapping and RSVP to DiffServ mapping. They will also experiment with multicast since it can reduce traffic. They want to test IPv6. The advanced network applications are video conferencing ...

ScotIX - Gordon Howell

Doesn't exist yet, i.e. vaporware. The objectives are to make it attractive to NSPs, it can be use as a colocation service center, financially self sustaning, neutrally operated, provides benefit to Scotland.

The models considred were: governemnat owned & operated (usually in developing countries); totally non-profit (e.g. Israel); non-profit core with commercial facilities; totally private for profit (needs a big market). Their choice was non-profit with commercial facilities. This was used successfully for LINX, maintains an important public remit, keeps a strategic initiative (e.g. not short term objective), adds a lot of value to the co-lo operation. Mebers will esentially be anyone with an ASN, the core exchange is a non-profit operation, right is granted by license to provide facilities to house exchange and its members, facilities operator selected by tender.

Is the tender based on biggest money gain versus quality. They chose quality "hardened" facilities, they want it to be a a location for serious international information services, a market leader, neutral, suitable location & facilities, experienced.

There are 11 sponsoring members includding BT. The market is IP peering to the N. England, London & UK ISPs for redundancy, access to the Scottish market, peering to N. Atlantic & N. Sea countries, USA.

Vienna Internet eXchange - Christian Panigl

VIX connected to ACOnet (the Austrian National network). ACOnet is connected at 10Mbps to EBONE. Also ACOnet has 34Mbps to TEN-155. For US use EBONE, but main part of from US traffic come via 8Mbps satellite link. ACOnet is not a legal entity, it is run by Vienna university.

The VIX is also just a name. It started in early 1996 with 5 members on a shared 10Base2. , in jan 1997 went to Catalyst 5000, and recently to 5509. Not for profit, best effort, lowest cost on U Vienna grounds, bilateral peering, no route server, no public statistics measurements. 99.9% availability. Member prerequistites require own IP allocations, their own AS, international Internet connectivity independent of existing members, prefixes & peering documented in RIPE database.

Charge by rack space. Have 33 ISPs, most are not Austrian (e.g. IBM, Swiss Com, EBONE, ...)

Polish Network - Stanislaw Starzak

Pol-34 has 34Mbps ATM over SDH backbone & 2 Mbps tail circuits. They had to install their own fibers so now own them since infrastructure was not in place. Now have to fight PT&Ts since it is in competition. They decided to create their own countrywide ATM infrastructure to interconnect the MANs being built up. The larger MANs interface via ATM at 155Mbps. They use VLANs for different applications. The power companies have fibers for communications which go to most of the right places for Pol-34. So Pol-34 obtained fibers from the power company Tel-Energo. They have a satellite link to Stockholm. They have plans to extend the 34Mbps links and increase the core to 155Mbps.

The VLANs address political requirements of dividing the bandwidth and separating the major applications for charging, e.g. for Polish Internet, international Internet, high power computer clusters, metacomputer applications, & multimedia appliations.

They have various services including W3 caches, FTP & news, metacomputing, multicasting, videoconferences, distributed library services, distance education, medical tele-diagnosis, & multimedia databases.

IP Bandwidth Sharing - Paul Ferguson (Cisco)

Major differences in IntServ & DiffServ is state vs. no state. RSVP requires end-system to be involved (i.e. needs to be implemented/supported in end stations such as Windows desktops), whereas DiffServ does not. RSVP has some scaling concerns when individual flows grow beyond a few hundred (or perhaps a few thousand). This may be samewhat alleviated in the near future with an RSVP reservation aggregation scheme.

DiffServ with expedited forwarding (EF) per hop behavior (PHB == local behavior at each hop) has: strict shaping to conform incoming EF traffic to available capacity; aggregate EF ingress <= % of link capacity set aside for this "service" in the core; packets get marked as EF get priority transmisson; has fairly good data protection.

DifServ AF PHB: packets are simply marked with relative priority; the service provider can interpret handling at-will; provides soft or "squishy" differentiation. QoS is simply bandwidth management.

How does this relate to hard guarantees on end-to-end delay and jitter? RSVP gives a bound on end-to-end maximal queuing times which basically bound delay for flows. Does not provide jitter control, does protect flows & guarantee bandwidth. DiffServ's EF PHB parallels the IntServ controlled load service. In the packet world jitter is an issue that is generally uncontrollable at an absolute level. A consistent queuing scheme might make it more predictable but can never guarantee it. Probably the most effective method of dealing with jitter is to adapt at the end system (e.g. RTP based monitoring).

There are 2 worlds, the global Internet and private organizational networks. PATH & RESV are not always evil depends on where you use and what you want. It is difficult to maintain a balance of economies of scale and sustain network performance.

Using ATM as a Peering Technology -

At a peering point interconnect routers by ATM switches (2 switches for fail safe). Lots of acronyms e.g. Peak Cell Rate (PCR), Cell Loss Rate (CLR), Cell Delay Variation (CDV), Committed Bit Rate (CBR), Early Packet Discard (EPD), Explicit Forward Congestion Indication (EFCI). User perception is high application goodput, low latency, min rate guarantee, access to additional bandwidth on demand, minimize WAN costs. Available Bit Rate is what is left after what is reserved for CBR and is dynamically used by Variable Bit Rate (VBR) is subtracted from the trunk bandwidth allows use of excess bandwidth in the network.

Goodput is the throughput at the application level. It is not the same as link utilization (i.e. retries/drops do not apply to goodput) it counts delivery end-to-end. To deliver goodput requires big buffers since congestion for ABR can exist for a long time. Big buffers provides better statistical multiplexing, a combination of big buffers and feedback is needed for congestion collapse.

A retransmit causes at least one round trip time delay.Big buffers however conflict with low latency for real time (e.g. video games, device control). Multiple queues allow one to penalize separately. Congestion collapse occurs when exceed 100% load and goodput falls. Congestion collapse can occur if sustained load exceeds capacity, as load increases, more traffic is dropped triggering more retransmissions. How big is big depends on RTT which in turn varies by bandwidth (for 1ms RTT then DS3 = 96bits, 20Gbps = 45184 bits, international links say RTT 250 ms at DS3 = 24000bits, 20Gbps 11,296,000bits), i.e. what matters is the bandwidth * delay product.

TCP congestion avoidance uses timeouts to indicate loss & requires ~ 2*RTT worth of buffer to keep link full. ABR network congestion is conveyed explicitly via RM (Resource Management) cells and is protocol independent, again requires 2*RTT worth of buffer to keep link full.

Ideally feedback should be to the application. If there is no feedback the cell loss affects a random user and random data. Would prefer a scheme where the discard affects the desired user, and the desired data. This requires the host to be ABR aware. So congestion must be pushed back to the edges. A way to do this is to do the congestion control via the ATM Virtual Paths (VP) or Virtual Circuits (VCs). This may be done via Switched Virtual Circuits (SVC) or Permanent Virtual Circuits (PVC) with the former being more flexible and scalable. There are also Soft PVC (SPVC) . SPVCs are re-routed by the ATM switch upon failure. PVCs provide easier for accounting than SVCs.

Between ATM switches need a routing protocol. This is PNNI (Private Network to Network Interface). It is complicated and requires a steep learning curve like IP routing.

MPLS + BGP policies to produce scaleable IP VPNs - Alain Fiocco

MPLS tags packets to classify them at the edge, then in the core the packets are forwarded using tags (as opposed to the IP address). This enables ATM switches to act as routers & creates new IP capabilities via a flexible classification.

Label Distribution Protocol (LDP) distributes the label so the core knows what to do with packets containing the label.. Inside the core the switches route the packets according to the label. These are a new class of service called Tagged Virtual Circuits (TVC).

The Tag has 20 bits for label, 3 bits for COS, TTL 8 bits, 1 bit (S) for bottom of stack. Can be used over Ethernet, PPP links, FR, ATM PVCs etc.

Using MPLS to create VPNs does not use tunnels. The goals include: provide a private Internet with security, QoS, performance, flexibility, private addressing, scalability, management & provisioning.

Security is ensured by: route distribution is done in BGP and is not spoofable (the VPN_ID is not carried in the packet); VPN isolation is guaranteed (packets from one VPN won't go into another VPN).

Inter VPN communications require going through a managed site which could be a firewall.

It appears to be very powerful but complex and will probably have a steep learning curve.

Charging/Accounting - Kevin Hoadley (UKERNA)

The background is the rapid growth in transatlantic traffic at teh same time as bandwidth costs have been falling, but the needs outstripped the costs. So the solution is to charge for the congested resource, i.e. the in-bound transatlantic traffic. The cost is $0.03/Mbyte. To do this they use Netflow & NetraMet to process flow records, with online bills (but not online payment) and a simple tariff clasification plus itemization (at a cost). The problem is how to present to customers so they can give feedback. Show utilization with delay of 30 minutes. The itemization is for the university to provide a list of adresses (e.g. Web, email, FTP, DNS, ICMP, Quake etc.), and a name to be associated with the item. Then billing is provided for each item with burrow down by department (university also provides the department). They provide discount for off hours use, discounts (e.g. for smurf attack if recognized within 10 days - this is how long they retain the data). No differential charging for different applications (e.g. do not charge different for UDP vs TCP). They do not provide a rebate for lost packets. They recover about 2/3 of the cost of the transatlantic link by this mechanism.

It is simple, but not congestion sensitive, and may not provide the right incentives to users (don't make trivial use at congested times). The simple model may not hold for long, the transatlantic costs have dropped dramatically, but WAN (JaNet) costs are falling less quickly. So the question is whether to charge for total use, should it be distance sensitive? Counting packets is relatively easy (though there is the issue of the cpu cost in the router to run Netflow especially as one goes to high speeds), fitting the data to a network & the cost model is more interesting. Unclear whether there was any change in user behavior, some institutions took this as a reason to look at what their users were doing which may have had an effect. Prior to this charging there was no contract with the customer. There can be confusion for the customer who accesses a Greek real audio station which is only accessible via the US.

March 19:	Leave Menlo Park
March 20:	Arrive London
March 22:	Visit RAL
March 22:>Arrive Amsterdam
March 23-24:	Attend peering workshop
March 25:	Leave Amsterdam, arrive Menlo Park