1
|
- Trouble Shooting Network Performance for Production Science Data Grids
- Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003
|
2
|
|
3
|
- What is the problem ?
- What is PIPES ?
- Network performance monitoring
- Problem identification
|
4
|
- The Data Grid consists of many components that must interoperate
|
5
|
- The resource broker must be fully informed
- Measurement is required !
|
6
|
- Internet2
- End-to-end performance initiative
- PI Performance Evaluation System (PIPES)
- PIPES Monitoring Platform (PMP)
- Overlap with goals of HENP
- Tremendous resources
|
7
|
- Package developed at SLAC
- Measurement Engine
- Iperf, bbftp, bbcp, ping, traceroute
- Abwe, owamp, udpmon, gridftp
- Job Manager
- Data Storage and data server
- Analysis Engine
|
8
|
|
9
|
|
10
|
|
11
|
- Typical Scenario
- User complains file transfer is slow
- Net admin runs ping, traceroute, iperf test
- Complain to upstream provider
- Proactive
- What do we mean by throughput?
- How do we know there was a performance hit?
- Our approach is diurnal changes
|
12
|
|
13
|
- Too much to keep track of
- Rather not wait for complaints
- Automated Alarms
- Rolling average ŕ la RIPE-TT
- May not be the best approach
- AMP Automated Detection System
|
14
|
|
15
|
|
16
|
- Could be over an hour before alarm is generated
- More frequent measurements impact the network and measurements overlap
- Low impact tools allow finer grained measurement
- Use NWS multi-variate method
- Use SCIDAC ABwE tool
- Use PingER, OWAMP
|
17
|
|
18
|
- Many monitoring projects, publish data to allow them to inter-operate
- MDS
- Web Services
- GGF NMWG
|
19
|
- Alarm System
- Multiple tools
- Multiple measurement points
- Trigger further measurements
- Cross reference off site stats
- Informant database
- No measurement is ‘authoritative’
- Cannot even believe a measurement
|
20
|
|
21
|
- MAGGIE
- Measurement and Analysis package built on NIMI/Akenti
- EDEE
- production-quality Data Grid for Europe
|
22
|
- IEPM Home Page
- IEPM-BW
- I2 E2E and PIPES
- RIPE-TT
- AMP Automated Event Detection
- NWS
- ABWE
|
23
|
|