Trouble Shooting Network Performance for Production Science Data Grids | |
Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003 |
What is the problem ? | |
What is PIPES ? | |
Network performance monitoring | |
Problem identification |
Network Monitoring for the Grid
The Data Grid consists of many components that must interoperate | |
The resource broker must be fully informed | |
Measurement is required ! |
Internet2 | |
End-to-end performance initiative | |
PI Performance Evaluation System (PIPES) | |
PIPES Monitoring Platform (PMP) | |
Overlap with goals of HENP | |
Tremendous resources |
Package developed at SLAC | |||
Measurement Engine | |||
Iperf, bbftp, bbcp, ping, traceroute | |||
Abwe, owamp, udpmon, gridftp | |||
Job Manager | |||
Data Storage and data server | |||
Analysis Engine | |||
Typical Scenario | ||
User complains file transfer is slow | ||
Net admin runs ping, traceroute, iperf test | ||
Complain to upstream provider | ||
Proactive | ||
What do we mean by throughput? | ||
How do we know there was a performance hit? | ||
Our approach is diurnal changes |
Too much to keep track of | ||
Rather not wait for complaints | ||
Automated Alarms | ||
Rolling average ŕ la RIPE-TT | ||
May not be the best approach | ||
AMP Automated Detection System |
Could be over an hour before alarm is generated | ||
More frequent measurements impact the network and measurements overlap | ||
Low impact tools allow finer grained measurement | ||
Use NWS multi-variate method | ||
Use SCIDAC ABwE tool | ||
Use PingER, OWAMP |
Many monitoring projects, publish data to allow them to inter-operate | ||
MDS | ||
EDG NM Schema | ||
Web Services | ||
GLUE NE Schema | ||
GGF NMWG | ||
Hierarchy Doc | ||
Tools Doc |
Alarm System | ||
Multiple tools | ||
Multiple measurement points | ||
Trigger further measurements | ||
Cross reference off site stats | ||
Informant database | ||
No measurement is ‘authoritative’ | ||
Cannot even believe a measurement |
Toward a Monitoring Infrastructure
MAGGIE | ||
Measurement and Analysis package built on NIMI/Akenti | ||
EDEE | ||
production-quality Data Grid for Europe |
IEPM Home Page | |
IEPM-BW | |
I2 E2E and PIPES | |
RIPE-TT | |
AMP Automated Event Detection | |
NWS | |
ABWE |