Use cases for Xrootd. --------------------- Some comments placed in the line with "===>" Admin should be able to stop the server remotely (a la oostopams). Admin should be able to audit server's (get it's state and debug info) remotely, i.e. via a call to the server. Admin should be able to turn on debugging remotely, i.e via a call to the server. Admin should be able to read server's log file(s) remotely, via a call. Log files include main log file, error messages, trace output, pid file. Server should be able to log it's host's and it's process' cpu and memory utilization and other usufull paramemters. ===>Remote administration is essential in distributed environment. Admin should be able to dump/load server's configuration remotely. ===>24/7 availability is essential. Stoping 1000+ clients for some simple tasks like reconfiguration is bad bad bad. Admin should be able to give a signal to dlb to rescan file system for new/gone files. Xrootd, dlb should coexist on the same host with ams(es), so a host can be used for both objy and root data. ===> It's very difficult to move objy data around and not possible at all without disruption of access for hours. Therefore the aim should be not to phase out objy servers, but to deploy the same servers for root data, considering of cource overal load and file commitment. Load balacing should not be applied to files not yet backed up (typycally, in user's production enviroment). (*) Dlb should be able to distinguish, whether a file is closed or still active (may be written into later). If the file is active, it must not be staged anywhere else, even for r/o access. (*) Data Management use cases (wrt xrootd) Admin wants to check whether file is on disk and which host(s), and/or in hpss. Admin wants to set a pool of hosts for readonly data. Admin wants to set a pool of hosts for user's production data. This is totally separate from readonly hosts. Admin wants to dynamically add or remove a host from a readonly or write pools. Admin wants to disable access to certain data sets, should it need so. This means tcl files should not be generated for a user jobs. ??Other ways to prevent user from accessing some data?? A la inhibit?? Via Xrootd?? Disaster use cases. ------------------- DLB should be dynamically and remotely configured not to redirect requests to specific hosts, either forever or for specified time. Xrootd should not stop working if hpss goes down. When new file is created, Xrootd should not check whether its in hpss or not. ===> In new model, files are typically combined in large files before archiving, and each job creates unique file. When a data host is down, xrootd should automatically avoid this host. It should report to administrator, via some messaging mechanism, that a host is down. When a files system on a host crashes, xrootd should automatically recover. It should report, that FS is down. DLB should be checking xrootd "health" of a data server and it's filesystems as a part of load measure. Should report if finds something wrong. Anything that prevents xrootd or dbl from doing it's job, like network problems, afs troubles, should be reported. Reporting: dbl should be able to send messages to some other application for further error handling. ===> Reporting error condition on timely matter is essential. It doesn't make a lot of sence to build another monitoring system, if dbl is already doing so. Recovery use cases: ------------------- Admin should be able to close file descriptor selectively or all of it. ===> Used for substituting files on disk, like during conditions sweeps. Admin should be able to ask Xrootd to re-stage a file from the mass storage. ===> In case file is corrupted. Testing usecase --------------- Dlb sensors should be able to simulate various load conditions on a host in order to test it's functionality.