Maintaining clustering hint server
run source startCHS command.
It will produce some debug output about starting transactions etc... If there is a CHS already running against your federation, it will produce a message :
Unable to get exclusive server lock
Before doing anything else, check if server is really running (see below)
Checking if CHS is running
run source pingCHS command.
It should say either "server is running" or "server is not running".
Initialization of CHS
It will produce messages like this, echoing the command it's executing (the whole procedure may take up to 20 minutes):
=====> : Stream20 esd Stream20-BaBar
/objy/databases/chs_rep3/scripts/./../bins/BdbClustHintConsole /objy/databases/chs_rep3/scripts/./../chs.ior -initGroup default g Stream20 esd Stream20-BaBar 0
run source stopCHS
It should say no message when it's stopped. There are cases when CHS can not exit, because some containers are still in use. In this case the error message would be like this:
Error: %s(): Server is not ready to exit (some containers are in use).
Try running with try2Recover flag
In this case you should check for locks on the federation, and run recovery (see below). If nothing helps, simply kill the process.
Recovery of used containers
run source recoverCHS
On success it should return no message. In some occasions of deadlock, or other code failure, recovery may hang. This indicates some sort of error. Kill it and restart CHS.
This utility is for control of CHS on low level. Make sure you know what you are doing. This utility allows you to enable/disable automatic precreation, close active databases; delete, initialize and close a group; list database and groups; precreate databases, print out metadata, run recovery and stop CHS.
See BdbClustHintConsole -help for exact syntax.
Looking for problems
Look for a size of CHS in memory. You might consider restarting it on a weekly or biweekly bases to avoid memory leaks.
Error lines in the log file have tokens "ERR", grep for them to find CHS errors. Also it's helpful to grep for "rror" to catch Objectivity and other errors. If the problem is with precreation, look in /tmp on the host where precreation was done - the log file is stored there.
BaBar Public Site | SLAC | News | Links | Who's Who | Contact Us
Page Owner: Jacek Becla
Last Update: June 13, 2002