Another tool Gary has been looking at is a beta test of some routers statistics display in Web format. Basically there is a line of thumbnail plots for each router interface, each plot gives the data for a different day. Each thumbnail is a plot of the packets in, packets out, and the sum of in + out on an hourly basis for the day. It requires Perl, TCL, Expect, gnuplot, giftool (to make GIF images transparent) and netpbm (from Berkeley) to turn the images from pbm format to gif format. It accesses the Cisco router stats via telnet/Expect using a non-privileged password. Gary has made an important mod to allow the graphs to be plotted with a common scale value. He has also used similar graphical output to provide plots of how many ports are in use on an hourly basis on their SLIP/PPP servers.
Gary was working with a package called Emanate from SNMP Research to implement an site specific server MIB to monitor UDP and TCP services in order to tell whether a server process is congested, how many clients it has, and what are the IP addresses of the clients.
I also browsed through their network/computing pages and got demos of their: forms for reporting problems to the help desk; their Web interface to browsing what problems there are on the net and recently, and what outages are scheduled. They use a majordomo email list for reporting scheduled outages (as does SLAC), and have built a nice Web interface to the information. They also have a Web form interface so that you can find out what has been assigned to you (or someone else). For example it will say what workstations are assigned to you, what accounts you have (e.g. Unix, Appletalk etc.). They also have a public domain help desk management system (not Remedy and not Gnats) that is being converted to use Oracle. Finally user can register their own devices. If an unregistered device is found on the net (by harvesting MAC addresses from Bridge tables) then the responsible person is tracked down and asked to register it in two weeks. Failure to do this results in the address being blocked at bridges. This is coupled with a move to recover network costs through charge-back based on attached devices (exceptions will be made for "local-only" devices such as printers which will be neither visable nor accessible remotely).
Gary is very interested in moving monitoring up the chain to applications. He has been following the applications MIB activities in the IETF. He has also been looking at a free SNMP v1, SNMP v2* package from Wes Hardaker at UC Davis which is compatible with the host resource MIB for HPUX, SunOS, Solaris and DEC Ultrix. He has it running on 150 hosts. It provides info on disk I/O, swap status. There may be some sociological implications since systems folks may feel this is their bailiwick.