Two Minute Overview of SLAC's UNIX Batch System 4-22-96 Here are a few pointers on getting started with UNIX batch computing using the SCS compute farm. The compute farm is under the responsibility of SCS' High Performance Computing team (aka the HPC team). UNIX batch is controlled by a system called LSF, Load Sharing Facility, for handling job scheduling, queues, etc. All of the batch machines offered by SCS under LSF start with the name "pinto" though other groups may have machines available for their own use under LSF. NOTE: LSF is only available on systems that are so licensed -- this includes any of the interactive compute farm machines (they start with the prefix "morgan") and the xhosts (currently hebe, phrixus, and ganymede, and later a set of machines with the generic name of vesta). From time to time the HPC team in SCS gives seminars on how to use the compute farm and an introductory document to supplement that seminar is being prepared. But in the meantime, especially since we seem to always have things more pressing than seminars, here are some ways to learn more. >>>>> IF YOU READ ANYTHING AT ALL, PLEASE READ THIS PART <<<<< To try running a batch job, simply give the command: bsub hostname This will submit a trivial batch job that consists of just the UNIX command "hostname" (which just prints out the name of the host computer). Try it out. You'll get a message like: Job <7624> is submitted to default queue . Then you'll get output back in your mail that will look something like this: Date: Mon, 22 Apr 1996 19:48:22 -0700 From: lsbatch system Subject: batch job 7624: Done Sender: LS Batch System To: randym@SLAC.Stanford.EDU Message-id: <9604230248.AA18250@pinto22.SLAC.Stanford.EDU> X-Envelope-to: randym@MAILBOX.SLAC.Stanford.EDU Content-transfer-encoding: 7BIT Job was submitted from host by user . Job was executed on host(s) , in queue , as user . was used as the home directory. was used as the working directory. Started at Mon Apr 22 19:48:20 1996 Results reported at Mon Apr 22 19:48:22 1996 Your job looked like: ------------------------------------------------------------ # LSBATCH: User input hostname ------------------------------------------------------------ Successfully completed. CPU time used is: 0.3 sec. The output (if any) follows: pinto22 >>>>> EASY, RIGHT? READ ON FOR THOSE STILL WANTING MORE INFORMATION. <<<<< 1. You can use the "man" command to learn about lsf by typing "man lsfintro". This will in turn give pointers to other commands, such as "bsub", that will be useful. Also "man lsfbatch" has useful information. 2. You can also try "man xlsbatch" to learn about a graphical user interface to much of the job submission system. The "xlsbatch" command runs in any X windowing environment that is also licensed for LSF. 3. The full LSF User's Guide in PostScript form is online in the /usr/local/doc/Reference/lsf/lsf.userguide.ps. While it is a little long to print, it has lots of good information. I'd turn to it once you've tried a little bit of LSF and want a great deal more information. On the other hand most people using LSF have never needed this manual. 4. The HPC team member responsible for LSF is Ed Russell, ext. 2902, esr@slac.stanford.edu. Ed is always interested in helping people use LSF or listening to suggestions about configuration changes, wishlist features, etc. 5. The HPC team has a mailing list, farmers@mailbox.slac.stanford.edu, used to send time-critical information. The "farmers" mailing list is used to inform various individuals and groups who have identified themselves as users of the SCS compute farm facility, both batch (machines of the "pinto" class) and interactive ("morgan" class). To subscribe to this list, send mail to listserv@SLAC.Stanford.EDU with the following in the body of the mail: subscribe farmers YourName - SLAC/YourExperiment substituting YourName, YourExperiment, and YourEmailAddress appropriately. The mailing list is run by a program called majordomo. More information about majordomo, like how to unsubscribe and how to find out what lists exist and which ones you're on, can be received by sending mail to listserv with the following in the body of the mail: help >From there you can discover other commands. 6. The HPC team also has a news group called slac.comp.computefarm. Items are posted there that are not time-critical in nature and/or fit into an environment suitable for dialog. Anything posted to the farmers@mailbox.slac.stanford.edu mailing list also appears on this news group. I hope that should be enough to get started. After you've read a little bit, you might want to simply try the xlsbatch command and submit a trivial job or two, even just a command if you wish, to get some experience. And of course, I'm always interest in feedback. (There is also a mechanism for dealing with large staged files on working disk space that reside on silo cartridges. But that's another story.) -- Randy Melen, SCS/High Performance Computing Team, randym@slac.stanford.edu, x2841