SLAC PEP-II
BABAR
SLAC<->RAL
Babar logo
HEPIC E,S & H Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews
Unwrap page!
Comp. Search
Who's who?
Meetings
FAQ Homepage
Archive
Environment
Administration
New User Info.
Web Info/Tools
Monitoring
Training
Tools & Utils
Programming
C++ Standard
SRT, AFS, CVS
QA and QC
Remedy
Histogramming
Operations
PromptReco
Simulation Production
Online SW
Dataflow
Detector Control
Evt Processing
Run Control
Calibration
Databases
Offline
Workbook
Coding Standards
Simulation
Reconstruction
Prompt Reco.
BaBar Grid
Data Distribution
Beta & BetaTools
Kanga & Root
Analysis Tools
RooFit Toolkit
Data Management
Data Quality
Event display
Event Browser
Code releases
Databases
Check this page for HTML 4.01 Transitional compliance with the
W3C Validator
(More checks...)

Batch system in a nutshell


The SLAC's batch system uses SCS UNIX compute farm that is based on LSF (Load Sharing Facility).

13 Feb 2006
Related pages: [BaBar Home Page] [Computing] [Tools]


Map:

  • bsub command.
  • bjobs command.
  • Command summary of LSF batch system
  • Examples of LSF batch system
  • Batch Exit Codes
  • More information, help for LSF batch system
  • bsub command

    Introduction:

       Submits a command to the batch system.
    

    Syntax:

       bsub [options]   command [argument]
    

    Major Options:

         -c <hh:mm>      [amount of CPU time]
         -q <queue>      [job queue] 
    
    

    Minor Options:

         -J <jobname>    [specify job name]
         -m <host>       [run job on this machine]
         -R <resource>   [run job on this resource]
    
    

    Execution Options:

         -E <command>    [specify pre-run command]          
         -L <shell>      [specify a login-shell]          
         -nr             [job is not re-runable from beginning or last check point] 
         -r              [job is re-runable from beginning or last check point] 
    
    

    I/O Options:

         -i <infile>     [specify standard input file]
         -o <outfile>    [specify standard output file]
         -e <errfile>    [specify standard error file]
    
    

    Example of bsub:

        bsub -q bldrecoq -m build02 gmake all
        bsub -q bldrecoq -m build02 ls -la /u1/drjohn/bfdist/releases/nightly
        bsub -q bldrecoq -m build02 'ls -la /u1/drjohn/bfdist/releases/nightly/DbiEvent/*'
    
    

    bjobs command

    Introduction:

       Queries the status of jobs in the batch system.
    

    Syntax:

         bjobs [options]
    

    Major Options:

         -u <user>        [specify user, all means all users]
    
    

    Minor Options:

         -a               [all jobs]
         -l               [long form]
    
    

    Example:

         bjobs            [query my jobs in the batch queue]
         bjobs -u mark    [query all jobs submitted by user mark]
    
    

    Command summary of LSF batch system

    Major batch queue commands:

         bkill    [kill batch jobs.]
         bsub     [submit a job for batched execution.]
         bmod     [modify the parameters of a submitted job.
    

    Minor batch queue commands:

         bacct    [generate accounting information about batch jobs.
         bchkpnt  [checkpoint batch jobs.]
         bmig     [migrate a job.]
         brestart [restart a job from checkpoint its files.]
    

    Suspend/resume commands:

         bbot     [move a pending job to the bottom (end) of its queue.]
         bresume  [resume suspended batch jobs.]
         bstop    [suspend batch jobs.]
         bswitch  [switch pending jobs from one queue to another.]
         btop     [move a pending job to the top (beginning) of its queue.]
    

    Query commands:

         bjobs    [display the status and other information about batch jobs.]
         bqueues  [display the status and other information about batch job queues]
         bhosts   [display the status and other info about Batch server hosts]
         bhpart   [display information about Batch  host  partitions]
         busers   [display information about Batch users]
         bugroup  [display the user group names and their memberships]
         bmgroup  [display the host group names and their memberships]
         bparams  [display the info about the configurable system parameters]
         bpeek    [display the stdout and stderr output produced so far by a batch]
         bhist    [display the processing history of batch jobs.]
    

    Examples of batch system:

    General examples:

         bsub -c00:30 gmake all       [build test release]
         bjobs                        [find my batch job]
         bkill 388999                 [kill this job]
    
    

    Use specific host:

         bqueues -m <host>            [which queue suports this machine]
         bsub -q <queue> -m <host> <commands..>    [run on this machine]
    
         [Note]: this won't work for the moment for build10. Use
         the following:
          bsub -q <queue> -R sol7 <commands..>    [run on build10]
    
    

    More information or help for LSF system:

    Getting help or more information regarding LSF batch system. This is web page "High Performance Computing at SLAC" provided by SCS.


    Maintained by Terry Hung. Send suggestions and additions to
    terryh@slac.stanford.edu 650-926-3618