Computing at SLAC
Search SLAC

Best Practices Using Batch

Writing Output from Batch Jobs

There are a number of different options available for writing output from batch:

  • Local space provides the best performance when writing your batch output. By local space, we mean /tmp or /scratch on the batch worker. Not all batch workers have a /scratch, you can check an individual batch worker with the command "bhosts -l ". Look for lines like the following under "CURRENT LOAD USED FOR SCHEDULING:"
                 lammpi_load scratch
     Total               0.0    38.0
     Reserved             -      0.0
    The entry 38 under scratch tells you how much /scratch space is currently available, in gb.

  • NFS space can be used for writing output from batch jobs, but has the disadvantage of having to go across the network which will be slower than writing locally. Another problem when writing to NFS is the severe impact that results when the NFS filesystem becomes or is close to becoming full. An NFS server can become unresponsive if there numerous batch jobs trying to write to an already full filesystem. The server has to spend its time checking on the filesystem and then denying each request if the system is full. Since NFS servers often have a number of filesystems that they export, users of filesystems that are not full will also suffer since it is the server that is the victim, not just the full filesystem.

  • AFS space is usually NOT a good place to write from batch. AFS has a difficult time of it when multiple machines are all reading and writing to the same file at the same time. The same goes for changing the status of entries in a directory. Everytime the file/directory is changed, each system that is trying to read it needs to recontact the server for the update. This puts quite a load on the AFS server and can cause the server to slow down having a noticeable impact on all users....interactive sessions will appear to hang and in severe circumstances users will see "Lost contact with fileserver xxxx" messages. If it is desireable to store results in AFS, it is best to write locally to /tmp or /scratch space and then copy the results into AFS, once, at the end of the job.

Owner: Renata Dart