Using KLSUB for JOB submission on CCD farm
Check out the DNQS Documentation maintained by CCD.
The standard offline JOB will run either INTERACTIVELY or in CCD BATCH
on any supported machine architecture. Here's how to adapt to this
style of JOB:
1) Your ax61 account comes with a file called .cshrc
to which you should append the following line:
source ~klue/.cshrc.klue
If you wish, you can simply replace your .cshrc file with
the one on the klue account.
2) Your ax61 account comes with a file called .login
to which you should append the following line:
source ~klue/.login.klue
If you wish, you can simply replace your .login file with
the one on the klue account.
3) Use the JOB structure of the file ~klue/test.job/871data.
The JOB will execute both INTERACTIVELY and in CCD BATCH. It
is also machine independent, in principle. However, in order
to run it either way, you need to specify the data file upon
which you want it to run, like this:
INTERACTIVELY: 871data -f=/e871_7/data/run.13590 for example
IN BATCH: klsub 871data -f=/e871_7/data/run.13590
The output from the BATCH job will be placed in the directory
from which you submitted the job and will be called something
like 871data.printer. For now jobs can only be submitted from ax61.
The default QUEUE for klsub submission is e871. You can specify a different
QUEUE when you submit a job using the -q= option. DNQS options should be
placed before the JOB filename and JOB options should come after the JOB
filename. For example,
klsub -q=rss -x ax61 871data -x -g -f=/bnl871/data/run.13590
Here one has requested the rss QUEUE, instead of e871. One has vetoed
QUEUEs which feed node ax61, since that machine is usually overloaded.
The JOB 871data is invoked with the -x option to turn on script tracing.
Debugging hooks are turned on for the JOB by the -g option. The JOB will
read data from nfs mounted partition /bnl871/data .
You can define ENVIRONMENT variables for the inside of a JOB from the
command line when you submit the JOB using a new option:
-Denvironment_variable=value
For example, you can change the JOBNAME to be different from the name
of the JOBSCRIPT which you are running. The files produced by the JOB
will have names like ${JOBNAME}.printer, etc., instead of the default
${JOBSCRIPT}.* names. Using -DJOBNAME=... allows you to run several JOBs
simultaneously from the same JOB script, without conflict between the
JOBs submitted from the same directory.
Another use of the -D... option allows redefinition of the JOB's TAPE00
or TAPE99 variables from outside the JOB. If you submit the JOB with:
-DTAPE00=my_input_file
this will override the TAPE00 definition in the JOB script, provided
the TAPE00 line within the JOB has the new form:
export TAPE00; TAPE00=${TAPE00-"${DATAROOT}/${DATAFILE}"}
This construction only sets TAPE00 to ${DATAROOT}/${DATAFILE} provided
TAPE00 has not been previously defined.
The option:
-Dtape99=/dev/null
pipes event output to the null device (throws it away), if the JOB has:
export TAPE99; TAPE99=${TAPE99-${JOBNAME}.tape99}
The sample JOBs in ~klue/dist/jobs now reflect this construction for
several variables.
One can use the -D... mechanism to send environment variables to the
Mortran of the JOB, using:
CHARACTER*256 value_string; "Here 256 is arbitrary, but must be enough"
CALL GETENV('varname',value_string);
to retrieve their values from within the Mortran.
The -D... construction works for JOBs run INTERACTIVELY and in BATCH.
The position of the option is unimportant within the klsub invocation.
You can use the DNQS commands (qstop, qstat, qrm, qdate, quse, qtime)
to query and manage your jobs in the QUEUEs. E871 has additional
commands like "throng" which lists the status of all KLJOBs from
the E871 Collaboration.
To use the qtime or qstop tools you must have a .rhosts file that
contains a list of nodes your batch jobs might run on.