SLAC PEP-II
BABAR
SLAC<->RAL
Babar logo
HEPIC E,S & H Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews
Unwrap page!
Comp. Search
Who's who?
Meetings
FAQ Homepage
Archive
Environment
Administration
New User Info.
Web Info/Tools
Monitoring
Training
Tools & Utils
Programming
C++ Standard
SRT, AFS, CVS
QA and QC
Remedy
Histogramming
Operations
PromptReco
Simulation Production
Online SW
Dataflow
Detector Control
Evt Processing
Run Control
Calibration
Databases
Offline
Workbook
Coding Standards
Simulation
Reconstruction
Prompt Reco.
BaBar Grid
Data Distribution
Beta & BetaTools
Kanga & Root
Analysis Tools
RooFit Toolkit
Data Management
Data Quality
Event display
Event Browser
Code releases
Databases
Check this page for HTML 4.01 Transitional compliance with the
W3C Validator
(More checks...)

Using edg to run a BaBar job on the grid

This page is intended to provide the necessary hints to be able to run a simple BaBar job on the grid, we will suppose that the user is physically logged on at SLAC. These instructions are easily translatable to any other site.

This project is in a fastly moving phase, so expect some changes to occur soon. 

The edg setup instructions is mainly coming from Gilbert

Main topics:

  1. How to setup your EDG environment
  2. How to run a simple Beta application 
  3. How to run a more complicated application using a Storage Element and the Replica Catalog

Pre-requisite

You need to have a valid certificate issued by a recognized certification authority and to have registered your certificate to the BaBar Virtual Organization (VO). The following link gives instructions to perform these preliminary steps.
 

Protecting your private key

Your private key (userkey.pem) should be stored in such a way that it cannot be stolen by someone else. To do so you need to do the following:
  • Create a ~/private/.globus directory, since it is in ~/private, this directory can only be read by you
  • Copy userkey.pem in ~/private/.globus
  • Create a soft link from ~/.globus/userkey.pem to ~/private/.globus/userkey.pem

Setting up the edg environment

The edg User Interface (UI) is now correctly setup on any noric machine.
  • Create a directory edg-test
  • Create a directory /nfs/babar/grid01/the_first_letter_of_your_logonID/your_logonID
  • Create a link from ~/.globus/.gass_cache to /nfs/babar/grid01/the_first_letter_of_your_logonID/your_logonID/.gass_cache. This directory is used internally by globus/edg
  • Define an alias: edgUI_env = 'source /afs/slac.stanford.edu/package/bbrgrid/edgUI-env.csh'
  • Execute this alias each time you connect to SLAC and want to use edg
In order to run any globus or edg command, you need to get a valid proxy with "grid-proxy-init"

A simple EDG application

The purpose of the following paragraph is to explain how to setup a very basic EDG application to run a Beta job. The executable, the log files and the output n-tuples are transferred back to the submission site through the Ressource Broker

Building a Beta application

  • Install a test release in the usual way, replacing "boutigny" by your name and the release number by any suitable release
    newrel -t 12.2.1 12.2.1 -s $BFROOT/work/b/boutigny
    cd 12.2.1
    srtpath 12.2.1 Linux24
    addpkg BetaUser
    addpkg workdir
    gmake workdir.setup
    setenv BetaBdbMicro yes for Objy or setenv BetaKanga yes for Kanga
    setenv BetaPhysMicro yes
    setenv PhysMicro yes
    setenv PhysAll yes
    setenv BetaRootTuple yes if you want to produce ROOT-tuples instead of HBOOK
    gmake BetaUser.all
  • Create a suitable tcl file in workdir: test.tcl
  • Make a dry run of your executable in order to create a dumped tcl (dumped.tcl).  This part is now working, thanks to the hard work from Asoka (See: http://babar-hn.slac.stanford.edu:5090/HyperNews/get/BaBarGrid/50.html )
  • In release 12.2.2 you need the following tags:
    Framework ads07Aug02
    RooModules ads07Aug02
    BdbModules ads30Jul02
  • In release 12.4.0 no tags are necessary 
  • Then type:
    physboot (for Objy)
    BetaApp
    ev dumpOnBegin dumped.tcl
    source test.tcl
    exit

  • If you wish you can now copy the BetaApp executable and the dumped.tcl file to another directory, ${HOME}/edg-test/Objy in the current example.
  • By default the "input add collection name" command in BdbEventInput is checking the existence of the collection in the current federation. In a real grid job one should of course not assume that the input collection exists locally. In order to turn off the collection existence check, one can use the -novalidate option: "input add -novalidate collection name"

Preparing the edg job

  • In the edg-test directory, you need to create a wrapup shell script Beta.sh containing the following lines:
    #!/bin/csh
    set pwd = `pwd`;
    cd $BFDIST/releases/12.2.1
    srtpath 12.2.1 Linux24
    physboot
    cd $pwd
    ln -s $BFDIST/releases/12.2.1 PARENT
    BetaApp dumped.tcl

     
    • The link is necessary for the Beta job to retrieve some data files. 
    • This scripts supposes that the $BFDIST variable will be correctly setup by the remote host
    • The physboot step sets up correctly the LD_LIBRARY_PATH for Objy
  • Still in the edg-test directory you need to create a file Beta.jdl in Job Description Language in order to give all the necessary instructions to edg to run the application
      • Executable = "Beta.sh";
        InputSandbox = {"$HOME/edg-test/Beta.sh","$HOME/edg-test/Objy/BetaApp","$HOME/edg-test/Objy/dumped.tcl"};
        StdOutput = "result.out";
        StdError = "result.err";
        OutputSandbox = {"result.out","result.err","framework.hbook"};
        Rank = other.MaxCpuTime;
        Requirements  = Member ( other.RunTimeEnvironment , "BABAR-Test-slac-01");
        Environment = {"BetaBdbMicro=yes","BetaPhysMicro=yes","PhysMicro=yes","PhysAll=yes"};
         
      • The BABAR-Test-slac-01 target corresponds to the SLAC Computing Element (CE), "CCIN2P3" will target in2p3 and simply "BABAR" will target the UK sites at large.

    Submitting the job

  • Job submission is done with the command:
      • dg-job-submit Beta.jdl
    • This will return a URL which is a unique job ID
    • If you are not using the default Resource Broker (RB) you will have to provide a special configuration file and pass it to dg-job-submit with the -config parameter. The configuration file can be built from $EDG_LOCATION/etc/UI_Config_ENV.cfg
    • You can then check your job status with:
        • dg-job-status 'URL'
      • Pay attention to the mandatory quotes
    • Once you get the status "Output Ready" you can retrieve the output sandbox with:
        • dg-job-get-output 'URL'


    If things go wrong, it may be useful to check the status of the Resource Broker (RB).

    A More Complex EDG Application

    Here we describe a more sophisticated application using the Storage Element (SE) and the Replica Catalog (RC)

    Preparing the BaBar job

    The BaBar executable is prepared in the same way as above, using a recent release (12.4.0c), let's assume that we have:

    • An Executable : "MyAnalysisApp"
    • A "all.tcl" tcl file generated with the Framework "ev dumpOnBegin" command 

    Goal

    We want to be able to do the following:
    • Copy the executable to a Storage Element
    • Record it in the Replica Catalog with a Logical File Name (LFN)
    • Submit the job through the RB
    • The RB selects a Computing Element (CE) and checks whether the executable is available on the closiest SE
    • If yes, the job is submitted to the CE
    • The job produces an n-tuple
    • The N-tuple is copied to the SE and the entry is recorded to the RC

    Recording the executable on the RC

    At the moment we are using a RC maintained in manchester by Alessandra Forti.
    To use the RC you need a RC configuration file like the following:
    RC_REP_CAT_MANAGER_DN=cn=Manager,dc=gridpp,dc=ac,dc=uk
    RC_REP_CAT_MANAGER_PWD=xxxxxxxx (*)
    RC_REP_CAT_URL=ldap://bfb.hep.man.ac.uk:9011/rc=BaBarReplicaCatalog,dc=gridpp,dc=ac,dc=uk
    RC_LOGICAL_COLLECTION=ldap://bfb.hep.man.ac.uk:9011/lc=file collection,rc=BaBarReplicaCatalog,dc=gridpp,dc=ac,dc=uk

    (*) Please ask for the password

    Define an environment variable RC_CONFIG_FILE containing the path name of the previous file

    Find the address and the Mount Point of the SE where you want to record the executable. These informations can be found from an LDAP query to the site gatekeeper. For instance, targetting in2p3:

    ldapsearch -x -H ldap://ccgridli08.in2p3.fr:2135 -b 'Mds-Vo-name=local,o=grid' objectclass=CloseStorageElement CloseSE MountPoint
    where ccgridli08.in2p3.fr is the name of the GateKeeper

    This command will return informations like the following:

    .....
    ccgridli07.in2p3.fr, ccgridli08.in2p3.fr:2119/jobmanager-bqs-A, ccgridli08.in2p3.fr, local,
    Griddn: closeSE=ccgridli07.in2p3.fr,ceId=ccgridli08.in2p3.fr:2119/jobmanager-bqs-A
    ,hn=ccgridli08.in2p3.fr,Mds-Vo-Name=local,o=Grid
    CloseSE: ccgridli07.in2p3.fr
    MountPoint: /edg/StorageElement/prod

    .....

    So we see that the name of the SE closest to ccgridli08.in2p3.fr is ccgridli07.in2p3.fr and that the file system mount point is: /edg/StorageElement/prod

    We can create a directory on the storage element with the command:

    edg-gridftp-mkdir gsiftp://ccgridli07.in2p3.fr/edg/StorageElement/prod/babar/boutigny

    We can now copy and register the executable file with:
    edg-replica-manager-copyAndRegisterFile -l boutigny/MyAnalysisApp 
    -s bbr-gate01.slac.stanford.edu//afs/slac.stanford.edu/u/eb/boutigny/rel/12.4.0/bin/Linux24/MyAnalysisApp
    -d ccgridli07/edg/StorageElement/prod/babar/boutigny/MyAnalysisApp -e

    One can check the presence of the file in the RC with:
    ldapsearch -L -S "uc=*" "filename=*" uc path filename -b dc=gridpp,dc=ac,dc=uk -h bfb.hep.man.ac.uk -p 9011 -P 2 -x

    JDL file

    We suppose that the relevants files are stored at SLAC in a directory $HOME/edg-test/IN2P3

    Executable = "Master.sh";
    Arguments = "boutigny/MyAnalysisApp";
    InputSandbox = {"$HOME/edg-test/IN2P3/job.sh","$HOME/edg-test/IN2P3/all.tcl","$HOME/edg-test/IN2P3/Master.sh","$HOME/edg-test/RC/RC.conf"};
    StdOutput = "result.out";
    StdError = "result.err";
    OutputSandbox = {"result.out","result.err"};
    InputData = "LF:boutigny/MyAnalysisApp";
    ReplicaCatalog = "ldap://bfb.hep.man.ac.uk:9011/lc=file collection,rc=BaBarReplicaCatalog,dc=gridpp,dc=ac,dc=uk";
    Rank = other.MaxCpuTime;
    Requirements  = Member (other.RunTimeEnvironment ,"CC-IN2P3") && other.OpSys == "RH 7.2" && other.MaxCpuTime > 400 && other.MaxCpuTime < 50000;
    Environment = {"BetaBdbMicro=yes","BetaPhysMicro=yes","PhysMicro=yes","PhysAll=yes","BetaRootTuple=yes","BdbMicro=yes","NEVENTS=1000","NTUPLE=test.root"};
    DataAccessProtocol = "gridftp";
    The combination of instruction: "InputData - ReplicaCatalog and DataAccessProtocol" will allow the RB to select a CE that own the executable. If the executable is not found, then the job will be rejected before being sent to any CE.

    In this JDL the "Environment" instruction is formatted to target a specific CE at IN2P3, it can of course be changed to be less selective and to choose any CE accessible to BaBar

    Also notice that we need to send the RC.conf file through the Input Sandbox.

    The scripts

    We have chosen to split the execution script in two parst. One  contains the pure BaBar analysis stuff and is very similar to the basic example

    job.sh 

    
                 
                
    #!/bin/tcsh

    set curdir = $PWD

    cd $BFDIST/releases/12.4.0c
    srtpath 12.4.0c Linux24
    setboot
    setenv OO_FD_BOOT /afs/in2p3.fr/group/babar/data/bootfiles/theData/physics-analysis/V1/9102/BaBar.BOOT
    cd $curdir
    ln -s $BFDIST/releases/12.4.0c PARENT

    chmod +x Executable
    Executable all.tcl

    The second one is in charge to get the executable from the SE and to store the output n-tuple to the SE

    #!/bin/bash
    exe=$1
    GSF=`edg-brokerinfo getSelectedFile $exe gridftp`
    echo GSF: $GSF
    TFN=`echo $GSF | cut -c 8-300`
    echo TFN: $TFN
    globus-url-copy gsiftp$TFN file://$PWD/Executable
    chmod +x job.sh
    ./job.sh
    CE=`edg-brokerinfo getCE | cut -d ":" -f 1` 
    echo CE: $CE
    CSE=`edg-brokerinfo getCloseSEs`
    echo CSE: $CSE
    MP=`edg-brokerinfo getSEMountPoint $CSE`
    echo MP: $MP
    globus-url-copy file://$PWD/$NTUPLE gsiftp://$CSE$MP/babar/boutigny/Ntuples/$NTUPLE
    edg-replica-manager-registerEntry -l boutigny/Ntuples/$NTUPLE -s $CSE$MP/babar/boutigny/Ntuples/$NTUPLE -c RC.conf
    With this executable, the closest SE nodename and the Mount Point are automatically determined using the "edg-brokerinfo" command, so this script is generic and can run on any CE.


     last modified March 11,  2003 by
    Dominique Boutigny, <boutigny@in2p3.fr>