Computing Questions & Answers
This page is meant to try to answer any question about BaBar Computing
planning you might have for which you cannot find an answer to
elsewhere. Please email your questions to Stephen Gowdy.
We will post questions of general interest here with answers.
For technical computing questions please refer the the BaBar Computing
FAQ.
Topics
Modifications to the Computing Model
- What is the specific problem we need to solve?
- What is our proposed solution?
- How much person and CPU time will it take to meet a
request?
- So, I'll only get the events requested?
Rogue Wave Migration
- Why are we doing it?
- When does it need to be complete?
The Mini
- What is the mini?
- It must be fairly large then?
- Can we use it from analysis as yet?
Modifications to the Computing Model
We fully expected from the beginning that the Computing Model
would be dynamic in the detailed implementation, responding to changes
in physics or other circumstances that are inevitable. We hope that the
overall Computing Model strategy does not change substantially. In the
following we address some of the current plans that we propose to adopt
in the summer of 2002, for reprocessing of Run 1 and Run 2 data and for
OPR when we resume running in the fall.
Q:What is the specific problem we need to solve?
A: Once we begin to distribute events to remote sites, it was
envisioned that this would be done by replicating streams, i.e. making
independent collections of events which inevitably have a significant
overlap between them. In the Computing model, it was anticipated that
the replication factor (total size of all streams including isPhysics
divided by the size of isPhysics alone) would be reduced to 2.0. Currently
this factor is ~2.5 and is expected to increase, not decrease, with the
skim changes implemented in November 2001. This would mean an
additional hundreds of TB of disk space for a 500 fb-1 sample or
push us to heavy use of staging. In addition, many remote sites have
been quite unhappy with the size of the streams since the skims of
interest to them are much smaller than the stream that they receive. It
seemed that the proposed system was not satisfying the needs of either
the physics or Computing communities.
Q:What is our proposed solution?
A: We propose to continue to operate as we have in 2001. Since remote
sites are using Kanga, not Objectivity, for analysis at present, there has
been no demand for export of Objectivity streams, so we have made only the
AllEvents stream with pointers to the other 21 streams. This saves disk
space and makes life easier for OPR for instance. We propose to produce
and distribute "custom" streams to sites as requested - each site would
specify which skims they want.
Q:How much CPU and human time will it take to meet a request?
A:It is expected that the system will be fully automated. It
will take a person's time to maintain the system, probably adding
0.2 FTEs to the database administration load at each Tier-A site. The
CPU time required should be fairly low as selection of events will be
done using existing collections.
Q:So, I'll only get the events requested?
A: Initially it will be easier and faster to implement it such
that you get each event in each collection you request. This means that
if the same event is in two or more collections you request you will
get it two or more times. We hope to develop procedures to avoid such
event duplication.
Rogue Wave Migration
Q:Why are we doing it?
A: The primary motivation for doing this now is that the
licensing terms from the vendor changed in April 2001. There is now a
deployment element to the fees required to use this product which is a
per CPU cost, the list price being $500 per CPU. Given the large
computing needs for BaBar and the reduction in hardware costs this
seems an unreasonable large burden on the collaboration.
There are a couple of other reasons why this is a good time for
this change. We chose to use Rogue Wave Tools.h++ due to the fact that
there was no version of Standard C++ Library which functioned on all
or platforms while we were developing our software. Now the standard
has been finalised and compilers are getting very close to
implementing the standard. New students and postdocs should learn to
use standard tools that they can apply both inside HEP and in
industry.
Q:When does it need to be complete?
A: We currently plan on completing this migration for the 12
series release. This is due in March 2002.
We can renew the current license agreement for one more year from
April if we wish to be able to buy new licenses for this product.
Beyond April 2003 we may not be able to buy new licenses under the
old licensing scheme, but the licenses we have we never lose. Trying
to live with that has two main problems: we will find it increasingly
difficult to move to new versions of compilers or operating systems;
and new collaborators would be unable to use any of our software that
utilises Rogue Wave.
The Mini
Q:What is the mini?
A: It is a layer of information which contains more information
than that available in the micro database. It allows a much more
detailed analysis and some level of reprocessing (eg. you can redo track
fits but you cannot make new tracks).
Q:It must be fairly large then?
A: Not really. Currently (Dec 2001) it is actually less
additional disk space when you have the micro on disk already. It is
about 6kB/event. However, this version does not contain any
information from the Emc so we expect it to increase next year.
Q:Can we use it for analysis now?
A: Not easily at present. There are ongoing developments to
make this easier. We will hear about this at the December
Collaboration Meeting.
This page is maintained by Stephen J. Gowdy and Jim Smith.
|