Proposed organization of babar physics planning and its interaction with the computing system. Neil Geddes, Tom Glanzman, Frank Porter, David Quarrie January 6, 1995 This note addresses the nature of the physics planning activities of the babar experiment, and describes the distinction between this and the computing system. A model is proposed, which we believe provides for the physics activities to proceed, at the same time permitting the computing system to be developed in a coherent manner. Computing System People are urged to read the computing chapter in the TDR to get a more detailed discussion of many of the ideas. In the TDR, the computing system is defined as: ``(i) the on-line data acquisition, control, and monitoring hardware and software; (ii) the off-line computing hardware -- CPU (``farm'' and desktop), storage, X-displays, and networking -- whether local to SLAC or at remote locations; (iii) the software environment in which the computing work is done; and (iv) the collaboration code itself. All of these are important items for the \babar\ computing, although some of it, including most of the second item, is properly considered institutional infrastructure.'' It is important to recognize that the computing system, in particular, item (iv), is built by many people within the collaboration. Thus, the computing ``group'', that is, the people responsible for constructing the computing system is made up of many people. Perhaps especially because of the large number of people involved, it is crucial that the construction of the computing system be uniformly coordinated. A source of confusion appears to be that the coordinators of the effort are regarded by some as the ``computing group'' itself, which is not our view. Because there are so many people likely to be involved, and the identities of these people will change over the lifetime of the experiment (and because historically so many people believe that they write good software and dont fully appreciate the design requirements), the computing system (the reconstruction software in particular) is particularly vulnerable to bad/incoherent/lack of design and implementation. It is for this reason that we feel that this should be centrally coordinated within a coherent computing subsystem organization. On the other hand, as has been stressed on various occasions, the computing subsystem is not responsible for specifying what is written (in terms of reconstruction/simulation packages) or (to first order) by who. The distinction between what is part of the computing subsystem responsibility and what is the responsibility of the other subsystems, or of the collaboration as a whole may be thought of as a distinction between how it is written and what is written (eg, what event generation tools, or what algorithms for reconstruction). The questions of "what" is written will be decided largely by the relevant other detector subsystems in the case of reconstruction and detector simulation, and largely by the "physics group(s)" which we propose below in the case of physics simulation and analysis tools. A further point here is that the longer term maintenance of any code produced is likely to fall into responsibilities of the computing subsystem (though hopefully not the physical act of maintaining the code !) and it is they/we who would have to find people to do it if no one else 'volunteered'. This will all be made much easier of course with a well engineered system and consequently we should be supported in making certain requirements on contributers. We discuss the development of two pieces, the reconstruction code and the simulation code, of the computing system. It appears that it is the development of these pieces where people have differing views on the proper organizational structure. We first give the organizational structure which the computing coordinators have been working towards, and then we discuss difficulties with other approaches. The envisioned computing organizational structure was discussed at the LOI time, and has evolved slightly since then. A schematic of the organization may be found in a PostScript file ~frankp/tex/babar/comp/organ2.4.ps. This figure, or its similar precursors, had been discussed with various members of the executive board. Within the computing system, we have the concept of "packages", for example: reconstruction for each detector system, simulation for each detector system, the data model, the code framework, physics utility package (e.g., object-oriented 4-vectors and their manipulations), histogram/tuple management, event display, and calibration. The point of these examples is to describe how we might factorize the software into manageable pieces AND to identify those components which are of interest to all other pieces (such as histogram/tuple management) so that each subsystem does not go off and obtain/develop their own mechanisms. It seems that a particularly large benefit of the organizational scheme described here is the ability of identify and take advantage of these areas of commonality. In the organizational structure, the reconstruction code and the simulation code are considered to be "packages", which are developed, maintained, and run within an overall framework. Each package, in this view, has a designated ``package coordinator'', which is a role, filled by anywhere from a fraction of a person, to more than one person. The reconstruction and simulation tasks are sufficiently large that they are really dependent on a number of packages at a finer level, eg, pattern recogniton in the vertex chamber, etc. Each of these packages also has a package coordinator. We note that the package coordinators will ordinarily be people who are themselves heavily involved in designing and writing the package code. Even though the simulation and reconstruction packages are dependent upon others we should emphasize to be clear that they still get their own coordinator, who in some sense will have a role in coordinating the (contributions from) the packages upon which his/her package depends. (we would expect that these people are the executive officers of the susbsystem computing group discussed below). Of course, to make sure everything fits together in a sensible way, there must be coordination between the coding for the packages written for the various other detector subsystems. The approach which we had proposed to the executive board, and which we had begun to implement, is to form a group of representatives, drawn from the other detector subsystems. The idea is that this group should consist of key people interested in the software relevant to the other subsystems. They thus effectively contribute to two subsystems at once -- the computing subsystem, and the other detector subsystem. The computing aspects of this, however, must be coordinated with the context of the computing system, and that is the purpose of forming this group. We think that the same group should coordinate both the reconstruction and the detector simulation development, and there may be more than one representative per subsystem. In fact, the membership of the subsystem group should be relatively open, and as long as the numbers of people are kept reasonable and we dont get conflicting opinions on all questions from multiple representatives from subsystems then we dont care who is on it. The main point is to make sure that all subsystems are participating and communicating. Unless and until it becomes apparent that we need a more formal method (to avoid the group getting bogged down in numerous religious arguments) this would be the place for people with legitimate concerns to bring them up for discussion. It appears to us that there is no disagreement about who is actually writing what code -- the issue seems to be what the organizational structure is. We have tried to describe our model for this. An alternative proposal has been suggested, of creating a parallel organization to the "computing system". This parallel organization would coordinate the construction of the reconstruction and simulation code. This looks to us like the creation of a second "computing system". We do not think that the creation of a parallel structure is an effective way of going about this task. We may consider the current situation, which evolved out of the working group structure: In the context of the working groups, there was a group charged with considering the physics and developing the simulation tools required. There was also a group charged with evaluating the computing requirements and designing a computing system to meet the requirements. For very good reasons, these two groups worked fairly independently -- they had quite different purposes. However, this structure cannot continue without modification. Physics studies will continue, of course, but the simulation and reconstruction tools must become packages integrated within the computing system for the experiment. This will require a shift for some people toward participation in the computing system, with a longer-range view. The creation of parallel computing organizations, which appears to be more-or-less a continuation of the present situation, is not conducive to this shift. Physics Group(s) It appears to us that there is general agreement that there should be a group concerned with the physics of the experiment, and that this group is certainly separate from the computing subsystem (and from any of the other subsystems). We propose here a model for this activity. We start with the premise that the vast majority of people in the collaboration are keenly interested in the physics, as well as many theorists outside the collaboration. A mechanism should be set up to plan for the physics, and to develop the analysis tools which will be required. The mechanism, at this stage, should involve coordination of activities. We thus need a coordinating body (perhaps the word "board" means this to some people, but we have trouble with this word, because it implies to us a decision-making body, and we think that the physics decisions will have to evolve out of the activities of the physicists and communications with the collaboration, rather than a board). The physics coordinator(s) formally report to the spokesperson, but in practice this would be largely an autonomous collaboration-wide effort. A possible function of the coordination body would be to set up a number of "physics working groups". These working groups might be organized by physics topic or type of analysis. For example, there could be groups which pursue tau physics, CP violation, rare B decays, charm physics, tagging algorithms, etc. The groups would beneficially include both theorists and experimenters. These groups may well develop computer code, and will need to use the computer system, in whatever state it exists. Some of the computer code would likely develop eventually into packages which could be included in the computing system (eg, event generators or tagging packages). Because of this, there may be an additional coordination role of communication with the computing system, although it is possible that this can proceed more efficiently at the level of individuals. We note that the physics groups work in parallel with the other activities of getting the experiment built and running. Among the experimentalists, we don't imagine people as being solely in a physics group without also working on a detector subsystem.