# FAST PARALLEL PIPELINED READOUT ARCHITECTURE FOR A COMPLETELY FLASH DIGITIZING SYSTEM WITH MULTI-LEVEL TRIGGER

Robert D. Cousins, Jacobo Konigsberg, Jonathan Kubic, Philip L. Mélèse Physics Department, University of California, Los Angeles, CA 90024

Gregory W. Hart

Los Alamos National Laboratory, Los Alamos, NM 87545

William R. Molzon<sup>a</sup> University of Pennsylvania, Philadelphia, PA 19104

George M. Irwin, Dale A. Ouimette<sup>b</sup>, Jack L. Ritchie<sup>c</sup>, Quang H. Trang Physics Department and Stanford Linear Accelerator Center Stanford University, Stanford, CA 94309

Robert J. Whyley<sup>d</sup>

College of William and Mary, Williamsberg, VA 23185

## Abstract

We have built, and used to take physics data, a digitizing and readout system for Brookhaven AGS Experiment 791, a high-rate search for rare kaon decays. All digitization of charge and time information is "flash" (performed in less than 200 ns), followed by front-end buffering and a pipelined readout with massive parallelism. A data transfer rate of 0.5 Gigabyte/sec into dual-port memories in eight 3081-emulating processors has been achieved. A readout-supervising circuit coordinates the three levels of event triggering and the movement of data throughout the system. The host Micro-VAX is interrupted only for the uploading of packets of fully filtered events from the 3081/E's. Digitizing and data transfer from the front end to the 3081/E's contribute negligible deadtime to the experiment.

Submitted to Nuclear Instruments and Methods

\*Work supported in part by the Department of Energy, contract DE-AC03-76SF00515 and the National Science Foundation.

# 1. Introduction

In searches for rare processes in higher energy physics experiments, one must frequently operate and read out detectors in environments with increasingly high event rates. This is the case for Brookhaven Experiment 791<sup>1</sup>, a search for rare kaon decays, with emphasis on the lepton-flavor-violating decay  $K_L^0 \rightarrow \mu e$ . From the millions of particles traversing the spectrometer each second, electronics circuits must select a small fraction to be written to magnetic tape. As is now common, we use a multi-level event-triggering scheme, whereby increasingly time-consuming selection criteria are applied in sequence to each event which passes all previous criteria. As these trigger criteria are applied, the event information must be stored in some fashion (e.g., an analog signal travelling down a cable, or a binary word in digital memory), and the information must be transferred among various parts of the system. Since analog-to-digital conversion (ADC) and timeto-digital conversion (TDC) require many micro-seconds in traditional commercial units, deadtime problems can arise unless severe constraints are placed on the fraction of events which is digitized. If these problems are solved, there remains deadtime introduced by the transfer of large amounts of data into concentrated memory buffers. Our system uses "flash" digitizers and fast parallel pipelined readout to solve these problems, resulting in a virtually deadtimeless readout.

This readout architecture design has four distinctive features which contribute to its high performance. The first is that all digitization is "flash". That is, all times and charges are digitized in less than 200 ns. The second feature is an extra set of latches in the extreme front end data modules, to which all data in an event can be quickly (a few tens of ns for system signal propagation) transferred upon completion of digitization. The third feature is a highly parallel pipelined dataway from the front-end data modules through the crate controllers to the memory buffers. The fourth is a set of custom-designed dual-port memory boards for eight 3081 Emulators<sup>2</sup> in which the computations of an online software event filter are performed. The combination of these four features allows us to move data quickly and efficiently out of the front end and into the 3081/E processors. The dataway (third feature) had its origins in a system<sup>3,4</sup> developed at UCLA (by a different group) for the CERN ISR Experiment R608, and has been extended for use in our multiprocessor environment. It is capable of effective data transfer rates of a gigabyte per second, of which we have needed only to implement about half the full rate. The dual-port memory (fourth feature) follows a suggestion of P. Kunz of the Stanford Linear Accelerator Center. It eliminates the need for a large memory buffer external to the 3081/E's, with associated complexities. These dual-port memories in the 3081/E's have been dubbed "Turbo" memories.

Excluding the 3081/E's (which are TTL), the system contains about 230 boards with over 5000 channels. There are over 30,000 ECL integrated circuits from the 10K series, and a smaller number of TTL, analog, and 100K ECL IC's. The reliability has been excellent, with the order of 5 post-burn-in IC failures in over 1000 hours of beam-on operation amongst several thousand hours with system power on.

In this article, we restrict our detailed descriptions to the readout hardware. The online data acquisition software, the Level 1 trigger hardware, the Level 2 trigger hardware, and the Level 3 trigger software are each significant projects developed partially or completely by others in the E791 collaboration. We mention in passing the details of each only where applicable to the discussion of the readout hardware.

# 2. System Overview

Figure 1 is a block diagram showing the inter-connections of the modules in the readout architecture. The readout of an event is controlled by the Readout Supervisor, a set of 12 modules residing in a CAMAC crate. (The CAMAC dataway is used for set-up, diagnostic exercises, and occasional commands from the host computer; it is not used for the high-speed data transfers.) The host computer is a DEC Micro-VAX II connected to two 6250-BPI magnetic tape drives. In the diagram, a line with an arrow indicates a cable with all signals going in that direction. A line without an arrow indicates a cable with signals going in both directions (on different wires in the cable). Most lines from fanout circuitry to *Turbo* memories have been suppressed in order to reduce the clutter; every fanout module is in fact connected to every *Turbo* memory.

On the left are the front end crates which contain the custom modules for digitizing spectrometer data. Modules which were used for physics data-taking include a 32-channel 6-bit TDC with 2.5 ns least count, a 12-channel bilinear 8-bit ADC, and a 96-bit coincidence latch. A prototype 16-channel 8-bit TDC with 200 ps least count was used in a brief test run, and is in production for future physics runs. The generic module readout is discussed in Section 3, which ends with a brief description of the specifications of each module. More detailed descriptions of the internal circuits are described elsewhere<sup>5,6</sup>. An active extender board has been built to facilitate debugging of these circuits.

The crates are custom-built with custom backplanes. With a 14.5 in. by 15.5 in. cardsize for the modules, the crates superficially resemble FASTBUS crates. Our crates have only 18 stations ("slots"), a choice which is natural for our addressing scheme and which greatly aids in cooling. The centers of adjacent modules are separated by 0.9 in., which allows for ample air flow to cool the 1000 Watts typically emitted by a crate-full of modules. Of the 18 stations, 16 are for digitizing "data modules", 1 is for the control and fanout module called the "Crate Scanner", and 1 is spare station, normally empty or used to power auxiliary fanout modules which do not communicate with the backplane. The choice of 16 data stations is matched to our use of 4 bits to specify slot position in the raw data. The control and spare stations are the middle stations of the crate in order to minimize lengths of backplane connections. Furthermore, crucial timing signals are independently driven to the left and right halves of the crate. The backplanes contain large power busses to accommodate high currents (e.g., 160 A at -5 V in the TDC crates) with negligible voltage drops. The crates were custom-made commercially<sup>7</sup> from steel by an inexpensive process, and have low impedance to air flow. We can stack three vertically in one standard 19 in. rack. Forced cool air flows from beneath the false floor up through the crates without the aid (or hindrance!) of additional fans. With care taken, the blowers of the main air conditioning units provide all the pressure needed.

The Crate Scanner collects the data words from the modules in its crate and passes them toward the 3081/E processors. The data are collected by a sparse data scan over the crate backplane, and augmented with header words (identifiers, word counts). Section 4 below contains a list of the backplane signal lines by which the data modules communicate with the scanner. The details of the scanner itself are in Section 5. The Crate Scanner drives the data onto eight independently-driven ECL 17-differential-pair connectors. Eight cables from these connectors can take the signals to the 3081/E Turbo memory board inputs and to diagnostic modules. Thus, each crate can be independently connected to each 3081/E processor. Each 3081/E processor can accommodate up to 24 separate 17pair external cables on its Turbo memory boards. It is this massive cross-connection which, under control of the Readout Supervisor, provides the ability to dump data quickly into any of the processors in a highly parallel fashion. Diagnostic systems consisting of LeCroy 4302 CAMAC memory buffers, associated fast logic, and a stand-alone computer, can "snoop" at the data without interfering with the readout. (The system was made compatible with the LeCroy 4302 module at the suggestion of W.K. McFarlane. We have used 4302 modules extensively in test stations during production of crate scanners and data modules; in a primitive version of the readout system during E791 beam tests prior to the construction of the 3081/E's; and in the current online diagnostic system.) As the total number of 3081/E's plus a diagnostic system has evolved beyond eight, an additional fanout module has been added to some crates.

It is possible to inter-connect the Crate Scanners of two different crates with two cables, and to read out this pair of crates through only one of the Crate Scanners, and thus to a single memory port on each 3081/E. (This configuration is not shown in Fig. 1). This feature is desirable in order to reduce the number of memory ports needed, and can aid in balancing the amount of memory filled through each port. In such a configuration, one of the Crate Scanners, called the "master", is cabled as usual to the 3081/E's. The other Crate Scanner, called the "slave", is read out via a short cable from one of its 8 output connectors into a "from slave" connector on the master scanner. A separate 5-pair cable connecting the two scanners is used for control lines. These signals are described in Sec. 6. In this configuration, both scanners are also connected to the Readout Supervisor by separate 17-pair cables.

The Readout Supervisor drives various control and clock lines which move data through the readout sequences. It communicates with the Crate Scanners, the Level 1 and Level 2 logic, the 3081/E's, and the host computer ( $\mu$ VAX) in order to coordinate the event readout. Section 7 enumerates the signals between the Readout Supervisor and the\_Crate Scanners. The Readout Supervisor also performs miscellaneous tasks (event counting, timing, etc.) which need to be done in hardware, and which were conveniently incorporated into its design. Section 8 contains a detailed functional description of the Readout Supervisor, followed by a description in Section 9 of its implementation in hardware.

Our 3081/E processors were loaded with a maximum of 6 *Turbo* memory boards. Each board has 4 front connectors which accept the 17-pair cables from the Crate Scanners, thus allowing data to be written directly to the processor's memory. (Only the cables from one scanner are shown in Fig. 1.) Thus, a 3081/E has enough connectors for 24 cables. Using the master/slave capabilities of the Crate Scanners, one can read out 48 crates into a 3081/E. With some changes, insertion of more memory boards into the 3081/E could increase this number up to a maximum of 96. (In this case, one would suffer a slight loss of error-detection capability, since there is a redundant crate identifier which can only count to 63.) In our actual system, we read out a maximum of 24 crates. In one physics run, these went to separate memory ports, and in another physics run, most were paired in the master/slave configuration with no apparent loss of performance.

The host computer communicates with the 3081/E's via a DR11W parallel interface<sup>8</sup> which is connected to a custom 3081/E processor interface module. The communication between the host computer and the Readout Supervisor is via CAMAC. (Such communication is rare.) The Readout Supervisor and the 3081/E controllers communicate through single lines, discussed in Section 10, by which the Readout Supervisor can re-start individual processors when their memories are full. Section 11 provides more details on the modifications and enhancements made to the standard<sup>2</sup> SLAC 3081/E in order to incorporate it into this system. There is no direct line of communication between the host computer and the individual Crate Scanners. The host computer can, during diagnostic tests, send signals to the Crate Scanners via the Readout Supervisor.

The principle of the pipelined readout is illustrated in Fig. 2. Each word that is read out from the extreme front end to the 3081/E's is strobed through one of many parallel three-stage pipelines. Each front-end module has a 12-bit register called the "1st Pipeline Register" (P.R.), each Crate Scanner has a 16-bit "2nd Pipeline Register", and each of the 24 ports on the 3081/E has a 16-bit "3rd Pipeline Register". (Four bits are added to the data word by the Crate Scanner.) Data is shifted one stage to the right each time there is a strobe from the Readout Supervisor (about every 100 ns). The timing of the strobe is such that the registers are clocked in the order 3rd, 2nd, 1st in the same clock cycle. In a crate, the data from 1st P.R.'s are placed into the 2nd P.R.'s in increasing priority (lowest numbered channels first) during successive strobes. The high level of parallelism is due to the fact that there is a separate cable from each Crate Scanner to a 3081/E port dedicated to that scanner on each 3081/E. Thus, with our initial 24 ports, each receiving a 16-bit word every 100 ns, the instantaneous readout rate (after two initial strobes to fill the pipeline) is 0.48 Gbyte/sec. There are inefficiencies when some crates have more data than others, but these can be minimized by judicious placement of data modules among the various crates. In the actual physics runs, the readout rate was so fast that little attention was placed on this balancing of the data among the memory ports.

In the sections that follow, details are given for the protocol used by the various modules. We begin with the front end and follow the data to the host computer.

# 3. Front End Digitizing Modules

The front end modules (ADC's, TDC's, etc.) have the capability to perform the initial digitization, store an old event while remaining live for another event (or store two events while dead), and respond to readout strobes with an on-board sparse data scan and backplane data transfer. A block diagram of a generic module is shown in Fig. 3. The front panel connector accepts a cable from a detector element in the spectrometer. The signals on the cable are gated at the input to the module by signals which are suitably shaped and timed fanouts of the Level 1 trigger. The relevant digitization is performed by the block labelled "Flash Digitize". At the conclusion of digitization, the data reside in some form of latch where it can remain until further control signals are received. This latch is called the "Stage 1 Latch". In addition, for each channel there is a flip-flop which is set if there was a hit on that channel during the event. These flip-flops are called "Stage 1 Hit FF's".

called "Stage 2 Latch" and "Stage 2 Hit FF's", have their inputs (D's) connected to the

outputs (Q's) of the Stage 1 latches and flip-flops. The Readout Supervisor can transfer data in all crates from Stage 1 to Stage 2 by asserting a signal called SHIFT on the cables to the Crate Scanners. The Crate Scanners pass this signal onto the backplanes, where it is picked up by the data modules and used to clock the Stage 2 Latches and Hit FF's, and to clear the Stage 1 Hit FF's. Another signal, called ABORT, allows the Readout Supervisor to clear all Stage 1 Hit FF's without shifting. As will be discussed in detail in a later section, these two operations (SHIFT and ABORT) are all the Readout Supervisor needs in order to move events efficiently toward readout, subject to Level 2 trigger decisions.

Finally, each module contains readout and control circuitry which performs a sparse data scan of the channels on the module and passes the hit data to the Crate Scanner. This circuitry is used for an event which has passed both the Level 1 and Level 2 triggers (and thus must be read into *Turbo* memory). At the appropriate time (see Section 8 below), the Readout Supervisor shifts the event into Stage 2 on all modules. Once in Stage 2 with a Level 2 "yes", an event can be read out. If it has at least one hit channel, the module asserts a direct backplane line (called HIT) from its station to the Crate Scanner. When the Crate Scanner is ready to read out the module, it asserts another direct line (called ENAB) to that station; this line activates the readout circuitry on the module, and allows it to use the backplane bus lines. The 16 independent HIT and ENAB lines from the Crate Scanner to the data stations are easily accommodated on the custom backplane.

The set of flip-flops called the "1st Pipeline Register" is central to the readout circuitry. In response to strobes on the RDCLK line from the Readout Supervisor via the Crate Scanner, data from hit channels are strobed into the 1st Pipeline Register. These data are passed from the enabled station to the Crate Scanner along the backplane data lines. The interpretation of the bits in these data depends on the particular function of the front end module. Typically, some of the bits are digitized time or charge, and the remaining bits form a "channel sub-address" designating which channel on the module had the hit. A physically separate piece of the 1st Pipeline Register also appears on the Crate Scanner (see below) and contains 4 bits designating the station number of the hit. (Throughout this description, the words "station number" mean a number from 00 to 15, numbering only the data stations. The control and spare stations are not numbered in this sense.)

The 1st Pipeline Register gets its name from its function as the first of three registers (shown in Fig. 2) which are used in pipeline fashion to move data from the front end modules to the Crate Scanners and ultimately to the 3081/E Turbo memory. In response to strobes from the Readout Supervisor (on the line called RDCLK to the Crate Scanners and on the line called WRTCLK to the 3081/E's), the data in the 2nd register are clocked into the 3rd register (and written into the Turbo memory), the data from the 1st register are clocked into the 2nd register, and data from the next hit are clocked into the 1st register by the front-end module control circuitry. It is this last operation which concerns us most in this section.

Each front-end data module contains priority selecting logic which looks at the Stage 2 Hit FF's and selects the hit channel with the lowest sub-address. The outputs of the Stage 2 Latch for the selected channel are gated onto an internal bus which goes to the inputs of the 1st Pipeline Register. When RDCLK arrives, the bus bits are clocked into the Register (along with the sub-address itself), and the Stage 2 Hit FF of the selected channel is cleared. The selection logic then moves on to the hit channel next highest in priority,

so that new data are clocked into the Register on each RDCLK strobe. The readout of the module continues until all hits on the module have been read out, at which time the module ceases to assert its backplane HIT line.

The following subsections contain brief specifications of the digitizing modules. More complete descriptions are presented elsewhere  $^{5,6}$ .

## 3.1. Digital 6-bit TDC with 2.5 ns least count

Each module measures on each of 32 channels the time of arrival, relative to a common "start", of a "stop" signal which arrives during an event gate. At the heart of the all-digital board, ECL 100K IC's form a 6-bit gray code clock with 2.5 ns least count, providing a range of 160 ns, matched to maximum detector response times. All clocks in one crate are driven by a 100 MHz oscillator located in the Crate Scanner. A single quarter-cycle phase split on the TDC board, along with inversion, provides all the necessary rising edges. Each channel has an independent 6-bit set of flip-flops; these form the Stage 1 Latches described in general above. These FF's and the readout circuitry are ECL 10K. The "stop" for the associated channel clocks the instantaneous state of the gray code clock lines into these 6 bits. By suitable (time-consuming) modification, either the first or the last such stop to arrive on that channel during the event gate is recorded. We recorded the first stop during the runs described here, in order to avoid problems from multiple-pulsing of detector elements. The rms error on the time, as measured with a commercial 100 ps least count TDC, is less than 0.8 ns for each of the 5216 channels constructed. Each module requires 9.5 A at -5.0 V (a compromise between the preferences of ECL 10K and 100K) and 2 A at -2 V.

## 3.2. Fast integrating 8-bit bilinear ADC

These modules are used to digitize the charge in photo-multiplier tube signals from a large lead glass array, from a Čerenkov counter, and from various scintillation counters. The circuit is bilinear: for small pulses, each additional count corresponds to 150 femto-Coulomb, while for large pulses, each additional count corresponds to 470 fC. We thus obtain a wide dynamic range (effectively 10 bits times the lower charge quantity) with the 8-bit ADC, while preserving high resolution for the smaller pulses. A separate Sony CX20052A 20 MHz flash ADC chip on each channel performs the digitization in less than 200 ns, with the 8 digital output signals constituting the Stage 1 Latches described in general above. Gates for the DC-coupled inputs are common to the 12 channels on each module, and can range from 20 ns to 150 ns. Extensive tests of the 528 constructed channels have demonstrated the integral non-linearity to be less than 1 count. Each module requires 5 A at -5.2 V, and 2 A or less at 2.0 V,  $\pm 15$  V, and  $\pm 6.0$  V.

#### **3.3.** 96-bit coincidence latch

This versatile module latches the state of 96 ECL differential pairs, grouped into 3 sets of 32 channels sharing a common clock (latch signal). Signals may be latched into either the Stage 1 Hit FF's (described above) or directly into the Stage 2 Hit FF's, as is desirable for parts of our trigger diagnostics. A third version allows the DC levels at the input of the module to be read out without a latching clock; this feature is used to read

out information (trigger bits) which the Readout Supervisor has stored until readout time. Auxiliary output connectors allow the contents of the latches to be examined by external circuits (for higher level trigger calculations) in addition to the normal readout path into the 3081/E's. Signals from the Readout Supervisor coordinate this activity. Unlike the other modules, no sparse data scan is performed. (In our system, this is more efficient than encoding the addresses of hit bits.) Each group of 32 signals is loaded into three words at readout time; switches allow one to select how many of the 9 words to read out. Fifteen modules were used for physics data taking.

### **3.4.** Fast Analog 8-bit TDC with 200 ps least count

This 16-channel module has been developed for those signals for which we desire tighter timing than that provided by the less expensive 2.5-ns least count TDC (Section 3.1.) On each channel, a Sony 8-bit flash ADC chip (see Section 3.2) is used to measure the voltage across a capacitor which is charged by a constant current source between the time of a common "start" and the "stop" signal arriving on that channel. With about 200 ps least count, the range is 50 ns. Power required is 7 A at -5.2 V, 3.5 A at -2 V, and 0.5 A each of  $\pm 15$  V. The prototype was tested in the actual readout system as well as on the bench, and production of more modules is underway.

### 4. Backplane signal lines

At each station, there is a 96-pin backplane connector, chosen to be sufficient for all the lines used by the control station. It was not cost-effective to have a smaller-sized connector for the data stations, though it would have sufficed. We list here a brief description of the function of each of the backplane lines:

ABORT: from scanner to all data stations, clearing all Stage 1 Hit FF's, and resetting ADC circuits.

SHIFT: from scanner to all data stations, clocking all bits from Stage 1 Latches and Hit FF's into Stage 2, clearing all Stage 1 Hit FF's, and resetting ADC circuits.

RESET: from scanner, resets clock on digital TDC's. Quiescent true between events, goes to false during an event for duration of one-shot on scanner.

RDCLK: from scanner, used during readout to clock pipeline registers and drive circuitry for sparse data scan.

HIT: a separate line from each station to the Crate Scanner (lines are labelled HITN, n=0,15, at the control station), asserted by the data module when there are data to be read out from that station.

ENAB: a separate line from the Crate Scanner to each station (labelled ENABN, n=0,15, at the control station), asserted by the Crate Scanner to enable that station for readout.

RTSEL: from scanner, has dual role of gating open front panel inputs to TDC's when not asserted, and, when asserted, allowing computer-controlled TEVEN and TODD to clock Stage 1 Hit FF's during diagnostic tests. D00-D11: Backplane data bus lines for transfer of data from data stations to the Crate Scanner.

TOGO: diagnostic line for TDC's, from scanner under computer control. Can be used to toggle the phase 0 input to the gray code clock.

TOG1: diagnostic line for TDC's, from scanner under computer control. Can be used to toggle the phase 1 input to the gray code clock.

TEVEN: ("Test even") from scanner under computer control. Rising edge clocks Stage -1 Latches, and high level sets Stage 1 Hit FF's, on all even-numbered channels of TDC's when RTSEL is asserted.

TODD: ("Test odd") from scanner under computer control. Rising edge clocks Stage 1 Latches, and high level sets Stage 1 Hit FF's, on all odd-numbered channels of TDC's when RTSEL is asserted.

TALL: ("Test all") from scanner relayed from front-panel input on scanner. When high, sets all Hit FF's on the ADC's. Used for pedestal tests; the circuitry in the fast logic which generates a Level 1 trigger for a pedestal test also pulses a high state onto the front-panel input for TALL.

## 5. Crate Scanner

The Crate Scanner, under control of the Readout Supervisor, collects data during a readout sequence from the data modules in its crate (or from the scanner of a nearby "slave" crate) and sends these data toward the 3081/E processors. In addition, the Crate Scanner accepts various control and timing signals from the Readout Supervisor and fast logic, and relays them along the backplane to the data modules. The scanner also contains special circuitry to generate various waveforms that are required by particular data modules (currently, we are implementing this feature only for the 2.5-ns least count TDC's).

A simplified block diagram of the E791 Crate Scanner is shown in Fig. 4. The parts related to master/slave operation are omitted, leaving the parts used by an isolated scanner. We describe first the parts related to readout, and then describe the parts devoted to smaller miscellaneous tasks. Prior to commencement of readout, the Readout Supervisor has arranged for all data of the event of interest to reside in the Stage 2 Latches and FF's in the front end digitizing modules in all crates. This is accomplished by a SHIFT, which also resets the readout circuitry.

Each front end data module asserts its backplane HIT line if it has any hits. A priority selection circuit on the scanner looks at the 16 HIT lines from the stations in the crate and asserts the ENAB line going to the station with highest priority (lowest station number). A line called CRINFO (for "crate information") to the Readout Supervisor is asserted by the scanner, and remains asserted until the last word in the crate is read out. The Readout Supervisor begins a stream of pulses on the RDCLK line to the scanners; this stream lasts until the crate with the most data has been read out. The pulse train has an adjustable period of 100-150 ns, with a duty cycle near 50%.

At each RDCLK, the 12 backplane data lines D00-D11 from the 1st Pipeline Register at the selected station are clocked into the "2nd Pipeline Register", which is a set of 16 flip-flops on the scanner. The 4 station address bits from the piece of the 1st Pipeline Register on the scanner are also clocked into the 2nd Pipeline Register. The outputs of the 16 flip-flops of the 2nd Pipeline Register are fanned out to drivers which drive the 8 cables to the 3081/E's and/or diagnostic modules.

After clocking the 2nd Pipeline Register, RDCLK continues on to clock the piece of the 1st Pipeline Register on the scanner, then onto the backplane to clock the 1st Pipeline Register on the selected data module. The piece on the scanner then contains the 4-bit encoded address of the station whose 1st Pipeline Register is enabled (determined by the priority selection circuitry looking at the 16 HIT lines).

In order to facilitate the reduction of data in the 3081/E processors and offline software, the 16 stations in the crate can be divided into 4 (or fewer) regions numbered 0 to 3. The boundaries of these regions are set by three sets of DIP switches located on the scanner. The switches are set to be the station addresses of the last modules in each of regions 0, 1, and 2. (Region 3 always ends with station 15.) The transfer of data from 1st to 2nd Pipeline Register is interrupted for one RDCLK pulse whenever a region boundary is crossed (i.e., whenever the region number for the next word is different from the region number of the most recent word). At this pulse, a 16-bit "Region Word", generated by the scanner, is clocked into the 2nd Pipeline Register. The bits in this word contain the region number of the region just finished, the word count for that region, a master/slave bit, and a fine event number (for error checking). This RDCLK pulse is not sent over the backplane.

This readout sequence of data words followed by a region word is repeated until the region word of the last region with data is strobed into the 2nd Pipeline Register. On the next RDCLK, a "Crate Word", also generated by the scanner, is clocked into the 2nd Pipeline Register. The bits in this word contain the crate ID number (from DIP switches on the scanner) and an inclusive crate word count. This word is then transferred to the 3rd Pipeline Register (in the Turbo memory) on the next RDCLK.

As the Readout Supervisor sends the RDCLK pulses to the Crate Scanners, it sends similar pulses to the memory inputs of the 3081/E's. These pulses are called WRTCLK, since they are used to effect the writing of the data to memory. In Fig. 2, the timing is such that the WRTCLK pulse clocks the data from the 2nd pipeline register (on the scanner) into the 3rd pipeline register (on the Turbo memory board) just before RDCLK arrives at the scanner in order to clock new data into the 2nd pipeline register. The 17th pair of the cable from the Crate Scanner to the Turbo memory carries a signal called ENABP3 (for "Enable Pipeline Register #3") which is used to gate the WRTCLK strobe to the 3rd Pipeline Register. The Crate Scanner is responsible for ensuring that ENABP3 is asserted only when there is valid data on the 16 data lines at the input to the 3rd Pipeline register (furthermore, there are some constraints on times during which transitions on ENABP3 are allowed). ENABP3 is similar, but not identical, to the CRINFO line from the scanner to the Readout Supervisor. A low level on ENABP3 prevents further memory write cycles in the Turbo memory and incrementation of the Turbo memory address register associated with that scanner. The Readout Supervisor monitors the different CRINFO lines from all the Crate Scanners, and terminates the event readout sequence when all are clear.

As mentioned above, it is also possible to read out a second crate through the same cable to the 3081/E. This second, "slave" crate also has a scanner which is identical to that

of the first, "master" crate, but which is cabled differently (thus activating some different circuitry). In order to accommodate the various configurations, the front panel of each Crate Scanner has 4 connectors for master/slave communication and for communication with the Readout Supervisor. Figure 5 is a sketch of the front panels and connnecting cables for a pair of crates used as master and slave. The labels and number of differential pairs on each connector are: "from slave" (17), "from R.S." (17), "to/from R.S. or master" (5), and "to/from slave" (5). The signals on these cables are listed Sections 6 and 7. Also shown -in Fig. 5 are 2 of the 8 connectors for cables to 3081/E's. These 2 are on the front panel for easy access; the other 6 are mounted on the printed circuit board and can be accessed by removing the Crate Scanner from the crate. Several auxiliary coaxial cables are not shown.

A 5-pair cable connects the "to/from R.S. or master" connector on the slave scanner front panel with the "to/from slave" connector on the master scanner front panel. The "Slave present" and "Master present" levels on pairs in this cable are detected by circuitry on each scanner, so that each knows its role.

The master scanner first reads out the modules in its own crate, including its own Crate Word. It then passes subsequent RDCLK pulses over one of these 5 pairs to the slave Crate Scanner, which performs a complete readout of the slave crate, driving the 16-bit data words onto its "to memory or master" connectors, one of which is connected to a cable to the "from slave" connector on the master scanner. The Region Words and the Crate Word for the slave crate are generated in the slave scanner, and properly inserted among the slave data words. The master scanner merely gates these data from the slave's Pipeline Registers into the master's 2nd Pipeline Register, from where they are sent onto the cables from the master scanner to the 3081/E's. The 17th pair of the cable from the slave to the master has a signal called SINFO, which the slave sets as long as it has data for the master to read out. With a slave present, the master waits until both crates have been read out before clearing CRINFO and ENABP3 to tell the Readout Supervisor and the *Turbo* memory that all information has been transferred.

\_\_\_\_\_The "from slave" and "to/from slave" connectors are left disconnected on both slave scanners and isolated scanners (scanners which do not have a slave). On a master scanner (with slave), all connectors are used.

The Crate Scanner has a front-panel input which accepts a Level 1 trigger strobe on a single ECL differential pair from the fast logic. The rising edge of this signal activates "start" circuitry to generate special waveforms required by the digitizing modules. There is a 100 MHz oscillator (an ECL 100K nor gate fed back to itself with suitable delay) with fanout on coaxial cables to the 16 TDC modules in such crates. There is also a one-shot which fires to make an event gate for the TDC's (this gate is sent to the data modules via the backplane signal RTSEL). The Crate Scanner also has the front-panel input for the signal called TALL, which is sent to all ADC's for setting Hit FF's during pedestal tests. These special circuits have been put on every scanner, in order to have compatibility.

The Readout Supervisor must know when any of the 24 ports of *Turbo* memory is full. A port is defined to be "full" if there is not room in the port for the data from another event of the largest possible size. It does not suffice for the Readout Supervisor merely to count the RDCLK's which it broadcasts, because in general there is not a port which uses all RDCLK's from every event. Thus, we have placed a counter on each Crate Scanner

which counts those RDCLK's which arrive when ENABP3 is true for that scanner. Before the Readout Supervisor starts to fill up a new 3081/E, it pulses a signal called RMEMCT to preset the counters on the scanners. The counters are preset to DIP switch-selectable values chosen so that the counter will overflow at that value corresponding to a full port. Later, when that counter overflows, the scanner asserts a line called FULMEM (for "full memory"). When the Readout Supervisor sees any FULMEM line go true, it finishes the current event, and then switches to a new 3081/E. Since the Readout Supervisor only begins reading out an event when all FULMEM lines are false, it never needs to switch 3081/E's in the middle of an event.

Finally, there is some diagnostic circuitry. In addition to that already described for the TDC's, there is a way to mask off data stations, so that they will not be read out. This can be useful for suppressing "hot" channels in various detector elements without having to physically disconnect cables. It also facilitates various diagnostic tests. This masking is accomplished by a subroutine on the host computer, which uses the Readout Supervisor to communicate with 16 flip-flops on each Crate Scanner. The computer selects a crate by writing a 6-bit port ID into a register in the Readout Supervisor. It then clocks a serial stream of bits into the 16 FF's on the selected scanner, using lines called MASKD and MASKCK (a third line called MASKSV is also set when masking is done on a slave crate). The Readout Supervisor broadcasts MASKD and MASKSV to all scanners, but only sends the clock, MASKCK, out the port selected by the 6 ID bits. The host computer loops over all possible crate ID's; and for each ID, sends out the desired stream of bits. At the end of the sequence (which is done infrequently), each of the 16 mask FF's tells the Crate Scanner whether or not to read out the corresponding data station. The host computer can then verify that the masks are all set by reading out a diagnostic event and checking that data came from the desired modules.

#### 6. Signals on the Master/Slave Cables

When two Crate Scanners are connected in a master/slave configuration as in Fig. 5, they communicate over the twisted-pair cables mentioned above. One of these cables is a 17-pair cable from the "to memory or master" connector of the slave to the "from slave" connector of the master. It carries the 16-bit words read out from the slave crate, and the SINFO signal which tells the master that there is more information in the slave yet to be read out.

The second cable connecting master and slave has 5 pairs, only 4 of which are used. The cable connects the "to/from R.S. or master" connector on the slave crate with the "to/from slave" connector on the master crate. (With the cabling done this way, the former connector is used in its "to/from master" mode). The signals on this cable are:

SPRES: Set to logical high by the slave; the master detects this high, knows that a slave is present, and interprets signals accordingly.

MPRES: Set to logical high by the master; the slave detects this high, and thus knows that a master is present. I.e., it knows that the cable in its "to/from R.S. or master" connector is connected to the master, not to the Readout Supervisor. (The Readout Supervisor sets this line low when connected.) CKSLV: from scanner; the system RDCLK, gated by master so that slave receives it only when master wants to clock slave's Pipeline Registers.

MASKCK: The clock for the station mask flip-flops (see above), from Readout Supervisor via master.

### 7. Signals between Readout Supervisor and Crate Scanners

The control of the Crate Scanners by the Readout Supervisor is discussed above, and is elaborated on further in this section. Communication takes place over two cables, one 17pair and one 5-pair. The signals on the 17-pair cables are all broadcast from the Readout Supervisor to the "from Readout Supervisor" connectors on all scanners (including slaves). This requires parallel fanout to separate cables to each scanner. Fanout modules for this purpose have been built, and are powered by placing them in the "spare" stations in the crates (or in empty data stations).

The following 16 signals are broadcast over the 17-pair cables (one pair is not used):

ABORT, SHIFT, RDCLK: described above, since they are also on the crate backplane. Generated by the Readout Supervisor; they may also be controlled by the host computer during diagnostic tests.

RESET, RTSEL, TOGO, TOGI, TEVEN, TODD, MASKD, MASKSV: all under control of the host computer during diagnostics and setup. During fast data acquisition, RESET and RTSEL are generated on the Crate Scanner in response to the front-panel Level 1 strobe.

RMEMCT: signal to reset the memory counter on each Crate Scanner which counts the number of data words sent to its 3081/E port (irrelevant for slave crates).

FEO-FE3: 4-bit Fine Event number. This number is copied onto all Region Words\_by the scanners. Contains redundant information useful for error checking.

The 5-pair cables carry signals which are used for communication between the Readout Supervisor and individual scanners. There is a separate cable from the Readout Supervisor to each non-slave scanner. It has the following signals:

MASKCK: clock for station mask FF's in Crate Scanners. The host computer puts a port ID into a register on the Readout Supervisor, and toggles MASKCK. The Readout Supervisor passes MASKCK onto the 5-pair cable pointed to by the ID.

FULMEM: Signal from each (non-slave) Crate Scanner to the Readout Supervisor. A scanner sets FULMEM when its memory counter has passed a preset threshold, indicating that there is not room for another event in its 3081/E port. The FULMEM signals from all the scanners are OR'ed by the Readout Supervisor; the Readout Supervisor switches to a new 3081/E at the end of the event in which the OR goes true.

CRINFO: Signal from each (non-slave) Crate Scanner to the Readout Supervisor. A scanner sets CRINFO when a SHIFT is done, and clears it when the last word of information in an event is transferred to the 3081/E. The CRINFO signals from all the

scanners are OR'ed by the Readout Supervisor; the Readout Supervisor keeps sending out RDCLK strobes until the OR goes false.

MPRES: Set false by the Readout Supervisor, telling the scanner that it is not a slave. The scanner then interprets the other signals on this 5-pair cable accordingly.

SPARE1: A spare, under computer control, but not used.

#### 8. Readout Supervisor: Functional Description

The Readout Supervisor, under control of the host computer ( $\mu$ VAX), coordinates trigger decisions with the appropriate action. More sophisticated control is introduced when the "Level 2 trigger" circuit is activated. The Level 2 circuit is an event filter, the purpose of which is to reduce the number of events read out into the 3081/E's. The time for a Level 2 decision is less than a microsecond. In the physics data taken thus far, the circuit was tested successfully, but not activated. From the point of view of the Readout Supervisor, all events simply passed Level 2. We describe here the more general case, where events can fail Level 2.

For each Level 2 "no", the Readout Supervisor deletes the corresponding event from the front end electronics by a method discussed below. For each Level 2 "yes", it effects the transfer of the corresponding event from the front end electronics to the *Turbo* memory. When a 3081/E's memory is full, the Readout Supervisor initiates the start sequence for that 3081/E, and begins to fill the next available 3081/E. The host computer keeps the supervisor informed of the availability of the several 3081/E's. The Readout Supervisor also performs various miscellaneous bookkeeping and diagnostic tasks, discussed below.

The deadtime circuit for the entire experiment is within the realm of the Level 1 trigger, which is beyond the scope of this paper. The Readout Supervisor tells the deadtime circuit that it is busy by asserting a front-panel signal call RSBUSY. When RSBUSY is false, the entire front end electronics is ready for another Level 1 trigger. When a Level 1 trigger arrives, the Readout Supervisor immediately sets RSBUSY true. It stays true until the Readout Supervisor has arranged for the front end electronics to be ready for another event.

As discussed above, the front end digitizing modules can store two complete events in their Stage 1 and Stage 2 latches and FF's. The Readout Supervisor keeps track of the presence and Level 2 status of events in these two stages. If Stage 1 has an event in it, then the experiment is necessarily dead, because there is no place to put the next event. Thus, the over-riding concern of the Supervisor is to keep Stage 1 empty as much as possible, in order to maximize the live time of the experiment. It has available the SHIFT line to move an event from Stage 1 to Stage 2, simultaneously clearing the Stage 1 Hit FF's, and the ABORT line to clear the Stage 1 Hit FF's. The way it accomplishes this task can be illustrated by analyzing various cases and the corresponding action to be taken. The following list of cases gives the situation, followed by the action taken on the SHIFT and ABORT lines. In each case, the supervisor must also update its records of which events are in what stages. Note that it is never necessary for the Readout Supervisor to reset Stage 2 latches or hit FF's; they merely can be overwritten the next time there is a SHIFT. Cases involving changing 3081/E's and between-spill calibration events have been omitted from the list. 1. Beginning of the run. Host computer sets various registers; Readout Supervisor asserts ABORT to clear Stage 1 FF's, clears RSBUSY.

2. There is an event in Stage 1, and no event in Stage 2. Assert SHIFT, then clear RSBUSY, thus allowing the experiment to be re-enabled.

3. There is an event in Stage 2 but no event in Stage 1; the Level 2 decision for the event arrives and is "no". No action on SHIFT and ABORT lines (experiment remains live).

-4. There is an event in Stage 2 but no event in Stage 1; and the Level 2 decision for the event arrives and is "yes". Begin readout of the event from Stage 2 to the *Turbo* memory (experiment remains live).

5. There are events in Stages 1 and 2; the Level 2 decision for the event in Stage 2 arrives and is "no". Assert SHIFT, then clear RSBUSY, thus allowing the experiment to be re-enabled.

6. There are events in Stages 1 and 2; the Level 2 decision for the event in Stage 2 arrives and is "yes". Begin readout of the event from Stage 2 to the *Turbo* memory (experiment remains dead).

7. An event in Stage 2 is being read out, and another event is in Stage 1; the Level 2 decision for the event in Stage 1 arrives and is "no". Assert ABORT, then clear RSBUSY, thus allowing the experiment to be re-enabled. (This capability helps to give the system such a low overall dead time).

8. An event in Stage 2 is being read out, and another event is in Stage 1; the Level 2 decision for the event in Stage 1 arrives and is "yes". No action (experiment remains dead).

9. The readout of an event in Stage 2 finishes, and there is no event in Stage 1. No action (experiment remains live).

10. The readout of an event in Stage 2 finishes, and there is an event in Stage 1 for which the Level 2 decision has not yet been made. Assert SHIFT, then clear RSBUSY, thus allowing the experiment to be re-enabled.

11. The readout of an event in Stage 2 finishes, and there is an event in Stage 1 for which a Level 2 "yes" has previously arrived. Assert SHIFT, then clear RSBUSY, thus allowing the experiment to be re-enabled; then begin readout of the event now in Stage 2.

From a study of the above cases, it is clear that the only time that the readout circuitry can introduce appreciable dead time is when two events are close together in time and both pass Level 2. In that case, the experiment is dead until the readout of the event in Stage 2 finishes, and the other event is shifted from Stage 1 to Stage 2 (see cases 8 and 11 above). This occurs only rarely for Level 1 and Level 2 rates that are in a range that can be physically handled by the detectors. A detailed simulation of the readout architecture during its design phase assumed Level 1 rates of 250 KHz, Level 2 "yes" rates of 50 KHz, Level 2 decision time of 1  $\mu$ sec, and event readout time of 2  $\mu$ sec. The deadtime due to pile-up in the readout buses was only 2%. In actual practice, the trigger rates have been lower and the readout time slightly higher, with the net result that the readout-related deadtime is even less.

Several miscellaneous tasks are handled by the Readout Supervisor. First, we discuss three counters in the Readout Supervisor: the spill counter, the event counter, and the time counter. The bits of these counters all leave the Readout Supervisor on flat ribbon cables in order to be latched by a latch data module (Sec. 3.3) and thus read out every event. The 12-bit "spill counter" counts the number of spills in the run, and is reset at the beginning of the run. This counter is also VAX-readable via CAMAC.

The 20-bit "event counter" counts Level 1 triggers, and is reset at the beginning of every spill. The low-order 4 bits form the fine event number, which is sent to all scanners during readout (the scanners put them in the region words for redundancy). These 4 bits are stored in the Readout Supervisor by the SHIFT operation, so that the Readout Supervisor remembers what the Level 1 number was for an event which is read out sometime later. The required match between the fine event number on each region word and the low-order 4 bits of the latched event counter provides a powerful check on the integrity of the data buffers.

The 24-bit "time counter" counts cycles of a 10-MHz oscillator. The time counter is latched and held for 250 ns when a Level 1 trigger arrives, facilitating subsequent latching in the latch data module. A digital filter is used to insure that the time is latched only when the counter bits are stable. The counter is reset at the start of every spill.

The Readout Supervisor has an 8-bit CAMAC-addressable register with bits that the  $\mu$ VAX sets to indicate which 3081/E processors are empty. At the beginning of a run, the  $\mu$ VAX sets as many bits as there are working 3081/E processors (normally 8). The Readout Supervisor picks the first 3081/E with a set bit, and routes data to that 3081/E until one of the following happens: 1) one of the Crate Scanners asserts its FULMEM line, 2) the end-of-spill (EOS) signal arrives, and a bit in the Readout Supervisor has previously been set by the  $\mu$ VAX to cause a 3081/E change at EOS, or 3) a high state is pulsed onto a front-panel ECL differential pair (this external FULMEM line allows for the capability to force a change of 3081/E's after a preset number of triggers, etc.). When one of these happens, the Readout Supervisor clears the bit for that processor that has a set bit (it may have to wait for one). A bit remains clear until the processor finishes and interrupts the  $\mu$ VAX, which then uploads the data from the processor, and then sets the bit.

The Readout Supervisor keeps a record of the Level 1 and Level 2 trigger bits associated with an event. These bits are input to the Readout Supervisor on ECL differential pairs from the fast logic and from the Level 2 logic. When an event with a Level 2 "yes" is readied for readout, the Readout Supervisor drives these bits onto front-panel connectors. These bits may be used to gate LeCroy 4302 memory modules (typically using the LeCroy 4508 memory lookup) which are snooping on the event. The Readout Supervisor can handle up to 16 L1 bits and 16 L2 bits, so that one may choose among 32 possible bits for gating the 4302 modules.

# 9. Readout Supervisor: Hardware Implementation

The Readout Supervisor consists of 12 wire-wrapped CAMAC modules, of which there are 8 distinct types. CAMAC was chosen for convenient packaging and ease of incorporating extensive diagnostic facilities. During readout of physics data, the CAMAC dataway itself is used only for sending infrequent control signals. Since most of the readout system is ECL while CAMAC is TTL, level conversion takes place at the front of many of the Readout Supervisor modules. Internally, Readout Supervisor modules use ECL logic where high speed action is required, and FTTL logic where high density circuits (such as RAMs) are needed or where communication with the CAMAC backplane is required.

The Readout Supervisor modules are constructed to operate as independent "subroutines" that only occasionally interact with the main supervisor module. Communication between modules is accomplished using both ECL and TTL signals on ribbon-cable backplanes located above the normal CAMAC backplane.

Two "Trigger Latch" modules (TLM) accept external trigger bits, keep track of them during subsequent shifting, aborting, etc., and drive the results externally toward latch modules at readout time. For the physics data-taking described here, only one of these modules was used. A "Crate Scanner Output" module (CSO) drives the 17-pair cable (see Section 7) to the crate scanners (it is fanned out by ECL fanout modules). Four "Crate Scanner I/O" modules (CSI) drive the 5-pair cables (see Section 7) to the 24 crate scanners. A "Counter" module (CTR) contains the time counter, event counter, and spill counters for the events in the two Stages, with outputs to drive the appropriate numbers to the latch modules at readout time. Two modules called "3081/E selector" (SEL) and "3081/Edriver" (DVR) coordinate the routing of the raw data to the various 3081/E's, setting WRTCLK and STARTE at the appropriate times. (STARTE restarts a 3081/E, as explained in Sec. 10.) The selector module receives a "seek request" signal from the main supervisor and initiates a search for the next available 3081/E by first starting the current 3081/E and then selecting the next 3081/E from the  $\mu$ VAX-maintained availability register. When a 3081/E is selected (which may take an indefinite length of time) the module sends a "seek done" signal back to the main supervisor.

Finally, the "Level 2 Processor" module (L2P) coordinates the Level 2 trigger with the main supervisor. This module contains a CAMAC programmable delay line that defines the length of time needed by the Level 2 trigger logic. It also keeps track of the lower two -bits of the fine event number as an internal consistency check.

The heart of the Readout Supervisor is a "State Machine" that recognizes 16384 distinct states (many illegal) of the readout system. This main supervisor is implemented with a set of four  $16K \times 4$  25 ns static RAMs configured with latch circuits, a digital filter, and a multi-tap digital delay line to operate as an asynchronous, data driven, "state machine".

Initially, the RAM is loaded via CAMAC with the complete pattern of states and actions. Once a run begins, the machine cycles unattended, coordinating the data transfer throughout the system, generating the necessary strobes, etc. The host  $\mu$ VAX need communicate only to set and clear bits to inform the Readout Supervisor when the various 3081/E's are uploaded and are ready to take more raw data. Diagnostic circuits allow the RAM to be written and read via CAMAC, and can "single step" the State Machine for debugging.

A block diagram of the State Machine in the main Readout Supervisor module (RSM) -is shown in Fig. 6. (CAMAC dual port and diagnostic circuitry are omitted for clarity.) Normally, the State Machine is idle. The arrival of an external event, such as a Level 1

or Level 2 trigger, causes the machine to initiate a cycle. The event sets its corresponding trigger flip-flop to indicate that a cycle request is pending. An additional Level 1 trigger flip-flop also asserts RSBUSY to hold off the front end of the experiment.

A machine cycle is started when the "cycle" flip-flop is clocked by one or more of the 5 trigger flip-flops. A 14-bit latch then records the instantaneous state of the the 5 trigger flip-flops, 3 trigger modifier bits, and 6 previous machine state bits. The output of the latch forms the input address of a 16K-address by 16-data-bit static RAM array. The addressed word appears on the RAM output data lines within 25 ns. The RAM data word is a machine instruction determined according to a state table. Part of the instruction indicates a new state of the machine, and part indicates some action to be taken as a result of the current cycle.

The "cycle flip-flop" also sends a 10 ns wide pulse down a multi-tap digital delay line. The 50 ns tap strobes a latch that holds 6 "new machine state" data bits. These bits are fed back to the RAM address latch to form the "previous machine state" bits for the next cycle.

The 60 ns tap strobes 5 "action data bit" gates to initiate an activity such as SHIFT or ABORT. It also strobes 5 "machine clear" gates that clear only the flip-flop that initiated the cycle. An 80 ns tap then attempts to "recycle" the State Machine, in case any triggers arrived during the cycle or if there were multiple triggers initially. If two or more triggers were present at the start of the cycle, the RAM instruction determines which trigger to deal with first. If, for example, a Level 1 and Level 2 trigger were both present simultaneously, the State Machine cycle would deal with the Level 1 situation, clear the Level 1 flip-flop, and generate a re-trigger to deal the the Level 2 situation. If the input RAM address corresponds to an illegal situation, such as a spurious Level 1 trigger arriving when there are already events in Stage 1 and Stage 2, a HALT instruction stops the State Machine and prevents further cycles. While taking physics data, no "spurious" HALT's occurred; the infrequent occurrences were definitively associated with hard failures (power, etc.).

The contents of the RAM are constructed by a  $\mu$ VAX-resident FORTRAN program. The RAM address bits are divided into three groups:

1)\_"Machine state address bits" describe the condition of the State Machine as the result of the last machine cycle. The 6 machine state address bits are:

| S1           | An event is in Stage 1                                          |
|--------------|-----------------------------------------------------------------|
| S2           | An event is in Stage 2                                          |
| S1L2         | The Level 2 decision for the event in Stage 1 is "yes"          |
| S2L2         | The Level 2 decision for the event in Stage 2 is "yes"          |
| SEEKING      | A 3081/E seek cycle is in progress                              |
| SEEK PENDING | A 3081/E seek cycle request is pending (readout is in progress) |

2) "Machine trigger bits" indicate which external event caused the State Machine to cycle. The 5 machine trigger bits are:

| L1        | Level 1 trigger                                        |
|-----------|--------------------------------------------------------|
| L2        | Level 2 trigger                                        |
| RD DONE   | Readout cycle completed                                |
| SEEK DONE | 3081/E processor seek completed                        |
| SEEK REQ  | 3081/E processor seek request (FULMEM or end-of-spill) |

- 3) "Machine trigger modifier bits" provide additional information about the machine trigger. The 3 trigger modifier bits are:

| L2 DECISION | Level 2 decision is "yes"   |
|-------------|-----------------------------|
| BEAM SPILL  | A Beam Spill is in Progress |
| MODE        | Unassigned, hardwired to 0  |

The RAM output data bits are also divided into three groups:

- 1) "Machine state data bits" record the new state of the State Machine as a result of the last cycle. The 6 machine state data bits are the same as the machine state address bits given above.
- 2) "Machine action data bits" initiate a supervisor function with a pulse from the State Machine cycle delay line circuit. The 5 machine action data bits are:

| SEEK 3081/E | Seek pulse to 3081/E selector module        |
|-------------|---------------------------------------------|
| SHIFT       | SHIFT pulse to crate scanners via CSO       |
| ABORT       | ABORT pulse to crate scanners via CSO       |
| RD START    | Readout Start pulse starts RDCLK and WRTCLK |
| HALT        | Halts Readout Supervisor cycles             |

3) "Machine clear data bits" clear individual trigger bit flip-flops. If a FF is cleared, the State Machine considers the trigger request completed. If a FF is not cleared, the State Machine circuit will automatically cycle again until all FF's are cleared. Multiple trigger requests can be present simultaneously but may be serviced one at a time or all at once depending on how the RAM software is written. FF's are cleared with a pulse derived from the State Machine delay line circuit. The 5 machine clear pulse data bits (CLR SEEK DONE, CLR SEEK REQ, CLR RD DONE, CLR L2, and CLR L1) each clear their respective FF's.

With the current RAM program, the state machine services only one pending cycle request at a time. There can be up to 5 requests pending simultaneously. In principle, all 5 requests can be dealt with in a single machine cycle. However, we found it unnecessary to introduce this level of sophistication. Multiple cycle requests are serviced in the order in which the "machine trigger bits" are listed above. A Level 1 trigger is given highest priority so that the experiment's dead time is minimized.

Each state machine cycle takes about 60 ns. (This could be reduced to about 40 ns.) Automatic recycles can occur every 80 ns. Like much of the system, the performance was sufficiently fast without tuning the timing as tightly as possible.

# 10. Signals from the Readout Supervisor to the 3081/E Processors

The Readout Supervisor communicates with a 3081/E controller via a single signal called STARTE (TTL differential pair), the purpose of which is to re-start the 3081/E. The supervisor pulses this line when the *Turbo* memory on that processor is full or when (if option is selected) the proton spill has ended. A separate line goes from the supervisor to each of the 3081/E's.

In addition, the Readout Supervisor sends the 100-150 ns WRTCLK strobes to all *Turbo* memory boards. WRTCLK is synchronous with RDCLK, the clocks to the scanners, differing

only slightly in phase and duty cycle. The WRTCLK strobes clock the data from the Crate Scanners into the 3rd Pipeline Registers on the memory boards, and initiate write cycles to the memory (Fig. 2). Because valid data does not arrive at the 3rd Pipeline Register until the first two RDCLK strobes have been sent to the Crate Scanners, the Crate Scanners do not assert ENABP3 (the WRTCLK gate) on their cables to the 3081/E's until just before the third RDCLK/WRTCLK. The module which generates the WRTCLK signal for the selected 3081/E is in the Readout Supervisor. The WRTCLKs for each 3081/E are fanned out to all of its *Turbo* boards in a fanout module.

## 11. 3081/E Processors

The memories of the 8 3081/E processors serve as large buffers into which raw data may be dumped at high rate. The cables and signals to effect this transfer are described above. In this section we briefly mention the modifications for E791 to the memory and processor interface of the standard<sup>2</sup> SLAC 3081/E (the processor boards themselves are standard). As discussed above, "*Turbo*" memory boards were constructed by adding external ports to the standard memory boards. This required a slight increase in the board size, which also resulted in the production of custom crates<sup>7</sup> similar to those in the rest of the readout system. On the *Turbo* memory boards, there is a separate memory address register for each of four 16-bit columns of memory. Each *Turbo* board then accommodates one cable from each of four Crate Scanners. The four columns of 16-bit words fill up independently, each at the rate for hits in the corresponding crate. Each memory address register increments only when its ENABP3 line is asserted, though the Readout Supervisor sends RDCLK strobes to the whole 3081/E as long as any scanner still needs them. During processing, the 3081/E processor actually considers these 4 sets of 16-bit words to be organized as 64-bit words, though smaller units are addressable.

Each Turbo board can accommodate up to 2 Mbytes of static memory  $(64K \times 1, 55 \text{ ns} \text{TTL})$ . In the master/slave configuration described in Section 5, we loaded Turbo memory boards with 1 Mbyte of memory, resulting in a 128K word buffer for each of the four 16-bit ports on the board. Three Turbo boards could then accommodate the 24 crates which are read out. In subsequent physics data-taking runs, we can increase the number of memory chips loaded.

A processor is off while its memory is being loaded with raw data. When it is full, the Readout Supervisor re-starts it, and the processor finds the data in the *Turbo* memory boards. Events which pass the Level 3 software filter are uploaded to the host  $\mu$ VAX. Communication with the  $\mu$ VAX is via a DR11W interface<sup>8</sup>, which is connected to a custom processor interface in the 3081/E.

#### Acknowledgements

We thank all those who participated in discussions regarding the system design and in its construction, including K. Biery, G. Bonneaud, J. Ellett, C. Friedman, P. Kunz, W.K. McFarlane, W. Slater, R. VanBerg, and D. Wagner. This work was supported by our respective institutions and by the U.S. Department of Energy, the National Science Foundation, and the Alfred P. Sloan Foundation.

### References

<sup>a</sup>Present address: Physics Department, University of California, Irvine, CA 92717

<sup>b</sup>Present address: Analytek Ltd., 10261 Bubb Rd., Cupertino, CA 95014

<sup>c</sup>Present address: Physics Department, University of Texas at Austin, Austin, TX 78712

<sup>d</sup>Present address: MCI Communications Corp., 8283 Greensboro Dr., McLean, VA 22102

- [1] R. Cousins, et al., "Study of Very Rare  $K_L$  Decays", AGS Proposal 791 (June 1984)
  - [2] P.F. Kunz, et al., "The 3081/E Processor", SLAC-PUB-3069, CERN/DD/83/3 (1983)
  - [3] P. Chauvat, et al., "The Performance of a 16000 Wire Mini-drift MWPC System," CERN-EP/82-198. Published in Collider Detectors: Present Capabilities and Future Possibilities, Proceedings of the DPF Workshop, ed. by S.C. Loken and P. Nemethy (LBL, Berkeley) 1983.
  - [4] J. Ellett and A. Russ, R608 TDC Readout System Manual, UCLA-HEP-8001.
  - [5] R. Cousins, C. Friedman, and P. Melese, "32-Channel Digital 6-bit TDC with 2.5 ns Least Count", contributed to the IEEE 1988 Nuclear Science Symposium, Orlando, FL
  - [6] K. Biery, D. Ouimette, and J. Ritchie, "A Fast Integrating 8 Bit Bilinear ADC", contributed to the IEEE 1988 Nuclear Science Symposium, Orlando, FL

- [7] Calmark Corporation, San Gabriel, CA 91776

[8] Model MV-F-DR11-W of MDB Systems Inc., Orange, CA 92267



Figure 1. Block diagram showing the inter-connections of the modules in the readout system. Most of the connections between the crate controllers and the *Turbo* memory boards are not shown.



i.

Figure 2. Principle of the pipelined readout. The WRTCLK and RDCLK strobes are closely related, as described in the text.

Ì



Figure 3. Block diagram of a generic digitizing data module.



Figure 4. Block diagram of the E791 Crate Scanner. Functions related to master/slave operation (described in the text) are not shown.



1

I.

Figure 5. Sketch of the Crate Scanner front panels and connnecting cables for a pair of crates used as master and slave.



2.4

Figure 6. Block diagram of the State Machine in the main Readout Supervisor module. Auxiliary CAMAC and diagnostic circuitry is not shown.

)