# The Readout Architecture of the ATLAS Pixel System

Roberto Beccherle, on behalf of the ATLAS Pixel Collaboration Istituto Nazionale di Fisica Nucleare, Sez. di Genova Via Dodecaneso 33, I-16146 Genova, ITALY

# 1 Introduction

This paper describes the readout architecture of the ATLAS Pixel System [1, 2, 3]. The Pixel Detector is the innermost detector of the ATLAS detector. The elementary building blocks are pixel modules with an active area of  $16.4 \times 60.8 \text{ mm}^2$ . They are arranged on 3 layers of cooling staves in the barrel region and on  $2 \times 3$  ring-shaped "disks" in the forward and backward regions.

In the paper i will give a general overview of the system, focusing on the active electronic components. Next I'll describe system initialisation and configuration, followed by the data taking phase. I will also present some architecture simulations and results that show the safety margins of the chosen architecture. Finally some experimental results from test beam and irradiation measurements will be shown.

# 2 The ATLAS Pixel Detector System

A picture of ATLAS Pixel Detector module is shown in figure 1. The pixel module is a sandwich of several layers. From bottom to top we first find the front-end (FE)



Figure 1: Picture of an ATLAS Pixel module.



Figure 2: Block diagram of the ATLAS Pixel Detector system architecture.

read-out chips, then the sensor in the middle and a high-density two-routing-layer kapton film ("flex hybrid") on top. Finally, on the top of the kapton we have SMD components (like decoupling capacitors and a temperature sensor) and the Module Controller Chip (MCC). The sensor diodes are connected to the input of the preamplifier with the high-density In or SnPb bump-bonding technique. The interconnectivity is completed by wire bonds from the FE chips to the kapton, on the two long module sides, and from the MCC to the kapton. On the pig tail, which is connected to the flex circuit by wire bonding, are two VCSEL driver chips which encode data on both module outgoing data lines and a Pin diode receiver chip(DORIC). The DORIC chip amplifies the PIN diode signal and regenerates the 40 MHz Clock and Data/Cmd signals needed by the MCC. All data communication to and from the electronics of the module is sent via this three serial data lines.

In figure 2 we can see a block diagram of all active components of the ATLAS Pixel module. As can be seen from the picture there are two distinct data flows in the system. The first one, from right to left, is a configuration data stream coming from the control room, via the Read Out Driver (ROD), to the MCC and then to the FE chips. During the Data Taking phase instead, data from the FE chips are decoded and reformatted by the MCC before being sent to the control room via the ROD. As can be seen there is only one downstream link (that carries the encoded Data and Clock signals) and up to two 40 MHz upstream links allowing a total bandwidth of 160 Mbit/s data transfer rate [4, 5].

### 2.1 FE: Main Characteristics

Each FE chip has 2880 analog channels organised in 18 column pairs, 1440 digital readout blocks, digital control logic in the bottom part of the chip and some error handling capabilities.

The analog part of each channel of the FE chips is composed of a DC feedback preamplifier, a differential second stage amplifier, a differential discriminator and a digital output to the control logic.

The digital readout is a Column pair based readout consisting of the following blocks. An 8 bit Grey coded 40 MHz differential Time stamp in each pixel for measuring the leading and trailing edge of the detected signal, a local RAM to store the information, a shared bus to transfer hits at 20 MHz to the End of Column (EoC) logic.

The EoC provides 64 buffers for data storage during Lev1 latency (up to 6.4 ms), a read out sequencer (up to 16 pending events). As soon as the Lev1 signal is received requests all associated hits that are sent out.

The Control Logic provides a 20 bit Command Register, a 166 bit Global Register and a Pixel shift register that allows to access all single pixels configuration bits (there are 14 bits per pixel).

Some error handling capabilities, like EoC buffer overflow detection and the ability to disable single pixels, are also provided.

### 2.2 MCC: Main Characteristics

The MCC has to provide system startup and initialisation capability.

It also has the ability to decode data/command signals (coming from the ROD): A simple serial protocol is used for all communication between the ROD and the MCC and also between the MCC and the FE chips (Slow, Fast and Trigger commands).

It provides trigger, timing and control: the MCC has to provide Triggers to all FE chips and keep event synchronisation at the module level. Serial data from 16 FE chips is received and accumulated in local FIFO's.

During event building complete module events are reconstructed and some data compression is performed. A scoreboard mechanism allows to start event building as soon as all enabled FE chips finish sending the data of one complete event. Event data is sent to DAQ (via VDC) at a selectable speed, ranging form 40 to 160 Mbit/s.

One very important feature of this chip is error handling. The MCC has to deal with FIFO overflows, misalignment of data from FE chips with BCID information, disable defective or noisy FE chips.

#### 2.3 ROD or TPLL + TPCC

Optical components (VDC, DORIC and optical fibres) are still in a development stage and are therefore not jet fully integrated in the system. The system works properly and integration will occur soon.

The Read Out Driver (ROD) for the whole system is a joint effort between the Pixel and SCT community and is still in an early prototyping stage. Most of the operational tests performed so far on the Pixel system are performed with an ad hoc hardware:

Turbo Pixel Control Card (TPCC): Controls power distribution to the module, regeneration of clock and data signals, up to 4 modules can be operated with one card (data commands and clock sent to all modules, one multiplexed output line), support for temperature measurement.

Turbo Pixel Low Level card (TPLL): VME interface, clock generation and synchronisation, data FIFO, trigger FIFO, 16 MByte on board SRAM supports module level histogramming, FPGA for encoding/decoding the MCC serial data protocol, support for all 4 MCC output modes.

This year, for the first time, full radiation characterisation of the electronic was performed. Test beam operation was successfully performed this year for the first time within the DAQ-1 framework.

# **3** System Initialisation and Configuration

At power up the whole system has to be correctly initialised. FE chips and the MCC are connected by means of a star topology in which each FE has dedicated parallel connections with the MCC. Each FE has one 40 MHz LVDS data and clock line and 3 slower (5 MHz) common configuration lines. FE chips can be addressed by the MCC either in "broadcast" mode or one by one using their geographical address. In order to reduce the number of electrical connections to the module our system does not provide a pin reset signal. The MCC Command Decoder is designed so that at power up, after a finite amount of time, it returns to the idle state in order to be able to accept a Global Reset command that puts the chip in it's default state. At this point global configuration data (number of enabled FE chips, desired output mode, etc.) can be stored in the MCC register bank. The next phase implies a Global reset to all FE chips of the module. Once the whole system is initialised configuration of the single FE chips can begin.

#### 3.1 Configuration: MCC

The MCC Command Decoder allows 3 type of serial commands:

- Slow Commands: These commands are used during FE and MCC configuration;
- **Fast Commands:** These are data synchronisation commands that can be issued without exiting data taking mode;
- **Trigger commands:** LEV1 commands from the ROD are distributed to all enabled FE chips.

There are 8 16-bit wide configuration registers inside the MCC that allow to configure the chip for all its features, like:

Enabling only a certain number of FE chips on the module;

Output data mode selection: 40 Mbit/s, 80 Mbit/s and 160 Mbit/s on one or two output lines are supported;

Selecting the level of error checking/reporting.

In addition there is the ability to self test most of the chip circuitry. One can, for example, send data that simulate data coming from certain FE chips and perform real event building on these data.

Calibration of all FE pixels is possible using an analog delay line that provides a signal sent to a chopper circuit inside the FE chip that allows charge to be injected in each single pixel cell (2 different charge ranges may be selected). FE configuration data is sent on a dedicated line and validated by a load signal.

### 3.2 Configuration: FE

Each pixel of the FE chip has 14 distinct configuration bits; A preamplifier kill bit, a preamplifier mask bit, the Hit Bus circuitry enable, Calibration enable, a 5 bit threshold adjust capability and a 5 bit feedback current trim (trimming of ToT) capability.

A global shift register that sneaks through the whole pixel array allows access for reading and writing the 14 control bits.

A 20 bit Command Register and an  $\approx$  200 bit Global Register, located in the bottom part of the chip, control all the configuration phase of the chip.

11 8 bit DAC's control all critical bias currents and voltages needed on the chip and 1 DAC is used for charge injection.

Configuration loading of the whole module takes  $\approx 130$  ms.

In picture 3 we can see a module with 16 FE chips before and after the tuning procedure. Threshold tuning is performed in concurrent mode (all FE chips enabled at once). The electrical performance of the system, on a fully loaded module, is not degraded with respect of single chips.

Another nice feature of the FE chips is the self triggering mode. A self Trigger generator can use the internal Fast OR signal (active each time there is a hit somewhere in the whole pixel array) to generate its own Lev1 signal with a programmable latency. This allows the chip to be read out even with a source.

When this mode is active the MCC sends a Trigger command to the enabled FE chip which will produce output data once the Fast OR fires. The MCC is able to collect data in this way just for one FE chip at a time and therefore, in order to image a whole module, one has to perform a Source Scan stepping through all 16 FE chips in a row.



Figure 3: The tuning steps of a whole module.

# 4 Data Taking

Data Taking phase can occur immediately after system initialisation.

The system has to be set in Run Mode and after that only Trigger and Fast commands are allowed, any Slow command will return to Configuration Mode. The architecture of the module is Data Push, i.e. as soon as the MCC receives a Trigger command it is immediately sent to the FE chips for data collecting. Up to 16 pending Triggers are allowed in the system.

Trigger commands are 5 bit commands that allow automatic correction for eventual bit flips inside the command preserving correct timing information. A Pending Event FIFO keeps track of how many Events have still to be processed. In case more than 16 Triggers are received before an event is fully reconstructed they are simply dropped and the information will be propagated to the ROD inserting a Warning word in the corresponding Event. This mechanism allows the ROD to insert empty events to maintain synchronisation with the data flow. Event Counter Reset and BCID Reset commands (Fast commands) can be issued to keep correct event synchronisation.

#### 4.1 Data-Taking: FE

An 8 bit time stamp is distributed to all pixels in the column. After a hit has been detected the time stamps for leading and falling edge of the discriminator are stored in a local RAM. Hits are transferred from the pixel column logic as soon as the trailing edge has been detected and the hit data is stored in EoC buffers. 20 MHz transfer rates are possible. The hit pixel is then cleared. After the trigger latency, the data is either cleared from the EoC buffers or sent to the MCC if a trigger occurred. Data are sent out as soon as the Trigger signal is detected (data push architecture). 64 EoC buffers are used on the FE for hit storage during Lev1 latency.

## 4.2 Data-Taking: MCC

Only enabled FE chips participate to the Event building. 16 parallel data streams are received and stored in 16 independent FIFO's. Each Event is identified by an EoE word. EoE information is stored in the Scoreboard. As soon as all 16 EoE words of an Event are collected event building is performed. 8 bit BCID and 4 bit Lev1 information is stored with each incoming Trigger. Event building collects data from all 16 FIFO's and formats the output data stream according to the selected output mode. Up to two output lines (each sampling data on both clock edges) can be used in order to provide a data throughput up to 160 Mbit/s. Data consistency checking between hits from the same FE chip and between EoE words from different FE's is performed. FIFO data overflow, which produces loss of hits, may occur and is signalled in the data flow.

# 5 Architecture Simulations

Detailed simulation of the architecture has been performed in order to validate its performance, and eventual safety margins.

This simulation has been performed with a modular C++ program developed by our community, called SimPix, that has been used in many different stages of the project. Besides architecture validation it allows hardware testing and debugging, and can directly be interfaced to different hardware components like a logic state analyser, a probe station, dedicated hardware for Test Beam operation and irradiation tests at CERN's PS.

Main concern parameters for the architecture validation of the MCC were the FIFO sizes in order to avoid data loss, the output bandwidth occupation and the available safety margins with respect of number of minimum bias events and trigger rate.

Main features of SimPix are the ability to perform an independent simulation of all read-out electronic modules, the ability to use simulated physics data as input to



Figure 4: Simulation results in function of Figure 5: Safety margins in function of FIFO size. Figure 5: Safety margins in function of pile-up.

the simulation, time correlation between simulated events, an electronics description that is more detailed than a parametric model and faster than a low-level one and the ability to perform an entire detector simulation. The analysed conditions were:

1500 b-jet events produced by Higgs boson decay  $(m_H = 100 GeV/c^2);$ 

An average number of 24 pile-up events;

An average Lev1 trigger selection rate of 100 kHz;

An electronic noise occupancy of 10-5 Hit/Pixel/BCO;

The selected module for simulation was in the B-layer at  $\eta = 0$  (worst case);

An ideal Front-End model was used, which does not introduce any inefficiencies (due to the 64 EoC buffers per column pair);

The MCC receiver FIFO size ranging from 32 to 128 words;

The output link data speed was selectable between 40, 80, 160 Mbit/s.

A real model (Verilog, C++) of the MCC was used in order to simulate all possible inefficiencies.

### 5.1 Simulation Results and Safety Margins

Figure 4 shows the main results of the simulation. As can be seen the output link occupancy does not strongly depend on FIFO size and even with only 1 link available the occupancy is acceptable. The receiver FIFO occupancy is below 5% if we use 80 or 160 Mbit/s output data rate. As we can see there is a sharp correlation of the number of skipped Lev1 triggers with the number of lost hits. Even here we can see



Figure 6: Hit map of a complete module in logarithmic scale.

that, if we select the higher output modes, there are absolutely no inefficiencies.

Figure 5 is a study of all main parameters in function of the number of pile up events. As can be seen the architecture is robust enough to support up to 80 pile up events without loosing in efficiency. Analogous results prove that the chosen architecture is stable also for trigger frequencies up to 200 kHz.

As a conclusion results show a big safety margin operating the MCC with 128 word FIFO's and 160 Mbit/s output links.

This is the chosen configuration for the chip that has been manufactured.

## 6 Test Results

In this section i will present some preliminary results of test beam data and from the irradiation of the MCC to  $\approx 50$  MRad at the CERN PS irradiation facility.

#### 6.1 Test-Beam: Hit map of a module

Extensive measurements of 3 modules were performed this summer at the CERN SPS proton facility. The VME based system was run, for the first time, in the DAQ-1 framework.

Data acquisition was performed at almost 7 kHz (9,000 events per spill).

No system failures were observed during the whole test period.

Figure 6 shows the hit map of a complete module in logarithmic scale. The beam profile can clearly be seen and is centred in between 4 different FE chips.

As can be seen from the picture the inter-chip region is covered using longer pixels (600  $\mu m$  instead of 400  $\mu m$ ) in the left-right direction and a ganged pixel structure

in the top-bottom region. The ganged pixel structure is realized connecting the first eight pixel diodes on the detector side to the first four analog cells of the FE chip. In this way the first four analog cells collect the charge of two sensor cells and see twice the capacitance at the preamplifier input.

Both the effects can clearly be seen in the hit-map in the inter-chip region. These effects are also confirmed by track extrapolation from analysed data.

### 6.2 Irradiation at CERN PS

Both Single FE chip assemblies and the MCC were successfully irradiated at the PS facility at CERN (24 GeV protons) during this year.

Main goals of the MCC irradiation program was to verify the correct operation of the chip during irradiation and study Single Event Upset effects. More than 65 MRad of accumulated dose were collected. We irradiated 7 MCC chips powered at 2.2 V and 1.8 V.

Before each particle spill, configuration data was written to 6 chips (both Registers and FIFO's being written). After the spill data was read back and compared with written data in order to understand Static Bit Flip probability. The remaining MCC was operated synchronising data taking with the particle spill in order to study possible problems during normal chip operation. During one week of continuous operation we never observed once a state transition in the chip causing an unrecoverable error that needed a hard reset, confirming the robustness of the Command Decoder architecture. At 1.8V we had twice the errors than at 2.2V both for full custom memory cells an standard cell flip flops. Errors in the full custom FIFO's are  $\approx 2$ times higher than in Std Cell FF's.

All chips were fully functional after the irradiation period!

If we try to figure out the probability of a "system critical" bit flip induced by SEU we have to take into account the fact that not all memory cells of the chip have the same weight.

in the whole B-layer there are  $13.4 \times 10^6$  FIFO bits and  $0.6 \times 10^6$  Flip Flops. As an example, for the FIFO's only an End-of-Event (EoE) word corruption by a bit flip ("event over run") represents a real problem. EoE words are tagged only by 3 bits out of 21 forming a FIFO word. A bit flip in one of these three bits causes data corruption on all subsequent events, as data would be misaligned, and a data path reset is required in order to solve the problem. It can be shown that this happens every 6 s at full luminosity in the whole B-layer if we run the electronics at 1.8 V and every 11 s if we run it at 2.2 V.

Taking into account that not all flip flop upsets have the same importance, an educated guess is that only < 10% of the total flip flops have an important effect at the system level, it turns out that for the whole B-layer, at full luminosity, a critical failure would happen every 3 s running the detector at 1.8 V and every 6 s at 2 V.

Such an error rate is of course critical for optimal performance of the system and one has to provide the ability, at the ROD level, to issue frequent data path synchronisation commands in order to minimise the dead time of the detector.

The next version of the MCC, due in the first months of 2003, will try to address this issues. There is of course no simple solution to this problem but using triple redundant logic with a majority voting mechanism on most critical blocks, together with Hamming code correction techniques may drastically reduce the problem.

# 7 Conclusions

In this paper i described the architecture of the ATLAS Pixel Detector subsystem separating the system initialisation and configuration phases from the data taking one.

Some results of a detailed simulation of the detector showed that the chosen architecture fulfils all severe system requirements and that there is some safety margin with respect of data link failure, increased luminosity and increase in minimum bias events.

This year our collaboration performed various tests with real structures and i presented results form both test beam and chip irradiation. Test beam analysis shows excellent system performance and stability, while irradiation of the electronics has proven that the chosen technology is fully functional up to 65 Mrad. No system failure has ever been observed due to irradiation.

SEU measurements were performed on the MCC chip and preliminary results show that the system is operational at full ATLAS luminosity with some limitations. Therefore the next version of our electronics, due at beginning of 2003, will implement some radiation hardening techniques on most critical blocks of the design in order to reduce, as much as possible, the effect of SEU.

# References

- ATLAS Pixel Collaboration, ATLAS Pixel Detector Technical Design Report, CERN/LHCC/98-13.
- [2] F. Ragusa, Nucl. Instrum. Meth. A **447** (2000) 184.
- [3] V. Vrba, Nucl. Instrum. Meth. A **465** (2001) 27.
- [4] P. Fischer, Nucl. Instrum. Meth. A **465** (2001) 153.
- [5] R. Beccherle *et al.*, "MCC: The Module Controller Chip for the ATLAS Pixel Detector", Nucl. Instrum. Meth. A 492 (2002) 117.