/afs/slac/g/esa/esalib/daqctrl/doc/cntrl.doc ###OVERVIEW I.. OVERVIEW OF COMPUTER/PROCESSES NETWORK IF SYSTEM IS HAPPY, SOME LIGHTS ON RACK 48 (near the top) will be flashing) BUSY (upper right) A2N (2nd row right) Sometimes can restart by doint I.B.d (EMERGENCY) ###DAQCTRL A. DAQCNTRL: 1. Normally run from ESASUN2 workstation in rack 38 a. Logon to account esaexp (password is hidden on border of monitor) c. type runcntrl 2. Responsible person: Steve Rock. Backup: Perry Anthony ###VMESERV B. VMESERV: Reads Data from CAMAC and sends it to LDATServ to put on tape 1. Runs on VME. 2. Control from Ambassador terminals in front of Back 47. 3. To START a) Press RESETS SIMULTANEOUSLY on Local & Remote WEB II (bottom of 3) onpanels in rack 48 b) Type NBO on Lower Keyboard of Ambassador"LOCAL VME SERVER" Rack 47 c) Type NBO on Upper Keyboard of " "REMOTE VME SERVER" Rack 25 d) When connection to VMESERV is established (see DAQCTRL DISPLAY, RACK 38 upper right, VMESERV has a number and a *. If no *, push "PROCESSES" on bottom line, then turn on VMESERV [click].) Click on "EMERGENCY" (upper left) Click on "ID QUERY" 3. Responsible person: Perry Anthony, Backup: Zen Szalata ###LDATSERV C. LDATSERV: 1. Receives data from VMESERV via reflective memory. 2. Writes data to DISK (which will be copied to tape. 3. Sends data analysis. 5. Sends Data to TAPE (when turned on) 4. Runs on ESASV2 (VME) 5. Responsible person. Zen Szalata, backup. Perry Anthony. The ESA VME based Real Time Data Acquisition System (DAQ) has a number of input/output panels in Rack 48 that display the status of just what the system thinks it is doing. These panels are labeled "Spider's Hub", "Local Web Status & Control", and "Remote Web Status & Control". These panels are in the middle of the rack just about the VME crate and tape drives. "Spider's Hub" This panel is divided into two sections. The top section provides TTL output signals and contains 32 Limo connectors and associated LEDs. The bottom sections provides for TTL input signals and also contains 32 Limo connectors and associated LEDs. The top section shows the status of the various processes running on the real time cpu's in the DAQ system. The bottom section provides input for polarization bits and run states. In both sections, 0 is on the right and 31 is on the left. How to read "Spider's Hub" Status (top) Section: Important lights: number function normal state when data is flowing 0 computer busy should be flickering when data is flowing 1 buffer full should be off when data is flowing 2 run state should be on when data is flowing 3 log state should be on when data is flowing 4 read beam depends on type of run 5 read 2.75 depends on type of run 6 read 5.5 depends on type of run all other ligths reflect the state of individual processes in the real type system and are useful only to an expert. When the DAQ system is in a run state, LEDS 4, 5, and 6 indicate which CAMAC branches are being read. 6 is for the 5.5 spectrometer, 5 is for the 2.75 spectrometer, and 4 is for the beam/moller system. If the LED is on, that means that that branch will be read out when the computer recieves an interrupt, at least one of these LEDs must be on for data to be read: Normal Run: all three LEDs on. LED Run: only LEDs 5 and 6 are on. TDC Test: only LEDs 5 and 6 are on. Pedestals: all three LEDs on. BeamMonitor: all three LEDs on. LED 3 indicates that data will be logged and should never be off The control program has no way to control this state and something is seriously wrong if this LED is off. Reload the DAQ system if this light is off. LED 2 on indicates that the DAQ system has been instructed by the control program to acquire data when an interrupt is received. If this light is not on, no data is being read out. LED 1 indicates the state of the data buffers used to pass data between the real time system and the data servers that log the data to tape and serve the analysis programs. If this light is on, that means that all the buffers are full and the real time system is waiting for LDATSERV to empty the buffer. If this light stays on for a long period of time, this may indicate the the data servers are in a bad state and should be looked at. When this light is on, the Busy LED (LED 0) will also be on and the A2N to the electronics will be suppressed. LED 0 is the computer busy LED. This is used to suppress A2N's to the electronics when the computer is busy servicing the previous event. LED 0 in the lower (input) section of "Spider's Hub" indicates triggers comming into the computer. If LED 0 in the output section is on solid, LED 0 in the input section will not flicker, and gates to the electronics will be supressed. "Local (Remote) Web Status & Control" panels: These two panels are connected directly to the real time cpu's. The most important LEDs on these panels are the FAILLED and STSLED located in the upper row of lights on each panel. If either of these lights is on, that means the the associated cpu (local or remote) has crashed. These panels also contain the Reset push buttons for each cpu. To reload the DAQ system, the reset button on both panels must be pushed in simultaneously, and then released. This will reset the real time cpus and resupt in "Bug>" prompts appearing on the console screens for the real time cpus. The console screen for the local cpu is on the left side of the table in front of rack 47. The console screen for the remote cpu is in the middle of rack 25. After pressing the reset buttons, type "nbo" at the "Bug>" prompt on both console screens. Various Crash States: If data is not flowing, the state of the real time system can be quickly checked to determine if it is the problem. First look at the "Spider's Hub" panel. Look for the following patterns in the top row of LEDs: Normal state: LED 0 flickering on the top row (if this LED is not on solid, then LED 0 on the lower row will also be flickering, if LED 0 on the top row is on solid, LED 0 on the bottom row will be off). LED 1 off LED 2 on At least one of LED's 4,5,6 on. Data should be flowing since this is the normal state. However it is possible that the real time system has crashed in such a way that the LEDs are left in the normal state. To check this, look at the "Web Status & Control" panels and verify that the FAILLED's and STSLED's on both panels are off. If this is the case, then there is nothing wrong with the real time system and the problem must be with the data servers. If any of the FAILLED's or STSLED's is on, then one of the cpus has crashed and the system must be reload by simultaneously pressing the reset buttons on both panels, and then typing "nbo" on both real time consoles. Real Time System Busy: LED O on solid (LED 0 in the bottom row is off). LED 1 on LED 2 on At least one of LED's 4,5,6 on. The data servers are not emptying the data buffers. Check data servers. Nothing is wrong with the real time system. LED 0 on solid (LED 0 in the bottom row is off). LED 1 off LED 2 on At least one of LED's 4,5,6 on. The real time system is hung or crashed. If the "Local (Remote) Web Status & Control" panels do not have a FAILLED or STSLED on, the the system is not crashed, but is hung up in some way (that is, it forgot to turn off the busy). In this case, push "endrun" in the control program, which should clear the busy, then start a new run. If a FAILLED or STSLED is on on one of the "Web Status & Control" panels is one, then the real time system is crashed and must be re-loaded (by pushing both reset buttons simultaneously and typing nbo on both real time consoles). ###MAGNET D. DAQMAGB: 1. Controls magnets, NMR, Hall probes 2. Runs in batch on ESAU6. 3. Submit with RUNMAGB from accout [ONLDAQ] logged onto ESAU6. 4. Controls run from any terminal logged onto account ONLDAQ on ESAU6 Submit controls using RUNMAGI. 5. Responsible person. Peter Bosted. ###MCCSTAT F. MCCSTAT: 1. Runs on MCC VAX. 2. Gets Beam stuff from MCC data base. 3. Responsible person: Owen Saxton *** DESCRIPTION OF DAQCNTRL (CONTROL PROGRAM) ******** ###TITLE DISPLAY II. TITLE DISPLAY A. RUN STATUS: (first column) 1. BEG RUN or END RUN Are we in an official data taking run or have we completed one. Date and time Official run started. 2. RUN/PAUSE: Are we taking data NOW. 3. SPECIAL AND CALIBRATION RUNS: eg PEDISTAL, COSMIC, PROFILE ... 4. TRANSIENT MESSAGES: a) CLEAR : clearing of histograms etc in progress. b) DUMP: Writing output to disk files (like at end run) in progress) c) CHKPOINT 5. TAPE LOG: Are we writing to tape: 6. RUNTYPE B. RUN STATUS ETC. (second column). 1. Time that Official Run began. 2. Number of seconds since the last checkpoint. This will count up to the CONSTANT "sec/chk", then automaticaly create a checkpoint. (see Sec VII). 3. Checkpoint number. How many checkpoints since BeginRun. Official Run will End Automatically after "chk/run" checkpoints where "chk/run" is one of the CONSTANTS". 4. TAPE STATUS (MOUNED, DISMOUNTED) 5. TAPE NUMBER (6 ALPHA-NUMERIC CHARACTERS) 6. Megabytes on tape. 7. Maximum megabyts allowed on tape. (Changeable thru Constansts) 8. Fine number on Tape C. BEAM /SPECTROMER 1. TOROID: Beam current in electrons/spill from MCCSTAT 3. Beam Energy from MCC flip coil. 4. Energy Slit width (Full Width) from MCC 5. Repetition rate of Linac. Should be 119. From VMESERV. 7. Tagger Magnet current E. DAQ PROBLEMS/INFORMATION (5th column) 1. Interupts: Number of interupts this run (1 per spill) AVERAGE FOR RUN ON LEFT, INSTANTANEOUS ON RIGHT 1b. Interupts from CID 2. LOST: Should be percentage of Interupts that could not be serviced for some reason (tape too slow, previous event too long..) 3. BUSY (lost data due to computer being busy) 7. Camac errors in Beam Branch 8. Camac errors ? ###BUTTONS(TOP) II. **** EXPLAINATION OF BUTTONS **** *** TOP *** 1. APPLICATION: For experts only 2. DISPLAY a) PEDESTALS Beam/Tagger Controlled by button at lower left b) CRATE VERIFIER for Branch 0, Crates 1, 3, 5 (Controls, HV, Scalers c) CONTROL FLAGS status of daqctrl. In order left to right The first 24 are displayed om rack 38 lights from right to left. e) REMOTE PROCESS STATUS: i)Process ID of DAQCNTRL (=-1 when not connected to NETSERV) ii)List of Remote Processes that we should be connected to. See Section X. for purpose of each process. If the process is connected the number to the right is the Process ID as determined by NETSERV. If the process is not connected to NETSERV, the PID =-1. iii) Communication can be enabled (dark,red) or disabled(light) by pushing on button. We can be connected to a process (communication is potentially possible) yet we can choose NOT to communicate (no hard feelings I hope). If we are not connected to a process, communication is impossible, so it is best not to try f) MCC STUFF. Magnets, toroids etc from the Linac Data base. g) SAM (Smart Analog Monitor) Voltages ###EMERGENCY 3. EMERGENCY: Various buttons to get out of trouble. This will probably grow with time. b. INIT VELO: Initialize the Veto Modules for recording beam polarization c. ID QUERY: Trys to reestablish contact with the remote processes. d. CAMAC TABLE READ: Read the file ONL$DAT:CAM-DB-ESA.DAT which has the location and instructions for the camac modules this program uses (mostly in Crate 1,Rack 48). e. UNINHIBIT CRATE 1: UnInhibit the Camac Crate f. CLEAR ALL CNTFLAGS: Set ALL the Control Flags (shown under DISPLAY) =OFF This includes: Begrun, EndRun, Chkpnt.... g. RESTART UPDATE: If the update ( 1 Hz for overview etc) has stopped ###CALIBRATE 4. CALIBRATE BUTTONS a) PROFILE: Does a PROFILE DETECTOR (Big UMass Wheel) Run. Reads out the 60 hodoscope elements, Lead Glass, Cerenkov Crystal, pion detectors and Beam Crate. Does not read out the Quartz Moller Detector. d) PED: Does ADC for PROFILE detectors and beam Pedistal run. i) Asks if you want to turn beam off. You must turn beam back on yourself. Beam MUST be OFF during the Ped run, but it may be off for other reasons and you many choose not to suppress it.) ii) Fixed duration set by the constant Ped-Evts (number of events). Nominally set to 120. (see Section VIII) iii) Program VMEServ Calculates Mean and Width for each ADC. These are put on tape and sent to Analysis processes. iv) Pedistals displayed using buttons DISPLAY/PEDISTALS e) COSMIC TEST: Reads out PROFILE DETECTOR ADC and TDCs. The TDCs have a live time of 4us/spill to maximize cosmic ray collection. (Normally .5 us). f) TDC TEST: Sends Test Pulses to PROFILE TDCs and ADCs from the common input to the discriminators. g) Various Feed Back Callibrations. ###RUN COMMANDS 7. RUN COMMANDS(far right Column) a. RUN: Starts data taking by opening the electronic gates. b. PAUSE: Pauses or Halts data taking by closing electronic gates. c. BEGIN 1 RUN: Begins an Offical Data taking Run by doing the Begin Run Sequence in item V. Must be Logging to Tape. (except with override) Tape Must be Mounted (see RUN/TAPE BUTTON) Computer will want to be connected to remote processes. The run will automatically end when the tape is full or the maximum checkpoints are reached (see CONSTANTS "chk/run" d. END 1 RUN Ends an Offical Data taking run. by doing the END RUN Sequence in item VI. e. BEGIN RUN SERIES: Does a series of runs, automatically dismounting tape when run is finished and mounting the next tape. This will go on for a number of hours determined by CONSTANT "run_series". Ended immediately by pushing Button END 1 RUN. f. END RUN SERIES: Will end the series of runs after the current run is complete. g. ABORT: Ends a Run without creating printer files or End Run summary. h. CLEAR: Clears all accumulated data, scalers, histograms etc. DO NOT DO THIS DURING AN OFFICAL RUN. l. CHKPOINT: Forces a Checkpoint to occur. See item VII. m. DUMP: i) Adds additional data to the Dump file DAQCNTRL.DUMP (which has stuff in it mostly from END RUN.) ii) Renames DAQCNTRL.DUMP to DAQCNTRL.PRINT, iii) Opens new Dump file. vi) Prints DAQCNTRL.PRINT ###BUTTONS (BOTTOM) III. EXPLAINATION OF BUTTONS ****(BOTTOM)***** ###SNSLIN A. SNSLIN: Historical and stupid name for Various flags to control the flow of the program. CNTRLCHECK checks the values of some SNSLINS at the beginning of a RUN (or RUN Series) to see if they are read. a) TAPE LOG: ON to enable writing to tape. Tape must be mounted also. If tape=mounted and TAPE LOG= off then data will NOT be sent to tape. (Can also set same flag from the RUN/TAPE display) b) CAMAC: Turns CAMAC ON OR OFF. Should be ON (solid) c) NET MSG: Enables diagnostic messages to be printed to screen. Normally OFF> d) MIN DUMP: ON = Minimum Dump OFF = Maximum Dump. h) DIAGNOSTICS: Turns on various diagnostic messages. ###CONSTANTS 13. CONSTANTS Varies Integers which can be changed interactively. See Section VIII. There are several basic characters. a) Counter Limits: e.g. seconds/checkpoint. b) TimeOut Limits: Internal Waits for various responses. c) Diagnostic Limits e.g. Pedestals must be less than xx. ###ERRORS III. ERROR SEVERITIES: A. Errors from Remote Processes (like Magnets). There are 3 severity levels: I = Information. W = Warning. F = Fatal. Automatically go to PAUSE mode and sound Alarm. (Rack 38). B.The MINIMUM level of error that we respond to can be set using the MENUS in the SEVERITY BUTTON (lower left) 1) NONE: Do not print any error messages. 2) ALL: Print all error messages. 3) >=WARNING: Print and respond to Warning and Fatal Messages only. 4) FATAL: Print and respond to Fatal messages only. C. ERRORS to Message Window and Log file can be seperately controlled. D. Almost all errors are generated by the remote programs and only reported and editied by DAQCTRL. ###BEGIN RUN SEQUENCE V. Begin Run Sequence. CntrlActBegRunDo A. Enable Signals to 2249SG ADC B. CntrlActStartTest Make sure that we are logging, that toroid calibrator off, C. Dump done? D. Check to make sure that We have at least Default values of various Constants, Switches, Error Severity Levels etc. E. Clear 2280 ADC F. Increment run number and Mount tape if necessary. Open New File. G. Send Begin Run to process= magnet, job and vme H. Clear I. Fill in Run Summary File J. Begin RUN message on Screen K. Begin Run Label to Tape L. Checkpoint All global sectins to tape. M. Go into Run State. ###END RUN SEQUENCE VI. END RUN SEQUENCE: A. CntrlActEndRun. 1. Pause (stop data taking) 2. Do a Checkpoint a) End Run Message to Remote processes. They should create a print file. b) write data to tape 3. EndRun label to tape. 4. Write End Run information (HV, Scalers, Magnets etc. to printer file). 5. Remote Analysis processes should write run summary. VII. CHECKPOINT: A. Purpose: To collect and write to tape the NON-EVENT DATA ( Scalers, Magnets, Target pol, beam information, Miscillaneous voltages, High Voltages... B. Frequency: 1. Automatically occurs while in Offical Run. The time interval is set by the CONSTANT sec/chk (seconds/checkpoint). See item VIII. Will probably be set for about 2 minuits. (120). 2. Number of seconds since last checkpoint is shown in OVERVIEW at top of Screen, see item IX.). C. Duration: About 1 second. D. CntrlChkPnt: Sequence 1. Pause 2. Write Checkpoint label to tape. 3. Get Beam Information from BCSCS (beam steering computer) 4. Checkpoint message to remote processes(get data ready for tape). 5. Write non-event data to tape. 6. Send non_event data to analysis processes. (see snslin status, item VII.) 7. Write End Checkpoint Label to Tape. 8. End Checkpoint messages to remote processes. 9. End Run Automatically on Number of Checkpoints? The run will end automatically after a certain number of checkpoints determined by the CONSTANT "CHK/RUN". See CONSTANTS, item VIII. 10. Increment Checkpoint number (see OVERVIEW at top of screen, item IX.) 11. Set chkpnt seconds =0 (number of seconds since last checkpoint). See OVERVIEW, item IX). 12. Run. ###CONSTANTS(MORE) VIII CONSTANTS Constant integers controling program flow. Constants are changed by pushing CONSTANT BUTTON in Menu Window. Then pusing button of desired constant. Then entering new value in dialog window followed by Carrage Return. A) SEC/CHK: Number of seconds in a checkpoint. Typically 150 (2.5 min). B) Chk/Run: Number of checkpoints in a run. Typically about 40. C) Ped-Evts: Number of Events in each crate for pedistal run. Typically 100. D) sec/moller: Moller run will end automatically after this many seconds. E) Mbyt/DoNotStart: Program will Close tape if you try to begin a run with more then this number of megabytes on it. It will then open a new tape and proceed. F) MBYTS_MAX: Maximum Megabytes on Tape. Will probably be around 850 For local tape running (to exabyte)set to about 4000. LDATSERV sets this automatically to 850 when it starts up. G) Hrs/Run Series: Number of hours in a run series. H) CkP_Tim Maximum time allowed for checkpoint. typically 4 sec. I) Timeout: Time out period in seconds waiting for remote response. typically 4 sec. J) Dmp_Tim Time allowed for Dump. Typically 10 sec. K) UpdtSec Time between Updating of displayed information from Beam Steering and Data Acquision Programs. Typically 2 to 5 sec. L) TCalSec Time allocated for Toroid Calibration Run. Typically 15 sec. M) Pol-Sec Time interval between poling remote processes to see if they are active. The processes are polled sequencially. e.g. Pol-Sec=2 means a polling every 2 seconds with an individual process(total of 7) polled every 14 sec. Typically 1. N) T-Tape Time allowed to mount or dismount a tape. Typically 20 sec. O) T-VMESERV Time out for VME serv messages, some of which take a long time P) Ped-ADC-Limit: Maximum value of ADC pedistal before they turn red on display. Q. Ped_ADC-Sig_l Maximum value of ADC Pedistal sigmas befroer they turn red. R. Ped-FADC-Limit Maximum value of FADC pedistal before turn red. S. Ped-FADC-Sig-L Maximum value of FADC pedistal sigma before turn red. T. Trunc_warn_e-6 Truncation display turns red if truncations are greater than this number multiplied by 1.E-6 IX. SNSLIN Historical name for various control flags. Historical and stupid name for Various flags to control the flow of the program. (See Section IX for details. ) THERE WILL BE A ROUTINE WHICH CHECKS THE VALUE OF THE SNSLINS AT THE BEGINNING OF A RUN TO ASSURE THAT THEY ARE REASONABLE. There are several basic categories. ###PROBLEMS: X.A. Checkpoint message (in RED) stays on and will not go off: Solution: Use Emergency Button "CLEAR CHKPOINT FLAG" May have to push "RUN" also. B. Metheus and Main Display do not update. Solution: check to see if cntFlag: beg_in_progress is ON (DISPLAY, CntFlag). Can clear it from EMERGENCY C. BUFFERS FULL: Turn Logging Off, then on again. D. 0 in LRS 2280 ADC (2.75 and 5 deg) Solution: 1) Try EMERGENCY--Clear2280 if that does not work 2) EMERGENCY -- CAMAC OFF (**Do NOT Use CAMAC ON**) GO INTO 2.75 deg Hut. For both 2.75 and 5 deg LRS2280 CAMAC Crates: Crate Controlers: Switch to OFFLINE Switch to ACL Push Clear Z Push Clear C Restore ACL switch Restore ONLINE Upstaris: Do PEDISTAL RUN to reinitialize ADCs and check them. E. WAITING mesage (in Red) stays on and will not go off: Solution: EMERGENCY--Clear Waiting F. CHANGING CAMAC MODULES Always Turn off CAMAC READ before turning off power. Use EMERGENCY -CAMAC OFF (**Do NOT Use CAMAC ON**) Must Start some kind of Run (Ped, Led ..) to restart CAmac Read.