LCLS Controls

SLC-Aware IOC Design

 

 

 

The SLC Executive Thread and SLC-IOC Initialization

 

Contents / Quick Links

Introduction

slcIOC Initialization, Restart and Shutdown

SLC Executive Design

slcExec Thread Description and External Interfaces

SlcExec Thread Detailed Design

 

1         Scope

The following document describes a software design for the SLC-aware IOC SLC Executive Thread.  The document includes a description of the SLC-aware IOC initialization, startup and shutdown functions, and a design of the SLC Executive thread.

2         Introduction

The SLC-aware IOC is responsible for responding to the existing SLC Control System, and to the EPICS based controllers of the Linac Coherent Light Source project.  The SLC-aware IOCs (slcIOCs) include two main components; a set of SLC-aware threads, and an EPICS IOC. 

The SLC-aware threads are responsible for duplicating the action of the SLC iRMXmicro for the SLC command subset assigned to the slcIOCs. They maintain an SLC Database, just as all SLC iRMXmicros do.  This SLC Database is synchronized with full SLC Control System Database maintained on the VMS controller. 

The EPICS component of the slcIOCs will contain all hardware interfaces and communicate with the EPICS environment and control system as a typical EPICS IOC. It too maintains a database of information for the EPICS IOC.

It is a design goal for the slcIOC that the SLC-aware thread can be started and stopped without interfering with the performance of the EPICS IOC component of the slcIOC.  This will allow the SLC Control System to synchronize the SLC Database without requiring a reboot of the entire slcIOC. 

It is also a design goal that the SLC-aware threads are portable to several different processor architectures and operating systems.  The SLC-aware threads are developed using the EPICS Operating System Independent libraries wherever possible to maintain that operating system and architecture independence.

The SLC Executive Thread is responsible for initializing, starting, monitoring, and stopping the SLC-aware threads.  The functions of the SLC Executive Thread are integrated with the full initialization of the slcIOC, including the EPICS IOC component.  Therefore, the complete slcIOC initialization process in included in this document.

2.1      Background

See the SLC-Aware IOC Functional Requirements document for discussion and background on the SLC-Aware IOC.  The following statements are paraphrased from the functional requirements document. 

 

The SLC control system at SLAC is currently used on most of the LINAC. It is the only control system in sectors 20-30, which will be used by the LCLS mostly intact. LCLS will replace much of the electronics in these sectors, as well as control components.  The Injector for LCLS will use all new control, except for the high power RF components, which are existing SLC klystrons and modulators. The corrector magnets in the LINAC that will be used for LCLS will all have new EPICS based controllers. From the undulator to the experimental stations, all new controls will be done in EPICS. Note that all SLC data from the existing LINAC will be available to the EPICS environment.

The motivation to implement an SLC aware, EPICS IOC, is to allow the new elements of the LCLS control system to use EPICS, while still taking advantage of the high level applications on the SLC control systems. These high level applications include: Correlation plots, energy management, beam steering, beam based alignment, emittance measurements, and slow feedback.

2.2      References

BUG: SLC Control System Introduction - an overview of the original SLC control system.

 

BUG: SLC Micro Structure - describes the structure of the SLC iRMXmicro.

 

2.3      Requirements

See the SLC-Aware IOC Functional Requirements document by Stephanie Alison.

 

3         slcIOC Initialization, Restart and Shutdown

The slcIOC is initialized using a startup command file.  The startup file initializes the slcIOC in a two-step process.  First the IOC portion is started and initialized as a typical EPICS IOC.  Second, the SLC-aware threads are initialized through a slcStart() function call.

 

The startup command file is tailored for the slcIOC’s target platform, and for its development vs. production status.  The startup command file indicates which proxy to connect to (dev or prod). The startup command file also names the slcIOC. Therefore, the startup command file is unique for each slcIOC / target processor / dev vs. prod combination.

3.1      Epics IOC Initialization

3.1.1      Unix / linux platforms

On UNIX platforms the <app>Main.cpp file contains the main() routine and performs initialization of the slcIOC.  It reads in and executes commands from the startup command file.  The startup file begins the slcIOC thread initialization with the slcStart() command.

3.1.1.1  The main() routine PDL

Int main(void)

{

      Read startup command file and execute commands

Read environment variables

Read IOC_NAME

Read PROXY_IP_ADDRESS

            Call iocCmlogDStart() (optional, for RTEMS and vxWorks)

            Call iocCmlogStart()

Load epics databases

Restart/Stop subroutine record

Others TBD

call iocInit()  - epics initialization function

call slcStart()  - slcIOC initialization

      call iocsh(NULL) to put IOC shell into interactive mode.

}

3.1.2      Other platforms TBD

3.2      Slc-aware Threads Initialization

The SLC executive is started after both CMLOG initialization and EPICS iocInit() during IOC startup.  SLC-aware thread startup errors will not interfere with normal EPICS IOC functionality.

3.2.1      slcStart()

The slcStart command is initiated by the startup file.  SlcStart() performs the following functions:

  1. Initializes slc globals
  2. Starts the slcExec thread
  3. Executes the Restart function

 

All slc globals are initialized to a known initial value.

 

The slcExec thread is started, and its thread id and active indicator are set.

 

The slcRestart() function is called, which in turn tells the slcExec thread to start up the remaining SLC-aware threads.

3.2.2      SLC-Aware thread global initialization

The SLC-aware thread global variables are initialized to known initial value that indicates inactivity or the non-existance of resources.  These globals remain valid throughout the life of the slcIOC, and therefore must be maintained in a known state at all times. 

3.2.3      slcExec Initialization Process

Within the slcStart() function, the slcExec thread is started.  The slcExec thread will create a message queue, and then indicate it is initialized and in its “ready” state by setting its thread active flag true.  It then begins to loop, waiting for a message.

3.2.4      Restart command

Once the slcExec thread has been started, the slcStart() function will execute the slcRestart command.  This command sends an IOC_RESTART message to the slcExec message queue to start up all the SLC-aware threads. Upon successful completion of the slcRestart command, the SLC-aware threads have been started and the SLC Database download is in process.

3.3      Slc-aware thread globals

The SLC-aware thread globals are initialized on the slcStart() function call.

 

There are several global variables, structures, and resources available to all the SLC-aware threads. The following list describes each global and its use:

3.3.1      slcJobs_as array

slcJobs_as[jobcode] an array size 32 of the following structure:

 

typedef struct {          

  char                   jobName_a[5];         /* null terminated 4-character job name */

  slcThread_te     hdlr_e;                     /* associated hdlr thread enumeration */

   dbupdate_ps * dbupdate_p;

} slcJob_ts;

 

 

 

 

 

 

 

 

 

The following table describes each field in the structure:

Field Name

Represents

hdlr_e

The corresponding msg-queue thread enumeration

Init. to slcInvalid or the known value

 

 

jobName_a[5]

Null terminated string containing 4 char. SLC Job name, initialized to known value

dbupdate_p

Pointer to the dbupdate structure see DatabaseService design document

 

The slcJobs_as[] array has the following uses:

3.3.2      slcSockets_as array

slcSockets_as[socket enum]

 

The slcSockets_as[] is defined as an array of structures.  Currently the structure has but one field, but is defined as a structure to allow for design changes. The structure follows:

typedef struct {

  SOCKET sd;

} slcSocket_ts;

 

 

3.3.3      dbExists boolean

dbExists

3.3.4      slcIocName null-terminated string

slcIocName

3.3.5      slcProxyIPAddr null-terminated string

slcProxyIPAddr

3.3.6      dbDownloadEvent semaphore

The dbDownloadEvent semaphore is implemented with the EPICS OSI Event library. dbDownloadEvent is used as follows:

 

3.3.7      slcThreads_as array

slcThreads_as[slcThread_e]

 

typedef struct {       

  epicsThreadId               tid_ps;

  epicsBoolean                active;

  epicsBoolean                stop;

  epicsMessageQueueId  mqId_ps;

  char                               destination_a[5];

} slcThread_ts;

 

The following table describes the fields of the slcThreads_ts structure:

Name

Represents

tid_ps

Thread identifier;

init to NO_THREAD

active

True if thread is active / looping; set by thread

Init to epicsFalse

stop

True if thread should stop; set by slcExec

Init to epicsFalse

mqId_ps

Message queue identifier

Init to NULL

destination_a[5]

Destination of the reply message for the message the thread is currently working on.  Initialized to V017; used by threads with message queues

 

The slcThreads_as[] array has the following uses:

·        Used by slcExec to monitor “active” threads, and to command threads to stop.

·        Used by all threads to store and find message queues, and to check on status of other threads.

·        Used by all threads to store the destination of  the reply message (also known as  the source of the incoming message).  This destination is used in error reporting to route error messages back to the SLC Control System SCP whose request resulted in error.

3.3.8      void * smMsgPvt_p and lrgMsgPvt_p

These are global pointers to the Message Service small and large memory pools for message handling. They are used by the Message Service threads and all job threads with message queues, for storing request and reply messages.

 

NOTE: The Database Service does not use memory pools for messages.  Threads that communicate with the Database Service must use different messaging utilities than threads communicating with the Message Service.

4         SLC Executive Design

4.1      slcExec Thread Description and External Interfaces

The slcExec thread exists through the lifetime of the IOC. SlcExec is responsible for responding to the Restart and Stop commands by starting and stopping all other SLC-aware threads in the proper order.  It maintains a message queue where it receives the IOC_RESTART command and the IOC_STOP command. While the slcIOC is in a normal running state, the slcExec thread is responsible for supervising the status of the slcIOC threads.

 

4.1.1      Block Diagram

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


4.1.2      External Interfaces

4.1.2.1  Message Queue

The slcExec thread uses a message queue to receive commands from the Message Service, from EPICS Channel Access (via an EPICS subroutine record) and from the EPICS IOC Shell. The message queue is implemented with the EPICS OSI Message Queue library (epicsMessageQueue).

4.2      SlcExec Thread Detailed Design

The slcExec thread is implemented with the EPICS OSI Thread library (epicsThread).

4.2.1      Startup

Within the slcStart() function, the slcExec thread is started.  When started the slcExec thread performs the following:

  1. Checks to be sure the slcExec thread is not already running
  2. Creates a message queue for incoming messages
  3. Sets its slcThreads_as[slcExec].active=True to indicate it is in the running state
  4. Goes into a loop waiting for messages

4.2.2      Restart

The Restart function causes the slcExec thread to stop all other SLC-aware threads that are currently running using the Stop function (see next section), then start all threads up again in an initial state.  All messages and database updates waiting in queues will be lost, and the full SLC Database will be reloaded.  This function is similar to an IPL, yet only restarts the SLC-aware threads of the slcIOC, leaving the EPICS IOC and EPICS Database portion running normally.

 

Once the Restart command is received by slcExec and the restart process is initiated, it cannot be interrupted.  It must process to completion before the next message is accepted by slcExec.  For instance, the Restart command will complete a full restart before accepting and completing a Stop command.

4.2.2.1  Initiated by

4.2.2.1.1   Message from slcStart()

One time, during the initialization after a boot, the slcStart() initialization function will call the slcRestart() function, which sends the IOC_RESTART message to the slcExec thread message queue. 

4.2.2.1.2   IOC Shell command slcRestart

An IOC Shell command, slcRestart, will cause the ioc shell to call the slcRestart() function, which sends the IOC_RESTART message to the slcExec thread message queue.

4.2.2.1.3   EPICS subroutine record

The EPICS Database contains a slcRest subroutine record.  This record will initiate the Restart or Stop function call, depending on the REST.A field.  When REST.A=0 the EPICS slcRest subroutine record will call the slcRestart() function, which sends the IOC_RESTART message to the slcExec thread message queue. This record can be activated through EPICS Channel Access, by any channel access client, such as an EDM display.

 

This command will use Channel Access security to help ensure that the SLC-aware threads are not restarted erroneously.

4.2.2.1.4   Message from VMS control system

The VMS Control System can issue a Restart via the IPL command.  This causes the VMS message service to send the MSG_IOC_SLCNOTIFY message, with the data portion = ‘BOOT’, to the slcIOC.  The Message Service msgHdlr thread will receive this command, and call the slcRestart() function, which sends the IOC_RESTART message to the slcExec thread message queue.

4.2.2.2  Functional Flow

The slcExec thread performs the following in response to the IOC_RESTART command:

  1. If threads are running, call slcStop()
  2. Create dbDownloadEvent semaphore for database related threads to wait on
  3. Create all Database Service threads
  4. Wait for these threads to set their global active=True. This is an infinite wait.
  5. Send a db_download_req message to dbSend message queue.
  6. Wait for the dbExists flag=True. This is an infinite wait.
  7. Create the msgRecv and dbRecv threads
  8. Wait for the Recv threads to set their global active=True.  This is an infinite wait.
  9. Create the cstrAsync thread.  Wait for that thread to set the asyncExists flag to true. This is an infinite wait.
  10. Create all “job” threads (xxxHdlr’s) as indicated by JMSK.
  11. Wait for these job threads to set their global active=True.  This is an infinite wait.
  12. Create the msgRecv thread and send the “MSG_IMALIVE” message to the SLC Control System.
  13. Go back to main loop waiting for messages.

 

(The slcIOC can hang on any of the five infinite waits in the previous initialization steps.  In all five cases a timeout could be implemented – TBD until some testing is performed.)

 

At this point the responsibilities of the slcExec thread end.  For completeness sake, we describe the Database Service functional flow till completion of the full SLC-aware thread initialization.  The Database Service continues with the database download as follows:

  1. Database Service downloads the database
  2. Database Service sets the dbExists flag to true
  3. Database Service signals the dbDownloadEvent semaphore to allow database-dependent threads to proceed.

 

At this point all SLC-aware threads are active and able to perform their functions.

4.2.3      Stop

The Stop function causes the slcExec to stop all SLC-aware threads that are currently active.   All messages and database updates waiting in queues will be lost, and the SLC-aware portion of the slcIOC will be in an inactive state.  The slcExec thread will remain alive and available to respond to any new commands.  This function is similar to a RESET, yet only stops the SLC-aware threads of the slcIOC, leaving the EPICS IOC and EPICS Database portion running normally.

 

Once the Stop command is received by slcExec and the process is initiated, it cannot be interrupted.  It must process to completion before the next message is accepted by slcExec.  For instance, the Stop command will complete a full stop before accepting and completing a Restart command.

4.2.3.1  Initiated by

4.2.3.1.1   IOC Shell command

An IOC Shell command, slcStop, will call the slcStop() function, which sends the IOC_STOP message to the slcExec thread message queue.

4.2.3.1.2   EPICS subroutine record

The EPICS Database contains an slcRestProc subroutine record.  This record will initiate the Restart or Stop function call, depending on the REST.A field.  When REST.A=1 the EPICS slcRestProc subroutine record will call the slcStop() function, which sends the IOC_STOP message to the slcExec thread message queue. This record can be activated through EPICS Channel Access, by any channel access client, such as an EDM display.

This command will use Channel Access security to help ensure that the SLC-aware threads are not restarted erroneously.

4.2.3.1.3   Message from VMS control system

The VMS Control System can issue a Stop via the RESET command.  This causes the VMS message service to send the MSG_IOC_SLCNOTIFY message, with the data portion = ‘RSET’, to the slcIOC.  The Message Service msgHdlr thread will receive this command, and calls the slcStop() function, which sends the IOC_STOP message to the slcExec thread message queue.

4.2.3.2  Functional Flow

The slcExec thread performs the following in response to the IOC_STOP message:

  1. Set all the thread global stop=True;
  2. get a small message buffer from the pool and initialize it as a  IOC_STOP message. This message is sent to all non-Database Service threads.
  3. initialize a local variable dbmsgmail_ts struct as a IOC_STOP message for the Database Service threads. The Database Service does not use buffer pools for its messaging.
  4. Send IOC_STOP message to all threads with message queues
  5. close all socket connections.
  6. Wait for all thread global “active” flags to be set false.  The socket connection closure will force the *Recv threads to break out of a recv call and check their global “stop” flag.  Once they detect the stop, the *Recv threads will terminate. Then ALL threads will be stopped.
  7. Destroy the global db dbDownloadEvent
  8. Set all thread global “stop” flags back to False;
  9. Could re-initialize all globals, do final global resource cleanup….TBD
  10. go back to loop waiting for messages

 

4.2.4      Global Data

The slcExec thread uses the following globals:

4.2.5      Resource Management

The slcExec thread allocates an EPICS OSI message queue upon startup. The slcExec message queue is destroyed as part of the normal cleanup process before the slcExec thread terminates, but since the slcExec exists throughout the lifetime of the slcIOC, this thread never exits in the normal course of events.

4.2.6      Message Logging

The slcExec thread will log all status and error messages using the slcCmlogLogMsg utility (described in the General Purpose Utilities document). In most cases, the error messages logged by the slcExec thread are generated by utility functions it calls. The slcExec thread logs the following messages:

 

MICR_EXNET_INITFAIL – socket library initialization error

MICR_EEXIST – a requested thread does not exist

MSG_NOMBX – a requested message queue does not exist

MICR_CREMBX – error creating a message queue

MICR_SENDMBX – error sending to a message queue

MICR_READ_MBX – error reading from a message queue

(possibly more – TBD)

4.2.7      Diagnostics

TBD

4.2.8      Major Routines

The major routines of the slcExec thread, the globals, initialization process, and the interface commands to the iocsh and the SLC Control System are defined in two directories.  The exec directory contains the thread, command and initialization sources.  The util directory contains the source and definitions related to the slcIOC globals. 

 

Each major routine is listed and a short description follows:

4.2.8.1  void slcStart(void)

slcStart is defined in the exec/slcMain.c and .h files.  It is the function call and iocsh command that will initialize the slcIOC threads. Typically it is called only during boot time.

4.2.8.2  Void slcInitGlobals(void)

initGlobals is defined in the util/slcGlob.c and .h files.  This routine is called only once, at boot time, by the slcStart function.  It initializes all slcIOC global variables, resources, and structures to a known value.  Pointers to resources such as thread Ids, event Ids, message queue Ids, and socket descriptors are initialized to a known value that indicates the resource is not allocated yet (such as NULL or an invalid flag).

4.2.8.3  void slcRestart(void)

slcRestart is defined in the exec/slcMain.c and .h files.  slcRestart is the function call and iocsh command to restart the slcIOC threads.

 

The exec/slcSub.c file contains the slcRestPRoc subroutine record used to implement the SLC Control System IPL command.

4.2.8.4  void slcStop(void)

slcStop is defined in the exec/slcMain.c and .h files.  slcStop is the function call and iocsh command to stop the slcIOC threads without stopping the EPICS IOC functionality.

 

The exec/slcSub.c file contains the slcRestProc subroutine record used to implement the SLC Control System RESET command.

4.2.8.5  void slcExecThread(void)

slcExecThread is defined in the exec/slcExec.c and .h files. This is the main routine and message queue loop for the slcExec thread. To initialize, it attaches the socket library, creates a message queue and set the active flag true.  It then goes into the main loop, waiting at the message queue for messages.  This routine normally will never terminate.

4.2.8.6  void cleanup(void)

cleanup is defined in exec/slcExec.c file.  It performs all resource cleanup for the slcExec thread when it terminates (which actually never happens in normal operations).  It releases the socket library, destroys the slcExec message queue and destroys the dbDownloadEvent semaphore.  Finally, it sets the slcThreads_as[].tid_ps to NO_THREAD.

4.2.8.7  epicsBoolean stopThreads(void)

slcStopThreads is defined in the exec/slcExec.c file.  It sets all thread ‘stop’ flags to true, sends a IOC_STOP message to all threads with message queues, and close all sockets.  It then waits for all threads to set their active flags to false.  Once all threads are inactive it sets all stop flags back to false and destroys the dbDownloadEvent.

 

At this time (TBD testing) this routine always returns True, it will hang on the wait if there is an error.

4.2.8.8  epicsBoolean startThreads(void)

slcStartThreads is defined in the exec/slcExec.c file. This routine starts all non-*Recv threads, waits for them to set their active flags to true, starts the *Recv threads, waits for them to set their active flags to true, then returns.

 

At this time (TBD testing) this routine always returns True, it will hang on the waits if there is an error.

4.2.8.9  epicsBoolean anyJobActive(void)

anyJobActive is defined in the exec/slcExec.c file.  This routine checks the active flag of each thread, returns TRUE if any thread is still active.

4.2.8.10                startDatabaseService

startDatabaseService is defined in the exec/slcExec.c file. This routine creates the dbDownloadEvent, creates the three Database Service threads, dbRecv, dbSend, dbHdlr.  It waits for these threads to become active, then sends the DB_DOWNLOAD_REQ message to the dbSend thread.  It then waits for the dbExists flag to become True before returning.  If there is an error in this process, this routine currently will NOT return. It can cause the slcIOC to hang, requiring a hard reboot.

4.2.8.11                waitForAsyncReady

waitForAsyncReady is defined in the exec/slcExec.c file. After the Database Service is started and the database has been successfully downloaded, the cstrAsync thread is started.  This routine is called to wait for the asyncExists flag to be set to True, signaling that the async function table has been successfully initialized. This routine currently will NOT return if the cstrAsync thread does not initialize successfully. . Uses a short sleep interval (0.5 sec) between checks within the loop. 

4.2.8.12                startHdlrThreads

startHdlrThreads is defined in the exec/slcExec.c file. This routine is called to start the Hdlr threads after the asyncExists flag is true.  It first reads JMSK to find out which Hdlr threads must be started for this particular slcIOC, then starts up each required Hdlr thread.  The msgHdlr, msgSend and cstrHdlr threads are always started by this routine as well.

 

The cstrHdlr thread is always started last in this routine, so that the other threads have been created and their thread ids (tid_ps) have been initialized.  The cstrHdlr checks these thread Ids against the required job list to verify that all required “jobs” are running.

4.2.8.13                waitForRecvThreadActive

waitForRecvThreadActive is defined in the exec/slcExec.c file.  This routine loops forever, checking the active flags of the dbRecv and msgRecv threads until they are both TRUE. Uses a short sleep interval (0.5 sec) between checks within the loop.

4.2.8.14                epicsBoolean allThreadsTerminated(void);

allThreadsTerminated is defined in the exec/slcExec.c file.  This routine returns TRUE if all threads tid_ps == NO_THREAD.

4.2.8.15                epicsBoolean waitForThreadsTerminated(void);

waitForThreadsTerminated is defined in the exec/slcExec.c file. This routine loops forever, calling allThreadsTerminated() until it returns TRUE.  Uses a short sleep interval (0.5 sec) between calls within the loop.

 

 

 

 

SLC-Aware IOC Home Page | LCLS Controls | EPICS at SLAC | SLAC Computing | SLAC Networking | SLAC Home

 

Contact: Diane Fairley

Last Modified:

Apr. 11, 2005 by dfairley.  Changed description of thread startup order and related routines.

Feb. 28, 2005. by dfairley.  Changed slcRestartSub -> slcSub, TEST_xxx ->IOC_xxx

Feb. 1, 2005. by dfairley.  Added these links and contact