LCLS Controls
SLC-Aware IOC Design
Message Service
Quick Links
Message Service Description and External Interfaces
The following document describes a software design for the SLC-aware IOC (slcIOC) Message Service. The document includes a description of the functionality of the slcIOC Message Service and the design of each thread involved in the Message Service.
Communication among processes in the SLC Control System passes either through the database or through the SLC message service. Messages are used to convey commands to the iRMXmicros and command completion responses back to the SLC Control System. They are also used for error messages. The slcIOCs must mimic the iRMXmicros in action and timing, to minimize changes in the SLC Control System.
The slcIOC Message Service accepts a subset of SLC Control System request messages, and a very few new messages, that are structurally the same as those sent to SLC micros. These messages are queued for other tasks to process. If necessary, the other tasks then queue replies to the requests, each reply normally consisting of status and data (if applicable), which the slcIOC Message Service sends back to the requesting SLC Control System process. The slcIOC Message Service also processes MSG messages, specifically the MSG_IOC_SLCNOTIFY message that is sent when an slcIOC is RSET or IPL’ed from the SCL Control System.
The SLC Control System produces message data that is little-endian, in VMS data formats, and the message data bytes are packed. The slcIOCs are required to perform the necessary conversions, unpacking and packing to allow the slcIOC to interpret this data properly, for any of the operating systems and architectures of the slcIOC.
The following paragraph is extracted from the document Notes by Tony Gromme, 3/2004 and is relevant to the slcIOC Message Service:
There are four "message service" pathways between SCP and iRMXmicro: One for most request and reply transactions; two for database transactions (which are performed by process named DBEX running on the SLC control system); and one for TIMEX (which is the name of another process running on the SLC control system). The slcIOC Message Service will manage all request and reply transactions initiated by the SLC Control system. All request and reply transactions will arrive at the slcIOC via IP/TCP. These messages go through a "proxy" node. This proxy node acts as a router for SLC Control System messages, minimizing the number of connections required between the SLC control system processes and the Micros.
BUG:Message Communications – describes the SLC Message Service.
Notes by Tony Gromme, 3/2004 – further discussion of the SLC Message Service is included.
SLC Executive Thread and SLC-IOC Initialization by Diane Fairley
See the SLC-Aware IOC Functional Requirements by Stephanie Alison.
The Message Service includes three threads shown in green in the block diagram below; the msgRecv thread, the msgHdlr thread and the msgSend thread. The msgRecv and msgSend thread share the message service socket and are responsible for receiving and sending SLC Control System messages. The msgHdlr thread handles MSG function messages. The slcExec thread is responsible for starting and stopping the message service threads (and all other threads) during startup, upon restart, and upon shutdown of the SLC-aware threads.
The msgSend and msgHdlr threads (and all other job threads) maintain a Message Queue to receive incoming message that the thread is responsible for handling.
The Message Service also maintains two memory pools for temporary storage of incoming messages, and outgoing reply messages. There is a pool of small buffers that can handle most Message Service messages, and a pool of large buffers for the large amounts of reply data sent by certain jobs. Please see the Message Service Memory Pools section for a detailed explanation of the message handling design.
The Message Service Socket is the interface between the slcIOC Message Service and the SLC Control System. It is connected, as a client, to the Proxy Forward Server. The msgRecv thread creates and connects this socket with the Proxy. The msgRecv thread is also responsible for re-connecting the socket when a connection, receive or send failure occurs.
When the slcIOC is commanded to stop, the slcExec thread is responsible for destroying the socket; this allows the msgRecv thread to terminate. See msgRecv Thread Detailed Design for a complete explanation.
The msgSend thread message queue is used by all threads for sending message service messages to the SLC Control System. The “job” threads of the slcIOC will send their replies to the msgSend message queue.
The msgRecv thread will send all MSG messages to the msgHdlr message queue. The slcExec thread will send a IOC_STOP message to the msgHdlr thread.
The msgRecv thread is responsible for receiving SLC Control System requests at the Message Service socket and forwarding them on to the proper “job” thread. The msgRecv thread gets a buffer from the small memory pool and copies the incoming TCP message data into the buffer. The msgRecv thread converts the forward header and msg header portions of the message data from VMS format to native format before forwarding the message to the job thread message queue.
The job thread receives an incoming message at its message queue and unpacks / converts the function specific data. It then completes the function requested, releases the received message buffer back to the small memory pool, and formulates a reply message. The reply message is stored in a new buffer, gotten from either the small or large message pool, depending on the size of the data. The reply data is converted and packed for the SLC Control system, but the msg header and fwd header portions are left in the native format. The job thread then sends the reply message on to the msgSend message queue.
The msgSend thread is responsible for accepting reply messages from the “job” threads and sending them to the SLC Control System. It will break up large reply messages into multiple packets before sending them when required. The msgSend thread assumes the “job” data portion of the message is already formatted and packed properly for the SLC Control System. It must format the msg header and forward header portions before sending to the SLC Control System. Finally, it must release the reply message buffer back to it’s proper memory pool when finished.
All SLC messages have a simple structure consisting of 12 words of forward header information, followed by 10 words of message header information. Request messages sent to a Micro or slcIOC then include up to 1002 words of data. The slcIOC and Micros may send reply messages with the same size forward and message headers, and up to 4072 words of data. NOTE: This message structure is shared by the SLC Control System, the existing iRMXmicros, and the new slcIOCs. If changed for one, it must be changed for all. The structure called msgmail_ts, or lrgmail_ts, is as follows:
Name |
Data type |
Represents |
fwd_hdr_ts |
|
|
ip_port_u len user cmd crc |
ip_port_tu (int4u) int4u user_field_ts(int2u) int1u int1u |
Lower half of the ip address, and the port number Message bytecount minus this fwdheader User defined; chunk count for large buffers; Fwd_server command, e.g. 8 bit crc over header; currently set to 0x55 |
Msgheader_ts |
|
|
source[4] dest[4] timestamp[2] func; datalen; |
char char int4u int2u int2u |
Name of Alpha job that sent the message Name of destination micro VMS format timestamp Function code (job code + function code) Word count of the data in Reqdata |
reqdata[] |
Array of int2u |
function-specific message data; packed; max size is NETVAXMSGLEN – sizeof(msgheader_ts)/2 for message from the SLC Control System. Max size is NETMICMSGLEN-sizeof(msgheader_ts)/2 for messages from the slcIOC to the SLC Control Sys. |
The forward header information is used by the Proxy server to interpret which slcIOC, or which VMS process, is to receive the message.
NOTE: It is assumed that the normal byte alignment is no more than 4-bytes for all the possible slcIOC processor architectures. By making this assumption the forward header and message header are guaranteed to be ‘packed’ in the native structures, so there is no need for the message service to pack and unpack these portions of the incoming message. Conversion from little-endian to native, and conversion of data formats such as floating point and time, is still required.
The following paragraphs are extracted from the document Notes by Tony Gromme, 3/2004. They give an overview of the SLC Message Service message headers and proxy communication. This must be duplicated by the slcIOCs in order to communicate properly with the SLC Control System.
Any message transmitted in either
direction through the TCP proxy must begin with a "TCP proxy forward
header", defined by C structure fwd_hdr_ts in
include file REF_C_INC:MSGHEAD.HC. (Note
that fwd_hdr_ts header is absent from messages
transmitted via SLCnet.) The TCP proxy (PX00 in prod network and PX01
in dev network, each running Linux) supports a TCP connection to each
interested process in the (prod or dev) Alpha, and four TCP connections to each
micro that has Ethernet. The proxy itself
is as passive as possible, in the sense that none of the TCP connections just
mentioned is established by the proxy.
From a micro, to establish
such a connection with the (prod or dev) proxy, you connect specifying proxy
TCP port 6060 (named PROXIES_HOST_MSG_PORT below), and since you probably cant
choose TCP port number for your own end of this connection, you send a first
message specifying one of following four "MICROS_HOST_":
MICROS_HOST_CTL_PORT
6090 - TCP port on micro for control messages, including ack
message from DBEX to micro acknowledging data message from micro to DBEX.
For the given connection, this initial "registration" fwd_hdr_ts also contains the least significant 16 bits of the micro's own (your own) IP address; the proxy assumes that IP addresses truncated to least significant 16 bits are unique across all its client micros. After receiving this initial "registration" message on this connection, the proxy can recognize messages from the Alpha destined for this connection. For clarity here is a simplified restatement (without unions) of structure fwd_hdr_ts:
typedef struct
{ unsigned short ipaddr_lo,
conn_id;
unsigned long datalen;
unsigned char user[2],
cmd,
crc;
} fwd_hdr_ts;
This initial "registration" fwd_hdr_ts the micro sends to the proxy (of course goes no
further than the proxy and) must contain:
For subsequent messages
which the micro sends (you send) to the proxy, ipaddr_lo
and conn_id become one 32-bit destination field,
which is the 4-byte ascii Alpha process name unique
abbreviation (in network byte order in the sense that most significant
character of name is transmitted first).
Field cmd must contain value
PX_FORWARD_ALIAS_FUNC, and field crc always contains
0x55. For each message stream which the
micro receives from the proxy (i.e., for each connection), the micro assumes
the datalen ( = bytecount not including fwd_hdr_ts
itself) in each message's fwd_hdr_ts will in effect
point exactly to the next incoming fwd_hdr_ts, and if
a message received does not begin with a recognizable & correct fwd_hdr_ts then the micro should terminate that connection,
reconnect, and send another initial "registration" fwd_hdr_ts, and hope for the best.
(End of excerpt)
3.1.4.1.1.1.1 PX_REGISTER_PORT_FUNC
In summary, the PX_REGISTER_PORT_FUNC command is placed in the cmd field of the forward header by the msgRecv thread whenever it attempts to make a connection with the Proxy. It is used to “register” the slcIOC with the proxy. There is no corresponding “unregister” command.
3.1.4.1.1.1.2 PX_FORWARD_ALIAS_FUNC
The PX_FORWARD_ALIAS_FUNC command is placed in all the forward header of all outgoing messages by the msgSend thread. It tells the Proxy to pass the message on to an SLC Control System process identified by the “alias” contained in the ip_port_tu field – a 4-byte ascii name. In the outgoing path, slcIOC to Control System, the ip_port_tu field is NOT a port – it is a process name.
The message header includes source and destination specifiers, a time stamp, a function code, and a data length word. The source and destination are 4 character Ascii names to identify either a VMS process or an slcIOC. VMS names are of the form V0nn where nn is the SLCNET interrupt used by the process; slcIOC names are the standard database names. The function code has two bytes, a high byte to identify the Facility (job) and a low byte to specify the explicit function to be executed. MSG is an example Facility.
NOTE: It is assumed that the normal byte alignment is no more than 4-bytes for all the possible slcIOC processor architectures. By making this assumption the forward header and message header are guaranteed to be ‘packed’ in the native structures, so there is no need for the message service to unpack and pack these portions of the incoming message. Conversions from little-endian to native, and conversions of data formats such as floating point and time, are still required.
The reqdata portion of the message structure contains data specific to the function code. This data is packed, and is structured according to the needs of the receiving “job” thread.
NOTE: the slcIOC “job” threads share the structure of this function specific data with the SLC Control System and the iRMXmicros.
The size of the Message Service messages is variable, from 22 bytes in a single command type message, to a possible several thousand bytes of data in a reply message. The Message Service is guaranteed to never have a large incoming message, and the design of the incoming message handling depends on that guarantee. Incoming messages are limited to NETVAXMSGLEN*2 bytes (this is 2048 bytes).
The msgSend thread and the msgHdlr threads (and all other job threads) each maintain a Message Queue to accept messages the thread is required to handle. These queues are implemented with their own pre-allocated memory pools, which are managed by the Message Queue utilities provided in the EPICS OSI Message Queue library. This library requires that messages put into, and taken out of, the queues are actually copied (with a memcpy). There is concern that these multiple copies may slow down the message service, as messages are passed from queue to queue.
To address this problem the Message Service maintains two pre-allocated memory pools for temporary storage of messages received from the SLC Control System, and for storing the reply messages. The EPICS libCom memory manager freeList is used create and manage these memory pools. This library of routines allocates a large pool of memory, and manages this large pool of memory as many smaller blocks of memory (all of the same size). There is a memory pool created with small buffers limited to the maximum size of incoming messages. This pool can handle most message traffic. A memory pool of large buffers is maintained for reply messages that are larger than the maximum incoming message size.
The message service uses these pre-allocated buffers for messages, and passes a pointer to the buffer from queue to queue. This avoids both multiple copies of large messages, and memory fragmentation due to repeated direct memory allocations and deallocations.
This is a message structure used internally for passing the Message Service message buffer pointers between threads via message queues. It has the following structure:
Typedef struct {
Void * msgptr_p;
Unsigned short msgsize;
}msgptr_ts;
The size of, and pointer to, the allocated message buffer is inserted in the message queue. This pointer may be cast to the msgmail_ts, or lrgmail_ts structure by the receiving thread when read out of the message queue.
All Message Service Threads ‘own’ an slcThreads_ts structure stored in the global slcThreads_as array. The fields are commonly used as follows:
WORK IN PROGRESS. Message Logging and Diagnostics are not complete.
The msgRecv thread performs the following to initialize:
The normal processing loop follows:
As it is not uncommon for a socket connection to have failures, the msgRecv thread has an outer loop to try to re-establish a connection if there is a problem with TCP communications. This is done using the tcpSockConnect utility. The msgRecv thread will continue to attempt to make a connection until successful, or until the thread is stopped by slcThreads_as[msgRecv].stop=TRUE.
Each failed attempt will generate and log an error message.
In general, the msgRecv thread spends most of it’s time waiting at the sdMsgSrvc socket as part of step 2. above. The only way to break out of this wait is to close the sdMsgSrvc socket. The slcExec thread will take care of this whenever it receives a Restart or Stop command. Once the sdMsgSrvc socket is closed, the msgRecv thread will return (with an error) from the wait at the socket, and is free to complete the following steps:
Check that the slcThreads_as[msgRecv].stop flag == epicsTrue. If so then
Beyond the global threads structure, the msgRecv thread uses the following globals:
The msgRecv thread manages the slcSockets_as[sdMsgSrvc] Message Service socket. Whenever a failure is detected the msgRecv thread is responsible for re-connecting with the Proxy. The EPICS osiSock library is used to implement the sockets programming.
Since all incoming messages from the SLC Control System are expected to be small the msgRecv thread will only request buffers from the small memory pool.
The msgRecv thread will get a buffer from the small memory pool for each incoming message. If there is a failure in receiving the message from TCP/IP, the msgRecv thread will release the buffer back to the pool, before looping back to try to re-connect and receive again.
If the message data is received and copied to the message buffer successfully, a pointer to the message buffer will be forwarded to a “job” thread. It is the responsibility of the “job” thread to release this buffer when done with it.
In most cases messages are logged by the utilities used by the msgRecv thread. The slcCmlog utilities are used for logging the following status and error messages:
MICR_EXNET_INITFAIL – failed to initialize the socket library
MICR_EXNET_SOCKFAIL – failed to create a socket, setsockopts, or bind to the socket
MICR_EXNET_CONNFAIL – failed to connect the socket to the proxy port
MICR_EXNET_FWDREGFAIL – the registration with the proxy fails.
MICR_EXNET_EOFERR – 0 bytes were read
MICR_EXNET_RECVFAIL – failed on the recv socket call
MICR_EXNET_FWDCRC – forward header CRC did not pass
MICR_EXNET_FWDBCMAX – received a forward header with a bytecount greater than the allowed maximum
MICR_EXNET_FWDBCSHT – received a data buffer containing less bytes than indicated in the forward header.
MICR_EEXIST – a requested thread does not exist
MSG_NOMBX – a requested message queue does not exist
MICR_SENDMBX – error sending to a message queue
MICR_MALLOC – a message buffer could not be allocated
Others TBD
TBD
msgRecvThread is the main procedure and loop of the msgRecv thread. It initialized the msgRecv resources and globals, maintains the Message Service socket connection, and waits for messages at the socket. It cleans up resources when terminating.
This routine is called just before the thread terminates. It sets the active flag to false, calls the osiSockRelease epicsSockets utility. The msgRecv thread does not have a message queue, but it does still set the mqId_ps to NULL, and the tid_ps to NO_THREAD.
This routine assigns the correct proxy and host identifiers before calling the tcpSockConnect utility to make a connection to the Proxy.
vmsstat_t getMessage(msgmail_ts **const msg_ps, int *cnt_p)
This routine manages the network read functionality. First it gets a small message buffer using the msgQGetSmallBuffer utility, then uses the tcpGetBuffer utility to read the incoming message into the message buffer. It returns the status code generated by the utility routines.
unsigned short prepMessage( msgmail_ts *msg_ps)
This routine swaps the bytes of the message header func and dest fields using the cvtSSwap utility. It swaps the bytes in place, assuming a 4-byte boundary for all slcIOC processors (see NOTE). It then replaces the timestamp with the current time. It also translates the function code into an slcIOC thread enumeration and stores this “job_id” into the msg_ps->job_id field. It then calculates the actual byte length of the message. The byte length is returned.
WORK IN PROGRESS. Message Logging and Diagnostics are not complete.
The msgSend thread performs the following as initialization
As it is not uncommon for a socket connection to have failures, the msgSend thread has an outer loop to try to wait for a good connection if there is a problem with TCP communications, using the tcpSockCheck utility. It is the msgRecv thread’s responsibility to actually re-establish the connection. The msgSend thread will continue to check for a good connection until the message is sent successfully, or until the thread is stopped by slcThreads_as[msgSend].stop=TRUE.
Normally, the msgSend thread will detect the stop=TRUE flag quickly within it main loop. It may be possible though, for the msgSend thread to get stuck on a socket send call. The only way to break out of this wait is to close the sdMsgSrvc socket. The slcExec thread will take care of this whenever it receives a Restart or Stop command. Once the sdMsgSrvc socket is closed, the msgSend thread will return (with an error) from the wait at the socket, and is free to complete the following steps:
When the slcThreads_as[msgSend].stop flag == epicsTrue
The msgSend thread reads the following global:
The msgSend thread will wait for the msgRecv thread to re-connect when a failure occurs at the socket.
The msgSend thread will release the reply message buffer back to the pool after the message has been sent. If the send fails, it will retry until the message is successfully sent, or until the msgSend thread is stopped.
The msgSend thread creates a message queue for incoming messages. The message queue is destroyed before the thread exists. The msgSend thread uses the messageQRelease utility to destroy the queue, which first reads out each message in the queue and releases the message buffer it points to, then destroys the queue.
The msgSend thread will log all status and error messages using the slcCmlogLogMsg utility (described in the slcCmlog Utilities section of the General Purpose Utilities document). The msgSend thread logs the following messages:
MICR_EXNET_SENDFAIL – a failure occurred during a socket send
MICR_CREMBX – error creating a message queue
MICR_READ_MBX – error reading from a message queue
Others TBD
TBD
msgSendThread is the main procedure and loop of the msgSend thread. It initialized the msgSend resources and globals, waits for a good socket connection, and waits for messages at the message queue. It cleans up resources when terminating.
This routine is called just before the thread terminates. It sets the active flag to false and destroys the msgSend message queue using the messageQRelease utility.
vmsstat_t msgSendBuffer(const unsigned long dataBytes, const lrgmail_ts *msg_ps)
This routine fills in the forward header of the message then sends the message using the tcpSendBuffer utility. It accepts the message as a lrgmail_ts pointer regardless of the actual size of the buffer. It relies on the msgheader.datalen to avoid overreaching.
This routine swaps the bytes of the message header fields using the cvtSSwap and cvtLSwap utilities. It swaps the bytes in place, assuming a 4-byte boundary for all slcIOC processors (see NOTE).
WORK IN PROGRESS. Message Logging and Diagnostics are not complete.
The msgHdlr thread performs the following as initialization
The msgHdlr thread represents the logic of any message-queue driven thread in the slcIOC. It accepts incoming message commands and acts on them. The basic loop is:
The msgHdlr thread performs the functions related to the follow messages from the SLC Control System, and from the slcExec thread.
The slcExec threads sends this message to tell the msgHdlr thread to terminate itself. The msgHdlr terminates without a reply.
This message is sent by the SLC Control System to either IPL (restart the slc-aware threads)or Reset (stop the slc-aware threads) the slcIOC. The msgHdlr saves the reqdata portion of this message, replies with response status message to msgSend, then waits for a second message, IOC_IOC_SLCNOTIFY from the msgSend thread. See below.
When the msgSend thread receives the MSG_IOC_SLCNOTIFY response message from the msgHdlr thread, it will send this message back to the msgHdlr. msgSend thread is responsible for making sure the reply goes out on TCP to the SLC Control System before the msgHdlr actually acts on this command. Otherwise, the command could stop the msgSend thread before it actually sends out the reply.
When msgHdlr receives this message it will execute either slcRestart() function if the reqdata portion contains the string ‘BOOT’, or it will execute the slcStop() function if the reqdata portion contains the string “RSET”.
Termination is initiated by the slcExec thread setting the slcThreads_as[msgHdlr].stop flag to TRUE, or when the msgHdlr thread receives a IOC_STOP message.
When the slcThreads_as[msgHdlr].stop flag == epicsTrue, or IOC_STOP is received:
The msgHdlr thread uses the following globals:
The msgHdlr thread creates a message queue for incoming messages. The message queue is destroyed before the thread exists. The msgHdlr thread uses the messageQRelease utility to destroy the queue, which first reads out each message in the queue and releases the message buffer it points to, then destroys the queue.
Once it is done acting upon the incoming message the msgHdlr thread releases the message buffer in which it was stored. The msgHdlr thread allocates the memory block required to store the reply data for the current reply message.
Most messages logged by the msgHdlr thread are generated by utility functions. The msgHdlr thread will log all status and error messages using the slcCmlogLogMsg utility (described in the slcCmlog Utilities section of the General Purpose Utilities document). The msgHdlr thread logs the following messages:
MICR_EEXIST – a requested thread does not exist
MSG_NOMBX – a requested message queue does not exist
MICR_SENDMBX – error sending to a message queue
MICR_CREMBX – error creating a message queue
MICR_READ_MBX – error reading from a message queue
MICR_MALLOC – error when trying to allocate a message buffer
TBD
msgHdlrThread is the main procedure and loop of the msgHdlr thread. It initialized the msgHdlr resources and globals, and waits for message at the message queue. It cleans up resources when terminating.
This routine is called just before the thread terminates. It sets the active flag to false and destroys the msgHdlr message queue by calling the messageQRelease utility.
This routine is called when the IOC_STOP message is received. It sets the slcThreads_as[msgHdlr].active flag to false. The thread main loop will detect the active flag is false and call the cleanup function, which causes the thread to exit.
vmsstat_t replyIOCNotify(const msgmail_ts * msg_ps)
This routine is called when the MSG_IOC_SLCNOTIFY message is received. Since this request ultimately shuts down all the slc-aware threads, this message must be handled in two parts. First, the reply is sent to the SLC Control system, indicating that the request was received. This reply must get out before the threads are stopped. Once the reply message is sent out by the msgSend thread, the msgSend thread must then send an IOC_IOC_SLCNOTIFY message back to this msgHdlr thread, telling it to go ahead and perform the requested SLCNOTIFY function.
This replyIOCNotify routine will store the reqdata portion of the incoming message in a local static variable for later use. The data is 4 characters that do not need to be converted. It then sends a response message back to the SLC Control System. This response is the status code MICR_OKOK copied into the first two reqdata words, and the response bit set in the msgheader.func field.
This function returns the following status codes:
MICR_OKOK – upon success
MICR_EEXIST - if the msgSend thread, the msgSend message queue does not exist.
MICR_SENDMBX – if there is an error sending a message to the msgSend message queue.
vmsstat_t doIOCNotify(const msgmail_ts * msg_ps)
This routine is called when the IOC_IOC_SLCNOTIFY message is received from the msgSend thread. This message indicates the SLC Control System has requested a stop or restart, and the msgHdlr is half way through managing it.
This doIOCNotify routine will examine the local static variable for the current SLCNOTIFY reqdata setting. If it contains the 4 characters “BOOT”, the slcRestart() function is called. If it contains the characters ”RSET” the slcStop() function is called.
At this time doIOCNotify always returns MICR_OKOK
SLC-Aware IOC
Home Page | LCLS
Controls | EPICS
at SLAC | SLAC Computing
| SLAC Networking | SLAC Home
Contact: Diane Fairley
Last Modified:
Feb. 28, 2005. by dfairley. Removed TEST message handling, added MSG_IOC_SLCNOTIFY handling
Feb. 1, 2005. by dfairley. Added these links and contact