Minutes from the 10/27/04 LCLS SLC IOC meetings. Next Meeting: 11/03/04, 9:30am, in B5, second floor. (1) MSG/DB design review moved to 11/10 due to down. Steph - intro, Diane - message service, Debbie - database service We will probably need more than one meeting for this. We will have a rough draft of what we'll present/discuss by next meeting. We will include functional requirements, design concepts and assumptions, and system data flow diagrams. Not necessary to go into detail on each individual routine but we should list utility routines available to the BPM and MGNT apps. (2) PRIMARY.MAP Data File and Dictionary - Ken will combine the two files into one. Debbie will enter all unique slc "primary names" and "primary secondary" names in the dictionary for name validity error checking. All primary/secondary file entries that match the ioc ST0 database downloaded will be stored as "primary unit secondary" in the dictionary. In addition, each of these matching "primary unit secondary" entries will be added to a linked list, managed by the associated "primary" dictionary entry, for use with ALL* commands. A dbput/dbget/dblist of either a non-existent primary or unit (except ALL*) returns an error. Currently on the existing micros, a dbget/dblist of an invalid secondary will return good status but no update to the db list. We will make this an error for IOCs since PV names have to be mapped to prim/unit/secn and we have to expect typos in PV names. (3) DB IO Routines - Debbie is done with porting dblist and dbunits from assembly to C using ST0 offset as the dblist pointer. She can now test and debug her ST0 endian swap work. She currently uses linear progression for ALL* units. She will now work on the dictionary that combines ST0 and the other database info and hashing. Because of the down, design review, and her other work, she won't finish the DB IO routines until later in Nov. (4) SLC IOC "IPL" - Sequence of events when triggered by the SCP (note that prototype doesn't yet do all this): (1) SCP IPL sets the IOC online in the micr_active_mask, sends the "BOOT" message, and waits for the reply. Any SCP can now attempt to send messages to the IOC even though it may not be ready to accept messages in which case the SCP will time out waiting for a reply. (2) IOC msg tasks receive the BOOT message and reply. (3) SCP IPL gets the reply or times out waiting for it. For now the SCP will ignore absence of reply. Once the reply is implemented, if the reply is absent, step (4) may be skipped and the SCP action finishes. (4) SCP IPL waits for IOC to copy CSTR:VTIM into CSTR:MTIM with a timeout of >=90 seconds, while DBEX is responding to IOC's dbdownload request. (5) IOC msgHdlr tells the slcExec to restart which then stops and restarts ALL tasks in the proper order. (6) IOC slcExec sends a message to the IOC dbSend to download which forwards it to DBEX. (7) DBEX sets the IOC online in the micr_active_mask if it is not already set. (8) DBEX and the IOC dbSend and dbRecv work to get the database downloaded. (9) While the database is downloaded, the IOC msg tasks will process any MSG messages (ie, from PARANOIA which polls "online" IOCs to find those that are no longer responsive). The msg tasks do not need to access the database. Any non-MSG messages received during this time will be thrown away and a message logged (with func code) for each discarded message. We will want to review this action later - we may want to allow some messages to queue up for a short time before discarding. (10)Any IOC task that needs access to the SLC database will need to wait for the database to be ready. (11)If IOC dbRecv finishes the database download successfully, it then sends a message to dbHdlr to convert ST0 data to new format. dbHdlr reads additional database information from the PRIMARY.MAP file on the NFS server and creates a dictionary with ST0 and the PRIMARY.MAP information. (12)If IOC dbRecv or dbHdlr have any error, a specific message is logged and no further action is taken. The SCP IPL will time out. Another IPL is required. (13)IOC dbHdlr then allows any task waiting on the database to proceed. (14)IOC dbHdlr copies CSTR VTIM to CSTR MTIM and sends a message to dbCtrl to send CSTR MTIM to DBEX as a ST3 update. (15)DBEX updates CSTR MTIM. (16)SCP IPL sees the update and the action finishes. (5) SLC IOC restart when not triggered by the SCP - only steps (5) to (15) apply. (6) DBEX Up and Down Messages - the IOC will store a flag that DBEX is up or down in some global area. It will also store the database major/minor version that comes with the DBEX up message. We may use the flag to decide whether or not to attempt sending database updates. We may use the DBEX-up message to trigger an update of all data that's changed since the last successful update. (7) SLC Task Restart - Resource Allocation and Freeing Diane has studied this and written a discussion of issues and recommendations. We'll get this posted on the web and discuss it in the next meeting. Note that most EPICS utilities call cantproceed (which will log a message and suspend the task) when there is a fatal error allocating a resource. This will force an IOC reboot which is a reasonable action for this kind of error. Note that the error that is logged is from errlogPrintf (so won't have all the tags that come with iocCmlogLogMsg). Diane and Debbie both plan to use cantproceed in their own code to suspend the task on a fatal error. (8) EPICS vs SLC Typedefs - Diane has noticed a lot of similarity between EPICS and SLC typedefs in particular with data types (int2u (SLC) is epicsUInt16 (EPICS)). We've agreed that Diane and Debbie should be using EPICS typedefs whenever possible. This may mean fewer include files to ftp from the VMS machines. (9) IPLing from EPICS and IOC Health - Diane has a subroutine record ready to test. It allows restart or stop using CA (in case the user has no SCP or cannot access iocsh). Diane is working on the EDM display. Note that this display is the beginnings of the SLC IOC "health" display which will eventually show things like CPU, file descriptor, memory usage from the non-SLC side of the IOC. And crate status if available. It can also show data from CSTR. And have a button to reboot the whole IOC. We can also add records to display diagnostics from the SLC tasks (ie, any information that may be stored in global structures, etc). (10)CMLOG - James still hasn't been able to start iocCmlog. RonM needs to check that the VMS errlog report will work properly for SLC IOCs. (11)Magnet Job Functional Requirements - Dayle is working on understanding what happens for each function code. She is shooting to be done by Dec 1.