Minutes from the 02/09/05 LCLS SLC IOC meetings (morning and afternoon):

(1) New Meeting Room - Next week we'll meet Wed, 9:30, in B280C, Rm 180,
    EFD conference room.

(2) cstrAsync and async utilities:

    Review tentatively scheduled for Feb 17, 1:30 in the MCC conf room.  RonM
    to follow up with Ginger.

    Discussed more requirements and implementation issues.  It was decided
    that the DB handler (Debbie) will create and initialize the async job 
    function tables using an async utility init routine (RonM).  All async 
    threads that need to access the function tables will have to wait until 
    the table initialization is finished (via an event flag or semaphore) 
    before proceeding.

    One functional requirement is that the cstrAsync thread periodically 
    checks for inactive threads and update the appropriate mask in the SLC 
    database so that the SCP async display (and maybe the SIP alarm handler?)
    shows the problem.  The idea is to have cstrAsync tag threads that have 
    exited on their own initiative due to a fatal error like a non-existent 
    secondary.  We discussed if cstrAsync should do more than just monitor 
    the situation.  It could also:
    (a) Send a stop request to SLC exec to bring down the whole SLC interface.
    (b) Or perhaps the task that exits on a fatal error should send a stop
        request right away.
    (c) But is it better to have users notice the problem and attempt a restart
        manually (and if the thread that's dead is not critical,
	then they could time the restart to be less invasive).  BTW, this is 
        the way EPICS works - when a task suspends or exits, it doesn't crash 
        the IOC.  Instead people start noticing (sometimes subtle) problems 
        and notify controls. Doesn't happen very often.
    (d) We agree that an automatic SLC restart should not be attempted since 
        the fatal error would probably just happen again.
    (e) Should cstrAsync at least log a message for the dead threads 
        periodically so that users are reminded of the problem?

    Currently, the function cycling times are read once from the SLC database 
    at initialization (they are ST 1 and can only be changed with dbedit).
    However, should we allow them to be dbeditted from iocsh?  If yes, 
    cstrAsync should periodically read them from the database and stuff the 
    values into the async function tables.  Same comment for the SLC database
    update metering parameters.

    Each job has a number of async functions. Currently we plan to have one
    async thread per job that performs all the functions at the right times.   
    Debbie asks why we don't have one thread per function.  One answer is that
    we think that the order that functions are performed may be important.
    The other answer is the extra overhead needed for the additional threads.
    Also, each function in a particular job uses similar dblists and has 
    similar initialization that would need to be duplicated.

(3) Database Service:

    Debbie is finished with the db utilities except dbupdate and hi/lo logic).
    She is working on the iocsh routines and has all the easy ones finished
    (slcdbGetMeta, slcdbExists, slcdbVersion) along with slcdbDumpHash that
    she uses for debugging the db download.  She is currently working on the
    first iocsh that uses dblist: slcdbDumpUnits which is allowing her to
    debug her db utilities.  Then she'll do the slddbDump followed by 
    slcdbEdit.

    We've asked her to change dblput to add a third argument for job ID and to
    change dbupdate to input job ID instead of name.  And add a new dbupdate
    utility that takes an input job name for use by iocsh.  The idea is that
    threads know which job ID they are and can pass it to the utilities instead
    of making the utility internally find the job ID, adding unnecessary
    overhead.

    Tasks left for Debbie in this order:
    * finish iocsh routines to test db utilities
    * dbupdate, hi/lo logic, and dbCtl thread
    * cleanup
    * add any missing diagnostics

(4) Message Service and General Utilities:

    We will remove the file name and lineno arguments from slcCmlogLogMsg so
    that Debbie and Diane don't have to change their existing code to add
    null arguments for them.  Nobody currently plans to use these arguments. 
    Diane mentions that most messages are logged inside utilities and 
    filename/lineno of a utility is not helpful (the caller filename/lineno 
    is helpful and we don't want to change all utilities to pass the caller 
    filename/lineno as arguments).  If need be, we'll add slcCmlogLogMsgFile 
    that includes these arguments.  
  
    Diane has our shared development environment up-to-date and it's built
    for solaris, linux, RTEMS, and vxWorks.  People will be copying it to 
    their own area, changing and testing, then copying the results back to 
    the shared area with email to the other active developers.  We're
    not quite ready to CVS import yet.  Note that in the linux build, a
    problem was found in a macro defined in a VMS include file (msgdef.hc)
    and a linux sys socket include file.  To get around these collisions,
    we are planning to FTP only a very few include files from VMS and just
    copy the few defines we need into our own include files.  We think the
    potential for mismatch between IOC and VMS is slim since most of what
    we are copying haven't changes in years.  

    Tasks left for Diane in this order:
    * implement code for the CTL socket needed for Debbie's db update logic
    * finish data conversion utilities
    * handle the SLC_NOTIFY function code (for SCP restart)
    * test large buffer transfer maybe using TEST_ECHO_MWORD?
    * retest proxy up/down
    * cleanup
    * add any missing diagnostics
    * correct thread priorities

(5) iocCmlog and CMLOG:

    RonM is building CMLOG for solaris GCC 3.1.1 in order to let us build
    soft SLC IOCs with iocCmlog on solaris.  People are using the solaris gnu 
    debugger and don't want to switch midstream to linux or native solaris.

    iocCmlog is almost done except for testing on RTEMS and Steph will CVS 
    import on Friday.  Steph will send instructions to people on how to use it 
    instead of the stubbed iocCmlog.  Steph will also create one Makefile that 
    will create one SLC IOC library and an example IOC build so we don't need 
    to maintain one Makefile per slc directory.

(6) Magnet Job:

    Krist is looking at the code and looking at the SCP for requirements.
    She is evaluating:
    (a) How do we map from each magnet database secondary to the EPICS 
        database?  What is needed, what can be ignored.  How do bitmasks
	from SLC map to EPICS - in one bitmask, some bits are read, some 
        bits are write, and some bits have different meaning depending on
	the device being controlled.
    (b) How do we keep the two sides synchronized?  For example, when a
        setpoint is changed by VMS, EPICS and SLC will not synchronize until
	a separate message to trim the magnet is sent.  Until then, the two 
        sides are different - what might be the consequences?  When a setpoint 
        is changed in EPICS, when does the SLC side get updated and how soon 
        after the change is it done?  Does Kristi need to post monitors on 
        setpoints and update them on change or can the setpoint update wait 
        for the next async cycle?
    (c) On the SCP side, virtual camac (writing from VMS to the hardware
        directly without sending a message to the magnet job) is done for
	things like turning the magnet on and off.  We have no choice but
	to find these instances and add CA calls if the "micro" is SLC-aware.
	One concern is trying to find all the areas in the SCP where 
	virtual camac is done or just finding all the areas of the SCP that
	control devices that may do things differently than expected and 
	will break with the SLC-aware IOC.

(7) Development:
    For the record, current tests are being done on these platform:
    XL01 - slcs6   (solaris 8)
    XL02 - slcsun1 (solaris 8)
    XL03 - noric06 (linux RHEL 3)
    No RTEMS testing yet.