Meeting Minutes.
Epics Meeting 04/24/02

Subject: CMLOG Throttling Phase II
Throttling topics related alarm handler (ALH), channel watcher (CW), and cmlog.

Attendees: SAA, LUCHINI, JSILVA, NCS, KKU, LAZMO, JROCK, RONM
Interested Parties: CLARK, RONC At the bottom of this page, after the meeting minutes, Stephanie's email which pointed out several important issues is included. This page serves as a semi-formal requirements document for this phase of the CMLOG throttling project.

-------------------- MESSAGE DEFINITIONS: --------------------

I start with some cmlog throttling message definitions first:

a) Initialization Message:
-------------------------
Throttle Initialization is when messages like this get logged.
No messages have been throtled, it's just when the user issued the setThrottle
call (OR DOES IT HAPPEN WHEN THE USER ISSUES THROTTLEON() CALL?)

8-APR-2002 10:39:04  SLCSUN1  Starting throttle -- host(slcsun1) user(jrock)
                               total_dropped(0)
18-APR-2002 10:39:04  SLCSUN1    tag(device) limit(2) delta(10.0s)
18-APR-2002 10:39:04  SLCSUN1    substr("li31:sbst:1:devt_phsb")


b) Throttle In Progress Message: 
-------------------------------
This message occurs periodically after messages have begun to be throttled.

c) Throttle Summary Message:
---------------------------
I mentioned that I thought there was another type of message issues.  I found it.
It's the message that comes out when the user issues a throttleShow() call.
It has a format like this:
"throttle summary -- %d total%s"
  "throttle"
  "throttle"
  ...
            Does this summary message go to the error log? I think so.
Note that neither ALH nor CW will ever ask for a summary message.

d) Remove Throttle Message:
--------------------------
When the user issues a call to throttleRemove (n), this message gets logged:
"Removing throttle --" 
Note that neither ALH nor CW will ever remove throttles.

e) Re-starting throttle Message:
-------------------------------
When the user does a setThrottle() call for a throttle that already exists,
this message is logged:
"Re-starting throttle -- "
Note that neither ALH nor CW will ever restart throttles.

------------------ MEETING DECISIONS ------------------

1. INITIALIZATION MESSAGES
--------------------------
We no longer want initialization messages.  When the user does a setThrottle()
call, no throttle messages will be logged until throttling begins.
This is different than just eliminating the current initialization message
because currently, the first message after initialization is a throttle in
progress call which occurs after the first set of messags are throttled.

2. THROTTLE BEGINS AND ENDS MESSAGES
------------------------------------
Throttle-begins and throttle-ends messages will be added.
The throttle-begins message occurs when cmlog detects a noisy message
based on the throttling parameters set by the application 
and will start counting and providing throttle-in-progress message.
The throttle-ends message occurs when a message is no longer noisy and 
throttle-in-progress messages are no longer provided.
These messages need to show the substring followed by something like 
" throttling begins" or " throttling ends".

3. THROTTLE IN-PROGRESS MESSAGES
--------------------------------
The throttle in-progress message should show the substring followed
by something like " dropped x msgs in the last y seconds".  x is the 
number of times the message has happened since the throttle-begins or 
last in-progress message and y is the throttling time.

4. WE NEED TO ADD SOME KIND OF TIMER DRIVEN CODE TO PUT OUT IN-PROGRESS
AND THROTTLE-ENDS MESSAGES ON TIME
------------------------------------------------------------------------
If this is too much trouble, then just add a NO_LOG (dummy) call to the API.
Then, the application is responsible for calling this API call periodically
to get in-progress or end throttling messages put out near when the events 
occur.  This call would cause the cmlog code to go through it's table of 
throttles and output any appropriate messages.

5. ALL MESSAGES NEED TO HAVE THE SAME FIELDS
--------------------------------------------
CMLOG throttling must use the same values as the message being logged
for the host, facility (or sys), device, message, domain, and error tags.  
The status, severity, and value must all be set to "THROTTLED" since the 
real values are throttled away.  

6. GARBAGE COLLECTION AND DEFAULT THROTTLING IN CMLOG WERE REJECTED
-------------------------------------------------------------------
Garbage collection involves a user settable state where cmlog would
automatically remove a PV from throttling if the PV has not been noisy
for some amount of time.  Default throttling means that there would be
some fallback if none is specified by the user.
Both ideas were rejected as too complex to define and implement.

However, both ALH and CW will be changed to allow the database designer
to specify PVs that take "default" throttling, if the PV will never 
be noisy, or "noisy" throttling, if the PV is expected to be noisy
at times.  The per-PV specification will be done in the ALH and CW config
files, meaning that a change requires a process restart.  PVs will default
to "default" throttling.  The throttling parameters for both "default" and 
"noisy" throttling will be specified as command line arguments with default 
values being something like 5 count/10 seconds for "default" and 
2 count/10 seconds for "noisy".

7. THERE ARE EXPECTED TO BE 1,000 TO 10,000 THROTTLES ISSUED BY THE 
APPLICATIONS
------------------------------------------------------------------

8. AN EFFORT WILL BE MADE TO IMPROVE THE SEARCH ALGORITHM THE CMLOG
USES FOR THROTTLING.
-------------------------------------------------------------------

9. MAINTAIN TWO LISTS IN CMLOG?
-------------------------------
The throttling design should consider keeping throttles in 2 lists.
The first list could keep throttles that are not in-progress (have
either not started or not ended).  The second list could keep the
throttles that are in-progress which will usually be the smaller
of the two lists.  The second list should always be searched first
since a noisy message will be the next likely message to be logged.
Also, the second list is the only list to be processed when it's
time for an in-progress or throttle-ends message.

Throttles are moved from the first to the second list when throttling
begins and are moved from the second back to the first when throttling
ends.

10. VXWORKS BUILD
----------------- 
Ronm will not do any more vxworks builds with Mike's 2 current fixes
for MAX_THROTTLE crashes.  This will be left for the downtime.  The
fixes will not yet be sent to Jie for release.  However, RonM may
consider updating his web pages with the patch description.

11. SUBSTRING ANCHORING
-----------------------
A blank be included at the end of the PV name to make it unique.

12. MAX_THROTTLES
-----------------
The default value will be set larger.

13. LAYER ON TOP OF GENERAL CMLOG THROTTLING SPECIFICALLY 
FOR PV THROTTLING?
-----------------------------------------------------------
The designer should consider adding a layer in between applications
like ALH and CW (and possibly used by the archiver later) and general 
cmlog throttling to be used specifically for PV (or device tag) throttling.

------------------------ OTHER REQUIREMENTS ------------------------

These issues came up after the meeting:

1) Add a complete description of how throttling works to the throttling
document including when messages are issued and under what conditions.  This 
should include initialization, in-progress, and summary messages, etc.  Maybe 
Mike could just copy what I wrote above about the messages into the document 
(after correcting any errors I made).

2) Add a complete description of all new functionality to the programmer's 
document.  This should include programming examples.

3) After the new code is developed and tested, we will deliver it to Jie 
for inclusion into the official cmlog release.

4) The code needs to be portable to all cmlog supported platforms (as all 
the rest of cmlog is).  We will test it on as many platfroms as possible.

5) SUBSTRING
------------
ALH and CW should make the substring be the PV name, followed by a blank, 
followed by "changed", meaning throttling would be on the "text" tag 
instead of the "device" tag.  Then the throttling-begins, in-progress,
and end messages will have the word "changed" in the text and will be
more readable.

6) vxWorks shortcuts.  (added 8/14/02, Ronm)
In the version of throttling that Lazmo wrote, there is a set of shortcuts
provided for vxWorks.  Those shortcuts should still work in this new 
version of cmlog throttling.  Here's an example of the shortcut:
cml_string("text",100,10,"CAU-E-EPICS_ERROR");

7) Wildcarding of the 'val' field will be supported in the 
cmlogFilterSLAC::setThrottle() call.

If 'val' is set to the wildcard character "", and *ctag is set to some tag
name, then, all messages that contain the '*ctag' will be treated the same
no matter what the value is for that tag.
In effect, any message that comes through with the given *ctag, will be 
throttled if more than limit of them have come through in deltaTime.

To refresh your memory, this is the syntax of the C++ setThrottle() call:
  int cmlogFilterSLAC::setThrottle(char     *ctag,
                                   int       limit,
                                   double    deltaTime,
	                           CTYPE     val);
    

------------------------- STEPHANIE'S ORIGINAL EMAIL -------------------------

From saa@SLAC.Stanford.EDU Thu Apr 25 11:57:26 2002
Date: Tue, 23 Apr 2002 17:52:05 -0700
From: "Allison, Stephanie" 
To: "Rock, Judith E." ,
     "MacKenzie, Ronald R." ,
     "Laznovsky, Michael" ,
     "Spencer, Nancy" ,
     "Zelazny, Michael S." ,
     "Chestnut, Ronald P." ,
     "Luchini, Kristi" ,
     "Silva, James" 
Subject: Issues with Throttling of ALH and CW PV Change Messages

Hello,

Discussion with Judy and MikeZ has spurred a couple of issues with
using CMLOG throttling for throttling PV change messages.  We can
discuss these tomorrow plus add other issues to the list:

(1) Look and Feel of the Throttled PV Message
----------------------------------------------------------------------------------------------

Currently, when throttling begins, these messages come out:

18-APR-2002 10:39:04  MCCDEV  li31:sbst:1:devt_phsb changed to 0, MAJOR
18-APR-2002 10:39:04  SLCSUN1  Starting throttle -- host(slcsun1) user(jrock) total_dropped(0)
18-APR-2002 10:39:04  SLCSUN1    tag(device) limit(2) delta(10.0s)
18-APR-2002 10:39:04  SLCSUN1    substr("li31:sbst:1:devt_phsb")

The throttled message should:
  (a) use the same host as the original message 
      (in this case, MCCDEV instead of SLCSUN1)
  (b) use the same facility or sys as the original message 
      (in this case, "Alarm" instead of "cmlogClient")
  (c) use the same application (not VMS) error code as the original message
      (in this case, 1 instead of 0)
  (d) set status/severity to something like "THROTTLED"/"UNKNOWN" instead of "N/A"/"0".
  (e) take only one line instead of three
  (f) The throttling-in-process message should look something like this:

18-APR-2002 10:40:04  MCCDEV  li31:sbst:1:devt_phsb has changed x times in the last y seconds

      where x and y come from the throttling counter/timer.

(2) Throttling-In-Process Messages
----------------------------------------------------------------------------------------------

The throttling-in-process message should come out periodically and should be
spaced by the throttling time.  For instance, if # of messages is set to 2 and the 
throttle time is 10 seconds and a PV changes 5 times in 5 seconds, quiet for 7 seconds, 
changes 5 times in 5 seconds, quiet for over 10 seconds, and does the same thing an
hour later, messages like these should come out:

18-APR-2002 10:39:04  MCCDEV  li31:sbst:1:devt_phsb changed to 0, MAJOR
18-APR-2002 10:39:05  MCCDEV  li31:sbst:1:devt_phsb changed to 1, NO_ALARM
18-APR-2002 10:39:15  MCCDEV  li31:sbst:1:devt_phsb has changed 3 times in the last 10 seconds
18-APR-2002 10:39:25  MCCDEV  li31:sbst:1:devt_phsb has changed 5 times in the last 10 seconds
18-APR-2002 11:39:04  MCCDEV  li31:sbst:1:devt_phsb changed to 0, MAJOR
18-APR-2002 11:39:05  MCCDEV  li31:sbst:1:devt_phsb changed to 1, NO_ALARM
18-APR-2002 11:39:15  MCCDEV  li31:sbst:1:devt_phsb has changed 3 times in the last 10 seconds
18-APR-2002 11:39:25  MCCDEV  li31:sbst:1:devt_phsb has changed 5 times in the last 10 seconds

(3) Start-Throttle/End-Throttle Messages
----------------------------------------------------------------------------------------------

Do we really need beginning and ending throttling messages for PV changes?
How do they help the user?  How do they help us?
If we do decide to keep them, they should look something like this:

18-APR-2002 10:39:05  MCCDEV  li31:sbst:1:devt_phsb has changed, start throttle
and
18-APR-2002 10:39:35  MCCDEV  li31:sbst:1:devt_phsb has not changed, end throttle

(4) Setting Throttling Parameters
----------------------------------------------------------------------------------------------

Currently, the design for CW requires that we change code and rebuild if
we want to change either the throttling time or # of messages before throttling
begins.  The ALH design allows us to set these at the time the process is started
(command line arguments).  CW should allow us to change them on the command line.

(5) Throttling per PV
----------------------------------------------------------------------------------------------

For both ALH and CW, the design is that we either throttle all PVs or none 
based on a command line option.  And the throttling parameters are also set 
once for all PVs.  We should have the ability to either exempt some PVs from 
throttling or to set the throttling parameters higher for a small subset 
of PVs so that we don't lose real alarms.  I think it'd be easier to provide
the ability to exempt PVs from throttling by adding an option to the config
file.  For ALH, we could add another option to the PV "alarm channel mask".
For CW, we could add /NOTHROTTLE to the PV in the channel list file.

(6) Using PV Name as SubString in CMLOG Throttling
----------------------------------------------------------------------------------------------

MikeZ discovered that using a PV name as a substring may have troubles.
If there is a PV called "LI31:XCOR:41:B" and it gets throttled, when
"LI31:XCOR:41:BDES" comes around and needs throttling, it'll get confused
with the first substring.  To make a PV name unique, a blank space must be 
appended to the end of the PV name when setting the substring for CMLOG.  
Luckily for ALH and CW, there is always a blank space between the PV name 
and "changed".  This is klugey and may break if we decide to change our 
message texts or use PV throttling in a new package.

(7) PV Throttling vs General CMLOG Throttling
----------------------------------------------------------------------------------------------

Is PV throttling so unique that it should be handled separately 
from general CMLOG throttling?   Should we do a special version of
CMLOG "device" throttling?

What have I missed?  See you tomorrow,

Steph
AUTHOR: Ron MacKenzie 4/24/02
REVISED Stephanie Allison 4/24/02