Proxy Server Recovery

This document provides information for recovery of the proxy server system from problems that prevent the communication between the SLC-aware IOCs and the VMS SLC control system. These problems result in a flood of proxy server related error messages in error logs. Also these result in bad status displayed for micros both on the SLC-Aware IOC EDM display that may accessed from the lclshome display and on the VMS SCP Network Micro Index panel. Another symptom of these problems is a high percentage (e.g., 90%) of time spent in the iowait state for both CPUs on the proxy server machine, PX00.

The problems may be triggered by the reboot of a SLC-aware IOC. They can usually be caused by rebooting more than one SLC-aware IOC in a short time period. They are common on ROD days when SLC-aware IOCs are booted. They also may occur when power is cycled (e.g., when there is a power failure).

Proxy Server Recovery Approach

These problems result in the proxy server software running on PX00 to enter a state where communication between one or more SLC-aware IOCs and the VMS SLC control system is lost. Often once communcation is lost with one SLC-aware IOC the problems often spreads to a loss of communicaton to others.

The recovery approach involves first reseting all of the SLC micro names associated with SLC-Aware IOCs listed on the SLC-aware IOCs EDM display on the VMS SLC control system using a button macro. This will stop the proxy server network traffic. After there are no longer error log messages indicating problems with the proxy server, the SLC-Aware IOCs listed on the SLC-aware IOCs EDM display may be rebooted individually with a sufficient delay (e.g. one minute) between reboots. This should allow the proxy server software an opportunity to restore network traffic without reentering a state where communcation begins to become lost.

It is important to note that the PX00 machine should NOT be rebooted in an effort to help in the recovery. This will likely cause the proxy server software running on this machine to soon enter a state where all communication between SLC-aware IOCs and the VMS SLC control system becomes lost.


  1. Invoke a button macro from the SCP to reset all SLC-aware IOCs. This action will stop the SLC-aware IOCs from continuing to try to communicate with the proxy server system. First, bring up a SCP and go to the IOC & VXI Reboot Panel:

    NOTE: It is very important that this SCP panel be brought up using this path from the LCLS INDEX (and NOT the PEP-II INDEX).

    Then select the "RESET AWARE IOCs" button. Finally, answer "y" to the "Are you sure?" prompt.

  2. Wait until the button macro activity stops and the "IOC & VXI Reboot Panel" is redisplayed.
  3. From the lclshome EDM display, bring up the SLC-aware IOC display:
  4. Select the "Reboot All" button to reboot all of the SLC-Aware IOCs and indicate on the pop-up dialog box that you really want to perform this action.
  5. While the "Reboot All" procedure is executing, the "IOC Status" column for each SLC-aware IOC row may be monitored to see the status changing for each IOC as it reboots. The "IOC Status" color box for the IOC currently being rebooted will change from green, to yellow, to red, and back to green.
  6. After 20 minutes has passed since the selection of the "Reboot All" button, check the SLC-Aware IOC display to verify that the status of all IOCs is good. The "IOC Status" color box for each IOC should be green.
  7. Also check the message log to verify that proxy server error messages no longer appear. To bring up the LCLS CMLOG Viewer to monitor messages:

    Proxy server error messages using the LCLS CMLOG Viewer have the following form:

    It is preferable to use the "errdsp" command from a VMS MCC machine account since the messages often scroll by quickly using the LCLS CMLOG Viewer. Messages using the "errdsp" command have the following form

    Check whether any messages appear that have the "TAG(not a tcp port) = 6090" string in messages of the form described above. If there are no such messages, the status of the proxy server should be good. If there are any messages with this string, follow the "Fix Proxy Status for a SLC-Aware IOC" procedure shown below.

Fix Proxy Status for a SLC-Aware IOC

This procedure should be used when there are error messages that have the string "TAG(not a tcp port) = 6090" after completing the procedure described above.

  1. The "IP = xxx" part of these messages indicates the hex code value for a SLC-aware IOC. For example, if this part of the message is "IP = B15" the hex code value is B15 and the associated SLC-aware IOC associated with this code may be found on the SLC-Aware IOC display by finding a match in the "ID" column (IOC name IOC-IN20-BP02 and SLC micro name IB20, in this case).
  2. After identifying the SLC micro name associated with the hex code value in the "6090" messages, reset the SLC micro from a SCP. For instance, if the SLC micro name is IB20, do the following actions on a SCP to reset this micro: and enter the SLC micro name IB20 followed by selecting "OK". The select the "Reset Micro" button.
  3. Toggle the status for the micro to "OFF". First, go to the "LCLS Cluster Status" SCP panel: Then find the "xxx BPMP STATUS" button for the micro and toggle the status for this micro from "ON" to "OFF" (for example, toggle "IB20 BPMP STATUS ON" to "IB20 BPMP STATUS OFF").
  4. Reboot the SLC-aware IOC from the SLC-Aware IOC display. To reboot a SLC-aware IOC from this display, select the "SLC-Aware..." button corresponding to the SLC-aware IOC in the "Diag Status" column and then select "Reboot..." on the resulting popup dialog.
  5. Toggle the status for the micro to "ON" from the "LCLS Cluster Status" SCP panel in the same manner as it was toggled to "OFF" previously.
  6. Verify that the status of this SLC-aware IOC is now good. First, verify that no more "6090" error messages are being generated using the "errdsp" VMS utility or the CMLOG Viewer. Next go to the SCP "LCLS Cluster Status" panel: Select the "DISPLY MICRO STATUS" button. Then select the "Last Page" button and possibly then the "Prev Page" button until finding the line for the SLC-aware IOC (e.g., IB20). The status of the SLC-aware IOC should be good-- the color should be green (rather than blue) and the "stat" value should be equal to 1.

    For more detailed status information, select the "Auto Check Status" button from this panel.

    If the status indication is not good, please contact the SLC-aware IOC engineer (currently all of the SLC-aware IOCs are BPM IOCs and the SLC-aware IOC responsible engineer for these IOCs is Sonya Hoobler), or Bob Hall and Debbie Rogind for expert level of recovery.


  1. Proxy Server Expert Recovery :

Author:  Bob Hall 29-Jan-2009