Proxy Server Expert Recovery
This explanation of proxy server recovery describes recovery from proxy
server problems as the "Proxy Server Recovery" document does but this
explanation is oriented towards experts rather than others that just need
a simple procedure for recovery. This document has been prepared from
information provided by Nancy Spencer and Kristi Luchini in email messages
This document provides information for recovery of the proxy server system
from problems that prevent the communication between the SLC-aware IOCs and
the VMS SLC control system. These problems result in a flood of proxy
server related error messages in error logs. Also these result in bad
status displayed for micros both on the SLC-Aware IOC EDM display that may
accessed from the lclshome display and on the VMS SCP Network Micro Index
panel. Another symptom of these problems is a high percentage (e.g., 90%)
of time spent in the iowait state for both CPUs on the proxy server machine,
The problems may be triggered by the reboot of a SLC-aware IOC. They can
usually be caused by rebooting more than one SLC-aware IOC in a short time
period. They are common on ROD days when SLC-aware IOCs are booted. They
also may occur when power is cycled (e.g., when there is a power failure).
Proxy Server Recovery Approach
These problems result in the proxy server software running on PX00 to enter
a state where communication between one or more SLC-aware IOCs and the VMS
SLC control system is lost. Often once communcation is lost with one SLC-aware
IOC the problems often spreads to a loss of communicaton to others.
The recovery approach involves first reseting all of the SLC micro names
associated with SLC-Aware IOCs listed on the SLC-aware IOCs EDM display
on the VMS SLC control system using a button macro. This will stop the
proxy server network traffic. After there are no longer error log messages
indicating problems with the proxy server, the SLC-Aware IOCs listed on
the SLC-aware IOCs EDM display may be rebooted individually with a sufficient
delay (e.g. one minute) between reboots. This should allow the proxy server
software an opportunity to restore network traffic without reentering a state
where communcation begins to become lost.
It is important to note that the PX00 machine should NOT be rebooted in an
effort to help in the recovery. This will likely cause the proxy server
software running on this machine to soon enter a state where all communication
between SLC-aware IOCs and the VMS SLC control system becomes lost.
Alternatively, one may see these error messages by logging into a VMS
account on the MCC machine and entering the "errdsp" command (this is
usually the preferred method of seeing the error messages since it is
easier to spot messages using "errdsp" than using the CMLOG Viewer).
An example of a message using "errdsp" that indicates a failure to
connect to a SLC-aware IOC is:
One may reset individual SLC-aware IOCs from a SCP that are associated with
proxy server error messages. One may see these error messages using the
CMLOG Viewer, which may be accessed from lclshome as follows:
Once an error message such as the one above has been identified, the next step
is to determine the name of the SLC-aware IOC associated with the IP hex
abbreviation. In the message above, this IP hex abbreviation is "B15".
The "dump_gs" command on the MCC machine may be used to translate between
IP hex abbreviations and SLC-aware IOC names. For example, the command
"dump_gs b15" produces the following output:
28-JAN-2009 11:35:58 %MSG-I-PX_DIAG_INFO, PX00 port register, IP = B15, TAG(not a tcp port)=6090
For the SLC-aware IOC associated with proxy server error messages, reset the SLC micros
for these SLC-aware IOC. This is done using the SCP on VMS. For example, in the output
above the name of the SLC-aware IOC associated with the B15 message code is IOC-IN20-BP02
and its SLC micro name is IB20. To reset this SLC micro, do the following actions on a
We are searching micorname global section for string: B15..
Id Name SSA Type
65 IB20 165 IPDB
XXIPXXXX nd= ioc-in20-bp02
and enter the SLC micro name IB20 followed by selecting "OK". The select the "Reset Micro"
Perform an "auto check status" on the SCP:
LCLS Index Panel -> Network Micro Index -> Micros:
followed by selecting the appropriate micro button (e.g., IB20) and selecting the the
"Auto Check Status" button. The date/time lines for the last time run on the display should
be very close to the current date/time (ignore the "BPMOTIME" and "BPMOCHCK" lines that
appear in white). If they are very close to the current date/time, the next step of
rebooting the SLC-aware IOC may be skipped.
Reboot the SLC-aware IOC. First, from the lclshome EDM display bring up the SLC-aware
LCLS Index Panel -> Network Micro Index -> Micro Diagnostic -> LCLS Cluster Status -> Auto Check Dsplay
To reboot the SLC-aware IOC, select the "SLC-Aware..." button corresponding to the
SLC-aware IOC in the "Diag Status" column and then select "Reboot..." on the resulting
Monitor the error messages using either the CMLOG Viewer or the VMS errdsp utility
to verify that the proxy server error message seen previously no longer appears.
Repeat the "auto check status" on the SCP as described above in step 4 above to verify
that the SLC-aware IOC was rebooted successfully.
lclshome -> Network (global) -> SLC-aware IOCs
One method of checking whether the status of each SLC-aware IOC on the SLC-aware IOC
EDM display is to check the status for each on the SCP Network Micro Index panel:
On this panel first select the "Disply Last Page" button and then the "Disply Prev
Page" button. The first SLC-aware IOC name listed on this display is IA20.
To verify that a SLC-aware IOC status is good, check to make sure the HSTA and IPL
time column values are good and are green.
LCLS Index Panel -> Network Micro Index ->
When there are proxy server problems, issuing a "top" command on the PX00 proxy
server machine can indicate approximately 90% IO wait.
The SLC-aware IOC EDM display has a Help button that provides helpful information
regarding proxy server recovery.
The following URL provides SLC-aware IOC documentation:
SLC-aware IOC documentation
Author: Bob Hall 03-Feb-2009