SLAC CPE Software Engineering Group
Stanford Linear Accelerator Center
System Admin

MCC Time Change

SLAC Detailed
SLAC Computing
Software Home
Software Detailed
 

 

 

 


 


Semiannual Time Change
Procedures

( $255$DUA7:[DOC.SYSTEM]TIME_CHANGE.TXT )

Ken Fairfield
03-Nov-1995
Revised 13-Apr-2001, Ed Miller
Revised 29-Oct-2001, Ken Brobeck
Revised 25-Oct-2002, Ken Brobeck
Revised 22-OCT-2003, Ken Brobeck
Revised 30-MAR-2004, Ken Brobeck
Revised 24-OCT-2005, Bob Hall
Revised 26-OCT-2007, Bob Hall
Revised 28-OCT-2008, Bob Hall
Revised 27-OCT-2009, Bob Hall
Revised 02-Nov-2009, Ken Brobeck

All nodes in the cluster are running NTP (Network Time Protocol)
to synchronize their clocks. [Note also that there is still a
procedure that uses SYSMAN to synchronize the system clocks which is
run once a week, although this is unnecessary with NTP running.
[See MON_SETTIME.COM in node MCC MGR_CLEANUP: directory.]

Our NTP configuration uses a SCS node, NS3, as our primary
time servers. (They in turn use a pair of "standard"primary
servers "on the net", one at NASA and the other at
APPLE.COM, plus a 3rd source as a backup, a node in SCS which has
its own WWV receiver.) Within the MCC VMScluster, we fall back to
MCCDEV's local clock if NS3 is unreachable, e.g., should the
network connection between MCC and SCS fail.

We are now using lcls-prod01 as our primary NTP server for our
LCLS environment with MCC and MCCDEV as backup.

Nevertheless, Multinet must be reconfigured as well on each node
in the cluster to reset the Multinet "timezone" parameter to the
new, correct value and the MULTINET_TIMEZONE logical name must be
redefined. Apparently the Multinet logical name does not get
redefined if NTP is not allowed to change the time by itself, plus
any node that reboots reads the timezone value stored in the
Multinet configuration files, so that the configuration files must
be set correctly for PST vs. PDT.

With the upgrade to VMS 7.1 in January of 1999, we must also
execute SYS$MANAGER:UTC$TIME_SETUP.COM for UTC timezone support.
UTC$TIME_SETUP (1) writes the new Timezone Differential Factor to
SYS$SYSTEM:SYS$TIMEZONE.DAT, (2) redefines SYS$TIMEZONE_DIFFERENTIAL
with the new value, and (3) updates a cell in VMS memory,
EXE$GQ_TDF. Since a node booting into the cluster retrieves the
cluster system time _and_ TDF via the connection manager, all nodes
in the cluster must agree upon the system time and TDF (stored as
noted on each system in the EXE$GQ_TDF memory cell) and therefore,
all nodes must run the above procedure so they agree on the TDF.
Unfortunately, _every_ node writes a new SYS$TIMEZONE.DAT file, so
we need to PURGE them after the procedure has been run. [Note: UTC
management is slated to change in VMS 7.3. These procedures should
be revisited when VMS is upgraded.]

Note that the TDF for PST is -08:00 hours, or -480 minutes, or
-28800 seconds. The TDF for PDT is -07:00 hours, or -420 minutes, or
-25200 seconds.

There are at this writing two Alpha systems in the cluster which
run DECnet-Plus rather than DECnet IV: MCCA2 and SLCW53.
DECnet-Plus includes support for DTSS, an alternative to NTP, which
on system startup defines several SYS$TIMEZONE* logicals not found
on nodes running DECnet IV. DTSS is more difficult (impossible?) to
control in terms of defeating the automatic time change, and in some
circumstances, interferes with the NTP daemon's ability to change
the system clock. Therefore we stop the DTSS "clerk" process
shortly after system startup. Therefore, we must be sure to
redefine the logicals SYS$TIMEZONE_DAYLIGHT_SAVING and
SYS$TIMEZONE_NAME (below) which DTSS would otherwise change.

The several configuration changes and logical name redefinitions
listed above that need to be made at each time change have been
collected into a command procedure,

CLU$MANAGER:CHANGE_TIME_TO.COM

As of this writing, that command file does the following:

1) Executes UTC$TIME_SETUP.COM
2) Runs MULTINET CONFIGURE to change the timezone parameter
3) Redefines PMDF_TIMEZONE, MULTINET_TIMEZONE, NEWS_TIMEZONE
and NEWS_GMT_OFFSET
4) Redefine SYS$TIMEZONE_DAYLIGHT_SAVING and SYS$TIMEZONE_NAME
on DECnet-Plus nodes
5) Re-Initializes the DTSS -so that the DECnet-Plus nodes will
_NOT_ jump forward/Backward on the next reboot

As noted below, there are still a few files that must be edited by
hand after this procedure has been executed. Also as noted below,
this procedure must be executed on _all_ nodes in the cluster.

In the following it is assumed that node MCC is running the
PRODUCTION Control System, node MCCDEV is running the DEVELOPMENT
Control System, and node MCC is the only cluster node (as noted
above) which accesses NS3 directly and subsequently acts as the
NTP time server for the rest of the cluster. Should different nodes
be performing these functions in the future, the following should be
modified to reflect the approriate node names.



------------------------------------------------------------------------------
Fall Time change, PDT -> PST (first Sunday in November)
------------------------------------------------------------------------------

----------------------------------
BEFORE official Time Change
----------------------------------

1) Shutdown NTP on all VMS servers prior to time change

MCC::System> MULTINET NETCONTROL NTP SHUTDOWN

2) Remove the logical: "SYS$TIMEZONE_RULE"

deassign/system/exec "SYS$TIMEZONE_RULE"

3) On all nodes in the cluster, insure that the NTP parameter
WAYTOOBIG has been set to 1800 seconds:

MCC::System> MCR SYSMAN
SYSMAN> SET envi/clus
SYSMAN> DO MULTINET NETCONTROL NTP WAYTOOBIG 1800

-----

When it is convenient for the Control Room, some time following
the official Sunday time change, and you are ready to change the
system time, continue with the following steps.

Preparation:
Have 5 sessions ready:
MCC::SLCSHR for warmslcx operations
MCCDEV::SLCSHR (same)

MCC SCP for micro time change (and EPICS displays)
MCCDEV SCP for micro time change

SYSTEM priv account for cluster-wide SYSMAN operations

*** Make sure control room (x2150)is ready for a one-hour history
*** buffer outage. NOTE: don't try to do this in early AM when
*** daily history buffer update jobs are in progress.

*** Make sure control room knows that MPS_CTRL will go away
*** for a minute or so (This will not affect beam).

*** Also should mention SLCCAS will be going away temporarily
*** (probably not a problem) and that BIVSC may report anomalous
*** (low?) poll rates for a few minutes after time change.

4) When ready, from SLCSHR or another suitable account, stop the
HSTBSLC (history buffers), SLCCAS (Channel Access Server) and
MPS_CTRL processes on MCCDEV:

********************************************************************
Note: One should broadcast an obnoxious message so people have a
least a CHANCE to know that history buffers will not be working
for that hour.
********************************************************************

MCCDEV::SLCSHR> warmslcx hstbslc,slccas,mps_ctrl /kill

5) Repeat step (4) on MCC (the Production system). However, note
that one additional process must be killed on MCC, HSTBEPICS
(Epics history buffers). Execute the following command on MCC:

MCC::SLCSHR> warmslcx hstbslc,hstbepics,slccas,mps_ctrl /kill

6) Use SYSMAN to reset the system time cluster-wide. It is
simplest to explicitly specify a time delta of 1 hour. [If for
some reason this command is not successful cluster-wide, follow
with a cluster wide setting of absolute time, not delta time.]

MCC:System> mcr sysman
SYSMAN> set envi/clus
SYSMAN> conf set time "-1:"
SYSMAN>

7) While still running SYSMAN, execute CLU$MANAGER:CHANGE_TIME_TO:

SYSMAN> DO @clu$manager:change_time_to pst

-Or run this individually on each server

***********************************************************************
NOTE: On MCCA2 we had two DCPS symbionts get hung up and use
40% to 50% of the cpu time. (Compute Bound)
***********************************************************************

8) Make the micros happy. On a SCP:

push the:
"NETWRK Micro Index" button for Network Micro Index panel
push the:
"MICRO DIAG- NOSTIC" button for SLCNET Micro Diagnostics panel
push the:
"Micro Time Change" button (right hand side, 3rd from bottom)
a) "Who & Why:"
Brobeck: Time Change
b) "Micro name (or ALL*):"
ALL* - case sensitive

*** The Micro Time Change must be done both on MCC
*** and on MCCDEV. It requires SLC_MICRO_BOOT privilege.

9) Restart MPS_CTRL and SLCCAS on MCC (and then on MCCDEV):
Optional: restart BIVSC to prevent (or reduce) its reporting
of anomalous poll rates for a few minutes after time change:

MCC:SLCSHR> warmslcx mps_ctrl,slccas
MCC:SLCSHR> warmslcx bivsc/restart [OPTIONAL]
and
MCCDEV:SLCSHR> warmslcx mps_ctrl,slccas
MCCDEV:SLCSHR> warmslcx bivsc/restart [OPTIONAL]

*** NOTE: clients of SLCCAS are supposed to automatically reconnect
*** when it restarts. However, on at least one occasion, Babar's
*** experience was that they had to reboot SENDBIP IOC afterwards.

10) Restart NTP server on all VMS servers

MCC::System> MULTINET NETCONTROL NTP START


-------------------------------
WAIT AN HOUR
-------------------------------

11) Wait an hour before restarting the history buffers. That was the
point of turning them off! When the allotted time has passed,
start both the "standard" history buffer process and the EPICS
history buffer process.

MCC::SLCSHR> warmslcx hstbslc,hstbepics
and
MCCDEV::SLCSHR> warmslcx hstbslc

12) When any of the Decnet-plus nodes are subesequently rebooted,
verify that their SYS$TIMEZONE* logicals are properly defined,
and that their clocks have not been shifted an additional hour.
[This clock shift occurred in 4/2001, but we believe we have
fixed this problem by specifying ENABLE DTSS SET CLOCK FALSE
(rather than TRUE) in SYS$MANAGER:NET$DTSS_CLERK_STARTUP.NCL.]



---------------------------------------------------------
Spring Time change, PST -> PDT (Second Sunday in March)
---------------------------------------------------------

1) Shutdown NTP on all VMS servers prior to time change

MCC::System> MULTINET NETCONTROL NTP SHUTDOWN

2) Remove the logical: "SYS$TIMEZONE_RULE" on all VMS Servers

deassign/system/exec "SYS$TIMEZONE_RULE"

3) On all nodes in the cluster, insure that the NTP parameter
WAYTOOBIG has been set to 1800 seconds:

MCC::System> MCR SYSMAN
SYSMAN> SET envi/clus
SYSMAN> DO MULTINET NETCONTROL NTP WAYTOOBIG 1800

When it is convenient for the Control Room, some time following
the official Sunday time change, and you are ready to change the
system time, continue with the following steps.

Preparation:
Have 5 sessions ready:
MCC::SLCSHR for warmslcx operations
MCCDEV::SLCSHR (same)

MCC SCP for micro time change
MCCDEV SCP for micro time change

SYSTEM priv account for cluster-wide SYSMAN operations

Make sure control room knows that MPS_CTRL will go away
for a minute or so (This will not affect beam).

Also should mention SLCCAS will be going away temporarily
(probably not a problem) and that BIVSC may report anomalous
(low?) poll rates for a few minutes after time change.

4) Stop the SLCCAS and MPS_CTRL processes on MCCDEV:

MCCDEV::SLCSHR> warmslcx slccas,mps_ctrl /kill

5) Stop the SLCCAS and MPS_CTRL processes on MCC:

MCC::SLCSHR> warmslcx slccas,mps_ctrl /kill

6) Use SYSMAN to reset the system time cluster-wide. It is
simplest to explicitly specify a time delta of 1 hour. [If for
some reason this command is not successful cluster-wide, follow
with a cluster wide setting of absolute time, not delta time.]

MCC:System> mcr sysman
SYSMAN> set envi/clus
SYSMAN> conf set time "+1:"
SYSMAN>

7) While still running SYSMAN, execute CLU$MANAGER:CHANGE_TIME_TO:

SYSMAN> DO @clu$manager:change_time_to pdt

8) Make the micros happy. On a SCP:

push the:
"NETWRK Micro Index" button for Network Micro Index panel
push the:
"MICRO DIAG- NOSTIC" button for SLCNET Micro Diagnostics panel
push the:
"Micro Time Change" button (right hand side, 3rd from bottom)
a) "Who & Why:"
Brobeck: Time Change
b) "Micro name (or ALL*):"
ALL* - case sensitive

*** The Micro Time Change must be done both on MCC
*** and on MCCDEV. It requires SLC_MICRO_BOOT privilege.

9) Restart MPS_CTRL and SLCCAS on MCC (and then on MCCDEV):
Optional: restart BIVSC to prevent (or reduce) its reporting
of anomalous poll rates for a few minutes after time change:

MCC:SLCSHR> warmslcx mps_ctrl,slccas
MCC:SLCSHR> warmslcx bivsc/restart [OPTIONAL]
and
MCCDEV:SLCSHR> warmslcx mps_ctrl,slccas
MCCDEV:SLCSHR> warmslcx bivsc/restart [OPTIONAL]

NOTE: clients of SLCCAS are supposed to automatically reconnect
when it restarts. However, on at least one occasion, Babar's
experience was that they had to reboot SENDBIP IOC afterwards.

10) Restart NTP servers on all VMS nodes

MCC::System> MULTINET NETCONTROL NTP START

11) When any of the Decnet-plus nodes are subesequently rebooted,
verify that their SYS$TIMEZONE* logicals are properly defined,
and that their clocks have not been shifted an additional hour.
[This clock shift occurred in 4/2001, but we believed we had
fixed this problem by specifying ENABLE DTSS SET CLOCK FALSE
(rather than TRUE) in SYS$MANAGER:NET$DTSS_CLERK_STARTUP.NCL.
But the same problem recurred in 10/2001. So we've now ALSO added
a brute force fix in SYSTARTUP_VMS_1.COM: if the clock has been
shifted by +- 1 hour, then shift it back.]

-----



 

[SLAC CPE Software Engineering Group][ SLAC Home Page]

 

Modified: 27-Oct-2014
Created by: Ken Brobeck Oct 27, 2014