Principles of Operation

Table of Contents


Introduction

The nanokernel clock discipline algorithm includes two separate but interlocking feedback loops. The PLL/FLL algorithm operates with updates produced by a synchronization daemon such as NTP, while the PPS algorithm operates with an external PPS signal and modified serial or parallel port driver. Both algorithms include grooming provisions that significantly reduce the impact of disruptions due to clockhopping, outages and network delay jitter. In addition, the PPS algorithm can continue to discipline the system clock frequency even if the synchronization sources or daemon fail.

PLL/FLL Algorithm

The PLL/FLL algorithm is similar to the NTP Version 4 clock discipline algorithm, which is specially tailored for typical Internet delay jitter and computer clock oscillator wander. However, the kernel algorithm provides better accuracy and stability than the NTP algorithm, as well as a wider operating range. Figure 1 shows the functional components of both algorithms.

gif

Figure 1. Clock Discipline Feedback Loop

While the figure shows some components implemented in the kernel, the NTP synchronization daemon includes equivalent components. Both algorithms operate as a hybrid of phase-lock and frequency-lock feedback loops. The phase difference Vd between the reference source qr and system clock qc is determined by the synchronization protocol, in this case NTP. The value is then groomed by the NTP clock filter and related algorithms to produce the phase update Vs argument to the hardupdate() routine in the kernel. This value is processed by the loop filter to produce the phase prediction x and frequency prediction y. These predictions are used by the clock adjust process, which runs at intervals of 1 s, to produce a correction term Vc. This value adjusts the system clock oscillator frequency so that the system clock displays the correct time.

It is important to point out that the various performance data displayed on these pages were derived from the system clock control signal Vc, since this is the best estimator of the time error. However, this estimator does not include the clock reading error, which depends on the resolution and access time of the oscillator counter. While the reading error with a modern architecture including a processor cycle counter is a few nanoseconds, older architectures may have reading errors of 1000 ns or more. In addition, the Vc signal necessarily varies with time, so the value depends on when it is sampled.

gif

Figure 2. PLL/FLL Prediction Functions

The x and y predictions are developed from the phase update Vs shown in Figure 2. As in the NTP algorithm, the phase and frequency are disciplined separately in both PLL and FLL modes. In both modes x is the value Vs, but the actual phase adjustment is calculated by the clock adjust process using an exponential average with an adjustable weight factor. The weight factor is calculated as the reciprocal of the time constant specified by the API. The value can range from 1 s to an upper limit determined by the Allan intercept [1], which is set arbitrarily at 1024 s. In PLL mode it is important for the best stability that the update interval does not significantly exceed the time constant for an extended period.

The frequency is disciplined quite differently in PLL and FLL modes. In PLL mode, y is computed using an integration process as required by feedback loop engineering principles; however, the integration gain is reduced by the square of the time constant, so becomes essentially ineffective above the Allan intercept. In FLL mode, y is computed directly using an exponential average with weight 0.25. This value, which was determined from simulation with real and synthetic data, is a compromise between rapid frequency adaptation and adequate glitch suppression.

Extensive experience with simulation and practice has developed reliable models for timekeeping in the typical Internet and workstation environment. At relatively small update intervals, white phase noise dominates the error budget and a phase-lock (PLL) algorithm performs best. However, at relatively large update intervals, random-walk frequency noise dominates and a frequency-lock (FLL) algorithm performs best. The optimum crossover point between the PLL and FLL modes, as determined by simulation and analysis, is the Allan intercept. Accordingly, the PLL/FLL algorithm operates in PLL mode at update intervals of 256 s and smaller and in FLL mode for intervals of 1024 s and larger. Between 256 s and 1024 s the mode is specified by the API. This behavior parallels the NTP daemon behavior, except that in the latter the weight given the FLL prediction is linearly interpolated from zero at 256 s to unity at 1024 s.

PPS Algorithm

PPS signals produced by an external source can be interfaced to the kernel using a serial or parallel port and modified port driver. The on-time signal transitions cause a driver interrupt, which captures a timestamp and related data. The driver calls the hardpps() routine, which implements the PPS algorithm. This algorithm is functionally separate from the PLL/FLL algorithm; however, the two algorithms have interlocking control functions designed to provide seamless switching between them in cases when either the synchronization daemon fails to provide phase updates or the PPS signal fails or operates outside nominal tolerances.

gif

Figure 3. PPS Prediction Functions

The PPS algorithm shown in Figure 3 is called at each PPS on-time signal transition. The arguments include a system clock timestamp and a virtual nanosecond counter sample. The virtual counter can be implemented using the processor cycle counter (PCC) in modern computer architectures or the clock counter in older architectures. The intent of the design is to discipline the clock phase using the timestamp and to discipline the clock frequency using the virtual counter. This makes it possible, for example, to stabilize the system clock frequency using a precision PPS source, such as a cesium or rubidium oscillator, while using an external time source, such as a radio or satellite clock or even another time server, to discipline the phase. With frequency reliably disciplined, the interval between updates from the external source can be greatly increased. Also, should the external source fail, the system clock will continue to provide accurate time limited only by the accuracy of the precision source.

Values passed to the hardpps() routine are rigorously groomed to insure correct frequency, reject glitches and reduce incidental jitter. However, the design tolerates occasional dropouts and noise spikes. A frequency discriminator rejects timestamps more than ±500 PPM from the nominal frequency of 1 Hz. The virtual counter samples are processed by an ambiguity resolver that corrects for counter rollover and anomalies when a tick interrupt occurs in the vicinity of the second rollover or when the PPS interrupt occurs while processing a tick interrupt. The latter appears to be a feature of at least some Unix kernels, which rank the serial port interrupt priority above the timer interrupt priority.

The discriminator samples are processed by a 3-stage shift register operating as a median filter. The median value of these samples is the phase estimate and the maximum difference between them is the jitter estimate. The PPS phase correction is computed as the exponential average of the phase estimate with weight equal to the reciprocal of the frequency calibration interval described below. In addition, a jitter statistic is computed as the exponential average of the jitter estimate with weight 0.25 and reported as the jitter value in the API.

Typical PPS signal interfacing designs seldom include provisions to suppress large spikes when connecting cables pick up electrical transients due to light switches, air conditioners and water pumps, for example. These turn out to be the principle hazard to PPS synchronization performance. In the PPS algorithm a spike (popcorn) suppressor rejects phase outlyers with amplitude greater than 4 times the jitter statistic. This value, as well as the jitter averaging weight, was determined by simulation with real and synthetic PPS signals. Each occurrence of this condition sets a bit in the status word and increments the jitter counter in the API. Surviving phase samples discipline the system clock only if enabled by the API.

The PPS frequency is computed directly from the virtual counter difference between the beginning and end of the calibration interval, which varies from 4 s to a maximum specified by the API. When the system is first started, the clock oscillator frequency error can be quite large, in some cases 100 PPM or more. In order to avoid ambiguities throughout the performance envelope, the counter differences must not exceed the tick interrupt interval, which can be less than a millisecond for some systems. The choice of minimum calibration interval of 4 s insures that the frequency remain valid for frequency errors up to ±250 PPM with a 1-ms tick interval.

The actual PPS frequency is calculated by dividing the virtual counter difference by the calibration interval length. In order to avoid divide instructions and intricate residuals management, the length is always a power of 2, so division reduces to a shift. However, due to signal dropouts or noise spikes, either the length may not be a power of 2 or the signal may appear outside the valid frequency range. Each occurrence of this condition sets a bit in the status word and increments the error counter in the API.

The required frequency adjustment is computed and clamped not to exceed ±100 PPM. This acts as a damper in case of abrupt changes that can occur at reboot, for example. Each occurrence of this condition sets a bit in the status word and increments the wander counter in the API. The PPS frequency is adjusted accordingly, but controls the system clock only if enabled by the API. In addition, a wander statistic is calculated as the exponential average of frequency adjustments with weight 0.25. The statistic is reported as the wander value in the API, but not otherwise used by the algorithm.

Operation Controls

It is important at this point to observe the PPS frequency determination is independent of any other means to discipline the system clock frequency and operates continuously, even if the system clock is being disciplined by the synchronization daemon or PLL/FLL algorithm. The intended control strategy is to initialize the PPS discipline state variables, including PPS frequency, median filter and related values during the interval the synchronization daemon is grooming the initial protocol values to set the clock.

When the NTP daemon recognizes from the API that the PPS frequency has settled down, it switches the clock frequency discipline to the PPS signal, but continues to discipline the clock phase using its own algorithm. When the mitigated phase offset is reduced well below ±0.5 s, to insure unambiguous seconds numbering, the daemon switches the clock phase discipline to the PPS signal. Should the synchronization source or daemon malfunction, the PPS signal continues to discipline the clock phase and frequency until the malfunction has been corrected.

The daemon continues to monitor the PPS phase offset and mitigated phase offset, in order to detect a possible PPS signal malfunction. If a significant discrepancy is discovered between the PPS phase offset and the mitigated phase offset, the daemon disables at least the PPS phase discipline and, if necessary, the PPS frequency discipline as well.

References

  1. Mills, D.L. Adaptive hybrid clock discipline algorithm for the Network Time Protocol. IEEE/ACM Trans. Networking 6, 5 (October 1998), 505-514. PostScript | PDF