Kernel Application Program Interface
Table of Contents
The kernel application program interface (API) is used by the NTP protocol daemon (or equivalent) to discipline the system clock and set various parameters necessary for its correct operation. The API is used by application programs to read the system clock and determine its health and expected error values. Following is a description of the interface, as well as the control and monitoring variables involved.
The API consists of two Unix system calls, ntp_gettime() and ntp_adjtime(). The ntp_gettime() call returns the current system time in either timespec format in seconds and nanooseconds, or timeval format in seconds and microseconds, as determined by the particular operating system. In addition to the time value, this system call returns variables representing the maximum credible error and estimated error of the time value in microseconds and the current offset of International Atomic Time (TAI) relative to Universal Coordinated Time (UTC) when available. The ntp_adjtime() call is used to set and read certain kernel state variables according to a set of mode bits in the call. To set the variables requires superuser permission, but to read them requires no special permissions. Both system calls return a code indicating the current status of the system clock; that is, whether a leap second is pending or whether the clock is synchronized to a working reference source.
Following is a description of the various values used by the API, including state variables and control/status bits. Detailed calling sequences and structure definitions are in the timex.h header file included in the distribution.
The following parameters defined in the timex.h header file establish the performance envelope of the kernel clock discipline loop. Included are the current values for these parameters, although they may be changed in future. Note that changing these values may adversely affect overflow and rounding behavior and require re-engineering of the code segments.
- Phase errors greater than MAXPHASE (0.5 s) or frequency errors greater than MAXFREQ (500 PPM) are beyond the range of the clock discipline algorithm. Values that exceed these limits are clamped to the limits before being used to discipline the system clock.
- For update intervals less than MINSEC (256 s), the clock discipline algorithm always operates in PLL mode; while, for update intervals more than MAXSEC (2048 s), the algorithm always operates in FLL mode. Between these two limits the mode is selected by the STA_FLL bit in the status word.
- The range of time constants supported by the clock discipline algorithm is limited to the range 0 through MAXTC (10). The time constant is expressed as a power of two, so that zero corresponds to 1 s and 10 corresponds to 1024 s.
The various functions of the clock discipline algorithm are controlled and monitored by the status word. The bits of this word are read and written using the ntp_adjtime() system call, but superuser privilege is required to write them. The following read/write bits are defined by the API.
- Master enable switch for the PLL/FLL loop. The algorithm is responsive to time and/or frequency updates if set; otherwise, no change in the current time or frequency will be made other than to complete a pending phase adjustment. This bit does not affect the PPS loop.
- Enables the PPS frequency discipline independent of the STA_PLL bit.
- Enables the PPS phase discipline independent of the STA_PLL bit.
- Selects the operating mode when the time constant is in the range 0 through 10. If set, operation is in FLL mode; otherwise, operation is in PLL mode.
- Controls the system clock behavior in the vicinity of a leap second insertion or deletion. See the Return Codes and the Leap-Second State Machine section on this page for how these bits are used.
- Set by the synchronization daemon to indicate an unsynchronized or out-of-tolerance condition, but otherwise has no effect on the clock discipline algorithm.
- Set by the synchronization daemon to freeze the current frequency while allowing the phase to be disciplined as usual. This bit is not used by the NTP Version 4 daemon and is included only for legacy purposes.
The following read-only status bits are defined by the API.
- Indicates the presence of a valid PPS signal. It is set by a valid PPS update and reset about two minutes during which no signal is present.
- Indicates a condition of excessive PPS phase jitter. See the Principles of Operation for further details.
- Indicates a condition of excessive PPS frequency wander. See the Principles of Operation page for further details.
- Indicates a calibration error in the PPS frequency measurement algorithm. See the Principles of Operation page for further details.
- Set by the external clock driver to indicate a fault in the hardware or driver.
- Set to indicate nanosecond mode or reset to indicate microsecond mode. See the MOD_NANO and MOD_MICRO mode control bits for further information.
- Set by the kernel to indicate FLL mode is in effect. FLL mode is in effect if the interval since the last update is greater than MAXSEC or if the interval since the last update is greater than MINSEC and the STA_FLL bit is set in the status word.
- Set to indicate the external clock is in use or reset to indicate the normal kernel clock variable is in use. See the MOD_CLKA and MOD_CLKB mode control bits for further information.
Return Codes and the Leap-Second State Machine
Occasionally, it is necessary to adjust the system clock in response to leap seconds as declared by national standards laboratories. The adjustments are in integral seconds that may be inserted or deleted in the local timescale. The mechanism to recognize and disseminate the leap events themselves is beyond the scope of the API.
A leap event is implemented using a state machine in the kernel. Normally, the machine is in TIME_OK state and nothing special happens at midnight UTC. In order to arm the machine for the event, the ntp_adjtime() system call sets the STA_INS or STA_DEL bit in the status word, which initializes the machine in either TIME_INS or TIME_DEL state to insert or delete the second, respectively. If in TIME_INS after second 86,399 of the day, the machine repeats that second, numbered 86,400 of the current day, and transitions to TIME_OOP. One second later it transitions to TIME_WAIT. If in TIME_DEL after second 86,398 of the current day, it sets the system clock one second in the future, numbered second 0 of the following day, and transitions to TIME_WAIT. The machine remains in this state until the STA_INS and STA_DEL bits are both reset in the status word, after which it transitions to TIME_OK.
It is extremely important to recognize the assumption in this design that the actual value of the system clock is read by a routine that always requires the system time to appear a monotonic process; that is, never runs backward. During an inserted leap second the system clock will appear to run one unit forward each time it is read, regardless of its prior value. Once the leap second has expired, the clock will resume normal operation. This is not a property of the API itself, but rather an intrinsic property of the system clock reading mechanism.
The current state of the machine is determined by the state word maintained by the kernel. When no error conditions are in effect, the value returned by the ntp_gettime() and ntp_adjtime() system calls is the current value of this word; otherwise, an error value is returned. Specific reasons for the error can be determined from the status word returned by the ntp_adjtime() system call. The following return codes are defined by the API.
- No leap second warning is in effect.
- A leap second will be inserted following second 86,400 of the current day.
- A leap second will be deleted following second 86,399 of the current day.
- A leap second insertion is in progress. Time might not be precisely coordinated between NTP server sites. This state occurs only during the actual leap second insertion and lasts for only one second.
- A leap second insertion has completed, but the leap warning bits TIME_INS or TIME_DEL remain set.
- The system clock is currently not synchronized to a reliable server. This state is declared when one or more of the following conditions are met.
- STA_UNSYNC or STA_CLOCKERR is set.
- Either the clock is not synchronized to a reliable server or a failure has occurred in the external clock or clock driver, if provisioned.
- STA_PPSSIGNAL is reset and either STA_PPSFREQ or STA_PPSTIME is set).
- The synchronization daemon has requested use of the PPS signal, but the signal has not been detected during the last two minutes.
- STA_PPSTIME and STA_PPSJITTER are both set.
- The synchronization daemon has requested PPS time discipline, but the jitter has exceeded the limit.
- STA_PPSFREQ is set and either STA_PPSWANDER or STA_PPSERROR is set.
- The synchronization daemon has requested PPS frequency discipline, but either the frequency wander has exceeded the limit or a frequency measurement has failed due to a glitch.
There are two additional return codes which can be produced by the kernel. These include the following system-dependent error numbers defined in the /usr/include/errno.h header file. Note that the values of these error numbers may collide with the above return codes in some systems.
- Not superuser - attempt to change kernel variables without root privilege.
- Invalid argument - attempt to set both MOD_MICRO and MOD_NANO or MOD_CLKB and MOD_CLKA simultaneously
Mode Control Bits
The following mode bits specify which kernel values are to be changed in the ntp_adjtime() system call, as well as the format used for time values and whether an external clock is selected.
- These bits control which field of the timex structure are used to update the corresponding kernel variable. The bits may be set in any combination. See the description below for which bits control which variable.
- These two bits control the scale used in the API interface (but not the actual operations used by the clock discipline algorithm). Only one of the two bits should be set. MOD_NANO selects seconds and nanoseconds (timespec format), while MOD_MICRO selects seconds and microseconds (timeval format). This applies to both the time value returned by ntp_gettime() and the offset used by ntp_adjtime(). Note that not all kernels can support nanosecond format. The recommended behavior is to select one or the other format and inspect the STA_NANO bit in the status word to determine the actual kernel mode. The default when the kernel is first booted is seconds and microseconds for legacy compatibility.
- These two bits control the operation of an external clock, if present in the architecture. Only one of the two bits should be set. MOD_CLKB sets the STA_CLK bit in the status word, while the MOD_CLKA resets it. The behavior in response to the STA_CLK bit is beyond the scope of the current implementation.
The ntp_adjtime() System Call
The ntp_adjtime() system call is used to set and read kernel variables used by kernel. It operates using the timex structure described in the timex.h header file. This structure is used both to change the values of certain kernel variables and to return the current values. Root privilege is required to change the values. Following are the variables that can be read and written. The return codes are described in the Return Codes and the Leap-Second State Machine section on this page.
- If the MOD_OFFSET bit is set in the mode word, this variable updates the kernel time offset in nanoseconds, if the STA_NANO bit is set in the status word, or in microseconds if not.
- If the MOD_FREQUENCY bit is set in the mode word, this variable updates the kernel frequency offset. This is ordinarily done only when the clock discipline algorithm is first started, since the frequency is automatically determined by the algorithm after that. The format of this variable is in PPM with a 16-bit fraction field.
- If the MOD_MAXERROR bit is set in the mode word, this variable updates the kernel maximum error in microseconds. The error is automatically updated by the clock discipline algorithm after that and until the next update. The value is not used for any purpose other than to provide a conduit from the synchronization daemon to the user applications.
- If the MOD_ESTERROR bit is set in the mode word, this variable updates the kernel estimated error in microseconds. The value is not used for any purpose other than to provide a conduit from the synchronization daemon to the user applications.
- If the MOD_STATUS bit is set in the mode word, this variable updates the read/write bits in the status word.
- This variable is actually two variables, one write-only and the other read-only. When written, the value represents the maximum calibration cycle length expressed in seconds as a power of two. When read, the value represents the current calibration cycle length expressed in seconds as a power of two.
The following variables are read-only.
- Precision of the system clock, in nanoseconds if the STA_NANO bit is set in the status word, or in microseconds if not.
- Frequency tolerance of the clock discipline algorithm, in the same units as the freq variable described above. More precisely, the maximum clock oscillator frequency error that can be corrected by the clock discipline algorithm.
The following variables are read-only and present only if the PPS clock discipline code has been compiled and configured in the kernel. They are included in the timex structure definition to insure portability.
- Frequency calculated by the PPS loop, in the same units as the freq variable described above. This calculation is independent of all other means to adjust the system clock frequency. If enabled by the API and the PPS signal is within nominals, the clock frequency should be identical to this value.
- Average phase jitter, in nanoseconds if the STA_NANO bit is set in the status word, or in microseconds if not.
- Average wander, in the same units as the freq variable described above.
- Count of excess phase jitter occurrences. See the Principles of Operation page for further details.
- Count of calibration intervals. See the Principles of Operation page for further details.
- Count of calibration error occurrences. See the Principles of Operation page for further details.
- Count of excess frequency wander occurrences. See the Principles of Operation page for further details.
The ntp_gettime() System Call
The ntp_gettime() system call is used to read the current system time and related error variables. It uses the ntptimeval structure described in the timex.h header file. This structure includes the following variables. The return codes are described in the Return Codes and the Leap-Second State Machine section on this page.
- Current system clock time, in timespec format if STA_NANO is set in the status word; in timeval format if not.
- Maximum credible absolute error of the system clock, in microseconds. The value is initialized by the ntp_adjtime() system call and incremented by the kernel after that. The value is not used for any purpose other than to provide a conduit from the synchronization daemon to the user applications.
- Estimated RMS error of the system clock, in microseconds. The value is initialized by the ntp_adjtime() system call, but not used for any purpose other than to provide a conduit from the synchronization daemon to the user applications.
- Current offset of TAI relative to UTC, as provided by the Autokey protocol and the NIST leapseconds file.