VxWorks API Reference : OS Libraries

dcacheCbio

NAME

dcacheCbio - Disk Cache Driver

ROUTINES

dcacheDevCreate( ) - Create a disk cache
dcacheDevDisable( ) - Disable the disk cache for this device
dcacheDevEnable( ) - Reenable the disk cache
dcacheDevTune( ) - modify tunable disk cache parameters
dcacheDevMemResize( ) - set a new size to a disk cache device
dcacheShow( ) - print information about disk cache
dcacheHashTest( ) - test hash table integrity

DESCRIPTION

This module implements a disk cache mechanism via the CBIO API. This is intended for use by the VxWorks DOS file system, to store frequently used disk blocks in memory. The disk cache is unaware of the particular file system format on the disk, and handles the disk as a collection of blocks of a fixed size, typically the sector size of 512 bytes.

The disk cache may be used with SCSI, IDE, ATA, Floppy or any other type of disk controllers. The underlying device driver may be either comply with the CBIO API or with the older block device API.

This library interfaces to device drivers implementing the block device API via the basic CBIO BLK_DEV wrapper provided by cbioLib.

Because the disk cache complies with the CBIO programming interface on both its upper and lower layers, it is both an optional and a stackable module. It can be used or omitted depending on resources available and performance required.

The disk cache module implements the CBIO API, which is used by the file system module to access the disk blocks, or to access bytes within a particular disk block. This allows the file system to use the disk cache to store file data as well as Directory and File Allocation Table blocks, on a Most Recently Used basis, thus keeping a controllable subset of these disk structures in memory. This results in minimized memory requirements for the file system, while avoiding any significant performance degradation.

The size of the disk cache, and thus the memory consumption of the disk subsystem, is configured at the time of initialization (see dcacheDevCreate( )), allowing the user to trade-off memory consumption versus performance. Additional performance tuning capabilities are available through dcacheDevTune( ).

Briefly, here are the main techniques deployed by the disk cache:

Some of these techniques are discussed in more detail below; others are described in varrious professional and academic publications.

DISK CACHE ALGORITHM

The disk cache is composed internally of a number cache blocks, of the same size as the disk physical block (sector). These cache blocks are maintained in a list in "Most Recently Used" order, that is, blocks which are used are moved to the top of this list. When a block needs to be relinquished, and made available to contain a new disk block, the Least Recently Used block will be used for this purpose.

In addition to the regular cache blocks, some of the memory allocated for cache is set aside for a "big buffer", which may range from 1/4 of the overall cache size up to 64KB. This buffer is used for:

Because there is significant overhead involved in accessing the disk drive, read-ahead improves performance significantly by reading groups of blocks at once.

TUNABLE PARAMETERS

There are certain operational parameters that control the disk cache operation which are tunable. A number of preset parameter sets is provided, dependent on the size of the cache. These should suffice for most purposes, but under certain types of workload, it may be desirable to tune these parameters to better suite the particular workload patterns.

See dcacheDevTune( ) for description of the tunable parameters. It is recommended to call dcacheShow( ) after calling dcacheTune( ) in order to verify that the parameters where set as requested, and to inspect the cache statistics which may change dramatically. Note that the hit ratio is a principal indicator of cache efficiency, and should be inspected during such tuning.

BACKGROUND UPDATING

A dedicated task will be created to take care of updating the disk with blocks that have been modified in cache. The time period between updates is controlled with the tunable parameter syncInterval. Its priority should be set above the priority of any CPU-bound tasks so as to assure it can wake up frequently enough to keep the disk synchronized with the cache. There is only one such task for all cache devices configured. The task name is tDcacheUpd

The updating task also has the responsibility to invalidate disk cache blocks for removable devices which have not been used for 2 seconds or more.

There are a few global variables which control the parameters of this task, namely:

dcacheUpdTaskPriority
controls the default priority of the update task, and is set by default to 250.

dcacheUpdTaskStack
is used to set the update task stack size.

dcacheUpdTaskOptions
controls the task options for the update task.
All the above global parameters must be set prior to calling dcacheDevCreate( ) for the first time, with the exception of dcacheUpdTaskPriority, which may be modified in run-time, and takes effect almost immediately. It should be noted that this priority is not entirely fixed, at times when critical disk operations are performed, and FIOFLUSH ioctl is called, the caller task will temporarily loan its priority to the update task, to insure the completion of the flushing operation.

REMOVABLE DEVICES

For removable devices, disk cache provides these additional features:

disk updating
is performed such that modified blocks will be written to disk within one second, so as to minimize the risk of losing data in case of a failure or disk removal.
error handling
includes a test for disk removal, so that if a disk is removed from the drive while an I/O operation is in progress, the disk removal event will be set immediately.
disk signature
which is a checksum of the disk's boot block, is maintained by the cache control structure, and it will be verified against the disk if it was idle for 2 seconds or more. Hence if during that idle time a disk was replaced, the change will be detected on the next disk access, and the condition will be flagged to the file system.

NOTE
It is very important that removable disks should all have a unique volume label, or volume serial number, which are stored in the disk's boot sector during formatting. Changing disks which have an identical boot sector may result in failure to detect the change, resulting in unpredictable behavior, possible file system corruption.

CACHE IMPLEMENTATION

Most Recently Used (MRU) disk blocks are stored in a collection of memory buffers called the disk cache. The purpose of the disk cache is to reduce the number of disk accesses and to accelerate disk read and write operations, by means of the following techniques:

Overall, the main performance advantage arises from a dramatic reduction in the amount of time spent by the disk drive seeking, thus maximizing the time available for the disk to read and write actual data. In other words, you get efficient use of the disk drive£s available throughput. The disk cache offers a number of operational parameters that can be tuned by the user to suit a particular file system workload pattern, for example, delayed write, read ahead, and bypass threshold.

The technique of delaying writes to disk means that if the system is turned off unexpectedly, updates that have not yet been written to the disk are lost. To minimize the effect of a possible crash, the disk cache periodically updates the disk. Modified blocks of data are not kept in memory more then a specified period of time. By specifying a small update period, the possible worst-case loss of data from a crash is the sum of changes possible during that specified period. For example, it is assumed that an update period of 2 seconds is sufficiently large to effectively optimize disk writes, yet small enough to make the potential loss of data a reasonably minor concern. It is possible to set the update period to 0, in which case, all updates are flushed to disk immediately. This is essentially the equivalent of using the DOS_OPT_AUTOSYNC option in earlier dosFsLib implementations. The disk cache allows you to negotiate between disk performance and memory consumption: The more memory allocated to the disk cache, the higher the "hit ratio" observed, which means increasingly better performance of file system operations. Another tunable parameter is the bypass threshold, which defines how much data constitutes a request large enough to justify bypassing the disk cache. When significantly large read or write requests are made by the application, the disk cache is circumvented and there is a direct transfer of data between the disk controller and the user data buffer. The use of bypassing, in conjunction with support for contiguous file allocation and access (via the FIOCONTIG ioctl( ) command and the DOS_O_CONTIG open( ) flag), should provide performance equivalent to that offered by the raw file system (rawFs).

PARTITION INTERACTION

The dcache CBIO layer is intended to operate atop an entire fixed disk device. When using the dcache layer with the dpart CBIO partition layer, it is important to place the dcache layer below the partition layer.

For example:

+----------+
| dosFsLib |
+----------+
     |
+----------+
|   dpart  |
+----------+
     |
+----------+
| dcache   |
+----------+
     |
+----------+
| blkIoDev |
+----------+
ENABLE/DISABLE THE DISK CACHE

The function dcacheDevEnable is used to enable the disk cache. The function dcacheDevDisable is used to disable the disk cache. When the disk cache is disabled, all IO will bypass the cache layer.

SEE ALSO

dosFsLib, cbioLib, dpartCbio

-------------

Each cache block can be at one of the five different states at any time, while the state transitions may occur only when the mutex is taken. The three basic states are:

EMPTY
a block does not contain any disk data
CLEAN
a block contains an unmodified copy of a certain disk block
DIRTY
a block contains a disk block which has been modified in memory.

There is also a UNSTABLE state which is used between mutex locks, which is used to indicate that a block is being modified in memory and its data is not valid. This state is never used after mutex is released.

Removable Device Support Details

It is worth noting that we dont trust the block driver's ability to set its readyChanged flag correctly. Some drivers set it without need, others fail to set it when indeed a disk is replaced. Hence we devised an independent approach to this issue - we are assuming that while the device is active and a disc is replaced, we will get an error, and we also assume it takes at least 2 seconds to replace a disk. Hence, if the disk has been idle for more then 2 seconds, we check the checksum of its boot block, against a previously registered signature.

Issues to revisit or implement:
 + boot block number is hardcoded.
 + separate removable detection into a separate CBIO module below dcache


OS Libraries : Routines

dcacheDevCreate( )

NAME

dcacheDevCreate( ) - Create a disk cache

SYNOPSIS

CBIO_DEV_ID dcacheDevCreate
    (
    CBIO_DEV_ID subDev,       /* block device handle */
    char *      pRamAddr,     /* where it is in memory (NULL = KHEAP_ALLOC) */
    int         memSize,      /* amount of memory to use */
    char *      pDesc         /* device description string */
    )

DESCRIPTION

This routine creates a CBIO layer disk data cache instance. The disk cache unit accesses the disk through the subordinate CBIO device driver, provided with the subDev argument.

A valid block device BLK_DEV handle may be provided instead of a CBIO handle, in which case it will be automatically converted into a CBIO device by using the wrapper functionality from cbioLib.

Memory which will be used for caching disk data may be provided by the caller with pRamAddr, or it will be allocated by dcacheDevCreate( ) from the common system memory pool, if memAddr is passed as NULL. memSize is the amount of memory to use for disk caching, if 0 is passed, then a certain default value will be calculated, based on available memory. pDesc is a string describing the device, used later by dcacheShow( ), and is useful when there are many cached disk devices.

A maximum of 16 disk cache devices are supported at this time.

RETURNS

disk cache device handle, or NULL if there is not enough memory to satisfy the request, or the blkDev handle is invalid.

SEE ALSO

dcacheCbio


OS Libraries : Routines

dcacheDevDisable( )

NAME

dcacheDevDisable( ) - Disable the disk cache for this device

SYNOPSIS

STATUS dcacheDevDisable
    (
    CBIO_DEV_ID dev           /* CBIO device handle */
    )

DESCRIPTION

This function disables the cache by setting the bypass count to zero and storing the old value, if there is already an old value then we won't repeat the process though.

RETURNS OK if cache is sucessfully disabled or ERROR.

SEE ALSO

dcacheCbio


OS Libraries : Routines

dcacheDevEnable( )

NAME

dcacheDevEnable( ) - Reenable the disk cache

SYNOPSIS

STATUS dcacheDevEnable
    (
    CBIO_DEV_ID dev           /* CBIO device handle */
    )

DESCRIPTION

This function re-enables the cache if we disabled it. If we did not disable it, then we cannot re-enable it.

RETURNS OK if cache is sucessfully enabled or ERROR.

SEE ALSO

dcacheCbio


OS Libraries : Routines

dcacheDevTune( )

NAME

dcacheDevTune( ) - modify tunable disk cache parameters

SYNOPSIS

STATUS dcacheDevTune
    (
    CBIO_DEV_ID dev,          /* device handle */
    int         dirtyMax,     /* max # of dirty cache blocks allowed */
    int         bypassCount,  /* request size for bypassing cache */
    int         readAhead,    /* how many blocks to read ahead */
    int         syncInterval  /* how many seconds between disk updates */
    )

DESCRIPTION

This function allows the user to tune some disk cache parameters to obtain better performance for a given application or workload pattern. These parameters are checked for sanity before being used, hence it is recommended to verify the actual parameters being set with dcacheShow( ).

Following is the description of each tunable parameter:

bypassCount
In order to achieve maximum performance, Disk Cache is bypassed for very large requests. This parameter sets the threshold number of blocks for bypassing the cache, resulting usually in the data being transferred by the low level driver directly to/from application data buffers (also known as cut-through DMA). Passing the value of 0 in this argument preserves the previous value of the associated parameter.

syncInterval
The Disk Cache provides a low priority task that will update all modified blocks onto the disk periodically. This parameters controls the time between these updates in seconds. The longer this period, the better throughput is likely to be achieved, while risking to loose more data in the event of a failure. For removable devices this interval is fixed at 1 second. Setting this parameter to 0 results in immediate writes to disk when requested, resulting in minimal data loss risk at the cost of somewhat degraded performance.

readAhead
In order to avoid accessing the disk in small units, the Disk Cache will read many contiguous blocks once a block which is absent from the cache is needed. Increasing this value increases read performance, but a value which is too large may cause blocks which are frequently used to be removed from the cache, resulting in a low Hit Ratio, and increasing the number of Seeks, slowing down performance dramatically. Passing the value of 0 in this argument preserves the pervious value of the associated parameter.

dirtyMax
Routinely the Disk Cache will keep modified blocks in memory until it is specifically instructed to update these blocks to the disk, or until the specified time interval between disk updates has elapsed, or until the number of modified blocks is large enough to justify an update. Because the disk is updated in an ordered manner, and the blocks are written in groups when adjacent blocks have been modified, a larger dirtyMax parameter will minimize the number of Seek operation, but a value which is too large may decrease the Hit Ratio, thus degrading performance. Passing the value of 0 in this argument preserves the pervious value of the associated parameter.

RETURNS

OK or ERROR if device handle is invalid. Parameter value which is out of range will be silently corrected.

SEE ALSO

dcacheCbio, dcacheShow( )


OS Libraries : Routines

dcacheDevMemResize( )

NAME

dcacheDevMemResize( ) - set a new size to a disk cache device

SYNOPSIS

STATUS dcacheDevMemResize
    (
    CBIO_DEV_ID dev,          /* device handle */
    size_t      newSize       /* new cache size in bytes */
    )

DESCRIPTION

This routine is used to resize the dcache layer. This routine is also useful after a disk change event, for example a PCMCIA disk swap. The routine pccardDosDevCreate( ) in pccardLib.c uses this routine for that function. This should be invoked each time a new disk is inserted on media where the device geometry could possibly change. This function will re-read all device geometry data from the block driver, carve out and initialize all cache descriptors and blocks.

RETURNS OK or ERROR if the device is invalid or if the device geometry is invalid (EINVAL) or if there is not enough memory to perform the operation.

SEE ALSO

dcacheCbio


OS Libraries : Routines

dcacheShow( )

NAME

dcacheShow( ) - print information about disk cache

SYNOPSIS

void dcacheShow
    (
    CBIO_DEV_ID dev,          /* device handle */
    int         verbose       /* 1 - display state of each cache block */
    )

DESCRIPTION

This routine displays various information regarding a disk cache, namely current disk parameters, cache size, tunable parameters and performance statistics. The information is displayed on the standard output.

The dev argument is the device handle, if it is NULL, all disk caches are displayed.

RETURNS

N/A

SEE ALSO

dcacheCbio


OS Libraries : Routines

dcacheHashTest( )

NAME

dcacheHashTest( ) - test hash table integrity

SYNOPSIS

void dcacheHashTest
    (
    CBIO_DEV_ID dev
    )

DESCRIPTION

SEE ALSO

dcacheCbio