One important variable that determines the function of the Odf system components as one network file system (NFS) is the duration between consecutive attempts of each ROM (as NFS client) to issue requests and to transmit data to the server, i.e. presently to bbr-srv02.
The corresponding variable in the ROM kernel code (VxWorks) is called "nfsReXmitSec"; per
OdfVxWorks/target/src/config/usrNetwork.c: nfsReXmitSec = ODF_NFS_REXMIT_SEC; OdfVxWorks/target/src/config/mv2304/config.h: #define ODF_NFS_REXMIT_SEC 1
Two counteracting effects need to be considered in estimating the nfsReXmitSec value for which data can be written most quickly (from all ROMs to the server, as in a calibration run):
On one hand, the server cannot process all ROM requests at once, and the average response time rises even exponentially with the rate of requests that can not be serviced right away (and must therefore be retransmitted).
On the other hand, any delay in retransmitting a ROM request means a delay in processing it and in finishing the write operation as a whole.
Historically, working with a slower server, an optimum nfsReXmitSec value had been determined as 8 (seconds), which was also used for bbr-srv02 until recently (June '03).
However, for bbr-srv02 it was found that the processing of ROM requests and related activity of disk and (virtual) memory approached saturation only for nfsReXmitSec as small as 1, being accompanied by a speed-up of typical write operations by more than a factor of three.
More information on determining NFS performance and on tuning of corresponding parameters (in particular for NFS servers running the Sun5.8 operating system, as does bbr-srv02) can be found here.
Timing of example write operations, 700kB each from 100 ROMs. nfsReXmitSec = 8 (first trial): Time after common start (seconds): 0 1 2 ... 6 7 8 9 10 ... 14 15 16 17 18 ... 25 26 ... 33 ... 41 Number of ROMs which finished at this time: 0 7 10 ... 11 0 0 0 11 ... 4 0 0 11 11 ... 11 6 ... 16 ... 3 nfsReXmitSec = 8 (second trial): Time after common start (seconds): 0 1 2 ... 6 7 8 9 10 ... 14 15 16 17 18 ... 25 26 Number of ROMs which finished at this time: 0 1 10 ... 10 0 0 0 29 ... 7 0 0 5 19 ... 16 1 nfsReXmitSec = 8 (third trial): Time after common start (seconds): 0 1 ... 6 7 8 9 ... 14 15 16 17 ... 25 ... 33 ... 41 ... 49 ... 57 Number of ROMs which finished at this time: 0 4 ... 7 0 0 9 ... 9 0 0 21 ... 10 ... 10 ... 5 ... 17 ... 8 nfsReXmitSec = 1 (first trial): Time after common start (seconds): 0 1 2 3 4 5 6 7 8 Number of ROMs which finished at this time: 0 0 5 28 22 29 9 7 0 nfsReXmitSec = 1 (second trial): Time after common start (seconds): 0 1 2 3 4 5 6 7 8 Number of ROMs which finished at this time: 0 0 4 25 19 36 11 5 0
Noticable are for nfsReXmitSec = 8 the long tails of ROMs finishing the write operation with consecutive separation of 8 seconds, apparently after as many as about seven retransmission attempts.
In contrast, with nfsReXmitSec = 1, the distribution of how long the ROMs take to finish is has no significant gaps or fluctiations (on the scale of 1 second, at which their durations to finish were here resolved). Disk, CPU, and memory activity of the server are meanwhile correspondingly quite evenly distributed (and on this scale), close to being saturated.
For read operations, e.g. one particular file stored at the server being read by all ROMs during configuration, a noticible dependence of the time to completion on the value of nfsReXmitSec was not found. Letting all ROMs read a file of 2 MB took typically about 10 seconds, and about 60 seconds for a file of 32 MB.
Tools:
Using the system commands nfsstat, netstat, sar, mpstat and vmstat,
the script "odfStats" records statistics about the file system, network,
disk, CPU and memory activities (concurrently, once per second) in
five ASCII files: "odfStats_ Concise extracts from these files can be obtained using the scripts
"odfStats_nfs", "odfStats_ip", "odfStats_iostat", "odfStats_sar", and
"odfStats_mpstat". (The files "odfStats_vmstat.