StreamDevice format converters work very similar to the format
converters of the C functions printf() and scanf().
But StreamDevice provides more different converters and you can
also write your own converters.
Formats are specified in quoted strings
as arguments of out
or in
commands.
A format converter consists of
%
character()
*# +0-?=
.
) followed
by an integer precision field (input ony for most formats)
The flags *# +0-
work like in the C functions
printf() and scanf().
The flags ?
and =
are extensions.
The *
flag skips data in input formats.
Input is consumed and parsed, a mismatch is an error, but the read
data is dropped.
This is useful if input contains more than one value.
Example: in "%*f%f";
reads the second floating point
number.
The #
flag may alter the format, depending on the
converter (see below).
The '
' (space) and +
flags print a space
or a +
sign before positive numbers, where negative
numbers would have a -
.
The 0
flag says that numbers should be left padded with
0
if width is larger than required.
The -
flag specifies that output is left justified if
width is larger than required.
The ?
flag makes failing input conversions succeed with
a default zero value (0, 0.0, or "", depending on the format type).
The =
flag allows to compare input with current values.
It is only allowed in input formats.
Instead of reading a new value from input, the current value is
formatted (like for output) and then compared to the input.
in "%f"; |
float |
out "%(HOPR)7.4f"; |
the HOPR field as 7 char float with precision 4 |
out "%#010x"; |
0-padded 10 char alternate hex (with leading 0x) |
in "%[_a-zA-Z0-9]"; |
string of chars out of a charset |
in "%*i"; |
skipped integer number |
in "%?d"; |
decimal number or nothing (read as 0) |
in "%=.3f"; |
compare input to the current value formatted as a float with precision 3 |
Every conversion character corresponds to one of the data types DOUBLE,
LONG, ENUM, or STRING.
In opposite to to printf() and scanf(), it is not
required to specify a variable for the conversion.
The variable is typically the VAL
or RVAL
field
of the record, selected automatically depending on the data type.
Not all data types make sense for all record types.
Refer to the description of supported record
types for details.
StreamDevice makes no difference between float
and double
nor between short
, int
and long
values.
Thus, data type modifiers like l
or h
do not
exist in StreamDevice formats.
To use other fields of the record or even fields of other records on the
same IOC for the conversion, write the field name in parentheses directly
after the %
.
For example out "%(EGU)s";
outputs the EGU
field formatted as a string.
Use in "%(otherrecord.RVAL)f";
to write the floating
point input value into the RVAL
field of
otherrecord
.
If no field is given for an other record .VAL is assumed.
When a record name conflicts with a field name use .VAL explicitly.
This feature is very useful when one line of input contains many values that should
be distributed to many records.
If otherrecord
is passive and the field has the PP
attribute (see
Record Reference Manual), the record will be processed.
It is your responsibility that the data type of the record field is
compatible to the the data type of the converter.
Note that using this syntax is by far not as efficient as using the
default field.
At the moment it is not possible to set otherrecord
to an alarm
state when anything fails.
Some formats are not actually converters.
They format data which is not stored in a record field, such as a
checksum.
No data type corresponds to those pseudo-converters and the
%(FIELD)
syntax cannot be used.
%f
, %e
,
%E
, %g
, %G
)
Output: %f
prints fixed point, %e
prints
exponential notation and %g
prints either fixed point or
exponential depending on the magnitude of the value.
%E
and %G
use E
instead of
e
to separate the exponent.
With the #
flag, output always contains a period character.
Input: All these formats are equivalent. Leading whitespaces are skipped.
With the #
flag additional whitespace between sign and number
is accepted.
When a maximum field width is given, leading whitespace only counts to the field witdth when the space flag is given.
%d
, %i
,
%u
, %o
, %x
, %X
)
Output: %d
and %i
print signed decimal,
%u
unsigned decimal, %o
unsigned octal, and
%x
or %X
unsigned hexadecimal.
%X
uses upper case letters.
With the #
flag, octal values are prefixed with 0
and hexadecimal values with 0x
or 0X
.
Input: %d
matches signed decimal, %u
matches
unsigned decimal, %o
unsigned octal.
%x
and %X
both match upper or lower case unsigned
hexadecimal.
Octal and hexadecimal values can optionally be prefixed.
%i
matches any integer in decimal, or prefixed octal or
hexadecimal notation.
Leading whitespaces are skipped.
With the -
negative octal and hexadecimal values are accepted.
With the #
flag additional whitespace between sign and number
is accepted.
When a maximum field width is given, leading whitespace only counts to the field witdth when the space flag is given.
%s
, %c
)
Output: %s
prints a string.
If precision is specified, this is the maximum string length.
%c
is a LONG format in output, printing one character!
Input: %s
matches a sequence of non-whitespace characters
and %c
matches a sequence of not-null characters.
The maximum string length is given by width.
The default width is infinite for %s
and
1 for %c
.
Leading whitespaces are skipped with %s
except when the space flag is given
but not with %c
.
The empty string matches.
With the #
flag %s
matches a sequence of not-null
characters instead of non-whitespace characters.
%[charset]
)
This is an input-only format.
It matches a sequence of characters from charset.
If charset starts with ^
, the format matches
all characters not in charset.
Leading whitespaces are not skipped.
Example: %[_a-z]
matches a string consisting
entirely of _
(underscore) or letters from a
to z
.
%{string0|string1|...}
)
This format maps an unsigned integer value on a set of strings.
The value 0 corresponds to string0 and so on.
The strings are separated by |
.
Example: %{OFF|STANDBY|ON}
mapps the string OFF
to the value 0, STANDBY
to 1 and ON
to 2.
When using the #
flag it is allowed to assign integer values
to the strings using =
.
Example: %#{neg=-1|stop=0|pos=1|fast=10}
.
If one of the strings contains |
or }
(or =
if the #
flag is used)
a \
must be used to escape the character.
In output, depending on the value, one of the strings is printed.
In input, if any of the strings matches the value is set accordingly.
%b
, %Bzo
)
This format prints or scans an unsigned integer represented as a binary
string (one character per bit).
The %b
format uses the characters 0
and
1
.
With the %B
format, you can choose two other characters
to represent zero and one.
With the #
flag, the bit order is changed to little
endian, i.e. least significant bit first.
Examples: %B.!
or %B\x00\xff
.
%B01
is equivalent to %b
.
In output, if width is larger than the number of significant bits,
then the flag 0
means that the value should be padded with
with the chosen zero character instead of spaces.
If precision is set, it means the number of significant bits.
Otherwise, the highest 1 bit defines the number of significant bits.
In input, leading spaces are skipped. A maximum of width characters is read. Conversion stops with the first character that is not the zero or the one character.
%r
)
The raw converter does not really "convert".
A signed or unsigned integer value is written or read in the internal
(usually two's complement) representation of the computer.
The normal byte order is big endian, i.e. most significant byte
first.
With the #
flag, the byte order is changed to little
endian, i.e. least significant byte first.
With the 0
flag, the value is unsigned, otherwise signed.
In output, the prec (or sizeof(long) whatever is less) least
significant bytes of the value are sign extended or zero extended
(depending on the 0
flag) to width bytes.
In input, width bytes are read and put into the value.
If width is larger than the size of a long
, only
the least significant bytes are used.
If width is smaller than the size of a long
,
the value is sign extended or zero extended, depending on the
0
flag.
Example: out "%.2r"
%R
)
The raw converter does not really "convert".
A float or double value is written or read in the internal
(maybe IEEE) representation of the computer.
The normal byte order is big endian, i.e. most significant byte
first.
With the #
flag, the byte order is changed to little
endian, i.e. least significant byte first.
The width must be 4 (float) or 8 (double). The default is 4.
%D
)
Packed BCD is a format where each byte contains two binary coded
decimal digits (0
... 9
).
Thus a BCD byte is in the range from 0x00
to 0x99
.
The normal byte order is big endian, i.e. most significant byte
first.
With the #
flag, the byte order is changed to little
endian, i.e. least significant byte first.
The +
flag defines that the value is signed, using the
upper half of the most significant byte for the sign.
Otherwise the value is unsigned.
In output, precision decimal digits are printed in at least
width output bytes.
Signed negative values have 0xF
in their most significant half
byte followed by the absolute value.
In input, width bytes are read. If the value is signed, a one in the most significant bit is interpreted as a negative sign. Input stops with the first byte (after the sign) that does not represent a BCD value, i.e. where either the upper or the lower half byte is larger than 9.
%<checksum>
)
This is not a normal "converter", because no user data is converted.
Instead, a checksum is calculated from the input or output.
The width field is the byte number from which to start
calculating the checksum.
Default is 0, i.e. the first byte of the input or output of the current
command.
The last byte is prec bytes before the checksum (default 0).
For example in "abcdefg%<xor>"
the checksum is calculated
from abcdefg
,
but in "abcdefg%2.1<xor>"
only from cdef
.
Normally, multi-byte checksums are in big endian byteorder,
i.e. most significant byte first.
With the #
flag, the byte order is changed to little
endian, i.e. least significant byte first.
The 0
flag changes the checksum representation from
binary to hexadecimal ASCII (2 bytes per checksum byte).
In output, the checksum is appended.
In input, the next byte or bytes must match the checksum.
%<sum>
or %<sum8>
%<sum16>
%<sum32>
%<negsum>
, %<nsum>
, %<-sum>
, %<negsum8>
, %<nsum8>
, or %<-sum8>
%<negsum16>
, %<nsum16>
, or %<-sum16>
%<negsum32>
, %<nsum32>
, or %<-sum32>
%<notsum>
or %<~sum>
%<xor>
%<xor7>
%<crc8>
%<ccitt8>
%<crc16>
%<crc16r>
%<ccitt16>
%<ccitt16a>
%<ccitt16x>
or %<crc16c>
or %<xmodem>
%<crc32>
%<crc32r>
%<jamcrc>
%<adler32>
%<hexsum8>
%/regex/
)This input-only format matches Perl compatible regular expressions (PCRE). It is only available if a PCRE library is installed.
If PCRE is not available for your host or cross architecture, download
the sourcecode from www.pcre.org
and try my EPICS compatible Makefile
to compile it like a normal EPICS application.
The Makefile is known to work with EPICS 3.14.8 and PCRE 7.2.
In your RELEASE file define the variable PCRE
so that
it points to the install location of PCRE.
If PCRE is already installed on your system, use the variables
PCRE_INCLUDE
and PCRE_LIB
instead to provide
the install directories of pcre.h
and the library.
If you have PCRE installed in different locations for different (cross) architectures, define the variables in RELEASE.Common.<architecture> instead of the global RELEASE file.
If the regular expression is not anchored, i.e. does not start with
^
, leading non-matching input is skipped.
A maximum of width bytes is matched, if specified.
If prec is given, it specifies the sub-expression whose match
is retuned.
Otherwise the complete match is returned.
In any case, the complete match is consumed from the input buffer.
If the expression contains a /
it must be escaped.
Example: %.1/<title>(.*)<\/title>/
returns
the title of an HTML page, skipps anything before the
<title>
tag and leaves anything after the
</title>
tag in the input buffer.
%m
)
This exotic and experimental format matches numbers in the format
[sign] mantissa sign exponent, e.g +123-4
meaning
123e-4 or 0.0123. Mantissa and exponent are decimal integers.
The sign of the mantissa is optional.
Compared to the standard %e
format, this format does not
contain the characters .
and e
.
Output formatting is ambigous (e.g. 123-4
versus
1230-5
). I chose the following convention:
Format precision defines number of digits in mantissa.
No leading '0' in mantissa (except for 0.0 of course).
Number of digits in exponent is at least 2.
Format flags +
, -
, and space are supported in
the usual way (always sign, left justified, space instead of + sign).
Flags #
and 0
are unsupported.
%T(timeformat)
)
This format reads or writes timestamps and converts them to a double number.
The value represents the number of seconds since 1970 (the UNIX epoch).
The precision of a double is large enough for microseconds (but not for
nanoseconds). This format is probably used best in combination with a
redirection to the TIME
field. In this case, the value is
converted to EPICS timestamps (seconds since 1990 and nanoseconds).
The timestamp format understands the usual converters that the C function
strftime() understands. In addition, fractions of a second can
be specified and the time zone can be set in the format string.
Example: %(TIME)T(%d %b %Y %H:%M:%.3S %z)
may print something like
3 Sep 2010 15:45:59 +0200
.
Fractions of a second can be specified as %.nS
(seconds with n fractional digits), as %0nf
or %nf
(n fractional digits) or as
%N
(nanoseconds).
In input, n is the maximum number of digits parsed, there may be
actually less digits in the input.
If n is not specified (%.S
or %f
) it uses
a default value of 6.
In input, the time zone can be specified in the format like
%+hhmm
or %-hhmm
for cases
where the parsed time stamp does not specify the time zone, where
hhmm is a 4 digit number specifying the offset in hours and minutes.
In output, the system function strftime() is used to format the time. There may be differences in the implementation between operating systems.
In input, StreamDevice used its own implementation because many systems are missing the strptime() function and additional formats are supported.
Day of the week can be parsed but is ignored because the information is redundant when used together with day, month and year and more or less useless otherwise. No check is done for consistency.
Because of the complexity of the problem, locales are not supported. Thus, only the English month names can be used (week day names are ignored anyway).
Dirk Zimoch, 2011