******************************************************************
* *
* Stanford Data Center *
* Stanford University *
* Stanford, Ca. 94305 *
* *
* (c)Copyright 1994 by the Board of Trustees of the *
* Leland Stanford Junior University *
* All rights reserved *
* Printed in the United States of America *
* *
******************************************************************
SPIRES (TM) is a trademark of Stanford University.
How data is arranged is an integral part of any computer application. The input data must be presented to the computer in a form it can interpret; similarly, the data output by the computer must be arranged so that it can be read and understood by the user.
In the SPIRES data base management system, input and output data is frequently processed through "formats", programs that may gather and arrange the data for the computer to place in the SPIRES file or for the computer to display to the user. SPIRES system formats, such as the standard format and the prompting input format called $PROMPT, are available for use with any SPIRES data base. However, many users want or need to create custom formats designed especially for the records in a particular subfile.
As a SPIRES user, you may want to develop custom formats for many different reasons, but each reason can probably be generalized to one of the following:
- your requirements cannot be handled by one of the system formats;
- a custom format is probably cheaper to use than a generalized system format that can do the same tasks; or
- you want to learn how to write SPIRES formats.
This manual is designed to teach you how to create your own custom SPIRES formats. Before reading it, however, you should be familiar with the material in the SPIRES primer "A Guide to Data Base Development" or in the reference manual "SPIRES Searching and Updating". Although reading the SPIRES primer "A Guide to Output Formats" is also highly recommended as a good background to this manual, it is not essential. That document introduces the SPIRES formats language and teaches you how to design output formats with the most popular features, following one example step-by-step through the format design process.
Though this manual will also teach you how to create formats, it also serves as the primary reference source on formats -- material from it is displayed when you issue the SPIRES command EXPLAIN to find out more information online about formats topics. If you are just learning formats, the size of this manual might be overwhelming (another reason why the primer serves as a better introduction to the topic). Note that sections whose titles are preceded by "(*)" usually consist of specialized material describing how to handle rare and unusual formatting problems. The (*) prefix indicates that reading the section is not necessary for most users.
The SPIRES Data Base Management System is developed and maintained by the Data Base Management Division of the Center for Information Technology, Stanford University. The formats language and processor were designed by Bill Kiefer. This manual was written by John Klemm, who thanks Becky Morton, Dick Guertin, Jack Hamilton, June Genis and especially Bill Kiefer, John Sack, Lynn McRae and Sandy Laws for their help in its preparation.
In this manual, examples of sessions are shown with prompts and messages from SPIRES as they appear on the terminal, often in uppercase; commands you type are shown in lowercase. (You may use upper- or lowercase interchangeably when you are actually using SPIRES). For example:
-OK TO CLEAR? ok
Here SPIRES types "-OK TO CLEAR?" and you type "ok".
In formal command syntax descriptions, uppercase letters denote command verbs or other command elements to be entered exactly as shown; a value for lowercase terms and characters must be supplied by you. For example:
SELECT subfile-name
To use this particular command, you type the command verb "SELECT" with the name of the desired subfile, for instance, "Restaurant":
-> select restaurant
where "->" is the prompt from SPIRES.
Brackets ([]) denote optional parts of a command's syntax. Braces ({ }) indicate that you must specify one (and only one) of the alternatives within the braces. Within the braces or brackets, a vertical line (|) separates possible choices. Neither brackets nor braces are to be typed as part of the command. For example:
TYPE [PAUSE|KEEP]
could be entered as
TYPE or TYPE PAUSE or TYPE KEEP
depending on whether you want one of the options.
Sections and subsections of this manual whose titles are preceded by an asterisk in parentheses -- (*) -- may be considered optional reading at that point. Usually the material provides details that most users will not need about how to handle uncommon situations, or technical information about how SPIRES internally handles some particular piece of formats code.
This chapter, in a general way, describes formats conceptually and technically. A couple of simple examples of formats and their "format definitions" will also be examined, to show some of the capabilities of SPIRES formats. This chapter will also describe the structure of the rest of the manual.
Conceptually, a format is a design for the arrangement of data. The primary purpose of the design is to facilitate the interpretation of the data, whether it is a person or a computer doing the interpreting. For example, the SPIRES standard format, in which goal record data is presented in the form
element-name = value;
tells you (on output) or SPIRES (on input) the name of each piece of data, where the data begins, where it ends, and thus, what it is. The format is a template for mapping data, specifying what data elements go where and suggesting relationships between the elements. You cannot make sense out of any data presented to you unless you understand how it is "formatted", i.e., how it is arranged.
Technically, in SPIRES a format is a program that processes data, usually as the data is placed inside the data base ("input formats") or is displayed from the data base ("output formats"). In its most basic form, the program tells SPIRES the source of the data and its destination. However, many other capabilities are based on that foundation. The program may, for example, modify or test the data, if desired. In addition, it may take advantage of standard programming facilities, such as looping, branching or subroutines.
The SPIRES formats language has a very rich and eclectic vocabulary, including pieces from file definition (e.g., processing rules), protocols (e.g., labels, variable groups and procedural statements) and other parts of SPIRES, such as system variables and functions.
A format program, written in the formats language, is called a "format definition". Just as a SPIRES file definition is a goal record in the public subfile FILEDEF, a format definition is a goal record in the subfile FORMATS, entered in the standard SPIRES format. Because of this, the format program structure is guided by the structure of FORMATS goal records; and at the detail level, much of the program statement syntax is affected by the rules for data entry via the standard SPIRES format.
In the next sections, examples of input and output formats will be briefly discussed in order to demonstrate some of these points.
Before stating the specific capabilities of formats, let's examine the procedure for creating two simple formats (one input, one output) for a subfile whose goal records consist of names and addresses. This procedure will introduce some concepts and terminology that will simplify the discussions of formatting capabilities later. [See B.1.1, C.1.1.]
The steps involved in creating a SPIRES format are:
1) design the format (both the layout and the program);
2) write the format definition;
3) add the definition to the FORMATS subfile;
4) compile the format definition; and
5) test, modify and use the format.
Each of those steps is discussed below as we create an output format for the subfile.
The first step is the conceptual one. Generally, the first question to ask is, what is the purpose of the format? In this particular case, the purpose of the format is to put the names and addresses into our active file so that a LABELER program can read the data, converting it into mailing labels. Often a specific need, such as mailing labels, creates the purpose, but a more general goal, such as the desire to display the data more attractively or to make data entry easier, can certainly be a reasonable purpose.
The LABELER program we want to use has certain requirements about the data input to it:
- Each record should be preceded by an exclamation point, which serves as a delimiter.
- On the next line should be the NAME element; on the next lines should be the occurrences of the ADDRESS.LINE element.
- No record may use more than five lines, including the exclamation point.
- No line may exceed 36 characters in length.
So a record to be displayed in this LABELER format would look like this:
! <-- the exclamation-point delimiter Horace Greeley <-- the NAME element value 47 Domingo West <-- the first occurrence of ADDRESS.LINE Youngman, CA 94922 <-- the next occurrence of ADDRESS.LINE
Our SPIRES file was designed partially around these requirements -- for example, each line of the address as it is to appear on a mailing label is a separate occurrence of the ADDRESS.LINE element. The NAME element is only allowed to have one occurrence, while the ADDRESS.LINE element may have up to three. Also, both elements are limited to 36 characters in length per occurrence. These goal record characteristics, specified in the file definition, suggest that file designers often create files with specific formats in mind from the beginning -- format design requirements may affect some aspects of the file design.
At this point, it is a good idea to look at the format as a template laid out on a grid, such as a piece of graph paper. The grid will represent one goal record displayed via the format. The size of the grid is determined here by the LABELER program: 5 rows by 36 columns is the maximum size of a record. Below is a representation of that grid, each "." representing one position in the grid:
.................................... .................................... .................................... .................................... ....................................
In formats terminology, this 5x36 grid is called a "frame". (Later, the term "frame" will also mean the part of the program that moves the data to and from the grid.) Within the frame, we can position element values and textual strings as we like: they can be centered or left- or right-adjusted within the frame, or they can be restricted to certain columns, or they can wrap around from one row to the next, and so forth. (The "and so forth" will be discussed later.) We will position the exclamation point and the elements in a very simple manner, all left-adjusted within the frame:
!................................... (NAME).............................. (ADDRESS.LINE)...................... (ADDRESS.LINE)...................... (ADDRESS.LINE)......................
The strings in parentheses indicate the elements, whose values would appear there instead.
External specifications guided the design of this particular format; for another format, you might have more creative freedom. Whatever the situation, however, it is recommended that you lay out your format design in a grid such as the one above, especially when you are just beginning to write format definitions. Some more specific guidelines for designing formats will be presented later.
The next step is to use the format definition language to write a format definition. The format definition is both a program and a goal record, and the format language reflects this dual role. When we treat it as a program, we say it consists of instructions, or "statements"; when we treat it as a goal record, it consists of "elements", which must follow the standard rules for record input. (In general, we will refer to the elements in a format definition as "statements", to avoid confusion with the elements in records of your subfile to be handled by the format.) Below is the format definition for the LABELER format, as collected in the active file:
1. ID = GQ.JNK.ADDRESS.LABELS;
2. FILE = GQ.JNK.ADDRESS.BOOK;
3. RECORD-NAME = REC01;
4. FRAME-ID = LABEL.OUT;
5. DIRECTION = OUTPUT;
6. FRAME-DIM = 5,36;
7. USAGE = DISPLAY;
8. LABEL = DELIMITER;
9. VALUE = '!';
10. PUTDATA;
11. LABEL = NAME;
12. GETELEM;
13. PUTDATA;
14. LABEL = ADDRESS.LINE;
15. GETELEM;
16. PUTDATA;
17. LOOP;
18. FORMAT-NAME = LABELER;
19. FRAME-NAME = LABEL.OUT;
20. FRAME-TYPE = DATA;
We can break that format definition into three main sections:
These statements provide information about the format definition itself and the file to which it applies. The ID element serves as the key of the format definition goal record when it is placed in the FORMATS subfile. The FILE and RECORD-NAME statements name the file and the particular record-type within the file for which the format is being created.
This section is usually the largest section in the format definition; most, if not all, of the instructions on how the goal record is to be processed are stated here. Although a format may have multiple frames, often one will suffice. Here, the single "frame definition" begins with some statements about the frame itself, including its name and dimensions. Then the remainder of the frame definition consists of "label groups", which specify the work to be done.
Each label group begins with a LABEL statement (which, by the way, has nothing to do with the fact that this format will be used to create mailing labels) and handles one value or one element. For example, the label group in lines 11 through 13 GETs the ELEMent named by the LABEL statement (NAME) and PUTs that DATA into the grid into a default location (the first column of the next row). Format execution within a frame begins with the first label group and proceeds from one to the next, executing the statements within each.
Label groups are often more complex, using more statements than are shown in this example. For instance, they may test the accessed element value and choose not to put it into the frame, or they may be used entirely to control the execution flow of the frame without accessing data elements or values at all.
In this section, the frames, which are the building blocks of the format, are put together and given a name used in the SET FORMAT command (the name specified in the FORMAT-NAME statement). When that command is issued, SPIRES fetches the frames under that FORMAT-NAME; SPIRES will execute them later when a command that uses this format is issued. (Remember that more than one frame definition may be specified in the format definition.)
Though only three sections were shown in this definition, many formats contain a fourth section, called "Vgroups", that defines user variables to be used during frame processing. Statements in the Frame Definitions and Format Declaration sections may assign, test, change and display variable values; these capabilities are used frequently in more complicated format applications.
Granted, the above presentation does not teach you how to write a format definition, that is, it does not teach you specifically how to convert a design on a grid into a format definition, but it does introduce some of the statements allowed and show you that some format definitions can be rather simple. The rest of the manual will teach you what formats statements to specify for a given design feature of your formats.
To compile a format definition, you must first place it in the FORMATS subfile. If the format definition shown above is in your active file, the procedure is simple:
COMMAND> spires -Welcome to SPIRES-3 ... If in trouble, try 'HELP'. -> select formats -> add ->
SPIRES examines the format definition in your active file, and if it follows the rules for goal records in the FORMATS subfile (e.g., it has a value for the key element ID, and it has the proper elements in the Frame Definitions section), then the record is added to the subfile.
Once SPIRES has accepted your format's blueprint (that is, once you have successfully added the format definition to the FORMATS subfile), you may compile it:
-> select formats (if not already selected) -> compile gq.jnk.address.labels -Compiled format: LABELER -Format definition compiled ->
The COMPILE command names the format definition in the FORMATS subfile that is to be compiled. SPIRES in effect "builds" the format from the definition, checking the syntax of the statements and creating the compiled code to be used by SPIRES when the format is invoked. If a syntax error is found by the compiler, an error message is issued, and you must correct the format definition record in the FORMATS subfile, and then try compiling again. When the format definition is compiled, the format may be used.
You may now select the appropriate subfile, and set the format:
-> select address book -> set format labeler ->
One of the record output commands, probably DISPLAY or TYPE, may now be used to display records through the format. If the format is not satisfactory, adjustments can be made to the format definition, which you then "recompile".
The procedure for format creation consists of the same steps regardless of the type of format. Next we will create an input format for the ADDRESS BOOK subfile.
We want to make data entry into the subfile very easy. Only two elements are collected: the single occurrence of NAME and the multiple (up to 3) occurrences of ADDRESS. An input format may prompt the user for the appropriate data (as the system format $PROMPT does) or may read the data from a data set such as your active file (as the standard SPIRES format does). For our simple input format, let's assume we will collect the data for each record in our active file, in a manner similar to the way the data is displayed by our output format above -- that is, the name will be on the first line, and the address lines will be found on the next one to three lines.
Again, we lay out a grid, representing a frame. This time, since we will not need the exclamation point delimiter required by the LABELER program, we can eliminate one row, leaving the grid as 4 rows by 36 columns:
(NAME).............................. (ADDRESS.LINE)...................... (ADDRESS.LINE)...................... (ADDRESS.LINE)......................
Below is the appropriate input format definition. You will notice several new statements, but basically this format definition is similar to the output format definition above:
1. ID = GQ.JNK.ADDRESS.INPUT;
2. FILE = GQ.JNK.ADDRESS.BOOK;
3. RECORD-NAME = REC01;
4. FRAME-ID = INPUT.DATA;
5. DIRECTION = INPUT;
6. FRAME-DIM = 4,36;
7. USAGE = FULL;
8. LABEL = NAME;
9. GETDATA;
10. PUTELEM;
11. LABEL = ADDRESS.LINE;
12. GETDATA;
13. UPROC = IF $ENDATA THEN RETURN;
14. PUTELEM;
15. LOOP;
16. FORMAT-NAME = INPUT;
17. FRAME-NAME = INPUT.DATA;
18. FRAME-TYPE = DATA;
This format definition has the same three sections; the major differences are in the statements within the Frame Definition section:
- the DIRECTION has changed from OUTPUT to INPUT.
- the GETELEMs and PUTDATAs have been switched around, becoming GETDATAs and PUTELEMs, reflecting the switch from output to input.
- the value of the USAGE statement has changed from DISPLAY to FULL.
- a new statement, the UPROC (for User PROCessing), controls the handling of multiple occurrences of ADDRESS.LINE: If the end of the data has been reached, then no more processing is to be done in that label group.
The format definition would be added to the FORMATS subfile and compiled, just as shown earlier for the output format. Then you would probably test the format like this:
-> select address book
-> set format input
-> collect
1. > Rex Alldrug
2. > 18 Tough Row
3. > Tohoe, Nevada 84522
4. > *** <-- ATTN is pressed
-> add
-Added record 312
-> display 312
ID = 312;
NAME = Rex Alldrug;
ADDRESS = 18 Tough Row;
ADDRESS = Tohoe, Nevada 84522;
->
With this basic outline of the process for creating SPIRES formats, we are now better equipped to consider the capabilities of particular types of formats.
Let's examine the process of how formats work and how they are used in more detail.
All formats are created to process data. In almost all cases, a format maps data from a record-type within a SPIRES file to a character array or vice versa. (A special type of format called a "global format" [See D.2.5.] is not associated specifically with any data base.)
No format can be used until a SET FORMAT command (or an equivalent, such as automatic format selection when a subfile is selected) has been issued. When SPIRES receives a SET FORMAT command, it looks for the record containing the compiled characteristics of the named format in a subfile called FORCHAR (for FORmat CHARacteristics). FORCHAR records are created by SPIRES when a format definition is compiled; users, including format definers, rarely need to access this subfile themselves. The compiled code is then in user memory for use by subsequent commands.
Also at this time, SPIRES initializes any user-defined variable groups ("vgroups") that can be used by frames that will be executed. Any user variables used during format execution (e.g., to hold element values for testing later in the format) must be defined either in the Vgroups section of the format definition, or in a separate record for the VGROUPS subfile. [See B.9.3.] When the compiled format is set, room in user memory is reserved for these variables, and initial values, if any, are assigned to them.
Sometimes a format may have a special type of frame called a "startup frame", which is executed as soon as SPIRES "loads" the format (i.e., when the SET FORMAT command is issued). The startup frame is typically used to send information to the terminal about how the format is used or to set certain variables or make certain tests, just like "select-commands" may be used when a subfile is selected. [See D.3.]
All of the commands listed below may cause format execution if a format is set:
Input Output Other ADD TYPE XEQ FRAME UPDATE DISPLAY MERGE SCAN BATCH (in SPIBILD) TRANSFER ADDUPDATE OUTPUT
When any of the above commands is issued and a format is set, SPIRES checks to see whether execution of any frames within that format is appropriate. This decision is based on the values of several statements within the format definition, specifically DIRECTION and USAGE in the Frame Definitions section and FRAME-NAME and FRAME-TYPE within the Format Declaration section. [See B.3.2, B.3.4, B.5.2.]
A single format may have frames to be used when the ADD command is executed and different frames to be used when the DISPLAY command is executed. Similarly, a single format may have two different "sets" of output frames, one set of which is invoked by a DISPLAY or TYPE command, and the other which is invoked by a TRANSFER command, or even by a TYPE or DISPLAY command that is preceded by the "USING frame" prefix. [See D.1.1.1.] The ability to have several contrasting uses for a format can make applications simple -- you may not have to keep switching formats back and forth to alternately add and display records, for instance. On the other hand, it creates a minor terminology problem, because a format used for adding records, which we are tempted to call an "input format", may be defined so that it can also be used to display records, in which case it also qualifies as an "output format".
That description of the terminology problem also suggests its solution. A given format may be both an input and an output format, so calling one an "input format" will not rule out the possibility that it may also be defined for use as an output format as well. The term "input format" simply implies that the format definition contains frames that can be executed when data is being mapped into the data base. Similarly, "output format" implies that the format definition contains frames that can be executed when data is being mapped from the data base. A format then may be either or both. [See D.2 to see how it can be neither.]
For example, the two formats defined in the previous section can be combined into a single format:
1. ID = GQ.JNK.ADDRESS.FORMATS;
2. FILE = GQ.JNK.ADDRESS.BOOK;
3. RECORD-NAME = REC01;
4. FRAME-ID = LABEL.OUT;
5. DIRECTION = OUTPUT;
6. FRAME-DIM = 5,36;
7. USAGE = DISPLAY;
8. LABEL = DELIMITER;
9. VALUE = '!';
10. PUTDATA;
11. LABEL = NAME;
12. GETELEM;
13. PUTDATA;
14. LABEL = ADDRESS.LINE;
15. GETELEM;
16. PUTDATA;
17. LOOP;
18. FRAME-ID = INPUT.DATA;
19. DIRECTION = INPUT;
20. FRAME-DIM = 4,36;
21. USAGE = FULL;
22. LABEL = NAME;
23. GETDATA;
24. PUTELEM;
25. LABEL = ADDRESS.LINE;
26. GETDATA;
27. UPROC = IF $ENDATA THEN RETURN;
28. PUTELEM;
29. LOOP;
30. FORMAT-NAME = INPUT.AND.LABELS;
31. FRAME-NAME = LABEL.OUT;
32. FRAME-TYPE = DATA;
33. FRAME-NAME = INPUT.DATA;
34. FRAME-TYPE = DATA;
Though only the single format INPUT.AND.LABELS is set, different frames, each with a different purpose, would be executed when different commands are issued. When a TYPE or DISPLAY command is issued, for instance, the output frame LABEL.OUT would be executed; when an ADD command is issued, however, the input frame INPUT.DATA would be executed. Hence, a single format serves as both an input and an output format.
Suppose now that a command is issued that will cause one or more frames of the set format to be executed. Again, for example, suppose that the set format has frames that will be executed when a TYPE command is issued. SPIRES will then begin executing the appropriate data frames, following the instructions in the label groups for each frame, beginning at the first label group and continuing straight through (unless some branching instruction is encountered) for each record. Data frames are executed once per record. Certain types of frames, available in "report mode", are executed once per "group" of records or at the end of all records when multiple records are output.
If we return to the conceptual notion of a frame being a grid in which data is positioned, we can think of a format being a collection of such grids. Many formats may have only one grid, but a format may consist of multiple grids; the grids may appear one after the other, or they may be grids superimposed upon grids. The latter capability is particularly important when data elements within structures are being processed. Thus, just the way you can position a data element within a frame, you can construct a frame and position it within another frame.
As a program then, a format can proceed from frame to frame, or a frame may invoke another frame, like a subroutine, that gets control, executes and then returns control back to the "calling" frame. Such a frame is called an "indirect frame" because it is invoked not directly by a user command but instead by another frame.
When SPIRES begins executing a frame, it first establishes a "buffer" in user memory -- the buffer can be considered the internal equivalent to a grid. For output, the data is positioned individually in the buffer according to the directions given in the label-groups. For input, input data is placed in the buffer until it is full and then individual pieces are read from the buffer, according to the label-groups. (The exception to this is the prompting input format, in which the data being input is not read from a data set but is retrieved by prompting the user for data values at the terminal.)
This manual will show you how to write formats, based on the concepts shown in this section. In Part B, we will concentrate on output formats. In a task-related approach, we will begin constructing simple output formats, adding to our capabilities and skills as we need them. Our format creation procedure a few sections back will be covered in detail; from that point of view, Part B is worth studying even if you just want to write input formats, since the procedure, as you saw, is very similar for both types of formats.
In Part C, we will examine input formats. Input formats for adding and modifying (either via the UPDATE or MERGE command) records will be covered, including discussions on the major methods of providing the input data -- reading input data sets, prompting the user, and providing the data in the format itself (normally done in merge processing). Though the direction of data flow has changed, the basic format definition concepts do not, and the chapters of Part C build on the statements and techniques introduced in Part B.
In Part D, some miscellaneous topics will be covered, including formats used for both input and output and formats used for neither. Also discussed there will be single format definitions that define multiple formats.
The concluding section, Part E, provides several appendices to the manual, including descriptions of all SPIRES system variables associated with formats (they are used from time to time throughout the manual) and explanations for error messages that you might receive when you compile a format definition.
Output formats can be designed to display data from a SPIRES data base. The format may be designed for displaying individual records or for reporting on groups of records (providing subtotals and totals across records, for example).
The first few chapters of this part of the manual will lead you step-by-step through the procedure of creating output formats. Later in this part, special features and capabilities of output formats will be discussed, such as the ability to call other formats from within a format, the ability to access records in other data bases from within a format, and even the ability to update records while displaying them.
It is important to realize, especially if you will be writing input formats, that most of the topics discussed in this part concerning output formats also relate to input formats. Statements discussed here in regard to output often are identical to, or have their opposite counterpart in, statements in input formats. Please be aware that the details of such topics covered in this part may not be repeated in the part on input formats -- cross references to this part will be provided there instead.
One final reminder before beginning this chapter: Remember that the term "output formats" is misleading in that a single format can be used for many different purposes (input and output) by many different commands. The term is used to identify formats that can be used for output, but not necessarily exclusively for output.
Before you decide to create an output format (and go to the trouble of learning how if you have never done so before), you should seriously consider whether a custom-designed format is necessary. SPIRES allows you to display record data in various ways without forcing you to learn the formats language. If you are not acquainted with these "system formats" (general-purpose formats that can often be used by any data base) and features, which are individually discussed below, you are encouraged to read about them in the references cited.
The data in records to be displayed may be controlled in two ways: 1) you may control the specific elements to be shown; or 2) you may control specific element values to be shown, based on their values. In the first case, the control is handled by element lists; in the second, it is handled by element filters. An example of element filters will be shown later in this section.
The most commonly seen system format is the standard SPIRES format, whose general form is:
element.name = value;
Here are some notes on the standard SPIRES format:
- It may be used by any record-type of any file.
- It is the default format for any record-type unless another format is so named in the file definition.
- It is in effect after the CLEAR FORMAT command is issued.
- It is used to display parts of records when a SET ELEMENTS command is in effect or when the "element list" option is added to a TYPE command.
- By indenting elements within structures, it shows you the data hierarchy of the record.
- It is no more expensive and usually less expensive to use than a custom-designed format.
- With the capitalization of element names, the trailing semicolons, and the quotation marks when special characters (like semicolons and quotation marks) appear in the value, the data may look unattractive and, at worst, be difficult to read, especially for large records.
This format also displays element names and values, but in a different form from the standard format. The element names are shown on the left; their values appear several blanks later on the right. Structures are labeled, and the format makes it clear which elements are in them. In general, it is more expensive to use than similar, custom-designed output formats, though you can generate a custom format from $PROMPT. Not only is one of these generated formats as cheap to use as a custom-designed one, it is also simple to create, saving you from writing a complex format definition. [See D.4.1.]
If you are not familiar with the format, select a subfile, issue the SET FORMAT $PROMPT command, and display some records with it. The format is also useful for input, as its name suggests. It is described fully in the SPIRES manual "Searching and Updating", section D.3.
This format displays records in a tabular form. The elements are arranged horizontally in columns across the page. (Because the $REPORT format is a "report" format [See B.10.] it is designed primarily to produce output for printed reports, though by no means is it limited to that.) The documentation for it can be found in part C of "Searching and Updating".
Here are some additional notes on the $REPORT format:
- Only the elements specified in the SET FORMAT $REPORT command are displayed.
- It includes features especially for multiple record displays -- totals, averages, maximums, minimums, and the like, computed across all the records or across groups of records, may be requested.
- Titles, page numbers, and "headers" and "footers" for each page may be requested.
This feature allows you to restrict elements displayed to particular occurrences or to occurrences with particular values. Filters affect the output regardless of what format is set. Although formats can restrict the occurrences displayed of a particular element, filters provide a more general facility.
Here is an example of a SET FILTER command:
-> set filter for children where age < 10
This command tells SPIRES that later record-display commands should process only the occurrences of the CHILDREN structure that have an AGE element value of less than ten for any given record. Any records displayed after the command has been issued will be affected -- no occurrences of the structure that have an AGE value greater than or equal to ten will be displayed, no matter what format is used.
A command such as "SET FILTER FOR CHILDREN (1/3) WHERE AGE < 10" will limit displays to only the first three occurrences of the structure that represent children less than ten years old. Another form of the command, such as "SET FILTER FOR CHILDREN IN 2/4", identifies the specific occurrences (the second through fourth) that should be processed.
Filters are a very powerful feature that can be used instead of output formats or in conjunction with them, depending on the application. They have no effect on output formats used for the TRANSFER command [See B.3.4.] nor do they generally affect input formats. If your application involves both formats and protocols, you may find it easier to limit the displayed values with SET FILTER commands in the protocol than with specific code in the format definition. Complete documentation for filters is in chapter 21 of "SPIRES Technical Notes". [See B.4.8.5a for information about how filters may be set within an output format.]
Dynamic elements can be used to show element values in different forms than the way in which they would be output by the standard format. For a very simple example, suppose an element were displayed like this:
SPEED = 1800;
You might issue a DEFINE ELEMENT command to create the dynamic element RPM:
-> define element rpm as @@speed || ' RPM'
The RPM element is based on the value of the SPEED element for any given record. If you issue the command "TYPE SPEED, RPM", you might get the following result:
SPEED = 1800; RPM = 1800 RPM;
Dynamic elements may be defined by anyone with access to the selected subfile; they only last for that user as long as he or she has the subfile selected. They may concatenate or perform computations with several element values. They may use SPIRES functions and system variables. They may use either the internal (i.e., pre-OUTPROC) or external (post-OUTPROC) form of the elements.
Dynamic elements are a very helpful tool; like filters, they may be used instead of or in conjunction with output formats, though if you intend to use them in a format, they must be represented by variables in GETELEM statements. [See B.4.2.1.] For more information about them, see "SPIRES Technical Notes", section 20.
Despite the power of all of these system formats and features, considered separately or in combinations, they do not handle all the situations that custom formats do. Moreover, a customized format has certain advantages over them.
In regard to the system formats described above, a customized format does exactly what you want -- you do not have to compromise your needs with generalized capabilities. In regard to the features such as dynamic elements, a customized format has the advantage of being compiled, and compiled code is more efficient to execute than uncompiled equivalents.
Generally, the individual capabilities described above for both the formats and features can be coded in output formats as well. But specifically, here are a few other capabilities of customized formats that may not be as easy to handle by other methods:
- conditionally displaying elements depending on the values of other elements;
- displaying structural data in a way that clearly and attractively shows the data's structural hierarchy;
- retrieving data from other records in other record-types ("subgoal processing");
- calling other formats ("load formats").
Output format record-processing can be considered a two-step operation. First, the data is retrieved from the record according to the instructions given in the format definition (specifically, in the definition of the frame being executed) and placed, after any specified alterations, in the "buffer". The buffer, an area in main memory, is associated with the frame currently executing.
Second, when execution of the frame is complete, the buffer is "flushed" (sent) to the output device, usually the terminal screen or active file. If the format contains multiple frames to be executed, SPIRES will go on to the next one, starting this process over again. Similarly, if multiple records are being processed and the last frame has been executed for a given record, SPIRES will go on to the next record, starting the process over again.
The concept of the buffer as an intermediate storage place for data is important to keep in mind, especially for output formats. For example, if an instruction within a frame tells SPIRES to immediately display a message at the terminal using the star ("*") "Uproc" [See B.4.8.13.] you might be surprised at first to see the message on your screen ahead of the data already processed by the format. But again, the formatted data is being held in the buffer until the frame executes completely, whereas the message from the star Uproc is sent to the terminal immediately. [See B.3.3.]
Output formats can be designed to display individual records or to produce reports processing multiple records. However, regardless of whether the format is designed for single record displays or multiple-record reports, only one record is processed per execution of the format -- that is, the frames that process records are written to process individual records. Commands that cause multiple records to be displayed (e.g., DISPLAY ALL) cause the format to be executed one time per record.
Many output formats are written with a single frame to process the data. The "data frame" can handle all record-level elements in the records, but elements within structures must usually be handled by separate frames that are invoked from the data frame. Within a frame, each element is usually "processed" (i.e., retrieved from the record, modified if desired and placed in the buffer) by a group of statements called a "label group"; although several label groups may be used and are sometimes necessary to handle an element, the norm is one label group per element.
All format definitions must be placed in the public subfile FORMATS before they can be compiled and used. Hence they have a dual role: a format definition is a collection of instructions in a particular language, and it is a goal record in a subfile. The rest of this manual discusses the format definition language, but this section will discuss the format definition as a goal record.
Generally, FORMATS records are entered in the standard SPIRES format. Almost all format definitions are entered into the FORMATS subfiles as records in the standard SPIRES format, so the syntax of statements in a format definition will be shown as they would be entered in that format. For example, the ID statement's syntax is:
ID = gg.uuu.anyname;
Thus, in the FORMATS subfile, the value for the element ID must have the form "gg.uuu.anyname". If you enter the format definition in the standard format, you must begin the statement with the element name and end it with the semicolon. The equals sign, though not shown as such in the syntax statements, is always optional.
The maximum length for any single element value in a FORMATS goal record is 3072 bytes. That means, for example, that no single occurrence of the COMMENTS statement may be longer than 3072 characters (about 40 72-character lines). However, COMMENTS statements are allowed to occur multiple times wherever they appear in a format definition, so very long comments could be split into multiple occurrences. [See B.2.2.]
The rules for data entry using the standard SPIRES format are discussed in detail in the SPIRES manual "Searching and Updating", section D.1.2. The most important ones to remember are the following:
COMMENTS = "ABC;DEF";
COMMENTS = "ABC""DEF""GHI";
LABEL;
VARIABLE = STRING1; LENGTH = 1; OCC = 5; VARIABLE = STRING2; LENGTH = 2; OCC = 5;
The last rules are important to stress. Many of the statements in a format definition are elements within structures. For example, a frame definition is an occurrence of a structure; each label group in a frame is an occurrence of a structure within the frame structure.
All structures in a FORMATS goal record have key elements, such as VARIABLE in the VARIABLES structure shown above. Because a statement introducing or ending a structure may thus look like any other statement, you can easily lose your bearings in the format definition hierarchy.
Also important to remember is that SPIRES allows you to enter the elements in a structure (or the record-level elements, for that matter) in almost any order. However, they will be rearranged into the order shown by the SHOW ELEMENTS command for the FORMATS subfile. Thus the order in which you enter the statements may not be the order in which they are stored and compiled and executed. That is occasionally a problem for people learning to write label groups. [See B.4.]
An appendix in the back of this manual shows the proper order of statements in a format definition record. [See E.1.] In addition, the examples throughout will show you the order in which to code the statements.
A format definition has four major parts: the Identification section, the Vgroups section, the Frame Definitions section and the Format Declaration section. The first section represents most of the record-level elements in a format definition record; the others represent multiply-occurring structures. For example, there may be more than one frame definition, each one representing one occurrence of the FRAME-DEF structure.
Each of the sections of the definition are covered in separate chapters of Part B. [See B.2, B.3, B.5.] In addition, one chapter covers label groups, the multiply-occurring structure within the FRAME-DEF structure. [See B.4.] Other chapters in Part B will discuss how to compile the created format definition, how to use it, and how to handle other special situations in output formats.
This chapter describes the format definition statements that relate to the format definition as a whole. The key of the format definition record is the first statement, ID. Other statements, FILE and RECORD-NAME, identify the file and the record-type within the file to which the formats defined in this record will apply. Other statements (such as AUTHOR and DEFDATE) may contain other information about the record that will be useful to you.
In terms of the FORMATS goal record, these statements are record-level elements. Following the RECORD-NAME statement come the VGROUPS, FRAME-DEF and FORMAT-DEC structures, discussed in later chapters. Several other record-level elements, such as GEN-FILE, follow these structures, but they are discussed in later chapters. [See B.4.5.2, C.10.]
The ID statement specifies the key element value of the format definition for the FORMATS subfile -- no other format definition in the FORMATS subfile may have the same value.
The form of the ID statement is:
ID = gg.uuu.anyname;
where "gg.uuu" is your account number and "anyname" is a character string containing any characters other than blanks.
Usually, to make identification easier, a format definition's ID references the file to which it applies. For example, if a file is named XR.RMN.TAPES, a format definition for one of its record-types might be specified as:
ID = XR.RMN.TAPES.LIST;
The ID value is not the value used in the SET FORMAT command; that value is specified in the FORMAT-NAME statement in the format declaration section. [See B.5.1.] The ID value is used when you want to update the record in the FORMATS subfile and when you compile the format definition, so it is usually advisable to keep it relatively short and easy to type. A reasonable limit to suggest here is that the entire value be no longer than 30 characters, but the absolute limit is over 100.
You may replace the "gg.uuu:" portion of the value with a period or an asterisk:
ID = .TAPES.LIST; or ID = *TAPES.LIST;
Both are equivalent to the first form shown above for account number XR.RMN.
This free-form, multiply occurring statement is the first of several COMMENTS statements allowed throughout the definition, appearing in most every section. The COMMENTS statement here is often used to describe the purpose of the format(s) being defined, though of course you may use it for anything you want.
For example,
ID = GQ.JNK.RECORDS.LISTING;
COMMENTS = This format definition is for the LISTING format
of the RECORDS subfile.;
Be aware that, if this were collected in your active file, the extra blanks on the second line of the COMMENTS value (preceding "of the RECORDS subfile") would be retained in the value. It is shown this way here, and in similar examples throughout the document, to make long values easier to read than the way you probably should enter them:
ID = GQ.JNK.RECORDS.LISTING;
COMMENTS = This format definition is for the LISTING format
of the RECORDS subfile.;
AUTHOR = John Klemm, DBMG, 497-4420.;
That style of presentation, though more accurate, can make examples more difficult to read.
Here is an example demonstrating "block comments", which are often easier to read because you enter them the way you want to see them:
ID = GQ.DOC.LEAVE.DISPLAY; COMMENTS = *************************************************; COMMENTS = * This format definition defines several *; COMMENTS = * output formats for the LEAVE subfile. The *; COMMENTS = * input formats are defined in the FORMATS *; COMMENTS = * subfile record GQ.DOC.LEAVE.INPUT. *; COMMENTS = *************************************************;
Certainly the asterisk border draws your attention to the comments inside.
Some users maintain their format definitions in WYLBUR or ORVYL data sets. Rather than transferring the definition from the FORMATS subfile, making changes, and then updating it, they make changes to their own copy of the definition and then update the copy in the FORMATS subfile. This procedure lets them take advantage of the "-" element, sometimes called the "dash element" or the "throwaway element". If you add a record that includes throwaway element values, those values are indeed thrown away, and not included in the stored record; the throwaway element is available for all SPIRES subfiles. People who maintain their own copies of a format definition can use the throwaway element to place comments anywhere they please within the definition, since SPIRES will ignore them.
COMMENTS = This comment will be kept in the FORMATS goal record.; - This comment will be discarded from the record when stored.;
Yet another type of comment is allowed in UPROC statements. [See B.4.8.14.]
Like the COMMENTS statement, this is an optional, multiply occurring, free-form statement, meant to include your name, as the definition's author, along with a phone number in case the SPIRES system programmers need to get in touch with you. Such a need seldom arises, of course, but when it does, the appearance of this statement is very helpful.
Here is an example of an AUTHOR statement:
AUTHOR = Clare Quilty, Admission's Office, 555-3030;
These three statements, automatically supplied when you add or update your format definition in the FORMATS subfile, provide the date the record was first added to the subfile (DEFDATE), as well as the date (MODDATE) and time (MODTIME) that it was last updated. They are provided for your convenience.
This singly occurring statement contains the full name (including the account number prefix) of the file to which the format definition applies.
For example,
ID = GQ.JNK.RECORDS.DISPLAY; FILE = GQ.JNK.RECORDS;
If the file belongs to your account, you may replace the "gg.uuu." portion of the file name with either a period or an asterisk:
FILE = .RECORDS; or FILE = *RECORDS;
SPIRES will replace those characters with your account number for record storage.
You can write a format definition for a record-type within any file; however, to use the format, you must be able to access the record-type, usually by selecting the subfile for which it is the goal record-type.
If you are not the file owner, you may not know the file name. That information may be obtained by selecting the appropriate subfile and issuing the SHOW SUBFILE INFORMATION command.
This statement and the RECORD-NAME statement [See B.2.6.] may be omitted if you are writing a general file format or a global format. [See B.14, D.2.5.]
This singly occurring statement identifies the record-type in the previously named file to which the format definition applies. If you do not know the name of the record-type, it may be discovered by selecting the subfile for which the format is being written and issuing the SHOW SUBFILE INFORMATION command.
Here is an example of the RECORD-NAME statement:
RECORD-NAME = REC01;
The record name will never be longer than six characters.
A format definition and all the formats it defines can be used with only one record-type of one file (unless it is a general file format, described later in this manual). Usually a format is written for the goal record-type of a subfile, but not always -- the format definition is tied to a specific record-type of a file, not to a subfile, via the FILE and RECORD-NAME statements. It is possible to access other record-types from a format, however, through subgoal processing. [See B.12, B.14.]
Below is a sample identification section, combining the statements described in this chapter:
ID = GQ.JNK.RECORDS.DISPLAY; COMMENTS = This format definition is for my RECORDS subfile.; COMMENTS = "It is also an example in ""SPIRES Formats""."; AUTHOR = John Klemm, DRG, 497-4420; FILE = GQ.JNK.RECORDS; RECORD-NAME = REC01;
Remember that this definition is a goal record in the standard SPIRES format. Hence, in the second COMMENTS statement, the inclusion of quotation marks around the title "SPIRES Formats" means that the entire element value must be enclosed in quotation marks, and the internal quotation marks must be doubled. Similarly, if a semicolon (;) appears within a value, the value must be enclosed in quotation marks. Within the value, using apostrophes (') instead of quotation marks (") and avoiding semicolons are alternatives to consider as well. [See B.1.3.]
The next section of the format definition, Vgroups, will be discussed later. For the time being, just remember that we will be able to use variables declared in the Vgroups section later in the frames that we write. The next two chapters will discuss the Frame Definitions section.
The Frame Definitions section, which contains sets of formatting instructions, does most and often all of the work for a format. The section is comprised of one or more "frame definitions". Most frame definitions consist of two parts: frame identification statements, that provide the name of the frame and put limits on how the frame may be used (for example, for input or for output), and label groups, the individual subroutines that specify the processing to occur.
Though many formats have only one frame to be executed, formats commonly have multiple ones. There are several reasons why this is so. The most common reason is that the goal record-type for which you are writing the output format has structures in it. In most such cases, the processing of the elements within the structure must be specified in a separate frame definition. How to handle structures is the subject of a later chapter. [See B.8.] Other reasons for using multiple frames, such as the ability to share the same code between multiple formats, will be discussed in later chapters of Part B.
The remainder of this chapter will focus on the basic statements of frame identification. Some others, such as SUBTREE and LOAD-FORMAT, are involved with the above-mentioned reasons for having multiple frames, and thus will be covered in later chapters. Discussion of the label groups in a frame definition will appear in the next chapter.
The FRAME-ID statement signals the beginning of a frame definition. It provides a name for the defined frame that will be necessary in various situations later. In particular, this name is used in the format declaration section to tell SPIRES which frames can be used when a format is executed. [See B.5.2.]
The frame name may be from one to sixteen characters long. No blanks are allowed in the name. Few special characters (i.e., not alphabetic or numeric) are allowed in the name either, though the useful exceptions are periods, hyphens and underscores, which are commonly used as substitutes for blanks.
For example,
FRAME-ID = HEADING; FRAME-ID = DATA.DISPLAY; FRAME-ID = ADDRESS-STRUC;
Just after the FRAME-ID statement, you can add another COMMENTS statements in the same form as the one described earlier. [See B.2.2.] The COMMENTS statement here will most likely describe how the frame will be used, or, if it is an indirect frame, which other frame or frames called it. [See B.4.8.7.]
This statement specifies the direction of the data mapping. A frame used for data output will have the value OUTPUT for this statement.
For example,
DIRECTION = OUTPUT;
Frames for input will have the value INPUT. A third value, INOUT, is used primarily in formats used in full-screen applications [See D.1.2.] though it may also be specified for frames containing code that is shared between input and output formats. [See D.1.]
In other words, the output commands DISPLAY, TYPE, SCAN and TRANSFER may cause execution of frames of DIRECTION = OUTPUT. The input commands ADD, UPDATE, MERGE and BATCH (in SPIBILD) cannot cause execution of such frames; instead, they may cause execution of frames of DIRECTION = INPUT. Neither of these statements is meant to imply that such frames will necessarily be executed when one of those commands is executed; that depends on other factors as well. [See B.3.4, B.5.2.]
By default, if no value is coded for the DIRECTION statement (i.e., the statement is omitted from the frame definition), it is given the value of OUTPUT. Getting into the habit of explicitly coding it is recommended, however, since forgetting to code it properly on an input frame would cause a compilation error.
Generally speaking, an indirect frame has the same direction as the frame that calls it, but that is not always the case. [See B.4.8.7, B.16.3, C.13.3, D.1.2.]
The FRAME-DIM statement defines the size of the two-dimensional array (also known as the format "buffer") into which the data for an output frame is placed. In other words, the values given are the dimensions of the frame being defined. This statement also specifies whether SPIRES will process the frame "line by line" or as a "fixed frame" (see below). This statement is very important -- it may only be omitted when the frame being defined is not placing data in or reading data from the buffer. [See B.4.8.7.]
The syntax of the FRAME-DIM statement is:
FRAME-DIM = nrows, ncols; or FRAME-DIM = nrows; or FRAME-DIM = , ncols;
where "nrows" and "ncols" are integers representing the number of rows down and the number of columns across the frame respectively. The only restriction on the size of these two values is that their product (nrows times ncols) must be less that 65,536 (64K). (For purposes of comparison, a standard terminal screen, 24 rows by 80 columns, would have a product of 1,920, and a printer page of 60 rows by 150 columns would have a product of 9,000.)
If the value of "nrows" is given as "0" (zero) or omitted, as in the last form shown above, then "line by line processing" goes into effect (see below). If the value of "ncols" is 0, then the width of the destination area (such as the active file) will be used; in other words, the width is dynamically set, based on the final destination of the data. For the active file and the terminal, both in SPIRES and batch SPIRES, the value used for "ncols" is the value of the system variable $LENGTH, which is 72 by default. If the format is used to display records in other device services areas, the width of the area will be used for "ncols". [See the note below about SET HCLIP, and see the SPIRES manual "Device Services" for more details.]
When the frame is executed, SPIRES will construct a buffer in memory having the dimensions given. Subsequent instructions within label groups may reference any position within the frame by citing its row and column numbers. References to positions outside the frame dimensions (too high a row or column number, for instance) may cause an S808 error when the frame is executing -- they will not cause a compilation error. [See B.6.2.] When the frame finishes executing, the buffer is "flushed" (that is, released from main memory and sent to the specified output device, such as your terminal).
The SET HCLIP Uproc allows you to display records in a device services area when the right margin of your format's FRAME-DIM would otherwise be too wide for the area. HCLIP stands for "horizontal clipping", and the effect is that when records are displayed in a device services area, the right portion of the data output by the format will simply not be displayed, rather than giving you S808 or S825 errors. You will see only what will fit in the dimensions of the device services area.
A primary use of this Uproc is in report formats coded for use in Prism. A report might be too wide to fit on Prism's screen, so one option is simply to forbid users to display the report online (i.e. they must print the report in order to see it). Alternatively, if the SET HCLIP Uproc is used in the format, you can allow the application users to display the portion of the report that will fit on the Prism screen. (The entire wider report would still be generated for printing.)
Code SET HCLIP in an initial frame, so that the setting can be established before frame dimensions are set. The HCLIP setting is reinitialized each time a multi-record output command is issued; there is also a SET NOHCLIP Uproc to turn it off.
The following principles should be considered when setting frame dimensions:
These guidelines seem to suggest that you should get the dimensions exactly right -- "too large" means inefficiency, "too small" means an error if the record being formatted is larger than anticipated. Ideally that is true, but it is somewhat impractical when you are dealing with records whose sizes vary. Given only these principles on "fixed frame dimensions", you should probably risk erring on the side of "too large".
However, there are other possibilities to consider. A neat alternative is to take advantage of "flush processing". You set "nrows" to a low, reasonable number and then specify the SET FLUSH Uproc. [See B.4.8.10, E.2.1.18.] Then, if a label-group tries to place data in a row beyond the last row of the frame, the partially completed buffer is flushed, as described above, and format processing continues, constructing and releasing rows of formatted data one by one. Once "flush processing" begins, each row is sent to the output device as soon as some value is positioned in the next row.
For example, if a data record that would require 35 rows is placed in a buffer of 30 rows that allows flush processing, as soon as SPIRES tries to place data in row 31, the first 30 rows would be flushed to, say, the active file. Then, when SPIRES tries to place data in row 32, row 31 is flushed, and so on. This process can continue indefinitely; the record may require just a few extra rows beyond the fixed frame dimensions, or hundreds. The SET FLUSH Uproc, which is specified in the format declaration section [See B.5.] grants you this flexibility.
One significant limitation of "flush processing" should be kept in mind: Once a frame or a row has been flushed, you cannot put any more data into it. Compare the two "frames" shown below. The string AAAAAAA represents an occurrence of the element A, the string BBBBBBBB represents an occurrence of the element B, and the strings made up of periods represent blanks. Both frames have fixed dimensions, but Frame 2 also has SET FLUSH. For both frames, you want to do the same thing: place all the occurrences of element A on the left, and all the occurrences of element B on the right, both sets of occurrences beginning on the first line. Note that all occurrences of A will be processed before any occurrences of B:
Frame 1 (frame-dim = 2,17) Frame 2 (frame-dim = 1,17; flush)
AAAAAAA..BBBBBBBB AAAAAAA..........
AAAAAAA..BBBBBBBB AAAAAAA..BBBBBBBB
.........BBBBBBBB
Presumably you want both to look like Frame 1. However, because the first row of Frame 2 is flushed when the second occurrence of element A is positioned, the first occurrence of element B cannot be positioned there; the row is already gone. The best that can be done at that point is to begin the occurrences of element B on the same row as the last occurrence of element A, as shown. Summarizing this example, we can say that multiply occurring elements to be positioned side by side with other multiply occurring elements should not usually be done within flush processing.
Line-by-line processing takes flush processing to the extreme. Each row of the frame is constructed and then is flushed when SPIRES tries to put data in the next row. Because SPIRES is working with a smaller frame (a single row at a time), line-by-line processing is slightly more efficient than fixed-frame processing. If you use a format extensively, line-by-line processing could represent a significant savings over time. On the other hand, line-by-line processing shares the same minor restriction as flush processing -- once you have written data into row 2, you cannot put any in row 1.
There may be ways around this restriction (such as another way to design the format or specify the instructions) but it is generally preferable to use fixed frame dimensions and code straightforwardly than to use line-by-line processing with a few kluges. The extra coding you do or the extra processing SPIRES must do in the latter case may far outweigh any efficiency advantages gained by using line-by-line processing. Formats that put out row after row of data down the page are natural candidates for line-by-line processing, though.
As mentioned earlier, line-by-line processing is requested by coding "0" for "nrows" in the FRAME-DIM statement:
FRAME-DIM = 0,68;
Other aspects of frame dimensions are discussed elsewhere as appropriate.
You can change the frame dimensions set for a frame by using the SET NROWS and SET NCOLS "Uprocs". (A Uproc is a statement requesting a particular procedure to be executed at that point in the format.) [See B.4.5.5.] Their syntaxes are:
UPROC = SET NROWS = nrows; UPROC = SET NCOLS = ncols;
where "nrows" and "ncols" are non-negative integers, integer variables, or expressions whose results can be converted to integers. If SET NROWS or SET NCOLS appears in a label group within the frame, the value of "nrows" or "ncols" must be less than the corresponding FRAME-DIM value. The frame dimensions of the buffer will change immediately. Further changes to the frame dimensions may be made in the frame, as long as they have successively smaller values. (In general, you should not set "nrows" to zero within a frame in an attempt to switch to line-by-line processing. However, you can do it if you have not already put any data into the buffer and if the SET FLUSH Uproc is in effect.)
These UPROCs can thus be used, for example, to change the number of rows for fixed-dimension frames such as initial or header frames once you know how many rows of data they have used. [See B.10.3.1.]
The SET NROWS and SET NCOLS Uprocs can also change the frame dimensions of a frame before it is entered, if they are coded as UPROCs in the frame declaration of the Format Declaration section. (They would be coded after the FRAME-NAME statement for the frame they are to be applied to.) [See B.5.2.] Then they would be executed before the frame itself was executed. Here the value of "nrows" or "ncols" may be larger than the corresponding FRAME-DIM value, but whatever it is, it will override that FRAME-DIM value when the frame is subsequently entered. Setting the value of "nrows" to zero will set line-by-line processing for the frame when it is entered.
The USAGE statement, in combination with the DIRECTION statement, determines which commands can cause execution of the frame being defined. For output frames, the most common usage is DISPLAY, indicating that DISPLAY, TYPE and SCAN commands can cause their execution.
The syntax of the USAGE statement is:
USAGE = value [, NAMED];
where "value" is one of the usage values shown below, and NAMED is an additional option, which is described below.
For an output frame, four values are allowed:
By default for DIRECTION=OUTPUT frames, the USAGE is DISPLAY. That is, if no USAGE statement is coded, or if USAGE = NAMED is coded by itself, the primary usage value is DISPLAY.
For most simple output formats, USAGE = DISPLAY is coded. Later we will see how the other values, including NAMED, are commonly used. [See D.1.1.1.]
Next in the frame definition usually come the label groups, which will specify the processing that should be done: which elements should be accessed, where their values should be placed, etc. Alternatively, other statements may appear next, shifting execution control to other formats. [See B.11.]
Label groups are more common, however, and they are the subject of the next chapter. Before that, however, here is the start of a sample frame definition, preceded by the Identification section, using the statements described in this chapter:
ID = GQ.DOC.LEAVE.DISPLAY;
FILE = GQ.DOC.LEAVE.SYSTEM;
RECORD-NAME = REC01;
FRAME-ID = EMPLOYEE;
COMMENTS = This frame will be used to access elements in
the EMPLOYEE goal records.;
DIRECTION = OUTPUT;
FRAME-DIM = 10,68;
USAGE = DISPLAY;
Note that there is another statement that may appear between the DIRECTION and FRAME-DIM statements, the SUBTREE statement. SUBTREE is specified for indirect frames that are used to access element structures, and will be discussed later, along with indirect frames. [See B.8.2.]
The second part of a frame definition is a collection of program instructions arranged in "label groups". Most frames have multiple label groups, at least one for each element being processed -- a single label group generally retrieves only one element.
Label groups basically have two purposes: to handle a single data value, and to control format execution. A single label group may do either or both. Specifically, label group statements have five functions:
- 1) to access the value;
- 2) to place the value;
- 3) to control (test and/or change) the value;
- 4) to control (test and/or change) the placement of the value;
- 5) to control the execution of the label groups.
More than twenty different statements are available within a label group for an output frame, each one serving at least one of the above functions. The wide variety of possibilities is just barely suggested by their names, shown below:
Access the value Place the value Control the value
---------------- --------------- -----------------
GETELEM PUTDATA DEFAULT
VALUE OUTPROC
IND-STRUCTURE INPROC
INSERT
UPROC
ENTRY-UPROC
Control the display Control the execution
------------------- ---------------------
TITLE, TSTART LABEL
MARGINS IND-FRAME
MAXROWS DEFAULT
START UPROC
LENGTH ENTRY-UPROC
BREAK LOOP
XSTART COMMENTS
UPROC
ENTRY-UPROC
Several other label group statements are available for input and "inout" frames. [See C.3.]
When SPIRES executes a frame, it begins with the first label group, executes the instructions described therein, and then proceeds to the next one. In general, the statements within a label group are executed independently, but some of them will have an effect on others -- as an extreme example of this, a UPROC = JUMP statement could possibly "undo" all the rest of the statements that preceded it in the label group. So there are two ways to look at a label group: first, as a collection of individual instructions, executed one by one; and second, as a single "super-instruction" that executes all at once, in most cases handling one data element.
The latter view is preferable for several reasons. First, whenever execution branching occurs (e.g., skipping some instructions, or looping back to earlier instructions), execution always resumes at the start of a label group. You cannot jump into the middle of a label group.
Second, a single execution of a label group handles a single value. Each label group has a value (actually in two forms, represented by the system variables $CVAL and $UVAL) and many label groups deal exclusively with their value -- retrieving it from the data base record, testing and adjusting it, and placing it in the buffer. Although these individual activities can be split into multiple label groups, the SPIRES formats language is designed to handle all a value's processing, in most cases, in one label group of statements.
Third, because the statements in a label group are actually elements in the LABEL-GROUP structure of a format definition record, they will be compiled and executed in the order in which they are stored in the FORMATS subfile, which may be different from the order in which you coded them. [See B.1.3, E.1.] In other words, you may code the three label group statements GETELEM, PUTDATA, and UPROC = JUMP in that order, but when the record is added to the FORMATS subfile, the order will be changed to GETELEM, UPROC = JUMP, and PUTDATA, and the statements would be executed in the changed order. (The PUTDATA statement would never execute, because the JUMP Uproc tells SPIRES to jump to the next label group.) Hence, treating a label group as a single, large instruction whose component statements work together in a standard order is more reliable than treating it as just a collection of instructions that will be executed. This subject will come up again in examples later in this chapter and the next.
Below is a list of the statements most commonly found in output format label groups, showing the order in which they are stored in the FORMATS record and are executed.
LABEL ENTRY-UPROC TSTART TITLE GETELEM <---(Only one of these 3 statements VALUE per label group) IND-STRUCTURE IND-FRAME DEFAULT MARGINS MAXROWS LENGTH START OUTPROC INSERT BREAK ADJUST UPROC DISPLAY PUTDATA SORT LOOP XSTART
This chapter covers most of the label group statements listed above; a couple of them used primarily with structures, report formats or full-screen applications are discussed later. [See B.8.3, B.10.8, B.13.] The first few sections cover the most basic statements: LABEL, GETELEM, VALUE and PUTDATA. The remaining statements are then discussed in more or less the order of the categories shown above.
The LABEL statement is the first and only required statement in a label group. It identifies the beginning of the label group and, if given a value, identifies the label group itself.
The syntax of the LABEL statement is:
LABEL = label.name;
where "label.name" is a string of one to sixteen characters. Like the FRAME-ID statement, the LABEL value should contain only alphanumeric characters and not special characters, with the common exceptions of periods, hyphens or underscores. [See B.2.1.] No blanks are allowed in the label name either.
If no label name is given (i.e., the LABEL statement is given a null value), the syntax is:
LABEL;
Label names are specified for several different reasons. Assigning a name lets you "jump" to that particular label group explicitly, using the XEQ PROC and JUMP Uprocs. The name will also be used as the default value for a subsequent GETELEM, PUTELEM or REMELEM statement in the label group. [See B.4.2, C.3.3, C.5.1.2.] The name will also be used by SPIRES to identify the label group in error messages if an error within the label group is detected during compilation. (If no label name is given, the label group is identified by a count from the last named label group.) And if you compile your format with the LABELS option, your label names will be used in SET FTRACE output. (SET FTRACE is a formats tracing and debugging command.) [See B.6.2, B.7.2.1.]
Here are some examples of LABEL statements:
LABEL = ADDRESS; LABEL = ITEM.NAME; LABEL = START-LOOP;
The GETELEM statement tells SPIRES to retrieve an occurrence of the named element from the record being processed for handling by the current label group. In general, other statements within the label group will then position the element value in the frame, though that is not required -- the GETELEM statement simply retrieves the occurrence, and the rest of the label group determines what is done to it.
The most explicit form of the GETELEM statement is:
GETELEM = element.name;
where "element.name" is the name of the element to be retrieved. Another form takes advantage of the label name supplied in the LABEL statement:
GETELEM;
Here the name given in the previous LABEL statement is presumed to be the name of the element to be retrieved by GETELEM. [See B.4.1.]
For example, the two sets of LABEL and GETELEM statements below are equivalent:
LABEL = PHONE.NUMBER; LABEL; GETELEM; GETELEM = PHONE.NUMBER;
Either set could be the beginning of a label group that is to retrieve the PHONE.NUMBER data element. If both the LABEL and the GETELEM statements are given values, the value given in the GETELEM statement is the element that will be retrieved.
When the element has multiple occurrences and you want to retrieve all of them, one by one, you must use the LOOP statement. [See B.4.8.4.] However, if you only want to retrieve one of the occurrences, you may request it explicitly:
GETELEM = element.name(n);
where "n" represents an integer, either 0 (zero) or positive. (Beware: the first element occurrence is numbered 0, the second is 1, the third 2, etc.) Another form, "GETELEM = element.name::n", is now considered obsolete, though it may still be used; it was replaced by the form shown above, because the obsolete form is confusing in regard to variables, whose subscripts are indicated similarly. [See B.4.2.1 for information on using variables for the element name or the occurrence number.]
If no occurrence number is given, then the next occurrence (usually the first occurrence, number 0) is retrieved by GETELEM. Note, however, that the two statements below:
GETELEM = element.name; GETELEM = element.name(0);
are not exactly equivalent. If you use the LOOP statement to retrieve multiple occurrences of the element, code the first statement rather than the second, for the second tells SPIRES to always retrieve the first occurrence. With the first statement, a LOOP statement will cause SPIRES to always retrieve the "next" occurrence. [See B.4.8.4.] Alternatively, in some situations you can use the SET STARTOCC Entry-Uproc to cause a loop to begin with a specified occurrence. [See B.4.8.5.]
Several other, less common forms for specifying the element to be retrieved are discussed later. [See B.4.2.1.]
The GETELEM statement is not usually coded for elements that are structures, though it can be. A structure is most often processed with an indirect frame. [See B.8.]
When a GETELEM statement fails to retrieve a value (that is, no occurrences or, in the case of a loop, no more occurrences of that element exist), the rest of the label group is skipped, and execution resumes with the next label group. The DEFAULT statement can be used to force SPIRES to continue executing the current label group in such cases. [See B.4.5.1.] Do not confuse "no occurrence" with a "null occurrence" where the element occurs but has no value. A null occurrence will not cause the rest of the label group to be skipped; however, the retrieved value is null.
When a GETELEM statement is executed, values for several important system variables are established. The most important are $UVAL ("Unconverted VALue") and $CVAL ("Converted VALue"). These two variables represent two forms of the element value retrieved, and they may be used to test or alter the value. [See B.4.5.4.] In fact, it is $CVAL that will be placed in the output buffer by the PUTDATA statement. [See B.4.4.]
Some other forms of the value for the GETELEM statement, though not commonly used, are discussed in this "optional" section. The forms are:
1) GETELEM = #variable; 2) GETELEM = @n; 3) GETELEM = @element.name;
Each of these forms is discussed below. In addition, a use of the GETELEM statement with structures is discussed.
1) The first form may be used when the name of the element to be retrieved is stored in the given string variable. Alternatively, if it is a four-byte value of type HEX, then it represents the "element ID" ($ELEMID) of a particular element. [See E.2.3.10.] In either case, SPIRES uses the variable to determine the element whose values are to be retrieved.
Although a LOOP statement may cause the label group to be executed repeatedly, SPIRES will only do the variable substitution the first time through. In other words, changing the value of the variable within the label group will not cause the label group to retrieve a different element if the re-execution of the label group is caused by the LOOP statement. [See B.4.8.4.] However, if you leave the label group and return to it later, the current value of the variable will be used.
2) When "@n" (where "n" is an integer) is given as the value of GETELEM, SPIRES uses that integer as the absolute element number within the record (or within the structure, if the frame is an indirect frame processing a structure). Elements are numbered from 0 (zero) both at record-level and within a structure; the slot number key of a slot record-type is always number 0. This form may not be specified using variables.
3) The "@element.name" form is related to the "@n" form discussed above. It may be coded in an indirect frame that processes structures if the frame definition contains multiple SUBTREE statements. [See B.8.2.] This form tells SPIRES to convert the element name to the absolute element number, as in "@n". Then, this indirect frame can be called to process other structures, as listed in the SUBTREE statements, even though the element names in the GETELEM statements do not match the element names in the structure being processed. SPIRES will change the element names to the element numbers for the first structure listed in the SUBTREE statement and then use those element numbers when retrieving elements in the other structures.
As mentioned earlier, the GETELEM statement is seldom used to retrieve an entire structure. [See B.4.2.] In most cases, an indirect frame is coded to process the elements within the structure individually. However, a structure may be retrieved all at once with a GETELEM statement, usually if the structure is defined with the $STRUC or $STRUC.OUT system proc (A33) for an OUTPROC. [See B.8, C.5.3.]
You may also use GETELEM with a structure to find out how many occurrences of the structure exist:
LABEL = ADDRESS.STRUC; GETELEM; DEFAULT; UPROC = LET NUM.ADDRESSES = $ELOCC;
The user variable NUM.ADDRESSES is set to the value of the system variable $ELOCC, which contains the number of occurrences of the data element processed by the GETELEM statement. [See B.9.3, E.2.2.26.]
Because an element occurrence can be specified in the same way as on occurrence of a variable in an array, using the symbol '::', confusion can arise. For instance, examine the following statement:
GETELEM = #ELEMENT::#N;
Does this statement refer to the "Nth" occurrence of the element represented by #ELEMENT or to the "Nth" value of the variable array represented by #ELEMENT? SPIRES assumes the latter case -- #N represents the occurrence number of the variable, not of the element.
To specifically request the occurrence of the element rather than the variable:
GETELEM = #ELEMENT(#N); <- the "Nth" occurrence of the
element named in #ELEMENT
Thus, these forms are equivalent:
GETELEM = ELEM::#N; <- the Nth occurrence of ELEM
and
GETELEM = ELEM(#N); <- the Nth occurrence of ELEM
In both cases, #N represents the occurrence number of the element ELEM. However, when the element name is in a variable, these two forms are not equivalent:
GETELEM = #ELEMNAMES::#N; <- the element named in the Nth
item in the ELEMNAMES array
and
GETELEM = #ELEMNAMES(#N); <- the Nth occurrence of the element
named in the variable ELEMNAMES
To specify both variable and element occurrence:
GETELEM = #ELEMNAMES::#N(#OCC); <- the OCCth occurrence of the
element named in the Nth
item in the ELEMNAMES array
You can also specify the element occurrence number with an indexed variable:
GETELEM ENTRY.DATE(#OCC::I); GETELEM = #ELEM(#OCCNUM::I);
The letter "I" as an occurrence number for the variable indicates that the definition of the variable included the INDEXED-BY statement, pointing to another variable whose value is to be used as the occurrence number for the first variable. [See the discussion of the INDEXED-BY statement in the manual "SPIRES Protocols" for further information.]
Sometimes in an output frame you have other values that are not elements that you want to be placed in the frame. The VALUE statement can be used to give the label group a value to process, just as it processes an element value accessed by the GETELEM statement. In fact, the GETELEM and VALUE statements are mutually exclusive: you may not code both of them in a single label group.
Specifically, the VALUE statement can be used:
- when you want to place a string value (e.g., some text) in the frame when the value is not directly tied to an element being positioned (cf. the TITLE statement).
- when you want to place the result of an expression (perhaps the value of a system or user variable) in the frame.
- when you want a value to be processed by an action or system proc whose processing is unavailable or difficult to simulate through system functions. [See B.4.5.2.]
The syntax of the VALUE statement is:
VALUE = expression;
where "expression" is an expression following the same rules as expressions in a LET command or LET Uproc. [See B.4.8.10.] The type of the value (e.g., string, integer) depends on the result of the evaluation of the expression; it is not by default converted to a string value (see below).
Each individual part of the expression must not exceed 256 characters in length. The VALUE statement follows the LABEL statement in a label group definition.
For example, here is a VALUE statement specifying a string value to be placed in the frame:
LABEL = RECORD.TITLE; VALUE = 'Personnel Record';
This value might appear at the top of a personnel record, identifying the data that will follow. Note that string values should usually be enclosed in apostrophes. [See B.4.3.1.]
Values that are expressions or whose type is not string may require extra care in handling:
LABEL; VALUE = 3 + 4;
During format execution, SPIRES will evaluate the expression. By default, arithmetic is done using packed decimal values, so the pieces of this particular expression, "3" and "4", are converted to packed numbers, and the result, "7", is also a packed decimal. However, before the value is output, i.e., placed in the format, it needs to be converted to a string value. The easiest way to accomplish this is to apply the $STRING function to the expression:
VALUE = $STRING(3 + 4);
Another alternative would be to code an OUTPROC, such as $PACK.OUT, that would convert the value to a string. [See B.4.5.2.] But regardless of your method, it is important to know the type of the evaluated expression and, if it is not a string and you are going to position the value in an output frame, to convert it to a string. [See B.4.5.4 for a further discussion of type conversions in this regard.]
The value of the evaluated expression is assigned to the system variable $UVAL. When SPIRES executes the VALUE statement (just like the GETELEM statement), the values of several important system variables are established, in particular $UVAL and $CVAL. [See B.4.5.]
Although it is not absolutely required, character strings in the VALUE statement should be enclosed in apostrophes:
VALUE = 'END OF DATA';
If a string value is not enclosed in quotation marks, blanks within the value will be ignored when it is evaluated. For example,
VALUE = END OF DATA;
would become ENDOFDATA for processing. It is interpreted as three separate strings to be concatenated together. (Blanks not surrounded by apostrophes or quotation marks are considered concatenation operators if no specific operator is given.)
It is possible, though not recommended, to use quotation marks (") instead of apostrophes (') around a string. However, because the VALUE statement is an element in a FORMATS goal record written in the standard SPIRES format, the quotation marks around the string must be doubled, and the entire value must be enclosed in quotation marks.
Here are some examples of values, the first being a mixed expression:
1a. "The sum of 3+4 is " $STRING(3+4) 1b. ""The sum of 3+4 is "" $STRING(3+4) 1c. VALUE = """The sum of 3+4 is "" $STRING(3+4)";
Step A shows the original value. Step B shows the quotation marks doubled, and step C shows the entire value being placed in quotation marks, as the value is assigned to the VALUE statement.
But compare that method to the apostrophe method:
VALUE = 'The sum of 3+4 is ' $STRING(3+4);
Apostrophes are much easier to use than quotation marks in this context.
Special characters in strings do not require much special handling. As always, the characters to be careful with are the apostrophe, the quotation mark, and the semicolon. If the string contains an apostrophe and is to be surrounded by apostrophes, the internal apostrophe must be doubled:
VALUE = 'This is Mickey''s watch.';
This is also true for quotation marks, except that once again, the rules for quotation marks within element values for standard format record input must be followed too:
2a. He said "Hello" and left. 2b. "He said ""Hello"" and left." 2c. VALUE = """He said """"Hello"""" and left.""";
The original value is shown in step A. Step B shows the value in quotation marks to indicate that it is a string value, and step C shows the value as given in a VALUE statement, showing the entire value in quotation marks and all other quotation marks doubled.
If a value contains a semicolon, the entire value of the VALUE statement must be enclosed in quotation marks:
VALUE = "'END OF DATA; END OF OUTPUT'";
This example shows that the value of the VALUE statement is the string 'END OF DATA; END OF OUTPUT', including the apostrophes.
To summarize, when you have a character string that is either part of or the entire value of the VALUE statement:
- First, place the string in apostrophes (preferred) or quotation marks.
- Second, if the string is enclosed in apostrophes and contains internal apostrophes, double the internal apostrophes. Similarly, if the string is enclosed in quotation marks and contains internal ones too, double the internal ones.
- Third, apply the rules for standard format record input to the entire value of the VALUE statement: if the value contains quotation marks, double them and place the entire value in quotation marks; and if the value contains a semicolon, place the entire value in quotation marks.
These rules also apply to strings in LET and SET Uprocs. [See B.4.8.10.]
The PUTDATA statement tells SPIRES to place the current value of the system variable $CVAL in the frame. That value usually represents the value created when the GETELEM or VALUE statement in the frame is executed.
The syntax of the PUTDATA statement is:
PUTDATA;
or
PUTDATA = n;
where "n" is an integer. The most common form is the first; the second form is discussed at the end of this section. [See B.8.1 for its use with indirect frames.]
The data value will be positioned in the frame starting in the row and column designated by the START statement [See B.4.6.1.] or, if no START statement is coded in the label group, at the default starting position.
A label group (or even all the label groups in a frame) may be as simple as the following:
LABEL = TITLE; LABEL; GETELEM; or VALUE = '*'; PUTDATA; PUTDATA;
In the left example, the value of element TITLE is retrieved (GETELEM) and placed in the frame starting in the default position (PUTDATA). In the right example, the string value "*" is placed in the frame in the default position. In both examples, the default position would be column 1 of the next row of the frame (the equivalent of START = X,1).
Several system variables are reset by the successful execution of the PUTDATA statement, in particular $CROW (current row) and $CCOL (current column). [See E.2.2.14, E.2.2.15.]
When SPIRES executes a PUTDATA statement, it assumes that $CVAL is a string value; in other words, if $CVAL is not a string value but is, for instance, a packed decimal, SPIRES will not convert it to a string value before placing it in the frame. Instead, SPIRES will effectively "retype" the $CVAL variable to a string, as if the $RETYPE function were used. [See B.4.5.4 for a further discussion of data types and the PUTDATA statement.]
For example, suppose you have a label group that looks like this:
LABEL; VALUE = 3/2; PUTDATA;
The value of $CVAL is "1.5", but it is a packed decimal value. Internally, it is stored as the hexadecimal characters "01 5C FF", along with the information that the value should be interpreted as a packed decimal. However, SPIRES ignores that last piece of information when the PUTDATA is executed, interpreting the value as a string. Hence the value that should be interpreted as a packed decimal is interpreted as a string when it is placed in the buffer, and the result is "garbage", data that is useless to you, in the format. (Note: There are situations where you might want packed decimal or integer data to be output without conversion to character strings, so that the data can be submitted to some other program that can read it in that form. In such cases, you would use the technique shown above intentionally, and the results would not be garbage to you.)
As shown above, the PUTDATA statement can be coded with an integer value to specify different processing for special circumstances.
"PUTDATA = 1" has an effect when all of the following conditions are true:
- 1) the current frame has fixed dimensions;
- 2) the current frame does not have SET FLUSH in effect; and
- 3) the current PUTDATA would cause the frame to overflow because the value is too long to fit in the remainder of the buffer and no other placement statement in the label group (e.g., LENGTH, MAXROWS) prevents the value from being too long.
When this situation arises and "PUTDATA = 1" is coded, the part of the value that fit in the buffer is blanked out, the system flag variable $OVERFLOW is set, and execution of the current frame stops. If the frame is an indirect frame, control returns to the calling frame; if the frame is a data frame, execution control continues with the next data frame, or if none exists, returns to command level. [See B.4.8.7, B.4.8.8, B.5.2.] The flag $OVERFLOW can be checked to determine whether those actions have taken place. [See E.2.1.23.]
If the situation occurs when the "1" option is not specified on the PUTDATA statement, the frame will overflow, causing the format processing to stop for the current record; error message S808 will be displayed.
A TITLE statement appearing in a label group having "PUTDATA = 1" will not be affected by the "1" option, meaning that the title could cause an S808 error. You should consider handling the title in a separate label group (positioning it with a VALUE statement rather than TITLE) with its own "PUTDATA = 1" if you want to use both the overflow-handling option and titles. [See B.4.7.]
"PUTDATA = 2" has an effect when the label group's value (that is, $CVAL) is null (i.e., has a length of zero). Normally, the value would not be placed in the buffer, meaning that the current row and column numbers would not reflect the placement of that value -- they would still reflect the placement of the last value placed in the buffer. If "PUTDATA = 2" is coded, then the current row and column numbers will be updated as if the null value had actually been placed in the buffer. [See B.4.6.1.]
"PUTDATA = 3" ensures that a value's length does not change if it wraps to other rows. With this option, if values are to be broken on blanks, succeeding blanks will not be stripped from the value when it wraps to the next line.
"PUTDATA = 4" acts like HOLDATA/FLUSHDATA on the containing label group.
"PUTDATA = 5" to be used if HOLDATA processing is to take place and the $CVAL being output is large (several lines) and would force the current screen out as a blank or nearly blank screen. PUTDATA = 5 tells the FORMATS processor to flush the beginning lines of the value onto the page despite the HOLDATA process.
Although a frame definition may contain label groups with only LABEL, GETELEM and PUTDATA statements, most label groups have more. The rest of this chapter will discuss the three other categories of label group statements, beginning with the statements that control the value that will be placed in the frame. Section B.4.6 will discuss those statements used to position the value within the frame, and B.4.7 will cover the statements used to control program execution.
In all of these sections, references will be made to the value being positioned in the frame. We will call that value "$CVAL", which is the name of a SPIRES system variable containing the value. It represents the "Converted VALue", that is, the value derived from a GETELEM or VALUE statement after it has been processed by INPROC, OUTPROC or INSERT statements. [See B.4.5.2, B.4.5.3.]
To be more specific, when SPIRES executes a label group containing either a GETELEM or VALUE statement, it establishes values for, among others, the two system variables $CVAL and $UVAL. For an element, $UVAL represents the internal, stored form of the element; $CVAL initially represents the external form, the form derived by processing the value through the OUTPROCs coded in the file definition or in the label group.
For a value or expression given in the VALUE statement, $UVAL and $CVAL are originally established with the same value, which is the result of the expression. Then, if an OUTPROC or INPROC statement appears in the label group, $UVAL is processed through that to establish a new $CVAL.
For either the GETELEM or VALUE value, if the INSERT statement appears, strings are then inserted in or appended to $CVAL, giving it a new value. Finally, the value of $CVAL can still be changed before it is placed in the frame, using the SET CVAL Uproc. [See B.4.5.4.]
One other statement allows you to set the value of $CVAL in some cases: the DEFAULT statement, which can be used to provide a default value when a GETELEM statement fails to find an element occurrence. [See B.4.5.1.]
When SPIRES executes a GETELEM statement and no element occurrence is retrieved, the remainder of the label group is skipped, and execution resumes with the next label group. However, when the DEFAULT statement is coded and the GETELEM statement fails:
- the remainder of the label group will be executed;
- if given in the DEFAULT statement, a "default" value will be supplied to $CVAL and $UVAL for further processing by the label group; and
- the $DEFAULT flag variable is set (see below).
The DEFAULT statement has no effect when $CVAL is established by a VALUE statement rather than a GETELEM statement. It also has no effect in input frames.
The syntax of the DEFAULT statement is:
DEFAULT [= value];
where "value" is a character string representing the value to be used when no value is retrieved by the GETELEM statement. This section will discuss the use of the DEFAULT statement's value. Later, the use of the DEFAULT statement to control label group execution will be discussed. [See B.4.8.4.]
The value supplied must be either a literal character string or a string variable; no other expressions are allowed. Like character strings given in the VALUE command, this value should usually be enclosed in apostrophes. The same rules given for strings in the VALUE command apply here. [See B.4.3.]
Here is an example of a DEFAULT statement:
LABEL = ADDRESS; GETELEM; DEFAULT = 'No address';
If no value is retrieved by the preceding GETELEM statement, the value "No address" will be provided instead.
When a default value is accessed, it becomes the value of both $UVAL and $CVAL system variables. The default value will not be processed through any OUTPROC for the element, whether the OUTPROC is given in the file definition or in the label group itself. [See B.4.5.2.] The INSERT statement, if coded, will be applied to the default value, however. [See B.4.5.3.]
The system flag variable $DEFAULT is always set when default value processing occurs; you can test it if you need to know whether the default value is being provided. [See B.4.8.1, E.2.1.20.]
If no value is given in the DEFAULT statement (meaning that "DEFAULT;" was coded), then the value of $UVAL will be null. However, other value-control statements, such as INSERT, will still be applied to the null value to create $CVAL. [See B.4.5.3.] Unless "PUTDATA = 2" is coded, a null value for $CVAL will not be "placed" in the buffer, meaning that the current row and column numbers will not be updated to reflect the positioning of the null value. [See B.4.4.] It is important to consider how a null element value or a null DEFAULT value will be processed by each label group where it might occur, since how it is handled may affect the placement of other values later.
When a GETELEM statement is executed and a value is retrieved, its internal value is processed through any OUTPROC rules given in the file definition for that element. The OUTPROC statement is used in an output frame for one of two purposes:
- If it is preceded by a GETELEM statement, the OUTPROC statement will override all of the OUTPROC processing rules for the element being retrieved that are coded in the file definition.
- If it is preceded by a VALUE statement, the OUTPROC statement causes the given value to be processed through the processing rules provided here.
The syntax of the OUTPROC statement is the same in a format definition as it is in a record definition:
OUTPROC = rule string;
where "rule string" contains one or more processing rules (actions, system procs, or user-defined processing rules). If multiple rules are given, they are separated by single slashes (/), optionally surrounded by blanks.
Another form of the OUTPROC statement can be coded if you want to override file definition OUTPROC processing but not replace it with something else:
OUTPROC;
With this form, no OUTPROC processing will occur at all, and $CVAL and $UVAL will be equivalent.
A typical use of the OUTPROC statement is to display a different form of a stored date than the one chosen in the file definition:
LABEL = DATE.RECEIVED; GETELEM; OUTPROC = $DATE.OUT(DAY.MONTH,,UPLOW,FULL); PUTDATA;
The OUTPROC string usually consists of actions and system procs. A user-defined proc can be included, but it must be defined at the end of the format definition, or in an EXTDEF subfile record that is referenced in the EXTDEF statement at the end of the format definition (see below).
The A62 or A124 actions or the $CALL system proc, used to invoke USERPROCs, may be coded in the OUTPROC string. However, the actual USERPROC definition must be in the record definition of the record named by the RECORD-NAME statement.
The $STRUC, $STRUC.IN and $STRUC.OUT system procs (action A33) can be coded to retrieve a structure element as if it were a single element. [See B.8.1.]
The $LOOKUP proc (A32 rule) has an important security limitation worth noting here: If someone other than the file owner writes a format for one record-type in a file, and if that format contains $LOOKUP procs or A32 actions in either INPROC or OUTPROC statements, then users of the format must have been granted subgoal access to the accessed record-type; otherwise, an error during format processing will occur. The access may be granted only by the file owner in the file definition, through either the SUBGOAL statement in the subfile section or the FILE-ACCESS statements, where the account numbers of the format users must be given an access level of SEE (or a level incorporating SEE access).
Similarly, the file owner may require that access by some or all accounts to an element be limited to the external form of the element as determined by the file definition's processing rules. (This restriction is made with the combination of the OUTPROC-REQ and PRIV-TAG statements in the file definition.) If a label group containing an OUTPROC statement tries to retrieve such an element, no value is retrieved and, unless the DEFAULT statement appears in the label group, the remainder of the label group is skipped. [See B.4.5.1.] If default processing is requested, the value of $UVAL will be the same as $CVAL, i.e., the value after the file definition's OUTPROC rules are executed and any INSERT statements in the label group are applied.
SPIRES allows you to create your own procs (collections of system procs and actions) where each string of processing rules is identified by a single name. That name can then be used in place of the string in OUTPROC and INPROC statements, for example, to identify the processing rule string that should replace it. The proc facility is discussed in detail in section C.10 of the manual "SPIRES File Definition".
Procs may be coded in OUTPROC and INPROC statements in format definitions. A proc may be used in a format if its definition appears in one of the following places:
- at the end of the format definition. All the statements allowed for proc definitions in a file definition may appear here, following the GEN-FILE statement. [See E.1.]
- in a record in the public subfile EXTDEF ("EXTernal DEFinitions"). (EXTDEF
records are described in the aforementioned section of "SPIRES File Definition".)
The name of the EXTDEF record must be coded in the format definition in the EXTDEF-ID
statement:
EXTDEF-ID = gg.uuu.idname;
The EXTDEF-ID statement also appears at the end of the format definition, following any proc definitions. (Both proc definitions and EXTDEF-ID statements are allowed in a single format definition.)
EXTDEF-ID used to be called PROCDEF, which is allowed as an alias.
Sometimes a processing rule that you want to use is only available as an INPROC. For example, you might want to verify that a value created by an expression in a VALUE statement is 5 characters long, and the $LENGTH system proc is only available as an INPROC. You can code an INPROC instead of an OUTPROC if the following three conditions are met:
- the value being processed was created by a VALUE statement and not a GETELEM statement; and
- no OUTPROC statement is coded in the label group. You may not have both an INPROC and OUTPROC in the same label group; and
- the INPROC string does not include any INCLOSE rules.
Because the value of a virtual element is created by its processing rule strings, understanding how the presence or absence of an OUTPROC statement will affect it is important. If no OUTPROC statement is coded in a label group retrieving a virtual element, then the OUTPROC statement in the file definition will be used to create the $CVAL form of the value for the label group, while the OUTPROC followed by the INPROC will be executed to create $UVAL.
However, if an OUTPROC statement is coded in the label group, that OUTPROC string will be used to create the value of $CVAL, but the value of $UVAL varies; specifically:
- if the virtual element is a redefining virtual element, then $UVAL is the internal value of the redefined element (i.e., the stored value the virtual element is based on);
- if the virtual element is not a redefining virtual element, then $UVAL is null.
This is the only situation in which the presence of an OUTPROC statement in the format definition will change the value of $UVAL -- in other situations, $UVAL is not affected by OUTPROC statements.
Processing rule errors can occur during record output, though they occur more frequently during input processing. The techniques used for handling them in input formats also work in output formats. [See C.3.6.] Several system variables are set when a processing rule error occurs, and they may be tested in the label group to determine whether an error has occurred. For example,
LABEL = NUMERIC.CODE; GETELEM = CODE; OUTPROC = $VERIFY(LIKE,NUMERIC,,D); INSERT = 'Numeric Code: '; UPROC = IF $APROCERR THEN JUMP ALPHA.CODE; PUTDATA;
In this example, SPIRES verifies that the retrieved CODE value contains only numerals (the $VERIFY system proc). If it contains other characters, the error flag is set. The Uproc tests the flag $APROCERR, which is set if any processing rule executed during the execution of the current label group has caused an error. If that flag is set, SPIRES is to proceed to the label group ALPHA.CODE, preventing any further execution of the NUMERIC.CODE label group.
The "D" parameter in the $VERIFY proc represents the error level for the rule -- it overrides the default "S" level for that particular proc, which would cause an error message to be displayed at the terminal. [See C.3.6.4.] Note however that "S" level errors that occur during format processing do not stop the execution of the format -- this is true for both input and output formats. For output, the record will be completely processed by the format, even though a serious-error message may be displayed on the terminal screen. Remember that "S" level errors that occur during output processing when no format is set (i.e., when the standard SPIRES format is set) will prevent any further output of the record.
The INSERT statement can be used to add a character string to the beginning, middle or end of $CVAL. Typically, it is used to identify the data, usually in a more attractive way than the standard format "element-name =" prefix, which could be considered a type of "insert".
For example,
LABEL = HOME.PHONE; GETELEM; INSERT = 'Home phone number: '; PUTDATA;
On output, the value should look something like this:
Home phone number: 938-2958
There are three forms available for the INSERT statement:
INSERT = string-expression; INSERT = END, string-expression; INSERT = n, string-expression;
The first form requests that the given string expression be inserted in front of the current value of $CVAL. The second form requests that it be appended to the end of $CVAL. The third requests that the string be inserted in front of the "nth" character of $CVAL; that is, the first character of the inserted string will begin at the "nth" character position in $CVAL, with the remainder shifted to the right; "n" is an integer. If "n" is used, and the current value of $CVAL has fewer than "n" characters, the insert string will be appended directly to the end of the value.
The string expression may have one of the following forms or a concatenation of them:
'literal string' #user.variable $system.variable
Note that END or "n" may not be expressed by variables but must be directly coded.
Because special characters are often used as insert characters, be careful to follow the data entry rules for the INSERT statement (i.e., putting a literal value in apostrophes) and for the record as a whole. [See B.1.3.] For example, to request that a semicolon be inserted at the end of a value, you would code the following INSERT statement:
INSERT = "END,';'";
Multiple INSERT statements are allowed, each one changing the value of $CVAL:
LABEL = AMOUNT; GETELEM; <-- e.g., $CVAL = '5 dozen' INSERT = 'This makes '; <-- $CVAL = 'This makes 5 dozen' INSERT = END,'.'; <-- $CVAL = 'This makes 5 dozen.' PUTDATA;
Even if $CVAL is null, the insertion will occur. This situation might arise when the DEFAULT statement is coded without a default value, setting $CVAL to null. [See B.4.5.1.] When the insertion is then applied, $CVAL becomes the insertion string.
Unlike the TITLE statement, which is only applied once in a label group, the INSERT statement will apply to all occurrences of an element processed in the label group with a LOOP statement. [See B.4.7 for a comparison of the TITLE, INSERT and VALUE statements, B.4.8.4 for the LOOP statement.]
One final way to alter the value of $CVAL is the SET CVAL Uproc. (Uprocs are command-like statements allowed in a label group.) [See B.4.5.5.] Since a label group's Uprocs are executed after the GETELEM, VALUE, DEFAULT, OUTPROC, INPROC and INSERT statements, the SET CVAL Uproc can completely override the processing of these statements. For example,
LABEL = AUTHOR; GETELEM; DEFAULT = 'Anonymous'; OUTPROC = $NAME; INSERT = 'written by '; UPROC = SET CVAL = 'Author Unknown'; PUTDATA;
No matter what value $CVAL had before the Uproc was executed, it has the value "Author Unknown" afterwards, and that is the value that would be placed in the frame. That label group might just as well have been written as:
LABEL = AUTHOR; or LABEL = AUTHOR; UPROC = SET CVAL = 'Author Unknown'; VALUE = 'Author Unknown'; PUTDATA; PUTDATA;
The syntax of the SET CVAL Uproc is:
UPROC = SET CVAL = expression;
where "expression" can consist of literal strings, user or system variables (including $CVAL itself, referring to its value before this Uproc is executed), or system functions. For example,
UPROC = SET CVAL = 'This makes ' || $CVAL || '.';
which is equivalent to the final example in the previous section, which used multiple INSERT statements. [See B.4.5.3.]
Though quotation marks around the "value" of the UPROC statement are usually unnecessary, they are added below to help show what the statement means:
UPROC = "SET CVAL = expression";
That is, the value of the UPROC statement is "SET CVAL = expression". Note that both equals signs are optional, though the second one is usually included for readability's sake.
The SET CVAL Uproc is often used in conjunction with other Uprocs, in particular the IF...THEN Uproc:
UPROC = IF $CVAL = '0' THEN SET CVAL = 'None';
Remember that the SET CVAL Uproc is the last chance to change the value of $CVAL before it is placed in the frame. It is easy to forget, for example, that INSERT statements will be applied to $CVAL before the SET CVAL statement is executed.
In an output frame, the SET CVAL Uproc will convert the result of the expression to a string. In contrast, the VALUE statement will not do that but will leave the type of the expression alone. However, for output, it is assumed that you will convert the "VALUE value" to a string somewhere before the PUTDATA with an OUTPROC or SET CVAL Uproc. [See B.4.3, B.4.5.2.]
For example, these two label groups do not produce equivalent results:
LABEL; LABEL; VALUE = 3+4; UPROC = SET CVAL = 3+4; PUTDATA; PUTDATA;
In both cases, the result of the expression is a packed decimal value, but the SET CVAL statement converts the result to a string; the VALUE statement does not make that conversion. However, the PUTDATA statement assumes that $CVAL is a string, and so it tries to read the packed decimal $CVAL as a string, producing "garbage characters" from the label group on the left. The character string "7" would be produced from the label group on the right.
The label group on the left could be repaired in several ways:
1) change VALUE statement to: VALUE = $STRING(3+4); 2) add: OUTPROC = $PACK.OUT; 3) add: UPROC = SET CVAL = $STRING($CVAL); 4) add: UPROC = SET CVAL = $CVAL;
If you used one of these methods, $CVAL would be properly converted to a string for placement in the frame. In terms of efficiency, the second way is best -- using processing rules is more efficient than using functions. (Note that the fourth method exploits the fact that the SET CVAL Uproc automatically converts the value to a string.)
The SET CVAL Uproc (pronounced "YOU-prock") is one of many Uprocs to be discussed in this manual. [See B.4.5.4.] A Uproc is a statement specifying a procedure to be executed at that point in the frame execution.
Uproc statements closely follow the syntax of protocol statements (but see the notes below); in fact, many of them have the same names and purposes. Uproc statements may appear in label groups containing GETELEM and/or PUTDATA statements, or in label groups by themselves. Some of the tasks of Uprocs are:
- to assign or change variable values;
- to transfer execution processing to another label group;
- to transfer execution processing to a subroutine;
- to display messages to the terminal;
- to prompt for input from the terminal;
- to stop format execution;
- to change values in other statements in the label group;
- to test variable values and perform other Uprocs accordingly.
Uproc statements are of the form:
UPROC = statement; UPROC = statement; . . . . . . . . UPROC = statement;
Uprocs are executed in the order in which they appear in the label group definition. The allowable statements fall into several categories. Below is a list of all the Uprocs allowed in a format, though not all are allowed in every frame or every label group. Details on most of them are provided in other sections of this manual; consult the index for details on specific ones. (Some, such as REPROMPT and the SET DISPLAY Uprocs, are discussed in the manual "SPIRES Device Services".)
* RETURN EVAL - LET IF JUMP/GOTO THEN ASK ELSE XEQ PROC SET BEGINBLOCK ENDBLOCK WHILE ENDWHILE REPEAT UNTIL LEAVE ITERATE
ABORT DOPROC BACKOUT FLUSH BUILD RECORD HOLD CASE REPROMPT EJECT PAGE STOPRUN EJECT COLUMN REFERENCE ALLOCATE SET FILTER DEALLOCATE CLEAR FILTER
SET <flag> SET <integer> SET <string> SET JUSTIFY SET NROWS SET ADJUST SET FLUSH SET NCOLS SET CVAL SET REPORT SET STARTROW SET UCODE SET [NO]TESTFAIL SET STARTCOL SET ERROR SET SUPPRESS SET COLUMNS SET PADCHAR SET SKIPEL SET COLWDTH SET PROMPT SET SKIPF SET HDRLEN SET ASK SET PUTSTRUC SET FTRLEN SET DISPLAY SET REMOVEL SET LINEPP SET TDISPLAY SET NOBREAK SET PAGECTL SET EDISPLAY SET NEWPAGE SET MARGIN SET TCOMMENT SET FRONTPAGE SET PUTOCC SET PARM SET SKIPROW SET PAGENO SET SORTDATA SET [UN]PROTECT SET MESSAGES SET SORTKEY SET LARGE SET MAXLEVELS SET UVAL SET NEWCOL SET ELENGTH SET SCANROW SET REPEAT SET RECNO SET [NO]VIALCTR SET [NO]VIALLCTR SET [NO]TESTREF SET AUTOTAB SET NOPUTFLAG SET TESTSUBG SET HCLIP
The following statements can only be used in Entry-Uprocs [See B.4.5.6.]
SET BUILDEND SET LOOP BACKWARDS SET SUBGOAL
SET BUILDSEP SET STARTOCC
The major differences between Uprocs in the Formats language and commands available in the Protocol language are:
- Not all protocol commands are allowed in formats (see list above).
- The SET Uproc may evaluate expressions in format Uprocs.
- The prompt in the ASK Uproc may not have string expressions in format Uprocs, though the SET PROMPT Uproc may.
- The ASK Uproc allows only certain command choices in its NULL and ATTN clauses in format Uprocs.
- The "/" prefix is not allowed in formats; literals in a Uproc must be enclosed in apostrophes:
UPROC = *'The time is ' $TIME;
(Command:) /*The time is $TIME
- * Uprocs do not display a "*" at the terminal in formats.
- A RETURN Uproc after an XEQ PROC Uproc returns to the next label group in Formats (except XEQ PROC from a NULL or ATTN clause in an ASK Uproc, which returns and re-asks the question).
- Block constructs established by the BEGINBLOCK, REPEAT and WHILE Uprocs must be entirely contained within a single format label-group. Within the label group, it must be entirely contained within the Entry-Uprocs or within the Uprocs.
Comments for Uprocs are usually handled by the "-" Uproc. You may code a Uproc statement without a value, in order to provide spacing in your format definition:
UPROC = IF $CVAL = 0 THEN SET CVAL = 'None'; UPROC; UPROC = - The above Uproc handles the value of zero.; UPROC;
Though they are more difficult to create and read, comments may also be appended to Uprocs by using the semicolon delimiter to separate the Uproc from the comment. Remember though that because the UPROC statement is an element in a FORMATS goal record, the entire value must be placed in quotation marks if internal semicolons appear:
UPROC = "IF $CVAL = 0 THEN SET CVAL = 'None' ; handles zero";
An Entry-Uproc is a special kind of Uproc statement [See B.4.5.5.] with a significant difference in timing: the Entry-Uproc is executed at the beginning of the label group, where a standard Uproc is executed near the end of the label group. Specifically, Entry-Uprocs are executed before any GETELEM statement (or, in an input format, a GETDATA statement), whereas Uprocs are executed after any GETELEM statement, though before a PUTDATA statement.
As a reminder, here is the order in which statements discussed so far in this chapter would be executed within a given label group:
LABEL ENTRY-UPROC <---(before GETELEM or VALUE statement) GETELEM VALUE DEFAULT OUTPROC INSERT UPROC <---(after GETELEM or VALUE, but before PUTDATA) PUTDATA
As with Uprocs, you can have more than one Entry-Uproc in a label group if you wish. In fact, much of what was said about Uprocs also applies to Entry-Uprocs, with these differences:
- The syntax and rules for Uprocs [See B.4.5.5.] also apply for Entry-Uprocs, but, since an Entry-Uproc executes before a value is retrieved (with GETELEM) or created (with VALUE), any Uproc that is used to test or convert the label group's value would have no meaning as an Entry-Uproc.
- In a label group that loops, Entry-Uprocs in the label group will be executed only once, at the beginning of the first iteration of the loop. [See B.4.8.4 for more on the LOOP statement.]
An Entry-Uproc can be especially useful for setting up an initial condition or environment in its label group. Because a Uproc executes only near the end of its label group, it cannot affect initial conditions (e.g., before an element occurrence is first retrieved) unless it is coded in a previous label group. The two sets of statements below accomplish the same task, but the second is slightly more compact:
LABEL = SET.INITIAL; UPROC = * 'The following locations stock item ' #ITEM; UPROC = * '------------------------------------------------'; LABEL = STORE.LOC; GETELEM; INSERT = 'Location: '; PUTDATA; LOOP; LABEL = STORE.LOC; ENTRY-UPROC = * 'The following locations stock item ' #ITEM; ENTRY-UPROC = * '------------------------------------------------'; GETELEM; INSERT = 'Location: '; PUTDATA; LOOP;
When in doubt which of the two types of statement you need, ask yourself at what point in the label group you want the statement to take effect: if it affects a retrieved or created element value, then you probably want a Uproc. If, on the other hand, it should take effect before a value is retrieved (or created), and if it does not need to be reset for multiple iterations of the label group, then perhaps it should be an Entry-Uproc. [For the sake of economy, this manual sometimes uses the term Uproc in cases where either a Uproc or Entry-Uproc could be equally appropriate.]
A section later in this chapter will describe two procedural statements, SET STARTOCC and SET LOOP BACKWARDS, that must always be coded as Entry-Uprocs, not Uprocs, because they must take place within the label group where the occurrence is retrieved, but before the GETELEM statement is executed. [See B.4.8.5.]
The SET BUILD collection of Entry-Uprocs provides a handy way to display multiple occurrences of an element as though all the occurrences really formed a single occurrence. The two statements let you concatenate multiple occurrences with a minimum of effort. Additionally, they provide more flexibility in preparing a value for output, with their effects on the value taking place "at the last second" -- they don't affect the value of $CVal as, say, the INSERT statement does.
Here are the 6 SET BUILD Entry-Uprocs:
SET BUILD SEPARATOR 'string' (or SET BUILDSEP)
- concatenates multiple occurrences within a single
looping label group, gluing them together with the
specified separator string (e.g., a comma and space).
SET BUILD END 'string' (or SET BUILDEND)
- used with SET BUILDSEP, appends the specified string
to the end of the value (e.g., an ending period).
* SET BUILD PREFIX 'string' (or SET BUILDPRE)
- used to add a string to the beginning of the value
* SET BUILD CONTAINER 'string' (or SET BUILDCON)
- specifies a container character used to surround
the entire value if the value contains that character
or others you specify
* SET BUILD SURROUND 'string' (or SET BUILDSUR)
- specifies a character or pair of characters used to
surround the entire value; compare with SET BUILD
CONTAINER
* SET BUILD ESCAPE 'string' (or SET BUILDESC)
- specifies an escape character and other characters
that should be preceded by the escape character if
they appear in the value
(* = available as Uprocs as well as Entry-Uprocs, but does not
work as efficiently)
SET BUILD SEPARATOR concatenates multiple occurrences within a single looping label group, appending a specified character set (such as a comma and space) at the end of each occurrence but the last one. [See B.4.8.4 for information on looping.] SET BUILD END, issued within the same label group, lets you append a different character set (perhaps a period) after the last concatenated occurrence.
These Entry-Uprocs create a new mode of operation within the label group, causing positioning statements such as START, MARGINS, and ADJUST to be executed only once, after the looping is finished. (The XSTART statement would not be needed or used at all.) Since the PUTDATA statement will also execute only after all occurrences have been concatenated, end-users of the format will probably never even know that the "single" value displayed is stored as multiple occurrences in the data base.
The syntax of these Entry-Uprocs is as follows:
SET BUILD SEPARATOR 'string' SET BUILD END 'string'
where "string" consists of the character(s) you wish to append to the occurrences as they are displayed.
Thus, the Entry-Uprocs might look like this:
ENTRY-UPROC = SET BUILDSEP ', '; ENTRY-UPROC = SET BUILDEND .; ENTRY-UPROC = "SET BUILDSEP '; '";
You can use SET BUILDSEP without SET BUILDEND, but SET BUILDEND makes no sense except as a partner to SET BUILDSEP.
For an example of how these Entry-Uprocs work, consider a "restaurant" record with multiple values for CUISINE, as below:
NAME = Castle Grand;
:
CUISINE = Continental;
CUISINE = Crepes;
CUISINE = Quiche;
:
A looping label group could retrieve all the occurrences, separating each occurrence by a comma and space, and appending a period at the end:
LABEL = CUISINE;
ENTRY-UPROC = SET BUILDSEP ', ';
ENTRY-UPROC = SET BUILDEND '.';
: <---(Statements such as
GETELEM; TSTART, TITLE,
: <--- and START not shown)
PUTDATA;
LOOP;
The resulting display would show CUISINE as a "single" occurrence:
Name: Castle Grand Cuisine: Continental, Crepes, Quiche.
It is important to remember that the concatenated occurrences only seem to be a single occurrence -- technically, they remain multiple occurrences. Thus, if the label group included an INSERT Uproc (or a SET CVAL Uproc) to insert or append a string value, the specified string would be inserted or appended within every one of the multiple occurrences. To insert a single value at the beginning of the concatenated value, use the TITLE statement instead. [See B.4.7.]
Note that if a file definition uses the $BUILD proc to concatenate element values, the $BUILD processing will be overridden by a format label group that uses the SET BUILDSEP Entry-Uproc -- the Entry-Uproc takes precedence.
The SET BUILD PREFIX Uproc, as well as the remaining ones described here, does not cause multiple occurrences to be concatenated. The "building" applies to a single element value, unless SET BUILDSEP is also in effect for the label group. If SET BUILDSEP is in effect, then any of these other SET BUILD Uprocs that are included in the label group will be executed only once, when the final concatenated value is being produced for output.
SET BUILD PREFIX adds a prefix to the value in $CVal just prior to output. This is similar to INSERT except that SET BUILD PREFIX does not actually change the value of $CVAL as INSERT does.
SET BUILD PREFIX 'string'
The SET BUILD CONTAINER Uproc is similar in principle to the way the standard format works in SPIRES. There, for example, if a value contains a semicolon, then the entire value is surrounded by quotes. And if the value contains quotes, the quotes are doubled, and then the entire value is surrounded by quotes.
With SET BUILD CONTAINER, you can specify the characters that get this type of treatment in the label group.
SET BUILD CONTAINER 'chars'
If only one character is specified, that is the character used to surround the value if the value contains other occurrences of the character. Additionally, those other occurrences of that character will be doubled.
If multiple characters are specified, the first is treated as described above. Also, if any of the other characters appear within the value, then the entire value will be surrounded by the first character.
If you specified:
UPROC = "SET BUILD CONTAINER '"";'";
that would be equivalent to the way the standard format in SPIRES works. Note the doubling of the quotation marks with the Uproc's value, and that the entire Uproc value must be put in quotation marks.
The SET BUILD SURROUND Uproc is similar to the SET BUILD CONTAINER Uproc except that SPIRES will always surround the value with the surround character(s), no matter what characters may appear in the value.
SET BUILD SURROUND 'chars'
If only one character is specified, that is the character used to surround the value on each side. If two characters are specified, the first is used on the left side of the value, and the second on the right.
If you specified:
UPROC = SET BUILD SURROUND '<>';
then the value "ABC" would come out as "<ABC>".
SURROUND and CONTAINER should not both be specified; if they are, SURROUND will be given precedence.
The SET BUILD ESCAPE Uproc will precede each of a set of characters in the output value with an "escape" character.
SET BUILD ESCAPE 'chars'
The first character is the escape character. Other characters may follow.
SPIRES will scan the value for occurrences of any of the characters, including the escape character. If any appear in the value, SPIRES precedes them with the escape character.
For example:
UPROC = SET BUILD ESCAPE '\*';
If the value contains any back slashes or asterisks, SPIRES will precede them with a back slash.
Unless told otherwise, the PUTDATA statement places the current value of $CVAL starting in the first column of the next row of the frame. The statements discussed in the sections of B.4.6 provide you with much flexibility in positioning a value within a frame.
The first ones to be discussed relate to the starting row-column positioning of the value: START, XSTART, SET STARTROW Uproc and SET STARTCOL Uproc. Just as the value-control statements were involved with the system variable $CVAL, the value-placement-control statements are involved with the system variables $CCOL (Current COLumn) and $CROW (Current ROW). The values of these two variables represent the current position in the frame, that is, the location where the last data placed in the frame ended. [See B.4.6.1 for an example using the current position.]
The subsections of B.4.6 cover how the value is to be positioned, such as how many rows long it can be, whether it should be centered in the row, and so forth: MARGINS, MAXROWS, LENGTH, BREAK, SET ADJUST Uproc, SET JUSTIFY Uproc and so forth. These statements establish or interact with the value's "field", that is, the area in which the value will or can be placed.
The START statement specifies the starting row and column for the positioning of $CVAL in the frame.
The START statement may be specified in any of the ways below:
START = row, column;
or
START = row;
or
START = , column;
where "row" is one of the following:
and where "column" is one of the following:
The row specifications are described above as they are used in fixed-dimension frames, as opposed to line-by-line ones. See the discussion below for details on how they apply to line-by-line processing.
The "row" or "column" value may not be an expression, other than those explicitly allowed above, such as "X+1". If you must use an expression, use the SET STARTROW or SET STARTCOL Uprocs instead. [See B.4.6.2.]
Here is an example of a frame definition using START statements, followed by a picture demonstrating the outcome of its execution:
FRAME-ID = EXAMPLE; FRAME-DIM = 5,30; DIRECTION = OUTPUT; LABEL 1; VALUE = '1'; PUTDATA; LABEL 2; VALUE = 'two'; PUTDATA; LABEL 3; VALUE = '3'; START *,*+2; PUTDATA; LABEL 4; VALUE = '4'; START X,X; PUTDATA; LABEL 5; VALUE = 'five'; START 1,10; PUTDATA; LABEL 6; VALUE = '6'; START *,*; PUTDATA; LABEL 7; VALUE = '7'; START *+4; PUTDATA; LABEL 8; VALUE = '8'; START 2,20; LABEL 9; VALUE = '9'; START *,*+3; PUTDATA;
Below is the produced frame, surrounded by a box, with column and row numbers shown on the outside:
....v....1....v....2....v....3
+------------------------------+
1 |1 fiv6 |
2 |two 3 |
3 | 4 |
4 | |
5 |7 9 |
+------------------------------+
Here are some notes on each label group:
In the example above, we could have replaced all the Xs and "*+3"s and so forth with explicit numbers, because we knew the exact lengths of all the values we were positioning. The * and X values are most often useful when you are handling values of varying or not easily determined lengths. For example, suppose you want the two elements REVIEWER.NAME and REVIEW.DATE to appear like this:
(Reviewed by Lola Cola on Sunday, November 3, 1981.)
You do not know how long the reviewer's name will be, but you do not need to:
LABEL = REVIEWER.NAME; GETELEM; START = 5,1; INSERT = '(Reviewed by '; PUTDATA; LABEL = REVIEW.DATE; GETELEM; START = *,*+2; OUTPROC = $DATE.OUT(DAY.MONTH,,UPLOW,FULL); INSERT = 'on '; INSERT = END, '.)'; PUTDATA;
In the REVIEW.DATE label group, $CVAL is positioned two columns to the right of where the REVIEWER.NAME value ended, regardless of its length. Even if the name had been so long that it wrapped around into the next row, the REVIEW.DATE value would have been placed properly following it. [See B.4.6.3 for a discussion of how long values wrap around.]
Two other types of START statements having the same syntax shown above are allowed in a label group: XSTART and TSTART. XSTART is used when multiple occurrences of an element are being processed by the label group with the LOOP statement. [See B.4.8.4.] TSTART is used when a TITLE statement appears in the label group to indicate the starting position for the TITLE. [See B.4.7.]
When a frame is being processed line-by-line, either because the FRAME-DIM statement so declares [See B.3.3.] or because SET FLUSH processing is in effect [See B.3.3.] the "row" options cited above for the START command have slightly different meanings and uses. This is because once you have started putting data on a new row in line-by-line processing, you cannot return to an earlier one.
That might seem to make row numbers useless, but that is not the case. Each time data is placed in a new row, that row ($CROW) is equivalent to row 1. In other words, in line-by-line processing:
START = *,3; is equivalent to START = 1,3;
Similarly,
START = *+2,1; is equivalent to START = 3,1;
So, in line-by-line processing, remember that if one label group specifies a starting row of 3, and the next label group specifies a starting row of 5, the second value positioned will start on the fifth row from the end of the first value, counting the row where the first value ends as "1". Also in line-by-line mode, the row "*+1" is always equal to "X". (Though that is often true in a frame with fixed dimensions, it is not true when the last data positioned was placed above, i.e., in a lower-numbered row than, other data already positioned.)
The two Uprocs SET STARTROW and SET STARTCOL can be used instead of or in conjunction with the START statement. [See B.4.6.1.] Like the START statement, these Uprocs override the default starting position for a value being placed in the frame. Unlike the START statement, they may contain expressions for values and may be conditionally controlled, using the IF... THEN Uproc. [See B.4.8.1.] These statements override the START statement when there is a conflict.
The syntax of these two Uprocs is:
UPROC = SET STARTROW = expression; UPROC = SET STARTCOL = expression;
where "expression" is an expression that evaluates as an integer. The expression may not contain the "*" or "X" characters allowed in the START statement; instead, you should use the $CROW and $CCOL variables as appropriate.
Also like the START statement, the SET STARTROW and SET STARTCOL Uprocs do not change the values of $CROW and $CCOL. Those values change only when data is actually placed in the frame by a PUTDATA statement. [See B.4.4.] Hence, if no PUTDATA occurs in the label group in which a SET STARTROW or SET STARTCOL Uproc appears, the Uproc has no effect whatsoever.
These two Uprocs have no effect in input frames.
SPIRES, by default, uses the column specification in the FRAME-DIM statement to establish margins for values placed in the frame. [See B.3.3.] For example, if "FRAME-DIM = 3,68;" representing a frame three rows high by 68 columns across, SPIRES will not place data any further to the left than column 1 and no further to the right than column 68. If a value would spill over the right margin, by default it "wraps around" into the next row, starting in column 1 and running toward column 68. If it is still too long, it wraps around again into the next row, and so forth. When the value wraps around, SPIRES looks for a blank character (by default) at which to split the value (see example below).
The MARGINS statement lets you change the default margins for the current value. Its syntax is:
MARGINS = lcol, rcol;
where "lcol" and "rcol", representing the leftmost and rightmost column margins respectively into which data will be placed, are each one of the following:
Generally speaking, the starting position of a value is controlled by the START statement; how it wraps around into other rows is handled by the MARGINS statement. Specifically, the right margin of the first row's worth of the value is controlled by the "rcol" value in the MARGINS statement, as is the right margin for subsequent rows. The left margin of the second row, where the wrap-around resumes, is controlled by the "lcol" value of the MARGINS statement.
However, if no START statement appears, or if the starting column is given as "0" for default, then the default starting column is "lcol" of the MARGINS statement. In other words, placement of a value begins where START (or MARGINS, if no specific starting column is given in the START statement) specifies; the remainder of the row and subsequent rows are controlled by MARGINS.
For example:
LABEL = QUOTATION; GETELEM; MARGINS = 1,55; START = 1,5; PUTDATA;
Here is a value processed by that label group (the column numbers are shown for clarity):
....v....1....v....2....v....3....v....4....v....5....v
The Oct. 7 paychecks will show the new 5 percent
reduction in withholding tax rates, according to
Payroll Manager Bob Behr. "It's not much, but it's
something," says Behr. "It's better than a poke in the
eye with a sharp stick."
The value began in the position given by the START statement (row 1, column 5); for the left margin of subsequent rows, the MARGINS value "1" was used; for the right margin, the value "55" was used. Had the START statement been omitted, the quotation would have started in column 1, since "lcol" in the MARGINS statement was "1".
Note that a MARGINS statement only affects the current label group. It does not carry over into subsequent label groups.
The MARGINS statement controls the margins for long values that wrap around into other rows. The BREAK statement controls what character or characters the value can be split upon. By default the value can be split only across rows at blanks. If no blank appears in the value before the end of the row, the value is split with as many characters on the row as possible, that is, with characters up to the right margin.
Here is how the BREAK character is used: If $CVAL will not fit in one row within the given margins, SPIRES temporarily puts as much of the value as will fit, and then scans the value backwards from that point, looking for the break character. The portion of the value up to and including the break character is kept in the row, and the remainder of the value is then started in the next row, and the process begins again. This continues until the value is exhausted, or until the MAXROWS value is exceeded. [See B.4.6.6.] This procedure is slightly different when the break character is a blank; it is described below.
The BREAK statement has several forms:
BREAK = character;
The character specified will be used as a break character, instead of the default blank. If the character is a special character, the value of the statement should be in quotation marks ("character") not apostrophes ('character').
BREAK = two characters;
This form means that either character can be used as the break character. In any given situation, the character allowing the most characters in the row will be used. Like the first form, the second should be used with quotation marks if special characters are specified.
BREAK = NONE;
This form means that no break character will be used at all. Values that would wrap around into another row will fill the first row completely and then resume, perhaps "midword", in the next row.
BREAK = TRUNC;
This special form indicates that no wrapping into the next row is to occur at all. If the value would go beyond the right margin, it is truncated there. Compare this method of fixing a value length (i.e., the MARGINS and BREAK statements) with the LENGTH and MAXROWS statement. [See B.4.6.5, B.4.6.6.]
Note that the default blank break character does not apply in label groups that call indirect frames. BREAK = NONE becomes the default for such label groups, but it can be overridden by explicitly coding a BREAK statement. [See B.4.6.4, B.8.1.]
When the break character is a blank, the break processing is slightly different from the way described above, though in a manner you would expect. Suppose that "rcol" represents the column of the right margin. If the character in "rcol" is not a blank, but the next character is, the value will be split at that blank, and the next non-blank character will begin the value in the next row. (Only when a blank is the break character can the "rcol+1" character affect the break, and then only when that character is a blank will it be discarded.)
For example, suppose you have the following values, called A and B:
....v....1....v....2....1....3....v....4 This is the value that we will call A. On the other hand, this is value B.
If the margins in effect are 1 and 18 (for "lcol" and "rcol" respectively, and the values start in column 1, here is what they would look like on output (note the use of the character "x" at the end of rows to indicate null characters):
....v....1....v... This is the value that we will call A. xxxxxxxxxxxxxxx On the other hand, this is value B. x
For value B, the blank following "hand," was suppressed -- it did not start the next row. For value A, the blank following "value" did appear on the first row since there was room for it there.
In some situations, such as cases where you want to overwrite a previously written field, it is important to realize that a blank used as a break character in the value will be output at the end of the row, just as a non-blank break character would be, as shown above for value A. (There is an exception to that statement. In the situation described above, where the "rcol+1" character is a blank and causes the break, that blank is suppressed. Similarly, if multiple blanks appear around the point of the break, those blanks up through "rcol" will be output; the rest will be suppressed, and the next row will begin with the next non-blank character.)
For example, here is value C:
....v....1....v....2....v....3....v.. Value C has internal blanks.
Using the same margins and starting column as the above example, SPIRES produces the following:
....v....1....v... Value C has internal blanks. x
The internal blanks through column 18 are output; the blanks in columns 19, 20 and 21 are suppressed.
Concerning the nulls displayed as "x"s in the examples above: When the buffer is constructed internally, it is "empty", i.e., full of blanks. When values are positioned in the buffer, they overwrite the blanks. In some situations, such as that described above where values contain blanks, it is necessary to distinguish between the buffer's "empty" blanks and the value's blanks, so the term "nulls" refers to the blanks in the buffer which are not overwritten by any values. Once the frame processing has finished, the differences between the two types are irrelevant -- all blanks in the buffer are treated the same. [See B.4.8.7, B.8.1.]
The LENGTH statement can be used to limit the length of $CVAL that will be placed in the output frame. LENGTH is specified as the number of characters allowed in the output value. If $CVAL is longer than that, it will be truncated as it is placed in the frame; if it is shorter, it will be padded with a "pad character" if one is in effect. By default, the pad character is null, meaning no padding occurs. [See B.4.6.7.] Note that $CVAL is not itself changed by the LENGTH statement; that is, $CVAL will not include the additional pad character nor be shortened to "length" characters. The LENGTH statement does not affect anything in the label group until the PUTDATA statement is executed.
The syntax of the LENGTH statement is:
LENGTH = length;
where "length" is an integer or a variable of type INTEGER that may be indexed but not subscripted.
The MAXROWS statement establishes the maximum number of rows in which the value can be stored. By combining MAXROWS with the margins in effect, SPIRES creates an area in the frame in which the data value will be placed. If the value would exceed the area, it is truncated. If the value is smaller than the given area, the pad character, if set, is used to fill the remainder.
The syntax of the MAXROWS statement is:
MAXROWS = nrows;
where "nrows" is an integer or an integer variable.
A fairly common use for MAXROWS is when you have an element that must begin on a specific row number but which follows a textual element whose length may vary. The MAXROWS statement may be used in that case to limit the size of the textual element.
Note that row X and row *+1 may not be the same after a label group with this statement executes -- X would represent the "next row" after "nrows" from the MAXROWS statement, while *+1 would represent the next row past the row in which the positioned value actually ended, which may be one or more rows shy of "nrows".
Though more commonly used in full-screen applications, the SET PADCHAR Uproc and the PADCHAR statement may be used to provide a pad character to fill out fields established by LENGTH and/or MAXROWS statements. [See B.4.5.5 for more information about Uprocs.] (Remember that, like many Uprocs, SET PADCHAR can also be used as an Entry-Uproc, which would mean that it would be executed at the beginning of its label group, rather than at the end.)
The syntax is quite simple:
PADCHAR = 'character'; UPROC = SET PADCHAR = 'character';
Only one character is allowed; if multiple characters are specified, only the first will be used. A string variable may be used in place of 'character' if desired.
As an Entry-Proc, SET PADCHAR will affect the current label group and any subsequent ones. (As a Uproc, it would not affect the label in which it appeared, only subsequent ones.) The pad character remains in effect until another SET PADCHAR Uproc (or Entry-Uproc) is encountered. The pad character is reset to null each time the format is executed, meaning that it should be set in a data frame that will be executed when the pad character is needed.
The PADCHAR statement sets a pad character that applies only to the label group in which it appears. If a pad character has been set by a SET PADCHAR Uproc, it is overridden for the label group's execution, but still applies to all subsequent label groups.
To turn off padding explicitly, set the pad character to null like this:
UPROC = SET PADCHAR = '';
Unlike most SET Uprocs, SET PADCHAR does not have a system variable directly associated with it -- that is, there is no $PADCHAR variable.
The SET ADJUST Uproc lets you right- and/or left-adjust or center the value within the given margins. By default, values are left-adjusted in their fields.
The SET ADJUST Uproc has the following syntax:
UPROC = SET ADJUST = adjust;
where "adjust" is one of or a variable containing one of the following values:
- RIGHT (or R) -- causes the value to be right-adjusted.
- LEFT (or L) -- causes the value to be left-adjusted.
- CENTER (or C) -- causes the value to be centered on each row.
- JUSTIFY (or J) -- causes the value to be justified.
Any other value has no effect.
Here is a demonstration of the effects of the SET ADJUST Uproc:
LABEL = REMARKS;
GETELEM;
MARGINS = 1,28;
START = 1,1;
UPROC = SET ADJUST = adjust;
PUTDATA;
when "adjust" = LEFT: when "adjust" = RIGHT:
....v....1....v....2....v... ....v....1....v....2....v...
Egg covers wide area; has Egg covers wide area; has
small amount of thick white; small amount of thick white;
yolk is somewhat flattened yolk is somewhat flattened
and enlarged. and enlarged.
when "adjust" = JUSTIFY: when "adjust" = CENTER:
....v....1....v....2....v... ....v....1....v....2....v...
Egg covers wide area; has Egg covers wide area; has
small amount of thick white; small amount of thick white;
yolk is somewhat flattened yolk is somewhat flattened
and enlarged. and enlarged.
Two other statements can be used as well, though neither is as versatile as the SET ADJUST Uproc. The ADJUST statement may appear in a label group following the BREAK statement. It may have the same values as the SET ADJUST Uproc (i.e., RIGHT, JUSTIFY, etc.) although the value cannot be a variable. For example,
ADJUST = CENTER;
The SET JUSTIFY Uproc is an abbreviated form of the "SET ADJUST = JUSTIFY" Uproc -- its syntax is simply:
UPROC = SET JUSTIFY;
Again, because it handles all the situations, the SET ADJUST statement is the recommended method.
The SET REPEAT Uproc causes the value of $CVAL to be duplicated to fill the field defined by the LENGTH and/or MAXROWS statements. The value of $CVAL is not changed, only the value as it is placed in the frame.
The syntax of this Uproc is very simple:
UPROC = SET REPEAT;
This Uproc is frequently used to make borders or division lines:
LABEL; VALUE = '-'; LENGTH = 40; UPROC = SET REPEAT; PUTDATA;
The result on output would be:
----------------------------------------
(....v....1....v....2....v....3....v....4)
(The columns are shown beneath for clarity.)
If you want blanks between occurrences of the repeated value, you need to add a blank to the end (or perhaps to the beginning) of $CVAL, possibly by using the SET CVAL Uproc or the INSERT statement, or, as in the example above, the VALUE statement:
VALUE = '- ';
If no LENGTH or MAXROWS statement appears in the label group, the value will be repeated to fill the row in which it begins, up to the given (or default) right margin.
The SET REPEAT Uproc has no effect in input frames. Also, there is no $REPEAT variable associated with the SET REPEAT Uproc.
Do not confuse the SET REPEAT Uproc with the LOOP statement, which tells SPIRES to begin executing the label group again. SET REPEAT does not affect the execution of the label group; it affects the value being placed in the buffer.
The SET SKIPROW Uproc may be used in an output format label group to cause double spacing of the value being output to the buffer. That is, if the value being output would wrap into multiple rows, double spacing between rows of output will occur. Note though that double spacing between occurrences of the element or between different element values must be accomplished with the START and XSTART statements. [See B.4.6.1, B.4.8.4.]
The "skiprow" condition is cleared at the start of each label group. Although SET is part of the Uproc's syntax, there is no $SKIPROW variable.
The TITLE statement specifies a string to be placed in the frame independently from the value of the label group:
TITLE = string-expression;
where "string-expression" is an expression consisting of strings and/or variables that evaluate as strings.
The starting position for the TITLE string is specified with the TSTART statement, which immediately precedes the TITLE statement. It has the same syntax as the START and XSTART statements. [See B.4.6.1.]
TSTART = row, column;
If no TSTART statement is coded, the title is placed in the same default position as $CVAL is when no START statement is coded: the next row (X), starting in column 1.
All three of these statements can be used to create headings for values being placed in the frame. However, in most situations, one of them will clearly be preferable to the other two, depending on your particular requirements. Relevant points concerning each statement are shown below:
In summary, headings and titles not directly related to element occurrences are best handled by the VALUE statement. Headings for elements, especially multiply occurring ones, are best handled by the TITLE statement. If the heading is to appear on the same row as the element value and the element will have only one occurrence, the INSERT statement may be a bit easier to use than TITLE and TSTART. [In full-screen applications, other considerations regarding protected and unprotected fields may apply to these statements. See the manual "SPIRES Device Services" for further details.]
It is important to remember that a format is a program to be executed, and as such, it needs to have standard procedural capabilities, such as loops, subroutines, and branching. This section will discuss the label group statements, some of which have been introduced already, that control the execution of a frame. Although most of them are Uprocs (i.e., coded as values in UPROC statements), or Entry-Uprocs, some of them are statements themselves.
For example, one of the most important of these statements is the LABEL statement, discussed earlier. [See B.4.1.] Whenever branching occurs, execution continues at a LABEL statement, the beginning of a label group; you cannot jump into the middle of a label group. The rest of section B.4.7 will discuss other execution-control statements.
Most of the procedural statements are particular Uprocs. [See B.4.5.5.] Many of them, particularly the LET, SET and IF... THEN Uprocs, are commonly used with system or user variables. Variables are discussed in detail in a later chapter. [See B.9.]
The procedural statements for output frames are divided into seven major categories: condition tests, subroutine calls, branches, exits, variable manipulation, terminal input/output, error handling and comments. Error handling is covered in detail in the input portion of this manual -- the techniques described there are generally applicable to output formats also. [See C.3.6.]
Using the IF... THEN Uproc, you can specify that a particular Uproc be executed only if a given "simple" condition (or series of simple conditions) is met. Each simple condition usually involves the testing of a variable, perhaps comparing it to some specific value.
The syntax of the IF... THEN Uproc is:
UPROC = IF condition THEN uproc;
where "uproc" is another Uproc allowed in the format, and "condition" is either a simple condition (defined below) or the result of a series of simple conditions combined with the logical operators AND or OR. [Note that the word THEN in an IF... THEN Uproc may be replaced by a colon.]
A "simple" condition may be one of the following:
= (is equal to)
~= (is not equal to)
> (is greater than)
>= (is greater than or equal to)
< (is less than)
<= (is less than or equal to)
The complete condition that is tested in an IF... THEN Uproc may consist of more than one "simple" condition, combined using the AND or OR logical operators:
UPROC = IF $CVAL = '94305' OR = '94306' THEN LET CODE = 'Local';
You can negate the condition by putting the condition in parentheses and preceding it with the word NOT:
UPROC = IF NOT ($CVAL = '94305' OR '94306') THEN BEGINBLOCK;
The rules for coding and combining conditions in a formats Uproc are the same as the rules described in the "SPIRES Protocols" manual for the IF... THEN command; see the Protocols manual for details on such topics as when parentheses are necessary and when a repeated "term1" may be omitted in the compound condition.
After an IF... THEN Uproc, two other Uprocs may be used whose outcomes depend on the result of the IF... THEN comparison. They are the THEN and ELSE Uprocs:
UPROC = THEN uproc; UPROC = ELSE uproc;
If the previous IF... THEN comparison is true, the Uproc given in the THEN Uproc will be executed; otherwise, it will not. Vice versa, if the previous IF... THEN comparison is false, the Uproc given in the ELSE Uproc will be executed; otherwise it will not.
Here are some examples showing how these Uprocs might be used. In the first example below, if the value of $CVAL is "N/A", it will be changed:
UPROC = IF $CVAL = 'N/A' THEN SET CVAL = 'Not Applicable';
Below, a function is used in the comparison:
UPROC = IF $SIZE($CVAL) > 60 THEN LET LONGVAL = $CVAL;
This Uproc declares that whenever $CVAL is longer than sixty characters, the user variable LONGVAL will be assigned the value of $CVAL.
In the next example, a user-defined flag variable, #NOPHONE, is tested. Both of these Uprocs are equivalent:
UPROC = IF #NOPHONE THEN RETURN; UPROC = IF #NOPHONE = $TRUE THEN RETURN;
In other words, if the NOPHONE flag is set, then the RETURN Uproc should be executed. [See B.4.7.7.]
Here is a series of Uprocs, using the THEN and ELSE Uprocs:
(1) UPROC = IF $CVAL < 10 THEN LET SMALLCLASS = $TRUE;
(2) UPROC = THEN IF $CVAL = 1 THEN SET CVAL = '1 Student';
(3) UPROC = ELSE SET CVAL = $CVAL ' Students';
If $CVAL = 1: IF $CVAL = 9:
(1) #SMALLCLASS = $TRUE (1) #SMALLCLASS = $TRUE
(2) $CVAL = '1 Student' (2) 2nd THEN not executed
(3) ELSE not executed (3) $CVAL = '9 Students'
If $CVAL = 11:
(1) THEN not executed
(2) neither THEN executed
(3) $CVAL = '11 Students'
The above example is interesting in that it shows how several IF... THEN Uprocs can interact. The sample values for $CVAL and their impact on the Uprocs show how the two variables $CVAL and #SMALLCLASS are set. Note that the ELSE in the third Uproc refers to the IF... THEN Uproc in line 2 when $CVAL is less than 10 but to the one in line 1 when $CVAL is not less than 10. (This example was presented here to illustrate how IF... THEN Uprocs can interact, not to recommend using them this way -- such coding is admittedly confusing.)
The THEN and ELSE Uprocs almost always refer to the last executed IF... THEN Uproc, even if it is in an earlier label group or even an earlier frame or calling frame. However, if the Uproc in an IF... THEN Uproc is an XEQ PROC Uproc, e.g.,
UPROC = IF $FREC THEN XEQ PROC FIRST.RECORD;
the result of the IF-test will be remembered by SPIRES when it returns (if it went there) from the FIRST.RECORD procedure. In other words, even though IF... THEN Uprocs may appear in the FIRST.RECORD proc (and THEN and ELSE Uprocs in the proc will be affected by them), any THEN and ELSE Uprocs executed after SPIRES returns from the proc will be controlled by the IF... THEN Uproc shown above. [See B.4.7.5.]
This section describes the block-construct Uprocs:
BEGINBLOCK & ENDBLOCK WHILE & ENDWHILE REPEAT & UNTIL ITERATE LEAVE
[This information is condensed from the reference manual "SPIRES Protocols", which contains more detailed information about block constructs.]
A block construct is a set of Uprocs delimited by two Uprocs, one at the start and one at the end, that define the type of block. The Uprocs within the block execute in order, one by one, but they can also be regarded as a self-contained entity that executes as a whole or not at all:
UPROC = If $CVal = 'NAME' Then BeginBlock; UPROC = Set CVal = $UserName; UPROC = Set Parm = 'Name = ' $UserName; UPROC = EndBlock; UPROC = Set Value = $LSub($ProcValue,' ');
In this example, if the value of $CVAL is NAME, then all the Uprocs between BEGINBLOCK and ENDBLOCK will be executed; if $CVAL's value is something else, none of those Uprocs will be executed.
There are three sets of block constructs, each of which has three parts:
(1) BEGINBLOCK WHILE condition(s) REPEAT (2) uprocs uprocs uprocs (3) ENDBLOCK ENDWHILE UNTIL condition(s)
BEGINBLOCK and ENDBLOCK open and close a simple block of Uprocs. The other two pairs are called "looping block constructs". WHILE and ENDWHILE open and close a block that loops as long as (WHILE) conditions stated at the beginning of the block are true. REPEAT and UNTIL open and close a block that loops until conditions stated at the end of the block are true.
In the diagram, you can see how similar the constructs are. Statement 1 opens the block of Uprocs (2), and statement 3 closes it. Statement 1 must always be paired with its corresponding statement 3; otherwise a compilation or execution error will occur. Statement 3 cannot be the object of an IF, THEN or ELSE Uproc; for example, "ELSE ENDBLOCK" is an illegal Uproc.
IMPORTANT: In formats, you may use these constructs in both UPROC and ENTRY-UPROC statements. However, a block construct must be entirely contained within the Uprocs or within the Entry-Uprocs of a single label-group. In other words, it cannot continue from one label-group to another, nor can it continue from the Entry-Uprocs into the Uprocs of a single label-group.
These constructs are very powerful. Besides their effectiveness in organizing code, they also make it possible to loop within a label-group. Otherwise, looping (with the LOOP statement) can be done only for the entire label-group.
The Uprocs within the block can be any other Uprocs, including other block constructs. However, block constructs inside of others must be completely nested therein; they cannot overlap.
Legal: Illegal: BeginBlock BeginBlock ... ... Repeat Repeat ... ... Until condition EndBlock ... ... EndBlock Until condition
The JUMP and GOTO Uprocs should not be used indiscriminately within block constructs, either. [Many programmers would contend that JUMP and GOTO have no place within block-construct programming at all.] Jumping out of a block construct can lead to possible logic problems with the $ELSE system flag, which controls THEN/ELSE processing, and is not generally recommended. Be aware that if you do use JUMP to leave a block, $ELSE will retain its value from within the block; it will not revert to the value it had when the block began execution, which it would do if you left the block normally.
Looping block constructs provide another way to leave from inside: the LEAVE Uproc. LEAVE causes execution to continue with the next Uproc outside of the looping block, just past the ENDWHILE or UNTIL Uproc. LEAVE is not available in BEGINBLOCK...ENDBLOCK constructs.
Here is an example demonstrating use of the LEAVE Uproc:
UPROC = Let n = 0; UPROC = While #n < 10; UPROC = Let XRef(#n) = $GetUVal(CrossRef,#n,'No value'); UPROC = If #XRef(#n) = 'No value' Then Leave; UPROC = Let n = #n + 1; UPROC = EndWhile;
The idea behind these Uprocs is to place the first ten occurrences of the CROSSREF element into the first ten occurrences of the #XREF variable. However, when no value exists for an occurrence, the $GETUVAL function returns the value "No value", which in the next Uproc is treated as the signal to leave the block construct.
Another useful Uproc for looping block constructs is the ITERATE Uproc, which transfers execution to the last statement in the block, either ENDWHILE or UNTIL.
Block constructs may be nested to 31 deep in a label-group.
By default, of course, execution of a frame begins with the first label group and proceeds from one to the next until the end of the frame. Several statements and Uprocs are available to alter the standard flow, and are discussed in this and the next few sections of B.4.8.
Certainly one of the most commonly used is the JUMP Uproc. It tells SPIRES to stop executing the current label group immediately and begin executing the next label group or another named label group. Its syntax is:
UPROC = JUMP [label.name];
where "label.name" is the name of the label group at which execution is to resume. If no label name is given, execution continues with the next label group, which is equivalent to 'CASE 1'.
The GOTO Uproc is the same as the JUMP Uproc, except that a label name is required:
UPROC = GOTO label.name;
Variables may not be used for "label.name" for either statement (cf. the CASE Uproc).
Here is an example using the JUMP Uproc. As is often the case with Uprocs, JUMP may be the object of an IF...THEN or ELSE Uproc:
LABEL = RECEIVED; GETELEM; DEFAULT; UPROC = IF $DEFAULT THEN JUMP COMMENTS; PUTDATA; LABEL = RECEIVED.DATE; ... LABEL = ACCEPTED.BY; ... LABEL = COMMENTS; GETELEM; ...
If no value is retrieved for the RECEIVED element, the $DEFAULT flag is set and thus the other elements relating to the receiving are skipped. [See B.4.5.1, B.4.8.1.]
Once again it is important to remember the order in which statements in a label group are executed. Uprocs always precede PUTDATA statements, so in the following label group, if the JUMP is executed, the PUTDATA will not be:
LABEL; GETELEM = DATE; UPROC = IF $CVAL = 'N/A' THEN JUMP; PUTDATA;
The named label group may precede or follow the current label group. Of course, if it precedes the current one, beware of creating an infinite loop. Normally you would not name the current label in the JUMP Uproc; if you do need to execute the same label group again, you would probably use the LOOP statement. [See B.4.8.4.]
The CASE Uproc, discussed next, provides an alternate method for jumping to label groups. [See B.4.8.3.]
The CASE Uproc has the same purpose as the JUMP Uproc. However, instead of providing a label name, you give an integer value "n", and execution will resume at the "nth" label group from the current one. Also unlike JUMP, a variable can be used as the value.
The syntax of the CASE Uproc is:
UPROC = CASE n;
where "n" is an integer (positive, negative, or 0) or an integer variable. If "n" is positive, execution will jump to the "nth" label group from the current one. If "n" is negative, execution will jump to the "nth" label group before the current one. If "n" is zero, SPIRES will begin executing the current label group again -- unlike LOOP, however, this statement would cause the first element occurrence to be processed again, if none is specified in the GETELEM statement. (For 0 and negative values, beware of infinite loops.)
Like the JUMP Uproc, CASE may be the object of an IF...THEN or ELSE Uproc, and 'CASE 1' is equivalent to 'JUMP' with no label.
The LOOP statement tells SPIRES to begin executing the current label group again. Basically it has two purposes: 1) to let you retrieve multiple occurrences of elements; and 2) to provide a general looping capability that includes "counter variables". The two purposes are discussed separately below.
The syntax of the LOOP statement is:
LOOP = n;
where "n" is an integer or an integer variable representing the maximum number of times SPIRES should restart the execution of the label group. That is, if "n" is 2, the label group will execute once and then twice again for the loop, for a total of three times.
An alternate form, most common for element processing, is:
LOOP;
This form tells SPIRES to continue looping until something happens within the label group to cause the label group processing to stop (see below).
Several system variables are commonly accessed during loop processing, including $LOOPCT, $ELOCC, $LASTOCC, $CURROCC, and $TRUEOCC. [See E.2.2.16, E.2.2.26, E.2.2.27.]
When you want to retrieve all occurrences of an element, you generally use the LOOP statement. Here is a simple example:
LABEL = PHONE.NUMBER; GETELEM; PUTDATA; LOOP;
The first occurrence of PHONE.NUMBER is placed in the default starting position, and then the looping occurs. The second time through the label group, SPIRES retrieves the second occurrence of the element, placing it in the default XSTART position (which is START = X,n; where "n" is the starting column for the first occurrence positioned), and so forth. The LOOP statement, when used in label groups having GETELEM statements, causes SPIRES to retrieve the next occurrence of the element each time a loop occurs; no specific reference to an element occurrence needs to be made. [See B.4.2.]
When SPIRES loops back to the start of the label group and the GETELEM retrieves no more occurrences, then the loop is broken. SPIRES then skips the rest of the label group, just as it does when there are no occurrences of an element at all; processing resumes at the next label group. That is the most common method for ending a loop.
When multiple occurrences of elements are being positioned in a frame, the START statement can affect the placement of the second through "nth" occurrences. As noted above, the default positioning for those occurrences is in row X (which is not necessarily *+1), starting in the same column given in the START statement.
The XSTART statement, following a LOOP statement, may be used to override the default starting position for the "extra" occurrences:
LABEL = COMMENTS; GETELEM; START = 3,5; MARGINS = 1,65; PUTDATA; LOOP; XSTART = *+2,5;
Subsequent occurrences of the COMMENTS element will be positioned beginning two rows from the end of the previous occurrence, starting in column 5. The XSTART statement has the same form and allows the same values as the START statement. [See B.4.6.1.] You can also use the Uprocs SET STARTROW and SET STARTCOL when you are working with multiple element occurrences. [See B.4.6.2.]
Note that the XSTART statement would not be used in a looping label group concatenating occurrences with SET BUILDSEP and SET BUILDEND Entry-Uprocs. [See B.4.5.7.]
The other statements in a label group will be applicable to multiple element occurrences. The INSERT statement, for example, is applied to all occurrences. As a result, it is usually replaced by the TITLE and TSTART statements when only the first occurrence should have the inserted string. [See B.4.8.]
The DEFAULT statement interacts with LOOP in an interesting way.
LABEL = COURSES1; LABEL = COURSES2; GETELEM; GETELEM; DEFAULT = 'None'; DEFAULT = 'None'; PUTDATA; PUTDATA; LOOP; LOOP = 5;
If there are no occurrences of the COURSES1 element, the default value will be displayed once, and no looping will occur. However, if there are no occurrences of the COURSES2 element, the default value will be displayed six times, because of the value given for the LOOP statement. SPIRES realizes in the first situation that you want the default value to be used only once, rather than to create an infinite loop. On the other hand, since a specific number is given on the right, SPIRES keeps looping that many times.
Similarly, if one or more occurrences of COURSES1 are retrieved, the default value will never be displayed. But if less than six occurrences of COURSES2 are retrieved, the default value will be used to bring the number up to that total.
In other words, when a loop count is given and a default value is in effect, SPIRES will loop that many times, regardless of the number of element occurrences that actually exist, supplying the default value each time that the next occurrence does not exist. If you want to supply a loop count and a default value, but you do not want the default value used if there are any values at all, you should test for default processing in a Uproc, testing the value of the $DEFAULT flag.
The LOOP statement and the JUMP statement are very different in effect:
LABEL = CHILDREN; LABEL = CHILDREN; GETELEM; GETELEM; PUTDATA; UPROC = JUMP CHILDREN; LOOP; PUTDATA;
The label group on the right is practically useless. All Uprocs are executed before a PUTDATA statement, and a PUTDATA statement is executed before a LOOP statement. That means that the JUMP statement on the right occurs before any value is placed in the frame. Moreover, in the example on the right, the GETELEM statement would always retrieve the first occurrence of the element; only the LOOP statement tells SPIRES to retrieve the next element occurrence when no specific occurrence is requested. Since the first occurrence is retrieved over and over again, the GETELEM statement would never fail (unless there were no occurrences of CHILDREN in the first place) and hence the label group would become an "infinite loop", which SPIRES would stop executing after 32,767 loops.
The LOOP statement can also be handy when you need a general looping capability for a label group. By checking the system variable $LOOPCT (for "loop count") you can keep track of how many loops have occurred:
LABEL; UPROC = LET TOTAL = #TOTAL + #ADDEND::$LOOPCT; LOOP = 19;
In this example, the variable TOTAL is augmented each time through the loop with the next occurrence of the ADDEND variable. The particular occurrence added is controlled by $LOOPCT, beginning with occurrence number 0. In other words, the label group adds together the first twenty occurrences of the ADDEND variable.
Because the LOOP statement only restarts the label group it is in, it cannot be used when the loop must include several label groups. In such cases, you may use the JUMP or CASE statements, probably augmented with a user-declared variable to serve as a counter. [See B.4.8.2, B.4.8.3.] Alternatively, you could put the label groups into a separate indirect frame called from a label group with a LOOP statement. [See B.4.8.7.]
In addition to controlling or converting an element's value, a label group can also control which occurrence of the element it retrieves, using an Entry-Uproc, which is executed at the beginning of a label group before a GETELEM or IND-STRUCTURE statement. [See B.4.5.6 for a general description of Entry-Uprocs; the IND-STRUCTURE statement, which retrieves occurrences of structures, is discussed later in this manual in chapter B.8.]
The next few paragraphs discuss two Entry-Uprocs, SET LOOP BACKWARDS and SET STARTOCC, which can be used (together or separately) to control which element occurrences a label group with a LOOP statement will retrieve.
The SET LOOP BACKWARDS Entry-Uproc lets you retrieve element occurrences in reverse order from how they are stored in the record: last becomes first and first becomes last. The syntax of the Entry-Uproc is simplicity itself:
SET LOOP BACKWARDS
Thus, if a record contained eight occurrences of the NAME element, the label group below would first retrieve the eighth occurrence, then the seventh, then the sixth, and so on:
LABEL = NAME;
ENTRY-UPROC = SET LOOP BACKWARDS;
GETELEM;
PUTDATA;
LOOP;
The SET STARTOCC Entry-Uproc lets you specify the starting occurrence that the GETELEM (or IND-STRUCTURE) statement in a label group should retrieve. The syntax is straightforward:
SET STARTOCC = n;
Here "n" is a number corresponding to the occurrence's actual position in the record, where "0" represents the element's or structure's first occurrence, "1" represents its second occurrence, and so on.
The label group containing a SET STARTOCC Entry-Uproc will almost certainly contain a LOOP statement as well; each iteration of the loop retrieves the next occurrence of the element beginning with the occurrence named in the Entry-Uproc. Thus the label group below would retrieve occurrences of the COURSES element beginning with the fourth occurrence of the element (assuming there was one). It would skip the first three occurrences altogether, and would loop until there were no more occurrences to retrieve:
LABEL = COURSES;
ENTRY-UPROC = SET STARTOCC = 3;
GETELEM;
PUTDATA;
LOOP;
The two Entry-Uprocs can be used together to cause occurrence retrieval to begin at an "nth" occurrence and loop backwards from that point. Remember to include a DEFAULT statement [See B.4.5.1, B.4.7.4.] if there is a chance that a specified occurrence will be absent from the record; otherwise, the format will skip the rest of the label group without retrieving "subsequent" occurrences, even though those occurrences come first in the actual record:
LABEL = COURSES;
ENTRY-UPROC = SET LOOP BACKWARDS;
ENTRY-UPROC = SET STARTOCC = 3;
GETELEM;
DEFAULT = '---';
PUTDATA;
LOOP = 3;
Some additional comments on SET LOOP BACKWARDS and SET STARTOCC:
- Note that, for reasons of timing, these statements must be coded as Entry-Uprocs, not as Uprocs.
- Be careful not to code an occurrence number as part of the GETELEM statement, since that number would override the value in the SET STARTOCC Entry-Uproc. (It would also cause the loop to retrieve a single occurrence over and over.)
- If one or both of these Entry-Uprocs is in effect, and you need to keep track of true occurrence numbers (which is more likely in an input format than during output), you should use the variable $CURROCC, not $LOOPCT, as your counter. Instead of $CURROCC you can use $TRUEOCC if there is any chance that an element filter is in effect.
The SET FILTER and CLEAR FILTER Uprocs (or Entry-Uprocs) let you set and clear filters as a format executes, in order to control which data is processed by the format. The syntax parallels that of the interactive SET FILTER and CLEAR FILTER commands:
UPROC = SET FILTER (type) FOR elem-name WHERE expression; UPROC = CLEAR FILTER FOR elem-name; UPROC = CLEAR FILTERS;
Filters set with the Uproc act as overlay filters (even though the term OVERLAY is not used in the command). That is, they are additive to other filters that may be in effect for the type of operation the format is executing (e.g. DISPLAY). As with the SET FILTER OVERLAY command, the SEQUENCE, OCC, and IN LIMIT options of the SET FILTER command are not allowed in SET FILTER Uprocs.
The CLEAR FILTER FOR elem-name and CLEAR FILTERS Uprocs clear only the filters set in your format with SET FILTER Uprocs, and not other filters that may have already been in effect.
The SET FILTER and CLEAR FILTER Uprocs may only be used in data or indirect frames. There is one restriction to note: these Uprocs are not allowed if your format uses the FORMAT-OPTIONS = GENELEM or FORMAT-OPTIONS = GENVIRT statement. [See B.6.2.]
All filters established during execution of a format will be cleared when the format is exited, after each record is processed. In other words, you must make sure the filter you want is set for each record processed by the format. For example, don't set your filters in a startup or initial frame.
If you are setting merge filters in an input format, note that you must also execute the BUILD RECORD Uproc in order for the merge filter to work. That is, if the merge filter is set in the format, the record must be built while still under format control. In the absence of BUILD RECORD, the format is exited, the filter set in the format is cleared, and so when the record is built the filter will not be in effect. [See C.3.4.4.]
If your WHERE clause in the SET FILTER Uproc contains a variable, the variable is evaluated at the time SET FILTER is executed. The element's Inprocs are also executed at that time. If there is a mismatch between the variable value and the element's processing rules, you may receive processing rule errors.
Here is a very simple example of how you might use SET FILTER and CLEAR FILTER Uprocs. The effect here is that when these output frames are executed, only DONATION structures containing DATE values greater than 1985 will be displayed.
FRAME-ID = DETAIL;
DIRECTION = OUTPUT;
FRAME-DIM = 100,72;
:
:
LABEL;
UPROC = set filter for donation where date > 1985;
LABEL = DATE;
IND-STRUCTURE = DONATION;
IND-FRAME = DONATION;
LOOP;
LABEL;
UPROC = clear filters;
:
:
FRAME-ID = DONATION;
DIRECTION = OUTPUT;
SUBTREE = DONATION;
LABEL = DATE;
GETELEM = DATE;
MARGINS = 8,15;
START = *+1,8;
PUTDATA;
LABEL = LOCATION;
GETELEM = LOCATION;
MARGINS = 30,72;
START = *,30;
PUTDATA;
In formats, SPIRES lets you handle subroutines basically in two ways: with the XEQ PROC Uproc, in which the called subroutine is within the current frame definition, and with the IND-FRAME (for INDirect FRAME) statement, in which the called subroutine is a different frame altogether. Generally, you use the IND-FRAME procedure when the same subroutine needs to be called from several different frames; if the subroutine will be called only from one frame, the XEQ PROC Uproc is typically used. This section will discuss XEQ PROC; the IND-FRAME statement is discussed in the next section and later in Part B with regard to structure processing. [See B.4.7.6, B.8.2.]
The syntax of the XEQ PROC Uproc is:
UPROC = XEQ PROC [label.name];
where "label.name" is the name of a label group in the same frame. SPIRES will jump to the named label group and begin executing it. If "label.name" is not specified, SPIRES will jump to the next label group in the frame.
At the end of the label group or label groups that comprise the "proc" (subroutine), you place the RETURN Uproc:
UPROC = RETURN;
When SPIRES encounters this statement, it returns to the next label group after the label group containing the XEQ PROC Uproc that caused the subroutine to be executed. (Note then that it does not return to the statements after the XEQ PROC Uproc in that label group.) There is one exception to that rule: when the XEQ PROC is in the NULL or ATTN clause of an ASK Uproc, SPIRES will return to the ASK Uproc from the subroutine and execute the "ask" again. [See B.4.7.9.]
The RETURN Uproc may be used in other situations besides returning from a subroutine. [See B.4.7.7.]
Here is an example of a label group that calls a subroutine:
LABEL = MONTHLY.QUANTITY; GETELEM; UPROC = IF $CVAL < 5 THEN XEQ PROC SMALL.ORDER; PUTDATA; LABEL = NEXT.ONE;
Note that the PUTDATA is not executed in MONTHLY.QUANTITY if the XEQ PROC is executed. Control is "returned" to the NEXT.ONE label group when the RETURN statement is encountered in the SMALL.ORDER proc.
Within the subroutine itself, if you do not code a RETURN statement, SPIRES will continue executing one label group after another until it reaches the end of the frame (or some branching statement), at which point the XEQ PROC will be cancelled and execution of the current frame will stop.
By convention, subroutines are placed at the end of a frame. To prevent SPIRES from executing them when it gets to the end of the frame, people generally code a RETURN Uproc in the last label group before the subroutines begin:
LABEL = COMMENTS; GETELEM; PUTDATA; LABEL = EXIT; UPROC = RETURN; LABEL = SUBROUTINE1; (etc.)
Here, the COMMENTS label group is the last "mainstream" label group of the frame. The EXIT label group tells SPIRES to stop processing the frame and "return" one level, halting execution of this particular frame. [See B.4.7.7.] The next label group, the first of the subroutines, does not execute unless an XEQ PROC Uproc (or some other branching Uproc) elsewhere invokes it.
The XEQ PROC Uproc is thus involved with nested procedures, unlike JUMP, which involves simple branching. A subroutine called by XEQ PROC may even call another subroutine, etc., to a maximum depth of eight subroutines.
As the previous section suggested, you may want a subroutine that can be called from several frames. Since the XEQ PROC Uproc can only specify a label group within the current frame, it is not suitable for this situation. Instead, you must call an "indirect frame". Note that indirect frames are more often used when element structures are being formatted; they are discussed in more detail later. [See B.8.]
Indirect frames are a general tool in formats, useful in a variety of situations. Though the discussion in this section specifically concerns their use in output formats as subroutines, most of the discussion applies to their use in general, with cross references provided when that is not the case. When an indirect frame is invoked from a "calling frame", format execution control is transferred to the indirect frame. Execution control returns to the calling frame when the indirect frame finishes executing, either because the end of the frame has been reached, or because a RETURN Uproc is encountered.
The indirect frame may retrieve record elements, but not elements in structures unless the appropriate IND-STRUCTURE statement is coded. [See B.8.2.]
An indirect frame is invoked by the IND-FRAME statement:
IND-FRAME = frame.name;
where "frame.name" is the name of another frame defined in the format definition. Thus "frame.name" must follow the rules for frame names given for the FRAME-ID statement. [See B.3.1.]
The IND-FRAME statement usually appears right after the LABEL statement when structure processing is not involved. Neither GETELEM nor VALUE statements (except in subgoal processing) are allowed in the same label group as an IND-FRAME statement.
LABEL = GO.TO.SUB1; IND-FRAME = SUB1;
When the GO.TO.SUB1 label group begins executing, control is given immediately to the frame SUB1.
Besides coding the IND-FRAME statement, you must also declare the frame as an indirect one in the format declaration section. [See B.5.2.]
To some extent, you can think of an IND-FRAME statement as replacing a GETELEM or VALUE statement in a label group. (Note however that in subgoal processing [See B.12.] both a VALUE and an IND-FRAME statement will appear in the calling label group.)
Generally speaking, when an indirect frame is used for a subroutine, no frame dimensions are coded in it, indicating that the frame dimensions of the calling frame will be in effect for any data placement done by the indirect frame. [See B.8.1.] Other frame-identification statements that are often coded in the indirect frame definition are DIRECTION and USAGE, usually with the same values as those of the calling frame. [See B.3.2, B.3.4, D.1.2.]
When execution is transferred to an indirect frame, it executes as any other frame does, one label group after another, unless some type of branching occurs. The indirect frame may itself call subroutines (using either XEQ PROC Uprocs or IND-FRAME statements). As you continue to "descend" into nested subroutines, remember that the RETURN Uproc will return you one level each time it is encountered. You will also return from the indirect frame to the calling frame automatically when the end of the indirect frame is encountered.
The indirect frame is allowed to call itself, but care must be taken to avoid "infinite nesting"; nesting is limited to about ten levels, beyond which a serious error occurs. (You may be able to descend deeper than ten levels in some circumstances.)
The calling label group may have other statements in it, for SPIRES returns to the calling label group once it has finished executing the indirect frame. A frequent companion of the IND-FRAME statement is the LOOP statement, which causes the indirect frame to be executed again. [See B.4.7.4.] Uproc statements may also appear; note that they are executed each time that SPIRES returns from an indirect frame before any looping occurs:
LABEL; IND-FRAME = SET.VALUES; UPROC = IF #COUNTER > 4 THEN XEQ PROC CLR.TEMP.VALS; LOOP;
The above example contrasts the manner in which the two types of subroutines handle returns. SPIRES returns to the next statement in the calling label group when it returns from the indirect frame, but it will "return" to the next label group when it finishes executing the subroutine invoked by the XEQ PROC Uproc. The XEQ PROC Uproc here in effect breaks the loop.
By default, when SPIRES is through executing label groups in a frame, execution of that frame stops. SPIRES may then continue to other frames of the format, if appropriate, or stop format processing completely, returning you to command level, if all appropriate frames have been executed. (Exception: For indirect frames invoked by calling frames, SPIRES will return to the calling frame when finished with the indirect one.)
The formats language provides other means of escape from a frame besides the default method. Perhaps the most common, and least dramatic, has already been introduced:
UPROC = RETURN;
When the RETURN Uproc is encountered and SPIRES is not in the middle of an XEQ PROC "proc" within the executing frame, no further processing of that frame occurs. [See B.4.8.6 for its use with the XEQ PROC Uproc.] If it is an indirect frame, control returns to the calling frame. If not, the next appropriate frame as declared in the format declaration section, if any, begins executing. [See B.5.2.]
Two other available Uprocs will stop format processing completely.
UPROC = ABORT [QUIET|NOERROR]; UPROC = STOPRUN [QUIET|NOERROR];
The ABORT Uproc causes SPIRES to stop format processing for the current record; no more frames will be executed for that record. However, if multiple records are being processed (e.g., you have issued a TYPE command), SPIRES will begin format processing of the next record.
The STOPRUN Uproc is similar to the ABORT Uproc, except that when multiple records are being processed, a STOPRUN will also stop SPIRES from processing any further records for the issued command. (In a report, however, STOPRUN will execute any ending frames that are present, in order to provide totals up to the point where execution stopped.) [See B.10.6.]
After either an ABORT Uproc or a STOPRUN Uproc, the system flag variable $ABORT is set. [See E.2.1.11.]
For either Uproc, the QUIET option prevents SPIRES from displaying the error message normally associated with an ABORT or STOPRUN. The NOERROR option not only suppresses the error messages (like QUIET) but also does not cause the setting of $NO or $SNUM. (In the absence of the NOERROR option, ABORT and STOPRUN turn on the $NO flag and set $SNUM to 847 and 848, respectively.)
The RETURN, ABORT and STOPRUN Uprocs often appear in THEN clauses of IF... THEN Uprocs, where some critical value is being tested:
LABEL = ACCOUNT.CHECK; GETELEM = ACCOUNT; UPROC = IF $CVAL ~= $ACCOUNT THEN * 'Wrong account number.'; UPROC = THEN * 'You can see only your own records.'; UPROC = THEN STOPRUN QUIET;
If the user's account number is not the same as the account number in the record, a few messages are displayed and no further format processing occurs for the issued command.
Another Uproc, BACKOUT, causes an immediate exit from the current format processing, telling SPIRES to behave as if a frame of the type required for the issued command did not exist in the set format. For example, if the BACKOUT Uproc is encountered during format processing under a DISPLAY command, the record would be displayed in the standard SPIRES format. Note though that if the BACKOUT Uproc occurs after the buffer has already been flushed (e.g., in line-by-line processing), the flushed data has already been output and the BACKOUT Uproc will not affect it; the current contents of the buffer will be discarded, however.
You cannot "pause" during the execution of a frame to return to command mode -- once you return to command mode, the current execution of the format has stopped. It is possible, however, to pass a command to WYLBUR from inside a format, using the $ISSUECMD function. For example,
UPROC = EVAL $ISSUECMD('SEND GQ.DOC Format LOOK used by ' $ACCT);
The EVAL Uproc causes the $ISSUECMD function to be evaluated, which passes the SEND command to WYLBUR to execute immediately. [See B.9.3.3.] The format pauses while WYLBUR executes the command, but as soon as the WYLBUR command is executed, the format resumes execution.
The syntax of $ISSUECMD is:
$ISSUECMD(command)
where "command" is a string expression. The function will return a "1" ($TRUE) if it succeeds, "0" ($FALSE) if it fails.
The command specified in the $ISSUECMD function must not be a SPIRES command; only ORVYL and WYLBUR commands are allowed. Be aware that some commands may affect your session such that the format cannot continue executing: some examples are CALL SPIRES and LOGOFF.
Although variable handling will be covered in detail in a later chapter [See B.9.] it is such an important part of the procedural language that it is worth introducing here. This brief discussion specifically introduces the LET and SET Uprocs, which can be used to assign values to system and user variables.
Variables are used for a number of different programming reasons. In a SPIRES format, some of the likely reasons include:
- to hold an element value from one label group for checking in another;
- to hold a loop counter when the loop (done with a JUMP statement, for instance) extends across several label groups;
- to hold a data value being constructed over several label groups so that it can be placed in the frame all at once;
- to hold information about the current processing, such as the system variable $LABEL, which holds the name of the currently executing label group.
Examples of such uses will be shown throughout the manual. [See B.10.11, for instance.]
Values may often be assigned to system variables, identified by the dollar sign that begins their names (e.g., $CVAL, $PROMPT). In a format, a SET Uproc is used for this purpose:
UPROC = IF $CVAL = 0 THEN SET CVAL = 'None';
Not all system variables may be set. A list of those that may be set from a format appears elsewhere. [See B.4.5.5.]
You may also assign values to user variables with the LET Uproc:
UPROC = LET VALUE = #VALUE || $CVAL;
In that example, the current value of the system variable $CVAL is concatenated to the current value of the user variable #VALUE to make a new #VALUE. [See B.9.3 to learn how user variables are declared and allocated in a format.]
In general, the LET and SET statements follow the same rules as the LET and SET commands in command mode. For example, the variable name right after the LET or SET verb does not have a dollar sign or pound sign. On the other hand, two differences are: 1) in formats, a SET Uproc allows expressions for the value; and 2) the LET and SET Uprocs are element values, which must follow standard format entry rules, meaning, for example, that they must end with semicolons. [See B.2.7 for data entry rules.]
In some circumstances the format design may require the user at the terminal to provide information during format execution. The ASK Uproc, which requests and handles user input from a format, allows the user to interact with the format as it executes.
When the ASK Uproc is executed, SPIRES sends a prompt to the terminal, stopping format execution until a response is sent back. The value given as the response is assigned to the system variable $ASK, unless the response was a null response (usually just a carriage return) or an attention (ATTN/BREAK key), in which cases the format designer may decide how to handle the response. The value in the $ASK variable may be extracted or used through other Uprocs, as desired.
The syntax of the ASK Uproc is:
UPROC = ASK [UPPER] [EXACT] [PROMPT='string'] [NULL='uproc'] ...
... [NOTRIM] [NOECHO] [ATTN='uproc'];
If both EXACT and NOTRIM are specified, only EXACT will be in effect (see below). The UPPER option indicates that the response should be converted to uppercase when it is assigned to the $ASK variable (see examples below); otherwise, the case of the response remains unchanged. The value "string" is a string of text for SPIRES to use as the prompt to the user; "uproc" is one of the following Uprocs:
RETURN JUMP [label.name] ABORT [QUIET|NOERROR] XEQ PROC [label.name] STOPRUN [QUIET|NOERROR] '' (see below) REPROMPT [QUIET] - comment BACKOUT
The Uproc given in the NULL clause will be executed if a null response (a simple carriage return, or all blanks and a carriage return) is given to the prompt. The Uproc given in the ATTN clause will be executed if the user presses the ATTN/BREAK key in response to the prompt. Though each of those Uprocs is described elsewhere in this manual, several points concerning how some of them may be used are covered in the examples below.
The NOECHO option prevents SPIRES from echoing the response back to the terminal as it is typed. What the user types thus does not show on the screen.
Here is the simplest form of the ASK Uproc:
UPROC = ASK;
If no PROMPT clause is given in the ASK Uproc, the current value of the system variable $PROMPT will be used for the prompt. (That variable may be set using the SET PROMPT Uproc. Unlike the PROMPT clause, the SET PROMPT Uproc can take a string expression, as opposed to merely a string, for its value.) Regardless of what the prompt is, it will always be preceded by a colon (:) and followed by a blank, unless the EXACT option is added, which tells SPIRES to display the prompt without the colon and trailing blank.
Here is what might appear on the terminal screen if a user displays a record through a format that has the above Uproc:
-> display 45 :
The colon is the prompt for input from the user. (The $PROMPT variable had apparently not been set.)
There are four possible types of responses by the user and counter-responses from SPIRES for that Uproc:
The third and fourth situations are the default reactions when SPIRES receives a null or attention response and no NULL or ATTN clauses are coded. Coding one of them will cause the appropriate Uproc to be executed in that situation instead:
UPROC = SET PROMPT = 'Enter the date to be printed on top:'; UPROC = ASK NULL='XEQ PROC EXPLAIN.DATE' ATTN='STOPRUN';
Here, if SPIRES receives an attention response, no further format processing is done. If SPIRES receives a null response to the prompt, execution control will go to the EXPLAIN.DATE label group, presumably for more information about the prompt to be displayed to the user. (Remember that a RETURN from an XEQ PROC that is part of a NULL or ATTN clause will return SPIRES to the ASK prompt for reprompting. [See B.4.7.5.] This is the only situation in which RETURN will not return to the next label group.) In both cases, $ASK will be set to null.
Note that if JUMP or XEQ PROC are not followed by a label name, the next label group in the frame is assumed.
The PROMPT clause on the ASK Uproc does not set or clear the value of the $PROMPT variable, which exists outside of formats. Hence, if you issue an ASK Uproc without setting $PROMPT or without using a PROMPT clause, SPIRES may prompt the user with some prompt left over from an earlier format or protocol. You may want to set $PROMPT to a null string first. [See B.5.2 for an example.]
Be aware that the maximum length for a useful prompt is 158 characters, not including the colon or blank that SPIRES will add (160 if the EXACT option is coded). A prompt longer than that will automatically cause an ATTN response to the ASK Uproc.
Technical note: The total length of the prompt and the user's response may not exceed 168 characters. The allowed length for the user's response can be determined by subtracting the length of the prompt (including the colon or blank that SPIRES would add, if present) from 168. If the user's response exceeds that length, it is truncated. An exception: if the prompt's length is zero, the user's response may not exceed 167 characters.
Two system variables, $ASK and $PARM, are commonly used to provide information to the format. The $ASK value is normally set by the user's response to an ASK Uproc; the $PARM value is set by a SET FORMAT or SELECT command. [See B.7.1, E.2.3.9.] The values of both variables are accessible by the format.
The $PARM variable is generally used to provide information about the format. For instance, consider this sample command:
-> set format $prompt name rank serial.number
The string of element names following "$prompt" is the value assigned to $PARM. In this case, it names the elements to be prompted for and displayed by the format. The value of $PARM is examined by the startup frame of the $PROMPT format when that command is issued. [See D.3.] (Because the $PARM variable can change between the time the SET FORMAT command is issued and the time the format is used, i.e., via a SET GLOBAL FORMAT command, you should retrieve the $PARM value in a startup frame.) Since the $PARM value is set when a SET FORMAT command is issued, it is seldom used to provide data to the format on a record-by-record basis.
On the other hand, $ASK is often used for that purpose:
LABEL; UPROC = SET PROMPT = 'Do you want to see more of this record?'; UPROC = ASK UPPER ATTN='JUMP ABORT'; UPROC = IF $ASK = 'NO' THEN JUMP FINISHED;
This label group might appear in an output frame that displays part of a record and then asks whether the user needs to see more of it. The label group would be executed for each record displayed.
Of course, an important difference in the usage of $ASK and $PARM that the above examples demonstrate is in the ways they are set. One, $PARM, is set because the user voluntarily supplies it; the other is set because the format prompts the user for it. That difference is markedly shown by the example below, which might be part of a startup frame:
LABEL; UPROC = IF $PARM ~= '' THEN LET USERPARM = $PARM; UPROC = ELSE ASK PROMPT = 'Format Parameters? (RETURN=None)'; UPROC = ELSE LET USERPARM = $ASK;
If $PARM is given a value in the SET FORMAT command, the user variable #USERPARM is assigned its value. If it is not given a value in the SET FORMAT command, the format prompts for a value, which SPIRES assigns to $ASK, and then the format assigns the value of $ASK to #USERPARM. In other words, when using this format, you can supply the value of #USERPARM on your own via $PARM or through prompting via $ASK.
Both $ASK and $PARM can also be set directly, using the SET ASK and SET PARM Uprocs and commands.
Sometimes you may want to send data to the terminal screen that is informational about the format execution, rather than part of the data to be formatted. For example, in debugging situations, you might want to look at the value of a variable as the format executes. The star Uproc can be used for those situations:
UPROC = * string-expression;
where "string-expression" may consist of character strings (surrounded by apostrophes), user or system variables that can be converted to strings, or some combination thereof.
For example:
LABEL = COURSES; GETELEM; DEFAULT; UPROC = IF $DEFAULT THEN * 'No courses for record ' $KEY; PUTDATA; LOOP;
Here is what the terminal session might look like:
-> in active display 45923 No courses for record 45923 ->
The formatted record is placed in your active file but the message from the * Uproc is sent to your terminal screen. Note that the appropriate value for the system variable $KEY was substituted by SPIRES before the message was displayed; $KEY represents the key of the current record. [EXPLAIN $KEY VARIABLE.] for more details on the $KEY system variable. You do not need to precede the * Uproc with a slash (/) to force evaluation of the expression, as you would with the * command; in fact, the "/" is not allowed in formats.
Note that the terminal message may appear in the middle of the formatted record if you display the record at your terminal. [See B.1.2, B.3.3.]
Note that if you want to direct custom messages to the trace log (with the SET TLOG command) you should use the $ISSUEMSG function rather than the * Uproc, as in this example:
UPROC = IF $FTRACE THEN EVAL $ISSUEMSG('Danger! Danger!',W);
For most tracing situations, $ISSUEMSG is preferable to using the "*" command, since it will only appear on the terminal screen and never as part of the trace log. [EXPLAIN SET TLOG COMMAND.] for information about trace logs. [EXPLAIN $ISSUEMSG FUNCTION.] for an explanation of the function.
The * command and the * Uproc in formats are similar but they are not identical. When a * command (without an "IN area" prefix) is executed by SPIRES, the asterisk will be displayed when the string is sent to the terminal; the asterisk is not displayed when a * Uproc is executed. Also, the * command may be prefixed by "IN area":
-> in active continue *END OF ACTIVE FILE
From a format, the * Uproc's expression will always be displayed at the terminal. Note too that string expressions do not have to be placed in apostrophes in the command; in the Uproc they almost always do.
UPROC = * 'END OF FORMAT EXECUTION';
"Timing problems" with the terminal input and output Uprocs ASK and * sometimes arise because of the way SPIRES handles the frame buffer. These problems are often solved with the FLUSH Uproc.
For example, consider the following frame definition:
FRAME-ID = COUNTER; FRAME-DIM = 0,80; LABEL = ONE; VALUE = 'One'; PUTDATA; LABEL = TWO; VALUE = 'Two'; PUTDATA; LABEL = THREE; UPROC = ASK PROMPT 'Three?'; LABEL = FOUR; VALUE = 'Four'; PUTDATA;
If a DISPLAY command is issued, the terminal display from that frame would be:
One :Three? Two Four
The reason that these appear out of order from what one would expect is that SPIRES does not "flush" the buffer, i.e., write the buffer to the terminal, until the format attempts to place data onto the next line. Hence, label group THREE executes before the buffer containing the data placed by label group TWO is flushed because another label group may be placing more data onto that same line. But as soon as another label group tries to place data into a later line (as FOUR does), the buffer is flushed.
To circumvent such timing problems, the FLUSH Uproc is available:
UPROC = FLUSH;
This Uproc causes the current contents of the buffer to be flushed immediately. So if you add it to label group THREE:
LABEL = THREE; UPROC = FLUSH; UPROC = ASK PROMPT 'Three?';
then the buffer will be flushed to the terminal before the ASK Uproc is executed when a DISPLAY command has been issued:
One Two :Three? Four
The FLUSH Uproc is particularly important in report formats. [See B.10.3.6.]
Comments are allowed as Uprocs, written similarly to the form they take in protocols:
UPROC = - text of comment;
They are often used within groups of Uprocs to comment on the processing being done, step by step. Like COMMENT elements in a format, these are ignored by the compiler. Keep in mind the rules for special characters if your text includes quotation marks or semicolons. [See B.2.7.] Do not confuse the "-" Uproc with the "-" throwaway element. The "-" Uproc is not discarded from the stored record as the throwaway element is. [See B.2.2.]
UPROC = IF $DEFAULT THEN JUMP ORDER.NUMBER; UPROC = - If no value, skip the next three label groups. UPROC = IF $CVAL ~= 'NONE' THEN SET CVAL = $CASE($CVAL,10); UPROC = - Unless the value is NONE, change the value to; UPROC = - lowercase, and capitalize each separate word.;
Unlike COMMENT elements, which are gathered together at the beginning of a label group, the "-" Uproc can be interspersed throughout a series of Uprocs. [See B.4.9.]
An additional statement within label groups, the SITE statement, makes it more convenient to transport a SPIRES format from one site to another, by ensuring that only label groups appropriate to a particular site will be compiled and executed there.
The syntax of the SITE statement is as follows:
SITE = sitecode;
Possible values for "sitecode" are MTS, CMS, TSO, and STS -- the last value stands for Stanford and other ORVYL sites.
When the SITE statement is included in a label group, that label group will only execute at the site(s) named:
SITE = STS; <--in label group for STS sites only
At sites other than STS a label group with the above SITE value would be ignored by SPIRES, and would not even become part of the compiled format. This means that the same format could include more than one label group with the same name, without invoking an error message, assuming each of these label groups pointed to a different site:
LABEL = ACCOUNT;
VALUE = $ACCT;
LENGTH = 6;
:
PUTDATA;
SITE = STS;
LABEL = ACCOUNT;
VALUE = $ACCT;
LENGTH = 8;
:
PUTDATA;
SITE = CMS,TSO;
In a way, the SITE statement acts like a procedural statement in that it causes its label group to be executed only conditionally, depending on the "sitecode" value.
You can include more than one sitecode in the SITE statement, and can precede the sitecode(s) with a ~ character to indicate that the label group should NOT be compiled and executed at the named sites:
SITE = CMS,TSO; <--label group directed to CMS,TSO sites
SITE = ~CMS,TSO; <--label group directed to all except
CMS,TSO sites
Note that variables declared either in the VGROUPS subfile or in the VGROUPS section of a format definition can also be filtered by site. [See B.9.3.1 and the manual "SPIRES Protocols".]
As you probably expect by now, each of your label groups can have one or more occurrences of the COMMENTS statement to document the function or purpose of the label group, or mention any special features it contains. Below is an example of such a comment.
LABEL = ACQUISITION.DATE;
COMMENTS = This label group displays the ACQUISITION.DATE
element in the same row as the previous element, adjusting
it to the right margin. Although a maximum LENGTH was
coded, I doubt that any value will be that long.;
GETELEM;
LENGTH = 30;
START = *,50;
OUTPROC = $DATE.OUT(MONTH,,UPLOW,SQU);
INSERT = 'AD: ';
UPROC = SET ADJUST RIGHT;
To place more detailed comments within a group of Uprocs in a label group, you can use the "-" Uproc. [See B.4.8.14.]
So far, your format definition consists of two sections: the Identification section and the Frame Definitions section, consisting of one or more frame definitions. The final section required in a format definition is the Format Declaration section. Here you specify one or more format names to be used in a SET FORMAT command, and then identify the frames that can be executed under that format name. In other words, the Format Declaration section tells the compiler how to "package" a set of logically related frames that are used together. Hence you can consider this section as the place where a format is actually defined. (The Frame Definitions section defines the specific building blocks of a format, but unless those frames are specified for a specific format in the Format Declaration section, the frames are not used.) You may define multiple formats in a single format definition; a single frame may be used in multiple formats.
As part of a FORMATS subfile goal record, the Format Declaration section consists of one or more occurrences of the FORMAT-DEC structure, each occurrence describing a single format:
FORMAT-NAME = TEST;
ALLOCATE = GQ.DOC.LOCAL;
FRAME-NAME = OUTPUT2;
FRAME-TYPE = INDIRECT;
FRAME-NAME = OUTPUT1;
FRAME-TYPE = DATA;
UPROC = LET COUNT = 0;
FORMAT-NAME = TEST2;
...
Each occurrence of the structure begins with the FORMAT-NAME statement, the key of the structure. Several statements regarding sort space and variables may follow the FORMAT-NAME. [See B.10.8 for information on the SORT-SPACE and ALLOCATE statements.] Next come the "frame declarations", where the frames that comprise the particular format are listed. They may be followed by an ACCOUNTS statement, which lets you restrict access to the named format.
You may have up to 16 occurrences of the FORMAT-DEC structure in a format definition, meaning that you can define up to 16 different formats in a single definition.
The FORMAT-NAME statement, which begins the Format Declaration section, specifies the name of the format being defined here. Its value is used in the SET FORMAT command to identify the format to be set. Its syntax is:
FORMAT-NAME = format.name;
where "format.name" is a name consisting of one to sixteen alphanumeric characters. The special characters "." (period), "_" (underscore) and "-" (hyphen) are also allowed in the name. Although the format name may contain internal blanks (unless it is a general file format or a global format), choosing names without them is preferable. [See B.7.1, B.14, D.2.5.]
Here are some sample FORMAT-NAME statements:
FORMAT-NAME = DISPLAY; FORMAT-NAME = TEST REPORT;
The FORMAT-NAME must be unique among all formats for the record-type. When you try to compile the format definition record and SPIRES finds another format already existing with that name, you will be asked if it is okay to replace it with the new one, if the old one is yours. If it is not yours, SPIRES will tell you to change the FORMAT-NAME statement. [See B.6.2.]
The FRAME-NAME and FRAME-TYPE statements identify the frames that collectively constitute the format named by the FORMAT-NAME statement. Specifically, the FRAME-NAME statement names a frame defined in the Frame Definitions section of the format definition, and the FRAME-TYPE statement specifies the circumstances under which the frame is to be executed. Up to 100 frames can be specified in a single format, i.e., per FORMAT-NAME.
The syntax of the FRAME-NAME statement is:
FRAME-NAME = frame.name;
where "frame.name" is the name given in a FRAME-ID statement earlier in the format definition.
The syntax of the FRAME-TYPE statement is similarly simple:
FRAME-TYPE = frame.type;
where "frame.type" is either:
- INDIRECT, meaning that the frame is called from another frame via the IND-FRAME statement [See B.4.8.7.] or
- DATA, meaning that the frame will be executed when a record-processing command is issued. (Whether or not it is executed for a given command also depends on the value of the USAGE and DIRECTION statements for the frame.)
There are several other frame types as well, most of which are used in report processing: INITIAL, ENDING, HEADER, FOOTER and SUMMARY. [See B.10.] Another type is used under partial record processing: STRUCTURE. [See B.15.] The XEQ type is most commonly used in global formats and full screen processing. [See D.2.]
Another type, STARTUP, is executed as soon as the SET FORMAT command is issued. Because no records are being processed at that point, a startup frame may not contain data base references (such as GETELEM statements) -- it is generally used to set or test various system and user variables. [See D.3.]
When coding FRAME-NAME and FRAME-TYPE statements, you should remember two rules:
Some Uprocs are also allowed for each frame in the "frame declaration":
LET * message SET EVAL IF ... THEN ABORT THEN STOPRUN ELSE HOLD BEGINBLOCK ENDBLOCK REPEAT UNTIL WHILE ENDWHILE LEAVE ITERATE
Any Uprocs here will be executed before the label groups of the frame to which these apply. Also allowed at this point are COMMENTS statements, usually concerning the particular frame. [See B.2.2.]
Here is an example group of FRAME-NAME and FRAME-TYPE statements for a format of four frames: the two indirect frames CHECK1 and CHECK2 and the two data frames PART1 and PART2.
FRAME-NAME = CHECK1; FRAME-TYPE = INDIRECT; FRAME-NAME = CHECK2; FRAME-TYPE = INDIRECT; FRAME-NAME = PART1; FRAME-TYPE = DATA; FRAME-NAME = PART2; FRAME-TYPE = DATA; COMMENTS = Second data frame; FRAME-NAME; FRAME-TYPE = STARTUP; UPROC = SET PROMPT = '';
Note that if CHECK2 were called from CHECK1, it would have to be declared before CHECK1 here. Note also that no startup frame was actually defined; however, the SET PROMPT Uproc will be executed when the format is set. In other words, at the time when the startup frame would be executed, the Uprocs associated with that frame are executed, even though no frame specifically exists. This technique is only available for startup, initial, group-start, group-end, and summary frames. [See B.10.5, D.3.]
In formats containing many thousands of lines of code, the compiled code of the last compiled frames is slightly less efficient than the code of earlier ones. In other words, you may see some efficiency improvement if you move more frequently used frames higher in the list of declared frames, and move the less frequently used ones to the end.
For example, in a large report format that has numerous data frames, an initial frame, an ending frame, etc., you might want to declare the data frames first (after any indirect frames they call, of course), since these will be executed many times during the report -- presumably once for each record. The initial and ending frames, since they are executed only once at the beginning and end of the report respectively, would be declared at the end. This way, the most frequently executed frames are declared first, and their code is compiled more efficiently.
Remember, this difference in compiled-code efficiency only affects large formats. None of the examples shown in this manual, for example, would be affected. [See B.10.11.]
By default, anyone with access to a set of goal records (e.g., through the SELECT command) can set any formats available for those goal records. However, you can limit the use of a format to accounts specified in the optional ACCOUNTS statement, which follows the frame declarations. Its syntax is like the ACCOUNTS statement in a file definition:
ACCOUNTS = accounts;
where "accounts" consists of one or more account forms, separated by commas. The commonly used account forms are:
FORM DESCRIPTION EXAMPLE
gg.uuu a specific account GQ.DOC
gg.... all accounts in a specific group GQ....
g..... all accounts in a specific community G.....
PUBLIC all accounts PUBLIC
(In a sense, the default, i.e., when no ACCOUNTS statement is coded, is PUBLIC, so it is seldom used here.)
When a user tries to set a format to which access has not been allowed, the message "-Privileged command" is issued.
The account specified in the ID statement will always have access to the format, whether or not it is listed in the ACCOUNTS statement. Note that the file owner does not automatically have such access. In summary, if an ACCOUNTS statement is present, only the format "owner" and the accounts listed may use the format.
Although a format may not be available to all users because of its ACCOUNTS statement, its name will still appear in the list of formats displayed by the SHOW FORMATS command for all users.
Here is a sample Format Declaration section, perhaps slightly more complex than most:
FORMAT-NAME = SHORT;
FRAME-NAME = SUBSTR1; FRAME-TYPE = INDIRECT;
FRAME-NAME = START.OUTPUT; FRAME-TYPE = DATA;
FORMAT-NAME = LONG;
FRAME-NAME = SUBSTR1; FRAME-TYPE = INDIRECT;
FRAME-NAME = START.OUTPUT; FRAME-TYPE = DATA;
FRAME-NAME = SUBSTR2; FRAME-TYPE = INDIRECT;
FRAME-NAME = END.OUTPUT; FRAME-TYPE = DATA;
UPROC = SET PROMPT = '';
ACCOUNTS = GQ.DOC;
Two formats are defined here. The first, SHORT, consists of the single data frame START.OUTPUT, which calls the indirect frame SUBSTR1. The second, LONG, can only be set by account GQ.DOC. It consists of the two data frames START.OUTPUT and END.OUTPUT, which are executed in that order. Each data frame may call at least one indirect frame: SUBSTR1 from START.OUTPUT and SUBSTR1 and SUBSTR2 from END.OUTPUT.
At this point you may have a complete format definition, having an Identification section, a Frame Definitions section and a Format Declaration section. On the other hand, you may still need other formats features, such as a way to handle structures or to create and use your own variables. Those and other slightly more advanced topics will be covered later in this part of the manual.
However, at some point your format definition will be completely written. The next steps are to add it to the SPIRES public subfile FORMATS, compile it, and test it. This process is covered in detail in this chapter. Regardless of the complexity of your format, you will follow the procedure described here to make it usable.
This chapter also discusses how to recompile a format definition when you want to make changes to it and how to get rid of it completely ("zap" it) when you are done with it.
Before it can be compiled, your format definition must be added to the public subfile FORMATS. As suggested by all the examples in this manual, you should collect the format definition in your WYLBUR active file using the standard SPIRES format, i.e.,
element.name = value;
or, as it has been presented here:
statement = value; as in ID = GQ.DOC.MANUALS.LIST;
To add the definition in your active file to the FORMATS subfile, you issue the SPIRES commands SELECT FORMATS and ADD:
-> select formats -> add ->
Just as with any other subfile, SPIRES will check your record against the information in the file definition for the FORMATS subfile, making sure that required elements such as the key ID are supplied, that quotation marks around statement values are correctly matched, and so forth. If no errors are detected, the prompt will be returned with no error messages, as shown in the example above. However, if an error in the record is detected, an error message will be displayed; use the EXPLAIN command to find out more information about the specific error if necessary.
A typical error detected at this point is shown below in this portion of a format definition:
LABEL; GETELEM = NUMBER.ORDERED; IF $CVAL = 0 THEN SET CVAL = 'None'; PUTDATA;
Here is what SPIRES' response might look like if you tried to add that record:
-> add -Unknown mnemonic: IF -Error at or before line 17. -Update terminated, code=S2 ->
This common mistake, omitting the "UPROC =" in front of the Uproc itself, must be corrected using WYLBUR text editing commands before you issue the ADD command to try again.
If you want a printed listing of your format definition, a nicely formatted version can be obtained with the PERFORM PRINT command:
-> select formats -> perform print gq.doc.manuals.list
You must first select the FORMATS subfile and then issue the PERFORM PRINT command, followed by the ID of the format definition desired. You may only print copies of your own format definitions, of course. The format definition will be printed on a high-speed printer, with each section of the definition (each frame definition, each vgroup definition, each format declaration section, etc.) printed on a separate page. For more information about the PERFORM PRINT command, issue the command EXPLAIN PERFORM PRINT.
There are several ways to track down a format definition when you have forgotten its ID value.
The easiest approach is to set the format and then issue the SHOW FRAMES command. The SHOW FRAMES display includes the format ID for the format that is currently set. [See D.1.1.2.] (When the format is set, the format ID is also contained in the $FORMATID system variable, or in $GLOFORMATID if a global format.)
The file owner can find all the compiled formats that are defined for a file by issuing the PERFORM FORMAT LIST command:
PERFORM FORMAT LIST [filename]
If the "filename" option is omitted, the display will show the formats for the file containing the selected subfile. Then if no subfile is selected, you will be prompted for the "filename". The IN ACTIVE prefix may be added to the command so that the display will be placed in your active file.
For more information about this command, EXPLAIN PERFORM FORMAT LIST.
Instead of using SHOW FRAMES or PERFORM FORMAT LIST, you may use the FORMAT subfile's indexes to help find the ID of your format definitions, if the name of the file to which the format applies is known:
-> select formats -> show indexes Goal Records: ID Qualifier: ID Simple Index: FILE Sub-Index: RECORD, RECORD-NAME Simple Index: FORMAT, FORMAT-NAME Simple Index: ALLOC, ALLOCATE -> find file gq.doc.documents -RESULT: 5 IDS -> type id ...
You could also use the sub-index or qualifier if you also knew the value of the RECORD-NAME or FORMAT-NAME statements respectively. For example,
-> find file = gq.doc.documents @ record-name = rec01
-RESULT: 3 IDS
or
-> find file = gq.doc.documents and format-name = report
-RESULT: 1 ID
->
If just seeing the ID of the format definition record would refresh your memory, you can use the SHOW KEYS command under Global FOR -- only the IDs of your own format definitions will be shown:
-> select formats -> for subfile +> set scan prefix gq.doc +> show keys all GQ.DOC.ABBREV.DISPLAY GQ.DOC.DOCUMENTS.RPT (etc.) +>
Of course, you could alternatively issue a SET ELEMENTS command such as SET ELEMENTS ID FILE FORMAT-NAME and then DISPLAY ALL under Global FOR, allowing you to see more information about each of your format definitions. (Though not necessary, the SET SCAN PREFIX account helps SPIRES find your records more efficiently; omitting it will not allow you to see format definitions belonging to other accounts, however.)
Occasionally you may need to find the formats for a file that reference a particular element. Since elements may be referred to in GETELEM, PUTELEM, REMELEM, IND-STRUCTURE and LABEL (when one of the other statements appears in the label group but with no value), not to mention within Uprocs, Inprocs and Outprocs, searching for the appropriate formats can be challenging.
A virtual element called ELEM-USE in the LABEL-GROUP structure of a FORMATS record contains any element named in the GETELEM, PUTELEM, REMELEM, IND-STRUCTURE and LABEL statements. [There's not much that can be done to help you find elements named elsewhere, except to look through the format definitions with text-editor commands.] This element only contains legitimate element-name values; other types of values that may appear in those statements, such as "@n" (referring to an element by number) or "#variable", do not appear. In addition, any occurrence number following an element name is stripped off.
Here's how you might use ELEM-USE. Suppose you need to find all the formats for a file that refer to the TITLE-ARTICLE element. You might try this way:
-> select formats -> find file gq.jnk.recordings -Result: 15 IDS -> for result where elem-use = title-article +> stack all -Stack: 3 IDS +>
Besides the other limitations already noted, be aware that the value in ELEM-USE is the name of the element as it appears in the label group, which could be an alias rather than the primary mnemonic.
To compile your format, select the FORMATS subfile and issue the COMPILE command:
[IN ACTIVE [CLEAR|CONTINUE]] COMPILE [gg.uuu].idname ...
... [STATISTICS] [NOWARN] [LABELS]
where "idname", with or without the account prefix, is the value given in the ID statement. [See B.2.1.]
The STATISTICS option displays data about the size of various internal tables created by SPIRES as it creates the compiled code for the format. This option, seldom used unless the format is very large (say, over 1000 lines), is described in detail at the end of this section. If you use it, you may also use the IN ACTIVE option to direct the statistics to your active file rather than your terminal.
Note: Whether or not you include the STATISTICS option, SPIRES will by default warn you if any table is over 90 percent full. You can suppress those warning messages by appending the NOWARN option to your COMPILE command.
The LABELS option tells SPIRES to store label group names in the compiled format record, for use in SET FTRACE output. If the format is compiled without the LABELS option, SET FTRACE will identify label groups by positional numbers rather than by names.
Alternatively, you can include a FORMAT-OPTIONS = LABELS; statement in your format definition to specify that the compiled format should always include the label group names. In this case, you don't need to include the LABELS option every time you issue the COMPILE or RECOMPILE command.
Note that the FORMAT-OPTIONS statement also takes values other than LABELS, to control how SPIRES stores element references in the compiled format. See the explanation at the end of this section.
-> compile gq.doc.document -Compiled format: DOCUMENT.LIST -Format definition compiled ->
The informational messages indicate that the format whose FORMAT-NAME is DOCUMENT.LIST was compiled successfully.
When SPIRES compiles a format definition, it first examines the first format defined in the Format Declaration section. There, where each frame to be used is identified, SPIRES finds the appropriate frame names, gets each frame definition from the Frame Definitions section in turn, and compiles it.
Compiling a frame involves two simultaneous processes: 1) verifying that the statements in the frame are syntactically and factually correct (e.g., whether the named file and record-type exists, whether the Uprocs specified are allowed in a given situation, etc.); and 2) converting the high-level formats language into a lower-level SPIRES-internal language. The result of the conversion is called the "compiled characteristics" of the frame. Indirect frames must be compiled before the frames that call them, which is why the indirect ones are declared before their calling frames in the Frame Definitions section.
SPIRES will compile each format defined in the Format Declaration section individually, sending the "Compiled format:" message back to the terminal for each. When all formats are compiled, the message "Format definition compiled" will be displayed.
When all the frames for all the formats are compiled, SPIRES places the compiled characteristics for each format in a public subfile called FORCHAR (for FORmat CHARacteristics). Then when you set a format, SPIRES retrieves the format characteristics from that subfile, reading them into the main memory. You yourself will probably not ever use the FORCHAR subfile directly. [The key of the FORCHAR record includes the names of the file and record-type as well as the format name, which prevents format definitions for the same record-type from having the same format name.]
If SPIRES finds errors in the format definition, it will report them to you with error messages, like this:
-> compile gq.doc.docform * RECORD-ID = REC01 * * FORMAT-NAME = DOCUMENT.LIST * * * FRAME-ID = RECORD.DISPLAY * * * * LABEL = COPIES.LEFR -Invalid elem mnemonic - Item = COPIES.LEFR -Format DOCUMENT.LIST was not compiled -Error during format processing ->
The example shows a common error message. Probably the element name COPIES.LEFT was mistyped as COPIES.LEFR. SPIRES tells you where the error is by preceding the message with starred lines of information naming the format, frame and label group involved. More details about the error (or errors) can be obtained through the EXPLAIN command (EXPLAIN INVALID ELEM MNEMONIC, for example).
If multiple formats are being compiled in a single format definition, SPIRES will immediately place the compiled characteristics of a successfully compiled format into the FORCHAR subfile, display the message "Compiled format: ..." and move on to the next format. If an error is detected during the compilation, then all subsequent formats will not be moved into FORCHAR. The message "Format ... was not compiled" is your signal that the compilation was not successful and that the format in question was not moved into FORCHAR.
If an error is detected and the format definition does not compile, you should change the format definition appropriately. The procedure is:
- select the FORMATS subfile if not already selected (SELECT FORMATS)
- put the format definition into your active file if it is not there already (TRANSFER gg.uuu.idname)
- make changes using WYLBUR editing commands
- put a copy back into the FORMATS subfile (UPDATE)
- try compiling it again, following the procedure described above.
Of course, when the format does compile, you may try using it, which is a topic discussed in the next chapter. [See B.7.1.] However, successfully compiling a format definition does not necessarily mean having a format that works the way you want. You may decide to make changes to the format definition, which means recompiling it. And, when you are all done with it, you will want to destroy the format definition and the FORCHAR record. Those subjects are discussed in the next sections. [See B.6.3, B.6.4.]
At some point, a format may become too large for one or more of the internal tables that are constructed by SPIRES when it is compiled, in which case you may need to move parts of it to "load formats", compiled separately. [See B.11.] By monitoring the table sizes as you compile and recompile a growing format, you can avoid the surprise (if not the heartbreak) of having to create load formats, perhaps at a time when you have the least amount of time to do so.
If you add the STATISTICS option to the COMPILE or RECOMPILE command, SPIRES will display data like this:
-> compile gq.jnk.recordings.display statistics
Format Statistics for SELECTIONS CAT
2422 of 16384 Bytes (14%) Frame/Allocate Table
2058 of 12264 Bytes (17%) Label Group Overflow
* 3849 of 4088 Bytes (94%) Primary Label Groups
2321 of 16368 Bytes (14%) Long Strings
1110 of 4088 Bytes (27%) Short Strings
4456 of 8184 Bytes (54%) Functions/Expressions
60 of 16384 Bytes ( 0%) Local Vgroups
40 of 4088 Bytes ( 0%) Frame Type Tables
16332 of 32768 Bytes (49%) Overall Limit
-Format definition compiled
->
The data shows that the "Primary Label Groups" table is nearing its limit; those tables over 90 percent full are marked with an asterisk.
In most cases where SPIRES attempts to fill a table beyond its capacity, the format compilation will fail. However, some tables may have "soft" limits that can be exceeded without causing SPIRES to stop compiling. If a table fills beyond 100 percent without stopping the compilation, the percentage shown will be "**". Unfortunately, there is no definitive way to tell whether the compiled code in those tables is defective; instead, you are advised to divide the format definition into separate load formats if any of the tables exceed 100 percent.
The FORMAT-OPTIONS statement may be added to your format definition in order to control how SPIRES stores label group names or element references in your compiled format. The syntax is:
FORMAT-OPTIONS = value, [value, value];
The possible "values" are:
ID = *MYREC; FILE = *MYREC; (RECDEF key) RECORD = REC1; (any value) GEN-FILE; .... FORMAT-ID = MYFORMAT; .... FORMAT-OPTIONS = TEMPFILE;
There are currently two restrictions on use of the GENELEM or GENVIRT options. First, you may not use those options if the format contains SET FILTER or CLEAR FILTER Uprocs. And second, you may not use them if the format has multiple SUBTREE statements in a frame.
If format size and efficiency are vital concerns for you, note that GENELEM and GENVIRT cause some tables in the compiled format to grow larger. These options also cause some drop in efficiency at execution time, since the element names will be converted to numbers whenever the label group is executed.
Even though GENELEM and GENVIRT eliminate one situation in which changes in file definitions require recompilation of formats, you should still be vigilant about when formats need to be recompiled. For example:
Once your format is compiled, any changes you make to the format definition in the FORMATS subfile will require you to recompile it in order for those changes to take effect. The RECOMPILE is used to recompile a format definition that is already compiled:
[IN ACTIVE [CLEAR|CONTINUE]] RECOMPILE idname ...
... [STATISTICS] [NOWARN] [LABELS]
where "idname" is the value given in the ID statement, the same value you provided in the COMPILE command. [Some users may be able to compile and recompile formats belonging to other accounts, in which case the "gg.uuu." prefix must appear before the "idname".]
The STATISTICS option can be added to request information about the sizes of internal tables being constructed by SPIRES as the format is compiled. It can be important information to monitor for large applications. The option is described in detail in the previous section on the COMPILE command. [See B.6.2.] You may use the IN ACTIVE prefix in conjunction with the STATISTICS option to direct the statistics to your active file instead of your terminal.
By default, SPIRES will warn you if any of the internal tables exceed 90 percent of their capacity. If you add the NOWARN option to the command, you suppress these messages.
The LABELS option tells SPIRES to store label-group names in the compiled format record, for use in SET FTRACE output. If the format is compiled without the LABELS option, SET FTRACE will identify label groups by positional numbers rather than by names. If the format definition includes a FORMAT-OPTIONS = LABELS; statement, then the LABELS option on the RECOMPILE command is unnecessary. [See B.6.2.]
You may also need to recompile a format if the order or names of elements in the record-type of the file changes, or if the order or name of the record-type changes. Strictly speaking, it is only necessary to recompile if the format references elements whose positions are affected by such changes.
If your format definition includes a FORMAT-OPTIONS = GENELEM; or FORMAT-OPTIONS = GENVIRT; statement you may not need to recompile the format when a new element is added to a file definition. But see the previous section for caveats about use of these statements. [See B.6.2.]
When you no longer need the formats in a particular format definition (e.g., you are destroying the file), you should let SPIRES discard the compiled characteristics in the FORCHAR subfile and the format definition in the FORMATS subfile. That will happen when you issue the ZAP FORMAT command.
ZAP FORMAT[S] idname [SOURCE]
where "idname" is the value given in the ID statement of the format definition, with or without the account number prefix. Without the SOURCE option, only the compiled code in FORCHAR will be removed; if you include the SOURCE option, the format definition in the FORMATS subfile will also be removed. If you believe it is possible you will want to use the format definition again, you may want to omit that option.
Only the format definer (i.e., the account given in the ID statement), the file owner, and any accounts that the file owner has given METACCT access to the FORCHAR subfile, may zap a particular format.
If you zap a format and later realize you made a mistake, the format definition and possibly the compiled characteristics can be recovered if the system file containing the FORMATS and FORCHAR subfiles has not been processed (i.e., you must realize the mistake the day you zapped it). Try selecting the FORMATS subfile in SPIRES and dequeuing the format definition, or alternatively, processing the records under FOR TRANSACTIONS. That can help you recover the format definition itself, which you can compile. If you need more help, contact your SPIRES consultant before that system file is processed overnight.
You are not charged for storage of format definitions or their compiled characteristics, and you are not required to eliminate them when you no longer need them. Still, to keep the system files from becoming cluttered with unused formats, you are asked to zap formats that you will not be using again.
Once you have compiled your format, you may return to SPIRES and try using it. After selecting the appropriate subfile, you issue the SET FORMAT command, discussed in the next section. If the named format has data frames with DIRECTION = OUTPUT and USAGE = DISPLAY (the only ones covered so far), then output commands such as TYPE and DISPLAY will display the requested record or records via those frames.
If your format does not work properly, you will need to debug it, which is the topic of the second section in this chapter.
To use a compiled format, you must first issue the SET FORMAT command, which looks like this in its most basic form:
SET FORMAT format.name [,parm|'parm'|"parm"]
where "format.name" matches one of the names given in the FORMAT-NAME statements of the format definition. These names will also appear in the list of formats displayed by the SHOW FORMATS command. (New format names will not show up until the day after they have first been compiled.)
The "parm" option may be added to the end of the command, letting you pass information to the format dynamically (via the $PARM variable). The "parm" option is used, for example, by the system format $PROMPT:
-> set format $prompt name city phone.number
The character string "name city phone.number" would be assigned to the system variable $PARM for processing within the format. Note that the parameter was delimited from the format name with a blank, a form of the SET FORMAT command not shown in the syntax statement above. A blank as delimiter is not allowed in general, but is allowed for system formats such as $PROMPT.
Other forms of the SET FORMAT command are necessary when you are setting a global format or a general-file format. [See B.14, D.2.5.] For a complete list of other forms: [See B.11.2.]
You may also "reset" a format by issuing the SET FORMAT command with the "*" option:
SET FORMAT * [parm|,parm|'parm'|"parm"]
This causes the "startup" frame of the format currently set to be executed again, if there is one. [See B.5.2.] However, static variables are not reset to their original allocated values by the "SET FORMAT *" command. You can do that by setting the format again, identifying the format by name in the SET FORMAT command. [See B.9.3.]
To return to the SPIRES standard format, issue the command CLEAR FORMAT.
When you first try your format, it may not work exactly as you expected, especially if it is a complicated one. A data value may have been positioned in the wrong place, or a title may not have appeared, for example. You will need to re-examine your format definition, make changes as appropriate, and recompile it. [See B.6.3.] In most cases, the problem is caused by a minor error or omission -- it may be that a simple typographical error is the cause of what appears to be a major problem.
Here's one important tip to keep in mind: When testing a format, be sure that you are familiar with the data in the sample record or records you use. You may be expecting a certain element to appear that does not even exist in the record, for example. Try displaying the record in the standard SPIRES format if you have any doubts about its contents.
A simple "troubleshooter" list for debugging formats appears at the end of this chapter. [See B.7.3.]
The format tracing facility, invoked in its simplest form by the SET FTRACE command, is a very useful tool when you are trying to debug complicated formats. You may set format tracing for any format you can use.
As a record is processed through a format when the tracing facility is turned on, SPIRES will display messages on the terminal screen describing its progress through the format. Events noted include: entry and exit to the format; entry and exit to each frame; call to and return from load formats; subgoal access (including the key of the accessed record and information about the record-type); and errors as they occur. [See B.11, B.12 for information about load formats and subgoal access.]
A number of options are available on the SET FTRACE command, to provide detailed or specialized tracing. Here are the variations:
SET FTRACE basic format tracing
SET FTRACE JUMP tracing of JUMP, CASE, XEQ PROC commands
SET FTRACE SNAPSHOT shows control information, e.g. for PUTDATA
SET FTRACE VARIABLES shows static variable values as they change
SET FTRACE FRAMES limits detailed tracing to specified frames
SET FTRACE BRIEF suppresses basic tracing for other frames
SET FTRACE ALL same as issuing SET FTRACE, SET FTRACE
JUMP, SET FTRACE SNAPSHOT, SET FTRACE
FRAMES and SET FTRACE VARIABLES
The VARIABLES and FRAMES options also take lists of variable and frame names:
SET FTRACE VARIABLES [variable-name1, variable-name2, ...] SET FTRACE VARIABLES + variable-name3, variable-name4, ... SET FTRACE VARIABLES - variable-name1, variable-name2, ... SET FTRACE FRAMES frame-name1, frame-name2, ... SET FTRACE FRAMES + frame-name3, frame-name4, ... SET FTRACE FRAMES - frame-name1, frame-name2, ...
SET FTRACE alone enables the basic tracing mechanism. You may type one or more of the other forms of the command in order to turn on additional trace information. Each variation except SET FTRACE ALL is described below. For online explanations, type EXPLAIN SET FTRACE followed by the option you are interested in (e.g. EXPLAIN SET FTRACE SNAPSHOT).
Here is an example of output from a simple SET FTRACE command:
-> select drinks -> set format display -> set ftrace -> in active display 1 * Format DISPLAY Enter frame DRINK Label no. 3 Enter frame INGREDIENTS Leave frame INGREDIENTS at Label no. 2 Label no. 3 $Loopct = 1 Enter frame INGREDIENTS Leave frame INGREDIENTS at Label no. 2 Label no. 3 $Loopct = 2 Enter frame INGREDIENTS Leave frame INGREDIENTS at Label no. 2 Leave frame DRINK at Label no. 6 * Leave format ->
In the above example, the record is placed in the active file; the tracing information is displayed at the terminal, line by line, as SPIRES executes the format. (You may instead direct the tracing information to a "trace log" for later examination.) [See B.7.2.7.]
The format DISPLAY is comprised of the data frame DRINK, which contains the indirect frame INGREDIENTS. INGREDIENTS is an indirect-structure frame that is executed several times, as the example shows. [See B.8.] When SPIRES leaves a frame, the name of the frame is displayed along with the number of the last label-group executed in that frame (the first label-group being number 1). [This format is created and explained in detail in the SPIRES manual "A Guide to Output Formats".]
This is just a simple example. More complex formats will have more complex tracing messages.
Note that you can compile your format with the LABELS option (on the COMPILE or RECOMPILE commands) or include a FORMAT-OPTIONS = LABELS; statement in your format definition, in order to see label-group names in your FTRACE output rather than just the positional numbers shown in the example above. [See B.6.2.]
The SHOW FTRACE command will display the formats tracing that is currently in effect:
-> show ftrace SET FTRACE SET FTRACE SNAPSHOT SET FTRACE BRIEF SET FTRACE FRAMES INDEX.BRIEF
To turn off all format tracing, issue the CLEAR FTRACE or CLEAR FTRACE ALL command. Setting a format or selecting a subfile will also turn it off.
You may instead turn off specific forms of tracing with one or more of these commands:
CLEAR FTRACE JUMP CLEAR FTRACE SNAPSHOT CLEAR FTRACE VARIABLES CLEAR FTRACE FRAMES CLEAR FTRACE BRIEF
(The basic format tracing will remain on, even if you turn off all the specialized tracing mechanisms.)
You can set tracing for global formats in effect by issuing the command SET GLOBAL FTRACE, and turn it off with CLEAR GLOBAL FTRACE. All of the FTRACE variations described above may also be used for global formats, e.g. SET GLOBAL FTRACE SNAPSHOT, SHOW GLOBAL FTRACE. [See D.2.5.]
Format tracing may also be used to help debug formats used in SPIBILD. [See C.9.]
SET FTRACE JUMP produces a trace of all JUMP (GOTO), CASE, and XEQ PROC commands executed (plus RETURNs from XEQ PROC). Here is a simple example of SET FTRACE JUMP output:
* Format INPUT
Enter frame ADD
Label no. 2 $Loopct = 2
Jump to Label no. 3
Label no. 7
Jump to Label no. 5
Label no. 7
Jump to Label no. 5
Label no. 7
Jump to Label no. 5
Label no. 7
Jump to Label no. 5
Label no. 5
Jump to Label no. 8
Label no. 11
Jump to Label no. 5
Label no. 5
Jump to Label no. 12
Leave frame ADD at Label no. 12
* Leave format
SET FTRACE SNAPSHOT displays selected information at certain control points of a format's execution, e.g. whenever PUTDATA, PUTELEM, REMELEM, or IND-STRUCTURE statements are executed. Here is an example:
* Format DISPLAY
Enter frame DRINK
Label no. 1
Access element: QUANTITY $Currocc = 0
External value = 'Makes 1 drink'
Data output: Start = 1,50 End = 1,68
Label no. 2
Access element: NAME $Currocc = 0
External value = 'Harvey Wallbanger'
Data output: Start = 1,1 End = 1,17
Label no. 3
Ind-structure: INGREDIENTS $Currocc = 0
Title value = 'Ingredients:'
Data output: Start = 1,10 End = 1,21
Enter frame INGREDIENTS
Label no. 1
Access element: AMOUNT $Currocc = 0
External value = '1 oz.'
Data output: Start = 1,5 End = 1,24
Label no. 2
Access element: CONSTITUENT $Currocc = 0
External value = 'vodka'
Data output: Start = 1,27 End = 1,31
Leave frame INGREDIENTS at Label no. 2
Label no. 3 $Loopct = 1
Ind-structure: INGREDIENTS $Currocc = 1
Enter frame INGREDIENTS
Label no. 1
Access element: AMOUNT $Currocc = 0
External value = '4 oz.'
Data output: Start = 1,5 End = 1,24
Label no. 2
Access element: CONSTITUENT $Currocc = 0
External value = 'orange juice'
Data output: Start = 1,27 End = 1,38
Leave frame INGREDIENTS at Label no. 2
Label no. 3 $Loopct = 2
Ind-structure: INGREDIENTS $Currocc = 2
Enter frame INGREDIENTS
Label no. 1
Access element: AMOUNT $Currocc = 0
External value = '1/2 oz.'
Data output: Start = 1,5 End = 1,24
Label no. 2
Access element: CONSTITUENT $Currocc = 0
External value = 'Galliano'
Data output: Start = 1,27 End = 1,34
Leave frame INGREDIENTS at Label no. 2
If you would like to see the values of static variables as they change during format execution, issue the SET FTRACE VARIABLES command. Here is a very simple example, involving only one variable:
* Format INPUT
Enter frame ADD
Label no. 5
ELEMNAME = BLOOD.TYPE
Label no. 5
ELEMNAME = PHONE.NUMBER
Label no. 5
ELEMNAME = CAN.BE.CALLED
Leave frame ADD at Label no. 12
* Leave format
The command SET FTRACE VARIABLES traces all your static variables. You may also include a variable list on the command, to limit the tracing to specific variables. In subsequent SET FTRACE VARIABLES command, you can add to or subtract from the list with a plus or minus sign. For example, this series of commands establishes tracing for all static variables except those named USERCODE and USERNAME:
-> set ftrace variables -> set ftrace variables - usercode, username
And here is another example showing how you can add to or subtract from variable lists:
-> set ftrace variables longphone -> set ftrace variables + areacode, prefix -> set ftrace variables - prefix
(After these commands, you end up tracing the LONGPHONE and AREACODE variables.)
If you have turned on JUMP, VARIABLES, or SNAPSHOT tracing, the SET FTRACE FRAMES command will limit the detailed tracing to the specified frames. The basic tracing still occurs for other frames, unless you have also issued the SET FTRACE BRIEF command (see below). SET FTRACE FRAMES has no effect if JUMP, VARIABLES, or SNAPSHOT tracing are not in effect. [See B.7.2.2, B.7.2.3, B.7.2.4.]
In most cases, your SET FTRACE FRAMES command would include a frame list. In subsequent SET FTRACE FRAMES commands, you can add to or subtract from the frame list with a plus or minus sign.
SET FTRACE FRAMES frame-name1, frame-name2, ... SET FTRACE FRAMES + frame-name3, frame-name4, ... SET FTRACE FRAMES - frame-name1, frame-name2, ...
SET FTRACE FRAMES without a frame list usually has no effect. (If no formats tracing at all is on when the command is issued, it will turn on the basic tracing, for all frames.) But you can use that form of the command to specify an "exclusion list" for tracing of frames. For example, this series of commands would let you see snapshot tracing for all frames except those named HEADER and FOOTER:
-> set ftrace snapshot -> set ftrace frames -> set ftrace frames - header, footer
SET FTRACE BRIEF is useful in conjunction with SET FTRACE FRAMES. [See B.7.2.5.] It suppresses all tracing information for frames other than those requested in your SET FTRACE FRAMES commands. In the absence of SET FTRACE BRIEF, you see detailed tracing (JUMP, VARIABLES, or SNAPSHOT tracing) for the frames you selected with SET FTRACE FRAMES, and basic tracing for other frames.
Normally, the SET FTRACE output is simply displayed at the terminal as the format executes. For complicated debugging tasks, you may find it convenient to send the tracing information to a "trace log" for later examination. The SET TLOG command accomplishes this task. To look at the tracing, issue the command SHOW TLOG. Or, to place the tracing information in your active file, type IN ACTIVE SHOW TLOG. For example:
-> set ftrace snapshot
-> set ftrace variables
-> set tlog
-> display 1
<the record displays as usual, without tracing>
-> in active show tlog
<the tracing is placed in your active file, where you can
study the sections you need to debug>
For details about using trace logs, EXPLAIN SET TLOG.
The format definer can request that specific information be displayed at the terminal during format execution by using the $FTRACE system variable, which indicates whether the tracing facility is on or off. For instance, you can code a statement such as:
UPROC = IF $FTRACE THEN * 'Value of $CVAL for DATE is ' $CVAL;
This Uproc, coded in a label group for the DATE element presumably, would only send the message to the terminal when the tracing facility was on. Hence, you can keep track of element and variable values as they are changed by coding Uprocs such as that.
Unlike protocols, which may have BREAK XEQ commands to temporarily halt their execution, formats cannot be stopped between label groups or between frames in order to test variable values. SET FTRACE SNAPSHOT and SET FTRACE VARIABLES may provide detailed enough tracing of changing values, but if not, you may find it helpful to use $FTRACE as described here.
Note that if you want to direct custom messages to the trace log (with the SET TLOG command) you should use the $ISSUEMSG function rather than the * Uproc, as in this example:
UPROC = IF $FTRACE THEN EVAL $ISSUEMSG('Danger! Start of problem',W);
For most tracing situations, $ISSUEMSG is preferable to using the "*" command, since it will only appear on the terminal screen and never as part of the trace log. [EXPLAIN SET TLOG for information about trace logs. EXPLAIN $ISSUEMSG for an explanation of the function.]
Whenever you are debugging a format, make sure you have not disabled the display of system diagnostics via a SET MESSAGES = 0 command. You should at least use the default setting, SET MESSAGES = 2. This applies to the SET MESSAGES Uproc allowed in the format definition too. [See C.3.6.4.]
If nothing appears except a prompt when you display a record:
- you may have omitted a FRAME-DIM statement.
If the record is displayed in the standard SPIRES format rather than through yours:
- you may have omitted a FRAME-TYPE = DATA statement in the format declaration section; or
- you may have coded the wrong USAGE value for a frame; or
- you may have issued the SET ELEMENTS command and not issued a CLEAR ELEMENTS command since.
If a data element you are expecting to see does not appear at all:
- the element may not have a value for this particular record; or
- the element value may have been suppressed by SET FILTER commands or "priv-tags" in effect; or
- you may have omitted the GETELEM and/or PUTDATA statement; or
- you may have overwritten the value with another value placed in the frame. Check that no other data element might have been placed in the same position.
If only one of several occurrences of an element appears:
- you may have omitted the LOOP statement; or
- you may have overwritten the values with other values placed in the frame; or
- the record may not have had multiple occurrences of that element; or
- the element may be in a structure and you did not code an indirect frame to get it; or
- multiple occurrences may have been suppressed by SET FILTER commands in effect.
If a value appears in the wrong position:
- you may have coded its START or XSTART or TSTART position incorrectly; or
- the starting position may reference * or X, and the previous label group may not have set it properly. For example, if the previous label group that placed a value did not have a DEFAULT statement and no element value was retrieved, * and X would not have been reset.
If "garbage" appears somewhere in the display:
- a value may have overwritten part of another. Check that the START statements are coded appropriately.
- you may have omitted a semicolon to end a statement. For example, "VALUE = 123 START = *,*+2;" which clearly was meant to have a semicolon after the "123", would make the value of $CVAL be "123START=*,*+2".
- you may not have converted a value to the proper type (string) before it was displayed. [See B.4.3.]
The above is only a list of suggestions; it does not represent a complete listing of all possible explanations for an error.
The previous chapters of Part B have discussed the basics of output format creation. The remaining chapters of this part will cover other aspects of output formats that are certainly important but perhaps not as fundamental. In this chapter, you will learn how to handle structures in an output format.
Generally, a data frame (i.e., a frame of FRAME-TYPE = DATA) can retrieve only record-level elements, that is, elements not within structures. To retrieve all the elements within a structure requires the use of an indirect frame. [See B.4.8.7.] (There are some situations where you do not need an indirect frame; they are described at the end of this section.) The indirect frame is called by the IND-FRAME statement; additionally, the IND-STRUCTURE statement identifies the structure that will be processed by the indirect frame. [See B.8.2.]
The elements within the structure are individually processed in the indirect frame. That frame contains label groups with GETELEM, PUTDATA and other statements to handle the individual element occurrences just as the data frame does for the record-level elements. The indirect frame also includes a SUBTREE statement that specifies which structure of the goal record is being processed by the frame. [See B.8.1.]
Two important factors to consider are the relationship between the frame dimensions of the calling frame and the indirect frame, and the impact of the PUTDATA statement when it is placed on the label group that calls the indirect frame. They are discussed in the next sections.
There are two situations where formats do not need to have indirect frames to process structures. One is a general situation, and the other is quite specific.
First the general case. You may be able to treat the structure as if it were a single element value, using the system procs $STRUC.OUT or $STRUC (action A33) in the OUTPROC string for the structure. These processing rules are often coded for multiply occurring structures whose individual elements are singly occurring. EXPLAIN A33 RULE or EXPLAIN $STRUC PROC for more information about them.
Suppose for instance that the PHONE structure contains the elements AREA.CODE, PREFIX and SUFFIX:
LABEL = PHONE; GETELEM; OUTPROC = $STRUC.OUT(3,-); PUTDATA;
The OUTPROC tells SPIRES to get the first occurrence of each of three elements in the structure and put them together, separated by hyphens, into a single value like this: "415-497-4420".
Note that the structure is treated as a single element value when the structure processing rules ($STRUC.OUT, A33, etc.) are in effect. Hence, in the format you do not have the flexibility of individually positioning each element of the structure if you retrieve the structure this way, unless you then split the value into its component parts yourself.
The second situation, though not common, does give you positioning flexibility, in that it permits you to position each element of the structure individually, each being handled in its own label group. The second situation arises when the structure containing the elements you want to process is singly occurring, or you want to process only the first occurrence of the structure.
Using the PHONE example above and assuming only the first PHONE occurrence is needed, you might code the following label groups:
LABEL = AREA.CODE;
GETELEM;
INSERT = '(';
INSERT = END, ')';
PUTDATA;
LABEL = PREFIX;
GETELEM;
START = *,*+1;
PUTDATA;
LABEL = SUFFIX;
GETELEM;
INSERT = '-';
START = *,*+1;
PUTDATA;
Here no mention of the structure PHONE appears. Only the elements within the structure are specified. (If they were not unique, i.e., if other elements having the same element names were in the record-type, the element names could be preceded by "PHONE@", as in "GETELEM = PHONE@PREFIX", to specify a structural path for SPIRES to follow to the desired element.)
All occurrences of the elements within that occurrence of the structure may be retrieved (e.g., using the LOOP statement). Remember though that you cannot use this method to retrieve elements within occurrences of the PHONE structure other than the first in the record.
The rest of this chapter describes the format definition code needed when you must use indirect frames to process structures.
Basically an indirect frame that formats data can work in one of two ways. Either it can place its data directly into the calling frame, just as if its label groups were in the calling frame's definition, or it can place its data into a buffer of its own, so that the entire buffer is positioned in the calling frame all at once when the indirect frame has completed execution. Which method SPIRES uses depends on whether a PUTDATA statement is coded on the calling label group, whether or not it is used to handle structures.
If no PUTDATA statement appears there, then the first method is used. Although execution control switches to the indirect frame, all data processed by it is placed directly into the calling frame. The current row and column numbers from the calling frame are thus carried into the indirect frame. In other words, the main impact of the indirect frame for structure processing with this method is that it now allows you access to elements within some structure, but otherwise acts as if you were still in the calling frame. In fact, any FRAME-DIM statement you code in the indirect frame will be ignored, since the frame dimensions of the calling frame are used when the PUTDATA is omitted from the calling label group.
For example, here is a calling label group on the left, and the indirect frame it calls on the right.
LABEL; FRAME-ID = DO.CLASSES;
IND-STRUCTURE = CLASSES; DIRECTION = OUTPUT;
IND-FRAME = DO.CLASSES; SUBTREE = CLASSES;
TSTART = X+1,1; USAGE = DISPLAY;
TITLE = 'Classes taken:'; LABEL = CLASS.NUMBER;
LOOP; GETELEM;
START = *+1,5;
PUTDATA;
LABEL = CLASS.NAME;
GETELEM;
START = *,15;
PUTDATA;
LABEL = DESCRIPTION;
GETELEM;
MARGINS = 35,64;
MAXROWS = 5;
START = *,35;
PUTDATA;
The starting positions declared in the indirect frame refer to locations in the calling frame. Since there is no PUTDATA statement in the calling label group, no FRAME-DIM statement is necessary in the indirect frame -- it would be ignored anyway.
Note the use of the TITLE and TSTART statements to provide a title for the values processed by the indirect frame. This is one of the few situations in a label group where the statements are not executed in the order shown -- the TITLE and TSTART statements are executed before the indirect frame is called.
Finally, note the LOOP statement on the calling label group, which directs SPIRES to execute the indirect frame again to process the next occurrence of the structure. Remember that if there were no occurrences of the CLASSES structure at all, the calling label group would not be executed, and hence the indirect frame would not be executed either.
Coding the DEFAULT statement on the calling label group will cause the indirect frame to be executed even though the structure does not occur; all GETELEM statements in the indirect frame will fail, unless those label groups have their own DEFAULT statements. [See B.4.5.1.] If SPIRES is executing an indirect frame under DEFAULT processing, it sets the system variable $NODATA. [See E.2.1.27.]
On the other hand, if a PUTDATA does appear on the calling label group, then a buffer is built apart from the calling frame, using the frame dimensions given for the indirect frame. As it executes, the indirect frame places data into the buffer, which has its own row and column numbers independent of those in the calling frame. When the indirect frame is finished executing, the calling label group regains control, handling the placement of the indirect frame's buffer in the calling frame. The rows of the buffer are put together end to end, so that it becomes a long string value that is assigned to $CVAL. Then, using the PUTDATA and other value placement statements in the calling label group, SPIRES positions $CVAL in the calling frame just like any other value. Any DISPLAY attributes set in the indirect frame are lost because $CVAL only contains data, not attributes.
Here is another calling label group and the indirect frame, but this calling label group has a PUTDATA statement on it. The result is the same as that of the example above:
LABEL; FRAME-ID = DO.CLASSES;
IND-STRUCTURE = CLASSES; DIRECTION = OUTPUT;
IND-FRAME = DO.CLASSES; SUBTREE = CLASSES;
TITLE = 'Classes taken:'; FRAME-DIM = 5,60;
TSTART = X+1,1; USAGE = DISPLAY;
MARGINS = 5,64; LABEL = CLASS.NUMBER;
START = *+1,5; GETELEM;
PUTDATA; START = 1,1;
LOOP; PUTDATA;
XSTART = X,5; LABEL = CLASS.NAME;
GETELEM;
START = *,11;
PUTDATA;
LABEL = DESCRIPTION;
GETELEM;
MARGINS = 31,60;
MAXROWS = 5;
START = *,31;
PUTDATA;
The structure data is placed in the DO.CLASSES buffer just as it was placed in the calling frame's buffer before. The FRAME-DIM statement assigns 60 columns to each row of the DO.CLASSES frame, which matches the number of columns allotted to it by the calling label group in the START and MARGINS statements (column 5 to column 64 is 60 columns). It is usually crucial that these numbers match; otherwise, after the indirect frame is converted to a long one-dimensional string value, it will not be placed properly into the calling frame.
Note that blank rows at the end of the indirect frame are not returned, just as blank rows at the end of a data frame are not displayed. However, any blank character positions at the end of the last row having data will be included at the end of $CVAL. (That will affect the value of "X" for column positioning for the next START statement, for example.) Generally they will not cause any problems, but you may code an OUTPROC of system proc $TRIM (action A51) in the calling label group to remove them.
You can see that this second example is much more complicated than the first, so usually the first method is preferable. However, there may be times when you must use the latter method to get the desired results. For example, suppose structure ALPHANUM contains singly occurring, optional elements ALPHA and NUM. Your format design positions them like this:
....v....1....v....2....v....3....v....4
(record key) (ALPHA) (NUM)
(ALPHA) (NUM)
(etc.)
Each occurrence of the structure appears on a new row, except for the first. In this situation, it is easiest to treat the structure as a single value to be positioned in the data frame:
(the calling label group) (the indirect frame)
LABEL = ALPHANUM; FRAME-ID = ALPHNUM.OUT;
IND-FRAME = ALPHNUM.OUT; FRAME-DIM = 1,40;
IND-STRUCTURE = ALPHANUM; DIRECTION = OUTPUT;
START = *,20; LABEL = ALPHA;
PUTDATA; GETELEM;
LOOP; START = 1,1;
XSTART = *+1,20; PUTDATA;
LABEL = NUM;
GETELEM;
START = 1,11;
PUTDATA;
Why was the PUTDATA in the calling label group necessary? First of all, without the PUTDATA there, each new occurrence of the structure would have overridden the previous one -- they would have all started on the same row with the record key. (Remember, the START and XSTART statements in the calling label group would be ignored.) Changing the starting row for the elements in the indirect frame to anything else would still cause similar problems. Problems could also arise if there were occurrences of the structure that did not have the ALPHA element. Try changing the starting rows for the indirect frame elements and adding or removing the PUTDATA statement on the calling label group in various combinations -- you will clearly see why this is the best (though not only) solution to this design problem.
The rest of this chapter will discuss the other statements needed for coding indirect frames.
The first few statements in an indirect frame definition, that is, the frame identification statements, generally look very much like those of the calling frame. The differences usually involve the SUBTREE and FRAME-DIM statements. The FRAME-DIM statement, whose values (or even whose inclusion) depends on the presence or absence of a PUTDATA statement in the calling label group, has already been discussed. [See B.8.1.]
The SUBTREE statement names a structure of the goal record, indicating the hierarchical path from the record level that SPIRES should follow to access the data elements processed by this frame.
SUBTREE = structure.name;
SUBTREE may also name a phantom structure. [See B.12.1.]
If the structure does not have a unique name in the goal record-type, then you must indicate the path SPIRES must take to get to that specific structure, beginning with the record-level structure or a unique structure name within that path:
SUBTREE = structure.1@structure.2@...@structure.n;
The SUBTREE statement tells SPIRES that all elements referenced in GETELEM and IND-STRUCTURE statements in the frame will be in that structure. Any elements not in the structure named by those statements will cause a compilation error.
The SUBTREE statement is not required for indirect frames that process structures, though it is recommended unless you need to retrieve other elements from outside the structure in such a frame. If you omit the SUBTREE statement, you can use GETELEM statements to retrieve other record-level elements from within the indirect frame, which normally allows access only to the elements within the "subtree".
Warning: If you do retrieve "extra-structural" elements from within the indirect frame with GETELEM statements, the frame may not correctly retrieve occurrences of the elements within the structure (other than the first occurrences) from that point on in the indirect frame. In other words, once an extra-structural element is accessed, the indirect frame no longer has the structural-access power it did have. Moreover, if such an indirect frame is being executed repeatedly because of a LOOP statement, "infinite loop" problems can arise -- SPIRES may access the same structural occurrence repeatedly.
If you need to retrieve extra-structural elements from within the indirect frame, use one of the $GETxVAL functions (like $GETCVAL) instead of the GETELEM statement in order to retain the structural-access power of the frame.
Occasionally a goal record-type contains a "floating structure", i.e., two or more structures from different subtrees that have identical definitions. (An ADDRESS structure containing STREET.ADDRESS, CITY, STATE and ZIP elements might appear in several places within a record-type, for instance.) A single indirect frame may be used to process all of the floating structure's appearances if multiple SUBTREE statements are coded. Each one must define a unique hierarchical path to the floating structure. The IND-STRUCTURE statement in the calling label group tells SPIRES which subtree to access.
The "@n" option on the GETELEM statement is often useful when you want to use the same frame to retrieve from multiple subtrees. [See B.4.2.1.] In this case, the structures processed by the indirect frame do not have to be identical (that is, they do not necessarily have the same elements with the same names) though they would probably be very similar.
The SUBTREE statement may also have the form:
SUBTREE;
which indicates that the frame is a multiple-subtree frame, designed to handle element retrieval from any structure or at the record level. It allows a frame of FRAME-TYPE = STRUCTURE to be a general frame for any subfile. [See B.15, B.14.]
The calling label group needs to have an IND-STRUCTURE statement if elements within the desired structure are to be processed in an indirect frame. What other statements are necessary in the label group depends on whether a PUTDATA statement will be included as well. [See B.8.1.]
The IND-STRUCTURE statement has this syntax:
IND-STRUCTURE = structure.name;
where IND-STRUCTURE may be abbreviated to IND-STRUC and where "structure.name" has the same form as "element.name" in a GETELEM statement. You may, for example, request a specific occurrence of the structure by using the form: "structure.name(n)", where "n" is an integer representing the occurrence number desired, counting from 0 (zero) for the first. [See B.4.2, B.4.2.1.]
A calling label group often has a LOOP statement to cause SPIRES to return to the indirect frame to process the next occurrence of the structure.
Here is a label group that calls an indirect frame for structure processing:
LABEL = CALL.ORDER.FRAME; IND-FRAME = ORDER.FRAME; IND-STRUCTURE = ORDER; UPROC = LET TOTAL = #TOTAL + #QUANTITY; LOOP;
The Uproc, which is presumably keeping a running total of some values retrieved in the indirect frame, is executed for each occurrence of the structure, just before the LOOP statement tells SPIRES to re-execute the indirect frame. When no more occurrences of the structure exist (which SPIRES would determine when trying to execute the IND-STRUCTURE statement), the loop is broken and SPIRES would continue to the next label group.
As you have seen in earlier examples, SPIRES variables can be indispensable in many format coding situations. System variables such as $CVAL keep track of the value to be placed in the frame, or its starting position in the frame, or its current length in characters, or some other useful information. User variables can be created to hold values to be used in other label groups, other frames, or even outside of the format.
The system variables always exist; that is, memory is always allocated for them. User variables must be defined, compiled and then "allocated", which means telling SPIRES to reserve memory for them and perhaps assign initial values to them. Both types of variables typically make appearances in Uprocs and VALUE statements, though they are allowed in many other statements as well.
Basically variables may be used the same way in formats as they can be in protocols; most of the syntax rules are the same. The next section will briefly cover those basics and describe the differences from protocol use in detail. [See B.9.1.] The following section will discuss the use of system variables in output formats, briefly describing the most useful ones. The remainder of the chapter will cover user variables, explaining how to define, compile and allocate them. [See B.9.2.] Local vgroups ("variable groups") defined within the format definition and global vgroups defined and compiled separately will both be discussed. [See B.9.3.]
This section will review basic syntax rules concerning the use of system and user variables in format definition code. More information about variables in general can be found in the manual "SPIRES Protocols".
System variables, whose names each begin with a dollar sign, usually contain values set by SPIRES. Some of them, however, may be set explicitly by you in the format, such as $PROMPT, $ASK and $CVAL, by issuing a SET Uproc:
UPROC = SET PROMPT = 'How many occurrences do you want?';
The SET Uproc in a format, unlike the SET command, allows expressions:
UPROC = SET CVAL = $UVAL * 12 || ' months';
All of the system variables explicitly related to formats are explained in detail later in this manual. [See E.2.] Note that other system variables are available within a format too (for instance, $SELECT, which contains the name of the selected subfile), though they may not be settable.
User-defined variables are identified by the pound sign that begins each of their names, e.g., #ANSWER. A value may be assigned to a user variable in one of two ways: either in a LET Uproc or command, or in a VALUE statement executed when the variable is allocated. [See B.4.8.10, B.9.3.1.] During format execution, only the LET Uproc can do it:
UPROC = LET ANSWER = $ASK;
Note that the pound sign (or the dollar sign for system variables) is omitted when the variable is on the left side of the equals sign in a LET or SET Uproc or command. Any variables on the right side of the equality operator must have the pound or dollar sign, as appropriate, if substitution of their values is desired.
In regard to variable substitution, note that the "/" prefix, used in commands to force substitution, is not available in formats; appropriate substitution occurs automatically as long as the variable is not part of a string enclosed by apostrophes or quotation marks. For example, compare these two statements, the first a command, the other a formats Uproc:
/* The time is $TIME. UPROC = * 'The time is ' $TIME '.';
Both would have the same effect (though the latter display would not include the "*" at its beginning). In a command, the literals do not have to be enclosed by quotation marks or apostrophes, while they almost always do in Uprocs. On the other hand, you do not need (and in fact cannot use) the "/" to force variable substitution for $TIME in the Uproc.
Sometimes you may want to check or use variable values assigned during format execution outside of the format. However, most of the system variables directly related to formats have no meaningful value outside of format execution. For example, if you issue the command SHOW EVAL $CVAL to see the last value of $CVAL inside the format that just executed, the value returned will probably be garbage; it is certainly not reliable. User variables and some system variables settable from within a format, such as $PARM, $PROMPT and $ASK, can be used to pass values into and out from formats. [See B.9.3 for global vgroups.]
Generally speaking, you must be somewhat more conscious of variable types when you work with variables in formats. All system variables (except $CVAL, $UVAL and $PVAL) and user variables in formats have a specific type associated with them, such as string or integer or flag (the most common ones). (The three exceptions may vary in type.) Similarly, all user variables available in a format have a type associated with them; user variables may not be created dynamically in formats in the same way as they can in command mode, so they are defined as having some type or another. [See B.9.3.1.]
Remember then that when you assign a value to a variable, that value must be convertible to the type of the variable; otherwise a conversion error will occur and format processing will stop. For example, if INTEGER is an integer variable, the following Uproc will stop the format from processing the current record further:
UPROC = LET INTEGER = 123.45;
Though that particular error is easy to see, it becomes somewhat more obscure when the value being assigned is another variable. For instance, now suppose that the value of the string variable VALUE is "123.45":
UPROC = LET INTEGER = #VALUE;
That still fails, but it would not fail if the value of VALUE were "123", because that value could be converted to an integer.
Some format statements allow variables as values, such as MAXROWS. The type of the variable used for such statements often does not matter, as long as the substituted value when the statement is executed is convertible to the type the statement expects. For example, a string variable may be given for the value of the MAXROWS statement as long as the variable's value is convertible to an integer when the statement is executed. [See B.4.6.6.]
Below is a list of system variables that are useful in output formats, each with a brief description. Variables used primarily within report formats are listed later. [See B.10.10.] Details about all the variables are provided later in the manual. [See E.2.]
The variable type is given in parentheses after the variable name.
The first group of variables is reset each time a new label group is entered:
$UVAL (varies) - the unconverted (i.e., before OUTPROCs or
INSERTs) value of the label group.
$CVAL (varies) - the converted (i.e., after OUTPROCs or INSERTs)
value of the label group.
$PVAL (varies) - the unconverted value of the previous label group.
$ULEN (int) - the length of the unconverted value.
$CLEN (int) - the length of the converted value.
$ELOCC (int) - the number of occurrences of the element specified
by GETELEM.
$LASTOCC (int) - the occurrence number of the final occurrence of
an element specified by GETELEM (usually $ELOCC-1)
numbered from 0.
$LOOPCT (int) - value of the LOOP counter for the current element,
beginning at "0" for the first time through.
$DEFAULT (flag) - set when a DEFAULT statement is coded and the
GETELEM or IND-STRUCTURE statement fails to
retrieve an element in the record.
$PROCERR (flag) - set when an S- or E- level error is returned
from OUTPROC or INPROC execution.
$APROCERR (flag)- set when any error is returned from OUTPROC
or INPROC execution.
$REPEAT (flag) - can be set, causing SPIRES to duplicate the
output value to fill the available space for it.
$SKIPROW (flag) - can be set, causing SPIRES to double space
the value as it is output.
$LABEL (string) - the name of the currently executing label group.
The variables below are not reset when a new label group starts executing:
$CROW (int) - the current row position in the frame, i.e., "*".
$CCOL (int) - the current column position, i.e., "*".
$SROW (int) - the starting row of the most recently positioned
value in the frame.
$SCOL (int) - the starting column of the most recently
positioned value in the frame.
$LROW (int) - the number of the highest row used in the frame
(i.e., X-1).
$LCOL (int) - the number of the last column processed (X-1).
$NROWS (int) - the number of rows assigned to the current frame.
$NCOLS (int) - the number of columns assigned to the frame.
$RECNO (int) - the number of records processed so far, including
the current record, under this command.
$FREC (flag) - set when the first record is being processed
during a multiple-record processing command.
$GPROCERR(flag) - set when an S- or E-level error has occurred
during INPROC or OUTPROC execution.
$ABORT (flag) - set when an ABORT or STOPRUN Uproc is executed;
useful only outside of the format.
$PARM (string) - the parameter list from the SET FORMAT (or SET
GLOBAL FORMAT) command.
Many more variables may be useful within output formats -- this is just a list of those that are particularly useful there.
As soon as your formats involve more complicated processing than just "GETting an ELEMent" and "PUTting the DATA" into the output grid, you will find that user variables are indispensable. Within the format, they are useful for holding values across label groups and frames. (For example, you might want to concatenate elements together and position them as a single value in the frame, requiring you to hold the earliest elements in a variable as you retrieve the later ones.) In complex applications where protocols and formats must work together, user variables may pass values back and forth between the protocols and formats.
Although you can create variables dynamically in command mode, you must usually predefine, compile and allocate them if you want to use them in a format. This process, called creating a "vgroup" (for "variable group"), can be specified completely within the format definition. Such vgroups are called "local vgroups". Alternatively, you may define and compile the vgroup separately, with only the allocation statement appearing within the format definition. That type is called a "global vgroup", because its variables could be referenced by other format definitions or by protocols.
Whether the vgroups you want to use in a format are local or global or a mixture (you may use multiple vgroups in a format), the vgroup definitions are similar. Complete information on the statements in a global vgroup record is given in the manual "SPIRES Protocols", section 4.2.1.1. Since the statements for a local vgroup in the format definition are practically all the same as for a global vgroup, only the basic rules for them are given in this manual, appearing in the next section. [See B.9.3.1.] The brief section that follows will discuss the ALLOCATE statement. [See B.9.3.2.]
To create a local vgroup, you must first define it in the Vgroups section of the format definition. That section begins after the identification section, immediately following the RECORD-NAME statement.
Multiple vgroups may be defined in this section, each vgroup beginning with the VGROUP statement (see below). Any variable in any vgroup in the format definition may be used in a format, as long as the format has an ALLOCATE statement naming that variable's vgroup. [See B.9.3.2.]
A vgroup definition contains from 1 to 256 variable definitions under an umbrella name:
VGROUP = vgroupname;
where "vgroupname" is a name from 1 to 16 characters long, which may include alphanumeric or the special characters period (.), hyphen (-) or underscore (_). It may be prefixed by the format definer's account number, which can extend its length to 23 characters, but that is not considered good practice. For example,
VGROUP = DISPLAY.VARS;
After the VGROUP statement and COMMENTS statements (optional) come the individual variable definitions, each of which begins with a VARIABLE statement that names the variable:
VARIABLE = variablename;
Here "variablename" is a name from 1 to 16 characters long. Generally speaking, it should contain only letters and numbers.
Other statements describing the variable may follow:
This statement declares the number of occurrences that the variable may have. The default, if no OCC statement is coded, is one occurrence for each variable. (Multiple occurrences of a variable are individually referenced by an index: the variable name is followed by "::n" where "n" is a number from 0 to 32,767 representing the desired occurrence, counting from 0, or "n" is an integer variable representing such a number. Alternatively, the "::INDEX" feature can be used to handle such references.) The form of the OCC statement will be different for two- or three-dimensional arrays. [See "SPIRES Protocols", Section 4.3.]
This statement specifies the length in bytes of each occurrence. It must conform to the restrictions inherent in TYPE. Default lengths are provided for each type, as shown in the TYPE chart.
This statement, which is probably the most important of them, describes the type of value that the variable will represent. The allowed types (and their allowed and default lengths per occurrence) are:
Type Allowed lengths Default lengths
STRING 1-32,765* 80
INTEGER 1, 2, 4 4
REAL 4, 8 4
PACKED 1-16 4
FLAG 1 1
CHAR 1-256 16
LINE 4 4
HEX 1-32,767* 4
DYNAMIC 1-32,765 0
* The upper limit applies to singly occurring variables; if the
variable occurs more than once, the upper limit is 253 for
string variables and 255 for hex.
The total size of ALL variables in a single vgroup cannot be greater than 65,536 bytes. You can define multiple vgroups if you need more variable storage than that.
This statement lets you assign initial values to the occurrences of the variable, one value per occurrence as given in the OCC statement. The given values, each a string, will be converted to the appropriate type during compilation. If special characters appear in any strings, including blanks or commas, those strings should be enclosed in apostrophes. No single value should exceed 255 characters in length.
For flag variables, you may use the values $TRUE and 1 to represent "true", and $FALSE and 0 to represent "false".
To assign the same value to multiple occurrences in a row, you can put the value in parentheses, preceded by the number of occurrences to be assigned that value. For example,
VALUE = '', 26('ABC');
The first occurrence of the variable will have a null value, but the next 26 will have the value ABC.
Other useful statements include COMMENTS (just like other COMMENTS statements in a format definition), SITE (limiting the vgroup to a particular type of site, such as CMS or STS), REDEFINE, INDEXED-BY and DISPLAY-DECIMALS. For more information about the last four statements, see section 4.1.1 of the manual "SPIRES Protocols", or issue an EXPLAIN command.
A typical vgroup definition might look like this:
VGROUP = LOCAL; VARIABLE = DOCS.ORDERED; TYPE = STRING; OCC = 10; VARIABLE = NUMBER.ORDERED; TYPE = INT; OCC = 10; VARIABLE = ANY.ORDERED; TYPE = FLAG; VALUE = 1;
Dynamic variables may be used in a format if they are handled in either of the following ways:
- the dynamic variables are defined in a vgroup definition as type DYNAMIC; or
- they are handled only by the system functions $DYNPUT and $DYNGET.
You cannot create a dynamic variable in a format simply by referencing it in a LET Uproc, as you might do with a LET command in a protocol. That is, the following Uproc will not compile properly unless the variable TOTAL is in a defined vgroup:
UPROC = LET TOTAL = $CVAL * 12;
Dynamic variables in formats may be useful in the following situations:
- 1) when an array has an indefinite number of occurrences (and the range of possible occurrences is large, e.g., there may be one occurrence or 100, but you have no way of knowing ahead of time) and/or each occurrence has an indefinite length (and the range of possible lengths is similarly large); or
- 2) when you will use only widely separated occurrences of an array (e.g., you might place values only in occurrences 1, 50 and 5000-5020 of an array, but still need the large occurrence numbers.);
- 3) when a vgroup is too large as defined (If a vgroup is too large when it is compiled, the error message VGROUP TOO LARGE -- TRY DYNAMIC TYPE will appear.); or
- 4) when you want to pass values to USERPROCs (which may not use static variables for communication outside the USERPROC) or to other formats that do not or may not allocate the same vgroups as the current format.
[See the manual "SPIRES Protocols" for more information on using dynamic variables.]
To use a vgroup in a format you must allocate it, i.e., tell SPIRES to allocate memory for its variables. The ALLOCATE statement is placed in the format declaration section, after the FORMAT-NAME statement:
ALLOCATE = vgroupname [, HIDDEN, TEMPORARY, CONTROLLED];
For "vgroupname", if the vgroup is a local vgroup (that is, its definition appears in the format definition), then the name given in the VGROUP statement should be specified here. If the vgroup is a global vgroup, you must specify the vgroup name including the account number in one of the following forms:
ORV.gg.uuu.vgroupname @gg.&uuu.vgroupname &uuu.vgroupname
If the name appearing in the ALLOCATE statement is the name of both a local vgroup and a global vgroup, the local vgroup will be used.
If you have multiple vgroups to be allocated for a single format, you can code multiple ALLOCATE statements. As many as 16 vgroups are allowed per format.
The HIDDEN, TEMPORARY, and CONTROLLED options are explained at the end of this section.
The vgroup that you name in your ALLOCATE statement is allocated when the format is set, and cleared from main memory when the format is cleared. If you want the global vgroup that was allocated in your format to remain in memory when the format is cleared (e.g., to use the values of the variables with another format) you must issue an ALLOCATE command sometime before clearing the format, perhaps before setting the format in the first place. SPIRES then realizes that the global vgroup is not completely tied to the format, and will let it remain when the format is cleared. [See the manual "SPIRES Protocols" for information on the ALLOCATE command.]
In most cases, when a format is set, a user of the format may examine or change variable values in the format's allocated vgroups -- the variables are not reserved exclusively for the format's use. The user may examine variable values by issuing SHOW STATIC VARIABLES or SHOW EVAL commands, for example; the first command displays the values of all variables in all allocated vgroups, while the second can be used to show the value of individual occurrences of variables. [See "SPIRES Protocols" for details on these commands.]
However, you can use the HIDDEN option on the ALLOCATE statement if you need to "hide" a vgroup from users of the format. When a vgroup is hidden, it cannot be seen with commands such as SHOW ALLOCATED or SHOW STATIC VARIABLES -- not even by the format's owner. Furthermore, its variable values can't be accessed or changed by any protocol or interactive command. The only access to the variables is through the format containing the ALLOCATE statement, or (for global vgroups) through load formats associated with the format.
Perhaps the main benefit of the HIDDEN option is that it allows two or more formats to be set simultaneously on different paths, and to access the same global vgroup, without any of the variables clashing. This is useful for system formats such as $REPORT and $PROMPT, and for formats code generated from these system formats. Thus, if you generate a report format from $REPORT statements, for instance, [See B.10.] you may find a hidden vgroup allocated in the format definition. If you decide you need to alter that vgroup, you may remove the HIDDEN option, if you prefer.
The TEMPORARY and CONTROLLED options on the ALLOCATE statement let you define working areas for local vgroups dynamically, and offer you ways to more effectively manage core for your application. These two options are allowed only for local vgroups.
If you use the TEMPORARY option, the vgroup will be allocated and initialized each time the format is called (or each time the format is entered if the vgroup is in a load format). The vgroup will then be discarded when the format or load format is exited. Because a temporary vgroup is cleared from memory when the format is left, it will not be shown when you issue a command such as SHOW STATIC VARIABLES.
A CONTROLLED vgroup will be assigned only when an ALLOCATE Uproc is executed. And it will be discarded by a DEALLOCATE Uproc (or when the format is exited, if the vgroup is also TEMPORARY). The syntax of these Uprocs is simply:
UPROC = ALLOCATE vgroupname; UPROC = DEALLOCATE vgroupname;
For example, you may want to define a vgroup that is used only during format startup. Your ALLOCATE statement could include the CONTROLLED option, and your startup frame could include ALLOCATE and DEALLOCATE Uprocs. In this case, the vgroup's memory space would only be assigned during startup.
Variables in global vgroups that were allocated but not by the format can be accessed, though only indirectly, via the $STATPUT and $STATGET functions. The $STATPUT function assigns a value to a static variable, and the $STATGET function retrieves one. More information about them can be found in the online manual "SPIRES Protocols". [EXPLAIN $STATPUT FUNCTION.]
By the way, $STATPUT and $STATGET cannot be used to find values of variables in hidden vgroups (see above).
Sometimes during format execution, you may want to reset individual variables or even entire vgroups back to their original allocated values. For a single variable, the LET Uproc is commonly used:
UPROC = LET COUNTER = 0;
Some system functions are particularly useful in this regard:
- $VGROUPINIT (or $VINIT), which re-establishes the values that the named vgroup had when it was first allocated;
- $ASET, which assigns a given value to all or a subset of the variable occurrences in the named array;
- $DYNASET, which assigns a given value to all or a subset of the occurrences in a dynamic array;
- $DYNPUT, which assigns a given value to a dynamic element; and
- $DYNZAP, which eliminates the named dynamic element.
These functions are discussed in detail in the manual "SPIRES Protocols".
To use these functions from within a format, you must either assign their returned value to a variable with a LET Uproc, e.g.,
UPROC = LET RETURNED = $VINIT(GQ.DOC.LOCAL);
or use the EVAL Uproc:
UPROC = EVAL $VINIT(GQ.DOC.LOCAL);
The EVAL Uproc tells SPIRES to evaluate the expression that follows but to discard the returned value. Its syntax is:
UPROC = EVAL expression;
where "expression" may consist of functions, variables and strings. Usually functions whose purpose is to execute some process (such as resetting a vgroup) rather than return a value comprise the expression in an EVAL Uproc. Since the action of the $VINIT function (rather than the value returned from it) is what is important to us, the EVAL Uproc lets us execute the function while avoiding unnecessary variable assignments. The other functions named above may also be used with the EVAL Uproc.
Using the material discussed so far in this manual, you could construct a format to display an entire set of goal records, perhaps for a printed copy of your data. However, as attractive as the format might be, you would probably still notice that its "hardcopy" was not likely to be confused with, say, a hand-typed report.
For instance, your output would not have page numbers and would probably not have introductory information on the first page; records would be split across page boundaries; there would be no summary information reporting total or average values across the entire set of records.
A "report format" can add these and other extra features, making your output a more attractive and informative document. Report formats have even been used to create camera-ready copy for catalogs and bibliographies.
In general, reports are created (using SPIRES formats language or one of the easier methods listed elsewhere in this section) to accomplish one or more of the following tasks:
- to cause proper page layout, including:
- creating carriage control;
- adding "header" and/or "footer" information to appear at the top or bottom of each page of the report; and
- preventing records from splitting across page boundaries;
- to provide introductory ("initial") or concluding ("ending") material;
- to provide summary information, such as totals, counts, or averages, across all the records or across groups of records;
- to create indexes or other tables from the record data.
There is more than one way to create a report definition in SPIRES, and only the most complex reports require you to code your own SPIRES format by hand. Below, the different methods to create a report in SPIRES are listed in order from the easiest to the most complex (where the most complex method is also the method offering the greatest freedom to customize the way that the report executes):
The utility Report Definer lets you create an effective tabular report complete with simple summaries and page-layout control, simply by filling in a series of screens with your report specifications -- only the most basic SPIRES knowledge is needed. [See Part C of "SPIRES Searching and Updating" for details.]
The system format $REPORT lets you make table-like reports of more complexity than Report Definer offers. Initial, ending, header, footer and summary information are all relatively easy to specify, in commands issued interactively or stored online. [Like Report Definer, $REPORT is discussed in Part C of "Searching and Updating".]
Although you must learn a special subcommand vocabulary to use $REPORT, you do not need to code a report format -- thus for many (probably most) reporting needs, you can use Report Definer or $REPORT and bypass the rest of this chapter entirely.
You can also generate a $REPORT definition into a SPIRES report format, by issuing the GENERATE FORMAT command when your $REPORT format is set on its selected subfile, and following the instructions given. The GENERATE FORMAT command creates a format record for you, using your $REPORT commands, and adds the record to the FORMATS subfile -- all you need do then is compile it. [Like the utilities discussed above, $REPORT code generation is discussed in Part C of "Searching and Updating".]
$REPORT definitions usually execute more efficiently in generated form, so even users who have no interest in the report format language should benefit from generating their $REPORT definition. In addition, a $REPORT-generated format provides a good starting point if you decide you want to customize the report format further.
You can also code a report format from scratch if you prefer, using the techniques described in the upcoming pages.
The rest of this chapter assumes that you want either to write your own report format from scratch or, preferably, to modify a format generated from $REPORT. In other words, this chapter is for developers who need a more customized format than even $REPORT provides.
A report format is basically an output format whose additional facilities (e.g., headers) you provide by coding some additional frames. When a report format is set and you have established "report mode" by issuing the SET REPORT command or Uproc (or by using the WITH REPORT prefix), [See B.10.2.] any commands causing multiple record output will produce more sophisticated results than in a standard output format.
Specifically, any frame whose frame-type is initial, group-start (summary), group-end, or ending, will be executed at the appropriate time. Also, if the command is prefixed with the IN ACTIVE or "IN file" prefix, header and footer frames will be executed, and automatic page control, via carriage control characters in column 1, will be provided.
All of these new frame-types are optional; you should use only those you need. Below is a brief description of each one:
The diagram below shows how these frames might appear in a typical report for five records:
Page 1
+--------------------------+
| |
| Initial |
| Frame |
| |
+--------------------------+
Page 2
+--------------------------+
| Header |
| |
| - - - - - - - - - - - - -|
| Group-Start |
| - - - - - - - - - - - - -|
| Data (Record 1) |
| - - - - - - - - - - - - -|
| Data (Record 2) |
| - - - - - - - - - - - - -|
| Group-End |
| - - - - - - - - - - - - -|
| Group-Start |
| - - - - - - - - - - - - -|
| Data (Record 3) |
| |
| - - - - - - - - - - - - -|
| Footer |
+--------------------------+
Page 3
+--------------------------+
| Header |
| |
| - - - - - - - - - - - - -|
| Data (Record 4) |
| - - - - - - - - - - - - -|
| Group-End |
| - - - - - - - - - - - - -|
| Group-Start |
| - - - - - - - - - - - - -|
| Data (Record 5) |
| - - - - - - - - - - - - -|
| Group-End |
| - - - - - - - - - - - - -|
| Ending |
| - - - - - - - - - - - - -|
| Footer |
+--------------------------+
The diagram shows that the initial frame is put on a page by itself. The initial page is the only one that does not by default allow a header and/or footer frame on it.
The group-start and group-end frames are each shown here three times, presumably because in the five records, the first two have the same value for the break element of these frames, the next two have a different value and the last has yet a different value. The group-start frame is also executed before the first data frame, that is, when the first record is retrieved for processing. Similarly, the group-end frame is executed after the last record is processed.
Many system variables relate to report formats, from the flag variable $REPORT (which tells whether report processing is in effect) to the integer variable $LLEFT (which tells how many lines are remaining on the page). The values of some of these variables are maintained by SPIRES; others are set by you, often in the startup or initial frames. Most of the variables will be discussed as they are needed; all will be listed in a later section. [See B.10.10.]
In addition to system variables, almost all report formats employ user variables. If you are not familiar with their use in formats, please read the earlier chapter. [See B.9.3.]
Other tools useful for report formats, though not necessarily restricted to them, include subgoal processing and phantom structures [See B.12.] the FOR INDEX command and the $PATHKEY system variable [See the manual "Sequential Record Processing in SPIRES: Global FOR", section 2.14.] the SPISORT facility and element occurrence paths [See the manual "SPIRES Technical Notes", section 1.] and sort arrays. [See B.10.7.3, B.10.8.]
Those are the basic additional tools that may be useful in a report format. Section B.10.2 provides details on the SET REPORT command, but the remainder of the sections in this chapter discuss these tools in detail. A complete report format definition and some pages of sample output appear at the end of this chapter. [See B.10.11, B.10.12.]
"Timing problems" are probably the major concern in creating a report format -- the order in which the frames were shown in the diagram is not necessarily the order in which they would be executed. Timing problems may include frames splitting across pages when that is not desired, or counter variables that are not incremented properly because they are not coded in the right place. Understanding how SPIRES determines which frames should be executed when can be helpful in solving such problems.
Suppose that the report diagrammed in the last section is being produced. Five records will be processed, and they are sequenced by CITY. The first two records have the value ALTOONA, the second two have BILOXI, and the fifth has CHATTANOOGA. The records are in a stack, and the command IN ACTIVE TYPE has just been issued. Both the SET FORMAT and SET REPORT commands had already been issued.
First, assume the frames all have fixed frame dimensions. The first event triggered by the command is the execution of the initial frame. Besides creating title page type information, the initial frame in this example sets some system variables, including the number of lines per page, the number of lines that the header will need and the number of lines that should be reserved for the footer. [See B.10.4, B.10.5.] When the initial frame is finished executing, the buffer in which it was constructed is flushed; the output goes to the active file.
Although the first data page begins with a header, the header frame does not execute next, because no data for that page has been created yet. Instead, SPIRES accesses the first record, retrieving the CITY element to determine whether its value is different from that of the previous record. There was no previous record, but the retrieved value is different from no value, so the group-start frame executes next. (Note though that the group-start frames will always execute before the first record is processed, even if the retrieved break-element value is null.)
As it executes, SPIRES keeps track of the number of rows left on the page, lowering the number as the row represented by "X" (the "next row" or $LROW+1) comes down the page. Although the header frame has not been executed yet, SPIRES has subtracted the number of lines allowed for it (specified in the initial frame) from the number of lines left on the page so that it has an accurate reflection of the data lines left.
Because SPIRES has accessed the first record before executing the group-start frame, the group-start frame may retrieve element values from that record. For example, the CITY value could be retrieved so that the group-start frame could announce which city is represented by the following records. More than one group-start frame could execute at this point, depending on how many "break elements" and their respective group-start frames you have coded. [See B.10.7 for the order in which they would execute.]
After the first group-start frame is executed, SPIRES realizes that it cannot be put on the page until the header has been placed there, so execution of the header frame begins. (SPIRES can only place data at the end of the active file, so the header must be placed there before the group-start buffer.) A separate buffer is established for the header frame, and the header data is placed within it. When the header frame has executed, its buffer is flushed to the active file, followed by the group-start frame buffer. It is not common for SPIRES to have multiple format buffers in memory at once, but in this situation it is necessary.
(Why isn't the header executed before the group-start frame so that multiple buffers are not necessary? For one, the header frame may need data from the first record to display as a guide word at the top of the report page. If it is executed before the group-start or a data frame, no record would be available from which to provide such values to the format. More importantly though, a new header is only created when a buffer of data for the new page is waiting to be output.)
Next the data frame is executed, processing the first record. Presumably this format has only one data frame, but there could be more -- each is flushed to the active file when execution completes, while SPIRES keeps track of the number of lines left on the page. (See below for what happens when that number becomes 0.) At this point, the first data record has been completely processed.
The second record is retrieved, SPIRES notes that its CITY element has the same value as the previous record's, and the data frame is executed again. When the third record is retrieved, the CITY element is found to be different, which triggers the group-end, followed by the group-start frame again. Next the data frame is executed for the third record.
At some point, the end of the lines on the page that are available for data will be reached. This will happen when SPIRES is flushing the buffer after completing execution of a data, group-start or group-end frame. Either the buffer will just fit with no lines left over or the buffer will be too large and only part of it can be put on the page.
In the case in which it just fits, SPIRES will execute the footer frame and place it on the page. It will then retrieve the next record and continue almost as it did at the start of the first page: it will execute the data, group-start or group-end frame (whichever is next), but the header frame will not be executed until one of those frames is about to be put on the page.
In the case in which the buffer is too large, SPIRES will place as many rows of the buffer in the remainder of the page as possible, unless you have told SPIRES not to place a buffer at the bottom of a page when it will not completely fit there. [See B.10.3.4.] When as much has been placed there as possible, SPIRES executes the footer for the page; then, knowing that it has more data to put on the next page, SPIRES executes the header frame for the new page and then finishes emptying the buffer.
This procedure continues from one record to the next, from one page to the next. When the final record has been processed by the data frame, the group-end frame executes, followed by the ending frame. When the final ending frame is flushed to the page, followed by the final footer, the report processing ends.
If you press the ATTN/BREAK key during report format processing, SPIRES will stop processing records, but the next group-end, footer and ending frames will still be executed to finish the report.
Any or all of the report frames may be established as line-by-line except for headers and footers. [See B.3.3.] The procedure described above is the same for line-by-line frames except that the buffer is made up of only one row, which is flushed when data is placed in the next row, meaning that the buffer is flushed much more often. Also, footer and header frames may execute when only part of a data, group-start or group-end frame has been executed.
Line-by-line processing restricts your ability to have SPIRES prevent records from splitting across pages. By the time SPIRES knows a record is too large to fit, some of it is already on the page. Line-by-line processing also prevents you from holding groups of records in the buffer at one time to see whether the entire group will fit on the remainder of the page, if that is a consideration. [See B.10.3.7.]
Report mode processing can occur only when report mode has been enabled. Report mode is enabled in one of the following ways:
- 1) by issuing the SET REPORT command after a format has been set.
- 2) by including the SET REPORT Uproc in a startup frame, which is executed when the format is set. [See D.3.]
- 3) by adding the WITH REPORT prefix to the TYPE, DISPLAY or XEQ FRAME command.
The syntaxes of the SET REPORT command and SET REPORT Uproc are:
SET REPORT UPROC = SET REPORT;
The command may be issued whenever a format is set, though it is most often issued after a format designed as a report format is set (see below). The Uproc may only be issued in a startup frame or in the format declaration section Uprocs associated with the startup frame. [See B.5.2, D.3.]
Report mode affects only those record output commands that can process multiple records: TYPE, OUTPUT, SCAN and some forms of DISPLAY under Global FOR (DISPLAY ALL, DISPLAY REST and "DISPLAY n", where "n" is greater than 1). Even if these commands only process one record, report mode will be in effect; the number of records that are actually processed is irrelevant. When the record-processing command is issued, the frames related to report processing will be executed appropriately, in addition to the data frames.
Other record output commands (other forms of the DISPLAY command, TRANSFER) are not affected by report mode -- only the data frames in the format will be executed for them.
If the format does not have output frames, the SET REPORT command will have no effect. If no format is set (i.e., the SPIRES standard format is in effect), the SET REPORT command will return the message NO FORMAT SET, and report mode will not be set.
To clear report mode, so that the multiple-record processing commands will not cause report mode processing, you can issue one of the following commands:
CLEAR REPORT CLEAR FORMAT SET FORMAT format-id
Note that the "SET FORMAT *" command, which causes the startup frame to be re-executed, will not clear report mode, though it may be reset if the SET REPORT Uproc appears within the startup frame.
As an alternative to using SET REPORT to turn on report mode processing for all subsequent record output commands, the WITH REPORT prefix may be added to a single command. WITH REPORT will cause frames related to report processing in the currently set format to be executed appropriately for the specified record-processing command or XEQ FRAME command only.
The WITH REPORT prefix may be added to the TYPE or DISPLAY command, or to the XEQ FRAME command, or to the GENERATE SET command, when a display set is being created. [See 1.9 in SPIRES Technical Notes for an explanation of display sets.] The WITH REPORT prefix may be used in conjunction with other command prefixes, such as IN ACTIVE. Commas may be used to separate prefixes, for clarity, if desired, e.g.:
-> in active clear, with report, display all
What effect does report mode processing have when none of the special report frames exist? The primary result is the appearance of carriage control to cause page ejects when the output is directed to your active file (or some other file data area). The format data is shifted one column to the right (though you do not account for this shift in your data placement) and a "1" (one) is placed in column 1 in the first line and, by default, every sixtieth line thereafter. Nothing other than carriage control characters will be placed in column one. These characters will be used by the printer to cause a new page to start with that line if you include the CC option on the PRINT command. Another effect is that the "****" and ";" lines that normally appear between records in multiple-record displays are suppressed.
The number of lines that will be output per page is controlled by the system integer variable $LINEPP (lines per page). By default, its value is 60, the most common number of lines per page for printers. It can be set to a different value with the SET LINEPP command or Uproc.
SET LINEPP = n UPROC = SET LINEPP = n;
where "n" is an integer from 0 to 32767. The Uproc is normally coded in the initial frame.
In most cases, the carriage control put out by default for non-report output formats is not very useful. If you are printing your data on one of the printers that by default prints 60 lines per page, the printer does not need carriage control every 60 lines to tell it to start a new page -- it would do that anyway. However, if you want to print a different number of lines per page, say, less than 60, it might be handy.
Note that the "1" carriage control symbol will be placed every $LINEPP lines, whether or not that line is in the middle of a record. That is, we have not yet told SPIRES that we do not want records split across pages. [See B.10.3.4 for information about SET NOBREAK.]
One of the most useful features of report formats is your ability to control not only the layout of numerous elements for a record but also the layout of numerous records on a page. When designing a report, most people begin with the design of the data frame or frames, keeping in mind how the design would look on a page of paper. Once that is done, consideration is given to the layout of an entire page:
- Should carriage control be provided automatically, or should it be controlled by statements in the format?
- Should multiple records appear on a single page?
- Should individual records be prevented from splitting across page boundaries?
- Should groups of records be prevented from splitting across page boundaries?
- Should page numbers or other information appear at the top or bottom of each page?
- Should the output appear in a single or multiple columns on the page?
In the previous section, you were introduced to the carriage control capabilities established by the SET REPORT command. [See B.10.2.] In this section and its subsections, that topic, as well as the others suggested by the above questions, will be discussed in detail.
A report format may have one or more header frames. A frame is identified as a header frame in the FRAME-TYPE statement in the format declaration section. Whenever SPIRES is ready to flush the frame's buffer and that buffer would be placed on a new page, any header frames are executed in order of declaration, with their output going to the top of the new page. (If part of the buffer would go on the previous page, then the footer frames, if any, would be executed first.) The exception to the rule is that no header is placed on the initial frame's page. [See B.10.4.]
The header frame usually places one or more of the following on the report page:
- the page number
- a guide word, identifying the first record on that page, for instance
- the date of the report
- the title of the report
- column headings for the record data displayed beneath
- a border separating the header from the data
- other informational notes
As with any frame, you must add code to two places in the format definition: 1) you must write a frame definition for the header frame; and 2) you must declare it a header frame to be used by the format in the format declaration section.
The second task is the easier -- you just add two statements in the appropriate format declaration section at the end of the format definition:
FRAME-NAME = header.frame.name; FRAME-TYPE = HEADER;
where "header.frame.name" is the name of the header frame as given in its FRAME-ID statement.
The frame definition of course depends on your needs. Its frame dimensions must be fixed -- they may not specify line-by-line processing. Note however that the number of rows specified will be placed on the page, whether or not you place data in the last ones. That is different from fixed-dimension data frames, whose last blank lines are not output. All of the report-specific frames (header, footer, initial, ending, group-start and group-end) are the same way. If you do not intend to put data in one or more of the last rows of the header frame, you can use the SET HDRLEN Uproc to prevent SPIRES from putting out the extra blank rows. [See E.2.2.5 for more details on its use.] Alternatively, you may use the SET NROWS Uproc to change the number of rows in the current frame, a technique that will work for the other report frame-types too. [See B.3.3.1.]
Some of the header functions listed above can be carried out by label groups such as those shown in the example below. For example, the values in the $UDATE (or $DATE) and $PAGENO variables contain the current date and page number respectively. Other values to be positioned in the header may be character strings, either literals or variables. In any case, they are generally placed into the frame by means of VALUE and PUTDATA statements. Other statements, such as START, MARGINS, TITLE, LOOP and UPROC, are also allowed.
A header frame may also contain GETELEM and IND-STRUCTURE statements, allowing access to elements within the record currently being processed (i.e., the record whose placement on the page causes the header frame to be executed). You could use this feature, for example, to retrieve element values for guide words that appear at the top of the page, like those in a dictionary or phone book. If there is no "current" record (for instance, during execution of an ending frame), then NODATA processing is in effect, meaning that any values specified with a VALUE, TITLE or DEFAULT statement will be displayed, and the $NODATA flag will be set.
Warning: There can be timing considerations that can cause problems if you use GETELEM within header frames. Be careful to honor any structure processing going on in the data frame at the time the header frame is executed, just as you would in the data frame itself. [See B.8.2.] For instance, if a structure is being processed by an indirect frame called by the data frame at the time the buffer is flushed to a new page, then using the GETELEM statement in the header frame to retrieve elements from another structure or from record-level may result in SPIRES losing its position within the processing of the data-frame structure. If this is a problem, you may want to use one of the $GETxVAL functions (like $GETXVAL) to retrieve the values in the header instead, or perhaps even better, do the GETELEM within the data frame and store it in a variable for later positioning in the header frame.
Note that you can also call indirect frames for structure processing or for subgoal processing if you want to access data records from a header frame. [See B.12.]
Below is a typical header frame definition. Parenthesized letters to the left refer to the notes that follow:
FRAME-ID = HEADER;
FRAME-DIM = 3,60;
DIRECTION = OUTPUT;
(A) LABEL = GUIDE.WORD;
GETELEM = TITLE;
START = 1,1;
LENGTH = 20;
OUTPROC = $CAP;
PUTDATA;
LABEL = DATE;
VALUE = $UDATE;
START = 1,30;
OUTPROC = $DATE.OUT(MONTH,,UPLOW,SQU);
PUTDATA;
LABEL = PAGE.NUMBER;
(B) VALUE = 'Page ' $PAGENO;
START = 1;
UPROC = SET ADJUST RIGHT;
PUTDATA;
LABEL = BORDER;
VALUE = '-';
START = 2,1;
UPROC = SET REPEAT;
PUTDATA;
That definition might produce a header that looked like this (row and column numbers are shown for clarity):
(....v....1....v....2....v....3....v....4....v....5....v....6)
(1) FLAPJACKS ON PARADE Aug. 19, 1982 Page 114
(2) ------------------------------------------------------------
(3)
Line 3 is output as part of the header frame even though no value was placed there.
Notes on the frame definition:
A) This label group shows the technique used when you want an element value from the first record on each page to be displayed in the header. The record being output to a new page, which causes the execution of the header frame, will be accessible by the GETELEM statement.
B) This demonstrates the point made earlier in the manual that VALUE statements must account for the variable type. If "Page" were put in an INSERT statement, then $PAGENO would need to be converted to string by the $STRING function; otherwise, that integer variable would be read as a string, and the proper conversion to string would not take place. As shown, the VALUE statement concatenates a literal string to $PAGENO, which forces $PAGENO to be converted to a string first. [See B.4.3.]
Below are some miscellaneous notes on headers:
Multiple header frames are allowed. They will be executed in the order in which they are declared in the format declaration section.
Often situations arise where you want some text to appear on the first data page of the report, before the data and possibly before the header for that page. One possible solution to this problem is to process that text in a separate header frame that is only executed for the first record. The format declaration section might contain a Uproc such as the following for that frame:
FRAME-NAME = FIRST.REC.HEADER; FRAME-TYPE = HEADER; UPROC = IF ~$FREC THEN SET SKIPF;
In other words, if the first record of the report is not being processed, skip this frame. Setting $SKIPF is the most obvious way to prevent a header frame from being placed on the page if one is defined. [See E.2.1.17 for another way, using the SET SUPPRESS Uproc.] If you do not use the SET SKIPF or SET SUPPRESS technique, the buffer established for the frame by the frame dimensions would be placed on the page, even if no data would be positioned in the header frame. [See E.2.1.3, E.2.1.16.]
Whenever the flushing of the buffer would cause data to be placed on a new page, the header frames are executed. If the data frame is doing line-by-line processing or if it has fixed dimensions but the Uproc SET NOBREAK has not been set (meaning that the frame is allowed to split across pages), then the record may continue after the header on the top of the next page. Be aware that if you retrieve an element value to place in the header, it will come from that record (the continuing one) rather than from the first complete record on the page.
At the other end of the page from the header frame(s) can come the footer frame(s). Footer frames are used for similar purposes as headers. [See B.10.3.1.] They are handled similarly as well -- you must write a frame definition for them and you must add their names, along with the proper FRAME-TYPE statement, to the format declaration section. However, the most important difference (besides their opposite orientations) is the importance of the variable $FTRLEN (for "footer length"), as opposed to the relative insignificance of $HDRLEN.
Like a header frame, a footer frame is identified as such in the format declaration section:
FRAME-NAME = footer.frame.name; FRAME-TYPE = FOOTER;
Multiple footer frames are allowed; they will be executed in the order in which they are declared here.
The frame definition itself might look very similar to the one shown for a header in the previous section. One possible difference might be that the border row of hyphens would be placed before the row of information, rather than after it as it appears in the sample header frame.
When constructing a page, SPIRES needs to know the number of lines to reserve for the footer frame or frames. Thus you must issue the SET FTRLEN Uproc, usually in the initial frame, to tell SPIRES how many lines to save:
UPROC = SET FTRLEN = 3;
That Uproc would be used if the footer frame would require three lines, or if three different footer frames required a total of three lines. The value of $FTRLEN should reflect the total number of lines needed for all footer frames. [See E.2.2.6 for more about $FTRLEN.]
The frame dimensions of a footer frame may not specify line-by-line processing. The number of rows specified in the FRAME-DIM statement will be placed on the page, whether or not data has been placed in the last ones. All the report-specific frame-types work this way. However, you can use the SET NROWS Uproc to delete the last unused rows of a report-specific frame-type, if desired. [See B.3.3.1.]
Values are generally placed in a footer frame with the VALUE and PUTDATA statements, with assistance from START, MARGINS, TITLE, and other statements. Just as they are in header frames, GETELEM statements are also allowed, and subgoal access to other records and record-types is allowed through indirect frames. [See B.12.]
Element values to be displayed in the footer are best retrieved in a data frame and saved in a variable to be used later by the footer frame. The GETELEM statement, used successfully in header frames, may not work as well here, since the record being processed when the footer is executed may be starting on the top of the next page.
A major part of report mode processing is the creation of carriage control in column 1 of the output when the output is directed to the active file (or some other file data area). Rudimentary carriage control is provided in report mode by default and was discussed earlier. [See B.4.2.] However, SPIRES provides you with several options in carriage control, giving you much more flexibility than the default of starting a new page every sixty lines. Those options will be discussed in this and the next few sections. [See B.10.3.4, B.10.3.5, B.10.3.6, B.10.3.7.]
Several important system variables will be mentioned throughout these sections, in particular $LINEPP and $LLEFT. The variable $LINEPP represents the number of lines allowed per page of printed output; the default is 60. The other variable, $LLEFT, represents the numbers of lines left on the page that can be used for output; its value is derived by subtracting the number of lines written on the page from $LINEPP. It does not include the current row. Thus, if $LLEFT is 1, then one more row past the current one can be used for output. [See E.2.2.3.] Whether report mode processing will create carriage control at all depends on the value of the integer variable $PAGECTL ("page control"). Though any of four values may be specified, the two discussed here are 0 (the default when data goes to the active file) and 4.
When $PAGECTL is 0, carriage control symbols are placed in column one of the data; the formatted data is shifted to the right one column. Header and footer frames, if defined and declared, will be executed and positioned on the pages as appropriate. New pages will be triggered by a "1" in column 1. The other carriage control symbol used by SPIRES is "-" (hyphen) to cause triple spacing.
When $PAGECTL is 4, no carriage control symbols are output automatically. No right-shifting of your data takes place. Header and footer frames, if defined and declared for the format, will be executed and positioned on the pages as appropriate, using the values in the variables $LINEPP, $HDRLEN and $FTRLEN to compute their positions. Extra blank lines, rather than hyphens in column 1, will be used to ensure that the footer appears in the right place.
There are two reasons why you might want to use $PAGECTL = 4. First, you might not want any carriage control at all, and possibly you do not have header or footer frames either. However, you want to take advantage of other report mode features, such as group-start frames, without having to have a reserved column for carriage control.
Second, you might want to create your own carriage control, reserving the first column of your frames for carriage control symbols you place there with VALUE or TITLE statements in your label groups. This also allows you to use some of the other symbols, such as a "+" for overstrike, or "8" for the bottom of the current page. You could use the "+" symbol with underscore characters to underline values, for example. Warning: The variable $LLEFT will not be aware of any carriage control you generate and will be counting each line of output as producing a single line. This may cause problems if you also have headers and footers that SPIRES is trying to place in the proper page positions. Some of these problems may be solved by increasing $LINEPP accordingly. Be sure, if you do that, to restore $LINEPP to its original value in the header frame.
Further details on $PAGECTL, including its effect on initial frames and its other values, appear elsewhere. [See E.2.2.2.]
The SET NOBREAK Uproc can be issued at any point within a fixed-dimension frame to tell SPIRES not to split the frame across page boundaries. Specifically, when the $NOBREAK flag is set, the buffer currently being filled will not be placed on the current page unless enough room for it is available there, according to the value of $LLEFT. Instead, the remainder of the page (except for the footer frame, which is then executed and placed there) will be left blank, and the contents of the buffer will follow the header at the top of the next page.
The SET NOBREAK Uproc takes no value or options:
UPROC = SET NOBREAK;
This Uproc sets the $NOBREAK flag variable. [See E.2.1.2.] It is often placed in the frame declaration in the format declaration section, though it can be placed anywhere in a frame. It applies only to the current buffer being written. For example, if it is set for a data frame, it will not be in effect for any other data frames in the format, unless it is set for them individually.
Setting $NOBREAK will not guarantee that the entire frame will be placed on a single page, merely that if it cannot fit on the remainder of the current page, it will start at the top of the next one. If the frame is too long for the next page, it will break onto the following page. To prevent a frame from being too large to fit on a single page, you need to keep the number of rows low in the FRAME-DIM statement. Note however that a serious error (S808) will occur if SPIRES tries to put data in rows beyond those of the assigned frame.
Another Uproc, HOLD, can be used to keep several frames together from splitting across page boundaries. For example, if your format has multiple data frames, and you do not want to allow a page break between frames, you must use the HOLD Uproc. [See B.10.3.7.] If you want to allow your data frame to split across pages but only at certain rows, define it as several data frames instead, with each one having $NOBREAK set.
The EJECT PAGE and SET NEWPAGE Uprocs give you even more control over paging than SET NOBREAK. [See B.10.3.4.] The SET NEWPAGE Uproc tells SPIRES to place the buffer currently being written onto a new page when it is output. What has previously been written into the buffer together with what will be written into it before it is output will be placed on the next page.
The EJECT PAGE Uproc operates somewhat differently. What has already been placed in the buffer previous to the EJECT PAGE Uproc is flushed, being placed on the current page. Any footer frames would then be executed. The $NEWPAGE variable is automatically set and the remainder of the frame is then executed; data resumes being placed in the now empty buffer. When that buffer is output, it will appear on a new page. Note that when the EJECT PAGE Uproc is executed, the header frames are not -- it is the second writing of the buffer, which will appear on the new page, that triggers them.
The SET FRONTPAGE Uproc specifies that not only do you want a new page, but that the output should print on the front of the paper. When $FRONTPAGE is on and a page eject occurs, the "1" carriage control character is replaced with an "F" and $FRONTPAGE is turned off. SET FRONTPAGE is used in conjunction with (and not instead of) SET NEWPAGE and/or EJECT PAGE Uprocs.
The "F" carriage control character has an effect when you are printing in duplex mode on the 4090 printer in Forsythe. SET FRONTPAGE may be abbreviated to SET FRONT or SET FPAGE, and you can turn $FRONTPAGE off explicitly with a SET NOFRONTPAGE Uproc.
None of these Uprocs has any options:
UPROC = SET NEWPAGE; UPROC = EJECT PAGE; UPROC = SET FRONTPAGE;
Beware of using absolute row numbers in frames with EJECT PAGE Uprocs. If a START statement before the EJECT PAGE Uproc places data in row 9, and a START statement after it places data in row 10, the new page will begin with nine rows of blanks after the header. This happens because the row 10 data goes into row 10 of the empty buffer; there are no longer nine rows of data ahead of it, but instead nine rows of blanks. In this situation, it is better to place values using "X" and "*" for row numbers; after flushing, $CROW is no longer 9 but is now 1.
Note that the $NEWPAGE flag is set in three ways: 1) by SPIRES during processing of the first record, i.e., when $FREC is set; 2) by SPIRES when the EJECT PAGE Uproc is encountered; and 3) by you when a SET NEWPAGE Uproc is explicitly encountered.
The FLUSH Uproc causes the buffer to be sent immediately to the appropriate device (usually the terminal or active file). The buffer is then emptied and subsequent label groups can resume placing data in it. The same warning given about absolute row numbers for the EJECT PAGE Uproc applies to the FLUSH Uproc. [See B.10.3.5.]
In most circumstances, SPIRES automatically flushes the buffer. For instance, the buffer is flushed whenever a frame finishes executing (unless it is an indirect or XEQ frame). So only a few circumstances require the FLUSH Uproc.
One situation is when the HOLD Uproc is used to hold a group of formatted records in the buffer to be placed on the page as a group. The HOLD Uproc tells SPIRES not to flush the buffer when a new frame begins but to wait until the FLUSH Uproc is given. [See B.10.3.7.]
The syntax of the FLUSH Uproc is quite simple:
UPROC = FLUSH;
The SET NOBREAK Uproc tells SPIRES to place the current buffer onto the current page only if there is room for all of it there; otherwise, it is to be placed on a new page. The HOLD Uproc lets you extend this principle to groups of records.
When the HOLD Uproc is in effect, the current buffer is not flushed to the page when SPIRES finishes processing a data frame; instead, the buffer is "held" in memory, and the next data frame is processed (perhaps for the next record), with its data following that of the previous data frame in the buffer. At some point the buffer must be flushed (this may be accomplished in several ways), but you can determine whether the buffer can fit onto the remainder of the given page and, if it will not, you can set the $NEWPAGE flag to force the entire buffer to be placed on a new page.
The most direct way to flush the buffer is to issue the FLUSH Uproc. [See B.10.3.6.] The current buffer is immediately flushed to the page, and execution of the frame continues with the "holding" resumed.
The HOLD Uproc affects all initial, data, group-start and group-end frames that begin execution after the HOLD Uproc is executed. If it is executed within a frame definition, it will not affect that frame.
The HOLD Uproc is "turned off" and the buffer is flushed just before the ending frame begins executing -- in other words, all frames executed after the HOLD Uproc is executed and before the ending frame is executed are affected by HOLD.
The HOLD Uproc does not have any options:
UPROC = HOLD;
The HOLD Uproc may only be used with frames of fixed dimensions. However, the number of rows specified in the FRAME-DIM statement should be high enough to allow for the maximum number of records you expect to be held, and the row positions in START statements should be specified in relative positions ("*" and X values) rather than absolute ones (such as 5 or 10). Because the buffer is not flushed between records, row 1 for the first record is the same row as row 1 for the second record. Coding relative values for starting positions will prevent overlaying records on each other.
If the HOLD Uproc will affect multiple frames (e.g., data and group-start and group-end frames), the FRAME-DIM statement of only the first frame executed after the HOLD statement appears will apply. The other frames must have FRAME-DIM statements, but the particular values will be ignored (see the example below).
Several different events may cause the buffer to be flushed:
- the FLUSH Uproc is encountered;
- the buffer fills up;
- the final record is processed.
After the buffer is flushed, HOLD processing automatically resumes.
Here is the type of situation in which HOLD processing is most useful: Suppose you have a stack of personnel records sequenced by DEPARTMENT. You want to get as many records on a page as possible, but you want all of the records for any given department on a single page. In other words, a page may have records for two or more departments, but all records for those departments should appear on that page. This could be accomplished with group-start and/or group-end frames [See B.10.7.] and the HOLD and FLUSH Uprocs.
At least three frames would be held in the buffer: the group-start frame, the group-end frame and one or more data frames for the records. Because the group-start frame executes before any of the data frames when a new DEPARTMENT value is detected, that frame definition contains the large frame dimensions:
FRAME-ID = DEPT.START; FRAME-DIM = 200,60;
The FRAME-DIM statements of the other frames are ignored by SPIRES when a HOLD Uproc is in effect, so their values are not important, but they must not be omitted.
The HOLD Uproc will apply to the next frame executed, not the current one, so it may be added either to the initial frame or to the group-start frame declaration in the format declaration section:
FRAME-NAME = DEPT.START; FRAME-TYPE = GROUP-START; UPROC = HOLD;
Thus, HOLD gets set right before the group-start frame is executed.
The buffer should be flushed before the group-start frame is executed again, probably at the end of the group-end frame:
FRAME-ID = DEPT.END;
FRAME-DIM = 0,0;
...
LABEL;
UPROC = FLUSH;
UPROC = RETURN;
Though you might have considered placing the FLUSH Uproc just ahead of the HOLD Uproc in the group-start frame declaration, FLUSH is not allowed there; it must be within a frame definition.
The only other requirement is to tell SPIRES to begin the buffer on a new page if it is too large to fit in the rest of the current page; this is best done by setting the $NOBREAK flag.
FRAME-ID = DEPT.START;
FRAME-DIM = 200,60;
LABEL;
UPROC = SET NOBREAK;
Of course, this implementation does not guarantee that all records for one department will fit on one page. However, records for such departments will each begin on a new page. The point of this design is to prevent a group of records from starting at the bottom of a page (i.e., after other groups of records) if they will not all fit on that page.
The HOLD Uproc is also useful in input formats for processing multiple records in SPIBILD. [See C.9.1.]
A report format (and sometimes a display format) occasionally needs to know if any data has been output beginning at a certain point in a record. For instance, a format may want to output a blank line before and after a group of fields, but want to avoid putting out two blank lines if no fields in the group occur.
The Uproc SET NOPUTFLAG and the variable $PUTFLAG together help you determine whether data has been output in a particular frame, beginning at a point in the frame that you yourself specify.
$PUTFLAG is set to $FALSE at the beginning of any frame (except a header frame, an indirect frame, or an overflow frame in Prism). The variable is set to true as soon as any PUTDATA statement or TITLE statement executes within the frame. Equally importantly, you can set its value to $FALSE yourself by issuing the SET NOPUTFLAG Uproc.
Thus, to check a group of fields to see whether any PUTDATA statements were executed for the group, you can code SET NOPUTFLAG before beginning the group and check the value of $PUTFLAG at the end of the group. (This "group of fields" is often coded as an indirect frame, [See B.8.1.] which is why the value of $PUTFLAG is not reset at the beginning of an indirect frame.)
Another use of the SET NOPUTFLAG/$PUTFLAG duo is to communicate to a header frame (or an overflow frame in Prism) as to whether data has been output beginning from some specified point. For instance, consider a bibliographic report where author, title, and publisher are treated as a unified data segment. You don't care if part of this segment is output on a page following the rest of the segment, but you do want your header to know, when it executes, whether the segment is beginning or is continuing from a previous page.
To declare that the data segment is just beginning (and that no part of it has been output), you can set $PUTFLAG to false with the SET NOPUTFLAG Uproc, coding it to execute just before the series of label groups defining your segment of data will begin to execute. If the header or overflow frame executes after SET NOPUTFLAG and before the label groups in the segment, it will know from $PUTFLAG that the segment has not yet begun.
For instance:
FRAME-ID = AUTHORTITLE;
:
LABEL = PRE.SEGMENT;
UPROC = Set NoPutFlag; <--At this point, $PUTFLAG is false
LABEL = AUTHOR;
:
PUTDATA; <--After this executes, $PUTFLAG is true
LABEL = TITLE;
:
LABEL = PUBLISHER;
etc.
As always in reports, the timing of your test can be crucial. For instance, TITLE and PUTDATA statements within a header or overflow frame will reset $PUTFLAG to be true.
SPIRES allows you to have reports with multiple columns of data on a single page, like a dictionary or a magazine. Several records may appear in the first column, a few more in the next, and so forth. (Warning: be aware of the terminology distinction between multiple columns of data on a page and columns within a frame. The intended meaning of the word at any given point in this section should be clear by the context.)
The number of columns to appear on the page is controlled by the $COLUMNS variable. To set the number of columns on the page, code the SET COLUMNS Uproc in the initial frame:
UPROC = SET COLUMNS = n;
where "n" is an integer from 1 up.
Think of the pages of a report with a single, narrow column of data (without headers or footers) running end to end. The FRAME-DIM statement might look like this:
FRAME-DIM = 0,30;
meaning it is a line-by-line frame with thirty columns per row.
Next, shift the pages so that some are side by side rather than end to end, using the number in $COLUMNS to determine how many to place next to each other. Then add a header and footer stretching across all the columns at the top and bottom, and you have a simple idea of how SPIRES constructs a page of multiple columns.
Actually, when multiple column processing is requested, SPIRES creates a special buffer for the page image. The regular buffer in which the frame data is placed is in turn placed in the page buffer. SPIRES determines the size of the page buffer by multiplying the number of columns requested ($COLUMNS) by the assigned width of each column ($COLWDTH). Generally you should set $COLWDTH to be slightly more than the number of columns in the FRAME-DIM statement, so that a good margin between the columns of data will be left.
UPROC = SET COLWDTH = n;
where "n" is an integer.
Both the SET COLWDTH and the SET COLUMNS Uprocs should appear prior to the entry into the frames to which they apply. Note that you cannot have separate header and footer frames for each column, though you may design your headers and footers so that particular data appears at the top or bottom of each column.
When SPIRES is placing data into the page buffer, it keeps track of $LLEFT as before. When no more lines are left, it begins the next column, rather than the next page. If it finishes the last column on the page, the footer for the entire page is executed, followed by the header for the next page, and then it begins placing data in the first column of the next page.
Just as you can keep records from breaking across pages, you can prevent them from breaking across columns by using the SET NOBREAK Uproc. [See B.10.3.4.] You may also use the SET NEWCOL Uproc, which is similar to the SET NEWPAGE Uproc, causing the current buffer to be placed in the next column. [See B.10.3.5.] Also, you may use the EJECT COLUMN Uproc to cause SPIRES to flush the buffer into the page buffer immediately and set the $NEWCOL flag, which is similar to the effect of the EJECT PAGE Uproc. [See B.10.3.5.]
When multiple column processing is set (i.e., when $COLUMNS is greater than 1), it will apply to group-start, group-end and ending frames as well. You can turn it off for them by changing the value of $COLUMNS before entering them. Executing an EJECT PAGE Uproc at that point is probably advisable too.
Note: Though we are generally reluctant to say that any given task is impossible to do in SPIRES, there is apparently no general way to balance the columns on the final page of data so that all the columns end on the same row in the middle of the page.
As soon as a multiple-record output command is issued when report mode is set, the initial frame is executed. The initial frame generally has two purposes: as a grid containing data to be placed on a page, it is used to provide introductory material for the report, such as a cover sheet or preface; as a frame to be executed, it is used to initialize or reset variable values for report processing. The second purpose is discussed later. [See B.10.5.]
An initial frame is identified as such in the frame declaration part of the format declaration section, using the INITIAL value for the FRAME-TYPE statement:
FORMAT-NAME = DISPLAY;
FRAME-NAME = INITIAL.A;
FRAME-TYPE = INITIAL;
Multiple initial frames may be specified. They will be executed in the order in which they are specified in the format declaration section.
The frame definition for an initial frame usually contains label groups using VALUE statements to assign a value for placement within the frame. For example, a simple initial frame definition might look like this:
FRAME-ID = INITIAL.A;
FRAME-DIM = 60,65;
LABEL = TITLE;
VALUE = 'Restaurant Reviews (by City)';
START = 10,1;
UPROC = SET ADJUST CENTER;
PUTDATA;
LABEL = DATE;
VALUE = $UDATE;
START = X+1,1;
OUTPROC = $DATE.OUT(MONTH,,UPLOW,FULL);
UPROC = SET ADJUST CENTER;
PUTDATA;
That definition would put the title on line 10, followed by the date on line 12, with both values centered.
GETELEM statements are not allowed in initial frames. However, an initial frame may access elements in records via an indirect frame, using subgoal processing. [See B.12.]
If the initial frame has fixed frame dimensions, the number of rows specified in the FRAME-DIM statement will be placed on the page, whether or not data has been placed in the last ones. All the report-specific frame-types work this way. However, you can use the SET NROWS Uproc to delete the last unused rows of a report-specific frame-type, if desired. [See B.3.3.1.]
Pages produced by initial frames will not have headers or footers provided for them. Headers and footers are not generated until data frames begin executing. If you want headers and footers on the initial page, you will have to create them yourself in the initial frame, or have the prefatory material appear on the first page of data (see below), or call the header and footer frames as indirect frames from the initial frames. (They would also need to be declared indirect frames in the format declaration section.)
By default, the carriage control provided by SPIRES puts each initial frame on a page itself -- the data frames begin on the next page. To handle the carriage control yourself, you must change the value of the $PAGECTL variable. [See E.2.2.2.]
If you want to put prefatory material on the same page as the first page of data, you could have label groups to handle that material in a group-start frame or in the data frame, but you would tell SPIRES to skip those label groups unless $FREC (the flag variable set when the first data record is being processed) is set. [See B.10.3.1.]
Besides its use for prefatory materials, the initial frame can be used to set system and user variables to particular values before the report processing begins. Setting the page height ($LINEPP) and telling SPIRES the number of lines to reserve for the header or footer ($HDRLEN, $FTRLEN) are typically done using the initial frame. User variables that will be used in other frames may also be re-initialized by the initial frame. [See B.10.6, B.10.7.2 for information on variables in group-start, group-end and ending frames.]
Generally the Uprocs used to set such variables are placed in the format declaration section rather than in the initial frame definition, since they often do not relate to initial-frame data placement at all. In fact, the use of the initial frame to initialize variables is more common than its use to display prefatory material, so SPIRES lets you request initial frame processing without your having to define an initial frame:
FORMAT-NAME = SYSTEM REPORT;
FRAME-NAME;
FRAME-TYPE = INITIAL;
UPROC = SET LINEPP = 80;
UPROC = SET HDRLEN = 2;
UPROC = LET ERRORCOUNT = 0;
No initial frame is named in the FRAME-NAME statement, so no initial frame needs to be defined. However, the Uprocs will be executed right before the point where an initial frame would be executed during format processing.
It might seem strange to re-initialize user variables in an initial frame when presumably they have not even been used, but it may be necessary for several reasons. First of all, you may indeed have already used them, i.e., you have already used the format to produce a report and are now producing a different one without resetting the format. The user variables in the vgroup may still have the values they had at the end of the last report; a counter variable might now be set to 35 instead of 0, for instance. The initial frame, which is executed whenever a multiple-record processing command is issued, could thus reset the variables.
Second, you might be using a global vgroup whose variables have pre-assigned values. The initial frame could reset some or all of them, if desired. Setting and resetting vgroup variables can be done with several functions, $VGROUPINIT, $ASET and $DYNASET, which are described in the manual "SPIRES Protocols".
Using the startup frame to initialize system variables is not generally recommended for two reasons. First, since it is only executed when the SET FORMAT command is issued, variables will not be reset if several reports are created without resetting the format each time. Second, the user may change the values between the time the format is set and the time it is executed, which would probably defeat the purpose of the format initializing them. [See D.3.]
Ending frames are most commonly used to present information gathered during report processing, such as a count of the number of records displayed or the average value of an element across all the records. They may also be used, of course, to present ending material such as indexes or footnotes or "legends" that explain data values in the records.
A frame is declared an ending frame in the format declaration section, using the FRAME-TYPE statement:
FRAME-NAME = END.REPORT; FRAME-TYPE = ENDING;
Multiple ending frames are allowed; they will be executed in the order in which they are declared here.
Unlike initial frames, ending frames are not by default placed on their own pages -- they follow the final record of the data, beginning on the same page (if there is room, of course). An EJECT PAGE Uproc can be added to the frame declaration section if you want the ending frame to begin on its own page. That page will also have headers and footers, however. If you want those suppressed, set a user flag variable in the ending frame, and check that flag before entering the header and footer frames, i.e., in the frame declaration section for those frames. Then, if the flag is set, execute the SET SKIPF Uproc. [See E.2.1.16.]
Like the initial frame, the ending frame may not include GETELEM statements, although subgoal processing through indirect frames is allowed. [See B.12.]
If the ending frame has fixed frame dimensions, the number of rows specified in the FRAME-DIM statement will be placed on the page, whether or not data has been placed in the last ones. However, you can add the SET NROWS Uproc to delete the last unused rows of an ending frame, if desired. [See B.3.3.1.]
Statistics based on the data presented during the report (such as index information) are usually collected in user variables during report processing, or in the $SORTKEY and $SORTDATA system variables. [See B.10.8.] SPIRES does not automatically keep statistical information about the data in the report, the only notable exception being the variable $RECNO, in which SPIRES keeps a count of the records processed since the multiple-record output command was issued. [See E.2.2.1.]
So, for example, if the data records represent families and you want to know the total number of children for all the records in the report, you might establish an integer variable called TOTALCHILDREN. First, you would define them in the Vgroup section of the format. [See B.9.3.1.]
VARIABLE = TOTALCHILDREN; OCC = 1; TYPE = INT;
Then, you might set TOTALCHILDREN to 0 in the initial frame, probably at its declaration in the format declaration section:
FRAME-NAME = INIT; FRAME-TYPE = INITIAL; UPROC = LET TOTALCHILDREN = 0;
Then, in the data frame, you would increment the variable for each child. How this is done would depend on the children's element. For example, if the element CHILDREN were an integer value representing the number of children in the family, the label group in the data frame might look like this:
LABEL = CHILDREN; GETELEM; INSERT = END, ' children'; START = *+1,5; UPROC = LET TOTALCHILDREN = #TOTALCHILDREN + $UVAL; PUTDATA;
On the other hand, if CHILD were a multiply occurring element whose values were the names of the children, the label group might look like this:
LABEL = CHILD; GETELEM; TITLE = 'Children:'; TSTART = X+1,1; START = X,2; UPROC = LET TOTALCHILDREN = #TOTALCHILDREN + 1; PUTDATA; LOOP;
Each time the label group is executed and another child is processed, the counter variable is incremented by one. If you were not displaying the element, the label group could be much simpler:
LABEL = CHILD; GETELEM; UPROC = LET TOTALCHILDREN = #TOTALCHILDREN + $ELOCC;
where the $ELOCC variable represents the number of occurrences of the accessed element.
The ending frame would report the total number of children. Other computations to that value could be made in that frame as well:
FRAME-ID = END; FRAME-DIM = 4,65; LABEL; VALUE = 'Total number of children is ' #TOTALCHILDREN; START = 3,1; UPROC = SET NOBREAK; PUTDATA; LABEL; VALUE = $PACK(#TOTALCHILDREN) / $RECNO; INSERT = 'Average number of children per family is '; START = 4,1; PUTDATA;
The example shows how the average number of children per record processed might be computed in the ending frame. Note that the TOTALCHILDREN value was converted to a packed decimal variable to provide greater accuracy in the division for the average than the division of two integers would provide.
Group-start and group-end frames ("break frames") give you the capability of providing summary information about groups of records within the set of records processed in the report. For comparative purposes, if you think of an initial frame as being useful for providing introductory information about the report that follows, a group-start frame is useful for introductory information about the group of records that follows. Similarly, if you think of an ending frame as being useful for report totals and other information about the preceding report, a group-end frame is useful for subtotals and other information about the preceding group of records in the report.
After arranging the records for your report in order by some element (using the SEQUENCE command, for example), you can tell SPIRES to execute a pair of group-start and group-end frames whenever the element value changes from one record to another (an event called a "summary break" or simply a "break"). Thus, if the records were sequenced by element SEX, a group-end frame could be executed when that value changed, reporting the total number of women's records just processed, followed by a group-start frame announcing that the next group of records in the report will be about men.
Specifically, the group-start frame is executed before each new group of records begins processing, including the very first group of records (i.e., right after the initial frame is executed). It can retrieve element values from the first record in each group, a feature which is most often used to retrieve the new value of the break element, so it can be displayed at the beginning of the group of records. (Note: the obsolete frame-type "summary" is equivalent to "group-start".)
The group-end frame is executed after each group of records has been processed, including the very last group (i.e., just before the ending frame is executed). It can retrieve element values from the last record in each group, a feature which is most often used to retrieve the "old value" of the break element, so it can be displayed at the end of the group. Like the ending frame, a group-end frame is often used to display totals, counts, averages or other statistics gathered during record processing. Variables containing such values are often reset to zero in the group-end frame after they have been placed in the buffer so that the gathering of statistics for the next group will start properly.
As you can see, both a group-end and a group-start frame may execute between two groups of records. If you created a blood donors report where the break element was SEX, the report might look like this between the two groups (the frame-types are indicated at the right):
...
Martha Washington Menlo Park 5 <-- Data frames
Edith Wilson Palo Alto 8 for FEMALE
Ellen Wilson Menlo Park 13
** Number of Women Donors: 35 <-- Group-End
** Average number of pints donated: 3.54
----------------------------------------
Blood Donor Records for Men: <-- Group-Start
John Adams Sunnyvale 5 <-- Data frames
John Q. Adams San Mateo 1 for MALE
...
That example shows a fairly typical use of these two frame-types.
The group-start and group-end frames are generally used in pairs, though they do not have to be. You may not need to display data at the end of a group, for example, so you may decide not to use group-end frames. If you do use both, you should be sure to specify the same break element and the same "break level" for both, so that they will be handled as a pair. [See B.10.7.1.] But even if you do not use both, it is useful to think of them in pairs, especially if you request multiple break frames.
You can request multiple group-start and/or group-end frames for multiple break elements. For example, if you have sequenced the records by two elements (for example, SEQUENCE SEX CITY), you can request a group-end and a group-start frame be executed when the CITY value changes and another pair when the SEX value changes. When SEX changes, both pairs will be executed -- since a group of records for a particular sex is then broken into subgroups by city, SPIRES assumes that when the SEX element value changes, a pair of frames for CITY is desired (even if the CITY value does not actually from one record to the next).
For instance, if the example report above also included break frames for the element CITY, the report might look like this at the point where the SEX element changed:
...
Claudia Johnson Palo Alto 2 <-- Data frames
Edith Wilson Palo Alto 8
* Number of Women Donors from Palo Alto: 5 <-- Group-End
* Average number of pints donated: 3.9 for CITY
** Number of Women Donors: 35 <-- Group-End
** Average number of pints donated: 3.54 for SEX
-------------------------------------------
Blood Donor Records for Men: <-- Group-Start
for SEX
* in Atherton: <-- Group-Start
for CITY
Chester Arthur Atherton 5 <-- Data frames
James Garfield Atherton 6
* Number of Men Donors from Atherton: 2 <-- Group-End
* Average number of pints donated: 5.5 for CITY
* in Burlingame: <-- Group-Start
for CITY
Bill Taft Burlingame 2 <-- Data frames
...
Note the nesting of the break frames as shown in the example: pairs of break frames for CITY are surrounded by pairs of break frames for SEX. The way to request such a hierarchy will be discussed in the next section.
Remember that the format will not arrange the records in the order shown -- they must have already been arranged in order, by a command such as SEQUENCE SEX CITY.
Group-start and group-end frames, like all other frames, require code in the format definition in two separate places: they must be defined in the frame definitions section and declared in the format declaration section. In the next section, the statements in the format declaration section will be discussed, followed by a section on break frame definitions. [See B.10.7.1, B.10.7.2.] After that will be a brief discussion of sorting techniques that can be used to arrange the records for processing, including information on the SPISORT program and the path information it provides. [See B.10.7.3.]
In the format declaration section, each break frame (at least one for each break element) is declared individually. For each break frame, two statements in addition to the FRAME-NAME and FRAME-TYPE statement, must be added:
FRAME-NAME = frame.name; FRAME-TYPE = GROUP-START; (or GROUP-END) BREAK-LEVEL = n; BREAK-ELEM = element.name;
The BREAK-ELEM statement names the element in the record whose value is to be monitored as records are processed. When that element value changes, the named break frame is executed. The named element may be a virtual element. (For an element within a phantom structure, use the "#variable" form shown below, setting the variable to the name of the phantom element.)
Several different forms besides the one shown above are allowed:
BREAK-ELEM = #variable; BREAK-ELEM = structure.name@structure.name@...element.name; BREAK-ELEM = VALUE, variable.name; BREAK-ELEM = PATH, element.name;
The first form may be used if the name of the break element will be contained in the named string variable. If the variable is a four-byte hex variable, it may contain the element's $ELEMID rather than its name. [See E.2.3.10.]
The second form should be used if the element name is not unique in the record-type. It gives the hierarchical path to the right element.
The third form names a variable whose value is examined between records; if it changes, it triggers the execution of the summary frame. This provides a method for creating summaries when the controlling value is not an element in the records. Note that the pound sign (for user variables) or the dollar sign (for system variables) should precede the variable name.
The fourth form, which was often used when the SPISORT program sorted the goal records, is now obsolete in most situations where it was used. That is, under FOR SET processing, if the named break element is a multiply occurring element and the DEFINE SET command that created the set noted that the element was multiply occurring, then the appropriate occurrence of the break element will automatically be used in break frame processing. [See B.10.7.3.]
The BREAK-LEVEL statement helps SPIRES determine which break frames should be executed. The value given, an integer from 1 to 10, generally reflects the position of the break element within the SEQUENCE or DEFINE SET command. For example, if the records have been arranged by the command "SEQUENCE A B C D E", then the break frames whose break element is A would be assigned a break level of "1" (the highest break level), the frames for B would have "2", etc., with E having "5" (the lowest level). Each pair of group-start and group-end frames should specify the same break level. [See B.10.7.]
When a break occurs at a given level, all the break frames at that level and at lower (higher-numbered) levels are executed. For example, if a break occurs at level 2, group-start and group-end frames for levels 2, 3, 4 and 5 are executed, the group-end frames first, and then the group-start frames. Note however, that the group-end frames and then the group-start frames are executed in the order in which they are declared in the frame declaration section, not in break level order. Hence, most formats using this feature declare group-end frames in "5 4 3 2 1" break-level order, so that the lower-level group-end frames will execute before the higher ones when multiple ones will be executed. On the other hand, group-start frames are usually declared in "1 2 3" order.
For example, here is part of the frame declaration section for the format used in the example at the end of the last section. [See B.10.7.] The records were arranged by the command SEQUENCE SEX CITY.
FORMAT-NAME = REPORT;
...
FRAME-NAME = CITY.END;
FRAME-TYPE = GROUP-END; BREAK-LEVEL = 2; BREAK-ELEM = CITY;
FRAME-NAME = SEX.END;
FRAME-TYPE = GROUP-END; BREAK-LEVEL = 1; BREAK-ELEM = SEX;
FRAME-NAME = SEX.START;
FRAME-TYPE = GROUP-START; BREAK-LEVEL = 1; BREAK-ELEM = SEX;
FRAME-NAME = CITY.START;
FRAME-TYPE = GROUP-START; BREAK-LEVEL = 2; BREAK-ELEM = CITY;
...
Here, the break frames of level 2 are specified before those of level 1 so that they will be executed first when a break occurs at level 1.
In some situations, you may want to use both group-start and group-end frames but not have a pair of them for each break level. Suppose, for example, that a report about students has student records sequenced by DEPARTMENT, DEGREE.PROGRAM, and GRAD.YEAR, and that group-start frames are defined for all three break elements but a group-end frame is defined for only GRAD.YEAR. What happens when, between two records, the DEPARTMENT and/or DEGREE.PROGRAM changes but GRAD.YEAR does not?
First, SPIRES would check to see whether it should execute the group-end frame for GRAD.YEAR. But the only basis SPIRES has for making that decision is whether the value of GRAD.YEAR changed from one record to the next. Since it did not change, the group-end frame would not be executed. Clearly, however, you would want it to be executed, since a higher-level break is occurring for the group-start frames.
To ensure that execution, you must tell SPIRES that there are higher break levels for the group-end frame too:
FORMAT-NAME = STUDENT SUMMARY;
...
FRAME-NAME = GRAD.YEAR.END;
FRAME-TYPE = GROUP-END;
BREAK-LEVEL = 3; BREAK-ELEM = GRAD.YEAR;
* FRAME-NAME;
* FRAME-TYPE = GROUP-END;
* BREAK-LEVEL = 2; BREAK-ELEM = DEGREE.PROGRAM;
* FRAME-NAME;
* FRAME-TYPE = GROUP-END;
* BREAK-LEVEL = 1; BREAK-ELEM = DEPARTMENT;
FRAME-NAME = DEPARTMENT.START;
FRAME-TYPE = GROUP-START;
BREAK-LEVEL = 1; BREAK-ELEM = DEPARTMENT;
FRAME-NAME = DEGREE.START;
FRAME-TYPE = GROUP-START;
BREAK-LEVEL = 2; BREAK-ELEM = DEGREE.PROGRAM;
FRAME-NAME = GRAD.YEAR.START;
FRAME-TYPE = GROUP-START;
BREAK-LEVEL = 3; BREAK-ELEM = GRAD.YEAR;
The code marked with asterisks tells SPIRES about the other two break elements whose values should be checked between records. Between records, SPIRES examines all the values of the break elements for the group-end frames to determine which group-end frames should be executed. After they have been executed, SPIRES checks the break elements for the group-start frames to determine which of them should be executed.
A group-start frame is generally used to announce the beginning of a new group of records, usually displaying the new element value that caused the summary break. A group-end frame is generally used to display statistical information gathered from the preceding group of records, usually displaying the shared element value that caused the records to be grouped together.
The break frames are different from other report-specific frames (e.g., header, ending) because they may have GETELEM statements to access data values. But which record is used to obtain such values: the last record in the previous group or the first record in the new group? The answer depends on the frame-type: the group-end frame has access to the elements in the last record of the previous group, while the group-start frame has access to those in the first record of the next group.
Suppose you are creating a printed author index to a bibliographic data base. Each time a new author triggered a break, the group-start frame below would be executed, and the new author's name would appear two lines after the preceding data.
FRAME-ID = AUTHOR.START;
FRAME-DIM = 2,60;
LABEL;
GETELEM = AUTHOR;
START = 2,1;
PUTDATA;
The next data frame to execute would process records having the new value for the AUTHOR element.
A group-end frame definition, on the other hand, would be more likely to concentrate on statistical information:
FRAME-ID = AUTHOR.END;
FRAME-DIM = 3,60;
LABEL;
VALUE = '- There are ' || #AUTHORENTRIES || ' entries for ';
START = 2,1;
UPROC = LET TOTALENTRIES = #TOTALENTRIES + #AUTHORENTRIES;
UPROC = LET AUTHORENTRIES = 0;
PUTDATA;
LABEL = AUTHOR;
GETELEM;
START = *,*+1;
INSERT = END, '.';
PUTDATA;
Assuming that the variable AUTHORENTRIES is a counter that is augmented each time a data record is processed for a given author, the above frame definition would produce a line such as this after the bibliographic entries for that author:
- There are 29 entries for Edward Eggleston.
Note that the counter was reset to zero for the next group, but not before its value was added to a TOTALENTRIES variable, which presumably will be used in the ending frame to display the total number of entries for all authors in the data base. [See B.10.6.]
Page control problems when data and break frames are involved are often solved by using the system variable $LLEFT. [See E.2.2.3.] For example, you may want to ensure that a group-start frame will not appear by itself at the bottom of a page with the records it heralds starting at the top of the next one. A possible solution is to check the value of $LLEFT before entering the group-start frame (i.e., in the frame declaration portion of the format declaration section:
FRAME-NAME = AUTHOR.START; FRAME-TYPE = GROUP-START; UPROC = IF $LLEFT < 10 THEN EJECT PAGE;
Thus, if there are fewer than ten lines left on the page (not counting the lines for the footer), the AUTHOR.START frame will be placed at the top of the next one. A better solution, if the group-start frame has fixed dimensions, would be to check whether $LLEFT was less than zero at the end of the frame, and if so, issue the SET NEWPAGE Uproc. Other solutions might involve the HOLD, FLUSH and EJECT PAGE Uprocs. [See B.10.3.5, B.10.3.6, B.10.3.7.]
If a break frame has fixed frame dimensions, the number of rows specified in the FRAME-DIM statement will be placed on the page, whether or not data has been placed in the last ones. However, you can use the SET NROWS Uproc to delete the last unused rows of a report-specific frame-type, if desired. [See B.3.3.1.]
You do not have to put data into a break frame. Instead you may just want to perform calculations. This means you might not need a break frame definition. Instead, you can code a null value for the frame name in the FRAME-NAME statement in the format declaration section, as you can for an initial or startup frame and perform your calculations there. [See B.5.2.] But if you do code a frame definition, it should not contain a FRAME-DIM statement.
There are four basic ways to get records into a particular arrangement by element value in SPIRES:
- 1) You can issue the SEQUENCE command to arrange the records in a stack or result by element values [See SEQUENCE COMMAND in the "SPIRES Searching and Updating" manual.];
- 2) You can run a SPISORT job to arrange the records in a set by element values [See SPISORT in the manual "SPIRES Technical Notes", section 1.];
- 3) You can use the key order of the records by processing them under the FOR SUBFILE command [See FOR SUBFILE COMMAND in the SPIRES manual "Global FOR".];
- 4) You can take advantage of the order of the records as given in a simple index by using the FOR INDEX command [See FOR INDEX COMMAND in the manual "Global FOR".].
Each of these particular methods is discussed in detail in other SPIRES manuals. Any of these methods may be used in connection with a report format. Of course, if group-start or group-end frames are involved in the format, you would want to choose a method that arranges the records by the break elements.
The SPISORT and FOR INDEX methods are worth discussing further because both methods allow a record to appear multiple times if it contains multiple occurrences of a break element. For example, if you are processing records under FOR INDEX ACCOUNT.NUMBER, a record with two or more account numbers can be processed several times, once for each number.
However, suppose you were creating a directory by account number. Your format displays the account number followed by the name of the person to whom it belongs. In such a case, you only want the one account number displayed that caused the record to appear at that particular point in the order. Under FOR INDEX ACCOUNT.NUMBER, the following label group, by itself, will not guarantee that the right occurrence of ACCOUNT.NUMBER is accessed:
LABEL = ACCOUNT.NUMBER; GETELEM; PUTDATA;
That label group will retrieve the first occurrence of the ACCOUNT.NUMBER, which may not be the occurrence that caused the record to be displayed at that point. If you use the FOR INDEX method, you should use filter processing in combination with the $PATHKEY variable to filter out the irrelevant element occurrences. An example of this technique is explained in the SPIRES manual "Technical Notes", chapter 21.
If you use the SPISORT program to sort the records and process them under the FOR SET command, the above label group would retrieve the appropriate occurrence of ACCOUNT.NUMBER. The GENERATE SET command, used to create the record set for SPISORT, produces "path information" to tell SPIRES how to retrieve the appropriate occurrence of the element. In other words, when SPIRES is processing records under a FOR SET command, it has access not only to each record but also to directional information telling which occurrence of a sequenced element caused the record to appear at a particular point in the sorted set.
Unless directed otherwise, by the UNFILTERED option on the FOR SET command, SPIRES will automatically apply the path information to the records processed under FOR SET. Then, only the occurrence of the element that caused the record to appear at that position in the record set is available for display.
More information about SPISORT and FOR SET processing can be found in the manuals "Technical Notes" (SPISORT) and "Sequential Record Processing: Global FOR" (FOR SET command).
In most report format situations, the automatic filtering done by FOR SET processing (described above) is desirable -- usually only the occurrence of an element that caused the record to sort in its particular place is needed. However, situations may arise where the automatic filtering is undesirable, so the UNFILTERED option on the FOR SET command tells SPIRES not to automatically filter the occurrences of the sorted elements.
But there may be still other situations where you need both: you want to know which occurrence of the element caused the record to be sorted where it was (perhaps you want to mark it with an asterisk) and you want to display all the occurrences besides. In this situation, you need to use the UNFILTERED option on the FOR SET command and code the PATH or NPATH options described below in the format definition to identify the particular element occurrence.
To tell SPIRES to use the path information to access the appropriate occurrence in a label group, you use the PATH option on the GETELEM statement:
GETELEM = PATH, element.name;
where "element.name" is the name of the element to be accessed. The "element.name" may be replaced by a variable if desired.
Remember that path information is only available for data sets created by the DEFINE SET and GENERATE SET commands. If no path information is available for an element and the PATH option appears on the GETELEM statement, the first occurrence of the element within the record will be used.
The PATH option may also be used on the BREAK-ELEM statement or on the IND-STRUCTURE statement. [See B.10.7.1, B.8.3.] Suppose the set of records is sorted by ZIPCODE, an element within the multiply occurring structure ADDRESS. In your report format, you not only want to access the appropriate ZIPCODE but also the related CITY and STATE elements. So, in the label group that calls the indirect frame to process the ADDRESS structure you replace this:
IND-STRUCTURE = ADDRESS;
with this:
IND-STRUCTURE = PATH, ZIPCODE;
SPIRES will use the path information leading to ZIPCODE to find the proper occurrence of the ADDRESS structure. Then the indirect frame can retrieve the other elements in that particular occurrence of the structure. Note that you do not code the structure name but the name of the element that has the path information that is within the structure in the IND-STRUCTURE statement.
One other variation on the PATH option, primarily useful in general file formats where you might not know the names of the elements with path information, is shown below:
GETELEM = NPATH, n;
where "n" is an integer or an integer variable. This form, also available in BREAK-ELEM and IND-STRUCTURE statements, replaces the element name with a number corresponding to the "nth" element in the sort list of the DEFINE SET command. For example, if the set is defined with "DEFINE SET TEST ELEM A, B, C", then "GETELEM = NPATH, 3;" is equivalent to "GETELEM = PATH, C;". Unlike the PATH option, this method will cause a format error if no path information is available, since SPIRES would not know what element to retrieve. [See B.14.]
The SET MAXLEVELS Uproc may be used to set the maximum break level to be executed by the report format. It allows a report format to execute the summary frames of all levels or no levels or some number in between. In other words, a single report format can be directed to execute all summary frames or just those down to a given break level.
The syntax of the Uproc is:
UPROC = SET MAXLEVELS = n;
where "n" is an integer or an integer expression. The number given will be the lowest break level (i.e., the highest number) of summary frame that will be executed. If, for example, the Uproc "SET MAXLEVELS = 3" is executed, summary frames with break levels of 1, 2 and 3 will be executed, while frames with levels of 4 or higher will not. [See B.10.7.1.]
There is no $MAXLEVELS variable, nor is there a SET MAXLEVELS command -- "maxlevels" can only be set within a format definition. If you want the capability of dynamically changing "maxlevels", you will need to prompt for it, or set its value in another variable before executing the format. For example, you might add the following label group to the initial frame:
LABEL = MAXLEVELS; UPROC = SET PROMPT = 'Maxlevels (RETURN=ALL)?'; UPROC = ASK NULL='-'; UPROC = IF $ASK = '' THEN SET MAXROWS = 10; UPROC = ELSE SET MAXLEVELS = $ASK;
If the user gives a null response to the prompt, all summary frames will be executed, which is achieved by setting "maxlevels" to 10 or higher (10 is the limit for a break level).
Occasionally you may need a way to sort values within a format. For example, during data frame execution you want to gather information that will appear as an index created in an ending frame. So you need a place to store the indexed values and their page numbers, as well as a way to sort them once they are all gathered.
SPIRES provides two methods to do this: triples, which are a general tool, and the sort facility built into the formats processor. Neither one is restricted to report formats, however, although they are more commonly used there than in non-report formats.
Triples are a special type of dynamic variable. Triples have more flexibility and more capabilities than the format sorting method: for example, their values can be accessed outside of the format, which is not true of the other method. However, computer memory is not handled as efficiently by triples, which could cause problems for applications that use a lot of memory already. Triples are discussed in detail in the manual "SPIRES Protocols", section 7a.
The format sort facility uses the system variable arrays $SORTKEY (sort value) and $SORTDATA (sort data). When the format is set, memory is set aside for sorting, according to the SORT-SPACE statement (see below). At some point in the format, you initialize the sort space and begin assigning values to the $SORTKEY and $SORTDATA arrays (for example, an index entry goes to $SORTKEY and its page number goes to $SORTDATA). When all the data has been gathered, you tell SPIRES to sort the values in the $SORTKEY array, in either ascending or descending order. You then can extract the values from the two arrays: the values in $SORTKEY are now in sort order, while the values in $SORTDATA have been rearranged so that each is in the same position in the array as its "partner" in $SORTKEY.
The restrictions on the format sort facility are discussed later. Here are the specific procedural steps to follow in coding it within a format.
First, you code the "SORT = INITIALIZE" statement at some point before you begin gathering the data. This statement initializes the sort space, discarding any values left in it. (INITIALIZE may be abbreviated to INIT in the SORT statement.) If the data will be gathered in a looping process, make sure that the SORT = INITIALIZE statement is outside of the loop. For example, if the values to be sorted are multiple occurrences of a retrieved element, place the SORT statement ahead of the label group that retrieves it; otherwise, the sort space will be reinitialized each time a new value is accessed. Similarly, if the values are gathered from multiple records before the sorting occurs, be sure that the SORT = INITIALIZE statement is not executed for each record; you could put it in the initial frame.
You then begin gathering the values, assigning them either to $SORTKEY or $SORTDATA:
UPROC = SET SORTKEY = value; UPROC = SET SORTDATA = data;
It is the values in $SORTKEY that will be sorted. You do not need to use the $SORTDATA array unless you have data associated with each occurrence of $SORTKEY that should be carried along with it (like the page number for the index entry in the example above).
You do not handle occurrence numbers of the variables in the array; each time you put another value into either array, it goes into the next available position. (However, if you do use both arrays because you are handling the pairs of values, you should place the value to be sorted into $SORTKEY first and then put the corresponding value into $SORTDATA; if you place a value into $SORTDATA first, the values may not match properly.) You cannot remove or change values placed in the array except by reinitializing the sort space.
Once all the data has been gathered, another form of the SORT statement is specified:
SORT = {ASCENDING|DESCENDING|AU|DU};
This statement tells SPIRES to sort the $SORTKEY array either in ascending or descending order. (The values may be abbreviated to A and D; AU and DU indicate sorting in either ascending or descending order and discarding "extra" pairs of $SORTKEY and $SORTDATA values when $SORTKEY is not unique, i.e., one pair will not be discarded.) From this point on, unless another SORT = INITIALIZE statement is encountered, each reference to $SORTKEY will retrieve the next occurrence in the array. For example,
LABEL = LOOP; UPROC = IF $SORTKEND THEN RETURN; LABEL = INDEX.ENTRY; VALUE = $SORTKEY ', ' $SORTDATA; START = X,1; PUTDATA; LABEL; UPROC = JUMP LOOP;
You can only retrieve a given value from $SORTKEY and $SORTDATA once. If you need to test or manipulate each value, put it into a variable right away. Consider this bad example:
LABEL; UPROC = IF $SORTKEY > 10 THEN LET HIGHVAL = $TRUE; LABEL; VALUE = $SORTKEY; PUTDATA;
Only every other value of $SORTKEY would be processed by the PUTDATA statement; the others are tested and "discarded" in the Uproc. This method would be a better way to handle the problem:
LABEL; UPROC = LET SORTEDVAL = $SORTKEY; UPROC = IF #SORTEDVAL > 10 THEN LET HIGHVAL = $TRUE; LABEL; VALUE = #SORTEDVAL; PUTDATA;
The flag variable $SORTKEND, tested in the LOOP label group, is set (set to $TRUE, i.e., 1) when the last sort value in the array has been retrieved. Similarly, the flag variable $SORTDEND is set when the last sort data in $SORTDATA has been retrieved. Both flags are cleared (set to $FALSE, i.e., 0) when the SORT = A or SORT = D statement is encountered.
The sample report at the end of this chapter shows another use of the sort facility. [See B.10.11, B.10.12.]
The last coding requirement is the SORT-SPACE statement, which appears in the format declaration section:
SORT-SPACE = n;
where "n" is the number of bytes to be allocated, in thousands. It can be any positive integer up to 64. Unless your format is part of a large application that uses much or too much memory (as evidenced by S198 or CORE EXHAUSTED error messages) you can probably set "n" to an arbitrarily high number. If you do need to worry about memory management, try to determine the maximum number of $SORTKEY and $SORTDATA pairs of entries and the average total length of each pair. (Add 8 bytes to the average length of the pair for system data associated with it.) Multiply the number of pairs by the average length of each pair to get an approximation of the number of bytes of sort space needed.
If both a load format and its calling format include a SORT-SPACE statement, the allocation requested by the calling format will be assigned and the SORT-SPACE statement in the load format will be ignored. The load format's SORT-SPACE will be assigned only if the calling format does not include a SORT-SPACE statement. [See B.11.]
assigning the integer variable value to $SORTKEY: UPROC = SET SORTKEY = #INTEGER; assigning the sorted "string" value back to #INTEGER: UPROC = LET INTEGER = $RETYPE($SORTKEY,INT);
When designing report formats, people often want different data values to be printed with different character sets. For example, some elements should be displayed with italic characters while others should be displayed with bold ones. Two different methods are available, one simple, one complicated.
The simple method is, unfortunately, the less general. It is tied to special multiple-font character sets (such as MT10) available on the IBM 3800 printer at Stanford. Values that are to be in an italic font are processed by the $ITALIC system proc as an OUTPROC; values to be boldfaced are processed by the $BOLD system proc as an OUTPROC. These procs can be included as part of the OUTPROC statement in a label group:
LABEL = ORDER.NUMBER; GETELEM; OUTPROC = $ITALIC; INSERT = 'Order number: '; PUTDATA;
Remember that the OUTPROC statement in the format will completely override the OUTPROC statement in the record definition. Note too that the insert text shown in the example above will not be italicized because the OUTPROC applies only to the element value retrieved from the record. [See B.4.5.2.] To have the entire value be italicized, try this method:
LABEL = ORDER.NUMBER;
GETELEM;
OUTPROC = $INSERT('Order Number: ')/ $ITALIC;
PUTDATA;
The resulting report must be printed on the 3800 laser printer with one of the multiple-font character sets that were designed for this purpose, such as the MT10 or MT12 font.
When the italicized or boldfaced values are displayed on a terminal, they will not print properly -- that is, you will not be able to read them. In fact, they may cause your terminal to ring the bell, blank the screen, and act peculiarly in general. Adding the MIXED option to the LIST command in WYLBUR, which causes unusual characters to be displayed in their hexadecimal equivalent, may be helpful when you want to see the output data at a terminal. For more information about the use of the $BOLD and $ITALIC system procs, see the manual "SPIRES System Procs" or EXPLAIN $BOLD PROC or EXPLAIN $ITALIC PROC. More information about the multiple-font character sets is available in the I.T.S. documents "Character Sets for the 3800 Laser Printer" and "Using the 3800 Laser Printer".
The second method, which allows you to mix various character fonts of your own choosing, can be very complicated when done in a report format -- it is much easier to use in non-report formats where you are not relying on the $LLEFT variable to keep track of the number of lines left on the page. For this method, which produces output for either the IBM 3800 or the Xerox 9700 printer, column 1 of your format should be reserved for carriage control that you will provide yourself, and column 2 will be used for character set selection.
The principle here is that each row of the format can be printed with only one character set, as specified by the number placed in column 2. When you want a line of output to contain more than one character set, you must place the characters to be italicized, for example, on one row of output, and the characters to be boldfaced on the next row of output but with the overstrike carriage control character (+) in column 1:
(....v....1....v....2....v....3....v....4....v....5) 1Italic text can appear on one line. +2 and bold text
When printed with the CC (carriage control) option, these two rows of output would be printed as one, with the second row merged into the first. More details on preparing a data set for this type of printing are given in the manual "Using the 3800 Laser Printer".
Writing a report format to create this type of data set is complicated for several reasons:
- 1) you must handle the carriage control yourself, after setting $PAGECTL to 4;
- 2) $LLEFT will count each row of output as one line's worth, whether or not it has the overstrike character in it. You must adjust it yourself, usually by adding 1 to $LINEPP each time an overstrike line is created.
An impact printer may be able to simulate boldface type by overstriking, that is, by printing the same characters several times on top of each other. The relevant lines of such a data set might look like this:
(....v....1....v....2....v....3....v....4....v....5)
This line will be overstruck three times.
+This line will be overstruck three times.
+This line will be overstruck three times.
+This line will be overstruck three times.
Similarly, text may be underlined by printing the text and then printing the underscore characters on the same line, requiring the overstrike carriage control symbol to prevent the printer from going to a new line. The relevant lines of the data set usually look like this:
(....v....1....v....2....v....3....v....4....v....5)
The third word in this line is underscored.
+ ____
The same steps outlined above are necessary for such effects: you must handle the carriage control yourself after setting $PAGECTL to 4, and you must temporarily increase $LINEPP to account for the extra lines added to the page.
As you have no doubt discovered, system variables are quite important in report formats. A list of system variables useful within output formats in general appeared earlier in the manual. [See B.9.2.] The list below describes variables useful within report formats in particular. Details for each variable are discussed in an Appendix. [See E.2.]
The variable type is given in parentheses after the variable name. An asterisk preceding the name indicates that it can be set explicitly by the format or by the user.
* $REPORT (flag) - set within a format when report mode processing
is in effect.
$NOREPORT (flag) - set within a format when report mode processing
is not in effect.
* $NEWPAGE (flag) - can be set to tell SPIRES to start the current
buffer on the next page.
* $FRONTPAGE (flag) - can be set to cause "1" to be replaced by "F"
when a page eject occurs (to force printing on
the front of a page)
* $NOBREAK (flag) - can be set to indicate that the current buffer
should begin a new page if it cannot fit on
the current page.
$FREC (flag) - set while the first record is being processed
when a multiple-record processing command was
issued.
$LREC (flag) - set after the last record has been processed
when a multiple-record processing command was
issued.
* $PAGECTL (int) - can be set to specify the type of carriage
control provided by SPIRES.
$RECNO (int) - the number of records processed thus far under
the multiple-record processing command.
* $LINEPP (int) - can be set to indicate the number of lines
allowed per page.
$LLEFT (int) - represents the number of lines left for data on
the current page after the current line.
* $HDRLEN (int) - can be set to indicate the total number of lines
allowed for all the header frames on the page.
* $FTRLEN (int) - can be set to indicate the total number of
lines to be left for all the footer frames
on the page.
* $PAGENO (int) - the current page number of the report.
* $MARGIN (int) - can be set to the number of blank characters for
a default left margin.
* $SKIPF (flag) - can be set to tell SPIRES to stop further
processing of the current frame.
* $SUPPRESS (flag) - can be set to tell SPIRES to process the data
but not put out any data.
* $COLUMNS (int) - when greater than 1, indicates that multiple
column processing will occur; value represents
the number of columns.
* $COLWDTH (int) - the width in bytes of each column in multiple
column processing.
$CURCOL (int) - the number of the current column on the page
during multiple column processing.
* $NEWCOL (flag) - can be set to tell SPIRES to start the current
buffer in the next column.
$FLINES (int) - the number of lines in the buffer when it was
last flushed.
* $SORTKEY (string) - the next value in the sorted array.
$SORTKEND (flag) - set when the final $SORTKEY value has been
retrieved.
* $SORTDATA (string) - the data value associated with the most
recently accessed value of $SORTKEY.
$SORTDEND (flag) - set when the final $SORTDATA value has been
retrieved.
Below is a sample report format definition for the public subfile BLOOD DONORS. (This file was created and described in detail in the SPIRES primer "A Guide to Data Base Development".) The report uses many of the features described in this chapter.
The format is used to create a phone list of blood donors. The donor records are sequenced by the BLOOD.TYPE, CITY and DATE (representing the date the donor last gave blood) elements, and the report is arranged by the first two categories, making it easy to find appropriate donors when necessary. Donors who have given a gallon of blood are treated specially by the blood bank, and their entries are preceded by asterisks to denote their achievement. Records are selected for the report with the search FIND CAN.BE.CALLED = YES, to include only donors who said they could be called.
The report includes a title page with the date of the report as well as a note concerning who may use the list. Each page within the report has a header and footer, including such information as the page number, the date and the blood-type of donors on that page.
Other features of the report include subtotal information for each blood-type as well as grand totals at the end of the report. Also at the end is a summarization by city, a tally created with the sorting facility.
The format definition below does not include comments, except to describe the variables that are defined. Detailed comments follow the definition, along with some sample pages of output. [See B.10.12.]
1. ID = GQ.JNK.DONORS.REPT; 2. MODDATE = WED. JAN. 12, 1983; 3. DEFDATE = SUN. JAN. 9, 1983; 4. FILE = GQ.JNK.BLOOD.DONORS; 5. RECORD-NAME = REC01;
6. VGROUP = LOCAL; 7. VARIABLE = PEOPLEPT; 8. OCC = 1; 9. TYPE = INT; 10. COMMENTS = Number of people per blood-type.; 11. VARIABLE = PINTSPT; 12. OCC = 1; 13. TYPE = INT; 14. COMMENTS = Number of pints donated per blood-type.; 15. VARIABLE = GALLONPT; 16. OCC = 1; 17. TYPE = INT; 18. COMMENTS = Number of gallon donors per blood-type.; 19. VARIABLE = PEOPLETOTAL; 20. OCC = 1; 21. TYPE = INT; 22. COMMENTS = Total number of donors.; 23. VARIABLE = PINTSTOTAL; 24. OCC = 1; 25. TYPE = INT; 26. COMMENTS = Total number of pints donated.; 27. VARIABLE = GALLONTOTAL; 28. OCC = 1; 29. TYPE = INT; 30. COMMENTS = Total number of gallon donors.; 31. VARIABLE = CITY; 32. OCC = 1; 33. TYPE = STRING; 34. COMMENTS = Value of CITY element for sort array.; 35. VARIABLE = CHECKCITY; 36. OCC = 1; 37. TYPE = STRING; 38. COMMENTS = Used to check for new city in sort 39. array.;
40. FRAME-ID = DONOR; 41. DIRECTION = OUTPUT; 42. FRAME-DIM = 6,67; 43. USAGE = DISPLAY; 44. LABEL = NAME; 45. GETELEM; 46. START = 1,3; 47. UPROC = LET PEOPLEPT = #PEOPLEPT + 1; 48. PUTDATA; 49. LABEL = PHONE.NUMBER; 50. GETELEM; 51. START = *,40; 52. INSERT = 'Ph: '; 53. UPROC = SET ADJUST RIGHT; 54. PUTDATA; 55. LABEL = DATE.GIVEN; 56. GETELEM; 57. START = 2,4; 58. INSERT = 'Last Donated: '; 59. PUTDATA; 60. LABEL = TOTAL.PINTS; 61. GETELEM; 62. START = 2,40; 63. INSERT = 'Total pints donated: '; 64. UPROC = LET PINTSPT = #PINTSPT + $UVAL; 65. UPROC = SET SORTKEY = #CITY; 66. UPROC = SET SORTDATA = $STRING($UVAL); 67. PUTDATA; 68. LABEL = GALLON.DONOR; 69. VALUE = '*'; 70. START = 1,1; 71. UPROC = IF $PVAL < 8 THEN JUMP; 72. UPROC = LET GALLONPT = #GALLONPT + 1; 73. PUTDATA; 74. LABEL = ADDRESS; 75. GETELEM; 76. START = 3,4; 77. PUTDATA; 78. LOOP; 79. LABEL = BLANK.LINE; 80. VALUE = ' '; 81. START = X,1; 82. UPROC = SET NOBREAK; 83. UPROC = IF $LLEFT = 0 THEN JUMP; 84. PUTDATA;
Comments:
47. In this Uproc, we begin to collect the subtotal information tha