*  A Guide To Output Formats -- A SPIRES Primer
+  Introductions
+1  Introduction to this Primer
+2  Introduction to Formats
1  Designing an Output Format
1.1  A Sample Design
1.2  Frames, Frame Types, and Frame Dimensions
2  Writing a Format Definition -- An Outline of the Definition
3  Beginning the Format Definition: The Identification Section
4  Controlling Format Execution -- the Frames Definitions Section
4.1  The Identification Elements of the FRAME-DEF Structure
4.2  Label Groups
4.2.1  Constructing the Label Groups for Our Sample Format  The LABEL, GETELEM and PUTDATA Statements  Text Placement and the INSERT Statement  Positioning: START, LENGTH, MARGINS and MAXROWS Statements  More on Text Placement: The VALUE, TITLE and TSTART Statements  Calling Indirect Frames: The IND-STRUCTURE and IND-FRAME Statements  Accessing Multiple Occurrences of an Element: the LOOP Statement  The UPROC Statement  The Label Groups for the Indirect Frame; the DEFAULT Statement
5  Linking Frames Together: The Format Declaration Section
6  Adding and Compiling the Format Definition
7  Frame Dimensions, SET FLUSH and Line-by-line Format Processing
7.1  Fixed Frame Dimensions and the SET FLUSH Uproc
7.2  Line-by-line Processing
8  Using Variables and UPROCs
8.1  Some System Variables Used Only in Formats
8.2  User Variables
8.3  Some Other Useful Uprocs
9  Other Formatting Capabilities
:  Appendix
:A  Sample Format Definition
:29  SPIRES Documentation

*  A Guide To Output Formats -- A SPIRES Primer

*                                                                *
*                     Stanford Data Center                       *
*                     Stanford University                        *
*                     Stanford, Ca.   94305                      *
*                                                                *
*       (c)Copyright 1994 by the Board of Trustees of the        *
*               Leland Stanford Junior University                *
*                      All rights reserved                       *
*            Printed in the United States of America             *
*                                                                *

        SPIRES (TM) is a trademark of Stanford University.

+  Introductions

+1  Introduction to this Primer

This document is one of the primers about the SPIRES data base system. Prior to the appearance of this document, most SPIRES users who created output formats did so after learning the SPIRES file definition language; however, knowledge about SPIRES file definitions is not needed if you want to learn how to write simple output formats.

Other SPIRES primers, "A Guide to Searching" and "A Guide to Data Base Development", which teach basic search techniques for using SPIRES subfiles, briefly discuss the formatting of goal records through simple means. For instance, the "TYPE element-list" command displays only the named elements to you. Also, the system format $REPORT lets you create simple reports to display particular pieces of goal records. (In fact, you might want to consider whether those two features will meet your formatting needs before you invest time in this primer.) Besides teaching those commands, the two primers also provide a short introduction to using custom formats, designed expressly for the subfile and accessed through the SHOW FORMATS and SET FORMAT command.

This primer will teach you how to create your own output formats for SPIRES subfiles, whether or not you even own a subfile or have defined one. Though this document can be useful to people who have read only "A Guide to Searching", most of it is really aimed at SPIRES users who have read the basic SPIRES primer "A Guide to Data Base Development". That is because creating a "format definition" involves updating a SPIRES subfile, a topic not taught in the first primer. Still, you will learn here the specific commands involved in updating the subfile; the point is that you will understand better what is happening if you are familiar with the updating procedure, which is also covered in "SPIRES Searching and Updating", part D. Moreover, you will find that the more you know about SPIRES files and subfiles and in particular about file definitions, the easier it will be to understand this material. Also, to understand sections 8 and 9 of this primer, you will need to be familiar with SPIRES protocols and the file definition languages respectively, both of which are the subjects of other SPIRES manuals and of future primers. Those two chapters should be considered optional, if what has been described before then is sufficient for your needs.

In an effort to make reading easier, we did not include cross-references to the reference manual "SPIRES Formats" in this primer. The cross-references here refer to other sections of this manual. For further explanation of a formats statement or concept, you should consult "SPIRES Formats" or use the EXPLAIN command.

+2  Introduction to Formats

SPIRES formats provide a method for moving data from one form to another. They can move data from records stored within a SPIRES file to a terminal screen or to a piece of paper, converting the data along the way into a legible, meaningful form. Or, vice versa, they can take information typed into a terminal or entered into the computer on cards or tape and convert it into records in a SPIRES file, stored in special internal forms. Formats that move data from within a SPIRES file out to you are called output formats; those that put data into a SPIRES file are called input formats. (A third type of format, "inout", is usually used for processing records on full-face terminals rather than those in line-by-line mode.) The topic of this primer is output formats; however, much of what you will learn about creating output formats applies to creating any type of format. If you are interested in writing input formats, it might be easier for you to learn to write output formats first.

Most output formats are designed to display the goal records of a particular subfile, but you may write a format for any record-type within a file.

Here is a brief terminal session, examining a record in the standard SPIRES format and the same record in a format designed for the goal record:

Above is an example of the standard SPIRES format, often used when adding or updating records. It is designed to be functional rather than attractive. However, if you reset the DISPLAY format designed especially for this subfile:

Now the record looks more like a recipe. Thus, many subfiles in SPIRES use customized output formats to display their data in more interesting ways than the standard format provides. Also, since an output format does not have to display every element in the record, formats are often used to limit the displayed material to only that material which a user wants or needs. Because SPIRES is a generalized system with many different types of files, there are many different ways that formats are used. In most cases however, output formats are designed for either or both of the purposes above: to display data more attractively or functionally and to subset the information in a record so that only certain elements are displayed. You might want to select some public subfiles and see whether they have custom formats, and if so, display some records through those formats to see how they were designed.

After examining a record in both the standard SPIRES format and a custom format, you probably have some idea of what a format can do:

 - it can get elements from the record and place them almost anywhere in  the  output  display
 - it can position character strings ("text") or variable values anywhere therein as
 - multiple values can be placed on one line;
 - values can be centered on a line, or justified so that the  right  margin  is  not  ragged;
 - computations or value-checking can be done, with the results controlling the  placement  or
 non-placement of a value in the format.

The possibilities for format design are not limited to those just listed, although this primer almost is. One of the aims of this document is to show you many of these formatting capabilities; as you read it, you will probably think of more ways to apply them.

Creating a SPIRES Format

Creating a format is similar in many ways to creating a SPIRES file. First, you must write a "format definition". This is a record that contains the characteristics of the format or formats you are creating; it tells SPIRES where to get and what to do with the data that you will eventually be examining through the format. (More than one format can be defined in a format definition; generally though, for the sake of simplicity, this manual will talk about one format per format definition.)

Second, you add the format definition to a public subfile called FORMATS. Just as SPIRES checks an added record in any other subfile for correct syntax, required elements, and so forth, SPIRES examines your added format definition to be sure that it adheres to the FORMATS subfile's rules for added records. Anyone can add records to FORMATS, and in fact, you are allowed to add format definitions for any SPIRES file. The catch is that you may not be able to use them. For example, if you cannot select a particular subfile, you cannot use any formats you may define for it. If you can only display records in a subfile, you cannot use any input formats that you may write for it. Note then that formats cannot be used to bypass subfile security.

Third, you must compile the format definition using the COMPILE command. Any errors that SPIRES detects in the format definition cause a compilation error; you must then transfer the record from the FORMATS subfile, update it, and try compiling again.

Fourth, when the format definition is successfully compiled, you will want to try your format. After selecting the appropriate subfile, you set the new format and display records. Hopefully, the format works the way you wanted; if not, you can update the record in FORMATS and then recompile it.

An Example of a Simple Format Definition

To show you that this process is relatively uncomplicated, we will create a format definition here in the Introduction. It will be a simple output format to create mailing labels from a subfile of local names and addresses called MEMBERS. We will need to know several pieces of information about the subfile. First, however, we need to decide what elements from the goal record we will display in the format. The elements available to us can be shown:

We will use the NAME and LOCAL-ADDRESS elements. (It is important that you know what types of values a given element can have before you start. For instance, what can be found in LOCAL-ADDRESS? Only the street address? Or does it include the city and state and zip code? In this case, LOCAL-ADDRESS has a separate occurrence for each line of an address on an envelope, including the city, state and zip code.) In order for a special label program to create mailing labels, we want the formatted records to look like this:

     !             <--- exclamation point as delimiter for LABELER program
     NAME          <--- the NAME element value
     LOCAL-ADDRESS <--- the 1st occurrence of the LOCAL-ADDRESS
     LOCAL-ADDRESS <--- the 2nd occurrence of that element

Thus, our format design is complete. The next step is to write the format definition.

The format definition is not only a set of instructions ("statements") telling SPIRES how to format data but is also a record in the FORMATS subfile (containing "elements"), so it must follow the standard rules for record input. [See "SPIRES Searching and Updating", section D.1.2.] (When discussing a format definition, we often call the elements of the definition "statements" so that they are not confused with the elements to be displayed by the format.)

Here is the complete format definition for the format desired.

     2.    AUTHOR = John Klemm, Data Resources Group, 497-4420;
     3.    COMMENTS = Output Format Definition for MEMBERS subfile;
     4.    FILE = GQ.JNK.MEMBERS;
     5.    RECORD-NAME = REC01;
     6.    FRAME-ID = LABEL-OUT;
     7.       DIRECTION = OUTPUT;
     8.       FRAME-DIM = 5,36;
     9.       USAGE = DISPLAY;
    10.       LABEL;
    11.          VALUE = '!';
    12.          PUTDATA;
    13.       LABEL;
    14.          GETELEM = NAME;
    15.          PUTDATA;
    16.       LABEL = LOCAL-ADDRESS;
    17.          GETELEM;
    18.          PUTDATA;
    19.          LOOP;
    20.    FORMAT-NAME = LABELS;
    21.       FRAME-NAME = LABEL-OUT;
    22.          FRAME-TYPE = DATA;

We can break that format definition down into three main sections:

1) Identification Section (lines 1-5)

These elements give information about the format definition itself and the file to which it applies. The ID element is the key of the format record; that is, it is a unique "name-tag" that is used to distinguish this particular record from any other in the FORMATS subfile. [See "SPIRES Searching and Updating", sections D.1.1 and D.1.3 for further explanation of keys.] FILE and RECORD-NAME are the names of the file and the particular record type in the file for which the definition is being written. Unless you are the file owner, you probably do not know these values, but you can get them with the SHOW SUBFILE INFORMATION command. [See chapter 3.]

2) Frame Definitions Section (lines 6-18)

These elements and structures make up the body of the format. They give SPIRES directions for processing the goal record -- they do the work. If you look at some of the statements (lines 13 and 14, for example), you will see that they tell SPIRES to GET an occurrence of an ELEMent and PUT that DATA into the format in a specific location. This work may be handled by one or by many such "frames", depending on the complexity of the task. Here there is only one frame. [See 1.2.]

3) Format Declaration Section (lines 19-21)

In this section the frames, the building blocks of a format, are put together and given a name (FORMAT-NAME). When a format is set, using the value of FORMAT-NAME in a SET FORMAT command, and a record is displayed, the frames named below that FORMAT-NAME are executed -- that is, the instructions given in the frame definitions are carried out.

Those 21 lines above constitute a complete, simple format definition. To use the single format defined, you would collect it in your active file, add the definition to the FORMATS subfile and then compile the format definition using the COMPILE command:

Though not very many formats are as simple as the above example, creating any format involves the same process, though with more detail, depending on the complexity of the desired format. The main purpose of this primer is to take you through this process step by step, explaining it to you on the way. The first six chapters cover those first two steps of creating output formats -- writing the format definition and adding it to the FORMATS subfile. The sixth chapter also teaches you how to compile the format. The last three chapters briefly discuss some more advanced aspects of format design and definition.

You might learn more from this manual if you try to create a format as you read. Most of the examples will be based on a simple task -- to make an output format for DRINKS, a public subfile. (If you are not familiar with this subfile, you might want to select it, search for some records and display them, etc.) Pieces of the format definition created for DRINKS will be displayed throughout the primer; the definition is printed completely in Appendix A.

If you do try to write a format while you read, please be sure that it will be "simple" -- otherwise, you may find that you are concentrating more on writing your sample format definition than on learning this material. Generally, files with only a few elements are the easiest to use. You can design a format for DRINKS, if you like. The PEOPLE subfile is a good one to use. Though more complicated, the RESTAURANT subfile can also be used.

1  Designing an Output Format

The first step in creating an output format is to design it. You should have a clear picture in your mind of what a record being displayed in your format will look like; ideally, you should transfer that image to a piece of paper. Writing a format definition is translating that image into special instructions for SPIRES to follow when it is displaying records to you.

Designing your format can be simple or complex, depending of course on your needs. For example, if you merely want to display one element of a record using your format, then designing it (and then converting your design into a format definition) will be trivial. However, if you need to display many elements, some of them perhaps multiply occurring, and center some of them and right-justify others, and not display one element if another element occurs, and so forth, then your design (and your format definition) will be more complicated. SPIRES can certainly handle many types of display formats, but not every possibility can be discussed in this primer. If you want to know for sure whether you can actually create a particular format you have designed, feel free to discuss it with the SPIRES consultant.

The following are some of the formatting capabilities you will learn here. They should give you an idea of what you can do in an output format. You can:

In this chapter we will design our sample format for DRINKS and learn about "frames" which make up a format.

1.1  A Sample Design

Suppose you want to design an output format for the DRINKS subfile. You would like it to look like a standard recipe -- that is, like the example shown a few pages ago. Your design might look like this (the column numbers at the top, not included in the design, are shown for clarity):

     (NAME)                                              Makes (QUANTITY)

                     (AMOUNT)  (CONSTITUENT)
                     (AMOUNT)  (CONSTITUENT)

       (INSTRUCTIONS,  e.g.,  "Mix fifteen  gallons of gin  with  fifteen
     gallons of fruit punch in any clean bathtub.  Always remember to put
     in the drain stopper.")

     From: (SOURCE).

     Contributed by (CONTRIBUTOR).

That is a reasonable design for a simple output format. The elements of the record are indicated by their names in parentheses. "Makes", "Ingredients:", "From:" and "Contributed by" are literal strings to appear in the display.

We will limit the display to 68 characters in width, but the vertical height (the number of lines) may vary, depending on the number of ingredients and the length of INSTRUCTIONS. [See 1.2, where we discuss "choosing frame dimensions", including why we chose "68".] Here is a brief description of the formatting characteristics we want to specify for each element:

(An account number is a six-character value used at Stanford University, where this application was developed, that is similar in purpose to a "userid" in CMS.) In this format, if the CONTRIBUTOR value is not an account number, it should be displayed; if it is an account number, the contributor's desire for anonymity will be honored. If displayed, the element should appear on the second line after SOURCE, be preceded by the words "Contributed by" and be followed by a period.

When you design your format, you must decide what horizontal and vertical dimensions you want, which is the subject of the next section. The major points to consider are, first, the size of the records and how much of them will be displayed, and second, whether the format is being designed for online use or for printed documents.

1.2  Frames, Frame Types, and Frame Dimensions

A "frame" is the two-dimensional area into which values are placed when a record is processed. A frame is also the set of instructions that creates that area and places those values within it (the term could be considered an abbreviation for "frame definition" in this context). The term "frame" thus has two related meanings, either or even both of which may be intended when it is used. In most cases, the context will determine the intended meaning.

More than one frame can be executed by a single format. Usually if your format definition defines just one format and all of the goal record elements being accessed are record-level elements (i.e., none of them are from structures), then only one frame is needed. It is called the "data frame", and "data" is the value entered for FRAME-TYPE in the Format Declaration section of the format definition. [See 5.] Every output format must have at least one data frame. A data frame can access all record-level elements in the goal record.

Often however, your goal record contains structures, and you want to access element values within those structures. In that case you need to use an "indirect frame". As you are writing the data frame definition, when you arrive at the point where you want to begin processing elements in the structure, you have the data frame "call" (that is, "transfer execution to") the indirect frame. The indirect frame then processes the elements within the first occurrence of the structure, usually placing them directly into the data area defined by the data frame. When the indirect frame is done, it returns execution control back to the data frame at the point where the indirect frame was called. If there are more occurrences of the structure, then the data frame can (if so instructed) call the indirect frame again to process the next occurrence, and so forth. When all occurrences of the structure are processed, the data frame continues processing after the point where the indirect frame was called.

Thus, an indirect frame is a frame within a frame. The data frame calls the indirect frame, which executes and then returns control to the data frame. Note too that if a goal record structure itself contains a structure, then the indirect frame which processes it must, in turn, call another indirect frame to process the elements within the innermost nested structure. Frame processing can be nested like this to a depth of ten frames -- that same limit also applies to structure nesting in file definitions.

Examining our designed DRINKS format above, we see that we will need two frames: a data frame to process all record-level elements, such as NAME, INSTRUCTIONS, and so forth, and an indirect frame to process the two elements within the INGREDIENTS structure.

Next, we need to determine frame dimensions for these two frames. These are values specified in the frame definitions that tell SPIRES how much computer memory to allocate in which to construct the data area before moving it to your terminal or active file. Frame dimensions are pairs of numbers -- the first number represents the number of rows (lines) that the frame can use (its vertical height), and the second number represents the number of columns per row (its width). During frame execution, SPIRES establishes a grid-like array internally using these dimensions after which placement of elements and values can begin, with the frame definition specifying array locations (row, column) where value placement should begin. The precise location (say, third row, first column) can be given for an element being positioned in the format; however, in most cases, a more general location is given, such as "the first column of the next row after the previous value is positioned". This flexibility saves you from worrying about exactly how many rows an element value needs to take; whatever the number is, the next element can start in the next row.

So how many rows by how many columns do we want our data frame to be? What about our indirect frame? How do we determine these values?

There are basically two factors to consider: first, the size of the records and how much of each will be displayed; and second, whether the format is being designed for online use (simple display at a terminal) or for printed documents. Generally, the first factor relates to the number of rows and the second to the number of columns.

The second is usually easier to figure since the format's use is an important design consideration, and thus the number of columns per row is easier to set from the start. If the format is primarily going to be used online, then a row should be no longer than the number of columns available on your terminal. If it is being used for printed documents, you need to consider paper size, character set size and so forth. Ideally, our format will be used for both purposes. The best size for a terminal display is usually 68 columns. Most CRTs contain 80 column rows, and choosing 68 leaves room for line numbers on the side when we put the records into the active file. This length will also be good for printed output. Naturally, different purposes and situations will require different row lengths. You will have to consider your own particular needs before setting that number.

Deciding the number of rows seems the more difficult task. One goal record might have one occurrence of each element; another might have five of each. Thus, if we want the format to display all occurrences of each element, each new occurrence beginning on a new row, the goal records could vary tremendously in height. Our task is made much simpler by the fact that we do not need to know exactly how many rows the largest record will need; we just need to choose a number higher than whatever that amount might be. [See 7.1 to learn when even that is not crucial.] We do not need to be precise because SPIRES will use only the number of rows necessary to follow the frame instructions. If fewer rows than specified are needed, then only those rows will be used; SPIRES will discard all the extra blank rows at the end.

So, since the number need not be exact, we can quickly estimate a number for each frame. For the data frame, if there were 10 INGREDIENTS (quite unusual for a drink!) and twelve rows worth of INSTRUCTIONS (inconceivable!), this large record would require about 30 rows. Since that would be a truly unusual record, let's set the number of rows for the data frame to 30. Thus, the frame dimensions for the data frame are 30 rows by 68 columns.

Of course, if your goal record contains elements that are fixed in occurrence and length, you might be able to determine the exact number of rows needed. However, as long as you remember that the precise number of rows is not needed by SPIRES (as long as your estimate is greater than or equal to the number of rows needed for a large record), you do not need to worry. Also, do not worry about being too high in your estimate; unless it is extremely high (say, you have set it to 10 times more rows than you'll ever need), the efficiency of format processing will not be affected very much.

The indirect frame can be handled quite simply: we do not need to specify frame dimensions. If they are omitted, the indirect frame operates within the same data area defined by the data frame.

There is another way to choose frame dimensions that frees you from setting the number of rows, called "line-by-line" processing. Because it is a more complicated concept, it will not be discussed until much later in the primer. [See 7.]

One final rule for frame dimensions: the only limit to frame size is that for each data frame, the product of the two dimensions must be less than 65,536.

Now that we have a broad conception of frames and frame dimensions, we are ready to begin the format definition. More details to flesh out your understanding of frames will be presented later. [See 4.]

2  Writing a Format Definition -- An Outline of the Definition

Once you have designed the format, you will find that it is relatively easy to write the format definition, a set of instructions telling SPIRES where to get data, how to process it, and where to put it to display it. You then add the format definition to the FORMATS subfile.

Like a record in any SPIRES subfile, a format definition contains certain required elements, such as a key. Some of the elements at the record level are structures. Here are the most important elements at the record level of records in the FORMATS subfile in the order in which they appear:

These group themselves into the four main parts of a format definition:

The first part identifies the record's name (ID) and the file and record name of the file for which the formats defined therein will be used. [See 3.]

If SPIRES will need to create and use any special variables to follow the instructions given by the format definition, then those variables must be declared in the Vgroups section, part two of the definition. [See 8.2.]

The instructions for each frame of the format are given in the Frame Definitions section, part three. Each occurrence of the FRAME-DEF structure represents one frame to be a part of one or more formats, and it includes the information that tells SPIRES to move data from the record to the area where the record is being displayed, performing any desired computations or value manipulations along the way. The frames in the Frame Definitions section do the formatting work; they are the building blocks of a format. [See 4.]

There are two main tasks for the last section of the format definition, the FORMAT-DEC structure. First, the frames making up your format must be linked together. The frames defined in the Frame Definitions section are only building blocks that can be entered in any order. In the Format Declaration section, you give SPIRES a list of the frames you want your format to use, generally in the order in which you want them to appear in the format. The second task is to give this ordered list a format name to be used in the SET FORMAT command. [See 5.]

3  Beginning the Format Definition: The Identification Section

Although a format definition does contain a set of formatting directions, it is also a record in a subfile. It is important to remember that it contains elements, one of which is the key to the record, and that the elements are entered in the SPIRES default format (i.e., "element-name = value;"). The following identification elements are the first parts of the record that you collect in your active file in order to add the record to the FORMATS subfile:

ID = gg.uuu.anyname;

The ID element is the key of the FORMATS record. The first seven characters must be your account number (in the form gg.uuu) followed by a period. The value "anyname" can be any alphanumeric string of 16 or fewer characters, none of which can be blanks. (Blanks are often indicated with periods, which are allowed.) Of course, since it is the key of a record, it must be unique -- for each format definition you write, "anyname" must be different.

Because the key of the record contains your account number, SPIRES can insure that only someone logged on to your account can update or remove your format definitions. Also, you cannot add format definitions if the "gg.uuu" part of the ID is any account other than your own.

The value you provide as the ID will be used when you transfer, update or display the format definition from the FORMATS subfile. You will also specify it when you compile the format definition. However, the ID is not used in the SET FORMAT command; the element FORMAT-NAME in the Format Declaration section of the definition will specify that. [See 5.]

COMMENTS = remarks;

This element is optional; it is often used to specify the purpose of the format definition. COMMENTS elements are available throughout the format definition. This is a good place to remind you that if the element value contains any internal semicolons (;), the entire value must be placed in quotation marks (COMMENTS = "remarks;remarks";). Similarly, if internal quotation marks are necessary, the entire value must be surrounded by quotation marks and the internal ones must be doubled (COMMENTS = "... ""remarks"" ...";). Remember that these rules apply to all elements of the format definition, of course.

AUTHOR = name;

This optional element usually contains your name, address and phone-number.

MODDATE = date; DEFDATE = date;

These two elements are automatically supplied by SPIRES when you add the record to the FORMATS subfile. DEFDATE contains the date the record was added; MODDATE contains the date the record was last updated.

FILE = gg.uuu.filename; RECORD-NAME = record-name;

These two elements connect the formats being defined to a SPIRES file and a particular record-type (or goal record) within the file. If you are the file definer, then you know these values from the file definition. FILE is the name of the SPIRES file as it appears as the FILE element in the file definition; RECORD-NAME is the name of the record type within the file for which the format is being written. So, if the format is designed to display goal records of a subfile, then "record-name" is the same name as the GOAL-RECORD element in the subfile section of the file definition, both values referring to the RECORD-NAME element for the appropriate record definition in the file definition. If you are not the file owner, you can find out these values by selecting the subfile and issuing the SHOW SUBFILE INFORMATION command:

The FILE then is GQ.JNK.DRINKS and the RECORD-NAME is REC1; these are the values to use in the format definition. Note that "gg.uuu" in the value for FILE is the account under which the file is stored. It need not be the same as the account used for the format definition ID, though it is if the format definer is the file owner.

Here is our format definition for DRINKS so far, collected in the active file:

Note from the example that the ID value often repeats the file name, followed by a period and the name of the format (FORMAT-NAME) being defined. This practice is completely optional, but if you have many format definitions for many files, it helps you identify a particular ID more easily.

Though the next format section is Vgroups, where you tell SPIRES what variables to create for use in the frames, you will not know what they are till you have written the frame definitions, which are thus the subject of the next chapter.

4  Controlling Format Execution -- the Frames Definitions Section

The FRAME-DEF structure, as you surely have guessed, is the most complicated part of the format definition, for here you define the frames that will make up your format, specifying the processing of element values, text and perhaps calculations. Though there is a finite number of control words in the SPIRES frame vocabulary (slightly more than you will learn in this primer), the number of combinations that can be used in frames is practically infinite.

The elements found within a frame (that is, one occurrence of the FRAME-DEF structure) are FRAME-ID, DIRECTION, SUBTREE (in indirect frames), FRAME-DIM, USAGE, COMMENTS and LABEL-GROUPs. LABEL-GROUP is a multiply occurring structure that contains the specific processing instructions. Each label group can get one element (one or all occurrences) from the goal record being processed, perhaps modify that value in some way, and then put the value into the frame in a specific location. Label groups can also direct SPIRES to make computations or they can call indirect frames. You will learn how to write label groups in section 4.2. The other elements are discussed in the following section.

4.1  The Identification Elements of the FRAME-DEF Structure

The following elements serve as identifying and descriptive elements for the frame:

FRAME-ID = framename;

FRAME-ID is the key of the FRAME-DEF structure; it uniquely identifies the frame being defined. The name used here will also appear in the FORMAT-DEC structure later. [See 5.] The "framename" can be from one to sixteen characters long, with no blanks allowed.

DIRECTION = direction;

This element tells SPIRES whether the frame will be used in an OUTPUT, an INPUT, or an INOUT format -- that is, it tells SPIRES in which direction data will flow. Since this primer teaches you how to write output formats, the value you will use here will be OUTPUT.

SUBTREE = structure-name;

This statement appears in the definitions of indirect frames used to process structures. It names a structure within a record that can be addressed by the current frame.

FRAME-DIM = nrows,ncols;

These are the frame dimensions determined earlier. [See 1.2.] These two values define the size of the two-dimensional data area into which the elements and other values will be placed. Here, "nrows" is the maximum number of rows to be used, and "ncols" is the number of columns. Remember that we can omit this statement for an indirect frame if we want it to have the same dimensions as the frame that calls it.

If "nrows" is 0, then the format operates in "line-by-line" mode. [See 7.]

USAGE = value;

This element, like DIRECTION, tells SPIRES how the format will be used. Its value is usually DISPLAY for output formats, which is the default if no value is given. Also like DIRECTION, we will only learn and use the one value, DISPLAY.

COMMENTS = text;

Again, if you want to make any comments (for instance, about the frame's use or purpose), you can use this element.

Below are the beginnings of the two frame definitions we will need for the DRINKS subfile format we are creating. On the left is the data frame, and on the right is the indirect frame for the INGREDIENTS structure, as indicated in the COMMENTS element.


These frame definitions are far from complete, of course. In the next section, we will add the label groups, the sets of instructions that place data into the data area.

Naturally, in the actual format definition, these two frame definitions would not appear side by side -- one follows the other. The order in which the frame definitions appear here does not matter, as we will see later. [See 5.]

4.2  Label Groups

In this section we will put together the heart of the format definition, the label groups, which are the actual instructions that tell SPIRES what to put in the data area. The label groups are multiply occurring structures within the FRAME-DEF structure; yet you will see that these structures, appearing in SPIRES default format, strongly resemble lines of programming commands. During format execution, SPIRES moves line by line through each label group, starting with the first one in the frame definition and continuing by default to the next one each time it completes the instructions in the current one. So, with the elements in the FORMATS record looking especially like commands, it is worth repeating that they are also elements in a record and must be entered into the active file in the SPIRES default format.

Each label group can access one element in the goal record, modify that value if desired and then position it within the data area. Label groups can also be used to make computations, check values or even control frame execution. A single label group is a unit that should be used to do just a single thing; "one label group, one action" could be a motto. Only one element or structure (and its multiple occurrences) or a single variable or text value can be processed by a single label group.

Unlike the last two chapters, which first explained the statements of a particular section of the format definition and then showed you the appropriate parts of our sample definition, this chapter will create, step by step, each label group for the example, explaining the statements used along the way.

Just as SPIRES naturally begins at the first label group in the frame definition ("at the top") and proceeds down through the label groups, we will find it easier to begin our label groups by working with the first element at the top of the sample format, which is NAME, and then go down, from one row to the next, ending with the CONTRIBUTOR element. Though we could write sets of label groups for rows 1, 3 and 4 and then return to row 2, for example, you can see how much more clear and organized the format definition will be if we proceed from row to row, top to bottom, if possible. (Sometimes it may be necessary to return to an earlier row, perhaps to place a total created from the sum of elements placed further down in the format.) Also, as you will see later, processing the values in any other order than top to bottom may limit your flexibility in positioning them. [See 7.]

Below is a list of the elements that we will be using in the label groups, along with brief descriptions of their use. Detailed descriptions will be given in the sections indicated as we create the label groups for our sample format. The elements are presented here in the order in which they appear in the label group and thus the order in which they are executed. Entering them in a different order within a label group will not alter the order of execution, since SPIRES will rearrange them in the order below when the record is added to the FORMATS subfile.

A label group begins with the LABEL element:

LABEL = name;

LABEL is the key of the LABEL-GROUP structure. Each time it appears, it signals the start of a new label group. [See]

Most label groups then identify the value to be processed by the label group. It might be an element or a structure in the goal record, or a character string or variable value. Depending on what it is, one or a pair of the statements below is used to tell SPIRES to begin processing that value.

GETELEM = element-name;

The GETELEM statement tells SPIRES to get an occurrence of the specified element from the goal record for further processing. [See]

VALUE = value;

If you want to place a specific character string or a variable value instead of an element value into the frame, that value is specified here. If a VALUE statement is used, the GETELEM statement is not, and vice versa. [See,] This label group element is one of two statements needed to call an indirect frame for structure processing. It names the structure in the goal record to be processed by the indirect frame. [See]

IND-FRAME = frame-id;

IND-FRAME is the second statement needed to call an indirect frame for structure processing, naming the frame that is to be executed at this point. [See]

The next statements alter the value, declare where the value will be placed within the data area (they do not actually place it -- that happens later in the label group) or, in the case of TITLE and TSTART, provide another value to be placed in the data area as well. None of these elements is required -- they all have a reasonable default value or action if they do not occur -- so very seldom do more than a few of them appear in a single label group.

TITLE = string-expression;

This statement specifies that the indicated string is to be positioned as a title for the value being processed. Using another element, TSTART, the TITLE can be placed independently of the value, e.g., on the row above the value. If more than one occurrence of an element is being displayed in the format, the TITLE is only placed in the frame once, when the first occurrence is placed there. [See,] Similar to START (described below), this element specifies the starting position for the TITLE string, described above. [See]

DEFAULT; or DEFAULT = value;

If there are no occurrences of the element or structure, then the GETELEM or IND-STRUCTURE statement fails, and SPIRES stops processing the label group immediately, proceeding to the next one. You can use the DEFAULT statement to specify that the label group should be completely processed regardless of the success or failure of those statements. In addition, it can provide a default value to be processed instead. [See]

MARGINS = lcol,rcol;

If the value being positioned is too long to fit on a single row of the data area, you will probably want it to wrap around into the next row. Here you state what columns you want to be the left and right margins of the value when it wraps around.

MAXROWS = nrows;

If the value being positioned is too long to fit on one row of the frame and will wrap around into the next row, then MAXROWS can be used to limit the number of rows for wrap-around by a single value. If more than "nrows" (a number) would be needed for the value, it is truncated. [See]

START = row,column;

This statement tells SPIRES where to begin value placement in the frame. The "row" and "column" values may either be explicit numbers or more general locations. If no START statement appears, the PUTDATA will place the value in a default location, usually starting in the next row of the frame, first column. [See]

LENGTH = number;

This element limits the length of the value. If the value has more characters than allowed by the indicated LENGTH, the value is truncated to that length; if shorter, it is padded with blanks to the indicated LENGTH. [See]

INSERT = string; or INSERT = END,string; or INSERT = n,string;

The character string provided is inserted before, after or within the value being processed, becoming part of the value.

After any of the above manipulations are made and the positioning is determined, user processing is allowed:

UPROC = statement;

Uprocs are used to handle many different special needs; they resemble SPIRES commands, and in particular, the Protocols language of SPIRES. If you want to assign values to variables for computation, make other changes to the value being positioned, do conditional testing, jump out of the label group to other label groups to alter the processing flow of the frame, display messages to or ask for input from the terminal, Uprocs give you these capabilities. Uprocs solve a number of problems that cannot be handled any other way, and since they resemble commands, they are straightforward to use. [See]

Then the value is actually placed into the data area:


This statement tells SPIRES to place the value being processed by the label group, either the GETELEM or the VALUE (only one can exist in a single label group) into the frame. If other statements within the label group have changed that value, such as INSERTs or UPROCs, then the altered value is used by the PUTDATA statement. [See]

Finally, if the element or structure is multiply occurring, we tell SPIRES to execute the label group again:

LOOP; or LOOP = n;

This statement causes the label group to be repeated, providing access to multiple occurrences of an element or structure. [See]

XSTART = row,column;

The XSTART statement can be used after a LOOP statement to indicate the positioning for the subsequent values. If XSTART is not declared, then these values are placed in a default position: the next row, starting in the same column as did the first value processed by the label group. [See]

COMMENTS = text;

Again, here is a field for any comments you might have. In particular, this COMMENTS statement is meant for remarks about the purpose or activity of the LABEL-GROUP structure in which it appears.

4.2.1  Constructing the Label Groups for Our Sample Format

In this section we will write the label groups for the two frames of our sample format -- the DRINK and INGREDIENTS frames. We will use the elements described in the previous section and learn some more details about them as we use them. As we proceed through the frame definitions, we will be learning which statements to use to solve common formatting problems, giving several different solutions at times and explaining why one solution is preferable to another in a given situation. Remember that this primer is not a reference manual -- you will not learn everything there is to know about each statement here. More details about these options can be learned by studying the reference manual "SPIRES Formats" or by using the EXPLAIN facility.

As a reminder, below is our sample format design again:

     (NAME)                                              Makes (QUANTITY)

                     (AMOUNT)  (CONSTITUENT)
                     (AMOUNT)  (CONSTITUENT)

       (INSTRUCTIONS,  e.g.,  "Mix fifteen  gallons of gin  with  fifteen
     gallons of fruit punch in any clean bathtub.  Always remember to put
     in the drain stopper.")

     From: (SOURCE).

     Contributed by (CONTRIBUTOR).

where the words in parentheses are the elements that we want to display.

In the last section, we decided that, when writing label groups in a frame definition, it is easier to begin with the label groups for the first row in the frame and work down the frame. Our first problem here is, which frame do we begin with? Some format definers prefer to work with the indirect frames first, treating them as smaller parts of a whole that are to be assembled first. Others prefer to write the data frame first, writing the indirect frames as they are called by the data frame. Still others prefer to write the data frame first, including the "calls" to the indirect frames, and then write the indirect frames later.

For conceptual reasons, this primer will follow the third method; several topics introduced in the label groups of the indirect frame are too advanced to be used for the first label groups taught. You may decide to use a different method, depending on the complexity of the format definition you want to write.

When constructing label groups, remember that the order in which you input the statements of a label group may not be the order in which they are stored within the FORMATS subfile, and thus the order in which they are executed. That order can be seen by issuing the SPIRES commands SELECT FORMATS and SHOW ELEMENTS FOR LABEL-GROUP. This is especially important to remember when you want to use Uprocs, as we shall see. [See]

Let's begin then with the data frame DRINK and with the first element in the first row.  The LABEL, GETELEM and PUTDATA Statements

The first element to be processed is the NAME element, whose appearance is to begin in column 1, row 1 of our format.

All label groups must have one occurrence of the LABEL element, because LABEL is the key of the LABEL-GROUP structure. Though the value may be null (as in "LABEL;") which is enough to begin a new label group, a LABEL value serves as an identification sticker. A frame definition may be very long, and LABEL values can help you find your way through a definition, particularly if the values are well chosen. Though it may be allowed, you should not have two or more LABELs with the same value in the same frame, primarily for the sake of the clarity of the definition. When you try to compile the format definition, you will be told if there are duplicate LABEL values, and in some cases, the format may not compile. [See in the discussion of the JUMP Uproc.]

Often when a label group will process an element from the goal record, the name of the element is used as the value for the LABEL. So our DRINK frame begins like this:

(Generally this primer will not repeat COMMENTS elements used to explain earlier concepts. Note that all COMMENTS do appear in the complete format definition as it appears in Appendix A.)

We can assign LABEL most any string value we want (with some restrictions, e.g., no embedded blanks are allowed). The reason we often employ the name of the element being processed will be shown in a moment.

In any label group that gets an element from the goal record and then puts it into the frame, the GETELEM statement is always the second statement and the PUTDATA statement is usually the last one.

The GETELEM value is the name of the element that is to be retrieved from the goal record. The GETELEM statement tells SPIRES to get an occurrence of the named element for further processing by the label group. If no value is given in the GETELEM statement, SPIRES will retrieve the element named in the preceding LABEL statement. Hence, the advantage to assigning the name of the element being processed to LABEL is that it serves two purposes there: it identifies the label group and also the element to be accessed. (This is a programming convenience that you may not wish to use; you might prefer to give a value for LABEL only when you want to highlight the start of a new section of the frame, or, as we will see later, give a value only when you want to identify a label group as a "branch-point", when we use the JUMP Uproc to cause execution to "jump" to some other label group. Also, you might prefer to associate the GETELEM directly with the element being retrieved by putting the element name with the GETELEM rather than with the LABEL statement.)

Our first label group looks like this so far:

It could just as validly look like this:

but we will use the first form in our example, so that we can refer to a single label group more easily. During format processing, both of the above pairs of lines tell SPIRES that it has begun a new label group and it should retrieve the first occurrence of the NAME element from the goal record.

Now that SPIRES has "gotten" the element value, we want to put it into the frame, a step accomplished by the PUTDATA statement:

PUTDATA places the value into the frame. No value goes into the frame until an accompanying PUTDATA statement in the label group is executed. By default, SPIRES positions the value in the first column of the next row of the data area. (Obviously, the next row when we begin is the first row.) The NAME element then will be placed in row 1, starting in column 1. Our first label group is complete.

The above label group is the simplest label group that completely processes an element. It gets the first occurrence of an element from the goal record and puts it in the default position in the data area. If we wanted a format that did nothing else but display single record-level elements (that is, elements not within structures), one above the next, we would use nothing but simple label groups like these. Usually however, we want something more complicated -- we want to insert a step or two into this process of getting an element and then putting the value into the data area. We will learn how to translate our more elaborate needs in the sections that follow.

What happens if there is no occurrence of the element -- that is, what if the GETELEM fails because the element does not exist in the goal record being displayed? The answer is simple: no further processing of the current label group occurs. SPIRES immediately proceeds to the next label group in the frame. If you want, you can specify that SPIRES continue processing the current label group, using the DEFAULT statement. [See]

If you want to access a single occurrence of an element other than the first one, you can use an option on the GETELEM command. The GETELEM value must be given with the element name as shown:

The parentheses indicate that what is inside is an occurrence modifier. Here, "n" is a number or a variable that represents either "0" (zero) or a positive integer. If "n" is 0 (zero), then the first occurrence of the element will be accessed. If "n" is 1, then the second occurrence will be accessed, and so forth. There is a different way to access all of the occurrences rather than just a single one -- it requires the LOOP statement. [See]

Internal vs. External Form of an Element

If you are familiar with SPIRES file definitions, you know that elements may be displayed in a form quite different from the way they are stored internally. For instance, a DATE element in a record may be displayed to you in the ten-character form "MM/DD/CCYY" ("07/01/1854"), but to save space, it is probably stored internally in a special form that requires only four characters. When a record is added to the subfile, the DATE value given is converted into this "four-byte" form by INPROCs (INput PROCessing rules), chosen for each element by the file designer. Similarly, during record output, the stored value is reconverted to a character string of, say, ten characters as shown above, when it passes through OUTPROCs. Some elements in a goal record may have neither INPROCs nor OUTPROCS; others might have very complex ones. [See "SPIRES File Definition", section B.4, for more information about processing rules.]

You probably do not need to know what OUTPROCs exist for the elements retrieved by your format. The GETELEM and PUTDATA statements will process the converted value (the value after any OUTPROCs have been applied) unless you specify otherwise. [See 9.] Or, in other words, the element value shown to you when you display the record in the standard SPIRES format will be the value processed in your format, since the standard SPIRES format processes elements through their OUTPROCs. Thus, if you are not familiar with the file definition, you should be familiar with the way the elements are displayed through the standard SPIRES format. (Again it is worthwhile to stress that the more information you have or the better you understand the goal record's characteristics, the better your format definition will be.)  Text Placement and the INSERT Statement

We would like the value of the next element, QUANTITY, to have the word "Makes" (a "text" or "character string") appear in front of the actual value. Often you will want to insert a character string before or after or even within a value. The INSERT statement, which appears in the label group that processes the value, will answer this need.

Actually, text can be placed in the frame with several different statements: the INSERT, the TITLE and the VALUE statements:

The INSERT statement is used:

The TITLE statement is used:

The VALUE statement is used:

If we consider a label group as a five-step process that first gets a value, alters it if necessary, establishes its position within the data area, does any user processing and finally, puts the value into the data area, then the INSERT and TITLE statements are part of the "value alteration" stage, while VALUE is one of the members of the first stage.

For QUANTITY, we want the element value preceded by the word "Makes". Since we don't want the word "Makes" to appear if the element should not occur, we should use either a TITLE or an INSERT statement. Also, since we want to right-adjust the entire value "Makes (QUANTITY)" (that is, we want to position it so that the last character is placed in column 68, the right margin), we should use INSERT; otherwise, we would have to position the TITLE separately, rather than as part of the entire value. [See for uses of the VALUE and TITLE statements.]

So, our second label group, for QUANTITY, begins like this:

We usually indicate literal character strings by surrounding them with apostrophes. If a character string has internal apostrophes, they must be doubled (that is, they must be pairs of apostrophes, not quotation marks). Although quotation marks can be used to delineate strings, it is generally easier to use apostrophes; remember that if quotation marks do appear in any element value, that entire value must be surrounded by quotation marks and the quotation marks within the value must be doubled.

Next we would add the PUTDATA statement, except that we want to position this value on the first row of the format, which is somewhere other than the default position. [See] We will work further on this label group in the next section.

There are other options available on the INSERT statement. One is used when you want a character string to be placed at the end of the value, instead of before it:

If you replace the word END with an integer "n", SPIRES will insert the 'character string' in front of the "nth" character of the value.  Positioning: START, LENGTH, MARGINS and MAXROWS Statements

Our next problem with the QUANTITY element is that we want to place it on the same row as the NAME element. By default, it would be placed in the first column of the next row. To tell SPIRES we want the value to begin somewhere other than the default position, we use the START statement:

The START statement has two values: the first is the row number and the second is the column number where the value should begin in the data area. (If only one value is given, it represents "row", and "column" becomes the default value of column 1.)

If we want the QUANTITY element to begin in column 50 of row 1, we insert the START statement into the label group:

which tells SPIRES to place the first occurrence of the QUANTITY element (there should be only one), including the INSERT string, into the data area, starting in row 1, column 50. (Actually, we want the QUANTITY value to be right-justified against column 68, the right-hand margin. However, allowing for a longer value such as "Makes seven gallons", whose 19 characters would fit perfectly from columns 50 to 68, we will begin the area for the value at column 50. If the value is shorter than 19 characters, such as "Makes 1 drink", we will have it right-justified within that area, using a UPROC we will learn later.)

The LENGTH Statement

Now that there are two values on the first row, a new problem comes to mind: what happens if the NAME element is longer than 50 characters or the QUANTITY value (including the string "Makes") is longer than 19? (Only 19 characters can be placed within columns 50 to 68, our row limit from our FRAME-DIM statement.)

Let's look at the first case first. Suppose, for instance that the NAME element is 65 characters long, and the QUANTITY (with "Makes ") is 13:

During format processing, SPIRES would place the NAME in columns 1 through 65 and then place the QUANTITY value on top of the NAME value in columns 50 through 62 like this:

To prevent that type of overlay, we can use the two elements LENGTH and MARGINS. The LENGTH statement, specified with a value, tells SPIRES how long in characters the value being processed is to be. If the value is longer than the allowed length, it is truncated to that length. If it is shorter, it is padded with blanks to the specified length.

We can therefore write the first two label groups as follows:

After execution of these label groups, row 1 would look like this:

The length of 47 in the NAME label group was chosen to leave at least two columns blank (48 and 49) before the QUANTITY value could begin. You might wonder why we used a LENGTH value of 19 in the QUANTITY label group when the value positioned could not possibly go beyond column 68 because of our frame dimensions. The reason will be explained later in this section.

The MARGINS Statement

The solution above is a simple one. But suppose we do not want the NAME truncated to 47 characters. Needless to say, there will be very few, if any, records with a NAME element that long, but we should still consider what we want to happen if there are any. Once again we see the importance of being familiar with the subfile and the goal records in order to make this decision: have we taken care of this formatting problem adequately?

If we want to solve this new problem, we can allow the long value to use columns 1 through 47 of row 1 and then "wrap around" and continue in row 2, using the MARGINS statement, instead of LENGTH:

where "lcol" represents the left column margin and "rcol" the right column one. If a MARGINS statement occurs, then the value will begin in the location designated by the START statement and continue to the column designated as "rcol" in the MARGINS statement, then continuing in the next row, starting in "lcol" and continuing once again to "rcol". This process would repeat till the end of the value.

There are three important points to mention here about wrap-around and the MARGINS statement. First, remember that wrap-around can happen automatically without a MARGINS statement. If the value is longer than the amount of space left on the row in which it is being placed, it will automatically wrap around into the next row, beginning in column 1, unless there is a LENGTH statement to limit the length of the value when it is positioned. (That is the reason why we put a LENGTH statement in the QUANTITY label group above. With the LENGTH statement, the value will be truncated to 19 characters if it is longer than 19; otherwise, the value would wrap around into the next row. If we were concerned that the QUANTITY value could be longer than 19 characters, we could use a MARGINS statement in that label group too.)

Second, the "lcol" value does not represent the starting column for the first row in which a "wrap-around" value is placed; it is used only to designate the left column where any part of the value that wraps around should begin. The START statement is still used to specify the column where the value should first begin. As usual, if no START statement appears, the value begins in the default position.

Third, by default during MARGINS processing, SPIRES will only end one row and wrap around to the next row at a "blank" or "space" in the value; that is, SPIRES only "breaks" on blanks, so a word will not be split, part on one row and the rest on the next.

Here are the two label groups with a MARGINS statement. We no longer restrict the length of the NAME with a LENGTH statement.

The record with the long NAME would then look like this on output:

You can see that the NAME value was split at the blank between "Orange" and "Blossom" and thus, not all 47 of the allotted columns in the first row were used. You can tell SPIRES to split the value at characters other than a blank if you want. [See 9.]

The MAXROWS Statement

Earlier we said that the value would keep wrapping around from one row to the next and to the next until the value ended. Using the MAXROWS statement, we can limit the number of rows used:

where "nrows" is a positive integer. If the value would use more than "n rows", it is truncated.

Some Special Values for the START Statement

Looking at our next format specification, we want a blank row between the top row or rows and the row with the word "Ingredients:" in it. To get a blank row, we first make sure that no values above it will wrap around into it and then we simply don't have any values START in it. But which row (by number) will be a blank row? If the NAME and QUANTITY elements fit on one row, the blank row should be row 2. But if the NAME element wraps around into the next row (row 2), the blank row should be row 3, with "Ingredients:" in row 4. How do we specify that sometimes the word should appear on row 3 and sometimes on row 4?

During format processing, SPIRES keeps track of several important numbers, among which are the column and row numbers where the last positioned value ended. These numbers are accessible to you in several different ways. The most useful way is through the START statement. The row and column numbers in which the last value ended are represented by an asterisk ("*"). In other words, to tell SPIRES that we want a value to begin in the same row and the next column from where the last value ended, our START statement for that label group would be

The first "*" means the same row where the last value ended; "*+1" here means the next column after where the last value ended. The value would thus be appended to the previous value.

would position the value beginning in column 1 of the second row after the row where the last value ended. The minus sign ("-") is also available in such expressions, e.g., "*+1,*-5".

Note that the "*" value always refers to the position of the last successfully placed value. If a GETELEM fails so that SPIRES skips to the next label group, the "*" value will not be reset by the "skipped" label group, since no value was placed in the data area. As a result, you might have some trouble if the next label group uses "*" in the START statement to refer to a position that was meant to be established by the skipped label group. Usually, this problem can be solved using the DEFAULT statement. [See]

The "*" value will not directly solve our current problem, however. The last value placed, the QUANTITY element, very definitely ends in the first row, thanks to the LENGTH statement. If the text "Ingredients:" were to start in row "*+2" and the label group that processes it were to be the next one written, then that text would always appear in row 3. Since we said we wanted it to appear in row 4 sometimes, this solution will not do.

One solution would be to switch the first two label groups around, and include an explicit START value of "1,1" for NAME:

Then the next label group would place "Ingredients:" beginning in "*+2,10". In "line-by-line" format processing, that would be the only solution to this problem. [See 7.2.]

A more straightforward solution is supplied by another number that SPIRES keeps -- the number of the next row beyond the furthest row into which data has been placed. That number is represented by an uppercase "X" when used in the START statement. If we did not switch the first two label groups around, and if the next label group used

then the value would start in column 10 of row 3 or 4, since "X", the next row beyond the furthest row used, would be either row 2 or 3, depending on whether the NAME value had wrapped around, and we want that row, row "X", to be blank.

There is also an "X" that can be used for a column value. It does not mean "one column past the highest numbered column stored into" -- usually, it just means the same as "*+1". However, if the last successfully completed label group has a LENGTH statement whose value allows a longer length than the actual value being positioned, then "X" as a column number represents the next column after the column in which a value of "LENGTH" length would end. On the other hand, "*+1" represents the column number right after the last character of the actual value. For example, if a label group positions a value in columns 1 through 15, with a LENGTH value of 15, then if the positioned value is only ten characters long, the "X" column would be column 16, while "*+1" would be 11. This difference between "X" and "*+1" as column numbers occurs only in the situation where a LENGTH or, similarly, a MARGINS statement is used to position the preceding value.

Now that we have the "*" and "X", we have much more flexibility in positioning values within our data area. You will probably find them indispensable as you write your own formats.

One final note about the QUANTITY label group: according to our specifications, we want the value "right-adjusted" so that it ends in column 68. We will tackle that problem with a UPROC later; until then, we will consider these two label groups complete, as they appear in the above section on "The MARGINS Statement", and we will use "X+1,10" as the starting position of "Ingredients:".  More on Text Placement: The VALUE, TITLE and TSTART Statements

The last section discussed positioning, concluding with the placement of the next value, the text "Ingredients:". In this section, we will continue the earlier discussion on the placement of text (character strings), as opposed to elements from the goal record, a topic that we began with the INSERT statement. [See]

There are two different ways to handle the "Ingredients:" text: we can consider it simply as a piece of text we want placed in the frame no matter what, or as a title, part of an element's label group, that is positioned only if the element occurs and if the label group is completely executed. We cannot use the INSERT statement here because "Ingredients:" will not be positioned with (i.e., within or next to) the element value, but on a separate row. Also, we only want the string to occur once, while the INGREDIENTS structure may occur several times.

Let's consider the first of the two solutions, in which we want to place some text in the frame, independent from anything else in the frame. Naturally we do not need a GETELEM statement, since we are not accessing elements in the record. To give SPIRES a value to process in the label group, we use the VALUE statement:

The value given in the VALUE statement is processed exactly like an element value accessed by the GETELEM statement. We specify a starting position and then use the PUTDATA statement to place the value into the frame.

The VALUE statement can also be used when you want to place SPIRES variables or your own variables and expressions within the frame. For example, there is a SPIRES variable called $TIME that holds the current time. If you would like the current time to be positioned in your frame, you could write an expression such as shown below:

During format execution, a value such as "The time is 10:23:56" would be placed in your frame in the default position. [See, 8, for more details on variables.] Remember that we can only process one element or one value per label group; hence, if a VALUE statement is used, the label group can have no GETELEM, and vice versa.

Our other choice, expressed earlier, is to consider the text "Ingredients:" as a title, processed in the same label group as the INGREDIENTS structure which it announces. While the VALUE statement is a simple, general way to process a character string, the TITLE and TSTART statements are more applicable to this situation. They allow us to specify that the string "Ingredients:" should be placed in the frame only if there are occurrences of the INGREDIENTS structure. (Surely, there should be occurrences of INGREDIENTS; still, it is preferable in theory to relate the string to the structure by using the TITLE statement.)

The TITLE and TSTART statements specify the text and its starting position. Here is the beginning of the INGREDIENTS label group:

TITLE is usually a character string, as in the VALUE statement, though it can be a variable. TSTART is specified the same way as START. [See] If TSTART and START are both specified in a label group, then TSTART is assigned first; then, if START uses "*" for its row number, "*" is the same row where the TITLE was placed by TSTART. This is not surprising since TSTART occurs before START in the label group.

Here is our frame so far, with the beginning of the INGREDIENTS label group:  Calling Indirect Frames: The IND-STRUCTURE and IND-FRAME  Statements

The INGREDIENTS label group which we began in the last section will process the INGREDIENTS structure of the goal record. However, since a single label group can only process one element, we will need to call an indirect frame to process the elements in the structure, as we determined earlier. [See 1.2.] To call an indirect frame for structure processing, we use the IND-STRUCTURE and IND-FRAME statements. The IND-STRUCTURE statement tells SPIRES which structure in the goal record to access; the IND-FRAME statement tells SPIRES which frame to transfer execution to in order to process the elements in that structure. Neither a GETELEM nor a VALUE statement is used in a label group containing an IND-STRUCTURE statement.

Since the structure is called INGREDIENTS and the indirect frame is INGREDIENTS, the label group looks like this:

When SPIRES is executing this label group, it first looks for an occurrence of the INGREDIENTS structure in the goal record. If at least one exists, then the TITLE is positioned and placed in the data area. Execution control then passes to the indirect frame INGREDIENTS, which will actually position and place values from the structure into the data area, using GETELEM and PUTDATA statements. When that frame has been completely executed, control returns to the calling frame and the calling label group, execution continuing with any UPROC and LOOP statements in the calling frame.

We do not use a PUTDATA statement in this "calling" label group; the PUTDATA statements necessary will be included in the indirect frame.

Only one occurrence of the structure will be processed, as the label group currently stands. As with GETELEM, if no occurrences of the structure exist, SPIRES will jump to the next label group without doing any processing. Since most drinks contain more than one ingredient, and since in general we need a way to process more than one occurrence of an element, we will learn a method in the next section to handle this.

We will write the indirect frame itself later. [See] To finish the calling frame, we do not need to know how many rows the indirect frame will use; remember that the "*" and "X" values can help us guarantee correct placement of the data later in the calling frame, regardless of how many rows the indirect frame needs.  Accessing Multiple Occurrences of an Element: the LOOP Statement

Up until the INGREDIENTS structure, we have been processing elements that presumably occur only once in each goal record -- NAME and QUANTITY. Since INGREDIENTS is multiply occurring and since a recipe is useless without all the ingredients listed, we need a way to indicate that SPIRES should get more than one (or all) occurrences of an element or structure. That is done with the LOOP statement, which tells SPIRES to return to the beginning of the label group and execute it again, with the exception of TITLE and TSTART, which are processed only the first time. If the label group accesses an element using either a GETELEM or an IND-STRUCTURE statement, then the next occurrence of the element or structure will be accessed each time the looping occurs.

The LOOP statement can appear with or without a value. If a value is supplied (a number, or a variable containing a number) then the label group will be repeated that number of times. For instance, if "LOOP = 1;" is coded, the label group will be repeated once for a total of two times.

If no value is given, as in our example above, then the label group will be repeated until some condition is met that discontinues the process. The condition that usually breaks a loop is running out of occurrences of the element being processed by the label group. We know that when no occurrences of an element exist, processing of the label group stops; similarly, during looping, as soon as no more occurrences of the element exist, processing of the repeating label group stops.

Another way to break a loop is with the Uproc statement JUMP. [See] Be sure if you do not give a value for LOOP that you have something else within the label group that will eventually cause SPIRES to stop looping and proceed to the next label group.

One other element can be used with the LOOP element. The XSTART statement specifies the starting row and column of subsequent occurrences of the element being processed. If no XSTART values are given, then the next value is placed in the next row (X), beginning in the same column as that in which the previous value positioned by the looping label group began. The XSTART statement has the same form as the START statement.

Generally XSTART is not used when the looping is applied to an indirect frame, for the indirect frame's label groups usually supply their own starting positions. However, here is an example of part of an output format that would need to use the XSTART statement:

If "Dates Modified:" were a TITLE and the DATE element began in row 5, column 21, that label group might look like this:

Continuing the Frame Definition

The next two elements are relatively simple to code, using methods we have already learned. The first is the INSTRUCTIONS element, the description of how to mix the drink.

Here we have specified that the element value should begin two rows after the "highest-numbered" row already used (that number being the last row used by the indirect frame). Then the value should continue along that row, wrap around into the next row beginning in column 1, and so forth. (Actually, the MARGINS statement here is superfluous because the default action is to wrap around, using the frame dimensions for the MARGINS. However, as a reminder to ourselves that we expect the value to wrap around, we will code the MARGINS statement anyway.) Should there be more than one occurrence of the INSTRUCTIONS element, then subsequent occurrences begin two rows after the end of the previous one, starting in column 3.

Here then is an interesting example where the format design might influence the record input, rather than the other way around. It might not have occurred to you that there could be more than one occurrence of the INSTRUCTIONS element. However, with the format designed this way, if you want some particularly long, detailed INSTRUCTIONS to be printed as two paragraphs, then you would simply input the data as two separate occurrences of the element when adding the record. Here is one more type of application that shows the importance of considering what formats you want to design for your subfile, if you are a file designer. Specific formatting capabilities may influence your file design.

Before proceeding to the next element, let us remark that this label group, like the QUANTITY label group, is still incomplete. In our specifications, we indicated that the lengthy value of INSTRUCTIONS should be "justified" -- that is, the last character of each row of the value should appear in column 68, making both the left and right margins even, not ragged. This is done, of course, by putting extra blank spaces in the rows of text to fill them out. Like right-adjustment, which we want for the QUANTITY value, justification is specified in a Uproc, which is the subject of the next section.

The next element is the SOURCE element, a singly occurring element that contains text. We want the string "From:" to appear at the beginning of the element and a period to appear at the end. The entire value should start two lines after the INSTRUCTIONS value has ended.

The END value preceding the character string in the second INSERT statement indicates that the string should be appended to the value before it is placed in the frame. [See] Note too that the MARGINS statement indicates that if the value is longer than 68 characters, it will wrap around into the next row, starting in column 5.  The UPROC Statement

In previous sections we have left two label groups incomplete -- the two that involved right-adjusting and justifying the value being processed. The statements that solve these problems are Uprocs, special command-like statements that allow us to handle quite a wide variety of processing problems that could not be solved with the statements learned so far. In fact, the variety is so wide that often the only point in common between these problems is that they can only be solved using Uprocs.

In the last section we left the INSTRUCTIONS label group incomplete. We want to justify the value -- that is, we want each row in the instructions to begin in the same column and end in the same column, just as this paragraph you are reading does. Here is the complete INSTRUCTIONS label group, including the Uproc that will cause the value to be justified:

Notice the command-like syntax of the UPROC statement above. As we learn more Uprocs, you will see more clearly how much a format definition internally resembles a set of instructions, a computer program, though externally it remains a record in a SPIRES subfile.

As a general rule, Uprocs must be placed directly ahead of the PUTDATA statement, if it occurs. (A label group may contain nothing but a LABEL statement and UPROCs.) Remember that you yourself can scatter them around a label group wherever you like; however, SPIRES will rearrange them when you add the format definition to the FORMATS subfile, placing all of them together before the PUTDATA statement. Thus, Uprocs may not be placed, for example, before a GETELEM statement. Even if you do put them there, they will not be executed before the GETELEM because of the rearrangement by SPIRES on record input; they will be executed after the GETELEM, but before the PUTDATA.

There are several categories of Uprocs:

Of the twelve most commonly used Uprocs, our format definition will require four: "IF...THEN...", JUMP, SET JUSTIFY, and SET ADJUST. [See 8.3 for details on the others.] Let's continue with our sample format definition and learn how to use these four Uprocs and thus Uprocs in general.

The last element in the data frame is the CONTRIBUTOR. According to the subfile specifications, this element can be either the name of the person who added the record, if contributed voluntarily, or that person's account number (stored as "gg.uuu"), if the name was not supplied. For our format, we decided earlier that we want to display the name if it occurs in the record, but not display anything if an account number is given. How can we determine whether the value is a name or an account number? We will use two tests -- if the element value is exactly six characters, and if the third character is a period ("."), then it is almost surely an account number. ("Dr. No" would fit those conditions too, but he probably would not add DRINKS to our subfile and identify himself.) If both of those conditions are true, then we do not want to do a PUTDATA for this element:

That seemingly complicated UPROC statement combines three Uprocs: one JUMP and two "IF...THEN..." statements.

The "IF...THEN..." Uproc is used to order a certain action if a specified condition is true. For example, if the length of the "unconverted value" of the element is six characters long, then if the character following the second character in the unconverted value is a period, proceed immediately to the next label group. That, in fact, is a translation of the above Uproc.

JUMP tells SPIRES to "jump" immediately to the next label group, doing no further processing of the current one. If JUMP is followed by a label name:

then execution continues at the named label group in the frame. Thus, JUMP can be used to skip label groups or "loop" to an earlier label group. The label group to which execution is transferred must be in the same frame; you cannot JUMP to a different frame. (Some format definers give a value to the LABEL statement only when the label group is the destination, or a "branch-point", for a JUMP.) In our example, the JUMP Uproc will be executed only when the two "IF" conditions are true, meaning that the element value was probably an account number. Thus, SPIRES will jump to the next label group without completing the current one, skipping over the PUTDATA at the end of the CONTRIBUTOR label group.

SPIRES System Variables and Functions

The "IF...THEN..." and JUMP Uprocs are relatively straightforward concepts. What seem to complicate the sample UPROC statement are the expressions $SIZE($UVAL) and $SUBSTR($UVAL,2,1). What do they mean?

In formats, the dollar sign ("$") when it is not part of a character string (i.e., not part of a value surrounded by apostrophes) indicates to SPIRES that what follows is either a system variable, for which SPIRES maintains a current value, or a system function, which performs some operation and returns a value. Both represent some value. Whether the expression is a system function or variable, the dollar sign here identifies it as something that must be evaluated before the statement containing the expression is executed -- that is, SPIRES must substitute the variable or function value before executing the expression that contains it.

In the case of $SIZE($UVAL), $SIZE is a function that returns the length in characters of the argument (a character string) enclosed in parentheses. All functions are comprised of a dollar sign, followed by a function name, followed by an argument list enclosed in parentheses. The argument list may contain one or several values, depending on the function, that are used by the function to compute a value. The argument for $SIZE here is a system variable, $UVAL, which is explained below. So, "IF $SIZE($UVAL)=6" can be translated as "If the value of $UVAL is six characters long".

The function $SUBSTR has three arguments, "(string1,int2,int3)". Using "string1" the function returns a "substring" that is "int3" characters long, beginning with the character after skipping "int2" characters in "string1". So $SUBSTR('KLEMM',2,1) returns the value "E", while $SUBSTR('GQ.JNK',2,1) returns a period ("."). "IF $SUBSTR($UVAL,2,1) = '.'" therefore means "If the first character after the second character in $UVAL is a period". [See "SPIRES Protocols", section 7, for complete information about functions.]

System variables are values maintained by SPIRES, though you can "set" some of them. They may change frequently ($TIME changes every two seconds) or they may remain the same for as long as you use SPIRES ($ACCOUNT contains your account number). Most of the system variables are discussed in "SPIRES Protocols", chapter 6. Other system variables used primarily or exclusively with formats are discussed here [See 8.1.] and in detail in "SPIRES Formats", chapter V.

$UVAL and $CVAL are very important system variables used in label group processing. $UVAL stands for "Unconverted VALue", that is, the value of the element currently being processed as it is stored in the SPIRES file or the value given in the VALUE statement. $CVAL represents the "Converted VALue", the value after being processed by any OUTPROCs [See] and after any INSERT values have been applied. $CVAL thus represents the value that will be placed in the frame by the PUTDATA statement. Both of these variables only exist during the execution of a label group that contains either a GETELEM or a VALUE statement.

Generally you should not use $UVAL unless you are familiar with the file definition and specifically, the form of the elements within the file; otherwise, problems might occur. For example, a goal record with the element EMPLOYED might display the values YES or NO externally, but actually store the values 1 and 0 internally; $UVAL would then be either 1 or 0, not YES or NO, and unless you were the file owner, there might be no way for you to know that. In this case, we, as the file owner, know how the element CONTRIBUTOR is stored (either as a text string supplied by the contributor or as an account number), so we can use $UVAL. Note also that we could use $CVAL instead; however, because $CVAL includes the INSERT string, we would have to consider that in our functions:

Using $UVAL seems easier to follow in this case. Again though, if we did not know how CONTRIBUTOR was stored internally, we would have used $CVAL instead.

Remember that variables (and functions too) can also be used in VALUE statements, if you want to place the value represented directly into the frame.


We have one more UPROC to enter in the format definition as written so far. We want to alter the positioning of the QUANTITY element in the first row so that the value ends in column 68, i.e., right-adjust the value.

The effect of the START, LENGTH and UPROC statements is this: the value is placed in row 1 so that it ends in the 19th column starting from 50 (column 68). If the value is longer than 19 characters, it will be truncated. Note too that if no LENGTH statement were supplied, a value longer than 19 characters would be wrapped into the next row and both "halves" would be right-adjusted within their respective rows.

You can also use SET ADJUST CENTER to center the value or SET ADJUST LEFT, which is the default, of course. Whenever you use the SET ADJUST Uproc, the value is adjusted between the columns specified as the START column and either the column determined by the LENGTH statement (equivalent to the mathematical expression "STARTcol+LENGTH-1") or the column of the right-hand margin, or between the columns designated in the MARGINS statement, depending on what is in effect in the label group.  The Label Groups for the Indirect Frame; the DEFAULT Statement

With CONTRIBUTOR, the last label group of the data frame, complete, we must define the indirect frame INGREDIENTS, which is called from the data frame.

We will place the definition of the indirect frame after the data frame in the format definition. The order, however, does not matter here -- the order in which the frames are listed again in the Format Declaration section will. [See 5.]

So, here is the end of the data frame and the beginning of the indirect frame, which we wrote earlier. [See 4.1.]

         INSERT = 'Contributed by ';
         UPROC = IF $SIZE($UVAL) = 6 THEN IF $SUBSTR($UVAL,2,1) = '.'
           THEN JUMP;

From here we will write the appropriate label groups for this frame, just as we did for the data frame before.

By the way, you may be wondering where SPIRES will "jump" if the CONTRIBUTOR element accessed is an account number. Since there is no "next" label group in the DRINK data frame to jump to, no further frame processing occurs, and SPIRES would go to the next data frame (if there were one) as specified in the Format Declaration section. In fact, regardless of the JUMP, this rule specifies what happens whenever SPIRES reaches the end of a frame: If the frame is an indirect frame, SPIRES returns to the calling label group of the frame that called it; if the frame is a data frame, then SPIRES either proceeds to the next data frame (if there is one specified in the Format Declaration section of the definition) or else is through processing the format. You could explicitly state that all processing should be complete at this point by inserting the following label group after the PUTDATA statement of the CONTRIBUTOR label group:

Details on the RETURN Uproc are given later. [See 8.3.]

We have two goal record elements to place in the INGREDIENTS frame: AMOUNT and CONSTITUENT. For each ingredient in the recipe, there will be a CONSTITUENT element (the name of the ingredient) and usually an AMOUNT, which tells how much of the CONSTITUENT to use. We want the two elements of the structure to appear on the same row, with the AMOUNT on the left, right-adjusted so that it ends in column 24 and with the CONSTITUENT to begin in column 27.

Here is one way we might write the label groups:

Translating the above into words, we are telling SPIRES to place the AMOUNT element so that it ends in column 24 (the 20th column starting from column 5). The value will begin no further left than column 5 so that the ingredients will be indented four columns from the rest of the recipe. The CONSTITUENT is then placed on the same row, running from column 27 to 68, wrapping around to column 30 if necessary. These two label groups will be executed at the time the indirect frame is called by the data frame in the INGREDIENTS label group, and because of the LOOP statement there, this frame and thus these two label groups will be executed for each occurrence of the INGREDIENTS structure. They are not executed when the data frame has finished executing.

The DEFAULT statement

This pair of label groups is acceptable as written, as long as there is always an occurrence of the AMOUNT in each INGREDIENTS structure. But that may not always be true. Suppose the CONSTITUENT is simply "ice" and there is no amount given. The AMOUNT label group would be skipped because the GETELEM would fail to find an occurrence of AMOUNT, and thus part of the format might look like this:

                1 1/2 oz.  whiskey
                    4 oz.  icenge juice

The value "ice" was placed in column 27 of row "*", and "*" referred to the row in which the last value, "orange juice", had been successfully put. Thus, "ice" was overlaid upon the preceding CONSTITUENT.

Clearly, we need a way to set the correct row whether the AMOUNT occurs or not. As before with the NAME and QUANTITY elements, we could switch the order of the two label groups, assuming that CONSTITUENT always occurs, even if AMOUNT does not. [See] A better solution here is the DEFAULT statement. This statement, in conjunction with a GETELEM or IND-STRUCTURE statement, specifies that the rest of the label group should be processed even if the data element or structure occurrence does not exist. The label group is executed as if a null value had been retrieved. However, a null value will not be positioned in the data frame, so the correct row would not be set; any UPROC statements in the label group would be executed though. The DEFAULT statement can also have a value, such as:

In this case, the string expression is used as if it were the value retrieved and is positioned accordingly.

Remember that the DEFAULT value is used only when the element does not occur at all. No DEFAULT action is taken when LOOP is coded and the looping is broken because all occurrences of the element have been processed. Note however that if a value is given for LOOP (e.g., "LOOP = 5;") and only two occurrences of the element exist, the DEFAULT action will be taken four times. Note also that if both DEFAULT and LOOP are coded, the DEFAULT action will be taken if the GETELEM or IND-STRUCTURE fails the first time.

Because the DEFAULT value will be processed just as if it were retrieved from the record, any INSERTs will be included when the value is finally placed into the frame.

$DEFAULT, another system variable, is also useful for default value processing. [See 8.1.]

That finally completes the frames for our sample format definition. We now have only the Format Declaration section to write. Though we have spent many pages constructing these frames, we have learned or at least been introduced to the most important concepts and capabilities of formats and the language that governs the transformation of these concepts into frames of a format. In the next chapter, we will learn how to construct the Format Declaration section. Later, we will return to frame definitions briefly to discuss the procedures used if you need to use variables other than the system variables discussed in the last section. [See 8.2.]

5  Linking Frames Together: The Format Declaration Section

The last section of the format definition is usually very simple to write. The Format Declaration section defines a format by specifying a format name (FORMAT-NAME) and then listing the frames that make up that format. This section is necessary because more than one format can be defined in a single format definition, and a single frame can be used by more than one format. So the Format Declaration section in effect packages a set of frames to be used together by a format.

Like the Frame Definitions section, the Format Declaration section is a multiply occurring structure. Each occurrence of the structure represents the information about one format. Here are the elements within each occurrence of the structure:


This is the name of the format, which is used in the SET FORMAT command. It must be different from any other format name for the goal record of the subfile, whether defined by you or not. FORMAT-NAME is the key of the FORMAT-DEC structure. The value can contain up to 16 alphanumeric characters, including blanks. The name chosen often indicates how the format will be used. The FORMAT-NAME for our sample will be called DISPLAY, suggesting that it is an output format used with the DISPLAY or TYPE commands.

ALLOCATE = vgroup-name;

This statement specifies that at least one of the frames used by the above-named format uses variables that you have defined in the VGROUP section earlier. [See 8.2.]

FRAME-NAME = frame-name; FRAME-TYPE = type;

These two elements are actually in a structure called FRAME-DEC within the Format Declaration section. Here we declare which frames should be executed and in what order. For each occurrence of the FRAME-DEC structure, there is one FRAME-NAME and its respective FRAME-TYPE.

The FRAME-NAME statement gives the FRAME-ID of a frame defined earlier in the Frame Definitions section that is to be executed when the format named above (FORMAT-NAME) is used. The FRAME-TYPE announces whether the named frame is a DATA or INDIRECT frame. (There are several other types as well, discussed in "SPIRES Formats", section IV.2.2.) For our example format then, we must have two occurrences of the FRAME-DEC structure -- one for the data frame DRINK and one for the indirect frame INGREDIENT.

The order in which these two frames are declared is crucial. There are two important rules to remember:

So our example format definition ends like this:

It is possible to have more than one format defined by a single format definition. For example, if two formats you had designed shared the exact same indirect frame, your Format Declaration section could look like this:

When you set a particular format, you use only those frames in the format definition that are declared in the specific Format Declaration section for that FORMAT-NAME.

Our format definition is now complete. If you compare the entire definition with that of the very simple format in chapter 1, you will see that our newly finished definition is several times longer than that first one. But in terms of the number of elements processed, the procedures by which they are displayed and, in general, the level of sophistication, our DRINKS format is many times more impressive.

Both format definitions achieve their aims, of course -- the simplicity of our mailing address format versus the complexity of our recipe format is an irrelevant issue. What is more important is that both definitions meet their design specifications. You might want a more or less complicated format than our format for the DRINKS subfile. The DRINKS example was meant to give you a fairly straightforward example, illustrating most of the common display attributes that formats may have and many of the formatting problems that format designers face. Perhaps it has also shown you the thinking that might go into designing a format and writing the format definition.

6  Adding and Compiling the Format Definition

Now that we have written the format definition, the hardest part of our job is behind us. The rest of the format procedure is standardized -- we will add the definition to the FORMATS subfile, then compile it, and then select the subfile and use our new format.

To add the format definition, we collect it in our active file, and then, in SPIRES,

where the returned prompt indicates that the format definition has been successfully added as a record in the FORMATS subfile.

Of course, you may not be so fortunate:

and you would want to issue the command "EXPLAIN S260" to see the meaning of that error message, and then correct your definition accordingly and issue the ADD command again. Error messages at this point tell you what mistakes the definition contains as a record in a subfile, such as that a required element is missing, or that there are too many occurrences of a particular element. Often the mistake is that a FORMATS element name has been misspelled or there is an extra or missing semicolon. Adding the format definition to the subfile is no guarantee that it will create a working format. That step is next.

Once you have added your format record, you need to compile it. The "compile" step both verifies that statements in the definition are syntactically and factually correct (e.g., whether the file referred to exists, whether the elements accessed exist, whether the Uprocs specified are allowed, and so forth) and converts the high-level formats language into a lower-level machine language. The resulting translation, called the "compiled characteristics" of the format, is stored automatically in the SPIRES subfile FORCHAR (FORmat CHARacteristics). Then, when you "set" a format, SPIRES brings the compiled characteristics of the format into the computer's main memory.

To compile your format, issue the COMPILE command while the FORMATS subfile is selected:

where "" is the ID element, the key of the format definition. The "gg.uuu." is optional, since you can only compile your own format definitions.

Again, like adding your record to the FORMATS subfile, compiling your format will not always be so simple. You may get diagnostic error messages, which look like this:

(This error is shown as an example only; the error as stated does not exist in the format definition we wrote.) The lines with asterisks that precede the error message help you locate the position in the format definition where the error was found. These error messages, which can seem cryptic, can be EXPLAINed. For instance, you can issue the command EXPLAIN MORE THAN ONE ELEMENT SOURCE and an explanation will be given to you.

If you do get a compilation error, you need to correct the format definition:

and then make corrections to the definition in your active file and UPDATE. Then issue the COMPILE command again. Repeat this procedure if necessary until the format definition compiles.

Of course, just because a format definition successfully compiles does not mean that the goal records will look exactly as you had hoped. If you later decide to change your format definition after it has already been compiled, you need to UPDATE the record in the FORMATS subfile, and then issue the RECOMPILE command:

which has the same syntax as the COMPILE command. [See 9 for information about the "format tracing" facility, a handy debugging tool.]

Using the Format

Assuming our record compiled, let's select our subfile and use our output format:

The SET FORMAT command is issued using the FORMAT-NAME from the format definition. To see what formats are available, you can issue the SHOW FORMATS command. [See "SPIRES Searching and Updating", section B.4.6.]

Destroying the Format

If you want to remove a compiled format that you defined for one of your files so that it can no longer be used, issue the "ZAP FORMAT" command in SPIRES:

where "" is the ID, the key of the format definition. If you specify the SOURCE option, the "source record" (the format definition) will also be removed from the FORMATS subfile; otherwise, only the compiled characteristics, kept in the FORCHAR subfile, will be destroyed, meaning that you could then COMPILE (not RECOMPILE, since it no longer exists in a compiled version) the definition again.

Note these restrictions on the ZAP FORMAT command, however: the file owner can ZAP any and all formats for his file; the format definer can ZAP only formats he has written.

7  Frame Dimensions, SET FLUSH and Line-by-line Format Processing

The DISPLAY format for the DRINKS subfile uses two frames, of which the data frame has fixed frame dimensions of 30 rows by 68 columns. No record displayed with this format will ever be more than 30 rows long. In fact, if we had a very large record that needed, say, 32 rows, we would get an error message telling us that a label group had tried to place data beyond the highest row allowed. Of course, during format design, we decided that 30 rows for this particular format was much more than we would probably ever use. [See 1.2.] However, with other subfiles, unexpectedly large records could be a problem. Aspects of this problem and several solutions will be discussed in this chapter.

7.1  Fixed Frame Dimensions and the SET FLUSH Uproc

One solution that comes to mind for handling large records is to set the number of rows to a very high number, perhaps double what we think we might ever need. Though this is generally the simplest solution, it is not efficient in terms of "core management" (i.e., use of the computer's main memory), because we would be telling SPIRES to reserve a lot of memory that we would usually never need.

A neater solution is to leave the frame dimensions at a low, reasonable number and then use the SET FLUSH Uproc to provide special processing if a large record needs more rows than allowed by the FRAME-DIM statement. Then, when a label group tries to place data in a row beyond the frame dimensions, the partially completed format is "flushed" (that is, released from the main memory and sent to your terminal or active file) and format processing continues, sending each subsequent row of formatted data to the destination as the row is constructed. For example, if a giant DRINKS record would need 42 rows and our frame dimensions allow only 30 rows but we have "UPROC = SET FLUSH;" in effect, then, during format processing, when SPIRES tries to position a value in row 31, the first 30 rows would be sent to your terminal (or active file) and then, when a value went into row 32, row 31 would be sent, and so on. Once "flush processing" begins, each row is sent to the destination as soon as some value is positioned in the next row, just like line-by-line processing. [See 7.2.]

Flush processing has one limitation that fixed frame processing does not: once flush processing has begun, you cannot return to an earlier flushed row to place a value, since that value is gone from the main memory. As soon as a row is flushed, it can no longer be accessed; only later rows can be used. For our sample format, once we are beyond the QUANTITY element, which is the only value that possibly could be placed in an "earlier" row than the value of "*", all values are placed one row after another. [See] Thus, using the SET FLUSH Uproc, we could set "FRAME-DIM=3,68;" in the data frame, to cause flush processing soon after the NAME and QUANTITY elements were placed.

The SET FLUSH Uproc is not placed in the frame definition label groups but instead in the Format Declaration section, as an element in the FRAME-DEC structure:

It appears in the occurrence of the FRAME-DEC structure that describes the data frame. [See 5.] Any UPROCs here are executed prior to frame execution.

7.2  Line-by-line Processing

Carrying flush processing to an extreme, it is possible to process and flush every row of the format individually. This processing, called "line-by-line mode", is slightly more desirable than fixed frame processing because it uses less memory and is easier for SPIRES to handle, resulting in a slightly lower cost that could become a significant saving if you use the format extensively.

Since flush processing and line-by-line processing are essentially the same, the limitation cited earlier for flush processing is a limitation here too. Once a row has been flushed, it can no longer be used to position values; you cannot go back to row 6 after processing row 7. In fact, giving a row value like "6" during line-by-line processing will cause SPIRES to skip ahead 6 lines, leaving the next five blank. Hence, if we have two or more values to position on a single row, then if the first one positioned wraps around into the next row, the other values cannot be positioned in that first row as planned. Still, many format designs are not affected by this restriction and can take advantage of line-by-line processing.

To specify line-by-line processing, you simply give "0" (zero) as the value of "nrows" in the FRAME-DIM statement of the data frame definition, e.g.:

You do not need a SET FLUSH Uproc if you set "nrows" to "0".

Can our DRINKS format be processed line-by-line? Yes, it can. The only problem, as suggested in the last section, is that if the NAME element on row 1 spills over into row 2 or beyond, we could not return to row 1 to place the QUANTITY element there. On the other hand, as shown earlier, we could switch the NAME and QUANTITY label groups so that the QUANTITY element is placed on row 1 before the NAME element begins, which allows the latter to wrap around into row 2. [See] Remember that there is no rule that says that the values on a row must be placed there from left to right. Here we simply place the right-most element, QUANTITY, on the row before NAME, the left one. That adjustment is made in our format as shown in the Appendix.

8  Using Variables and UPROCs

This section will briefly discuss some further aspects of variables and Uprocs that are relevant to format definitions. Unlike the other chapters of this primer, this section is written for people with some knowledge of SPIRES protocols language. The "prerequisite" reading is sections 4 through 7 of "SPIRES Protocols". To use the capabilities discussed here, you will need to understand that material.

8.1  Some System Variables Used Only in Formats

Below is a list of the most useful SPIRES system variables that are used exclusively by formats, along with a brief explanation of how each is used. More information can be found by EXPLAINing any of the terms.

The first group of variables exist only during a single label group. They are all reset when the next label group begins executing.

(Note that $ULEN is equivalent to "$SIZE($UVAL)", the expression we used in a UPROC in the DRINKS format we wrote earlier.)

These integer variables are not reset whenever a new label group begins:

8.2  User Variables

If you need to manipulate data in your format before displaying it, you will probably want or need to create your own variables. These variables must generally follow the rules for user variables given in section 4 of "SPIRES Protocols". There is a difference between using your own variables in formats and in protocols that is worth mentioning. All user-defined variables that are used in a format must first be defined in the Vgroups section of the format definition and then allocated (assigned a position in the main memory) in the Format Declaration section before they can be used in the label groups. They are initialized when the SET FORMAT command is issued.

The Vgroups section is a multiply occurring structure that follows the RECORD-NAME element and contains these elements:

VGROUP = vgroupname;

This element is the key of the structure. It must have a value, usually given in the form "gg.uuu.vgroupname" where "gg.uuu" is your account number. This value is used later in the ALLOCATE statement in the Format Declaration section to indicate which VGROUP should be allocated when the format is set. The value must be less that 24 characters long.

VARIABLE = name; OCCURS = occs; LENGTH = len; TYPE = var-type;

These elements comprise the structure VARIABLES in the Vgroups section. The element VARIABLE is the key of the VARIABLES structure; its value is the name of the variable you want to use. Though optional, the other elements do have defaults if they are not specified. By default, the variable is assumed to have only one occurrence if no OCCURS is given. By default, the variable is assumed to be a string variable of maximum length 80 if no TYPE and LENGTH is given. [See "SPIRES Protocols", section 4, for other defaults.]

Near the end of your format definition, you must include an ALLOCATE statement in each occurrence of the FORMAT-DEC structure that includes frames using defined variables. So, if our format for the DRINKS subfile used some variables, the Vgroups and Format Declaration sections might look like this:

You can use your variables in VALUE statements, in Uprocs, or anywhere else that system variables are allowed in a format definition. Remember that your variables are indicated by a pound sign ("#MYVARIABLE", for instance), system variables by a dollar sign.

8.3  Some Other Useful Uprocs

Only four Uprocs were discussed earlier. They were: "IF...THEN...", JUMP, SET ADJUST and SET JUSTIFY. Several others will be discussed in this section. [See]

Variable Assignment Uprocs

UPROC = LET = expression;

Now that we can use our own variables, we need a way to assign values to them. Generally, the LET Uproc is specified in the same form as the LET command in SPIRES protocols. Here are some samples:

If you are used to writing protocols, the hardest aspect of adding UPROC statements to label groups is remembering to include the element name UPROC and the semicolon at the end of the value. Also, remember that if quotation marks appear within the value, they they must be doubled, and quotation marks must then surround the entire element value:

UPROC = SET = expression;

This Uproc, whose syntax is similar to the LET statement above, is used to assign values to those system variables that can be reset by you. Some of the variables that are useful in this context are $CVAL, $PROMPT and $ASK.

Note the use of $CVAL in this label group:

Here, changing $CVAL changes the value that will be processed by the PUTDATA statement. At this point, it is too late to change the value any other way than by setting $CVAL. For example,

Changing the value of #TEMP after the VALUE statement will not change the value of $CVAL, the value which will be processed by the PUTDATA, because it was computed before the UPROC statement is executed. Instead of the above, $CVAL could be set to "No value".

Some Execution Flow Uprocs


This Uproc has several slightly different uses. If you are in the middle of an indirect frame and you want execution control to return immediately to the calling frame without any further processing of the indirect frame, this Uproc will return it there. If you are in a data frame, RETURN will cause record processing to stop at that point; SPIRES will either proceed to the next data frame, the next record, if processing multiple records, or return command control to you.


This Uproc will cause SPIRES to stop format processing of the current goal record immediately, regardless of the current location within the format definition. An error message will be displayed at the terminal, unless the QUIET or NOERROR option is specified. If processing multiple records, SPIRES will begin processing the next one.


This Uproc will cause SPIRES to stop all format processing immediately, even during multiple record processing. An error message will be sent (unless the QUIET or NOERROR option is used), and SPIRES will return control to you at the terminal.

Input/Output Uprocs


ASK is used to ask for input from the terminal user when the format is executing. It works like the ASK command in protocols. The basic syntax of the command is:

The options, if used, must be specified in the order shown. The "string" is the question or statement to be displayed at the terminal, prompting the user for a response. The user's response is then assigned to the system variable $ASK, which can then be tested using various SPIRES functions. The NULL and ATTN clauses allow special processing when the user response is either just a carriage return (NULL) or pressing the BREAK/ATTN key (ATTN). These clauses can each specify one of a limited set of Uprocs, including RETURN, ABORT, STOPRUN and "JUMP label-name". Whichever is chosen, it must be enclosed in apostrophes, as shown in the syntax statement above. If you specify "NULL='';" then SPIRES will continue executing the current label group if a NULL response is entered. By default, if no ATTN option appears and BREAK/ATTN is the user's response, the current label group will continue executing; by default, if no NULL option appears and the user gives a null response to the prompt, the prompt is repeated.

You can assign the prompt string to the system variable $PROMPT in a SET Uproc before the ASK:

The current value of $PROMPT will be used by the ASK command if no PROMPT='string' is given. Note that if PROMPT='string' is given, it does not reset $PROMPT.

UPROC = * 'string expression';

The "*" or "star" Uproc allows you to send information to the terminal during format execution. It is especially useful for debugging a format. For instance, you could insert into your format definition several Uprocs such as:

which would display the name of the currently executing label group to you, thus telling you where SPIRES is at a particular moment. The string expression can be any text surrounded by apostrophes, or it can be a system or user variable, or it can be a mixture. No functions are allowed here, however.

A message sent to the terminal is not part of the format or the frame being processed, i.e., it is not positioned in the frame. If you are displaying records at your terminal, it may seem that messages are; however, if you issue the displaying command with the IN ACTIVE prefix, the distinction will be clear: the messages come to your terminal and the data goes to your active file.

The star command works similarly in protocols. The only difference is that from a protocol, the expression looks like this when sent to your terminal:

From a format, the asterisk is not included:


UPROC = -comments;

The hyphen, or dash, indicates that what follows is a comment. Comment Uprocs can be scattered throughout a group of UPROC statements. They are not compiled.

This particular method of including comments should not be confused with the dash element that can be used as a "throw-away" COMMENTS element for every SPIRES subfile. Unlike that dash element, the hyphen Uproc is not thrown away, but remains part of the record when it is added to the FORMATS subfile.

9  Other Formatting Capabilities

With this last section, the primer concludes by suggesting other capabilities you might want to explore that are explained in "SPIRES Formats". (The numbers in brackets indicate sections in that manual to which you should refer.)

In addition to these capabilities for output formats, "SPIRES Formats" also provides further details about all of the topics discussed in this primer as well as information about input formats.

:  Appendix

:A  Sample Format Definition

Here is the complete format definition that we created for the DRINKS subfile:

:29  SPIRES Documentation

I. Primers

II. User Language

III. Application Development

IV. Reference Guides (Cards)

V. Prism

VI. SPIRES Aids and Study Materials

VII. Other Related Documents

(The following documents are not SPIRES documents per se, but describe utilities and programs that may be useful in developing SPIRES applications.)

Obtaining Documentation

The above documents (except any marked "in preparation") may be obtained through the PUBLISH command on the Forsythe computer at Stanford University. If you do not use SPIRES at Stanford, contact your local system administrator to find out how SPIRES documents are made available there.

Updates to SPIRES Manuals

SPIRES manuals are updated regularly as changes are made to the system. This does not mean that all manuals are out of date with each new version of SPIRES. The changes to the documentation match those made to SPIRES: they are usually minor and/or transparent. Not having the most current version of a manual may mean you do not have all the most recent information about all the latest features, but the information you do have will usually be accurate.

A public subfile, SPIRES DOC NOTES, contains information about changes to SPIRES manuals. Using this subfile, you can determine whether the manual you have has been updated and if so, how significant those updates are. You need to know the date your manual was published, which is printed at the top of each page. For details on the procedure, issue the command SHOW SUBFILE DESCRIPTION SPIRES DOC NOTES.


* UPROC   8.3
FRAMES   1.2
IF...THEN... UPROC   8.3

UPROC, *   8.3
UPROC, -   8.3
UPROC, ASK   8.3
UPROC, IF...THEN...   8.3
UPROC, LET   8.3
UPROC, SET   8.3