SPIKE.MISC

INDEX
* General Documentation
B
B.7 Understanding and Coding Index Records
B.7.1 How Indexing Works
B.7.2 Understanding Simple Indexes
B.7.3 Understanding Qualifiers
B.7.4 Understanding Sub-Indexes
B.7.5 Understanding Combined Indexes
B.7.6 The Impact of Global FOR and ALSO on Indexing
B.7.7 Index Definition
B.7.8 Coding Simple Indexes
B.7.9 Coding Simple Indexes with Qualifiers
B.7.10 Coding Combined Indexes
B.7.11 Coding Sub-Indexes
B.7.12 Index Record and Goal Record Elements
B.7.13 Index Records as Goal Records
B.7.14 Index Records for Non-Removed Record Types
B.7.15 Ensuring the Validity of Index Records
B.7.16 Personal Name Algorithm Details
B.8 Understanding and Coding the Linkage Section
B.8.1 Functions of the Linkage Section
B.8.2 The Global Parameters Section
B.8.3 Individual Index Linkages
B.8.4 Simple Indexes
B.8.5 Sub-Indexes
B.8.6 Global Qualifiers
B.8.7 Local Qualifiers
B.8.8 Combined Indexes
B.8.9 Coding SRCPROC Rules
B.8.10 The NOPASS Statement
B.8.11 Coding PASSPROC Rules
B.8.12 Choosing the "Fetcher" PASSPROC
B.8.13 Other Actions in a PASSPROC Rule String
B.8.14 How Passing Works
1 Triples
1.1 Triple Functions
1.1.1 $MAKE
1.1.2 $MADE
1.1.3 $UNMAKE
1.1.4 $UNMAKETRIPLE
1.1.5 $LOOKUP
1.2 Decomposing Triples
1.2.1 $ATTRIBUTE
1.2.2 $OBJECT
1.2.3 $VALUE
1.3 Groups
1.3.1 $GROUP
1.3.2 $GROUPSIZE
1.3.3 $GROUPELEM
1.3.4 $GROUPSORT
1.4 SHOW & CLEAR TRIPLES
2 Add Function Documentation
2.1 SPIRES Functions -- Background
2.2 SPIRES Functions -- Implementation
2.3 SPIRES Functions -- Installation
2.4 SPIRES Functions -- Documentation
2.5 SPIRES Functions -- Distribution
3 Add Variable Documentation
3.1 SPIRES Variables -- Background
3.2 SPIRES Variables -- Implementation
3.3 SPIRES Variables -- Installation
3.4 SPIRES Variables -- Documentation
3.5 SPIRES Variables -- Distribution
8 The Host Language Interface (HLI)
9 UPDCLOSE Processing: INCLOSE
9.1 Intermediate Form
9.2 Structure Processing Dsect
9.2.1 FATL
9.2.2 NXTL
9.2.3 LSTL
9.2.4 NXTH
9.2.5 LSTH
9.2.6 SLOC
12 Partial Record Processing: Partial FOR
12.1 Introduction
12.2 Current Capabilities
12.3 Concept of Partial Processing
12.4 Record Level Commands
12.5 Record Navigation
12.6 Partial Processing Commands
12.7 Partial Processing UPDATE and MERGE Capabilities
12.8 General Information
12.9 The FOR * command
12.10 Partial Processing to the rescue

* General Documentation

B

B.7 Understanding and Coding Index Records

B.7.1 How Indexing Works

Let's consider what happens when we want to build an index. Suppose we had a subfile called "TABLE OF CONTENTS"; each record in the subfile is a chapter number (the key of the record) and a chapter title. If an appropriate format were written, the table of contents for the first seven chapters of Part B of this manual might look like this:

Chapter   1: Goal Record Concepts and Definition
Chapter   2: Goal Record Keys, Slot and Removed Records
Chapter   3: Structures
Chapter   4: Processing Rules: INPROC, INCLOSE, OUTPROC
Chapter   5: FILEDEF Subfile and SPICOMP Compiler
Chapter   6: File Structure: Tree and Slot, Goal and Index Records
Chapter   7: Understanding and Coding Index Records

An index based on the words appearing in that chapter titles might look like this:

AND                INDEX               REMOVED
  1, 2, 5, 6, 7      6, 7                2
CODING             INCLOSE             RULES
  7                  4                   4
COMPILER           INPROC              SLOT
  5                  4                   2, 6
CONCEPTS           KEYS                SPICOMP
  1                  2                   5
DEFINITION         OUTPROC             STRUCTURE
  1                  4                   6
FILE               PROCESSING          STRUCTURES
  6                  4                   3
FILEDEF            RECORD              SUBFILE
  5                  1, 2                5
GOAL               RECORDS             TREE
  1, 2, 6            2, 6, 7             6

In fact, an index at the end of a book is a good example of the structure of a simple SPIRES index. The record definition for this index would look like this:

RECORD-NAME = REC02;
  REQUIRED;
      KEY = TITLE-WORD;
      ELEM = GOAL-RECORD-KEY;

The element "TITLE-WORD" contains one of the words in each of the titles. Since it is the key of the record, it can only occur once in each record. So, each word ("AND", "CODING", etc.) in the above index is the key of a separate record in the index record-type. But notice that for each "TITLE-WORD" there may be several occurrences of "GOAL-RECORD-KEY", each occurrence pointing to the goal record in which the title word occurs.

The record in this index record-type for the TITLE-WORD "GOAL" would look something like this:

TITLE-WORD = GOAL;
      GOAL-RECORD-KEY = 1;
      GOAL-RECORD-KEY = 2;
      GOAL-RECORD-KEY = 6;

This is an index record whose key is some "word" (here, "GOAL") and contains a set of pointers to those goal records that contain a certain word in the title. Each record that is stored in this index contains as its key a word from a chapter title, and one or more pointers, each to a goal record whose TITLE element contains the word that is the key of the record.

A SPIRES search of this index could look like this:

-? find title-word goal
-RESULT:  3 CHAPTER(S)

While a DISPLAY <key> command searches the goal record tree for the key named, an index record is searched by the FIND <searchterm> <key> command. For example: SPIRES will locate the record that has the key-value "goal" in the index named in the FIND command, and count up the number of pointers to the goal record data set to indicate how many records are likely to be in the search result. (The number reported may not be entirely accurate, since a single particular record may have several pointers to it in the same index record; if this is the case, SPIRES will report a corrected number in the search result after it has examined the records via the TYPE or OUTPUT commands.)

An index record thus looks very much like a goal record as far as SPIRES is concerned. It has a key of fixed or varying length, depending upon the nature of the data being passed to the index; it has a multiply occurring element called a pointer (that may be the key of a structure). Since the FIND command attempts to locate records by key, the most efficient structure for storing and locating index records will be a tree structure. Typically, the records in an index are not REMOVED, since they are usually quite small, allowing a large number of them to fit into a single tree block. Index records are thus structured and stored in a manner identical to that for goal records, except that we can take advantage of their small size.

The simple definition for the TITLE-WORD index shown above is probably not what most file definers would specify, especially if the goal record (the chapter titles) were a REMOVED record type. If the goal record is REMOVED, then its indexes do not usually store goal record keys, but addresses of goal records in the residual data set. The index definition would probably look more like this:

RECORD-NAME = REC02;
  REQUIRED;
    KEY = TITLE-WORD;
  OPTIONAL;
    ELEM = GOAL-REC-LOCATION; TYPE = LCTR;

Index records exhibit a new type of element, the locator, denoted by "TYPE=LCTR." This element refers to ("locates") a goal record, not by its key, but by its address (location) in the residual data set. To see why this is done, consider the sequence of events for a FIND and TYPE command. SPIRES searches an index and accumulates a list of pointers to the goal records, and reports on the number of pointers found. If each of the pointers were in the form of a goal record key, then the TYPE command would cause SPIRES to read blocks of the goal record tree until it found the location of the referenced record in the residual; then SPIRES would access the residual. In almost all cases, the middle step of searching the goal record tree for the record's location in the residual can be eliminated by storing that location itself as the pointer, rather than the key of the record; this optimization can only be done when the goal records are REMOVED.

To be precise, a pointer to a non-REMOVED record type is not declared TYPE=LCTR, since it contains a goal record key rather than a pointer to a location in the residual data set. Only the pointer to a REMOVED record type can be TYPE=LCTR.

Since SPIRES creates and maintains indexes automatically, the file definer must tell SPIRES how and what information is to go from the goal record to a certain index. The file definer specifies how this is done in the "Linkage Section" that follows all of the index record definitions and precedes the "Subfile Section" that specifies subfile name and privileges.

As its name implies, the Linkage Section links the goal and index records; it defines how information is passed from the goal to the index records when file updating is done, and it defines how indexes are to be searched. The details of coding the Linkage Section are covered in the next chapter "Understanding and Coding the Linkage Section." [See B.8.] With this brief look at the structure of a very simple index record, we can now consider the different methods of indexing available to the file definer. Each method has a search and retrieval situation for which it is particularly well suited. For any file that will be searched often, or will contain more than one thousand records, indexing plans should be discussed with the SPIRES consultant. For each indexing strategy described below, guidelines for its use are also presented.

B.7.2 Understanding Simple Indexes

Simple indexes may be defined for files of any size. Their structure and use by the system is "simple" and efficient. Here is a picture of two records in a simple index:

REQUIRED (or FIXED) KEY  1
      OPTIONAL POINTER  3
      OPTIONAL POINTER  2
      OPTIONAL POINTER  1
REQUIRED (or FIXED) KEY  2
      OPTIONAL POINTER  2
      OPTIONAL POINTER  1

Also, simple indexes are the only type of index for which a thesaurus and/or synonym can be maintained. [See C.2, C.3.]

If an element to be indexed has only a few possible values, it may be best to "search" this element using Global FOR, or perhaps index it as a "qualifier" (see below). Any time a search request would retrieve a large percentage of the records in a file (seventy percent or so), simple indexes may not be the best search mechanism. For example, an index built on the sex (male or female) of people in a personnel file may or may not be necessary, depending upon the search situation. If that index will not be searched frequently, it may be cheaper to search the goal records sequentially (using FOR or ALSO) than to pay the cost of building, updating, and storing the very large index record entries. If a search result will frequently be narrowed by a "sex" criterion, then "sex" might be added to another index or indexes as a qualifier.

B.7.3 Understanding Qualifiers

Qualifiers provide a search flexibility for large files, allowing search requests to be narrowed by the specification of criteria that would be inefficient to search and index otherwise (such as the language in which a program is written in the MASTERLIST subfile--only four or so possibilities exist). Qualifiers should be used sparingly: they must be stored redundantly in each index to which they apply, generating high storage costs. Let's look at the structure of a simple index with one qualifier.

If we were to qualify the title-word index shown in the beginning of this chapter with a STATUS element, allowing only the values "Preliminary," "Current," and "Out of Date," the structure of the index and a sample record in it would be something like this:

Structure:                   Sample Record:

RECORD-NAME = REC02;         TITLE-WORD = AND;
  REQUIRED;                    POINTER = 3;
   KEY = TITLE-WORD;             STATUS = CURRENT;
  OPTIONAL;                    POINTER = 2;
   ELEM = POINTER-STR;           STATUS = PRELIMINARY;
     TYPE = STR;               POINTER = 1;
STRUCTURE = POINTER-STR;         STATUS = OUT OF DATE;
  REQUIRED;
   KEY = POINTER;
     TYPE = LCTR;
   ELEM = STATUS;  OCC = 1;

As you can see, the qualifier is stored with each pointer. Thus, a qualifier takes up quite a bit of space, relative to a simple index on an element with only a few values (such as STATUS). But, the time required to search on the basis of a qualifier is less than that for searching two indexes, especially if one of them has only a few large records (entries) in it. This is because a qualifier search request narrows a search by operating off an existing search result stack; a search involving two indexes requires SPIRES to build two search results, and AND them together.

So, if search time is more important than storage cost, and you will frequently want to qualify a search request by a certain criterion or criteria (there can be more than one qualifier for an index), a qualifier may be appropriate.

Several other facts about qualifiers will influence a decision on their use: 1) they may only be used with the AND and AND NOT logical operators; 2) they allow the full range of relational operators, such as ">" and "<"; 2) they can only be used after a search request involving the index to which they are attached. For example, assume DATE is a qualifier to a TITLE index:

-? find title a month of sundays
-RESULT: 1 BOOK(S)
-? and date after July 1972
-RESULT: 1 BOOK(S)
-? or date before 1975
-ILLEGAL USE OF QUALIFIER
(here the qualifier was used with the OR logical operator)

-? find date after July 1972
-ILLEGAL USE OF QUALIFIER
(here the qualifier was used without a preceding search result)

One additional requirement is that the qualifier must occur in any index(es) to which it applies; note that a global qualifier usually is a REQUIRED element in the OPTIONAL pointer structure--if the POINTER occurs, then the qualifier must occur also. For both global and local qualifiers, this means that the element being passed from the goal record as the qualifier may not be optional, or a default value must be supplied by pass processing rules if the element does not occur in the goal record.

Qualifiers may also be "local" or "global." If local, then it may only be used after the index to which it applies has been named in a search request. If global, then it may be used any time after the first FIND command referencing any index. Global qualifiers are stored redundantly on every pointer of every index; they are thus quite expensive from a standpoint of storage costs.

B.7.4 Understanding Sub-Indexes

Sub-indexes are almost exclusively used with personal name indexes. The personal name search processing rule (SRCPROC A38) breaks a search value into two portions: last name, and first names. After searching the index record on the last name, the first names are used to determine which sub-index structures define the pointer groups. If no first names were given in the search request, all pointer groups in the index record are logically OR'd together.

Sub-indexes can be used in other ways besides personal name, and are searched by specifying the commercial ("@") character. For example, we might make CITY a sub-index of STATE and search as follows:

-? find state california @ city palo alto

or make SEAT and ROW sub-indexes of SECTION:

-? find section h @ row aa @ seat 52

Sub-indexes can be useful when things logically fit inside other things, as cities do in states, or seats do in sections. They allow you to choose a subset of the index as a result.

B.7.5 Understanding Combined Indexes

Several elements can be passed to a single combined index (only one combined index may be defined per goal record), requiring somewhat less storage space than several simple indexes. Passing several elements to a single combined index is necessary when the number of indexes defined for a goal record would require that more record-types be defined in a file definition than are allowed (64 is the maximum). Combined index organization is most efficient when the data elements are numerics (often requiring relational operators for effective searching) and short alphanumerics (such as codes).

However, there are significant disadvantages to combined indexing strategies when a file is large (more than eight thousand records) and is searched or updated frequently. As the file gets large, the cost of updating a combined index with many entries gets progressively greater; combined indexes are also somewhat more time consuming and expensive to search than simple indexes, usually requiring several disk accesses to retrieve the large records they contain from the residual data set. Also, many of the elaborate search and pass processing rules (SRCPROC and PASSPROC) available for simple indexes are not available for combined indexes. In addition, the BROWSE command cannot be used to inspect the contents of a combined index.

All of the advantages and disadvantages of combined indexes arise either from the search and update techniques they require, or from their structure, which is similar to indexes with local qualifiers. In a simple index, there is one index record for each unique value passed from the goal records; if several goal records had the same value, then the one index record for that value would have multiple occurrences of pointers to the goal records. For combined indexes, however, there is one index record for each element-mnemonic in the goal record that passes to the index, and every unique value that that mnemonic has forms an occurrence of a pointer structure containing the pointer and the value of the element in the goal record that is being pointed to. For example: if a goal record passes TEMPERATURE, AGE and DATE to a combined index, the goal and index records would look like this:

Goal Records:             Combined Index Records:

PATIENT-NUMBER = 12345;   KEY = 0,1; (TEMPERATURE)
TEMPERATURE = 99;           POINTER-GROUP;
AGE = 60;                     POINTER = 54321;
DATE = 01/21/70;              ELEM-VALUE = 101;
                            POINTER-GROUP;
                              POINTER = 12345;
                              ELEM-VALUE = 99;

PATIENT-NUMBER = 54321;   KEY = 0,2;    (AGE)
TEMPERATURE = 101;          POINTER-GROUP;
AGE = 44;                     POINTER = 54321;
DATE = 06/05/69               ELEM-VALUE = 44;
                            POINTER-GROUP;
                              POINTER = 12345;
                              ELEM-VALUE = 60;

                          KEY = 0,3;    (DATE)
                            POINTER-GROUP;
                              POINTER = 54321;
                              ELEM-VALUE = 06/05/69;
                            POINTER-GROUP;
                              POINTER = 12345;
                              ELEM-VALUE = 01/21/70;

Note that the records shown in the right column are created and maintained by passing. The key of the record is a combination of the structure and element number of the elements that is being passed from the goal record; these keys are computed by SPIRES.

When a search request is made against a combined index,

-? find age > 64

the single index record containing all of the AGE values in the goal record is read, then all the values (ELEM-VALUE, above) in the index record read in are scanned, and pointer groups not meeting the criteria are weeded out of the search result. Because a single record containing all AGE values exists in a combined index on AGE, a command such as

-? find age

is possible; the result will be all records in which the AGE element passed a value to the index.

A combined index record may grow quite large if 1) it contains many values because the number of goal records passing to it is large, or 2) the values passing to it are long, such as lengthy character strings. Since a large record must be read and then scanned, searching a combined index, particularly in medium and large sized files, may give a noticeably slower response than searching a simple index in the same subfile. Also, updating such a large record is more time consuming and thus expensive than updating a simple index; because the large records in a combined index may often overflow the 2048-byte limit for an ORVYL file block, multiple disk accesses may be necessary to search for or update a single record.

If the file is not large or, if the elements being indexed do not occur in a majority of the goal records, or if updating is not done nightly, then combined indexes are quite suitable for numerics and short alphanumerics, such as codes.

B.7.6 The Impact of Global FOR and ALSO on Indexing

The Global FOR and ALSO commands provide substitutes for a combined index in a large file. Of course, these methods involve sequential rather than indexed searching, and will be noticeably slower (more elapsed and CPU time required) than a combined index search unless the existing search result is small.

The ALSO command always examines all the goal records pointed to in an existing search result; this capability is also available using the Global FOR commands. In contrast to the FOR and ALSO sequential search, an index search request preceded by another search request operates as any compound search request: two or more subsets of pointers are built, one for each of the search criteria, then put together into a single search result. For this reason, if a search request requiring relational operators were always preceded by search requests yielding a relatively small search result, such a request might be performed most efficiently using a Global FOR or ALSO command.

Another consideration: if searching is done sequentially by the Global FOR or ALSO commands, then no expenses are incurred for building, updating and storing combined or simple indexes. If search requests against the values in some elements will be quite infrequent, it may be advisable to use sequential search techniques rather than indexed search techniques. Retrieval may be slower, but costly indexes of little use will not be maintained.

There are some cautions to the use of sequential searching techniques, however. Unlike the FIND command, the ALSO command cannot initiate a search; it must always operate on a preceding search result--in this respect it is like a Qualifier. Unlike the ALSO command, the Global FOR commands need not operate via a search result, but can.

The search criteria for Global FOR commands are specified in the WHERE clause. Two additional operators are available in the Global FOR WHERE clause: OCCURS and LENGTH; these are not available to the FIND or ALSO commands. OCCURS allows a user to specify search criteria based on the number of occurrences of an element, and LENGTH allows criteria based on the length of any single occurrence. For example, suppose you wished to print mailing labels from your subfile's records, but first wanted to print all addresses that would not fit on standard labels. This might be done as follows:

-? select mailing list
-? set format large-labels
-? for tree where address occurs > 4 or address length > 35
+? in active clear clean display all
<-- records are placed in active file and displayed
+? endfor
-END OF GLOBAL FOR
-?

This would place in the active file the subset of all goal records that had more than four lines of ADDRESS or had any occurrence of the element ADDRESS that was longer than the width of a label, 35 characters.

Note from the above example that the Global FOR commands do not automatically provide you with a count of the number of records meeting the criteria specified in the (optional) WHERE clause. This is because the FOR command itself does not initiate a search of the file; the file is not searched until another command is issued that specifies what is to be done with the records--remove them, display them, dequeue them, etc. A count can be obtained, however, and a new set of WHERE criteria specified if the number is too small. The following example shows this process, which involves several examinations of the goal records in the search result, and is therefore rather time-consuming and expensive:

-? find name smith
-RESULT:   22 PERSON(S)
-? for result where sex = male and eyes = blue
+? skip last
+? show level
-FOR RESULT    4       22
+? for result where sex = male
+? skip last
+? show level
-FOR RESULT   11       22
+? endfor
-END OF GLOBAL FOR

The system's response to the SHOW LEVEL command gives two numbers: The second indicates the number of records examined--here it is 22, the same as the number of records in the search result. The first number indicates how many of the records examined met the criteria specified in the WHERE clause--4 for the first WHERE clause and 11 for the second.

The same results are more directly obtained by the use of the ALSO command, which gives an indication of the number of records meeting the criteria immediately, just as a FIND or other index search command does. For example:

-? find name smith
-RESULT:   22 PERSON(S)
-? also sex = male and eyes = blue
-RESULT:    4 PERSON(S)
-? backup
-RESULT:   22 PERSON(S)
-? also sex = male
-RESULT    11 PERSON(S)
-?

If further index searching commands (e.g. AND, OR) are necessary, then the ALSO command must be used, since the "result" of a Global FOR command is not a set of pointers in a search result. The pointers in a search result can be combined logically with the pointers meeting the criteria specified in subsequent search commands. If, however, the records meeting the WHERE criteria are to be displayed at the terminal or placed in the active file, regardless of their number, then Global FOR is a far more efficient way to do this than the ALSO command.

Compare the two search scenarios following:

-? find name smith        -? find name smith
-? also eyes string blue  -? for result where eyes string blue
-? type                   +? display all
... ...                   ... ...
-?                        +? endfor

The second series of search commands is almost twice as efficient as the first. With the ALSO command, the system must read the goal records to examine the EYES element, then read the records meeting the criteria a second time when a TYPE command is given. With the FOR RESULT command, the record is read to examine the EYES element, then, while the record is still in main memory, it is displayed on the terminal. The net effect is that a record is accessed only once when FOR RESULT is used.

In addition, Global FOR commands can be used for many record management functions other than what has been described. Here we have just exhibited its capabilities with respect to those of the ALSO command. The Global FOR commands facilitate a full range of data base and record management functions unavailable otherwise. All file owners and managers should be familiar with the capabilities of Global FOR for sequential subfile search and subsetting. Consult "SPIRES/370 Searching and Updating" for an introduction to Global FOR searching.

B.7.7 Index Definition

Having considered the different indexing options available to the file definer, and having described the functional differences among them, we can now attack the practical problem of coding the record definitions for the different indexes a file will have.

Subfiles may have one, several or no index records defined. There usually is one index record definition for each simple index in a file. One index record definition could be for a combined index in the subfile (remember that only one such index can be defined per subfile). Through a process called "passing", a combined index typically receives values from more than one element in the goal record, while a simple index typically receives values from only one element in the goal record. However, it is entirely possible for a combined index to have its values passed from a single goal record element. And it is also possible for more than one element in the goal record to pass to a simple index; this situation is known as "multiple passers." There may not be more than one combined index per subfile, but there may be more than one combined index defined in a file that has more than one subfile. There may be a large number of simple indexes, provided that the total number of records defined for a file (goal and index records) does not exceed sixty-four.

The different kinds of indexes a subfile has influences the kinds of records defined for all indexes in a subfile. This is due to one of the primary rules of coding index record definitions: all pointer groups to the same goal record must "look alike" in terms of their structure. This means that if there is a combined index or a simple index with a qualifier for a goal record, the pointer groups in all indexes to that goal record must exhibit the structure of a combined index or simple index with a qualifier.

It is fairly easy to reduce the definition of most index records to a "recipe," and "A Guide to Coding Index Record Definitions" [See D.5.] gives recipes for the indexes encountered in most SPIRES file definitions. The following sections describe how to code index record definitions in such a way that you can see the reason for their structure.

B.7.8 Coding Simple Indexes

If all indexes to a single goal record are to be simple indexes, then the structure of each index record might look something like the following:

RECORD-NAME = record-name;       RECORD-NAME = record-name;
  REQUIRED;                        FIXED;
    KEY = element-name;              KEY = element-name;  LEN = n;
  OPTIONAL;                        OPTIONAL;
    ELEM = pointer-name;             ELEM = pointer-name;
      TYPE = LCTR;                     TYPE = LCTR;

The only essential difference between the two is that one declares the key to be fixed in length, the other declares it to be varying in length. What is the key? The key of a simple index record is, in almost all cases, the value of an element passed from the goal record. Thus, if you were passing a fixed binary number, such as a price or date, you would want to specify that the key is fixed. The length is the same as the length of the stored value in the goal record.

It is wise to choose names carefully for the elements whose values are shown in lower case. "Record-name" can be up to six characters long; its value is used by SPIRES to sort record definitions for both goal and index records into alphabetical sequence. By tradition, the name REC01 has often been used for the goal record, and REC02, REC03, etc., have been chosen for the index records.

The "element-name" may be anything up to sixteen characters long. For simplicity, it is usually best to give this element the same name as the name of the goal record element that passes its value to this index record.

The "pointer-name" again may be anything, but the name you choose must be coded in the linkage section. One additional requirement falls on the "pointer-name": it must be given the same name in all indexes for a particular goal record. This is the second primary rule for coding indexes. The pointer element is often given the mnemonic name "POINTER." In the example shown above, the pointer element is an optional multiply occurring simple element. If possible, it is best for the pointer element to be fixed length. That's because SPIRES can do logical operations more efficiently when the pointer element is fixed length, and fixed length elements take less overhead in the index records.

Only one other comment need be made about these two index record definitions. The pointer-name is said to be "TYPE=LCTR;" (fixed length of 4 bytes). The pointer-element is the "thing" that SPIRES tallies when it reports the number of records in a search result. It is also the element in which SPIRES stores the reference back to the goal record. If the file definer has coded "REMOVED;" for the goal record definition, then this "reference back" or pointer is usually in the form of the four-byte location (a "locator") of the goal record in the residual dataset. If the goal-record has not been removed, then TYPE=LCTR may not be coded. (However, even if the goal-record has been removed, in certain circumstances you may not want TYPE=LCTR.)

Let's look now at a very simple bibliographic file definition which contains two indexes, one for titles and one for dates. The date element will pass its value to the FIXED key of an index. Note that the following definition is incomplete in that it doesn't say how this "passing" is to occur. This process is defined in the next chapter.

FILE = GA.SPI.BIBLIOGRAPHY;
AUTHOR = JOHN SACK;
BIN = 907;
RECORD-NAME = BOOK;
  SLOT; REMOVED;
  FIXED;
    ELEM = DATE;  LEN = 4;  OCC = 1;
      INPROC = A31; OUTPROC = A76;
  REQUIRED;
    ELEM = TITLE;
      INPROC = A40;
RECORD-NAME = REC02;
  COMMENT = Simple index on TITLE.;
  REQUIRED;
    KEY = TITLE;
  OPTIONAL;
    ELEM = POINTER;
      TYPE = LCTR;
RECORD-NAME = REC03;
  COMMENT = Simple index on DATE.;
  FIXED;
    KEY = DATE;   LEN=4;
  OPTIONAL;
    ELEM = POINTER;
      TYPE = LCTR;

B.7.9 Coding Simple Indexes with Qualifiers

A qualifier adds another level of "depth" to a simple index record definition: it introduces a structure, the same kind of structure that was defined by "TYPE=STR" in the goal record.

The pointer element, which was only a simple data element containing a reference to a goal record, is now a structure. The structure is always a keyed structure, and the key is always the pointer element. The structure itself is optional, but its key is fixed if the key is TYPE=LCTR. This introduces the third rule of index definition: if the pointer element is in a structure, then it must be the key of the structure.

What of the other elements in the structure? Typically, there is usually only one, the qualifier itself; if there is more than one qualifier, then there will be more than one qualifier element in the structure. The qualifier elements should always be defined in the index record definition with their OCC=1. If the goal record element that is passed to the qualifier does not occur in the goal record, then special pass-processing rules (PASSPROC rules, covered in the next chapter) should be coded to provide a default value.

Let's examine the skeleton of a simple index with one qualifier. Next to it is shown a TITLE index in which there is a SUBJECT qualifier.

RECORD-NAME = record-name;       RECORD-NAME = REC02;
  REQUIRED; (or FIXED)             REQUIRED;
    KEY = element-name;              KEY = TITLE;
  OPTIONAL;                        OPTIONAL;
    ELEM = pointer-structure;        ELEM = POINTER-STR;
    TYPE = STR;                      TYPE = STR;
STRUCTURE = pointer-structure;   STRUCTURE = POINTER-STR;
  FIXED;                           FIXED;
    KEY = pointer-name;              KEY = POINTER;
    TYPE = LCTR;                     TYPE = LCTR;
  OPTIONAL;                        OPTIONAL;
    ELEM = qualifier-name;           ELEM = SUBJECT;
    OCC = 1;                         OCC = 1;

We can now expand the example file definiton which had two simple indexes on DATE and TITLE to include a qualifier on the TITLE index. The new definition will illustrate another primary rule of coding indexes: if a pointer structure is used in one index, it must be coded in all indexes to that goal record, whether it is necessary to the structure of the specific index record in which it occurs or not. To see this, notice how the definition of the DATE index record has changed from its appearance in the previous example. Before, it was only a simple index; now, it looks like a simple index with a qualifier--even though no qualifier is passed to the DATE index.

FILE = GA.SPI.BIBLIOGRAPHY;
AUTHOR = JOHN SACK;
BIN = 907;
RECORD-NAME = BOOK;
  SLOT; REMOVED;
  FIXED;
    ELEM = DATE;  LEN = 4;  OCC = 1;
      INPROC = A31; OUTPROC = A76;
  REQUIRED;
    ELEM = TITLE;
      INPROC = A40;
    ELEM = SUBJECT;
      COMMENTS = Note that SUBJECT must always occur in the
      goal record, otherwise it can't be a qualifier.;
RECORD-NAME = REC02;
  COMMENT = Simple index on TITLE, with SUBJECT as qualifier.;
  REQUIRED;
    KEY = TITLE;
  OPTIONAL;
    ELEM = POINTER-STR;
    TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINTER;
    TYPE = LCTR;
  OPTIONAL;
    ELEM = SUBJECT;
    OCC = 1;
RECORD-NAME = REC03;
  COMMENT = Simple index on DATE.;
  FIXED;
    KEY = DATE;   LEN = 4;
  OPTIONAL;
    ELEM = POINTER-STR;
      TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINTER;
    TYPE = LCTR;
  OPTIONAL;
    ELEM = DUMMY;
    OCC = 1;

In general, pointer groups for indexes that apply to the same goal record must have identical structure; this is so SPIRES can AND and OR pointer groups when manipulating search results. If you need to violate this general rule, then the first index record-type defined in the linkage section for the goal-record is taken as the model for the other index record-types. If this record-type has the appropriate structure defined for it (as described above), then more specific rules can be used for other index record-types.

The specific rules for pointer group structures are as follows:

If there is no REQUIRED section, then all pointer groups must be identical through the length of the FIXED section. If one pointer group structure is declared with LEN attribute, then all must be declared that same way, and only FIXED elements are allowed. This is the most efficient form of pointer group structure. If the pointer group structure is not declared with LEN attribute, then any (or all) may have OPTIONAL elements.

If there is a REQUIRED section, then all pointer groups must be identical through the end of that section. If the only REQUIRED element is the KEY of the pointer group, then any (or all) may have OPTIONAL elements. If there are non-key REQUIRED elements, then if one pointer group has OPTIONAL elements declared, all must declare OPTIONAL elements.

Note that if an index record-type is to have multiple qualifiers passed to it, then the following definition is appropriate:

RECORD-NAME = record-name;       RECORD-NAME = REC02;
  REQUIRED; (or FIXED)             REQUIRED;
    KEY = element-name;              KEY = TITLE;
  OPTIONAL;                        OPTIONAL;
    ELEM = pointer-structure;        ELEM = POINTER-STR;
    TYPE = STR;                      TYPE = STR;
STRUCTURE = pointer-structure;   STRUCTURE = POINTER-STR;
  FIXED;                           FIXED;
    KEY = pointer-name;              KEY = POINTER;
    TYPE = LCTR;                     TYPE = LCTR;
  OPTIONAL;                        OPTIONAL;
    ELEM = qualifier-name1;          ELEM = SUBJECT;
    OCC = 1;                         OCC = 1;
    ELEM = qualifier-name2;          ELEM = PUBLISHER;
    OCC = 1;                         OCC = 1;

B.7.10 Coding Combined Indexes

A combined index record cosmetically looks very similar to a simple index with one qualifier. Two important differences must be noted. First, the KEY of the combined index record is always fixed with a length of two bytes. This is because the key of such a record is an encoded form of the element name that is being passed to this index. [See B.7.5.] If the elements DATE, AGE, and TEMPERATURE all pass to a combined index, there will be three keys, and hence three records, in the index. (The keys, with which the file definer and searcher need never be concerned, tell SPIRES the structure and element number of the element in the goal record definition.) Second, the element that previously named the qualifier is now given a generic name, usually something mnemonically significant, like "VALUE", since each occurrence of it contains one value passed from the goal record.

Below, a skeleton record definition is presented. Next to it is shown a combined index that contains a DATE occurrence. Notice that the word DATE never appears in the index definition. The linkage section specifies which element(s) will pass to the combined index record.

RECORD-NAME = record-name;       RECORD-NAME = REC03
  FIXED;                           FIXED;
    KEY = element-number;            KEY = ELEM-NO;
      LEN = 2;                         LEN = 2;
  OPTIONAL;                        OPTIONAL;
    ELEM = pointer-structure;        ELEM = POINTER-STR;
    TYPE = STR;                      TYPE = STR;
STRUCTURE = pointer-structure;     STRUCTURE = POINTER-STR;
  FIXED;                             FIXED;
    KEY = pointer-name;                KEY = POINTER;
    TYPE = LCTR;                       TYPE = LCTR;
  OPTIONAL;                        OPTIONAL;
    ELEM = value-name;               ELEM = VALUE;
    OCC = 1;                         OCC = 1;

Notice how these record definitions follow one of the indexing rules: if the pointer (here, TYPE=LCTR) is in a structure, it must be the key of the structure.

As the following example shows, all of the other rules are followed: 1) if one index has a pointer structure (because it is a simple index with a qualifier or because it is a combined index) then all indexes must have pointer structures; 2) all pointer elements must have the same name in each index.

The following example is similar to the previous two in some ways. The goal record contains TITLE, SUBJECT and DATE elements; COST will now be introduced and placed in the combined index. Note how the definition is quite different from the first example, which showed only simple indexes; but its only difference from the previous example, which showed simple index qualifiers, is the addition of a combined index.

FILE = GA.SPI.BIBLIOGRAPHY;
AUTHOR = JOHN SACK;
BIN = 907;
RECORD-NAME = BOOK;
  SLOT; REMOVED;
  FIXED;
    ELEM = DATE;  LEN = 4;  OCC = 1;
      INPROC = A31; OUTPROC = A76;
  REQUIRED;
    ELEM = TITLE;
      INPROC = A40;
    ELEM = SUBJECT;
      COMMENTS = Note that SUBJECT must always occur in the
      goal record, otherwise it can't be a qualifier.;
  OPTIONAL;
    ELEM = COST;
      LEN = 4;
      INPROC = A42:1; OUTPROC = A81,7;
RECORD-NAME = REC02;
  COMMENT = Simple index on TITLE, with SUBJECT as a qualifier.;
  REQUIRED;
    KEY = TITLE;
  OPTIONAL;
    ELEM = POINTER-STR;
    TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINTER;
    TYPE = LCTR;
  OPTIONAL;
    ELEM = SUBJECT;
    OCC = 1;
RECORD-NAME = REC03;
  COMMENT = Simple index on DATE.;
  FIXED;
    KEY = DATE;   LEN = 4;
  OPTIONAL;
    ELEM = POINTER-STR;
      TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINTER;
    TYPE = LCTR;
  OPTIONAL;
    ELEM = DUMMY;
    OCC = 1;
RECORD-NAME = REC04;
  COMMENT = Combined index, to which COST will be passed.;
  FIXED;
    KEY = ELEM-NO;
    LEN = 2;
  OPTIONAL;
    ELEM = POINTER-STR;
      TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINTER;
    TYPE = LCTR;
  OPTIONAL;
    ELEM = VALUE;
    OCC = 1;

Note that the occurrence of the VALUE element is always 1. The structure containing this element occurs once for each value of an element passed from the goal record. A record passing two COST values would cause two POINTER-STRs to occur, each with a single POINTER and VALUE. When SPIRES retrieves such records, it reports a result, which is the number of POINTER-STRs that met the criteria specified; this count may be high, since a single record could be represented in the POINTER-STR list more than once. SPIRES will correct any erroneous result count after it has been asked to TYPE the records in the search result.

Notice that the POINTER-STR in REC02 (TITLE) has the same "form" as the POINTER-STR in REC03 (DATE) and REC04. SUBJECT, DUMMY, and VALUE all occupy the same position.

B.7.11 Coding Sub-Indexes

Sub-indexes, usually used only for personal name indexing, provide a variation on the theme of simple indexes and simple indexes with qualifiers. Sub-indexes cannot be defined as part of a combined index, but may be defined for simple indexes in subfiles that have combined indexes.

Sub-indexes provide a way of searching data that has a hierarchical organization. Two simple hierarchies might have the following structures:

PERSONAL NAME:                   AIRLINE TICKET:
  Last Name                        Flight
    First Name                       Section
    Middle Name                        Row
                                         Seat

It would not be useful to find all people with a first name of "John" in a subfile unless you had first established that you were interested only in people whose last name was "Smith." It would also not be helpful for an airline reservation system to be able to find all seats with the number 13 unless a particular flight had been established to restrict the domain of the search.

Two types of sub-indexes can be defined, one for subfiles that contain only simple indexes and no qualifiers (no pointer structures would be involved in this case) and one for subfiles that contain either qualifiers or combined indexes. The following example shows the two types of record definitions; each is defined for a personal name index. The key of such an index is the person's last name, and the key of the sub-index (which is a structure) is the rest of the person's name (first, middle, etc.). Note that the first name structure is not a pointer structure: the pointer or pointer structure is an optional element in the first name structure.

RECORD-NAME = REC02;                   RECORD-NAME = INDEX5;
  REQUIRED;                              REQUIRED;
    KEY = LAST-NAME;                       KEY = LAST-NAME;
  OPTIONAL;                              OPTIONAL;
    ELEM = FIRSTNAME-STRUCT;               ELEM = FIRSTNAME-STR;
      TYPE = STR;                            TYPE = STR;
  STRUCTURE = FIRSTNAME-STRUCT;        STRUCTURE = FIRSTNAME-STR;
    REQUIRED;                            REQUIRED;
      KEY = FIRST-NAME;                    KEY = FIRST-NAME;
      ELEM = POINTER;                    OPTIONAL;
        TYPE = LCTR;                       ELEM = POINTER-STR;
                                             TYPE = STR;
                                       STRUCTURE = POINTER-STR;
                                         FIXED;
                                           KEY = POINTER;
                                             TYPE = LCTR;
                                         OPTIONAL;
                                           ELEM = VALUE;
                                             OCC = 1;

B.7.12 Index Record and Goal Record Elements

Up to this point, the definition of an index record looks very similar to the definition of a goal record: there are keys, elements of fixed or varying length and optional elements, and there are structures. Several file definition elements have not appeared: SLOT, SLOTCHECK, REMOVED, INPROC, OUTPROC, and ALIASES.

SLOT, SLOTCHECK and REMOVED are rarely coded for index records. However, one element is often coded for index records that is not usually coded for the first record-type (usually the goal record) in the file defintion; this is COMBINE. As explained in "Tree and Slot, Goal and Index Records," [See B.6.5.] COMBINE specifies that the data sets created by the compiler for each record definition specifying COMBINEd are to be merged into a single data set or file. Any tree structured data set can be combined with any other tree structured data set; slot record-types cannot be combined with each other, or with any record-type. Except in the largest files (over 100,000 records) with several subfiles, or files in which there are large table-lookup files, COMBINE should be used whenever possible. When there are table-lookup record-types, it is often a good idea to COMBINE them with each other, and to COMBINE the goal and index record-types together. This allows flexibility in erasing and recreating the table files with the ZAP DATA SET command. [See B.10.17.]

The COMBINE element is coded just after the RECORD-NAME element as follows:

RECORD-NAME = REC01;
      :
 record definition
      :
RECORD-NAME = REC02;
 COMBINE = REC01;
      :
 record definition
      :
RECORD-NAME = REC03;
 COMBINE = REC01;
      :
 record definition
      :
      :
     etc.

Note that COMBINE is not coded for the goal record, REC01, in the above example, since it is the record with which other record-types are combined. The record-type named in the COMBINED statement must have been defined earlier in the file definition; it may not be defined further down. All of the file definitions in "Annotated File Definition Examples" [See D.7.] use the COMBINE feature wherever possible.

B.7.13 Index Records as Goal Records

What about coding INPROC, OUTPROC and ALIASES for index record definitions? These file definition elements may be coded for index records, but often are not. Since SPIBILD maintains the indexes, there is no need for ALIASES, and any INPROC, INCLOSE or OUTPROC rules that are coded are ignored when SPIBILD is updating the indexes as part of its processing. If the SPIBILD process will create new records in a record-type (as is usually the case with index records), it is important that no FIXED or REQUIRED elements be defined that will not be created by SPIBILD. If a required element is not present when SPIBILD attempts to create a new record in a record-type, a PASS ERROR with a code of S419 will occur. This is a serious error.

Generally file owners are encouraged to include INPROCs and OUTPROCs for the key or each index record, because these affect the results displayed by the BROWSE command. When index values are displayed with the BROWSE command, the values are processed through the OUTPROCs for the key. Also, if a value is given in the BROWSE command (such as BROWSE DATE-ADDED 7/1/80), that value is processed through the key's INPROCs as well. Without such INPROCs and OUTPROCs on the key of the index record, browsing the index can be a pointless exercise for the subfile user. (Actions A32 and A65 will not be executed during "BROWSE processing", however.) If only an INPROC is defined for an index record key and the INPROC sets the type for the key, e.g., an A31 identifies the stored key as a hexadecimal one, then SPIRES will convert the values displayed to string values when the BROWSE command is issued. This is not usually as valuable, or as straightforward, as putting the appropriate INPROCs and OUTPROCs in the index record definition.

It is also important to code INPROCs and OUTPROCs (and perhaps even ALIASES) if the index is to be used as a goal record that can be selected (using the SELECT command) or attached (using the ATTACH command).

In the following example, suppose the first index record, the SUBJECT index, can be selected as a goal record. An element called CROSS-REFERENCE has been added, so that the file owner can add cross-reference records to the SUBJECT index. Also, an OUTPROC action 32 has been added on the pointer, and it will convert the pointer on output by referring to the goal record it locates and looking up the TITLE element. The details of coding action 32 are covered in "Indirect Record-Access: Action 32 and SUBGOAL Processing." [See C.5.]

FILE = GA.SPI.BIBLIOGRAPHY;
AUTHOR = JOHN SACK;
BIN = 907;
RECORD-NAME = BOOK;
  SLOT; REMOVED;
  FIXED;
    ELEM = DATE;  LEN = 4;  OCC = 1;
      INPROC = A31; OUTPROC = A76;
  REQUIRED;
    ELEM = TITLE;
      INPROC = A40;
    ELEM = SUBJECT;
  OPTIONAL;
    ELEM = COST;
      LEN = 4;
      INPROC = A42:1;  OUTPROC = A81,7;
RECORD-NAME = REC02;
  FIXED;
    KEY = ELEM-NO;
    LEN = 2;
  OPTIONAL;
    ELEM = POINTER-STR;
    TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINTER;
    TYPE = LCTR;
  OPTIONAL;
    ELEM = VALUE;
    OCC = 1; LEN = 4;
    COMMENTS = This combined index will hold a value
      for the COST element, which is fixed length
      of four bytes.;
RECORD-NAME = REC03;
  REQUIRED;
    KEY = SUBJECT;
      INPROC = A30/ A40; ALIASES = S, SUB;
    ELEM = CROSS-REFERENCE;
      INPROC = A30/ A40; ALIASES = C, CR, REF;
      OUTPROC = A36,'See Also: ',0;
  OPTIONAL;
    ELEM = POINTER-STR;
      TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINT-BACK;  ALIASES = POINTER;
    TYPE = LCTR;
      OUTPROC = A32:7,1,2;
  OPTIONAL;
    ELEM = DUMMY;
    OCC = 1;

B.7.14 Index Records for Non-Removed Record Types

If the goal record does not have REMOVED specified, the file definer may not use the TYPE=LCTR specification in defining index records. Whenever the pointer to the goal record is the goal record key, rather than a location of the goal record in the residual data set, then TYPE=LCTR may not be specified.

This is always the situation when the goal record is not REMOVED. It may also occur for REMOVED goal records if the file definer has chosen to pass the goal record's key rather than goal record's residual location.

All of the sample index definitions shown so far have assumed that a location in the residual data set is being stored rather than a key. However, it is fairly simple to see the implications of storing a key for index record definition:

 - "TYPE=LCTR;" cannot be used.

 - If the goal record key is of fixed length, then code  "LEN=n;"  in  place  of
 "TYPE=LCTR;"  where "n" is the fixed length of the goal record key.

 - If the goal record key is varying in length, then the pointer element cannot be in  the
 FIXED section of the index record or structure in which it is defined.

Usually you will pass the locator rather than the goal record key as the pointer. However, it can be very useful to pass the key at times: if the key itself is stored in the index record, the key can be examined directly when the index is used as a goal record. Also, if the techniques of "goal-to-goal passing" or "self-indexing goal records" are being used, the key usually must be passed. [See C.12.] Passing the goal record key to indexes may be a good idea if any of the following is true:

 - the key is fixed-length and short (fewer than five characters)

 - the goal record is SLOT

 - the indexes will be used to produce lists of goal record keys.

B.7.15 Ensuring the Validity of Index Records

Index records are normally maintained entirely by SPIBILD, in accordance with the rules the file definer specifies in the linkage section of the file definition. [See B.8.] The file definer does not need to take any explicit action to ensure that information in the indexes is valid, but must ensure that a null value isn't passed as the key of an index record.

When index records can be transferred and updated as goal records, [See B.7.13.] the file owner must ensure that a user cannot incorrectly alter the linkages between goal and index records build by SPIBILD.

The most important ingredient or rule of this linkage is that all keys along the structural path from the index record's key to the pointer group (or pointer element) are in descending sort order. This is the way they are automatically created by SPIBILD. The best way to ensure this is to make these elements non-updateable. This is usually done with a PRIV-TAG specification. [See B.9.4.]

All structures along the pointer group path up to and including the pointer element itself must be in descending sort order. This can be ensured by coding an A138:0 as the INPROC for all structures along this path.

For example, in a personal name index [See B.7.11.] the sub-index structure must be sorted in descending order by its key (a person's first name), and:

 - if the pointer-group is a structure, it must be sorted in descending order by its  key,
 which is the pointer

 - if the pointer-group is just a simple element, then it must  be  sorted  in  descending
 order.

B.7.16 Personal Name Algorithm Details

The following describes in detail the way the "personal name algorithm" matches (or does not match) values in search commands against values in index records.

Assume that a NAME index has been built using the personal name algorithm and that it is being searched by a corresponding search term. The algorithm first places the search value into the standard form: surname, non-surnames. Thus:

NAME John Michael Vincent

becomes: NAME Vincent, John Michael

Note that the non-surname portion of the search value may be null as in: NAME Smith

The algorithm then establishes a null "master" pointer group set into which matching pointer groups will be OR'd (if found).

The algorithm then proceeds to retrieve records based on the surname portion of the search value (including truncated search retrieval). Each record retrieved contains sub-index portions keyed on non-surnames. For each record retrieved, these sub-index portions are processed as follows:

The first non-surname in the search value is used to scan the names in a sub-index key. The scan matches through the length of the shorter of any two names. Thus, "David" in the sub-index key and "D" in the search value match (D=D). Likewise, "ANN" in the sub-index key and "ANNE" in the search value match (ANN=ANN). The scan of the sub-index key names continues until a match is made. Then the next non-surname in the search value is used and the scan continues in a similar manner.

If all non-surnames in the search value succeed in matching, then the associated pointer groups in this sub-index portion are OR'd into the "master" set. Note that for a search value with a null non-surname, all sub-index keys are assumed to match.

If all the non-surname fields in the sub-index key are scanned before all the fields in the search value have been used, then this sub-index portion is ignored (does not match).

For example:

If the index contains "Smith, Jane Anne Marie" then:

 - if the search is for "Smith, J M", it succeeds.

 - if the search is for "Smith, Ann", it succeeds.

 - if the search is for "Smith, Ma", it succeeds.

 - if the search is for "Smith, Jane M Ann", it fails.

 - if the search is for "Smith, Anabel Marie", it fails.

 - if the search is for "Smith, Mary", it fails.

If the index had contained "Smith, J A M" then the last two searches above would have succeeded.

There is an option on the personal name algorithm used in building indexes that allows multiple index entries to be built from a single name. This is particularly useful with married women's names. For example:

Duke Asten, Patty

would be indexed as both of the following:

Duke, Patty
Asten, Patty Duke

This allows retrieval by maiden name to still succeed.

Finally, it should be noted that the personal name algorithm can be used in searching "funny names". All that is required is that there be a sub-index keyed by terms that look like non-surnames. This can be accomplished by using A38 in the Goal record as INPROC/OUTPROC rules, and then pass that element to the sub-index (without forcing to upper case). For example, assume a structure contains the following items in the Goal records:

STRUC-NAME = something;
  KEY = FAMILY;
  ELEM = PARENT;  INPROC = A30/A38;  OUTPROC = A38;
  ELEM = CHILD;  INPROC = A30/A38;  OUTPROC = A38;

FAMILY is just a surname; PARENT and CHILD are non-surnames. Thus,

FAMILY = JONES;
  PARENT = JOHN MICHAEL;
  PARENT = MARY ANN;
  CHILD = JIMMY;
  CHILD = ANN MARIE;

Upon passing, FAMILY is passed to the key of an index record just like a standard index (A169). The PARENT and CHILD elements are then passed to the sub-index key (A167:5,PARENT,CHILD). The SRCPROC for FAMILY does something like: SRCPROC = A38/A14,#; specifying both "personal name algorithm" and truncated search. A search request could then be formed like: FIND FAMILY JONES, JOHN

The index is still a "personal name" index, but it is built from separate elements within the Goal records. This same kind of process could be applied to any single-word element and multi-word "qualifiers" as long as these multi-word elements can be upper case in the goal record AND can be searched by non-surname matching rules.

B.8 Understanding and Coding the Linkage Section

B.8.1 Functions of the Linkage Section

The previous chapters of this manual have covered the definition of the record-types that will make up a SPIRES file. Two kinds of record-types have been examined in detail: goal records and index records.

The linkage section, as its name implies, links the goal and index records for two purposes: searching and passing. The linkage section controls the search process by specifying in the "SEARCHTERMS" statement the names of the components of an index to be searched. The linkage section also specifies, in the "SRCPROC" statement, the processing rules to be applied to values in a search request. Passing, which is the process of using information in a goal record to build an index record, is controlled by specifying the goal record information to be passed. The source of this information is specified by the "GOALREC-ELEM" statement or by processing rules coded in the "PASSPROC" statement.

Thus we can see at least two different parts of a file definition. The first part, defining the goal and index records, is a description of data structures. The second part, defining the linkage between goal and index records, is devoted to procedural rather than descriptive statements. These procedural statements provide for passing and searching. A third part of the file definition, defining the privileges of any user or group of users with respect to a subfile, is described in the next chapter "Defining Subfile Privileges." [See B.9.]

The linkage section itself can be subdivided into small sections: 1) a single group of statements defining certain global relationships between a goal record and all its index record(s), and 2) groups of statements describing the specific processing of the linkage between the goal record and a single index record. (1) is discussed in "The Global Parameters Section" [See B.8.2.] and (2) is described in "Individual Index Linkages" [See B.8.3.] There usually is one individual index linkage for each index record you have defined. The structure of these parts is fairly simple; the definition of the linkage is in terms of SEARCHTERMS, SRCPROC, GOALREC-ELEM and PASSPROC statements. If a combined index, qualifiers, or sub-indexes are used, then one or two additional elements must be specified in the linkage definition for the index record in which they occur. The only difficulty usually encountered in defining linkage sections is in coding the various PASSPROC rules, and occasionally in coding the SRCPROC rules; we will not consider the definition of these processing rule strings in detail until the end of this chapter.

B.8.2 The Global Parameters Section

The linkage section for any particular goal record begins with some "global" information that is common to all indexes belonging to that goal record. This information always includes the name of the goal record to which the entire linkage section applies, the name given to a search result for the goal record, and the name of the pointer element in all of the indexes. Any global qualifiers (qualifiers that are passed to all indexes) are specified here also.

Linkage sections are coded following the record definitions of the goal and index records. The linkage section begins with the global parameters portion:

GOALREC-NAME = name coded in the RECORD-NAME element for
      the goal record;
PTR-ELEM = name given to the pointer in every one of the
      index-record definitions;
EXTERNAL-NAME = name you want displayed in search result
      counts, such as RESULT: 16 RESTAURANT(S).
      It must be less than 16 characters and contain no
      blanks; SPIRES adds the '(S)' automatically.;
GOALREC-KEY = the name of the key element in the goal
      record. This will default to the first element
      defined in the goal record if no value is coded.;
PASSPROC = A170; if the PTR-ELEM in the indexes is TYPE=LCTR
      and the goal records are REMOVED.  Otherwise,
PASSPROC = A167:5...; or A169:1; when the PTR-ELEM in the
      indexes is not TYPE=LCTR; regardless of goal records
      being REMOVED.
      If PASSPROC is not coded, then GOALREC-KEY is passed
      to the PTR-ELEM in upper case (equivalent to A169:0).

Because of the rarity of global qualifiers, the additions they require to the global section will not be covered until later in this chapter. [See B.8.6.] Let's begin coding the linkage section by defining the global parameters section for a very simple bibliographic file. In this file, we have one goal record, BOOK, and two index records, REC02 and REC03. Let's say that we want the search result to be called "CITATION".

The file definition, up through the global portion of the linkage section, looks like this:

FILE = GA.SPI.BIBLIOGRAPHY;
AUTHOR = JOHN SACK;
RECORD-NAME = BOOK;
  REMOVED;  SLOT;
  FIXED;
    ELEM = DATE;  LEN = 4;  OCC = 1;
      INPROC = A31; OUTPROC = A76;
  REQUIRED;
    ELEM = TITLE;
      INPROC = A40;
RECORD-NAME = REC02;
  REQUIRED;
    KEY = TITLE;
  OPTIONAL;
    ELEM = POINT-BACK;     <--|
      TYPE = LCTR;            |
RECORD-NAME = REC03;          |
  FIXED;                      |
    KEY = DATE;  LEN = 4;     |
  OPTIONAL;                   |
    ELEM = POINT-BACK;     <--| - note that these statements
      TYPE = LCTR;            |   specify the same name.
                              |
GOALREC-NAME = BOOK;          |
PTR-ELEM = POINT-BACK;     <--|
EXTERNAL-NAME = CITATION;
PASSPROC = A170;

The GOALREC-NAME statement names the goal record by specifying its RECORD-NAME. The EXTERNAL-NAME statement declares what a search result will be called when SPIRES reports the result count after a search command such as FIND. The PTR-ELEM statement names the element in each index record that is to receive the pointer back to the goal record. You may choose any element name you wish, but it must be the same in all index records. In our example, it happens to be POINT-BACK. The PASSPROC specifies A170 because the pointer element is TYPE=LCTR and the goal records are REMOVED. A170 specifies that the information passed to the pointer element in each index will be the address of the goal record in the residual data set. If the pointer element is not TYPE=LCTR, then the information passed to the pointer element in each index should be the key of the goal record, which is usually specified by a GOALREC-KEY statement. If the goal records are not REMOVED, then the pointer element cannot be TYPE=LCTR and A170 cannot be used.

B.8.3 Individual Index Linkages

After any global parameters section, a linkage between the goal record and each index record must be defined. The definition of these linkages is, in structure, fairly straight-forward, and looks like the following:

INDEX-NAME = name of the index record whose KEY is to
      receive a goal record value;
SEARCHTERMS = the names and aliases to be allowed in
      a search command such as FIND;
SRCPROC = processing rules applied to the values in a
      search request;
GOALREC-ELEM = name of the element in the goal record
      that is passing its value to this index record;
PASSPROC = processing rules applied to goal record
      elements before placing them in the index record;
PRIV-TAG = number; to limit the use of these SEARCHTERMS.

PTR-GROUP = name of the pointer structure, or of the
      pointer element if there is no pointer structure,
      in the index record;

The structure of this skeleton can be slightly complicated by the inclusion of linkage information for a sub-index, local qualifiers, or a combined index. These cases will be covered later in this chapter. [See B.8.5, B.8.7, B.8.8.]

A "recipe" for coding the global and individual parameters of the linkage section is given in "A Guide to Coding the Linkage Section Definition." This guide covers all types of linkage definition. [See D.6.] The different kinds of index records coded in the preceding chapter will serve as examples of simple, personal name, qualified and combined index linkage sections. We will take each possibility in turn, leaving the detailed consideration of PASSPROC rule strings to the end of this chapter.

The PTR-GROUP statement, for any particular index, names a multiply occurring element in the index that is either a STRUCTURE element whose KEY is the PTR-ELEM, or a simple ELEM which is the PTR-ELEM. If the subfile needs only simple pointers back to goal records (no combined index or qualifiers), then PTR-ELEM and PTR-GROUP refer to the same simple pointer ELEM in all indexes. But if there is a need for combined index or qualifier terms, then PTR-GROUP for each index refers to a multiply occurring STRUCTURE which will contain those terms. The KEY of each STRUCTURE is the PTR-ELEM (pointer element).

B.8.4 Simple Indexes

Here are the two individual linkages to the index records REC02 and REC03. In the complete file definition we are building towards, these would be added right after the global parameters section with which the previous example ended.

INDEX-NAME = REC02;
      SEARCHTERMS = TITLE, T;
      SRCPROC = A45,/ A47,2/ A11:3,#;
      GOALREC-ELEM = TITLE;
      PASSPROC = A166/ A45,/ A47,2;
   PTR-GROUP = POINT-BACK;

INDEX-NAME = REC03;
      SEARCHTERMS = DATE, DAT, D;
      SRCPROC = A56, 'Not Date Format'/ AS31;
      GOALREC-ELEM = DATE;
      PASSPROC = A169:1;
   PTR-GROUP = POINT-BACK;

The INDEX-NAME statements name the particular index records that will be linked to the goal record; here they are REC02 and REC03. The SEARCHTERMS statement is similar to the ALIASES statement in the goal record definition. Here SEARCHTERMS specifies the name or names that can be used to access an index in a search command such as FIND. [See B.9.4 to see how PRIV-TAG can restrict the use of SEARCHTERMS.]

The SRCPROC statement specifies processing that is to be performed on search values given in search commands. This processing is usually equivalent to a combination of both INPROC and PASSPROC rules used to determine the form in which goal record values are to be placed in the index record. That is, SRCPROC rules are usually coded to "translate" incoming search values into values that might be found in the index records. The SRCPROC for REC02:

SRCPROC = A45,/ A47,2/ A11:3,#;

breaks a search value up into individual words ("A45,", which breaks on blanks), then excludes any words of two or fewer characters (A47,2), and allows special truncated search if a word of more than three characters contains a "#" (A11:3,#).

Notice that the PASSPROC for this index contains similar rules:

PASSPROC = A166/ A45,/ A47,2;

A166 specifies that the goal record element value (or values) named in GOALREC-ELEM is to be fetched, and is later to be processed by A45 or A38 (both are actions that "split" a value into parts). The rules "A45,/ A47,2" make sure that only individual words are passed to the key of the index records, and that no words less than two characters are passed. This part of the rule string is identical to a portion of the SRCPROC rule string.

The SRCPROC and PASSPROC rules coded for REC03 are as follows:

SRCPROC = A56, 'Not Date Format'/ AS31;
PASSPROC = A169:1;

The SRCPROC rules coded will convert a date in a search value to the internal form of a date, just as was done by an INPROC=AS31 statement in the goal record definition. The PASSPROC rule specifies only that the element whose name is coded in the GOALREC-ELEM statement be fetched and stored in the index record without the standard conversion to uppercase on passing. Values that are stored in character form should always be forced to uppercase on passing. Any other form of a value (e.g., binary, floating-point, packed decimal) should not be forced to uppercase. No translation by a matching AS31 is necessary in the PASSPROC, since the date is stored in the appropriate format in the goal record via the INPROC=AS31. Part of the power of SPIRES indexing methods is that values can appear in the goal record in one form, and can be passed and searched in a more usable (for the searcher) form.

The PTR-GROUP statement names the same element as the PTR-ELEM statement because our indexes have been defined to use only simple pointer elements (no qualifiers or combined index).

Although each INDEX-NAME refers to a different RECORD-NAME in our example, it is possible for any RECORD-NAME to be referenced by more than one INDEX-NAME. Such a case usually occurs when different elements within the goal records are to be passed to the same index, but those elements require different PASSPROC or SRCPROC rules.

B.8.5 Sub-Indexes

The general form of SUB-INDEX linkage is similar to INDEX-NAME:

SUB-INDEX = name of an index record STRUCTURE whose KEY
      is to receive a goal record value;
SEARCHTERMS = the names and aliases to be allowed in
      a search command such as FIND;
SRCPROC = processing rules applied to the values in a
      search request;
GOALREC-ELEM = name of the element in the goal record
      that is passing its value to this sub-index;
PASSPROC = processing rules applied to goal record
      elements before placing them in the sub-index;
PRIV-TAG = number; to limit the use of these SEARCHTERMS.

When SUB-INDEX terms are added to a simple index, the effect is to introduce additional structural levels to the hierarchy leading from the KEY of the INDEX-NAME record to the PTR-GROUP element. SUB-INDEX names a keyed structure in the index record. The KEY of that STRUCTURE receives the goal record's value being passed for the sub-index term. A personal name index is a good example of a simple index with a sub-index structure. Let's modify our sample file definition to include a PERSON element in the BOOK records, and another index record: REC04

FILE = GA.SPI.BIBLIOGRAPHY;
AUTHOR = JOHN SACK;
RECORD-NAME = BOOK;
  REMOVED;  SLOT;
  FIXED;
    ELEM = DATE;  LEN = 4;  OCC = 1;
      INPROC = AS31; OUTPROC = A76;
  REQUIRED;
    ELEM = TITLE;
      INPROC = A40;
    ELEM = PERSON;
      INPROC = A40/ A41:1;

RECORD-NAME = REC04;
    KEY = LAST-NAME;
    ELEM = FIRST-NAME-STR;
      TYPE = STR;
STRUCTURE = FIRST-NAME-STR;
    KEY = FIRST-NAME;
      INPROC = A38;
      OUTPROC = A38;
    ELEM = POINT-BACK;
      TYPE = LCTR;

GOALREC-NAME = BOOK;
PTR-ELEM = POINT-BACK;
EXTERNAL-NAME = CITATION;
PASSPROC = A170;

INDEX-NAME = REC04;
      SEARCHTERMS = PERSON, NAME, PN, N;
      SRCPROC = A44,'.',/ A38/ A11:1,#;
      GOALREC-ELEM = PERSON;
      PASSPROC = A166/ A44,'.',/ A38;

   SUB-INDEX = FIRST-NAME-STR;
      SEARCHTERMS = NULL;
      PASSPROC = A165;

   PTR-GROUP = POINT-BACK;

No GOALREC-ELEM was needed for the SUB-INDEX term in this example because the PASSPROC A165 indicates the value to be passed to FIRST-NAME had already been created by A38 in the PASSPROC associated with INDEX-NAME. This is usually the case with personal name sub-indexes, but not for other sub-index structures. The SEARCHTERMS of the SUB-INDEX for personal name are not usually used in a search request because A38 in the SRCPROC of the INDEX-NAME provides the necessary search values for the SUB-INDEX. [See B.9.4 to see how PRIV-TAG can restrict the use of SEARCHTERMS.]

Let's examine the index record definition and linkage definition for a sub-index that is not for a personal name. Suppose the following hierarchy were needed for an airline reservation system:

Index:  Flight Number
  Sub-Index:  Section Number
    Sub-Index:  Seat Number

So, SEAT is inside SECTION which is inside FLIGHT. The index record definition for this structure would look like this:

RECORD-NAME = REC04;
    KEY = FLIGHT-NUMBER;
    ELEM = SECTION-NUMBER-STR;
      TYPE = STR;
STRUCTURE = SECTION-NUMBER-STR;
    KEY = SECTION-NUMBER;
    ELEM = SEAT-NUMBER-STR;
      TYPE = STR;
STRUCTURE = SEAT-NUMBER-STR;
    KEY = SEAT-NUMBER;
    ELEM = POINT-BACK;
      TYPE = LCTR;

The linkage definition for this index record would look like this:

INDEX-NAME = REC04;
      SEARCHTERMS = FLIGHT-NUMBER, FLIGHT;
      SRCPROC = <appropriate search processing rules>;
      GOALREC-ELEM = FLIGHT;
      PASSPROC = A169:1;

   SUB-INDEX = SECTION-NUMBER-STR;
      SEARCHTERMS = SECTION-NUMBER, SECTION;
      SRCPROC = <appropriate search processing rules>;
      GOALREC-ELEM = SECTION;
      PASSPROC = A171,0/ A169:1;

   SUB-INDEX = SEAT-NUMBER-STR;
      SEARCHTERMS = SEAT-NUMBER, SEAT;
      SRCPROC = <appropriate search processing rules>;
      GOALREC-ELEM = SEAT;
      PASSPROC = A171,0/ A169:1;

   PTR-GROUP = POINT-BACK;

Note the use of A171 to pass a default value of SECTION and SEAT if no value is found in the goal record. This will ensure that the index record is created, even if it is incomplete. A171 is also used this way in passing qualifier elements. [See B.8.6, B.8.7.]

The SEARCHTERMS of a SUB-INDEX are specified with a leading @-sign in a search request along with the SEARCHTERMS of the INDEX-NAME. For example, FIND FLIGHT 27 @SECTION B @SEAT 9 requests a specific hierarchy within the REC04 index.

B.8.6 Global Qualifiers

In order to have qualifiers in an index record, PTR-GROUP should specify a structure element in all index records. PTR-ELEM specifies the KEY of the structure, and the other elements within the structure receive qualifier values. The "form" of the structure must be the same across all index records associated with a particular GOALREC-NAME. By that is meant, the number of FIXED, REQUIRED, and OPTIONAL elements must be the same in each definition of the structure; and the LENgth and OCCurrence attributes associated with corresponding elements must be the same within each structure. The KEY of the structure receives the pointer back to the goal.

Global qualifiers are specified in the global parameters section of a linkage description just prior to the first INDEX-NAME section. The statements of the QUAL-ELEM section are:

QUAL-ELEM = name of the element index record's PTR-GROUP
      structure that is to receive a goal record value;
SEARCHTERMS = the names and aliases to be allowed in
      a search command, such as AND;
SRCPROC = processing rules applied to the values in a
      search request;
GOALREC-ELEM = name of the element in the goal record
      that is passing its value to this qualifier;
PASSPROC = processing rules applied to goal record
      elements before placing them in the qualifier;
PRIV-TAG = number; to limit the use of these SEARCHTERMS.

The SEARCHTERMS of any QUAL-ELEM are specified in a search request following the AND or AND NOT logical operators. [See B.9.4 to see how PRIV-TAG can restrict the use of SEARCHTERMS.]

Let's alter our sample file definition and linkage section to pass DATE as a global qualifier instead of building a separate DATE index (REC03). We will make DATE a global qualifier of both TITLE (REC02) and PERSON (REC04) indexes. Since PTR-ELEM must become the KEY of a PTR-GROUP structure, we will have to alter the index record definitions. The revised definition might look like:

FILE = GA.SPI.BIBLIOGRAPHY;
AUTHOR = JOHN SACK;
RECORD-NAME = BOOK;
  REMOVED;  SLOT;
  FIXED;
    ELEM = DATE;  LEN = 4;  OCC = 1;
      INPROC = AS31; OUTPROC = A76;
  REQUIRED;
    ELEM = TITLE;
      INPROC = A40;
    ELEM = PERSON;
      INPROC = A40/ A41:1;

RECORD-NAME = REC02;
    KEY = TITLE;
    ELEM = POINTER-STR;
      TYPE = STR;  LEN = 8;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINT-BACK;
      TYPE = LCTR;
    ELEM = DATE-QUALIFIER;
      LEN = 4;  OCC = 1;

RECORD-NAME = REC04;
    KEY = LAST-NAME;
    ELEM = FIRST-NAME-STR;
      TYPE = STR;
STRUCTURE = FIRST-NAME-STR;
    KEY = FIRST-NAME;
      INPROC = A38;
      OUTPROC = A38;
    ELEM = POINTER-STR;
      TYPE = STR;  LEN = 8;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINT-BACK;
      TYPE = LCTR;
    ELEM = DATE-QUALIFIER;
      LEN = 4;  OCC = 1;

GOALREC-NAME = BOOK;
PTR-ELEM = POINT-BACK;
  EXTERNAL-NAME = CITATION;
  PASSPROC = A170;

QUAL-ELEM = DATE-QUALIFIER;
   SEARCHTERMS = DATE, DAT, D;
   SRCPROC = A56, 'Not Date Format'/ AS31;
   GOALREC-ELEM = DATE;
   PASSPROC = A169:1;

INDEX-NAME = REC02;
      SEARCHTERMS = TITLE, T;
      SRCPROC = A45,/ A47,2/ A11:3,#;
      GOALREC-ELEM = TITLE;
      PASSPROC = A166/ A45,/ A47,2;

   PTR-GROUP = POINTER-STR;

INDEX-NAME = REC04;
      SEARCHTERMS = PERSON, NAME, PN, N;
      SRCPROC = A44,'.',/ A38/ A11:1,#;
      GOALREC-ELEM = PERSON;
      PASSPROC = A166/ A44,'.',/ A38;

   SUB-INDEX = FIRST-NAME-STR;
      SEARCHTERMS = NULL;
      PASSPROC = A165;

   PTR-GROUP = POINTER-STR;

Notice that we defined the POINTER-STR as consisting of entirely FIXED information, and included LEN=8 with TYPE=STR. The "form" of the pointer group structure is the same in all indexes.

If the DATE element within the BOOK records occurred multiple times, only the first occurrence would be passed to the global qualifier. And if DATE hadn't occurred at all, either A171 would need to be specified in the PASSPROC to supply a default value, or else a null value would be passed to the global qualifier.

There is a special case of global qualifier worth mentioning. If the keys of goal records are passed to PTR-ELEM, then the pointer element in the indexes referred to by the PTR-ELEM can also be referred to by a global QUAL-ELEM. The QUAL-ELEM would not specify a GOALREC-ELEM since the key of the goal records had already been passed to PTR-ELEM. The SRCPROC would correspond to the INPROC of the goal record's keys, and the PASSPROC must be A165. The SEARCHTERM statement provides you with search names that allow you to use the PTR-ELEM as a qualifier, which means you can qualifiy your search requests by goal record key criteria.

If this special QUAL-ELEM is the only qualifier defined for the GOALREC-NAME, and there is no combined index, then PTR-GROUP, PTR-ELEM, and this QUAL-ELEM can all refer to the same simple element in the indexes. This is the only exception to the rule about PTR-GROUP structures being required when qualifiers are defined.

B.8.7 Local Qualifiers

All the rules for PTR-GROUP structures and PTR-ELEM keys apply for local qualifiers just as they do for global qualifiers. [See B.8.6.] Local qualifiers are specified in the linkage section for any particular index by adding QUAL-ELEM sections just after the PTR-GROUP statement.

Let's alter our sample file definition and linkage section again to pass DATE as a local qualifier of the TITLE index (REC02) instead of making it a global qualifier in all indexes. We will keep the personal name index (REC04) introduced in the SUB-INDEX section, but it will not have a qualifier.

FILE = GA.SPI.BIBLIOGRAPHY;
AUTHOR = JOHN SACK;
RECORD-NAME = BOOK;
  REMOVED;  SLOT;
  FIXED;
    ELEM = DATE;  LEN = 4;  OCC = 1;
      INPROC = AS31; OUTPROC = A76;
  REQUIRED;
    ELEM = TITLE;
      INPROC = A40;
    ELEM = PERSON;
      INPROC = A40/ A41:1;

RECORD-NAME = REC02;
    KEY = TITLE;
    ELEM = POINTER-STR;
      TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINT-BACK;
      TYPE = LCTR;
  OPTIONAL;
    ELEM = DATE-QUALIFIER;
      OCC = 1;

RECORD-NAME = REC04;
    KEY = LAST-NAME;
    ELEM = FIRST-NAME-STR;
      TYPE = STR;
STRUCTURE = FIRST-NAME-STR;
    KEY = FIRST-NAME;
      INPROC = A38;
      OUTPROC = A38;
    ELEM = POINTER-STR;
      TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINT-BACK;
      TYPE = LCTR;
  OPTIONAL;
    ELEM = DUMMY;
      OCC = 1;

GOALREC-NAME = BOOK;
PTR-ELEM = POINT-BACK;
EXTERNAL-NAME = CITATION;
PASSPROC = A170;

INDEX-NAME = REC02;
      SEARCHTERMS = TITLE, T;
      SRCPROC = A45,/ A47,2/ A11:3,#;
      GOALREC-ELEM = TITLE;
      PASSPROC = A166/ A45,/ A47,2;

   PTR-GROUP = POINTER-STR;

   QUAL-ELEM = DATE-QUALIFIER;
      SEARCHTERMS = DATE, DAT, D;
      SRCPROC = A56, 'Not Date Format'/ AS31;
      GOALREC-ELEM = DATE;
      PASSPROC = A169:1;

INDEX-NAME = REC04;
      SEARCHTERMS = PERSON, NAME, PN, N;
      SRCPROC = A44,'.',/ A38/ A11:1,#;
      GOALREC-ELEM = PERSON;
      PASSPROC = A166/ A44,'.',/ A38;

   SUB-INDEX = FIRST-NAME-STR;
      SEARCHTERMS = NULL;
      PASSPROC = A165;

   PTR-GROUP = POINTER-STR;

Notice the DUMMY element in the pointer group structure of REC04. It is there to make the structure "form" identical to the structure defined in REC02, which has a DATE-QUALIFIER element. Also notice that both the DUMMY element and DATE-QUALIFIER element were declared OPTIONAL. That's because the DUMMY element will not occur within REC04 occurrences of the POINTER-STR.

In this sample definition, the DATE element in the goal records always occurred since it is a FIXED element. However, if had been an OPTIONAL element which did not occur, then A171 should be coded in the PASSPROC to pass some default value to the local qualifier, otherwise no index entries would be created for TITLE. All local qualifier and sub-index sections must define values for an index entry to be created. If the goal record elements which supply values for local qualifiers or sub-index terms are multiply occurring, or a PASSPROC rule specifies multiple passer elements, then multiple index entries can be created. [See B.8.14.]

Problems may be encountered if a variable length qualifier is passed to a fixed-length qualifier element in the index record. If this is being done, the following PASSPROC should be included with any other qualifier PASSPROCs:

A36:1,' ',n

where "n" is the value of the LEN statement on the qualifier element (i.e., the fixed-length field size).

B.8.8 Combined Indexes

CINDEX-VALUE is just a special case of local qualifier. PTR-GROUP must be a structure with PTR-ELEM as its KEY.

Let's alter our file definition to make DATE a combined index. REC03 will now be used to define a combined index record-type. Remember, the "form" of the PTR-GROUP structure must be the same in all index definitions. Here is the revised file definition, including the linkage sections for both the simple index on TITLE and the combined index on DATE (the PERSON index has been dropped).

FILE = GA.SPI.BIBLIOGRAPHY;
AUTHOR = JOHN SACK;
RECORD-NAME = BOOK;
  REMOVED;  SLOT;
  FIXED;
    ELEM = DATE;  LEN = 4;  OCC = 1;
      INPROC = AS31; OUTPROC = A76;
  REQUIRED;
    ELEM = TITLE;
      INPROC = A40;
    ELEM = PERSON;
      INPROC = A40/ A41:1;

RECORD-NAME = REC02;
    KEY = TITLE;
    ELEM = GROUP-STR;
      TYPE = STR;
STRUCTURE = GROUP-STR;
  FIXED;
    KEY = POINT-BACK;
      TYPE = LCTR;
  OPTIONAL;
    ELEM = DUMMY;
      OCC = 1;

RECORD-NAME = REC03;
  FIXED;
    KEY = ELEM-NO;  LEN = 2;
  OPTIONAL;
    ELEM = POINTER-STR;
      TYPE = STR;
STRUCTURE = POINTER-STR;
  FIXED;
    KEY = POINT-BACK;
      TYPE = LCTR;
  OPTIONAL;
    ELEM = VALUE;
      OCC = 1;

GOALREC-NAME = BOOK;
PTR-ELEM = POINT-BACK;
EXTERNAL-NAME = CITATION;
 PASSPROC = A170;

INDEX-NAME = REC02;
      SEARCHTERMS = TITLE, T;
      SRCPROC = A45,/ A47,2/ A11:3,#;
      GOALREC-ELEM = TITLE;
      PASSPROC = A166/ A45,/ A47,2;

   PTR-GROUP = GROUP-STR;

INDEX-NAME = REC03;
      SEARCHTERMS = DUMMY-VALUE;
      PASSPROC = A167,DATE;

   PTR-GROUP = POINTER-STR;

      CINDEX-VALUE = VALUE;
      PASSPROC = A169:1;

Compare the linkage definitions for REC02, a simple index, and REC03, a combined index. Notice that the PTR-GROUP statements refer to different structure names in each index, but the "form" of those structures is the same, and they have the same KEY name. The PTR-GROUP structure names are usually the same, but that is not a requirement, as this example demonstrates.

Also notice that the combined index linkage has a "dummy" SEARCHTERMS statement coded, no SRCPROC or GOALREC-ELEM statements, two PASSPROC statements, and a new kind of statement, "CINDEX-VALUE".

When searching a combined index, the searcher may use the element name or alias of any of the goal record elements passing to the combined index; the SEARCHTERMS statement must be coded, but its value is meaningless. The index names that are reported when a user issues the SHOW SEARCH TERMS command are picked up from the P+ values of PASSPROC=A167. Note from the description of this action [See D.1.7.] that the order of the P+ parameters is not important unless some of the elements being passed are inside structures; in this case, the order must be the order in which the elements would be displayed if a record from the file were displayed in the standard output format.

It is this first PASSPROC, A167, that specifies the goal record elements that are passed to the combined index; this is why no GOALREC-ELEM statement is needed. Instead of a SRCPROC rule string, SPIRES passes all search values through the INPROC rules for the particular goal record element being searched. (Only one SRCPROC rule can be coded in a combined index definition: SRCPROC = A6; in the INDEX-NAME section.)

The CINDEX-VALUE statement is only coded in the linkage to a combined index, immediately following the PTR-GROUP statement. It names an element in the index record's PTR-GROUP structure that will receive data values being passed from the goal record elements (See A167 in the PASSPROC of INDEX-NAME). In the sample file, this element has the name "VALUE", hence the statement CINDEX-VALUE=VALUE in the linkage definiton.

The final statement, a second PASSPROC, is always coded in combined index linkages. If the elements being passed are in binary form (as is often the case in combined indexes), such as the DATE element, then A169:1 is the only rule coded for this statement. If the elements are values that must be converted to uppercase, then A169:0 (or simply A169) is coded. If some elements being passed must not be converted to uppercase and others must be, then A162 is also coded, as explained later. [See B.8.11.]

B.8.9 Coding SRCPROC Rules

If the values of an element have been altered by an INPROC or PASSPROC then the same processing rules are generally coded in the SRCPROC rule string to apply a similar transformation on the values a user might specify in search commands.

The following SRCPROC rules are available to modify the search process. Other actions may also be used in SRCPROC rule strings.

A6   -  Allows the implicit insertion of AND NOT or OR between
        search values; if this is not specified, then the AND
        operator will be the default.
A12  -  Allows the implicit insertion of relational operators
        between search values for qualifier and combined
        index terms.

The following are restricted to SRCPROC of INDEX-NAME.

A11  -  Allows a truncation character at the end or in the
        middle of a search value.
A13  -  Allows any of several truncation characters at the end
        of a search value; used for code stem searching.

A14  -  Allows a truncation character at the end of a
        search value.
A15  -  Allows the next (alphabetically) lower value with
        identical initial characters to be retrieved if the
        search value itself is not found.
A16  -  Allows the next (alphabetically) higher value to be
        retrieved if the search value itself is not found.

If no SRCPROC statement is coded, and thus no SRCPROC rules are specified, the default SRCPROC will be used: "A45,". This will automatically cause search values to be broken on blanks.

B.8.10 The NOPASS Statement

The NOPASS statement may be specified in the linkage section of a file definition. If it is coded, SPIBILD will not attempt to update any of a subfile's indexes when it is processing records. The indexes can still be searched, however.

The statement, "NOPASS;", is placed after the last statement in the linkage section of the goal record whose indexing is to be "turned off." The file definition must then be recompiled. Subsequent SPIBILD processing will not attempt to pass any information to the subfile's indexes.

In order to re-start index updating, the NOPASS statement must be removed from the file definition, and the definition must then be recompiled. Note that NOPASS stops passing to all indexes in a single linkage section; it cannot be used to disable one of several indexes selectively.

B.8.11 Coding PASSPROC Rules

The rules for coding strings of PASSPROCs are more rigid and difficult than the rules for coding INPROCS, OUTPROCS or SRCPROCS. The descriptions of the PASSPROC rules in the last part of this manual [See D.1.7, D.2.6, D.3.] and in the ACTIONS subfile are very concise; a problem with their brevity is that the choices a file definer has in coding PASSPROC rule strings are not easily distinguished. This section will focus on the central choices that must be made in coding PASSPROC strings.

The first PASSPROC encountered in most file definitions is in the global parameters of the linkage section. For example:

GOALREC-NAME = BOOK;
PTR-ELEM = POINT-BACK;
EXTERNAL-NAME = CITATION;
PASSPROC = A170;

As has been mentioned, PASSPROC=A170 is used if the goal record is REMOVED (as most samples definitions in this manual are) and the PTR-ELEM is declared TYPE=LCTR. This rule says that the PTR-ELEM in each index record will receive the locator of the goal record in the residual data set.

It is also possible to pass the key of the goal record to the PTR-ELEM, instead of the locator of the goal record in the residual data set. If the goal records are not REMOVED then you must pass the key. To do this, one of the following PASSPROC rules must be coded instead of A170:

PASSPROC = A169:1;
PASSPROC = A167:5...;

Both of these rules do not force to uppercase. When no PASSPROC is coded, the default action is to force the key to upper case, which should only be done when the goal record's key is already an uppercase value.

The first rule is used when only the GOALREC-KEY is to be passed. That includes passing the slot-number of SLOT records. The second rule is usually used with multiple passer elements, but can be used with just the goal record's key specified.

Although it is not frequently done, it is possible to pass the key of the goal record as the pointer even if the goal record is a REMOVED record-type. This should be avoided if the key is varying in length or if the key is more than four bytes long.

There are two other places in the linkage section where the choice of a PASSPROC is simple. The first of these is the second PASSPROC statement coded in a combined index linkage definition:

INDEX-NAME = REC03;
      SEARCHTERMS = DUMMY-VALUE;
      PASSPROC = A167,DATE;
   PTR-GROUP = POINTER-STR;
      CINDEX-VALUE = VALUE;
      PASSPROC = A169:1;

A169:1 is used when the value or values being passed are stored as numbers and hence must not be forced to uppercase. A169:0 would be used if character strings were being passed to this index. If both character and binary data were being passed to this index, then A162:1 would also be coded to exclude certain elements' values from uppercase conversion. For example:

PASSPROC = A169:1/ A162:1,DATE;

would force all values to uppercase upon passing, except those from the DATE element.

The second case in which the choice of a single PASSPROC is simple is the second PASSPROC string coded in a personal name index:

INDEX-NAME = REC03;
      SEARCHTERMS = NAME, PN, N;
      SRCPROC = A44,'.',/ A38/ A11:1,#;
      GOALREC-ELEM = NAME;
      PASSPROC = A166/ A44,'.',/ A38;
   SUB-INDEX = FIRST-NAME-STR;
      SEARCHTERMS = NULL;
      PASSPROC = A165;
   PTR-GROUP = POINTER-STR;

Here, only PASSPROC=A165 can be coded because A38 in the PASSPROC of INDEX-NAME supplies both the KEY of the index record and the KEY of the sub-index structure.

B.8.12 Choosing the "Fetcher" PASSPROC

Choosing the PASSPROC rule that fetches the element value or values from the goal record is a matter of selecting one rule from among sixteen that are shown in a table below. However, the table itself requires some explanation of the terminology that is often found in SPIRES processing rule descriptions.

The terms "single passer" and "multiple passer" need definition. A single passer situation occurs when only one goal record element is passing its value or values to an index. It does not matter whether this element is itself singly or multiply occurring, or whether A45 is used to break a single occurrence into multiple occurrences. A multiple passer situation occurs when more than one element in the goal record passes to a single index. For example, if HOME-PHONE and BUSINESS-PHONE elements in a goal record were both passed to a PHONE index, this would be a multiple passer situation.

Combined indexes present a special problem, since they contain two PASSPROC rule strings. However, the choice of PASSPROC rules is fairly straight-forward. [See B.8.5 for a complete discussion.] The first PASSPROC is always "A167:0"; this defines the elements from the goal record to be passed to the index. The second PASSPROC may be either A169:0 or A169:1, depending on whether the elements being passed are to be forced to upper case in passing or not; see the entry for a single passer without A38 or A45 in the following table.

One other factor affects the choice of PASSPROC rules from the table. If values "fetched" (obtained from) the goal record are to be broken apart in passing by action 45 or by action 38 (the latter for personal name indexes), then different PASSPROC rules must be selected than would be if A45 or A38 were not to be coded in the same rule string.

The P1 parameter on the PASSPROC that fetches the value from the goal record is determined by whether or not the value is to be forced to uppercase in passing. Values that are converted to an internal form, such as fixed binary or date values, must not be converted to "uppercase" in passing, since this would change their value. The P1 parameter is also determined by whether or not the value should be processed through the OUTPROC rules associated with that element, thus passing the external form of the value to the index.

                 Without                With
               A38 or A45            A38 or A45
         |--------------------|---------------------|
         |                    |                     |
 Single  |  A169:0            |  A166:0             |--Force Upper
 Passer  |                    |                     |
         |            A169:1  |            A166:1   |--Don't Force
         |                    |                     |
         |  A169:8            |  A166:8             |--Force Upper, but
         |                    |                     |  Pass External Form
         |            A169:9  |            A166:9   |--Don't Force, but
         |                    |                     |  Pass External Form
         |--------------------|---------------------|
         |                    |                     |
Multiple |  A167:1            |  A167:2             |--Force Upper
 Passers |                    |                     |
         |            A167:5  |            A167:6   |--Don't Force
         |                    |                     |
         |  A167:9            |  A167:10            |--Force Upper, but
         |                    |                     |  Pass External Form
         |            A167:11 |            A167:12  |--Don't Force, but
         |                    |                     |  Pass External Form
         |--------------------|---------------------|

B.8.13 Other Actions in a PASSPROC Rule String

The following describes the syntax for any PASSPROC rule string. You enter the table at the word "PASSPROC" and follow the paths defined. The symbol "::" is read "is defined as." Terms on the left side of a "::" are defined by the term(s) that appear on the right side of the "::". Terms on the right side of the "::" that are listed directly under another term (or terms) on that side are an alternative definition for the term on the left side of the "::". The symbol "|" means "or," and also separates alternative definitions.

 <Term>    indicates a term that must occur once, i.e., is required
 (Term)    indicates a term that may occur once, i.e., is optional
 (0,Term)  indicates a term that may occur several times,
           i.e., may not occur or may occur more than once
 A-number  indicates a required processing rule.  If no P1
           parameter is specified, then all P1 parameters are
           included.  If a P1 parameter is specified, then only
           that P1 parameter is allowed.

PASSPROC         ::  <MULTIPLE-PASSER>
                  |  <SINGLE-PASSER>
                  |  <SIMPLE-PASSER>

MULTIPLE-PASSER  ::  (DEFAULT) <MULTIPLE-FETCHER> (0,MIDDLE) <BREAK>

SINGLE-PASSER    ::  (DEFAULT)  <SINGLE-FETCHER>  (0,END)

SIMPLE-PASSER    ::  <DEFAULT> | A165 | A167:0 | A170

DEFAULT          ::  A171

MULTIPLE-FETCHER ::  A166 | A167:2 | A167:6

SINGLE-FETCHER   ::  A167:1 | A167:5 | A169

MIDDLE           ::  A22  | A32  | A36  | A40  | A43  | A44  | A46  | A47
                  |  A48  | A55  | A62  | A161 | A162 | A163 | A168

BREAK            ::  A45  (0,END)
                  |  A38

END              ::  <MIDDLE> | A52 | A164

For example, this syntax shows that the following is illegal syntax:

PASSPROC = A167:2/ A52/ A45,;

because A52 (as an END rule) must follow A45 (which is a BREAK rule). This syntax also shows that A38 must be the last rule in any PASSPROC in which it is coded.

B.8.14 How Passing Works

The SPIBILD and FASTBILD processors make use of the Linkage Sections of a File Definition (GOALREC-NAME sections) in the following ways:

Each record type of a file is processed in ascending order by record-type number (all REC1's, then all REC2's, etc.). Goal records associated with each record-type are processed either in the order in which they are input under BATCH command processing, or in ascending key order if thru the DEFQ. Note that UPDATE and REMOVE commands under BATCH processing normally cause DEFQ entries to be created and later processed in key order. ADD commands under BATCH processing are handled in the order they are received.

Limiting the discussion to just the PROCESS command in SPIBILD will simplify what happens, and make it possible for you to determine what happens on a "recovery" situation. For simplicity, assume all records come from the DEFQ. (Note that the process of passing index information is optimized during BATCH MERGE requests in SPIBILD. Only indexed information that is changed between the old version of a record and a new version is passed. This minimizes CPU and I/O activity.)

Phase 1

If the record-type does NOT have an associated Linkage Section (or sections), then skip to Phase 2.

For ADD-type records, a copy of the record is first added to the TREE if the record type is declared as REMOVED. If the record fails to add because a copy already exists in the TREE, then it is assumed that "recovery" is taking place, and the record is updated in the TREE. The original DEFQ record is now sent to the passing process marked for pointer INSERT. For UPDATE-type or REMOVE-type records, the original TREE copy of the record is read up and sent to the passing process marked for pointer DELETE. Then, the original UPDATE-type record from the DEFQ is sent to the passing process marked for pointer INSERT.

The passing process is controlled by the Linkage Section. For each record received (marked for pointer INSERT or DELETE), the PTR-ELEM and Global QUAL-ELEM portion of the Linkage is processed first. One and only one item is placed in the "pass stack" (called PS) for each item in this Common portion. PASSPROC A165 may be used to cause no entry, or something like A167:1 (multiple passer) can be used to cause the first occurrence of the first existing element in the list to be placed in PS. If no value can be found and A165 or A171 was not used, then a "null" value is placed in PS.

Once the Common portion of the Linkage Section is finished, processing begins on each INDEX-NAME portion. The INDEX-NAME, SUB-INDEX, and Local QUAL-ELEM terms constitute one INDEX-NAME portion. These terms are "traversed" in both a forward and backward manner beginning with a forward scan starting with INDEX-NAME and running thru the last term. Each term causes one value to be retrieved from the goal record. If no value can be found during a forward scan, and A165 or A171 was not used, then NO INDEX ENTRY is defined, and backward scan commenses. If the forward scan can be completed, then an INDEX ENTRY is defined, and backward scan commences. During backward scan, each term attempts to retrieve a value from the goal record from where it left off during the previous forward scan. If another value can be retrieved, then forward scan resumes from this point, and if it continues forward thru the last term, then another INDEX ENTRY is defined. However, if during any forward scan, structural boundaries have to be crossed to satisfy a term as compared to a previous term, then the backward scan will NOT be able to pick up inside the same occurrence of the structure again. The common root of the two paths closest to the record level determines the next forward path.

Here are some examples using a single INDEX-NAME section with different layouts of elements in goal records. Note: only important items are shown.

INDEX-NAME = index;
  GOALREC-ELEM = WORD;
PTR-GROUP = ptstr;
QUAL-ELEM = qual1;
  GOALREC-ELEM = DATE;
QUAL-ELEM = qual2;
  GOALREC-ELEM = TIME;

The layout of WORD, DATE, and TIME in the goal records can vary in many ways. Structural boundaries can make a big difference in what happens. Consider the following:

 Record-level  WORD(1)  WORD(2)  DATE(1)  DATE(2)  TIME(1)  TIME(2)
 --------------|--------|--------|--------|--------|--------|

All element are doubly occurring, and all are at the Record-level. The following INDEX ENTRIES will be constructed:

WORD(1)  DATE(1)  TIME(1)
WORD(1)  DATE(1)  TIME(2)
WORD(1)  DATE(2)  TIME(1)
WORD(1)  DATE(2)  TIME(2)
WORD(2)  DATE(1)  TIME(1)
WORD(2)  DATE(1)  TIME(2)
WORD(2)  DATE(2)  TIME(1)
WORD(2)  DATE(2)  TIME(2)

Now consider the following layout:

                              TIME(1)             TIME(2)
                     DATE(1)  |          DATE(2)  |
   WORD(1)  WORD(2)  |--------|          |--------|
  -|--------|--------|-------------------|

Here, TIME is the key of a structure which has been defined as an element of another structure whose key is DATE, and that structure is defined at the record level along with WORD.

The following INDEX ENTRIES will be created:

WORD(1)  DATE(1)  TIME(1)
WORD(1)  DATE(2)  TIME(2)
WORD(2)  DATE(1)  TIME(1)
WORD(2)  DATE(2)  TIME(2)

After completing the first forward pass, an attempt to pick up another value of TIME on a backward scan would have required leaving the structural bounds defined by the first DATE. On a backward scan, structural bounds defined by a previous element can't be crossed, so the backward scan continues by trying to get another DATE value. The boundary now is the record-level defined by WORD. Since the DATE structure has its root at that level, SPIBILD can proceed forward again beginning with DATE(2), and then TIME(2). Scanning backward again takes us all the way back to WORD(1) from which SPIBILD now picks up WORD(2) and scan forward again. If there had been other occurrences of the TIME structure associated with DATE(1), then SPIBILD would have picked all of them up along with DATE(1), but none of them with DATE(2). The same thing would have happened if TIME had been simply a multiply occurring element inside the DATE structure instead of being the key of a multiply occuring structure defined inside the DATE structure. That is,

                     DATE(1)  TIME(1)    DATE(2)  TIME(2)
   WORD(1)  WORD(2)  |--------|          |--------|
  -|--------|--------|-------------------|

Now, consider the following layout:

                                         WORD(1)  WORD(2)
                     DATE(1)  DATE(2)    |--------|
   TIME(1)  TIME(2)  |--------|----------|
  -|--------|--------|

Here, WORD is a multiply occurring element of a structure defined as an element of another structure containing a multiply occurring DATE element, and that structure is defined at the record-level along with a multiply occurring TIME element.

The following INDEX ENTRIES will be created:

WORD(1)  DATE(1)  TIME(1)
WORD(1)  DATE(1)  TIME(2)

Why? Because on the forward scan, SPIBILD has to abandon the structure containing WORD to retrieve DATE, and then SPIBILD has to abandon that structure to retrieve TIME. We can't go back into the same occurrences of those structures on the backward scan. The common root would be the next occurrence of a structure at the record-level. Interestingly enough however, the following example does give us more INDEX ENTRIES (Note the similarity to the first structural layout).

                              WORD(1)             WORD(2)
                     DATE(1)  |          DATE(2)  |
   TIME(1)  TIME(2)  |--------|          |--------|
  -|--------|--------|-------------------|

Here, WORD is the key of a structure which has been defined as an element of another structure whose key is DATE, and that structure is defined at the record level along with TIME.

The following INDEX ENTRIES will be created:

WORD(1)  DATE(1)  TIME(1)
WORD(1)  DATE(1)  TIME(2)
WORD(2)  DATE(2)  TIME(1)
WORD(2)  DATE(2)  TIME(2)

Once all INDEX ENTRIES for an INDEX-NAME section have been placed in PS, the next INDEX-NAME section is processed. An INDEX-NAME section is considered completed when a backward scan for the INDEX-NAME term fails to retrieve a value. When all INDEX-NAME sections have been completed, passing for the record is finished, but the information concerning index updates only exists in PS. When PS fills up or no more records are retrieved from the DEFQ, then PS is processed by sorting it down to unique INDEX ENTRIES cancelling duplicate entries and any INSERT/DELETE pairs. The result is a list of INSERT and/or DELETE entries to be applied to the indexes. The appropriate index records are added, removed, or updated using the PS information. Because of the sort, an index record need only be read once in order to apply all PS information associated with that record.

When all DEFQ records for one record-type have been passed, processing continues with Phase 2 below.

Phase 2

All the records in the DEFQ associated with one record-type are read sequentially. For each ADD-type record, the record is added to the TREE if no Linkage Section was defined, or the record type is not removed; otherwise, these ADD-type records are ignored. For each REMOVE-type record, the TREE copy of the record is removed. For each UPDATE-type record, the TREE copy is replaced by the DEFQ copy. When all the records for a particular record-type have been processed, the next logical record-type is processed starting back at Phase 1. When all record-types have been processed, the DEFQ is cleared and the PROCESS command is finished.

1 Triples

There is a "type" associated with all variables in SPIRES. The normal types are as follows:

STRING, CHAR, LINE, PACKED, REAL, INT, and FLAG.

CHAR is a "fixed length" form of STRING. LINE is the type associated with line-number variables like $WDST or $WDSR. The others represent the most commonly used types.

A 'VALUE' is an association of the value of a field with the name of the field. A Triple (or 3-Tuple) is used to associate more than 2 entities. A triple might be used if it was also necessary to know who owned the field. For example: BIRTHDATE (the attribute) and 476322 (the employee number whose birthdate this is) and 19540611 (the value). Thus a triple is an association of three pieces of information. Such an association is itself an entity with TYPE=TRIPLE. A collection of triples form another entity with TYPE=GROUP. Thus, there are additional types that variables can assume. They are:

GROUP, TRIPLE, and PSEUDO ($ANY or $NEW).

1.1 Triple Functions

Triples are constructed and manipulated by $-functions. The functions are:

LET tri = $MAKE(att,obj,val)
LET int = $MADE(att,obj,val)
LET int = $UNMAKE(att,obj,val)
LET int = $UNMAKETRIPLE(#tri)
LET tri = $LOOKUP(att,obj,val)
LET att = $ATTRIBUTE(#tri)
LET obj = $OBJECT(#tri)
LET val = $VALUE(#tri)
LET gro = $GROUP(att,obj,val)
LET int = $GROUPSIZE(#gro)
LET tri = $GROUPELEM(#gro,#int)
LET gro = $GROUPSORT(#gro,field)   [field = ATT, OBJ, or VAL]

Within the parens in the list above, "att", "obj", and "val" represent variables and/or values of any type.

1.1.1 $MAKE

$MAKE guarantees that the triple exists and returns a type TRIPLE result. There are two common ways to make a triple:

LET T = $MAKE(A,B,C)
EVAL    $MAKE(A,B,C)

EVAL makes the triple but doesn't assign it to a variable. This is the most common way of creating triples. The PSEUDO variable $NEW can be used in any field to make a unique triple when less than 3 values are available. Each use of $NEW causes a different PSEUDO value to be generated.

1.1.2 $MADE

$MADE returns an integer count of the number of triples that satisfy the criteria. The PSEUDO variable $ANY can be used to indicate a non-specific criteria. For example:

EVAL $MAKE(A,B,C)
EVAL $MAKE(A,B,X)

LET N = $MADE(A,B,$ANY)

would set N to 2 since there are two triples that have "A" as the att-field, "B" as the obj-field, and anything in the val-field. $MADE($ANY,$ANY,$ANY) returns the count of all triples that currently exist.

1.1.3 $UNMAKE

$UNMAKE unmakes the triple specified. If $ANY is coded as a parameter, $UNMAKE unmakes all triples that qualify.

EVAL $UNMAKE($ANY,$ANY,$ANY)

unmakes all triples unconditionally. No error occurs on an attempt to unmake a triple that already is unmade. As in the case of $MAKE, $UNMAKE guarantees that the triple doesn't exist. However, if the triple has been assigned to a variable, then the variable must be eliminated before space for the triple is released. Thus:

LET T = $ZAP
EVAL $DYNZAP(T,0,'')

1.1.4 $UNMAKETRIPLE

$UNMAKETRIPLE unmakes a specified triple. For example:

LET T = $MAKE(A,B,C)

then

EVAL $UNMAKE(A,B,C)
EVAL $UNMAKETRIPLE(#T)

both unmake the same triple. $UNMAKETRIPLE returns an integer which is the use-count of the triple unmade. Note that the triple-type variable must still be eliminated before space for the triple is released.

1.1.5 $LOOKUP

$LOOKUP returns the first triple which satisfies the criteria.

LET T = $LOOKUP(A,B,$ANY)

would return the first triple encountered that had "A" in the att-field, "B" in the obj-field, and anything in the val-field.

1.2 Decomposing Triples

The $ATTRIBUTE, $OBJECT, and $VALUE functions are used to isolate the three components of a triple, any of which may be another triple. A triple-type variable cannot be displayed by the /* command because triples do not convert to strings.

1.2.1 $ATTRIBUTE

$ATTRIBUTE returns the att-field of a specified triple. Thus:

LET X = $MAKE(#T,"APPLES",$INT(9))
LET R = $ATTRIBUTE(#X)

sets R to the value that T had when the triple was made. The "type" of R would be the same as the type associated with T (which might be triple or group).

1.2.2 $OBJECT

$OBJECT returns the obj-field of a specified triple.

LET X = $MAKE(#T,"APPLES",$INT(9))
LET O = $OBJECT(#X)

sets O to the string-value "APPLES".

1.2.3 $VALUE

$VALUE returns the val-field of a specified triple.

LET X = $MAKE(#T,"APPLES",$INT(9))
LET V = $VALUE(#X)

sets V to the integer-value of 9.

1.3 Groups

Triples are 3 values, given that a value may itself be a triple. Groups are N values, all of which must be triples. Thus groups are N-Tuples which may be referenced like a list.

1.3.1 $GROUP

$GROUP makes a group out of all triples that qualify. Thus:

LET G = $GROUP(A,B,$ANY)

makes a group out of all triples that have "A" in the att-field, "B" in the obj-field, and anything in the val-field.

LET G = $ZAP

eliminates the G variable and the associated group.

1.3.2 $GROUPSIZE

$GROUPSIZE returns an integer representing the size (i.e. number of members) in the specified group. For example:

EVAL $MAKE(A,B,C)
EVAL $MAKE(A,B,X)
LET G = $GROUP(A,B,$ANY)
LET N = $GROUPSIZE(#G)

Note that LET N = $MADE(A,B,$ANY) returns the same number, but doesn't require a group-type variable exist.

1.3.3 $GROUPELEM

$GROUPELEM returns the nth triple in a specified group. Thus:

LET T = $GROUPELEM(#G,1)

sets T to the first triple in group G. If #N represents the number of members in a group, then the second argument of the $GROUPELEM function ranges from 1 through #N (inclusive).

1.3.4 $GROUPSORT

$GROUPSORT sorts a specified group by a specified field.

LET G = $GROUPSORT(#G,VAL)

would sort group G by the val-field. The sort is always done in ascending character (unsigned) order.

1.4 SHOW & CLEAR TRIPLES

CLEAR TRIPLES is a SPIRES command that unmakes all triples (if any exist). Likewise, the command EVAL $UNMAKE($ANY,$ANY,$ANY) will unmake all triples. A subsequent $MADE($ANY,$ANY,$ANY) would return a zero count. However, the SHOW TRIPLES command might still find triples in the system even though they have been unmade, providing they are still referenced by variables. Therefore, SHOW TRIPLES finds all that exist for any reason, and $MADE (or $LOOKUP) finds all that are still made.

The "IN area " prefix may be used with SHOW TRIPLES to place the result in an area (like IN ACTIVE). Each triple that is still made is shown as a percent-sign followed by a number. Unmade triples that are still references are shown with a leading X instead of a % character. The 3 components of each triples are then shown following the number, with commas separating each component. Here are some examples:

LET TRIPLE = $MAKE($DATE,"Apples",$ANY)
EVAL $MAKE(A,"Apple sauce",#TRIPLE)
EVAL $MAKE(A,B,C)

%7562744: 06/27/80, Apples,%0
%7561704: A, Apple sauce,%7562744
%7561896: A, B, C

Following CLEAR TRIPLES, we would have the following:

X7562744: 06/27/80, Apples,%0

That triple is unmade but is still referenced by the variable called TRIPLE. After doing a CLEAR DYNAMIC VARIABLES, EVAL $DYNZAP(TRIPLE,0,''), or LET TRIPLE = $ZAP, the triple would be eliminated.

In Formats, all variables that are to be used as triples or groups must be declared TYPE=DYNAMIC. Also, CLEAR TRIPLES and CLEAR DYNAMIC VARIABLES commands cannot be used in the Format, so the EVAL command must be used instead:

EVAL $UNMAKE($ANY,$ANY,$ANY)
EVAL $DYNZAP(variable,subscript,'')

2 Add Function Documentation

2.1 SPIRES Functions -- Background

Spires allows functions in only certain commands. The BNF is what controls this selectivity. These are the commands:

   IF, LET, WHILE, UNTIL, EVAL, SHOW EVAL, SET TCOMMENT

These same commands may occur in protocols, either compiled or uncompiled. An uncompiled protocol is nothing more than a series of commands contained in an active file or the type-XEQ element of a Spires record. Type-XEQ has the same stored structure as an input active file. Therefore, either way Spires gets your protocol, the executed code looks the same.

UPROC of USERPROC and Formats are another place where functions can occur. The same basic set of commands is involved, with SHOW EVAL replaced by SET VALUE.

Functions are recognized by their starting $ character, a name and then an open parenthesis. The exception is $ZAP, which is used stand-alone in LET.

   LET variable = $ZAP

All other $functions have a parenthesized parameter list, with one or more parameters. The number of parameters is determined by where the function's name is found in the element-list of the .$2FUN record-type of the $DATA file. In other words, functions are organized into structures within that record-type, and which structure they are in determines the number of parameters:

  Record .$2FUN of GG.SPI.DATA

  $0  Functions with variable number of arguments.
  $1  Functions of 1 argument
  $2  Functions of 2 arguments
  $3  Functions of 3 arguments
  $4  Functions of 4 arguments
  $5  Functions of 1 argument
  $6  Functions of 2 arguments
  $8  Functions with variable number of arguments in RECDEF ($FUN.name)

The list above shows the structure names. There can not be a $7. Spires limits each structure to 31 names. That's why $1 and $5 are both for 1 argument functions. $5 is overflow for $1. Likewise for $2 and $6. The Structure/Element number is what identifies each function. The reason for the limits goes way back to the original implementation, in which structure/element numbers get compressed down to a single byte. That limited us to about 255 PL360 functions. There can't be a Hex-00 code.

The relation between the structure/element numbers and the final one-byte code for each function is as follows:

   $1  -> 01 thru 1F
   $2  -> 21 thru 3F
   $3  -> 41 thru 5F
   $4  -> 61 thru 7F
   $5  -> 81 thru 9F
   $6  -> A1 thru BF
   $0  -> C0 thru FF  (this is a special case, up to 63 functions)

   $8  -> $SYSXEQ with the given function name as the 1st parameter.
          There could be as many as 127 functions in the $8 structure.

$0 names the functions with one or more (indefinite) arguments. $MATCH is an example. $8 names functions that are found as compiled RECDEF records with ID = $FUN.name as their key. These functions are implemented using USERPROC = PROD and USERPROC = TEST, and it is the UPROC statements that do the work. UPROC = SET VALUE ... returns the answer for these functions. Actually, $8 function names are just a shortcut to doing a $SYSXEQ function.

You can see a complete list of system functions by doing the following:

  -> select system functions
  -> show elements

Yes, it is that simple. The SYSTEM FUNCTIONS subfile is .$2FUN of $DATA. By the way, record-names that begin with a dot are pseudo-record-types. If any records are added to such record-types, they exist in the DEFQ only.

2.2 SPIRES Functions -- Implementation

SPIRES Functions are implemented as follows. Start by determining how many parameters it will have, and if it will be implemented in PL360 code or in RECDEF records. You have to SELECT FILEDEF, TRANSFER $DATA, and add your new $function name at the end of one of the structures in .$2FUN's record definition. You can add alternate spelling for your function using the ALIAS statement. You then update the file definition and recompile it, and then dump characteristics:

   -> master
   -> select filedef
   -> transfer $DATA
   -> ..edit       <--- you add your function name as described below.
   -> update
   -> /recompile $key
   -> select system functions
   -> dump char to FUNS.OBJ rep entry FUNSTABS build

For RECDEF records, $8 is the structure which will receive the function name. You'll code the function by writing a RECDEF record with ID = $FUN.name; Look at existing $FUN.name records in RECDEF to get a flavor of how they are coded. They always have a USERPROC = PROD; and USERPROC = TEST; and most of the time TEST simply calls upon PROD to do the work. You could have separate implementations if you want to test changes. In that case you'd also have to issue the SET TMACROS command to tell Spires to use the TEST code, not PROD.

For PL360 code, you'll append the name to $0, $3, $4, $5, or $6. $1 and $2 are both full. $0 is for variable number argument functions, although many functions in $0 are almost always referenced with a fixed number of arguments.

The PL360 code is in PARSEM.TXT, in several Global Procedures all with names that begin with APPLY, such as APPLY1HI or APPLY2, etc. Each procedure has a GOTO jump list that branches to the specific code associated with each of the functions named by the related structure(s). So APPLY2 and APPLY2HI are dealing with functions from $2 and $6.

When I added the $DENCODE function, I added "ELEM = DENCODE;" to the end of the $6 structure because this is a two-parameter function. I then added new PL360 code to APPLY2HI, just after the current GOTO jump list, and I had to add another jump to the list to branch around this new code. I also had to update an EQUATE value that specifies the currently known limit for functions with two parameters. There is a warning message in the code about that detail. This is what the PL360 code looks like:

       GOTO FUN239; |-- $LSTRIP --|
       GOTO FUN240; |-- $RSTRIP --|
 |-- **** Be sure to update APPLY2MX limit **** --|

 FUN241:  |-- $DENCODE --|
    GET2CHAR;  STM(R1,R2,B13);  R1 := R9;  R2 := R10;
    IF R1 <= 0 THEN GOTO EXITNULL;   LM(R9,R10,B13);
    REDUCE(R9);  R3 := 7;  IF R9 > R3 THEN R9 := R3;
    R4 := R1;  R3 := PARSEND;  |-- Get answer space --|
    R1 := R1 + R3 - NXTSPACE;  IF > THEN GETCORE;
    IF R10 = 0 THEN R9 := DEC R10;  B13(0/8) := 2.78281L;
    IF R9 >= 0 THEN EX(R9,XC(0,C13,C10));  |-- Key --|
    BEGIN  PROCEDURE CYPHER (R14);
       BEGIN  B13 := R1 OR #A5A5A5A5;
          UNPK(0,0,C13,C13);  UNPK(0,0,C13(3),C13(3));
          XC(1,C13(2),C13);  XC(0,C13(1),C13(3));
       END;  LM(R9,R10,B13);
       R1 := R9;  CYPHER;  B13(4) := B13;
       R1 := R10;  CYPHER;  |-- Values ready --|
    END;  R1 := R3;  R10 := R3;  R9 := R4;  R6 := R2;
    R14 := R4++R4;  REDUCE(R14);  WHILE R4 > 0 DO
    BEGIN  R3 := B13(4) * R14 =: B13(4) =: B13(8);
       R3 := B13 * R14 =: B13 XOR B13(8) =: B13(8);
       B1 := B6 XOR B13(8);  R1 := @B1(4);
       R6 := @B6(4);  R4 := R4 - 4S;
    END;  PARSEND := R1;  RETCORE;
    R1 := R9;  R2 := R10;  GOTO EXITSTR;

 FUN240:  |-- $RSTRIP --|

What was added was the GOTO FUN240; line, and all the FUN241 code. What existed before looked something like this:

       GOTO FUN239; |-- $LSTRIP --|
 |-- **** Be sure to update APPLY2MX limit **** --|

 FUN240:  |-- $RSTRIP --|

Of course the real trick here is to come up with the PL360 code to implement your function. In this DENCODE example, I found code in other parts of the system that implemented the $ENCRYPT(NUM) proc, and the SET CRYPT command. These were combined to create DENCODE, and the result is stored in PARSEND (extended as needed). Every function then returns a length in R1, a value or address in R2, and a type-code in R3. In this case, EXITSTR supplies the type.

Functions can return different types. EXITNULL returns a null. There are also INT, HEX, PACK, FLAG, REAL, LINE, CHAR and STRING types. CHAR is a fixed-length string, which may be blank padded. There are special types infrequently used. They involve TRIPLES.

2.3 SPIRES Functions -- Installation

SPIRES Functions are installed into SPIRES/SPIBILD as follows. You've already created the FUNS.OBJ when you dumped characteristics from .$2FUN of the $DATA file. If you created your function with a RECDEF named $FUN.name, then you must compile that RECDEF to create a RECHAR record. If you created your function with PL360 code (PARSEM.TXT modifications), then you must compile PARSEM.TXT to get all remaining object decks.

In SPIRES, you should do the following:

   -> set xeq spiproto
   -> ..compall pl360.parsem

This should generate all the object decks associated with PARSEM.TXT.

You must now move your FUNS.OBJ, from where it was stored when it was created by the dump char command, into the "obj" directory used for linking new versions of SPIRES/SPIBILD. An old version of FUNS.OBJ should already be in your spisrc/obj directory. Your new FUNS.OBJ should be in the spisys directory, the same place all GG.SPI files are located. Replace the old with the new as follows (Mac OSX example):

   $ cd
   $ mv spisys/FUNS.OBJ spisrc/obj/.

In SPIRES, you should now do the following:

   -> set xeq spiproto
   -> ..emlink

This should move any object decks created by compiling PARSEM.TXT, and then link all versions of SPIRES and SPIBILD. The new versions will be in the spisrc/obj directory.

WARNING: Be sure you know where things are located. GG.SPI files are along some path that usually leads to spisys. GQ.DOC files are SPIKE files, and are usually some other place. MA.INT is where "spisrc/pl360" and "spisrc/obj" are located, although these paths can all be differently named.

Be sure your .empath file defines /paths for these accounts. Also be sure you have "link...." files in spisrc/obj, such as "link.spiresx". You need link files for: spibildh, spibildx, spiresh, spiresx.

And be sure you have "compall" and "emlink" in spiproto.

2.4 SPIRES Functions -- Documentation

You must document your function. There are several places where such documentation exists. First is within .$2FUN of the $DATA file. Second is within PARSEM.TXT or a RECDEF/RECHAR with $FUN.name.

You should update the FILEDEF.BI text file in spisrc/bi to reflect the ELEM (and ALIAS) statements you added to the $DATA file. FILEDEF.BI is a seed file used to build $DATA from scratch.

You should then create a new record in the PROTOCOLS MANUAL subfile. This is a SPIKE-file, and is easiest to work with using the ..spike protocol defined in both SPIPROTO and SPIKE PROTOCOLS. If you have SET XEQ to either SPIPROTO or SPIKE PROTOCOLS, then you can use ..spike to select a SPIKE-file. There is a SPIKE.INDEX in the GQ.DOC directory that lists all the SPIKE-files accessible using the ..spike protocol. For our purposes, this is enough:

   -> set xeq spike protocols
   -> ..spike proto

All currently defined functions are documented in this SPIKE-file, which has the PROTOCOLS MANUAL subfile name. You see this when you EXPLAIN any existing function. The header of the explanation indicates which section of the PROTOCOLS MANUAL contains the documentation.

You should use the WRITE format to alter SPIKE-file records. You can TRANSFER an existing section to get an idea of how they are constructed. In fact, you can use an existing record as a model for a new record. Be sure to come up with a unique section number.

The -XP and -KT lines define the explainable terms. Usually, the first -XP phrase is the key phrase. All aliases follow, with semi-colon separations.

Once you have created a new section in the PROTOCOLS MANUAL documenting your function, you must cause the EXPLAIN subfile to get updated. This is done with the ..spiexp protocol, which takes the same argument as the ..spike protocol.

   -> set xeq spike protocols
   -> ..spikexp proto

At this point you create a result or stack of sections to be output to the EXPLAIN subfile, or use Global-For to create a display of records.

  -> ..spikexp proto
  :System (ORVYL, UNIX or CMS)?
  :Job (EXPLAIN, SYNTAX, EXAMPLE)?
  * PROTOCOLS MANUAL
  * Establish a Global-FOR, and in active display.
  *   or obtain a RESULT or STACK , and OUTPUT
  *   Then -------
  * ..upd.exp
  *   which does: SELECT EXPLAIN
  *               SET FORMAT EXPLAIN UPDATE
  *               INPUT ADDupd
  * To publish, ..orvyl.html
  -> stack 7.3.2a
  -Stack: 1 RECORD
  -> output clr
  -> ..upd.exp
  -? SELECT EXPLAIN
  -? SET FORMAT EXPLAIN UPDATE
  -? INPUT ADDUPD
  - ADD  @Line     40.     Key = $DENCODE FUNCTION
  -   Requests/Success:
   ADD       1       1
   SUM       1       1
  -End of batch input
  ->

2.5 SPIRES Functions -- Distribution

Once you've created SPIRES/SPIBILD by linking FUNS.OBJ and any object decks associated with functions defined by PL360 code in PARSEM.TXT, you have a complete implementation. You simply copy all versions of SPIRES/SPIBILD to other sites. If you used RECDEF/RECHAR to create $FUN.name functions, you must copy the RECDEF (source) to other sites and compile them there.

You do NOT have to migrate the $DATA file definition unless you want the other site(s) to have a $DATA file. If no one is going to SELECT SYSTEM FUNCTIONS, you don't need $DATA. Of course you need it at the site where you created the function.

You should also copy the explanations added or updated in the EXPLAIN subfile. The best way to do that is by creating a LOAD file from the DEFQ of the EXPLAIN subfile after doing the ..upd.exp protocol within the ..spikexp process. You then INPUT LOAD loadfile UPDATE at each site that has an EXPLAIN subfile.

3 Add Variable Documentation

3.1 SPIRES Variables -- Background

Spires allows system variables in any commands that are parsed by a / or are contained in the following commands:

   IF, LET, WHILE, UNTIL, EVAL, SHOW EVAL

UPROC of USERPROC and Formats are another place where variables can occur. The same basic set of commands is involved, with SHOW EVAL replaced by SET VALUE.

System variables are recognized by their starting $ character, followed by a name or a name in apostrophies. For example:

   if $cap($cginame) = 'CGI-BIN' then /* $cginame
   /find name $select
   let myvar = $date || ' ' || $time
   /wdse x$'date'd.$'time't

Note: in the wdse above, the output might appear as: x11/02/05d.12:15:25t

The nature of each $variable is determined by where the variable's name is found in the element-list of the .$1VAR record-type of the $DATA file. In other words, variables are organized into structures within that record-type, and which structure they are in determines how they are obtained, and usually what 'type' they return. In the list below, the 'type' is given first, then possibly a brief description, and then an example.

  Record .$1VAR of GG.SPI.DATA
  $01 FLAG (0=true), $NOECHO
  $02 FLAG (1=true), $ECHO
  $03 INT  (2-byte), $LENGTH
  $04 STRING, $SELECT
  $05 STRING, $ASK
  $06 LINE, $DELTA
  $07 CHAR, $ACCT
  $08 Variable type - dynamically determined, $DATE
  $09 INT, Formats, $ULEN
  $10 STRING, Formats, $UVAL
  $11 FLAG, Formats, $FREC
  $12 FLAG, Formats, $VIAPTR
  $13 FLAG, Formats, $FLUSH
  $14 FLAG, Formats, $CHANGED
  $15 INT (1-byte), $FORTYPE
  $16 STRING, $FORMAT
  $17 FLAG (byte), $ELSE
  $18 FLAG (bit), $DEBUG
  $19 STRING, USERPROC, $VELEM
  $20 INT (4-byte), $RESULT
  $21 STRING, Site Dependent, $MACHINE
  $22 HEX, $GETXPATH
  $23 STRING, Prism, $PRISMID
  $24 STRING, $PROGRAM

Type CHAR is a fixed-length STRING, and may be returned with trailing blanks. Conversion to STRING will delete those blanks. Except for $08, all variables declared in a particular structure are stored in the same manner within the system, and return the same type. But $08 can return different types because all of these variables are dynamically constructed. The Structure/Element number is what identifies each variable, and determines how it should be retrieved.

As you can see above, there are several INT, STRING and FLAG groups. That's because each group contains variables all stored the same way. For example, STRING variables can be stored as follows:

  1.  INTEGER pointer -> one-byte length, followed by the string.
  2.  INTEGER pointer -> halfword length, followed by the string.
  3.  One-byte length, and INTEGER pointer -> string.
  4.  Halfword length, and INTEGER pointer -> string.

In all of the examples above, anything before the arrow (->) is within a DSECT at a known location. A pointer can point anywhere.

You can see a complete list of system variables by doing the following:

  -> select system variables
  -> show elements

Yes, it is that simple. The SYSTEM VARIABLES subfile is .$1VAR of $DATA. By the way, record-names that begin with a dot are pseudo-record-types. If any records are added to such record-types, they exist in the DEFQ only.

3.2 SPIRES Variables -- Implementation

SPIRES Variables are implemented as follows. Start by determining how the variable will be set, where that setting will be stored, and how it will be retrieved. You have to SELECT FILEDEF, TRANSFER $DATA, and add your new $variable name at the end of one of the structures in .$1VAR's record definition. You can add alternate spelling for your variable using the ALIAS statement. You then update the file definition and recompile it, and then dump characteristics:

   -> master
   -> select filedef
   -> transfer $DATA
   -> ..edit       <--- you add your variable name as described below.
   -> update
   -> /recompile $key
   -> select system variables
   -> dump char to VARS.OBJ rep entry VARSTABS build

It is simplest to implement Prism variables ($23 structure). They can be set using their ELEM name, which must appear in the PRISMVAR table in COMSPI.TXT as a one-byte length followed by a quoted name, like "CGINAME" shown below.

 % diff old.DATA.FDEF DATA.FDEF
 424a425    ; $23 appendage
 >      ELEM = CGINAME;

 % diff old.COMSPI.TXT COMSPI.TXT
 1226c1226  ; PRISMLEN change
 <             64,36,0,0,0,0,0,0);
 ---
 >             64,36,16,0,0,0,0,0);
 1241c1241  ; PRISMNAM change
 <             0);   |-- Add here & in $DATA ST23 --|
 ---
 >             7, "CGINAME",   0);  |-- Add here & in $DATA ST23 --|

PRISMLEN contains the maximum length of the STRING, and PRISMNAM has the variable name's length followed by the quoted variable's name. You can set such variables with a SET command in SPIRES, or with an EVAL $SET(...) function call. You can reference any $variable by its ELEM or ALIAS name.

 -> set cginame = 'CGI-BIN'
 -> eval $set(cginame, $syseval('ENV:REQUEST_URI'))

SET only allows a simple string, while EVAL $SET allows an expression.

The next simplest to define are the $08 variables that are dynamically constructed. You should look at PARSEM.TXT near the 'ST8:' label to see what has been done already. Examples of dynamically constructed variables include $DATE, $TIME, $UDATE, $UTIME, etc.

FLAG variables are the next simplest, if there already is some BYTE somewhere with an available bit, and you need a FLAG variable. You can use $18 and modify V18 in PARSEM.TXT giving the address of the byte and what bit position to test. Again, it is best to examine other V18 bit-flags to see where they are located and how they are set. You may have to make SPIRES.BNF changes to define the commands for re/setting your bit-flag.

Almost anything else will require extensive modification and recompilation. Adding anything to R11DSECT.TXT to hold new items has a ripple effect. Both R11DSECT.TXT and "r11dsect.h" must be kept in sync, and that's very difficult to accomplish because a new SPIRES needs a new Emulator, at the same time. All source code that refers to either R11DSECT.TXT (pl360) or "r11dsect.h" (c) must be recompiled and relinked. Unfortunately, old TRAP-versions of SPIRES/SPIBILD won't work with a new Emulator, or vice-versa.

GETVAR in PARSEM.TXT is where $variables are retrieved. Variables can return different types. There are INT, HEX, PACK, FLAG, REAL, LINE, CHAR and STRING types. CHAR is a fixed-length string, which may be blank padded. There are special types infrequently used. They involve TRIPLES, and $ANY is an example.

Variables always return a 'type' in R3, an 'address' or 'value' in R2, and a 'length' in R1. INT and LINE types return a 'value' in R2, not an 'address'. STRING, CHAR, HEX, PACK and FLAG return an 'address' in R2. There are no REAL $variables, but if there were, they would return a REAL value in F01, and R2 would be meaningless.

3.3 SPIRES Variables -- Installation

SPIRES Variables are installed into SPIRES/SPIBILD as follows. You've already created the VARS.OBJ when you dumped characteristics from .$1VAR of the $DATA file. If you created your variable in COMSPI.TXT as a Prism Variable, then you must recompile both SPIRES and SPIBILD to get object decks. If you created your variable within PARSEM.TXT, then you must compile PARSEM to get all remaining object decks.

In SPIRES, you should do the following:

   -> set xeq spiproto
   -> ..compall pl360.parsem
   -> ..compall pl360.spires
   -> ..compall pl360.spibild

You must now move your VARS.OBJ, from where it was stored when it was created by the dump char command, into the "obj" directory used for linking new versions of SPIRES/SPIBILD. An old version of VARS.OBJ should already be in your spisrc/obj directory. Your new VARS.OBJ should be in the spisys directory, the same place all GG.SPI files are located. Replace the old with the new as follows (Mac OSX example):

   $ cd
   $ mv spisys/VARS.OBJ spisrc/obj/.

In SPIRES, you should now do the following:

   -> set xeq spiproto
   -> ..emlink

This should move any object decks created by compiling PL360 code, and then link all versions of SPIRES and SPIBILD. The new versions will be in the spisrc/obj directory.

And be sure you have "compall" and "emlink" in spiproto.

3.4 SPIRES Variables -- Documentation

You must document your variable. There are several places where such documentation exists. First is within .$1VAR of the $DATA file. Second is within PARSEM.TXT or COMSPI.TXT, and possibly more.

You should update the FILEDEF.BI text file in spisrc/bi to reflect the ELEM (and ALIAS) statements you added to the $DATA file. FILEDEF.BI is a seed file used to build $DATA from scratch.

You should then create a new record in the PROTOCOLS MANUAL subfile. For Prism variables, update the PRISM APPLICATIONS MANUAL subfile. These are SPIKE-files, and are easiest to work with using the ..spike protocol defined in both SPIPROTO and SPIKE PROTOCOLS. If you have SET XEQ to either SPIPROTO or SPIKE PROTOCOLS, then you can use ..spike to select a SPIKE-file. There is a SPIKE.INDEX in the GQ.DOC directory that lists all the SPIKE-files accessible using the ..spike protocol. For our purposes, this is enough:

   -> set xeq spike protocols
   -> ..spike proto
   . . or . .
   -> ..spike prism app

All currently defined variables are documented in these SPIKE-files. You see this when you EXPLAIN any existing variable. The header of the explanation indicates which section of the PROTOCOLS MANUAL or the PRISM APPLICATIONS MANUAL contains the documentation.

The -XP and -KT lines define the explainable terms. Usually, the first -XP phrase is the key phrase. All aliases follow, with semi-colon separations.

Once you have created a new section in the PROTOCOLS MANUAL documenting your variable, you must cause the EXPLAIN subfile to get updated. This is done with the ..spiexp protocol, which takes the same argument as the ..spike protocol.

   -> set xeq spike protocols
   -> ..spikexp proto

At this point you create a result or stack of sections to be output to the EXPLAIN subfile, or use Global-For to create a display of records. [See 2.4.]

When CGINAME was added as a Prism variable, section 17 of PRISM APPLICATIONS MANUAL was updated adding this variable to the existing list.

3.5 SPIRES Variables -- Distribution

Once you've created SPIRES/SPIBILD by linking VARS.OBJ and any object decks associated with variables defined by PL360 code, you have a complete implementation. You simply copy all versions of SPIRES/SPIBILD to other sites.

You do NOT have to migrate the $DATA file definition unless you want the other site(s) to have a $DATA file. If no one is going to SELECT SYSTEM VARIABLES, you don't need $DATA. Of course you need it at the site where you created the variable.

8 The Host Language Interface (HLI)

HLI is not defined in the Unix/Linux environments. The historic mainframe description is covered in the DEVICE manual.

9 UPDCLOSE Processing: INCLOSE

9.1 Intermediate Form

UPDCLOSE is called after all TRAMERGE processing is done. The record is complete except for the application of the INCLOSE rules. UPDCLOSE operates on the Intermediate form of the record, which has the following organization in core:

      +---------------+  <-- VALORG
      |  Value region |
      +---------------+  <-- CVE
      |               |
      |    (space)    |
      |               |
      +---------------+  <-- TABORG
      |   Entries 1   |
      +---------------+  <-- CTE
      |               |
      |    (space)    |
      |               |
      +---------------+  <-- CTO
      |   Entries 2   |
      +---------------+  <-- TABEND

The region from VALORG to TABORG is determined by SUPERVAL. The region from TABORG to TABEND is determined by SUPERMAX. When UPDCLOSE is called, all entries are in Entries 1. The form of an entry is:

      +-------+-------+--------------+--------------+
      | ST/EL |  SEQ  |     LENGTH   |     DISP     |
      +-------+-------+--------------+--------------+

Each entry is 12-bytes long, with LENGTH and DISP taking full words. ST/EL is the Structure/Element number of the element. SEQ contains an occurrence number, and control bits. The top bit of SEQ determines if DISP points to either values in the VALORG region, or other entries in the TABORG thru TABEND region.

9.2 Structure Processing Dsect

The structure processing dsect used by UPDCLOSE has the following form:

      +---------------+
      |     FATL      |
      +---------------+
      |     NXTL      |
      +---------------+
      |     LSTL      |
      +---------------+
      |     NXTH      |
      +---------------+
      |     LSTH      |
      +-------+-------+
      |  CEL  |  LEL  |
      +---+---+-------+
      | F | S |       |
      +---+---+-------+
      |     SLOC      |
      +---------------+

F is a flags byte called GENVFLG, and S is a byte called STRNO containing the current structure number. CEL and LEL are half-words containing the current and "last+1" element numbers within the current structure. All the other terms are full-words.

9.2.1 FATL

FATL contains the relative location of the "father" entry that points to this structure. FATL for one level is SLOC for the previous level.

9.2.2 NXTL

NXTL contains the relative location of the last occurrence of CEL in the Entry 1 section. By "last" is meant the entry at highest core address since entries are built and sorted as follows:

      +------------+ <-- DISP of entry at FATL
      | Elem 2 (2) |
      +------------+
      | Elem 2 (1) |
      +------------+
      | Elem 1 (3) |
      +------------+
      | Elem 1 (2) |
      +------------+ <-- SLOC  (working on 2nd occ)
      | Elem 1 (1) |
      +------------+ <-- NXTL  (For CEL = 1)
      | Elem 0 (2) |
      +------------+
      | Elem 0 (1) |
      +------------+ <-- LSTL  (last elem, section 1)

NXTL (like all others) points beyond the occurrence. When TABORG-IMRS is used as a base, these vectors point at the ST/EL field of the occurrence.

9.2.3 LSTL

LSTL contains the relative location of the end of the structure still remaining in section 1. If portions of a structure are moved to section 2, LSTL is adjusted.

9.2.4 NXTH

NXTH contains the relative location of where in section 2 a portion from section 1 begins. If portions of a structure are moved to section 2, NXTH minus the length moved becomes the new NXTH, and is where the moved portion is placed.

9.2.5 LSTH

LSTH contains the relative location of the end of a portion in section 2. LSTH-NXTH is the length of the portion in section 2.

9.2.6 SLOC

SLOC contains the relative address of the current entry being operated upon.

12 Partial Record Processing: Partial FOR

12.1 Introduction

The purpose of this paper is to briefly document all the work which has been done so far concerning Partial Processing. The main emphasis will be on the commands available and how they interact with each other and the user. Any restrictions or limitations will be pointed out as appropriate.

12.2 Current Capabilities

While it is possible with whole record modification to add or remove partial data from records, the whole record must be "unpackaged" and presented to the user and then entirely "repackaged" after the modifications are made. This is costly, particularly if only a small part of the overall record is involved. Furthermore, CRTs and other fixed-dimension devices do not have the text editor's practically unlimited capacity; hence, the capability must exist for the presentation and manipulation of records in a piecemeal, or partial fashion.

12.3 Concept of Partial Processing

Global-FOR allows linear traversal of a set of records by means of commands such as SKIP, DISPLAY, REMOVE, etc. The set of added records (FOR ADDS) is distinctly different from the set of updated records (FOR UPDATES). Each set of records could be thought of as multiple occurrences of a structural element, and those occurrences constitute a single level of control called the record level. But other levels exist due to the hierarchial nature of records. Structural elements within records form subtrees, and the REFERENCE command provides access to these subtrees by means of Partial-FOR commands. Such accessing is called Partial Processing.

12.4 Record Level Commands

At the record level, the following commands establish what is called a "referenced record":

REFerence [key-value]  [NOUPDate]
REFerence [occurrence] [NOUPDate]
GENerate REFerence

The two forms of the REFERENCE command establish a referenced record by retrieving an existing goal record from the data base. If the NOUPDATE option is given, the UPDATE command (shown below) is blocked, and secure-switch 10 record locking is not performed. The GENERATE REFERENCE command establishes an empty referenced record.

At the record level, a referenced record may be returned to the data base (updating the DEFQ) by the following commands:

UPDate [key-value] [CLEAR]
ADD [CLEAR]

The key-value in the UPDATE command is not needed if the referenced record was established by a REFERENCE command, and the key in the record being updated is the same as the original key.

The CLEAR option on UPDATE or ADD may be either CLEar or CLR. If CLEAR is not specified, the referenced record is retained. This allows the user to do further modification of the referenced record, including its key, so that another ADD or UPDATE can be done using the same basic referenced record. When CLEAR is specified, and the UPDATE or ADD command succeeds, the referenced record is released.

A referenced record may be released at any time by the command:

CLEAR REFerence

12.5 Record Navigation

Once a referenced record is established, Partial-FOR commands may refer to data elements, beginning with those defined at the record level; and once Partial-FOR has chosen an element, other partial processing commands can be used to manipulate the chosen element. One of those partial processing commands is another form of the REFERENCE command, and if Partial-FOR has chosen a structural element, the REFERENCE of an existing occurrence of that structure causes Partial-FOR to then refer to the data elements defined for that structure. Therefore, the structural hierarchy of a referenced record may be traversed by nested pairings of Partial-FOR against a structural element, and REFERENCE of an existing occurrence of that structure.

The general form of the Partial-FOR command is:

FOR <element> [WHERE clause]

The "For element" command provides the basic mechanism for all other partial processing. It specifies either a structural or simple data element, and the optional WHERE clause can specify criteria to be applied in locating occurrences of that element. If the element is a simple data element, the WHERE clause may only refer to that element. Therefore, the special form

FOR <element> <relation> <value>

may be used as a shorthand for

FOR <element> WHERE <element> <relation> <value>

Note that for a structural element, the WHERE clause may refer to any element contained at any level within the structure.

Partial-FOR may only refer to a particular set of data elements, those at the last referenced structural level. If no Partial-FOR commands are in effect for a referenced record, Partial-FOR may refer to only record level elements. The SHOW REFERENCE ELEMENTS command lists all the primary element names that may be specified in a subsequent Partial-FOR command. The form of the command is:

SHOw REFerence ELEments

As stated earlier, it is possible to move down several levels in the structural hierarchy of a referenced record by nested pairings of Partial-FOR and REFERENCE against structural elements.

The ENDFOR command is used to back up one or more levels. When used with Partial-FOR, the general form of this command is:

ENDFor [element]

ENDFOR cancels Partial-FOR commands, and if no element name is given, the Partial-FOR associated with the current level is cancelled (moving back up one level). Otherwise, the specified element name must correspond to one of the active Partial-FOR commands, and all levels up through that Partial-FOR are cancelled.

A condition known as a "referenced state" exists immediately following either the ENDFOR of a Partial-FOR, or the REFERENCE of an existing occurrence of a structural element specified by Partial-FOR, or the establishment of a referenced record by the record-level REFERENCE or GENERATE REFERENCE command. When a referenced state exists, Partial-FOR commands may only refer to data elements at the next level in the hierarchy. Otherwise, Partial-FOR commands may replace themselves at the current level since they may only refer to elements at that level. Therefore, a referenced state shifts by one level the allowed set of data elements that Partial-FOR may specify. A Partial-FOR command issued when a referenced state exists moves to the next level in the hierarchy; but when a referenced state does not exist, Partial-FOR commands remain at the current level.

12.6 Partial Processing Commands

These commands provide manipulative control over the element named in the preceding FOR command:

FOR **
FOR * [WHERE clause]
SKIp [Occurrence] [END clause]
REFerence [Occurrence] [END clause]
DISplay [Range] [END clause]
TRAnsfer [Range] [END clause] [CLEAR]
UPDate [Range] [USING line range] [END clause]
MERge [Range] [USING line range] [END clause]
REMove [Range] [END clause]
ADD [AFTER] [USING line range]
ADD BEFORE [USING line range]
SET REFerence ELEments [element-list]

The optional Occurrence or Range specifies which occurrence (or occurrences) of an existing FOR element should be processed. If the FOR command specified a WHERE clause, then only occurrences which meet the criteria are considered for processing, although every existing occurrence in any range is examined. Any WHERE clause is ignored for the ADD commands.

The allowed Occurrence or Range specifications are:

*             the "current" occurrence, which is the one
              last processed by a preceding command.
First         the first occurrence
Last          the last occurrence
NEXT          the next occurrence.
<Number>      a specific number (not 0) starting with NEXT.
REMaining     those remaining (NEXT through LAST inclusive).
REST          (same as REMAINING)
ALL           all occurrences (FIRST through LAST inclusive).

When a "FOR element" command is issued, the following conditions exist: the NEXT occurrence is the FIRST; no occurrences have been processed, so the * specification is not allowed; and a referenced state does not exist. The "FOR **" command reinstates these conditions. It cancels any referenced state, and sets NEXT equal to FIRST. On the other hand, the "FOR *" command retains any established referenced state, and does not reset NEXT back to FIRST. It only eliminates any previous WHERE clause, and establishes a new WHERE clause if one is given with the command. FOR * allows you to change WHERE clauses in the middle of processing.

For most partial processing commands, if no Occurrence or Range is specified, NEXT is assumed. The * Occurrence or Range specification is not allowed following any FOR command until some other occurrence specification has been given (including the default NEXT). It also cannot be used if the last occurrence processed was a removed occurrence (REMOVE, MERGE, and UPDATE can do "removal").

The optional END clause provides command processing whenever no occurrence can be found to process. The general form of the END clause is:

END (=) <Command>

where <Command> is either a single command verb like RETURN or is an enclosed string like 'JUMP LABEL'.

Whenever an occurrence cannot be found, an END condition is signaled. Such a condition is cleared by an END clause, by an ENDFOR command if in XEQ mode, or else automatically. If an END condition is not cleared, no other partial processing or FOR commands are accepted until an ENDFOR command is processed.

The partial processing commands operate as follows:

SKIP -- This command positions to the specified occurrence of the current level Partial-FOR element. The <Number> option is interpreted as a specific position beginning with NEXT=1. Thus, SKIP 1 (or just SKIP) is equivalent to SKIP NEXT, and SKIP 3 means to position to the 3rd occurrence, which would be at NEXT+2. ALL and REST (or REMAINING) are invalid occurrence specifications and are interpreted as FIRST or NEXT respectively. An END condition occurs if the specific occurrence can not be reached. If * is a valid occurrence specification, SKIP * does nothing except to eliminate a referenced state.

REFERENCE -- This command, like SKIP, positions to the specified occurrence of the current level Partial-FOR element. The ALL, REST (or REMAINING), and <Number> specifications operate as for SKIP. An END condition occurs if the specific occurrence can not be reached. REFERENCE of a non-structural element acts the same as SKIP. REFERENCE of a structural element establishes the condition known as a referenced state. Such a state permits Partial-FOR commands to specify elements at the structure's level, and is the only mechanism for nesting Partial-FOR commands. If any Partial Processing command (SKIP, REMOVE, DISPLAY, MERGE, TRANSFER, ADD, UPDATE, or FOR **) follows a REFERENCE command, then the referenced state is eliminated and further nesting of FOR commands on elements of that structure can only occur after another REFERENCE command is issued. As FORs are nested by issuing appropriate REFERENCE and FOR commands, the SHOW REFERENCE ELEMENTS command may be used in debugging (usually in combination with SHOW LEVELS) to ascertain which level has been reached. If * is a valid occurrence specification, REFERENCE * does nothing except to establish a referenced state if the current level Partial-FOR element is a structural element.

DISPLAY -- Displays a range of occurrences of the current Partial-FOR element. A <Number> specification indicates the maximum number of occurrences to process. If there are fewer than <Number>, only those which exist are processed, but the minimum is one. The elements displayed can be controlled with the SET REFERENCE ELEMENTS command, described below. If a frame exists with FRAME-TYPE = STRUCTURE, DIRECTION = OUTPUT, USAGE = ALL or DISPLAY, and has a SUBTREE matching the current Partial-FOR structural element, then such a frame would be used by the DISPLAY command to output to any specified "IN area". If the USAGE statement of the frame included the NAMED option, then "USING frame" would have to prefix the DISPLAY command to cause this frame to be used.

SET REFERENCE ELEMENTS -- This form of the SET ELEMENTS command limits the elements shown by an unformatted DISPLAY command under Partial-FOR processing to the elements listed; the SET ELEMENTS command will not do this for you. However, it affects all subsequent element and record displays until a CLEAR ELEMENTS command or a SET REFERENCE ELEMENTS command with no element list is issued. The SHOW ELEMENTS SET command can be used to determine which elements have been set by SET REFERENCE ELEMENTS.

TRANSFER -- Transfers to the ACTIVE file a range of occurrences of the current Partial-FOR element. A <number> specification operates as for DISPLAY. The TRANSFER command also remembers the starting point and count of occurrences processed for possible subsequent reprocessing by a partial processing UPDATE or MERGE command. If a frame exists with FRAME-TYPE = STRUCTURE, DIRECTION = OUTPUT, USAGE = ALL or TRANSFER, and has a SUBTREE matching the current Partial-FOR structural element, then such a frame would be used by the TRANSFER command to output to any specified "IN area". If the USAGE statement of the frame included the NAMED option, then "USING frame" would have to prefix the TRANSFER command to cause this frame to be used.

UPDATE -- Replaces a range of occurrences of the current Partial-FOR element. A <Number> specification operates as for DISPLAY. If no Range specification is given, NEXT is assumed unless there is a TRANSFER count and starting point in effect, in which case this command processes the same occurrences processed by the TRANSFER command.

MERGE -- Updates a range of occurrences of the current Partial-FOR element. If no Range specification is given, NEXT is assumed unless there is a TRANSFER count and starting point in effect, in which case this command processes the same occurrences processed by the TRANSFER command.

REMOVE -- Removes a range of occurrences of the current Partial-FOR element. A <Number> specification operates as for DISPLAY. The Examined count displayed by SHOW LEVELS is not incremented for each Processed occurrence that is totally removed. If the last occurrence processed is totally removed, then the * position specification becomes invalid. If the element being processed is a constrained structure containing a mixture of updateable elements and non-updateable or invisible element, then REMOVE will not remove occurrences of the structure but will only remove the updateable portions from each occurrence.

ADD -- Adds new occurrences of the current Partial-FOR element. These new occurrences are added either BEFORE or AFTER any currently existing occurrence at the * position. If the * position is not valid, these occurrences are added such that the first one added would be NEXT if there were no WHERE clause.

12.7 Partial Processing UPDATE and MERGE Capabilities

Both UPDATE and MERGE have unique capabilities in Partial Processing. The first thing these commands do is input one or more occurrences of the Partial-FOR element, possibly via a FRAME-TYPE=STRUCTURE frame with proper USAGE and DIRECTION. These input occurrences usually match the range of occurrences processed by these commands on a one-for-one basis. The first occurrence found to process (according to any WHERE clause criteria) is occurrence number one, the second is occurrence number two, etc. However, it is possible to number the input occurrences such that they don't always match with the processed occurrences. Consider the following:

Levels down -->
KEY
A                 This chart depicts a sample record hierarchy.
S                 KEY, A, and S are record-level elements,  and
. T               S is a structural element consisting of T, V,
. . U             and  H  elements.  All the element names that
. . . B           begin with letters from the first part of the
. . . C           alphabet are simple elements,  and those from
. . D             the last part of the alphabet are structural.
. V
. . E             There are four levels shown,  and  structures
. . W             have  at least  two elements defined at their
. . . F           structural level.  Thus, structure V consists
. . . G           of elements E and W at the next level.
. H

A sample record might look like the following:

KEY = 1;  A(1) = A1;  A(2) = A2;
S(1);
  T(1);
    U(1);  B(1) = B1111;  B(2) = B1112;
    U(2);  B(1) = B1121;  B(2) = B1122;
    D(1) = D111;  D(2) = D112;
  T(2);
    U(1);  B(1) = B1211;  B(2) = B1212;
    U(2);  B(1) = B1221;  B(2) = B1222;
    D(1) = D121;  D(2) = D122;
  V(1);
    E(1) = E111;  E(2) = E112;
    W(1);  F(1) = F1111;  F(2) = F1112;
    W(2);  F(1) = F1121;  F(2) = F1122;
  V(2);
    E(1) = E121;  E(2) = E122;
    W(1);  F(1) = F1211;  F(2) = F1212;
    W(2);  F(1) = F1221;  F(2) = F1222;
  H(1) = H11;  H(2) = H12;
S(2);
  T(1);
    U(1);  B(1) = B2111;  B(2) = B2112;
    U(2);  B(1) = B2121;  B(2) = B2122;
    D(1) = D211;  D(2) = D212;
  T(2);
    U(1);  B(1) = B2211;  B(2) = B2212;
    U(2);  B(1) = B2221;  B(2) = B2222;
    D(1) = D221;  D(2) = D222;
  V(1);
    E(1) = E211;  E(2) = E212;
    W(1);  F(1) = F2111;  F(2) = F2112;
    W(2);  F(1) = F2121;  F(2) = F2122;
  V(2);
    E(1) = E221;  E(2) = E222;
    W(1);  F(1) = F2211;  F(2) = F2212;
    W(2);  F(1) = F2221;  F(2) = F2222;
  H(1) = H21;  H(2) = H22;

All elements are shown with an occurrence number in parenthesis. That was done by declaring PRIV-TAG numbers on each element, and then assigning those numbers to either CONSTRAINT or NOUPDATE. To conserve space, multiple occurrences of the simple elements were placed on the same line. The values of the simple elements show the complete occurrence path leading to them. Thus, the value B1212 represents the occurrence of B given by the path: S(1); T(2); U(1); B(2);.

Now consider the following sequence of commands:

REFERENCE 1;  establishes referenced record.
FOR S;        we wish to process occurrences of S.
REFERENCE;    NEXT = FIRST by default at this point.
FOR T;        we wish to process T's in 1st S.
TRANSFER ALL; T's in 1st S to the ACTIVE file.
UPDATE;       same occurrences processed by TRANSFER.

The output by TRANSFER might look like the following:

T(1);
  U(1);  B(1) = B1111;  B(2) = B1112;
  U(2);  B(1) = B1121;  B(2) = B1122;
  D(1) = D111;  D(2) = D112;
T(2);
  U(1);  B(1) = B1211;  B(2) = B1212;
  U(2);  B(1) = B1221;  B(2) = B1222;
  D(1) = D121;  D(2) = D122;

If everything in T is updateable, then the UPDATE command would "replace" these same two occurrences by whatever is input as the replacement. But consider the following input:

T(2);
  U(2);  B(2) = B<1222>;  B(3) = B<1223>;
  D(2) = D<122>;
T(3);
  U(1);  B(1) = B<1311>;

The occurrence numbers shown in the input don't match with those of the occurrences to be processed. T(1) does not occur in the input, so the first occurrence processed will be "removed". T(2) refers to the second occurrence, so that occurrence will be "replaced". T(3) now comes along, and there are no more occurrences to process since we are only updating two (the same ones found by TRANSFER ALL). This and any other extra input occurrences are considered an "addition" to be inserted immediately following the last occurrence processed. Following the UPDATE, if we requested TRANSFER ALL again, we'd get:

T(1);
  U(1);  B(1) = B<1222>;  B(2) = B<1223>;
  D(1) = D<122>;
T(2);
  U(1);  B(1) = B<1311>;

On the surface, it seems as though the UPDATE caused all the occurrences to simply be renumbered. But that would not necessarily always be the case. If there had been three or more occurrences of the T structure, and a WHERE clause on the FOR T command chose non-adjacent occurrences to process, then "removal" of the first and "replacement" of the second could cause a non-processed occurrence to fall between the two processed occurrences. Here is an example, where the original record contains:

T(1);  D(1) = 111;
T(2);  D(1) = 121;
T(3);  D(1) = 131;
T(4);  D(1) = 142;

to be processed by:

FOR T WHERE NOT D STRING 2
UPDATE ALL  ;which chooses T(1) and T(3) only.

with input of:

T(2);  D(1) = <122>;  D(2) = <123>;
T(3);  D(1) = <132>;

which would result in the following:

FOR T      ;no WHERE clause, so we can see all occurrences.
DISPLAY ALL
T(1);  D(1) = 121;
T(2);  D(1) = <122>;  D(2) = <123>;
T(3);  D(1) = <132>;
T(4);  D(1) = 142;

This example serves to illustrate a couple of points. First, notice that none of the final occurrences meet the original WHERE clause criteria. Second, close examination of the result shows that the original T(1) is gone, and that what was the old T(2) is now T(1). The old T(3) is also gone, with the new T(2) taking its place. The new T(3) is an "addition", followed by the old T(4).

In all of the examples thus far there has been an assumption that "everything in T is updateable". But what would happen if an occurrence of T contained non-updateable or invisible elements? The process of "removal" would not eliminate such an occurrence, but would only remove all updateable information leaving just the non-updateable of invisible portion. Likewise, a "replacement" operation would merge the non-updateable or invisible portion of the old occurrence into the input replacement. All non-updateable material is dropped from the input data before the input is used for either "replacement" or "addition". The basic rule for UPDATE is that all updateable material in the original occurrences is discarded, and the input supplies new updateable material, either as a "replacement" or "addition".

For both UPDATE and MERGE processing, if input occurrences of the current Partial-FOR element are not numbered, they are simply considered to be sequentially numbered from 1 on up. Thus, the following are equivalent inputs:

T; D = <12>;                T(1); D(1) = <12>;
T; D = <3>; D = <4>;        T(2); D(1) = <3>; D(2) = <4>;
T; D(2) = <5>;              T(3); D(2) = <5>;

Warning: Do not mix numbered and unnumbered occurrences of the Partial-FOR element in a single input. In the example above, T was either always numbered or unnumbered. The same warning applies to multiple occurrences of each element that occurs within any particular structural occurrence. In the example above, it would be improper to have something like D(2) = <3>; D = <4>; within the second occurrence of T. Processing results are unpredictable when such mixtures occur.

It's important to realize that the processed occurrences are always considered to be numbered beginning with 1, regardless of where the range begins within the set of actual occurrences. For example, if there were ten actual occurrences of T, and six of them met WHERE clause criteria, then:

FOR T WHERE <criteria that finds 6 out of 10>
TRANSFER 3;  these would be numbered T(1), T(2), T(3).
TRANSFER 3;  and so would these next three!

Although each range of occurrences begins at a different place within the set of all occurrences, the first occurrence processed by a range is always numbered 1, the second is numbered 2, etc. This rule applies to all commands that specify a Range to process: DISPLAY, TRANSFER, UPDATE, MERGE, and REMOVE. If the input to UPDATE or MERGE has occurrences that fall beyond the process range, then those occurrences are considered "additions" to be place immediately following the last processed occurrence of the range. All other input corresponds to occurrences within the process range. If an occurrence in the process range doesn't have corresponding input, the UPDATE command recognizes that as a signal to do "removal" processing against that occurrence, while the MERGE command simply "skips" that occurrence leaving it unchanged.

The basic rule for MERGE is that updateable material in the original occurrences is retained, unless the input supplies new updateable material, either as a "replacement" or "addition". MERGE can also indicate "removal" by specifying a negative occurrence number in the input. Consider the first example given for UPDATE, but this time for MERGE:

REFERENCE 1;  establishes referenced record.
FOR S;        we wish to process occurrences of S.
REFERENCE;    NEXT = FIRST by default at this point.
FOR T;        we wish to process T's in 1st S.
TRANSFER ALL; T's in 1st S to the ACTIVE file.
MERGE;        same occurrences processed by TRANSFER.

The output by TRANSFER might look like the following:

T(1);
  U(1);  B(1) = B1111;  B(2) = B1112;
  U(2);  B(1) = B1121;  B(2) = B1122;
  D(1) = D111;  D(2) = D112;
T(2);
  U(1);  B(1) = B1211;  B(2) = B1212;
  U(2);  B(1) = B1221;  B(2) = B1222;
  D(1) = D121;  D(2) = D122;

Again assume that everything in T is updateable, and that the input for the MERGE command was the same as shown for UPDATE:

T(2);
  U(2);  B(2) = B<1222>;  B(3) = B<1223>;
  D(2) = D<122>;
T(3);
  U(1);  B(1) = B<1311>;

The result would be:

T(1);
  U(1);  B(1) = B1111;  B(2) = B1112;
  U(2);  B(1) = B1121;  B(2) = B1122;
  D(1) = D111;  D(2) = D112;
T(2);
  U(1);  B(1) = B1211;  B(2) = B1212;
  U(2);  B(1) = B1221;  B(2) = B<1222>;  B(3) = B<1223>;
  D(1) = D121;  D(2) = D<122>;
T(3);
  U(1);  B(1) = B<1311>;

The first occurrence of T was left unchanged. Within the second occurrence, selective values of B and D were replaced or added. T(3) was an "addition" just like in the UPDATE example.

Under MERGE processing, element(-n); does "removal" processing for the n-th occurrence of the specified element. If the input had specified things like B(-2); or D(-1); then the selected occurrence (positive equivalent) would be removed. That would even be true for the Partial-FOR element itself; thus if the input had been only T(-1); the result of a command like MERGE LAST would be the same as if REMOVE LAST had been done.

Finally, the * Occurrence or Range specification is invalid following an UPDATE or MERGE if the last occurrence processed was totally removed and no "additions" occurred. If "additions" occur, the last one added defines the * position. Otherwise, the last processed occurrence (not removed) defines the * position.

12.8 General Information

The following commands provide useful information concerning partial processing:

SHOw LEVel[S]
SHOw LeVeL[S]

The SHOW LEVELS commands show the Processed and Examined counts for Global-FOR (if applicable) and each nested Partial-FOR. Each Partial-FOR also shows the corresponding FOR element name.

The Processed and Examined counts vary according to the following rules. The "FOR *" command sets the Processed count to zero, but leaves the Examined count unchanged. The "FOR element" and "FOR **" commands both set the Processed and Examined counts to zero. Partial Processing commands which specify an Occurrence or Range specification of FIRST, LAST, or ALL also set the Processed and Examined counts to zero, after which they proceed to find at least one occurrence starting from the FIRST position. An Occurrence or Range specification of NEXT, <Number>, REST, or REMAINING begins with the Processed and Examined counts at their current values, and then they proceed to find at least one occurrence starting from the NEXT position. FIRST, LAST, NEXT, and <Number>=1 represent a single occurrence. The * specification, if valid, also represents the single occurrence at the "current" position.

The number of occurrences actually processed by a single Partial Processing command is called the "process range". The number of occurrences examined to satisfy any particular process range is the corresponding "examined range". The Processed and Examined counts are altered by adding the corresponding range amount following the completion of a command. The ADD command does not alter the Processed count, and only occurrences added by "ADD BEFORE" increment the Examined count. Also, UPDATE and MERGE can have extra "additions" inserted following the last processed occurrence. The total number of such "additions" increments the Examined count. UPDATE, MERGE, and REMOVE can do total "removal" of occurrences. The total number of such "removals" decrements the Examined count. When * is used as an Occurrence or Range specification, the Processed and Examined counts are not varied unless an UPDATE, MERGE, or REMOVE causes total "removal" of that occurrence, in which case the Examined count is decremented by one, and * becomes an invalid specification.

12.9 The FOR * command

When a referenced record has been established, and no Partial-FOR commands are currently active, the FOR * command may be used to establish a special mode of processing. The form of this command is just:

FOR *

There is no WHERE clause allowed. When this special mode is established, DISPLAY, TRANSFER, and MERGE commands may be used to process the entire referenced record, not as individual elements, but as a single unit. All the elements of the record can be accessed at once. Since the full record is processed each time, the DISPLAY, TRANSFER, or MERGE commands do not require a Range specification, and an END clause is meaningless. Formatted output and input can be done using record-level frames (FRAME-TYPE = DATA;).

ENDFOR (or ENDFOR *) terminates this special FOR * mode.

12.10 Partial Processing to the rescue

Occasionally a situation arises where a set of records are to be found and processed with Global-FOR WHERE criteria such that the conditions of the WHERE clause cannot be guaranteed. For example, using the sample record described in section 12.7, find the records which have "D STRING 2" and "B STRING 3" in the same occurrence of the T structure. D belongs to the T structure, but B belongs to the U structure which in turn belongs to T. They are not in "exactly the same structure". Yet it is possible some occurrence of U within a particular T could have the required B value, and D in that same T could have its required value. It would not be possible to use the @-sign to indicate "same structure processing" since B and D are in different structures. But the Global-FOR WHERE clause could at least specify that the records chosen should have both conditions satisfied, although possible from different occurrences of the T structure.

FOR <class> WHERE D STRING 2 AND B STRING 3
REFERENCE END '<command to do when none found>'

This retrieves a candidate record. The first thing that might come to your mind is: "This requires retrieving the record, isn't that expensive?". Not really because the record needs to be retrieved to check out the WHERE clause criteria, and doing a REFERENCE of that records doesn't retrieve it again. Next you might ask: "What good does it do to REFERENCE the record?". The answer is that Partial Processing can now determine if this record meets the true criteria, B and D conditions being met in the same occurrence of T. A protocol to do this might look like the following:

FOR <class> WHERE D STRING 2 AND B STRING 3
++NEXT.REC
REFERENCE END RETURN
FOR S
++NEXT.S
REFERENCE END 'JUMP ALL.DONE'
FOR T WHERE D STRING 2 AND B STRING 3
REFERENCE END 'JUMP NEXT.S'
ENDFOR S;  Hurray!  This record meets the conditions.
- FOR * ; record level processing
- <whatever needs to be done>
- ENDFOR *
- <UPDATE or ADD if needed>
++ALL.DONE
CLEAR REFERENCE; we are done with this record.
JUMP NEXT.REC

The difference here is that the WHERE clause on FOR T is restricted to examining the B and D criteria within each occurrence of T, not across all occurrences. So Partial-FOR WHERE clauses exibit a "same occurrence" property for criteria that occur at different levels within a particular structural occurrence of the FOR element. Of course, "same structure" processing (indicate by @-sign) could still be done for elements which are within a single structure, such as B and @C or F and @G, or even multiple occurrences of a single element within a structure, such as H and @H, D and @D, or B and @B.

As a final example, assume a goal record with the following structure:

Donor Goal Record

SLOT (key)
NAME
ADDRESS
DONATION (multiply-occurring structure)
  DATE
  AMOUNT
  FUND (multiply-occurring structure)
    FUND-ID
    FUND-YR

Suppose the processing requirement is to purge all FUND structures from the subfile where FUND-ID=8 and FUND-YR < 1980 are satisfied in the same occurrence of the structure. As an additional requirement, the AMOUNT of the donation must be less than 10000. The request could be done with the following protocol:

SELECT DONORS
LET UPDCNT = 0
FOR TREE WHERE AMOUNT < 10000 AND FUND-ID=8 AND @FUND-YR < 1980
++DONOR.LOOP
  REFERENCE END='/ RETURN * #UPDCNT Donor Records Processed'
  LET CHANGE = $FALSE
  FOR DONATION WHERE AMOUNT < 10000 AND FUND-ID=8 AND @FUND-YR < 1980
  ++DONATION.LOOP
      REFERENCE END='GOTO UPDATE.RECORD'

      FOR FUND WHERE FUND-ID=8 AND @FUND-YR < 1980
          REMOVE ALL END='GOTO DONATION.LOOP'
          ENDFOR FUND
          LET CHANGE = $TRUE
          GOTO DONATION.LOOP

++UPDATE.RECORD
  IF #CHANGE THEN UPDATE
  THEN LET UPDCNT = #UPDCNT + 1
  CLEAR REFERENCE
  GOTO DONOR.LOOP

INDEX

ADD COMMAND, IN PARTIAL PROCESSING 12.6
ALSO AND INDEXES B.7.6
A11 B.8.9
B.8.4
A12 B.8.9
A13 B.8.9
A14 B.8.9
A15 B.8.9
A16 B.8.9
A162 B.8.11
A165 B.8.11
   B.8.5
A166 B.8.12
   B.8.4
A167 B.8.12
   B.8.8
A169 B.8.12
   B.8.11
   B.8.8
   B.8.4
   B.8.2
A170 B.8.11
   B.8.2
A32 B.7.16
B.7.13
A36 B.7.13
A38 B.8.5
A44 B.8.5
A45 B.8.4
A47 B.8.4
A56 B.8.4
A6 B.8.9
   B.8.8
A71 B.7.13
CINDEX-VALUE STATEMENT B.8.8
CLEAR REFERENCE COMMAND 12.4
COMBINE STATEMENT B.7.12
COMBINED INDEX, CODING B.7.10
COMBINED INDEX, LINKAGE B.8.8
COMBINED INDEX, UNDERSTANDING B.7.5
DISPLAY COMMAND, IN PARTIAL PROCESSING 12.6
ENDFOR COMMAND, IN PARTIAL PROCESSING 12.5
EXTERNAL-NAME STATEMENT B.8.2
FOR * COMMAND, IN PARTIAL PROCESSING 12.6
FOR * COMMAND, WITH REFERENCED RECORD 12.9
FOR ** COMMAND 12.6
FOR ELEMENT COMMAND, IN PARTIAL PROCESSING 12.5
GENERATE REFERENCE COMMAND, IN PARTIAL PROCESSING 12.4
GLOBAL FOR AND INDEXES B.7.6
GOALREC-ELEM STATEMENT B.8.3
GOALREC-KEY STATEMENT B.8.2
GOALREC-NAME STATEMENT B.8.2
INDEX-NAME STATEMENT B.8.3
INDEX, ALSO AND B.7.6
INDEX, COMBINE STATEMENT B.7.12
INDEX, COMBINED LINKAGE B.8.8
INDEX, COMBINED, CODING B.7.10
INDEX, COMBINED, UNDERSTANDING B.7.5
INDEX, FUNCTION B.7.1
INDEX, GLOBAL FOR AND B.7.6
INDEX, GLOBAL QUALIFIER LINKAGE B.8.6
INDEX, MULTIPLE PASSERS B.7.7
INDEX, PASSPROC RULES B.8.11
INDEX, QUALIFIER LINKAGE B.8.7
INDEX, QUALIFIER, CODING B.7.9
INDEX, QUALIFIER, UNDERSTANDING B.7.3
INDEX, RESTRICTIONS ON B.7.7
INDEX, SIMPLE LINKAGE B.8.4
B.8.1
INDEX, SIMPLE, CODING B.7.8
INDEX, SIMPLE, UNDERSTANDING B.7.2
INDEX, SORTING ELEMENTS IN B.7.15
INDEX, SUB-INDEX LINKAGE B.8.5
INDEX, SUB-INDEX, CODING B.7.11
INDEX, SUB-INDEX, UNDERSTANDING B.7.4
LINKAGE, COMBINED INDEX B.8.8
LINKAGE, PASSPROC RULES B.8.11
LINKAGE, QUALIFIER B.8.7
LINKAGE, QUALIFIER, GLOBAL B.8.6
LINKAGE, SIMPLE INDEX B.8.4
B.8.1
LINKAGE, SUB-INDEX B.8.5
LOCATOR B.7.1
MERGE COMMAND, IN PARTIAL PROCESSING 12.6
MULTIPLE PASSERS B.7.7
NOPASS STATEMENT B.8.10
PASSING, DETAILED PROCESS DECRIPTION B.8.14
PASSPROC CODING B.8.11
PASSPROC STATEMENT B.8.4
PASSPROC, CHOOSING FETCHER B.8.12
PASSPROC, RULE STRING SYNTAX B.8.13
PERSONAL NAME ALGORITHM B.7.16
POINTER B.7.1
POINTER-GROUPS, SORTING B.7.15
PROCESSING RULES, FOR INDEX RECORDS B.7.13
PTR-ELEM STATEMENT B.8.2
PTR-GROUP STATEMENT B.8.3
QUAL-ELEM STATEMENT B.8.7
B.8.6
QUALIFIER INDEX, CODING B.7.9
QUALIFIER INDEX, UNDERSTANDING B.7.3
QUALIFIER, GLOBAL LINKAGE B.8.6
QUALIFIER, LINKAGE B.8.7
RECORD, REMOVED B.7.1
REFERENCE COMMAND, IN PARTIAL PROCESSING 12.6
   12.4
REMOVE COMMAND, IN PARTIAL PROCESSING 12.6
REMOVED B.7.1
SEARCHTERMS STATEMENT B.8.3
SET REFERENCE ELEMENTS COMMAND 12.6
SHOW LEVELS, IN PARTIAL PROCESSING 12.8
SHOW REFERENCE ELEMENTS COMMAND 12.5
SIMPLE INDEX, CODING B.7.8
SIMPLE INDEX, LINKAGE B.8.4
B.8.1
SIMPLE INDEX, UNDERSTANDING B.7.2
SKIP COMMAND, IN PARTIAL PROCESSING 12.6
SPIRES FUNCTIONS -- BACKGROUND 2.1
SPIRES FUNCTIONS -- DISTRIBUTION 2.5
SPIRES FUNCTIONS -- DOCUMENTATION 2.4
SPIRES FUNCTIONS -- IMPLEMENTATION 2.2
SPIRES FUNCTIONS -- INSTALLATION 2.3
SPIRES VARIABLES -- BACKGROUND 3.1
SPIRES VARIABLES -- DISTRIBUTION 3.5
SPIRES VARIABLES -- DOCUMENTATION 3.4
SPIRES VARIABLES -- IMPLEMENTATION 3.2
SPIRES VARIABLES -- INSTALLATION 3.3
SRCPROC RULES B.8.9
SRCPROC STATEMENT B.8.4
SUB-INDEX STATEMENT B.8.5
SUB-INDEX, CODING B.7.11
SUB-INDEX, LINKAGE B.8.5
SUB-INDEX, UNDERSTANDING B.7.4
TRANSFER COMMAND, IN PARTIAL PROCESSING 12.6
TYPE=LCTR STATEMENT RECORD-TYPE B.7.1
UPDATE COMMAND, IN PARTIAL PROCESSING 12.6
WHERE CLAUSE, IN PARTIAL PROCESSING 12.5