INDEX
*  SPIRES File Definition
A  Introduction to SPIRES File Definition
A.1  Preface
A.2  System Overview
A.2.1  SPIRES Command and Definition Languages
A.2.2  SPIRES Processors and Programs
A.2.3  The User Interface
A.2.4  File Definition Concepts and Facilities
A.3  The Process of Defining a File
A.3.1  Design Analysis
A.3.2  File Definition
A.3.2.1  The File Definer: A SPIRES Subsystem to Simplify File Definition
A.4  Glossary of Important File Definition Terms
A.4.1  Element
A.4.2  Record
A.4.3  Structure
A.4.4  Key
A.4.5  Goal Record
A.4.6  Index Record
A.4.7  Record-type
A.4.8  Index
A.4.9  Simple Index
A.4.10  Compound Index
A.4.11  Combined Record-type
A.4.12  Removed Record-type
A.4.13  Subfile
A.4.14  File
A.4.15  Hierarchy of File Definition Components
B  Defining a SPIRES File
B.1  Goal Record Concepts and Definition
B.1.1  Element Names, Occurrences, and Lengths
B.1.2  File and Record Name Statements
B.1.3  Element Categories
B.1.4  Record Keys
B.1.5  Element Name, Occurrence and Length Statements
B.1.6  Element Aliases
B.1.7  Dummy Elements; Comment Statements
B.1.8  Optional Statements: AUTHOR, MAXVAL, NOAUTOGEN and BIN
B.1.8.1  (The AUTHOR Statement)
B.1.8.2  (The MAXVAL Statement)
B.1.8.3  (The NOAUTOGEN Statement)
B.1.8.4  (The BIN Statement)
B.1.9  Statements in the Subfile Section
B.1.10  A Complete File Definition
B.2  Goal Record Keys, Slot and Removed Records
B.2.1  Record Keys
B.2.2  Slot Keys
B.2.2a  SLOT-START Statement
B.2.3  Removed Record-Types
B.2.4  Monotonic Record-Types
B.3  Structures
B.3.1  Data Structuring
B.3.2  Coding Structures
B.3.3  Structured-Data Input
B.3.4  Keyed Structures
B.3.5  Keyed-Structure Data Input
B.3.6  Floating Structures
B.4  Processing Rules: INPROC, INCLOSE, OUTPROC
B.4.1  Functions of Processing Rules
B.4.2  INPROC and OUTPROC Rule Functions
B.4.3  Processing Rule Strings
B.4.4  Processing Rule Syntax
B.4.5  Processing Rule Restrictions
B.4.6  Understanding Processing Rule Descriptions
B.4.7  Custom Error Messages
B.4.8  INCLOSE Rules
B.4.9  Processing Rules For Numeric Data
B.4.10  Processing Rules for Dollar-and-Cents Data
B.4.11  Processing Rules for Free Form Character Strings
B.4.12  Processing Rules for Strings of Codified Data
B.4.13  Processing Rule for Personal Names to Canonical Form
B.4.14  Processing Rules for Validating Length and Occurrence
B.4.15  Processing Rules and Element Types
B.4.16  Processing Rule Tracing: SET PTRACE
B.4.17  Processing Rule Tracing for Passprocs: SET PASSTRACE
B.5  The FILEDEF Subfile and File Compilation
B.5.1  The FILEDEF Subfile
B.5.2  Adding Records to FILEDEF
B.5.3  Compiling File Definitions
B.5.4  Altering a File Definition in FILEDEF
B.5.5  ORVYL Files Created by Compilation
B.5.6  The ATTACH and SELECT Commands
B.5.7  The PROCESS Command in SPIBILD
B.5.8  Making Major Changes to a File: The ZAP FILE Command
B.5.9  Making Minor Changes to a File: The RECOMPILE Command
B.5.10  Destroying a SPIRES File
B.5.11  Summary
B.6  File Structure: Tree & Slot, Goal & Index, Removed Records
B.6.1  Introduction
B.6.2  Element Storage
B.6.3  Record Storage
B.6.4  Removed Record-Types
B.6.4.1  Very Large Databases
B.6.4.2  RES-LARGE Statement
B.6.5  Combined Record-Types
B.6.5a  Extended Tree Data Sets for Large Databases
B.6.5b  The OVERFLOW-TO and OVERFLOW-KEY statements
B.6.5c  The EXTERNAL-TYPE statement
B.6.6  Figure: Function of Goal and Index Records
B.6.7  Figure: Storage of Element Length and Occurrence Information
B.6.8  Figure: A Tree-Structured Data Set
B.6.9  Figure: Detail of the Structure of a Single File Block
B.6.10  Figure: Sample Tree After Intense Local Growth
B.6.11  Figure: Sample Tree With Well-Distributed Growth
B.6.12  Figure: Tree Showing High Number of Access Per Record
B.6.13  Figure: Previous Tree After Rebalancing
B.7  Understanding and Coding Index Records
B.7.1  How Indexing Works
B.7.2  Understanding Simple Indexes
B.7.3  Understanding Qualifiers
B.7.4  Understanding Sub-Indexes
B.7.5  Understanding Compound Indexes
B.7.6  The Impact of Global FOR and ALSO on Indexing
B.7.7  Index Definition
B.7.8  Coding Simple Indexes
B.7.9  Coding Simple Indexes with Qualifiers
B.7.10  Coding Compound Indexes
B.7.11  Coding Sub-Indexes
B.7.12  Index Record and Goal Record Elements
B.7.13  Index Records as Goal Records
B.7.14  Indexes for Non-Removed Record Types; Keys vs. Locators
B.7.15  Ensuring the Validity of Index Records
B.8  Understanding and Coding the Linkage Section
B.8.1  Functions of the Linkage Section
B.8.2  The Global Parameters Section
B.8.2.1  The SEARCHPROC Statement in the Global Parameters section
B.8.2.2  The EXTERNAL-NAME Statement
B.8.3  Individual Index Linkages
B.8.4  Simple Indexes
B.8.5  Sub-Indexes
B.8.6  Global Qualifiers
B.8.7  Local Qualifiers
B.8.8  Compound Indexes
B.8.9  Coding Searchproc Rules
B.8.10  The NOPASS Statement
B.8.11  Coding PASSPROC Rules
B.8.12  Choosing the "Fetcher" Passproc
B.8.13  Other Actions in a PASSPROC Rule String
B.9  Defining Subfile Privileges
B.9.1  The Function of the Subfile Section
B.9.2  Basic Statements in the Subfile Section
B.9.2a  Subfile Selection for "Access Lists" of Accounts
B.9.3  The SECURE-SWITCHES Statement
B.9.3.1  Secure-Switches 1 and 2
B.9.3.2  Secure-Switch 2
B.9.3.3  Secure-Switch 3
B.9.3.4  Secure-Switch 4
B.9.3.5  Secure-Switch 5
B.9.3.6  Secure-Switch 6
B.9.3.7  Secure-Switch 7
B.9.3.8  Secure-Switch 8
B.9.3.9  Secure-Switch 9
B.9.3.10  Secure-Switch 10
B.9.3.11  Secure-Switch 11
B.9.3.12  Secure-Switch 12
B.9.3.13  Secure-Switch 13
B.9.3.14  Secure-Switch 14
B.9.3.15  Secure-Switch 15
B.9.3.16  Secure-Switch 16
B.9.3.17  Secure-Switch 17
B.9.3.18  Secure-Switch 18
B.9.3.19  Secure-Switch 19
B.9.3.20  Secure-Switch 20
B.9.3.21  Secure-Switch 21
B.9.3a  The SHOW SSW and SET SSW Commands
B.9.4  Security for Individual Elements and Indexes
B.9.4.1  Views: Element Security Defined in Packets
B.9.4.2  Advanced Features of the View Facility
B.9.4.3  Specific Effects of the View Facility
B.9.4.4  Priv-Tags and the CONSTRAINT and NOUPDATE Statements
B.9.4.5  The INPROC-REQ and OUTPROC-REQ Statements
B.9.4.6  Index Security: Priv-Tags and the NOSEARCH Statement
B.9.5  The SUBGOAL Statement
B.9.6  The SELECT-COMMAND Statement
B.9.7  The PROGRAM Statement
B.9.8  The SUBCODE Statement
B.9a  Defining File Access Privileges
B.10  SPIRES File Management
B.11  Logging Database Use in SPIRES
B.12  Immediate Indexing
B.12.1  Coding Immediate Indexes
B.12.2  Efficiency Considerations for Immediate Indexes
B.12.3  Immediate Indexing and Goal-to-Goal Passing
C  Additional Facilities for the SPIRES File Definer
C.1  Recompile of an Existing File's Definition
C.1.1  The Function of the RECOMPILE Command
C.1.2  Statements You Can Change, Add or Delete Anytime
C.1.3  Statements You Can Sometimes Change, Add or Delete
C.1.4  Statements You Can Never Change, Add or Delete
C.2  [Currently not used]
C.3  Synonyms
C.3.1  The Function of a Synonym
C.3.2  Defining a SYNONYM Index
C.3.3  Adding Synonyms to Index Records
C.4  Executable Elements: Protocols and TYPE=XEQ
C.5  Indirect Record-Access: Action 32 and SUBGOAL Processing
C.5.1  The Function of Action 32
C.5.2  Action 32: Problem One
C.5.3  First Solution to Problem One
C.5.4  Second Solution to Problem One
C.5.5  Third Solution to Problem One
C.5.6  Action 32: Problem Two
C.5.7  Solution to Problem Two
C.5.8  SUBGOAL Processing
C.6  Practical Techniques for File Definers and Managers
C.6.1  Introduction
C.6.2  Proximity Searching: Information for File Developers
C.6.2.1  Proximity Searching: File Definition Requirements
C.6.2.2  Proximity Searching: How it Works
C.6.5  Examination Of Index Entries
C.6.10  Automatic Accumulation of Record Modification Dates
C.6.11  Free Global Qualifiers
C.6.12  Record Protection By Account Number
C.6.13  Same-Structure Retrieval Through Indexes
C.6.14  Phonetic Search of Personal Names
C.6.15  Non-Unique INDEX-NAME Statements
C.6.17  The QUELEVEL and RES-LEVEL Statements
C.6.18  Record Size Limits and Split Records
C.6.19  Indexing Negative Integer or Real Values
C.6.20  Encrypting Data Values
C.6.21  DEFQ-Only Record-Types
C.6.22  File Security: ORVYL Data Sets
C.6.23  Creating a Deferred Queue on Another Account
C.6.23a  Creating a Duplicate Deferred Queue
C.6.24  Checkpoint Data Sets
C.7  Compiling File Definition Code from Several Sources
C.7.1  The RECDEF Subfile
C.7.2  The EXT-REC and EXT-LINK Statements
C.7.3  Comparing the DEFINED-BY and EXT-REC / EXT-LINK Statements
C.8  FASTBILD
C.9  File Definition Compilation Diagnostics
C.9.1  Determining the Location of Errors
C.9.2  ATCHFILE, INITFILE and Other Error Messages
C.9.3  Compile Diagnostics and Errors
C.9.3.1  * ACCOUNT MISMATCH
C.9.3.2  * AET TABLE OVERFLOW
C.9.3.3  * COMBINE RECORD INVALID
C.9.3.3.0  * NAME PCT TABLE FULL
C.9.3.3.0a  * PACKED CHAR TABLE FULL
C.9.3.3.0b  * ELEMENT TABLE FULL
C.9.3.3.0c  * USERPROC TABLE FULL
C.9.3.3.1  * LABL PCT TABLE FULL
C.9.3.3.1a  * IMT TABLE FULL
C.9.3.3.1b  * AET TABLE FULL
C.9.3.3.1c  * LABEL TABLE FULL
C.9.3.3.2  * TOO MANY RECORD TYPES
C.9.3.3.2a  * TOO MANY REAL RECORD TYPES
C.9.3.3.3  * TOO MANY SLOT TYPES
C.9.3.3.4  * XEQ TYPE MUST BE VARIABLE
C.9.3.4  * DEFINED-BY RECORD TABLES NOT FOUND
C.9.3.5  * EXTRANEOUS string
C.9.3.6  * FILE EXISTS
C.9.3.7  * FILE NOT AVAILABLE
C.9.3.8  * ILLEGAL 2ND PROPERTY
C.9.3.9  * INVALID ACTION CODE
C.9.3.9a  * ELEMENT MUST BE TYPE STRUCTURE
C.9.3.10  * INVALID ACTION SYNTAX
C.9.3.10a  * ILLEGAL ACTION SEQUENCE
C.9.3.10b  * INVALID ACTION GROUP
C.9.3.11  * INVALID CINDEX
C.9.3.12  * INVALID MNEMONIC
C.9.3.13  * INVALID NAME LEN
C.9.3.13a  * INVALID LOCATOR LENGTH
C.9.3.14  * INVALID REC NAME
C.9.3.15  * INVALID SEQUENCE
C.9.3.16  * KEY ELEMENT ERROR
C.9.3.17  * LENGTH VALUE > 255
C.9.3.18  * LENGTH NOT GIVEN
C.9.3.19  * DUPLICATE ELEMENT NAMES IN STRUCTURE
C.9.3.19a  * DUPLICATE VARIABLE NAME
C.9.3.20  * NO ELEMENTS ALLOWED WITH DEFINED-BY VALUE
C.9.3.21  * NO ELEMENTS IN RECORD
C.9.3.22  * NO ELEMENTS IN STRUCTURE
C.9.3.23  * NO RECORD TO COMPILE
C.9.3.24  * NON FIXED ELEMS FOR SLOT
C.9.3.25  * ONE UNIQUE ID PER RECORD
C.9.3.26  * PROCESSING RULE TABLES FULL
C.9.3.27  * PROCESSING RULE TABLE OVERFLOW
C.9.3.28  * RECORD KEY ELEMENT MISSING
C.9.3.28a  * IMMEDIATE INDEX IS ALSO IMMEDIATE GOAL
C.9.3.28b  * TREE-DATA VALUE INVALID
C.9.3.29  * INVALID SLOTCHECK VALUE
C.9.3.29a  * INVALID ELEM MNEMONIC
C.9.3.29b  * ACTION 32 RECORD recname NOT VALID
C.9.3.29c  * USEMPROC error
C.10  Processing Rule-String Procedures
C.10.1  The PROC and RULE Statements
C.10.2  The PARM and DEFAULT Statements
C.10.3  The SYMBOL and VALUE Statements
C.10.4  Processing Rule Procedure Examples
C.10.5  The EXTDEF Subfile
C.11  User-Defined Processing Rules: Userprocs
C.11.1  Coding Userprocs
C.11.2  Uprocs (Commands) Available in Userprocs
C.11.2.1  SET Uprocs
C.11.2.2  Block-Construct Uprocs for Execution Flow
C.11.2.3  Other Uprocs for Execution Flow
C.11.2.4  Uprocs for Setting User Variables
C.11.2.5  Uprocs for Terminal Input/Output
C.11.2.6  Miscellaneous Userproc Uprocs
C.11.3  Using Variables in Userprocs
C.11.3.1  System Variables Available Only in Userprocs
C.11.3.2  User Variables in Userprocs
C.11.4  Some Interesting and Unusual Uses for Userprocs
C.11.4.1  Retrieving Other Element Values with the $GETxVAL Functions
C.11.4.2  (*) Outproc Userprocs for Record Keys and Secure-Switch 13
C.11a  Virtual Elements
C.11a.1  Examining Virtual Elements
C.11a.2  Retrieving Other Elements for Virtual Elements
C.11a.3  The REDEFINE Statement
C.11a.3.1  (A Comparison of Rule A79, $GETxVAL, and Redefining Elements)
C.11a.4  Variably-Occurring Virtual Elements
C.12  Record-Types that Serve as Goals and Indexes; Goal-to-Goal Passing
C.12.1  Index Record-Types Used as Goal Records
C.12.2  Record-Types with Goal and Index Data
C.12.3  Linking Goal-Record Types: Goal-to-Goal Passing
C.12.3.1  Rules and Suggestions for Goal-to-Goal Passing
C.12.3.2  A Solution to the Example in C.12.3
C.12.3.3  Using Qualifiers in Passing to Create Goal Records
C.12.4  Double-Headed Files
C.12.5  Chain Passing
C.12.6  Goal-to-Same-Goal Passing
C.12a  Indirect Searching
C.12a.1  Coding Details for Indirect Indexes
C.12a.2  Uses for Indirect Searching
C.12a.3  Some Technical Details on how SPIRES Handles Indirect Indexes
C.12a.4  Dynamic Indexes
C.13  File Definition Information Packets
C.13.1  Element Information Packets
C.13.1.1  The ELEMINFO (INFO) Info-element
C.13.1.2  The NOTE Info-element
C.13.1.3  The DESCRIPTION (DESC) Info-Element
C.13.1.4  The HEADING (HEAD) Info-element
C.13.1.5  The COL-HEADING (COLHEAD) Info-element
C.13.1.6  The WIDTH (WID) Info-element
C.13.1.7  The ADJUST (ADJ) Info-element
C.13.1.8  The INDENT Info-element
C.13.1.9  The MAXROWS (MAXROW) Info-element
C.13.1.10  The EDIT Info-element
C.13.1.11  The VALUE-TYPE (VTYPE) Info-element
C.13.1.12  The INDEX Info-element
C.13.1.13  The USERINFO Info-element
C.13.1.14  The INPUT-OCC (INOCC) Info-element
C.13.1.15  The DEFAULT Info-element
C.13.1.16  The SUPPLIED Info-element
C.13.1.17  The RDBMS_COLUMN Info-element
C.13.1.18  The RDBMS_DATATYPE Info-element
C.13.1.19  The RDBMS_DATALENGTH Info-element
C.13.2  System Commands and Utilities Using Element Information
C.13.3  Index Information Packets
C.13.3.1  The INDEXINFO (INFO) Info-element
C.13.3.2  The NOTE Info-element
C.13.3.3  The DESCRIPTION (DESC) Info-Element
C.13.3.4  The SOURCE (SOU) Info-element
C.13.3.5  The VALUE-TYPE (VTYPE) Info-element
C.13.3.6  The TRUNCATE (TRUNC) Info-element
C.13.3.7  The EXCLUDE (EXC) Info-element
C.13.3.8  The USERINFO Info-element
C.13.4  System Commands and Utilities Using Index Information
C.13.5  Alternate Locations for Information Packet Definitions
D  Appendices
D.1  Actions: Complete Listing By Number
D.1.1  About this Chapter
D.1.2  Actions Used Only As SEARCHPROC Rules (A6 -- A16)
D.1.2.0.0.6  * A6
D.1.2.0.0.7  * A7
D.1.2.0.0.8  * A8
D.1.2.0.0.9  * A9
D.1.2.0.1.0  * A10
D.1.2.0.1.1  * A11
D.1.2.0.1.2  * A12
D.1.2.0.1.3  * A13
D.1.2.0.1.4  * A14
D.1.2.0.1.5  * A15
D.1.2.0.1.6  * A16
D.1.2.0.1.7  * A17
D.1.3  Actions Used as INPROC, OUTPROC, SEARCHPROC or PASSPROC Rules (A21 -- A66)
D.1.3.0.2.1  * A21
D.1.3.0.2.2  * A22
D.1.3.0.2.3  * A23
D.1.3.0.2.4  * A24
D.1.3.0.2.5  * A25
D.1.3.0.2.6  * A26
D.1.3.0.2.7  * A27
D.1.3.0.2.8  * A28
D.1.3.0.2.8a  * A28 for time
D.1.3.0.2.8b  * A28 for datetime
D.1.3.0.2.9  * A29
D.1.3.0.3.0  * A30
D.1.3.0.3.1  * A31
D.1.3.0.3.2  * A32
D.1.3.0.3.3  * A33
D.1.3.0.3.4  * A34
D.1.3.0.3.5  * A35
D.1.3.0.3.6  * A36
D.1.3.0.3.7  * A37
D.1.4.0.3.8  * A38
D.1.4.0.3.9  * A39
D.1.4.0.4.0  * A40
D.1.4.0.4.1  * A41
D.1.4.0.4.2  * A42
D.1.4.0.4.3  * A43
D.1.4.0.4.4  * A44
D.1.4.0.4.5  * A45
D.1.4.0.4.6  * A46
D.1.4.0.4.7  * A47
D.1.4.0.4.8  * A48
D.1.4.0.4.9  * A49
D.1.4.0.5.0  * A50
D.1.4.0.5.1  * A51
D.1.4.0.5.2  * A52
D.1.4.0.5.3  * A53
D.1.4.0.5.4  * A54
D.1.4.0.5.5  * A55
D.1.4.0.5.6  * A56
D.1.4.0.5.7  * A57
D.1.4.0.5.8  * A58
D.1.4.0.5.9  * A59
D.1.4.0.6.0  * A60
D.1.4.0.6.1  * A61
D.1.4.0.6.2  * A62
D.1.4.0.6.3  * A63
D.1.4.0.6.4  * A64
D.1.4.0.6.5  * A65
D.1.4.0.6.6  * A66
D.1.4.0.6.7  * A67
D.1.4.0.6.8  * A68
D.1.4.0.6.9  * A69
D.1.5.0.7.0  * A70
D.1.5.0.7.1  * A71
D.1.5.0.7.2  * A72
D.1.5.0.7.3  * A73
D.1.5.0.7.4  * A74
D.1.5.0.7.5  * A75
D.1.5.0.7.6  * A76
D.1.5.0.7.6a  * A76 for date
D.1.5.0.7.6b  * A76 for datetime
D.1.5.0.7.7  * A77
D.1.5.0.7.8  * A78
D.1.5.0.7.9  * A79
D.1.5.0.8.0  * A80
D.1.5.0.8.1  * A81
D.1.5.0.8.2  * A82
D.1.5.0.8.3  * A83
D.1.5.0.8.4  * A84
D.1.5.0.8.5  * A85
D.1.5.0.8.6  * A86
D.1.6  Actions Used Only as INCLOSE Rules (A122 -- A148)
D.1.6.1.2.2  * A122
D.1.6.1.2.3  * A123
D.1.6.1.2.4  * A124
D.1.6.1.2.5  * A125 assigned element
D.1.6.1.2.6  * A126
D.1.6.1.2.7  * A127
D.1.6.1.2.8  * A128
D.1.6.1.2.9  * A129
D.1.6.1.3.0  * A130
D.1.6.1.3.1  * A131
D.1.6.1.3.2  * A132
D.1.6.1.3.3  * A133
D.1.6.1.3.4  * A134
D.1.6.1.3.7  * A137
D.1.6.1.3.8  * A138
D.1.6.1.3.9  * A139
D.1.6.1.4.0  * A140
D.1.6.1.4.6  * A146
D.1.6.1.4.7  * A147
D.1.6.1.4.8  * A148
D.1.7  Actions Used Only as PASSPROC Rules (A161 -- A171)
D.1.7.1.6.1  * A161
D.1.7.1.6.2  * A162
D.1.7.1.6.3  * A163
D.1.7.1.6.4  * A164
D.1.7.1.6.5  * A165
D.1.7.1.6.6  * A166
D.1.7.1.6.7  * A167
D.1.7.1.6.8  * A168
D.1.7.1.6.9  * A169
D.1.7.1.7.0  * A170
D.1.7.1.7.1  * A171
D.1.7.1.7.2  * A172
D.2  Quick Reference to Processing Rule Functions by Number
D.2.1  Actions Used Only As SEARCHPROC Rules (A6 -- A16)
D.2.2  Actions Used as INPROC, OUTPROC or SEARCHPROC Rules (A21 -- A37)
D.2.3  Actions Used as INPROC, OUTPROC, SEARCHPROC, PASSPROC Rules (A38-A62)
D.2.4  Actions Used Only as OUTPROC Rules (A71 -- A85)
D.2.5  Actions Used Only as INCLOSE Rules (A122 -- A148)
D.2.6  Actions Used Only as PASSPROC Rules (A161 -- A171)
D.3  Quick Reference to Processing Rules by Function-Keyword
D.3.1  Binary Conversion
D.3.2  Character Test
D.3.3  Comparison and Generation of Elements
D.3.4  Date Generation and Conversion
D.3.5  Default Value Generation
D.3.6  Dollar-and-Cents Conversion
D.3.7  Floating-Point Conversion
D.3.8  General
D.3.8a  Hexadecimal Conversion
D.3.9  Insertion of String
D.3.10  Length Test
D.3.11  Multiple Occurrence Conversion
D.3.11a  Packed Decimal Conversion
D.3.12  Personal Name Algorithm
D.3.13  Range Test
D.3.14  String Replacement, Inclusion, Exclusion, Code Translation
D.3.15  Time Generation and Conversion
D.4  FILEDEF Subfile Syntax and Semantics
D.4.1  Understanding FILEDEF Subfile Structure
D.4.2  Record Level Elements
D.4.3  Record Definition Elements
D.4.4  Key Element Definition
D.4.5  Element Definition
D.4.6  Structure Definition Elements
D.4.7  Linkage Section Definition Elements
D.4.8  Individual Index Linkage Elements
D.4.9  Qualifier Linkage Definition Elements
D.4.10  Sub-Index Linkage Definition Elements
D.4.11  Compound Index Linkage Definition Elements
D.4.12  Subfile Section Definition Elements
D.4.13  Processing Rule Procedure Definition Elements
D.4.14  SLOT Section Definition Elements
D.4.15  User Defined Processing Section
D.4.16  Subfile Section Selection Elements
D.4.17  FILE-PERMITS Definition Elements
D.4.18  Element-Information-Packet Elements
D.4.19  Index-Information-Packet Elements
D.4.20  View Definition Elements
D.4.21  Version Information Elements
D.5  A Guide to Coding Index Record Definitions
D.6  A Guide to Coding the Linkage Section Definition
D.7  Annotated File Definition Examples
D.7.1  GA.SPI.BIBLIOGRAPHY
D.7.2  GA.SPI.BIBLIOGRAPHY2
D.7.3  GA.SPI.PEOPLE
D.7.4  GA.SPI.RESTAURANT
D.7.5  XA.B14.AV
D.7.6  GA.SPI.MAIL
D.8  Passing Keys or Passing Locators to Indexes
:29  SPIRES Documentation

*  SPIRES File Definition

******************************************************************
*                                                                *
*                     Stanford Data Center                       *
*                     Stanford University                        *
*                     Stanford, Ca.   94305                      *
*                                                                *
*       (c)Copyright 1994 by the Board of Trustees of the        *
*               Leland Stanford Junior University                *
*                      All rights reserved                       *
*            Printed in the United States of America             *
*                                                                *
******************************************************************

        SPIRES (TM) is a trademark of Stanford University.

A  Introduction to SPIRES File Definition

A.1  Preface

The intent of this manual is to enable anyone with a good knowledge of SPIRES searching and updating to define a functional SPIRES file and manage its contents.

The file definition process for most files is fairly straight- forward: analyze the structure of the records you want to have in your data base, define the characteristics of those records in the file definition language, then test your definition with sample data records. Next, by analyzing the requirements for retrieving those records, define the indexes required and the method in which information from the data records will be passed to the indexes. After testing the searching capabilities you may want to define various levels of access and protection for your file.

Before tackling a production file requiring complex techniques, experiment with a file of modest requirements. This manual is meant to accompany you through your first file definition; several appendixes give details of some powerful definition techniques, which rely on a previous mastery of the basics of file definition.

Since a file definition is itself a record in a SPIRES system-owned subfile, a knowledge of SPIRES searching and updating commands is essential for entering a file definition. This knowledge is also essential for the assessment of searching requirements for a file: you must formulate your searching needs in terms of the SPIRES command facilities you understand. For example, you might be likely to overlook compound indexing techniques as a possibility for a file if you had not used the extended capabilities of compound indexes in searching a SPIRES file. An informed file definer is first an experienced SPIRES user; you are encouraged to investigate the SPIRES facilities you will need with the SPIRES consultant.

Many experienced users are not aware of the internal features of SPIRES that provide the command language. Therefore, Part A of this document is devoted to linking conceptually the internal file definition options to the external command language. A cursory view is presented of those facilities of the file definition language which are invisible to the general user.

The section following the overview gives a timetable for the file definition process, outlining the topics to be considered at each stage of the definition. It is tempting to define a file all at once, that is, defining indexes and access privileges while trying to decide the length of certain elements. This approach is haphazard at best; at its worst it is confusing--try to approach the tasks of file definition in the order in which they are presented in this manual.

If you encounter problems in learning file definition, you may contact the SPIRES consultant in SCIP User Services at Polya Hall for assistance. Extensive assistance in defining files is available for a fee through Contract Programming services.

Intimately tied to a file may be various input and output formats, as well as protocols. Formats and protocols, whose command languages are described in other SPIRES documents, can be used to present an attractive, concise and helpful interface between your file and its users.

The original file definition manual was written by J. R. Schroeder; the present document was written by J. R. Sack.

A.2  System Overview

SPIRES, the Stanford Public Information Retrieval System, is a generalized, online data base management system developed at Stanford University in the early 1970's.

The task of the original SPIRES development was to provide a file and file management system for Stanford's library automation project (BALLOTS). The versatility of the present SPIRES design can be gauged by the diversity of applications SPIRES now supports. Since 1972 well over 300 data bases have been implemented, including such applications as BALLOTS, the library automation project; bibliographic citations files, such as PLANTBIO; student record management; document inquiry and preparation systems, using SPIDOC; program library maintenance, MASTERLIST; survey data, both geo-physical and astronomical; inventory and materials tracking systems; directories, catalogs, mailing lists, and others. Present files range in size from a few dozen records to over three-quarters of a million. In sum, SPIRES serves the database needs of a large and diverse computing community.

SPIRES users design and maintain their own data bases; there is no centralized data base administrator. A number of the data base applications noted above were defined by individual users, non-professionals in data base systems, largely without individual programming aid from the data base professional staff.

Files presently in the system vary widely in complexity from unindexed files with two elements (a protocols subfile, for example), through files with ten indexes for as many elements (a personnel file, perhaps), up to files with over a hundred elements and nested data structures (the MARC or FILEDEF subfile, for example). Many of the present files are made up of more than one subfile, with the file definition describing the interrelationships between subfiles in a single file.

The command language available to SPIRES users for manipulation and management of these data bases is described in the SPIRES/370 Searching and Updating Manual; only commands that relate specifically to file definition and management will be discussed in this manual.

A.2.1  SPIRES Command and Definition Languages

The various "languages" that form the language for development, management and use of data bases are:

Interactive Command Language

Most SPIRES users are familiar with this language, which is used for input and editing of data (TRANSFER, UPDATE, ADD), record retrieval (FIND, AND, OR, ALSO, FOR), and record display (SET FORMAT, SET REPORT, TYPE, OUTPUT, DISPLAY). The prospective file definer and manager should know the capabilities of the index and sequential searching commands.

Protocols Language

This facility is an extension of the command language; protocols are a set of SPIRES, WYLBUR, MILTEN and ORVYL commands. The commands make up a program that can be executed by users needing guidance in manipulation of a particular data base. Using protocols you can extend the normal SPIRES command language, tailoring the interface of specific users to specific files. This tailoring is particularly valuable when the end user of a production file has no special training in SPIRES commands.

Protocols are developed and tested interactively, and feature string manipulation and arithmetic functions as well as condition testing and branching capabilities for sophisticated interactive dialog and data base manipulation.

Format Definition Language

Any file user can provide formats for input and/or output of any data base's contents by defining a set of data transformations that map information from a source (a data base or terminal, for example) to a destination (a terminal or data base).

By means of input formats, a user can be prompted for the records' element values, and be given helpful diagnostic messages and reprompts should any error conditions be raised. Also, input formats provide a tool for converting pre-existing machine readable data to SPIRES-suitable input form. Using commands for arithmetic and string operations, condition testing and branching, complex algorithms for data validation can be performed that are unavailable using even elaborate file-definition input editing rule sequences.

By means of output formats, products such as reports, directories and catalogs can be produced by mapping file element values, and any other computed values, onto a two-dimensional array that can be output onto a terminal, line-printer, or full-face CRT. Output formats can make SPIRES data base contents acceptable for use by batch programs, or can arrange the same data so it can be easily understood by an untrained user.

The formats facility should be considered an integral part of the file/user interface. Database contents can be graphically organized so that information structures and hierarchies are easily recognized by any user. For example, a bibliographic entry in the CATALOG file can be output in the form of a library catalog card so that any user would easily recognize which elements are the author, title, and call number by their place on the output screen. Using another format, information in a card catalog file can be selected for printing in the form of a call number on a book's spine label. A data base of computer charges can be used to produce a billing letter for an individual user and reports of system-wide charges for an accountant. Just as in an outline, formats can make use of indentation to highlight hierarchical relationships of elements in a record. Where data is logically understood as a table, table formats can be devised. Text such as catalogs can be output in a form suitable for photocomposition.

A.2.2  SPIRES Processors and Programs

There are several SPIRES processors that are important to the file definer and manager.

Online SPIRES

The SPIRES that most users know, this is the program with which most file searching and updating is done.

Batch SPIRES

This is an offline version of SPIRES that allows users to indicate a series of SPIRES commands that are to be executed during non-prime time blocks. Large searches and reports are also aided by this facility, since they are not hampered by the active file and user core size restrictions that are applied to interactive SPIRES users. Use of this facility is made easier by the OFFLINE file: user commands are added as a record to this file, then executed after file updating is done by JOBGEN (see below).

SPICOMP

This is an online program that compiles file definitions, format definitions, variable group definitions and protocols, returning error messages if the compilation process is not successful. Users no longer need SPICOMP, since all the compilation functions mentioned are incorporated into SPIRES.

JOBGEN

This is a batch program run every night against any file that has records in the deferred queue and does not have the NOAUTOGEN option set. JOBGEN submits a batch SPIBILD job (see below) that causes the passing of deferred queue records to goal and index records overnight, during non-prime time blocks; the same passing can be done online by SPIBILD. The file owner can issue online commands that cause JOBGEN to pass over the file on certain nights, or can indicate in the file definition that JOBGEN is not to be run on the file.

SPIBILD

This program can be called in either an online or batch form to pass records in the deferred queue of your file to the goal record and index record data sets.

FASTBILD

This is a batch program that greatly reduces the CPU time and I/O's necessary for adding a large number of initial records to a new (empty) file. A protocol is available to generate the necessary JCL.

FASTADD

This is a batch program similar to FASTBILD: it provides a facility for adding a large number of records to a file using less CPU time and fewer I/O's than SPIBILD. Unlike FASTBILD, it can be used on files that already contain data. A protocol is available to generate the necessary JCL.

Host Language Interface (HLI)

This facility allows access to SPIRES data bases through batch programs in PL/I, COBOL, and FORTRAN. The batch programs can provide input and/or process output from SPIRES files.

A.2.3  The User Interface

A SPIRES user can access any data base permitted to him or her through a powerful set of English-like commands. The same command language applies to all SPIRES files, in keeping with the general-purpose intent of the SPIRES system. Though the number of commands is large, a searcher develops a feel for the grammar implicit in SPIRES commands. These commands allow you to

At Stanford these command facilities are supplemented by the text-editor, WYLBUR, which provides commands for data entry, offline listings, and remote job entry. Terminal communication is the function of another service program called MILTEN. The timesharing monitor, called ORVYL, supports the file system in which SPIRES data bases are stored, and controls virtual memory and resource scheduling for interactive SPIRES users. These three subsystems, WYLBUR, ORVYL and MILTEN, are distinct from SPIRES, and SPIRES does not duplicate their functions. At other installations these companion services are provided by different text editors, communications controllers and timesharing systems.

The SPIRES commands used for searching and updating any data base are covered in a user's manual of less than 170 pages ("SPIRES/370 Searching and Updating"), and can be learned in a four-session course.

A.2.4  File Definition Concepts and Facilities

The primary data base unit familiar to SPIRES users is the "subfile." The subfile is a set of goal records optionally linked to index records. Speaking generally, we can say that a goal record is what you retrieve from a search request--a book in the CATALOG subfile, a restaurant in the RESTAURANT subfile, a file definition in the FILEDEF subfile.

The subfile name is specified in a section of the file definition, appropriately called the "subfile section," in conjunction with a list of the computer account numbers of users or groups of users for whom you specify various levels of access to your file. There are many levels possible. You may make the entire subfile available to the public for searching and updating (the RESTAURANT subfile is an example of this privilege level). You may make all records searchable and only some updatable (such as the PEOPLE subfile), or make only certain records available for search and update (such as the FILEDEF and FORMATS subfiles). You could even prevent a user from seeing, searching and/or updating certain elements in all of the goal records of a subfile. Of course, a subfile's use can be restricted to only one account (the file owner's) if you wish.

The file definer must be aware of the difference between a file and a subfile. The difference is best shown through examples: the DOCUMENTS, MASTERLIST, and MASTERLIST SHARE CODES subfiles are all part of a SCIP-maintained file of software resources; the MARC and CATALOG subfiles belong to a single file maintained by the BALLOTS Center.

It is often useful to put subfiles that share some common information together in a single file, allowing SPIRES to look up or cross-reference information in one subfile when it is needed for input or output operations in another. Information is not stored redundantly in each subfile, because look-ups between subfiles in the same file can be performed; such a method of linkage also makes file updating more convenient, since common information need only be modified once per file, not once per subfile. It might be useful to link together subfiles of student, teacher, and course goal records in a single file, looking up a student's identification number or name when printing out a teacher's course list, and looking up a course number or title when printing out a student's transcript.

Search and retrieval of a reasonable number of records (less than five million) through index is accomplished in seconds, with a maximum of five disk accesses to retrieve records in a file of a million records. For most applications SPIRES uses a "B-tree" method of record access. The chapter on SPIRES File Structure contains more information on this. [See B.6.] Rapid retrieval is possible if the file definer has specified that selected information in the goal records be passed to index records. If an index was not built when records were first added to the file, it can usually be added easily to the file at a later date, by using the original goal records to pass the requisite information to that index. In cases where indexes have not been built, perhaps because the frequency of searching for a certain kind of information would not have warranted the cost of building or maintaining an index, the file can be searched sequentially for this information. It is also possible to obtain a subset of the file via an index search, then examine or process that subset sequentially (using global FOR). However, a sequential search is usually slower and more expensive than an index search.

When records have been retrieved, they can be manipulated in several ways: they can be displayed at the terminal using any predefined output format; they can be put into a user's work-area or "scratch pad" for manipulation by the text-editor; they can be made available to a batch program; they can become a source for a new series of SPIRES commands, such as a sequencing command; or they can be used to generate reports.

Keeping information in the data base current is done by removing records from, adding records to, or updating records in the file. For adding and updating, records can be moved between a subfile and the text editor's work-area using simple commands. Data is collected or modified using the text-editor's commmands. SPIRES also supports use of a CRT in full face mode.

The integrity of the data is constantly verified and protected by SPIRES. Redundant information stored in each file block insures the validity of the data inside. Modifications made by users to the data base do not take place immediately (though they are immediately visible online). Changes are held until the deferred queue records are processed by JOBGEN overnight; this reduces the chance of accidental loss of data base contents as a result of a system crash or user error.

Data can also be validated on entry by specifying certain tests that the input values must pass. These tests can be specified in the file definition, or in an input format or protocol used to add records to the file. They may be as simple as tests based on the number of occurrences of an element, an element's length, a required range for element values, or inclusion or exclusion of elements that contain certain characters. Editing of input data can also be specified: values can be converted to binary; personal names can be changed to a canonical form; the date or time of day can be supplied, and many other kinds of editing can take place. Similar to these input processing rules are index passing rules that specify how or what kind of information is passed from an element in the goal record to an element in an index. You can specify that groups of words, individual words or phrases, all words, or all words over a certain length be passed; you can also include or exclude a certain set of words, delete or nullify punctuation, or force capitalization, when an element or elements are passed to an index.

A.3  The Process of Defining a File

A complete file definition is never easy. But if you approach the process in the order outlined below, trying not to define ahead of your comprehension, you will avoid general confusion. If you test your first file definition at the different stages or "test points" indicated, you will be able to locate problems more easily and review the materials in a particular section of this manual or ask the SPIRES consultant in User Services for help.

Experienced file definers go through all of the following steps, and usually in the order presented.

A.3.1  Design Analysis

Analyze your data base with respect to the following:

a) What are the elementary parts of each "entry" in your file?

An entry such as a restaurant in the RESTAURANT subfile is usually called a "record." A collection of logically related entries or records that are the goal of search requests in your file are called "goal records." For example, an entry or record for an individual restaurant in the RESTAURANT subfile is a goal record in that subfile.

The elementary parts of a record in the phone book are a name, an address, and a phone number. In a SPIRES file, each of these becomes an "element" and is assigned an "element name" in the file definition.

b) How many times does each element occur?

In most entries in a phone book for example, each name has a single address and phone number. But if we were making a file of students and the courses they take, the "courses" element in the record might occur four, five, or six times, or perhaps not at all.

So, in your file perhaps, some elements must occur once or twice, some must occur at least once but perhaps many times, and some elements may be entirely optional. A file can have elements with any combination of these possibilities.

c) Do you know the length of some of the elements?

Elements like a telephone number can be fixed in length, just as a social security number can be. Elements like dates and most numbers are of fixed length if you tell SPIRES to change them to binary (a fixed internal form) before storing them. If some elements only have a limited number of values, you may want to have SPIRES turn the value into a fixed-length code.

The "length" of an element is always the length in bytes (or characters) as the value will be stored on disk. It is cheapest to process and store elements that are either fixed in occurrence (see "b" above) or fixed in length, or fixed in both occurrence and length. Elements that vary in either or both length and occurrence are more expensive to store and process. Optionally-occurring elements that may vary in length and occurrence are the most expensive to process and store.

d) Do some elements have only certain allowable values or forms?

SPIRES can be told to check the validity of input data if you can specify the criteria for validation. Elements like a phone number and a social security number have "-" in certain required places and are of a known length. Zip codes are of a certain length and contain only numerals. Course numbers at certain institutions might always be three letters, a blank, then three numbers. You can tell SPIRES to reject input to an "age" element when a value is greater than 100 and less than 0, or a "number of children" element's input value that is negative or greater than fifteen. You may want to supply automatically a default value if an element is not input, or override an input value if one is supplied: the date and time a record is added to your file can be supplied as input data by SPIRES.

e) Do some elements occur with other elements in a "structure"?

Elements that are grouped together form a "structure." Taken together, a street address, city, state and zip-code might be called an "address"; in a file in which several addresses occur in each record (perhaps a home address and business address in a mailing list file) there must be a way to associate or bind the first city input with the first state and zip-code input, and the second city with the second state and zip-code. This logical binding of different elements is a structure. The university affiliation of a professor may always be paired with his or her name in a subfile of conference participants, for example. Or, a job-code might always occur with a salary figure in a record of a person's employment history.

Structures can be nested within structures: the record of a student's grades for a single term, making up a course-number/grade structure, could itself be a structure that occurs several times in a goal record that is a student's transcript for several terms of work.

f) What single element is best suited to be the key of the record?

The key of the record is a unique identifier by which one record can be distinguished from all other records. The key must be chosen carefully, since it has many consequences: the element designated as the key must only occur once in each record and the value for that key must be unique among all the records in the subfile.

In a file with a goal record of employee data, the name of a single employee is not likely to be unique, but his or her social security number is unique. So the social security number may be the best choice for the key. In a file whose goal record is comprised of items ordered for a store, you might be tempted to use a purchase order number as the key, but if more than one item were listed on a purchase order then the goal record, which is the result retrieved from a SPIRES search (see "a" above), could not be individual items but would have to be purchase orders.

If you don't have an element that can be the key, that is, an element whose uniqueness could be guaranteed by the nature of your data, you can have SPIRES assign an integer or "slot" key for each added record; this technique is used by the RESTAURANT subfile, a "slot" subfile. SPIRES simply assigns the first record the key "1", the second record the key "2" and so on.

An "augmented key" can also be coded if you must use a non-unique key. SPIRES will simply place a suffix on any non-unique key you enter; this suffix will make the key unique. This technique is useful when personal names are used as keys, or when accession numbers are being assigned.

A.3.2  File Definition

Step One

Define your goal record elements using the file definition language to describe the data characteristics you determined in the above steps. [See A.3.1.] The language for goal record definition is described in the first three chapters of Part B, "Goal Record Concepts and Definition," "Goal Record Keys, Slot and Removed Records," and "Structures." [See B.1, B.2, B.3.]

Step Two

Add processing rules to the description of each element. These processing rules or "actions" are called INPROCS or OUTPROCS depending upon whether they affect the input or output of an element. Study "Processing Rules: INPROC, INCLOSE, OUTPROC" [See B.4.] and become familiar with the use of the appendices "Processing Rules: Complete Listing by Number," "Quick Reference to Processing Rule Functions by Number," and "Quick Reference to Processing Rules by Function-Keyword." [See D.1, D.2, D.3.]

Step Three

Test your basic goal record description. To do this you must first study "The FILEDEF Subfile and File Compilation." [See B.5.] Then:

Two additional chapters may be of help at this point: "File Definition Syntax and Semantics" and "Recompile of an Existing File's Definition." [See D.4, C.1.]

Step Four

Study "File Structure: Tree and Slot, Goal and Index Records," [See B.5.] in order to understand the structure of the ORVYL files that SPIRES has created according to your file definition. That study prepares you for defining your file's index records.

Step Five

Study "Understanding and Coding Index Records" [See B.7.] for an understanding of the various indexing techniques and when to use each: Simple Indexes, Qualifiers, Sub-Indexes and Compound Indexes. In that chapter you will learn how to describe the structure of the "index records" which are associated with the "goal record" you defined and tested previously.

Step Six

Study "Understanding and Coding the Linkage Section" [See B.8.] so you can use the file definition language to describe:

Step Seven

Make use of the appendices "A Guide for Coding Index Record Definitions" and "A Guide for Coding the Linkage Section" [See D.5, D.6.] to code the index and linkage sections of your file. This chapter provides (almost) guaranteed recipes for these two difficult-to-code parts of a file definition. You determine the indexing techniques you need, and use the recipes for index and linkage sections that are indicated.

Step Eight

Transfer your goal record file definition and add the index and linkage sections to it, update the definition, then erase and compile your file. (Before this, you may want to save any data that you have entered.) Use the online SPIBILD processor to pass information from the deferred queue and goal records to the index records you just defined, and build the indexes and goal records into a searchable file. Use the SPIRES searching and browsing commands to check your SRCPROCS and PASSPROCS, verifying that the indexes contain the information you intended.

Step Nine

Make use of the file definition language described in "Defining Subfile Privileges" [See B.9.] to specify accounts or groups of accounts that can: use the file, search it, update it, see only some elements, and update only some elements. Modify your file definition, adding this new code, and recompile it. You will be able to verify that you specified the correct privilege codes after the system FILEDEF file has been updated--that is, the day after you make any changes.

Step Ten

Make use of the file-manager commands described in "SPIRES File Management" [See B.10.] to monitor and control the status, activity and processing of your file.

A.3.2.1  The File Definer: A SPIRES Subsystem to Simplify File Definition

A subsystem of SPIRES, named the File Definer, can simplify the process of file definition. By using a concentrated language, based on a subset of the standard file definition language discussed in this manual, you can specify basic information about the file design, such as element names, which elements should be indexed, etc., and the File Definer will generate a complete file definition for you, saving you the trouble of writing and coding goal and index records and the linkage section.

The File Definer subsytem is available only when you are in SPIRES; to use it, you issue the SPIRES command ENTER FILE DEFINER. Below is an example of a sample File Definer session:

The five lines of input shown would be used by File Definer to generate a file definition of about 30 lines, including a goal record definition with the elements NAME, PHONE and COMMENT, an index record definition for the NAME element, a linkage section and a subfile section. That is much simpler than coding the record definitions and linkage sections yourself. [See B.7, B.8.] The file definition generated may then be added to the FILEDEF subfile and compiled.

Most people writing file definitions will want to use the File Definer at some point because it relieves them from the tedious task of coding index record definitions and linkage sections. Even if you cannot code the entire definition using File Definer (it has some limitations, e.g., you cannot directly code SEARCHTERMS, SRCPROC and PASSPROC statements), you can use it to create a file definition ranging from skeletal to almost complete for any file.

Naturally, there is a great deal of educational value in writing an entire file definition (goal records, index records, linkage section and all) yourself, especially if you want to learn and understand SPIRES file structure. However, letting the File Definer do the tedious work and studying the file definition it generates can be educationally rewarding as well, especially if you do so as you read this manual.

The File Definer has its own reference manual, entitled "File Definer", which is written for people already familiar with the concepts and language of file definition as taught in this manual. A primer to the File Definer, aimed at people who primarily want to create a SPIRES file quickly but who are unfamiliar with file definition concepts and language, may be found in the SPIRES primer "A Guide to Data Base Development".

A.4  Glossary of Important File Definition Terms

A.4.1  Element

A data "element" is the smallest unit of named data known to SPIRES. Data elements (or "fields" as they are called in other systems) may consist of characters, numbers or bits; they may be fixed in length or varying in length. They may also be required to occur more than once or be completely optional. Elements are things such as a person's name, a social security number, a salary, or an abstract of an article.

A.4.2  Record

A "record" consists of a series of data elements and their values. Usually, the record is a collection of all the data elements that pertain to a single entity in the entire collection of data. Thus, a record could be made up of one person's name, address, social security number, and salary. Another record in the same collection of data would have the same elements, but for a different person.

A.4.3  Structure

Within a record, elements may be grouped together in "structures," which are referenced in the same manner as elements. For example, if a person has several offices and phone numbers, the office and phone number elements might be grouped or paired together in a structure to keep the proper phone number associated with an office.

Elements that are not in structures are called "record level" elements. Elements inside a structure are "lower level" elements with respect to the record level elements. Structures that are not inside of other structures are record level structures. Structures inside of another structure are lower level structures with respect to the containing structure.

A.4.4  Key

Each record in a collection of data maintained by SPIRES has a required singularly occurring data element known as the "key." Each key within a collection of goal records must be unique to the goal record--no two records in the same goal record collection can have the same key. The key element in a personnel goal record would probably be the social security number, since it will be unique for each person. If the data records themselves do not contain unique elements that are useable as keys, SPIRES can supply unique consecutive numbers as the values for the keys in a set of goal records.

A.4.5  Goal Record

"Goal Record" is the SPIRES term for a data record that could found as a result of a SPIRES search operation. In a collection of data about restaurants, the goal records would probably be restaurants; in a collection of data about library holdings, a goal record would probably be a book. A single record retrieved from a SPIRES search is a goal record. All of the records that have the same structure as the retrieved record or records, and hence "could" have been retrieved by a search, are referred to collectively as "the goal record" or "the goal record data set."

A.4.6  Index Record

An "index record" consists of a series of data elements and their values, just as a goal record does. However, in index records one of the data elements contains as its value an internal pointer (or pointers) to a goal record (or records). An index built out of names in the goal record would contain one index record for each name that occurs in the goal record as well as a pointer to the goal record (or records) in which the name occurs. The user has no direct interaction with indexes, though they are used by searching commands.

A.4.7  Record-type

A "record-type" must be distinguished clearly from a "record." A record-type refers to a collection of records, and may refer to either goal records or index records. There are "goal record record-types" and "index record record-types." The record-type is a collection of records that all have the same structure. In a personnel file, the goal record of social security numbers, names and salaries makes up one record-type, while a name index and a salary index are two other record-types.

A.4.8  Index

An "index" is a collection of index records created and maintained by SPIRES; one usually does not manipulate them directly. Indexes act as a "go between" between a searching command and the goal records. The values in a search request are looked up in the index, and the values in the index point to particular goal records. This is similar to the index one might find at the end of a book: such an index contains words or concepts, and each word or concept has a list of the pages on which it occurs.

There are two types of indexes available: simple and compound.

A.4.9  Simple Index

A "simple index" (or more specifically, a simple index record-type) contains one record for each entry in the index. For each unique name in a personnel file, there is one record in a NAME index that points to each goal record containing that name. The key of a simple index is the thing being indexed, a person's name, for example. Simple indexes are cheaper to search than compound indexes.

A.4.10  Compound Index

A "compound index" may index several elements, and is usually used to index short numeric values or coded elements such as salaries and dates. In a compound index, there is one record for each element being indexed. If a personnel file has the elements salary, job-class, and date-hired all in a compound index, then the compound index will have three records (one for salary, one for job-class, and one for date-hired). Each record in the compound index contains all the values that exist in the goal records for a particular element; this is why compound indexes are not recommended for large files -- the compound index records become too large to be searched quickly. Compound indexes may be searched with all of the relational operators, but are more expensive to search than simple indexes.

A.4.11  Combined Record-type

Combined record-types are record-types that are stored in the same ORVYL file. The file owner specifies in the file definition which record-types are to be combined together; if no combinations are defined, then each record-type occupies its own ORVYL file. There is a system limit of 13 physical record-types, so if a goal record is to have more than eight indexes, some of them must be combined into the same ORVYL file. SLOT record-types may not be combined with any other record-type. Record-types that are physically combined are kept conceptually separate by the logic built into SPIRES. Record-types that are physically combined with each other are different "logical record-types," even though they may occupy the same "physical record-type" (the same ORVYL file). There is a system limit of 64 logical record-types.

A.4.12  Removed Record-type

A "removed record-type" has nothing to do with records that have been deleted from the file with the REMOVE command. Removed record-types may provide increased access efficiency for some data. SPIRES access efficiency depends on a large number of records being packed in a single file block. SPIRES provides the file definer with an option of keeping only the key of a record in the file block, plus a pointer to the remainder of the record's data. The remainder of the data is kept in the "residual data set." This is called "record removal" and allows many more (partial) records to be kept in a file block than if whole records were kept intact.

A.4.13  Subfile

A "subfile" is defined as one set of goal records, the indexes to those goal records, and the access and update restrictions that apply to the data elements. Among the record-types that are brought into association in a subfile, a clear distinction is made between goal records and index records, since the user can only manipulate goal records, not index records. If a goal record has no indexes built for it, then the subfile consists of only the goal record and the access restrictions to it.

A.4.14  File

Several subfiles may relate to the same data and may be placed in one "file" or data base. A file thus contains all the subfiles that relate to the same data. A user can only work on a single subfile at a time, even though there may be several subfiles defined in one file. It is also possible (and is frequently the case) that only a single subfile is contained in a file.

A.4.15  Hierarchy of File Definition Components

The following chart shows the relationship of the parts of a file. The chart depicts a single file, with two subfiles. The first subfile has three record-types: one goal record and two index records. The second subfile has only a single record-type, the goal record.

The goal record of the first subfile is composed of a record with a key and several elements. One of the elements is a structure, and is thus itself composed of elements. The first index record of the first subfile is a record with only a key and a pointer.

              ------  ----------   -------------    -----
                                   | goal record-> | KEY
              |       |            |               | ELEMENT
              |       | "goal      |               | ELEMENT     ---
              |       |  record    |               |    :        |
              |       |  record-   |               | STRUCTURE---|ELEMENT
              |       |  type"     |               | ELEMENT     |ELEMENT
              |       |   or    -> |               |    :        |--:
              |       | "goal      | goal record   |

              |       |  record    | goal record   |
              |       |  dataset"  |      :        ---
              |       |            |      :          :
              |       |            |-----------      :
              |   S   |            ------------
              |   U   |            | index record  |---
              |   B   | "index     |             ->| KEY
              |   F ->|  record    |               | POINTER
              |   I   |  record- ->|               |----
      FILE -->|   L   |  type"     | index record  [
              |   E   |   or       | index record  [
              |       | "index"    | index record  [
              |       |            |      :
              |       |            |-------------
              |       |
              |       |             -------------
              |       | "index     |
              |       |  record    |
              |       |  record-   |
              |       |  type"   ->|   as above
              |       |   or       |   for
              |       | "index"    |   index-record
              |       |            |   record-type
              |       |            |
              |       -----------  -------------
              |
              |       -----------   -------------
              |       |             |
              |       | "goal       |
              |       |  record     |
              |   S   |  record-    |
              |   U   |  type"    ->|  as above
              |   B ->|    or       |  for
              |   F   | "goal       |  goal
              |   I   |  record     |  record
              |   L   |  data set"  |  record-type
              |   E   |             |
              |       |             |
              ------- -----------   ------------

B  Defining a SPIRES File

B.1  Goal Record Concepts and Definition

To begin our consideration of goal record definition, let's take a telephone directory as our example; the structure of a directory is something with which we are all familiar. Certain assumptions we are going to make for our directory file will simplify its structure.

What information is stored in a telephone directory? Usually, and most simply, a name, address, and telephone number make up each entry.

B.1.1  Element Names, Occurrences, and Lengths

If we have a single "record" or entry that consists of the three elements, name, address, and phone number, how many times will each element occur in a single record? Let's look at the question this way:

Most likely (in the simplest case), the name and address elements will occur once and only once: for each name there is one address. But it would not be unusual for a person to have several phone numbers, so we don't know how many phone numbers to expect or allow for.

Now, what can we say about the length of each of these elements? Here is a review of what we have so far:

We really don't know the length of the longest possible name and address. We could probably specify a length that couldn't be exceeded, but SPIRES does not require us to. If you do not specify a length, SPIRES stores only the length of the value input, plus two bytes of information about the length. Let's not specify a length for the NAME and ADDRESS elements.

The question of the length of the phone number requires some decision; let's agree that a phone number is an eight character (or eight "byte") value, such as "497-4420". (If we wanted to include the area code with each number, then the value is thirteen bytes long: "(415)497-4420" for example.) Our "file definition" now looks like this:

B.1.2  File and Record Name Statements

Let's see what this file will look like in the SPIRES file definition language. The first thing we must "code" or specify is the name of the file. This name is always an alphanumeric string preceded by the file definer's account number in the form GG.UUU. This account becomes the only account that by default can modify or compile the file definition. The file name is coded first in the definition, and looks like this:

Here, GG.UUU is the account, and "DIRECTORY" is the name chosen for this particular file. The file name (including the account) may be up to 23 characters long (longer names will be truncated by SPIRES). No one but the file owner need ever see this name. This is not the name used to select the subfile.

A file consists of sets of records; each set is called a "record-type." Most often there is a goal record record-type and several index record record-types per subfile. To simplify our discussion at this point, we will call a record-type a "record." (Though this is not true, strictly speaking. Many goal-records make up a single goal-record record-type. [See A.4.] Each of these records has a unique name.) The goal record is often called "REC01", and its name is coded

We will see later why this name is most common, and some circumstances in which you might want to choose a different name. [See B.2.2.]

B.1.3  Element Categories

Within each record we define that record's elements. All of the elements must be in one of three categories: FIXED, REQUIRED, or OPTIONAL. Elements are segregated into these categories by their occurrence and length attributes as follows:

Notice that "Fixed" and "Varying" for FIXED and REQUIRED refer to the length attribute of the element being defined, not its occurrence attribute.

The length of an element in the FIXED section of the record definition must be specified. If the occurrence is not specified for an element in the FIXED section, then the number of occurrences is assumed to be one. On the other hand, an element defined in the REQUIRED section of the record definition need not have either length or occurrence attributes specified; the element must occur--but its occurrence and length may vary from entry to entry. Elements in the OPTIONAL section need not have either length or occurrence specified, because that element may or may not occur in a given record.

If you do specify an occurrence for an element in the OPTIONAL section or the REQUIRED section, it has a special meaning, which is different from an occurrence specification for a FIXED element. If the number of occurrences is one, then the element is "singularly occurring": for a REQUIRED element, this means that it must occur once and only once in each record; for an OPTIONAL element, this means that if it occurs at all, it can occur only once. If the number of occurrences is more than one, then the element is multiply occurring: for a multiply-occurring REQUIRED or OPTIONAL element, the number of occurrences is not checked. You may have SPIRES do a minimum and/or maximum occurrence check by specifying certain processing rules, usually A123 and A146. [See B.4.14.]

A small amount of storage space is saved when REQUIRED or OPTIONAL elements are specified as singularly occurring.

Though not required for a valid file definition, an OPTIONAL section with a dummy element should be coded in every file or record definition for which an OPTIONAL section would not otherwise occur. By coding this section you have the flexibility to add elements to the record definition even after data has been stored in the file. Such elements are always added to the OPTIONAL section; the dummy element is never coded with length and occurrence attributes. [See B.1.7.] No more than 254 elements may be coded in an OPTIONAL section.

Remember that we decided "PHONE" is fixed in length, but is not fixed in the number of times it can occur, though it must occur at least once. Elements that must occur, but for which a firm occurrence count can't be specified, are placed in the REQUIRED section, even if they can be fixed in length.

We code the category name at the head of the list of elements it describes:

Note the order in which the categories must appear: FIXED, then REQUIRED, then OPTIONAL.

If you do not code any categories, all the elements will be OPTIONAL, with the exception of the key, which will be REQUIRED. [See B.1.4.]

B.1.4  Record Keys

In addition to placing each element in an appropriate category, we must choose an element that will be the "key" of the record. A key is required for every record (whether goal or index) defined in the file; it must be unique in value and occur only once in each record.

Now, in our phone directory, we would most likely pick the name as the key. What consequences does this have? The key of a certain record is a unique value for that record; no two records or entries in the file can have the same value for the key element. Thus, no two records in our telephone book could have the same name. (Here is where we allow ourselves to simplify with the assumption that no two people in our phone directory will have the same name. This would not be a realistic assumption for a real phone directory. The solution to this problem is found in the next chapter, "Goal Record Keys, Slot and Removed Records.")

In addition to being unique in value among the goal records, the key must always be singly occurring; that is, the occurrence attribute of the key must be one. For this reason, an occurrence number need not be specified for the key element. A key may be varying in length, such as the name in our phone directory. But a length attribute may be specified if it is known. In our phone directory, NAME would be coded as the key element as follows:

The key element is always coded as the first element in the category in which it is defined. Since the key must be singly occurring, but may be fixed or varying in length, it is coded as the first element in either the FIXED or REQUIRED categories.

Let's review our definition:

Since we don't have any elements in the FIXED section, we don't code it.

B.1.5  Element Name, Occurrence and Length Statements

The next element to code is ADDRESS. For this element we can specify that it must occur once and only once, but we can't specify a length attribute. The name of an element is specified in the ELEM statement. (You are allowed to use ELEMENT instead of ELEM; however, SPIRES will change it to ELEM when you add it to the FILEDEF subfile later The occurrence attribute of an element is specified in the OCC statement. (Similarly, OCCURS, OCCURRENCE and OCCURRENCES may be used instead, though they will be changed to OCC.) The ADDRESS element would be coded in the REQUIRED section as follows:

If we had not specified "OCC = 1", then ADDRESS could occur one or more times. (Since it is coded in the REQUIRED section, it must occur at least once if the occurrence attribute is not coded.)

We now must code the phone number element. For this element we can only say, "it must occur." We don't know how many times. In such a case, "OCC = 1" is not coded since this would limit the element to one and only one occurrence. We have decided that the length of the phone number in bytes (characters) as it will be stored on disk is eight characters. The length attribute of an element is specified in the LEN statement. (LENGTH is also allowed but will be changed to LEN.) Since it may vary in occurrence, the phone number element is coded in the REQUIRED section thus:

Remember that the length attribute, coded by "LEN =", is the length as the value will be stored on disk, which is not necessarily the length of the value as input when the record is added; processing rules, called "actions", can manipulate the input values.

If we specify "LEN = 8" for the PHONE-NUMBER element, then all element values stored on disk will be eight bytes long. If a value is input that is longer than eight characters, the record will be rejected for input, and an error message will be issued. If a value is input that is shorter than eight characters, SPIRES will pad it with blanks to a length of eight bytes. [To allow null values for an element's input, omit the LEN statement; otherwise, SPIRES will fill the entire length with blanks, which is not the same as a null value.] Manipulation of input values can be effected more intelligently when "actions" are coded. [See B.4.]

Embedded blanks are not permitted in element names; the special characters ".", "_", "-", and "$" are allowed, though "-" is not allowed as an element name by itself; it may be embedded within a name, however. [See "SPIRES Searching and Updating", section D.1.3.1, for more information about the "-" or "throw-away" element.] The length of an element's name is limited to sixteen characters.

B.1.6  Element Aliases

Long element names are often advisable for clarity, since the value coded in the ELEM statement is the name of the element used when records are displayed.

However, it is not convenient or sensible to enter a twelve-character element name for an eight-character value: "PHONE-NUMBER = 497-4420;". SPIRES allows you to give an element a long, descriptive name such as "PHONE-NUMBER" and refer to it by several other names, such as "P" or "PN". The file definer must indicate what these other names can be by coding "aliases" for the element names in the file definition. Here is how aliases are coded:

A phone number can now be entered by "P = 497-4420;" or, more simply, by "P 497-4420;" (since the "=" is optional). We have also allowed the aliases "PN" and "NUM", which are mnemonically more significant than the terse "P". No two elements can have the same alias at the record level or in a structure.

B.1.7  Dummy Elements; Comment Statements

Since we have no OPTIONAL section, we should code an "empty" OPTIONAL section with a single dummy element. (This will allow us to add elements to the record definition at a later date without invalidating data already stored.) This section is coded as follows:

An item called "COMMENTS" may be coded for any element you define; no single comment can be longer than 1,024 characters.

Let's look at the record definition we have coded, adding aliases where they are useful:

B.1.8  Optional Statements: AUTHOR, MAXVAL, NOAUTOGEN and BIN

Other statements can be coded that will make the file definition more complete: AUTHOR, MAXVAL, NOAUTOGEN and BIN. They are coded after the FILE statements, which is the first statement in our definition.

B.1.8.1  (The AUTHOR Statement)

It is important to specify the AUTHOR statement in your file definition. In case it is necessary for the data base systems staff to contact you, the AUTHOR statement should supply the necessary information. This element is usually coded after the FILE element, and is a free-form text string:

B.1.8.2  (The MAXVAL Statement)

Another file-level element is necessary for some applications, particularly those involving long text strings such as bibliographic and abstract files. If any element values in your file will be longer than 4,096 bytes, you must code the following in your file definition:

The value specifies the maximum data length for any single occurrence of an element in the file. MAXVAL cannot exceed 32,760. Also, no single record in a SPIRES file can be more than 120,000 characters long.

The MAXVAL limit also applies to values processed by actions A44 and A48 and by the SET VALUE Uproc in Userprocs. [See C.11.1.1.]

B.1.8.3  (The NOAUTOGEN Statement)

An optional element "NOAUTOGEN;" may be coded in your file definition if you do not want SPIBILD automatically (i.e., nightly) to pass records from the deferred queue to the goal and index records. Normally, every night that there are records in the deferred queue, JOBGEN will generate a job to build or process them into the goal and index record data sets. Every time this job runs, a certain amount of overhead for job scheduling and initiation is incurred. With only a small number of records to be processed (say, fewer than 5), this overhead is a significant percentage of the job cost.

However, if NOAUTOGEN is coded, you must explicitly cause this job to be submitted by issuing the online SET AUTOGEN command, perhaps after allowing several records to accumulate in the deferred queue. JOBGEN will generate a SPIBILD job that night, and then reset the file to the NOAUTOGEN condition. If NOAUTOGEN is not coded, then you must take specific action to prevent overnight processing; SET NOAUTOGEN can be issued to prevent the generation of this job until you explicitly SET AUTOGEN in SPIRES or PROCESS the file in SPIBILD.

B.1.8.4  (The BIN Statement)

You may code the bin number to which you wish output from SPIRES-generated jobs to be sent. Output from compilations and automatic file building (JOBGEN) will go to the bin specified; if no bin is coded, then such output will be directed to the default bin of the file owner.

If you code PURGE for the bin, then the output will be purged if there were no batch requests processed by SPIBILD and if no errors occurred during SPIBILD processing. Otherwise, the output will be sent to the file owner's default bin. Coding PURGE is recommended because it generates output only in the event of a SPIBILD problem or a batch request, thus saving you printing charges.

If you code HOLD for the bin, the output will be directed to the default bin of the file owner but the output will be held. The file owner can fetch the output and then either purge it or release it for printing.

The bin is coded in your file definition like this:

where "nnn" is the number of the bin or HOLD or PURGE as described above.

B.1.9  Statements in the Subfile Section

Though our goal record definition is now complete, there are several other things that must be coded to complete the definition of the file itself. (Remember that a file definition usually, but not necessarily, contains several record definitions.)

As noted earlier, the file name is almost never seen by the user; what the user sees is the subfile name, which is coded as the first statement in the "subfile section" of the file definition. The subfile section (or sections) follows at the end of the last record description.

Embedded blanks are allowed in the subfile name. Since this name is typed in a SELECT command, it should not be very long or otherwise difficult to type. The maximum length for a subfile name is thirty-two characters, including blanks.

The second statement in the subfile section identifies the record that will be the goal record when the subfile is selected. Because we only have one record name for our single record definition, this may seem redundant. But since most subfiles have multiple record descriptions--usually one goal record and several index records--SPIRES must be told explicitly which record is the goal record. This statement is coded as follows:

Remember that "REC01" was the name of the record we described and named by the statement "RECORD-NAME = REC01;".

Now we must specify what accounts are permitted to select the subfile whose name is given by the "SUBFILE-NAME" value immediately preceding.

This permits access to the subfile only to the account specified. At a minimum, the file-owner's account should always be specified; if it is not, then the file owner must issue the ATTACH command to use the subfile.

You can permit more than one account by coding other account values:

To permit all group "GG" accounts (but not "GA" accounts), you would include "GG...." in the ACCOUNTS value. To make a subfile public, you specify "PUBLIC" as the ACCOUNTS value. The matter of controlling access to SPIRES subfiles is detailed in "Defining Subfile Privileges." [See B.9.] A complete subfile section can be coded like this:

B.1.10  A Complete File Definition

Here is what our complete phone directory looks like when coded in the file definition language:

The indentation shown is for the sake of clarity; you can use any indentation that is helpful to you. Also, an element's name, occurrence, length and aliases need not be defined on a single line; in fact, when SPIRES displays your file definition, each of these will be on a separate line, with indentation used to structure the definition for easy reading.

B.2  Goal Record Keys, Slot and Removed Records

B.2.1  Record Keys

Let's consider another way of defining a telephone directory file. Suppose we made the telephone number the key of the record, what would be the impact on the file? Here is a record definition in which the key is the phone number; we have also allowed name and address to occur more than once by not specifying any OCC limits.

Such a directory would give you access to all the users of a particular phone number; if one person had two different phones, the name would be in two different records, each record's key being one of the phone numbers. A directory keyed on the phone number might not be useful to someone looking for John Jones' phone number, but it would be useful to someone looking for the owner of phone number 497-4420, which has been reported out of order, perhaps.

Notice that this dramatically changes how we look at or use the file. Now, all the people sharing a single office extension can be found, but one person's phone number can't be found as directly as it was in the directory keyed on name. The goal of the search--either names, as in the previous case, or phone numbers as in the present example--determines the choice of key.

Since it is unlikely that one phone number could be at more than one address (though some businesses have "extensions" in several buildings), we will code "OCC = 1" for the occurrence attribute of the ADDRESS element. But it is very likely that more than one person could be listed for each phone. For this reason, we will not code any occurrence attribute for the NAME element: we simply don't know how many times this element will occur. The occurrence of a record key must be one; the length of a record key must never be greater than 240 bytes, whether the length is fixed or not, whether it is the key of a goal record or index record.

The present definition, keyed on phone number, is different in another way from the definition keyed on name: the phone number, which was in the REQUIRED section (varying in length, required to occur) in our record keyed on name, is now in the FIXED section. The phone number was multiply occurring before, though it was fixed in length; now, since it is the key of the record, it is required to occur exactly once. Elements whose occurrence and length attributes both can be fixed are usually coded in the FIXED section.

There may be reasons why you would choose not to put a fixed length and occurrence element in the FIXED section of a record definition. Let's look at two record definitions for a phone directory keyed on phone number; we will add an element for zip code, which seems to belong in the FIXED section, being fixed in both length and occurrence.

 RECORD-NAME = RECO1               RECORD-NAME = RECO1;
   FIXED;                            FIXED;
     KEY = PHONE-NUMBER;               KEY = PHONE-NUMBER;
       LEN = 8;                          LEN = 8;
     ELEM = ZIP-CODE; OCC=1;
       LEN=5;
   REQUIRED;                         REQUIRED;
     ELEM = NAME;                      ELEM = NAME;
     ELEM = ADDRESS; OCC = 1;          ELEM = ADDRESS; OCC = 1;
                                       ELEM = ZIP-CODE; OCC = 1;
                                         LEN = 5;
   OPTIONAL;                         OPTIONAL;
     ELEM = DUMMY;                     ELEM = DUMMY;

In the standard SPIRES output format, "element mnemonic = value", the elements in a record are output in the order in which they are defined: FIXED, REQUIRED, then OPTIONAL elements. (If an element occurs more than once, its occurrences are output in the order in which they were input.) Standard record output formats for each of the above definitions might be as follows:

  PHONE-NUMBER = 497-4400;         PHONE-NUMBER = 497-4400;
  ZIP-CODE = 94305;                NAME = USER SERVICES;
  NAME = USER SERVICES;            ADDRESS = POLYA HALL 117;
  ADDRESS = POLYA HALL 117;        ZIP-CODE = 94305;

So, for readability, you may want to put ZIP-CODE in the REQUIRED section of the record definition. But if you are certain to define an output format, there is no need to consider this problem.

B.2.2  Slot Keys

We have just discussed the importance of choosing the best element for the key of the record. Let's look at situations in which the choice of unique key may be difficult or impossible.

Suppose our SPIRES file was going to be a collection of abstracts from scientific journals. Our element record definition might be as follows (note that the key is not specified):

Now, if we wanted a search to retrieve the list of journal abstracts in which the words specified in the search request appeared, the goal record would be "articles." A search request for such a file would look like this:

How would we go about choosing a key for an "article" goal record? None of the elements defined above is very likely to be entirely unique. We could contrive a unique key by concatenating portions of the JOURNAL, YEAR and PAGE elements: NG.76.202, for example, could signify page 202 of a 1976 issue of National Geographic. However, such a key would not be convenient to enter or use. (See the "Structured Key" processing rule, A33, for one solution.)

SPIRES has a more elegant solution to the problem of a lack of a natural key. If you specify that a record type is "SLOT", SPIRES will assign a unique integer key to each record added; these keys start at one and will be incremented by one as each record enters the file. SPIRES always stores a slot key as a four byte binary number.

This simple solution could lead to problems: suppose you typed a command such as "remove 197" when it was actually record 187 that was to be removed. The file definer can protect against this kind of error in a slot file. SPIRES allows you to specify that a "check digit" be appended to each integer slot number as the record is added to the file. A check digit is a single digit that is appended to the right end of a number; it is computed by performing multiplication and addition operations on each digit of the original number, and then adding and subtracting the resulting sum to yield a single digit. Since this digit is computed from the other digits in the number, the original number's digits can be verified by seeing that the final digit is correct for a number you type at the terminal.

For example, a record in your file may have the key "2757", of which the final digit, "7", is the check digit. The value "2657" would not be a valid key, however, since the first three digits, "265", require (or compute to) a different check digit than "7". Thus, each digit becomes significant in computing the check digit, and most typographical errors in specifying a record's key (such as typing "2657" instead of "2757") will be caught when the system attempts to verify the check digit. Note that the record whose key is "2757" is not two thousand seven hundred fifty-seventh record in the file, but the two hundred seventy-fifth; the final digit is the check digit. The system does not store the check digit with the key, but computes it each time you display the key--the digit is always shown when the key is displayed. (If you "look up" the key of a record using action 32 [See C.5.] and intend to display it, you must explicitly code a processing rule to have this digit displayed.)

A check digit is requested by coding "SLOTCHECK" on a SLOT record. On all commands requiring a key, the check digit, first computed by SPIRES, is appended to the record key by the user and recomputed and validated by the system. This digit functions similarly to a parity bit in tapes, verifying that the data is valid. The method SPIRES uses in computing the check digit is described in detail in the description of action 27. [See D.1.3.0.2.7.]

The default check-digit formula is called the Mod-11 rule; it can be explicitly requested by giving the value "0" to the SLOTCHECK statement ("SLOTCHECK = 0;"). Other formulas, described in the description of action 27, can be requested by coding different integers on the SLOTCHECK statement:

No KEY statement is coded for a slot record, since the slot number (with a check digit if one is requested) is the key of the record. SLOT and SLOTCHECK are coded as part of the record definition as follows. (REMOVED will be explained in the next section of this chapter.)

You will notice that "RECORD-NAME = ENTRY" was coded, instead of "RECORD-NAME = REC01" as before. In a slot record the name of the goal record key is determined by the value of the "RECORD-NAME" statement. One caution should be observed: the value of this element should be lower in alphabetical sequence than any other "RECORD-NAME" statements you code since the record definitions are displayed in alphabetical sequence by RECORD-NAME. In general, it is not good to code "RECORD-NAME = GOAL". If you wish to have the goal record key named something other than the RECORD-NAME, then following the "SLOT" statement, code the "SLOT-NAME = name" statement. You may also code an ALIASES statement for the slot key. [See B.1.6.]

The SLOT statement may also have a numeric value, representing its priv-tag number. [See B.9.4.4.]

A file may not have more than eight slot-type record definitions. You may, under certain circumstances, want to define a goal record as "slot" even when a natural key exists. The advantages and disadvantages of such a scheme must be weighed carefully, and are described below.

SPIRES treats slot type goal records in a special way; it keeps them in a data set that is organized sequentially rather than tree structured. If all of the elements in a slot record are fixed required (coded in the FIXED section of the record definition), then the amount of space each record requires is known exactly; when SPIRES goes to retrieve such a record, it "calculates" the record's position and goes directly to that location--it does not have to account for the varying size of each goal record stored. Thus, the major advantage of slot organization of fixed required elements is that record retrieval goes much faster (retrieval is not to be confused with searching, the process that usually precedes retrieval). A second significant advantage is that, since the goal records are structured sequentially, sequential searching by global FOR commands is faster.

The disadvantages of forcing your data base structure into a fixed slot-type record format are actually inconveniences; the file definer must decide if these inconveniences are acceptable.

One inconvenience is that records must be referred to by their slot number in TRANSFER, REMOVE, UPDATE and DISPLAY commands. In a personnel file keyed on social security number, you could remove or update an employee's record by simply giving the person's social security number. If this file were defined as a slot file, you would first retrieve the record by using a FIND command against a social security index, then TRANSFER the record retrieved using a global FOR command. As you can see, the extra expense of building a social security number index would have to be incurred.

A second inconvenience is the loss of verification that the social security number of each person was unique; if social security number were the key of the record, SPIRES would verify that no records had the same value for the key. Generally, when a natural key exists, SLOT organization is not used.

If you expect to do a lot of sequential searching of the data base using global FOR commands, then consider making the record elements FIXED and the record SLOT. How is this done? If most of the elements in a record are of fixed length and fixed occurrence, you can consider having SPIRES store a variable length element such as NAME as a fixed length element; you would choose the largest possible length for the length attribute of each variable length element. But if a record has an element that can vary greatly in length, such as ABSTRACT in our article goal record above, we would not want to waste storage space by fixing the length of this element at its longest possible value.

If all of a SLOT record's elements are not coded in the FIXED section, then the record must be "removed." In the following section we will see what record removal means, and how it is coded.

B.2.2a  SLOT-START Statement

SLOT-START is a new field in the SLOT structure of record definitions, both in FILEDEF and RECDEF. SLOT-START serves more than a single purpose -- enabling a simple way to generate keys that begin at a particular value.

Suppose you would like to generate Slot records whose first key value is something other than 1 -- say 9000000. You want the first record to have a key of 9000000, the second 9000001, then 9000002, etc. The ORVYL system on the Stanford mainframe stored the key of 9000000 in block 35573 of the record-type data set (assuming the block size is 2048 and the record-type is REMOVED). This situation poses no real disadvantages in mainframe SPIRES because ORVYL only writes a single block 35573 resulting in a data set that has two blocks -- block 0 and block 35573.

But for Unix SPIRES this would be a different matter. If you added the first record of 9000000 to a slot record-type in that system then SPIRES must fill in the gap between 0 and 35573. Not only does this represent "wasted" space but represents 35572 extra block read requests should you attempt to do a sequential scan (eg. FOR SUBFILE / DISPLAY ALL) of the subfile.

If you wish to take advantage of this option then you should code the "SLOT-START = number" statement immediately following the "SLOT" statement in your record definition. If the subfile is a NEW subfile then your work is done and the first record added to the subfile will have a key of "number".

If the subfile already exists and has SLOT keys that begin from a different value and you wish to take advantage of this new option then you should RECOMPILE using the REBALANCE option, following the same recipe that you use to rebalance Tree data set using the CONVERT option. [EXPLAIN RECOMPILE COMMAND, WITH REBALANCE OPTION.]

Here are some answers to other questions you might ask:

B.2.3  Removed Record-Types

In our first example of a file, a telephone directory keyed on name, the length of a single record is approximately the sum of the lengths of the individual elements. It would probably not exceed fifty bytes or characters: the NAME and ADDRESS elements may take twenty bytes each, and the PHONE-NUMBER element takes an additional eight bytes.

This means that in one block of ORVYL storage, which is 2048 bytes, approximately forty records could be stored. SPIRES access efficiency depends to a great extent upon the number of records each block contains. If the average number of records per block is eighty, then one record in 512,000 can be retrieved by accessing three blocks or less, if the data is structured in tree fashion. If the average number of records in a block is only twenty (each record averaging one hundred bytes in length), then five accesses may be necessary. If the number of records per block drops to sixteen or fewer, then efficiency of record access seriously degenerates, since many file I/O's may be necessary to locate a particular record.

In order to keep access efficiency high, SPIRES provides the file definer with the option of removing large record types (remember that a "record type" is our "REC01," a "record" is an entry in the file), say of 60 bytes or more, from the goal record data set to a "residual data set." This removal is done for all records in a record type, and may be specified for any record-type, whether large or small in size. Only a key and a pointer to the record's location in the residual data set remain in the goal record data set when you specify record removal.

Tree (or "non-slot") record types, such as our telephone directory, should usually be removed, since the size of an entry is often over forty bytes. You can specify that a record type's contents are to be removed to a residual data set when you code the record type's name:

Slot record types, which use a different access technique from tree structured record types, are always removed unless all elements are fixed in length and occurrence. Slot record types, such as our articles file, can specify removal to a residual data set as follows:

Slot and tree structured data sets may be mixed in a file or data base.

Some rules-of-thumb can be stated for record removal. Remove record types if any of the following are true:

B.2.4  Monotonic Record-Types

A "monotonic" record-type is a record type for which keys will be added in monotonic order. This simply means that the key of every new record will be higher or lower (either alphabetically or numerically) than the key of the previous record added.

For example, suppose a business receives materials and assigns a key to a record in a SPIRES file. The key could be a concatenation of the year and an integer number like a SLOT number:

In this example, it is apparent that the key of a new record will always be greater than the key of all previous records. Because of this, the record type should be declared as MONOTONIC. The MONOTONIC statement is coded at the record level as follows:

MONOTONIC cannot be declared for SLOT record types.

The MONOTONIC statement provides for more balanced growth [See B.6.3.] of a file when keys are added in monotonic order; it causes SPIRES to pack each file tree block full before beginning a new block. Normally, SPIRES leaves some room for growth in each block; this is because a new key is often added that falls between two existing keys. In a MONOTONIC situation, however, this is not the case; new keys always fall at the end of previous keys.

Since MONOTONIC only specifies the way tree blocks are packed, and not an access method (SLOT determines both a packing method and an access methods for example), it is possible to recompile a file definition after records have been added and either add or delete the MONOTONIC statement. [See B.5.9.] It is also possible to add new keys in-between previous keys, even if MONOTONIC has been specified. If this is frequently necessary, then the MONOTONIC statement should be deleted and the file recompiled.

B.3  Structures

B.3.1  Data Structuring

Let's look at a phone directory file for a doctors' answering service. The file should allow the answering service operator to find one of several phone numbers a doctor has left for emergency calls, then choose the one that matches the time of day. We will make this a slot file to avoid the problem of duplicate names. Since all of the elements don't appear in the FIXED section, we must specify that the slot goal record be removed to a residual data set.

A record as entered and retrieved from this file would look like the following:

    ENTERED:                         RETRIEVED:

    NAME = Taylor, Paul;             NAME = Taylor, Paul;
    STREET = Ash Lane;               OFFICE = Medical Center;
    CITY = Palo Alto;                STREET = Ash Lane;
    STATE = Ca;                      STREET = Stanford Ave;
    ZIP = 94305;                     CITY = Palo Alto;
    HOURS = 9-12 Daily;              CITY = Stanford;
    HOURS = 5-8 Thursday;            STATE = Ca;
    PHONE = 555-1212;                STATE = Ca;
    OFFICE = Medical Center;         ZIP = 94305;
    STREET = Stanford Ave;           ZIP = 94305;
    CITY = Stanford;                 HOURS = 9-12 Daily;
    STATE = Ca;                      HOURS = 5-8 Thursday;
    ZIP = 94305;                     HOURS = 2-5 Daily;
    HOURS = 2-5 Daily;               HOURS = 12-2 Daily;
    PHONE = 497-4420;                PHONE = 555-1212;
    HOURS = 12-2 Daily;              PHONE = 497-4420;
    PHONE = 548-7737;                PHONE = 548-7737;

This kind of data organization is obviously not acceptable; not only is one occurrence of each of the address elements not grouped with the one or more occurrences of HOURS and PHONE that are related, but the varying number of occurrences of each of these elements makes it impossible to sort the elements into related structures visually.

Now let's look at the entered record in terms of groups or structures of the elements:

                                         NAME
  LOCATION:        --ADDRESS:         |--STREET
                   |                  |  CITY
                   |                  |  STATE
                   |                  |--ZIP
                   |
                   | HOURS-PHONE:     |--HOURS
                   --                 |--PHONE

  LOCATION:        --ADDRESS:         |--OFFICE
                   |                  |  STREET
                   |                  |  CITY
                   |                  |  STATE
                   |                  |--ZIP
                   |
                   | HOURS-PHONE:     |--HOURS
                   |                  |--PHONE
                   |
                   | HOURS-PHONE:     |--HOURS
                   --                 |--PHONE

(The lines drawn around groups of elements represent a hierarchical structuring of the data that relates elements to each other.) SPIRES allows you to group together separate occurrences of elements into a hierarchy or "structure." The structure itself has a name, and an occurrence of the structure is stored logically as a unit; upon retrieval, the elements in a single occurrence of the structure are displayed together. The elements within a given occurrence of the structure are said to be "structurally bound" -- structural binding is an important aspect of searching techniques that needs to be considered when designing a file. [See C.6.13.]

Structures may contain other structures, nested up to ten levels, as LOCATION contains ADDRESS and HOURS-PHONE. The NAME element is called a "record level" element, since it is not contained in any structure. The LOCATION structure is called a record level structure, since it is not contained in any structure, and is defined by an element, LOCATION, occurring at the record level. ADDRESS and HOURS-PHONE are not record level structures, since they are contained in a structure; similarly, STREET, CITY, and other elements are not record level elements.

B.3.2  Coding Structures

Structures, like elements, can have occurrence and length attributes. The structures in the doctors' phone directory are fairly complex; "Location" is multiply occurring, and has in it a singly occurring "Address" structure and a multiply occurring "Hours-Phone" structure.

Until now, all the elements we had defined were "simple" elements, such as NAME and PHONE-NUMBER. SPIRES allows another kind of element, "STR", for structured elements. If we want to code the address structure for the above record, we would begin this way:

If the structure has fixed length or occurrence attributes, or aliases, they must also be coded:

The length of a structure is the sum of the lengths of all elements in the structure. The length attribute can only be specified for structures containing only FIXED elements; otherwise the length attribute must be omitted. The occurrence attribute is the number of occurrences of the structure as a whole.

Structures, like elements, must be placed in the FIXED, REQUIRED, or OPTIONAL section of the record. If the structure contains only FIXED elements, it may be placed in the FIXED section of the record. If the structure need not occur at all (e.g., if none of the elements in the structure will be given values), it should be placed in the OPTIONAL section of the record. Note that a REQUIRED structure may contain OPTIONAL elements, and an OPTIONAL structure may contain FIXED and REQUIRED elements.

Where are the elements in a structure specified? Where do you define the occurrence and length attributes of the STREET, CITY, STATE and ZIP elements of the ADDRESS structure? The structure definition, containing element definitions, is entered at the end of the record definition. Its organization is similar to that of a record; it has FIXED, REQUIRED and OPTIONAL elements and sections in it, coded as follows:

If none of the categories, FIXED, REQUIRED, etc., are coded, all the elements of the structure are OPTIONAL, with the exception of the structure key, if there is one, which is REQUIRED.

The "name" of a structure is given in the structure declaration in the goal record description:

Here is a somewhat better phone directory of doctors, using structures to group related occurrences of elements:

B.3.3  Structured-Data Input

The information we input into the earlier file [See B.3.1.] will be input to this new file with only slight modifications. The inclusion of the structure name ("LOCATION;" "ADDRESS;", and "HOURS-PHONE;") signals the start of an occurrence of the structure. An input record for this file is shown below; an output record is identical.

B.3.4  Keyed Structures

The structures used in the above file definition and data record are called "non-keyed" structures.

A keyed structure differs from a non-keyed structure in that one element of the structure is defined to be the structure's key. Like a record key, a structure key must be singularly occurring, and must be either the first element in the FIXED section or the first element of the REQUIRED section of the structure. Unlike the record key, multiple occurrences of the same structure in the same record may have the same value for their key. Also unlike a record's key, a structure's key must be the first element defined in the structure. This means that if a structure that has both FIXED elements AND a REQUIRED key, the structure is treated as a "non-keyed" structure. But if any FIXED element is declared the KEY, SPIRES forces it to be the first element of a keyed structure.

Keyed structures also differ from non-keyed structures in the following ways:

Keyed structures are defined in the same way as non-keyed structures, except that one element is designated as the structure key. For example, the LOCATION structure could be defined as a keyed-structure as follows; note that ADDRESS is a keyed structure within LOCATION.

B.3.5  Keyed-Structure Data Input

A record for input to this file will be organized differently than input for a non-keyed structure. The most important difference is that the occurrence of the key element of the structure must be the first element entered for input to the structure. Here is an example:

Use of a keyed rather than a non-keyed structure has a significant effect on data entry. The structure name need not be entered; but the key element can only occur once for each occurrence of the structure. Compare data entry and organization of a non-keyed and keyed structure (on PHONE) when the same HOURS element applies to two different PHONE elements.

Two occurrences of a keyed structure are necessary for two occurrences of the structure's key. Thus, the choice of a key for a structure is almost as important as the choice of the proper key for a record since it determines how information is entered, stored and displayed. In our current LOCATION structure keyed on PHONE, different HOURS at one PHONE are entered as a single occurrence of the structure; if this structure were keyed on HOURS, every occurrence of a different HOURS element would require a new occurrence of the structure to be keyed.

Unlike record keys, structure keys need not be unique. The same phone number key element could occur twice in two different occurrences of the structure, perhaps each having different hours.

B.3.6  Floating Structures

As we have seen, structure definitions are quite flexible and can thus be used to represent many types of data organization. Nested structures offer one kind of flexibility. Floating structures offer an additional flexibility.

Because a structure is defined independently of its position in the record, it is simple to define a structure which occurs at multiple places in the record. Such a structure is called a floating structure. The elements in a floating structure can be indexed by specifying them by their structural element path. [EXPLAIN STRUCTURAL ELEMENT PATH.]

In the following definition, the doctors' answering service phone directory has been altered to allow two ADDRESS structures, but the structure is only defined once, since it floats. ADDRESS is now a structure that occurs at the record level, where it contains either a doctors' home or business address. The home and business addresses are distinguished by the occurrence or non-occurrence of the BUSINESS element; if the address is a business address, then the element occurs but has no value (LEN = 0). Conversely, if the element doesn't occur, then it is not a business address. ADDRESS is also a structure that occurs nested in the HOURS-PHONE structure.

Sample input to this file is shown on the left; sample output is shown on the right:

B.4  Processing Rules: INPROC, INCLOSE, OUTPROC

B.4.1  Functions of Processing Rules

SPIRES provides a facility, called "processing rules", for examining, validating and modifying input and output values.

In all of the files we have defined so far, element values such as "497-4420" for a telephone number and "94305" for a zip code have been input and stored as character strings.

Suppose we could tell SPIRES to take the eight byte (character) telephone number, verify that it is in fact eight bytes, delete the "-", then convert the remaining seven digits to a four byte binary number, giving an error message if the value contains anything but digits. If we do this, eight bytes of input can be verified for correctness and stored in four bytes of space; in a telephone directory, this would mean a substantial reduction in storage costs and data entry errors.

Of course, having stored the data in a binary fashion, we want to restore it to its original form on output by converting it to a string then inserting a "-" after the third character.

Processing rules perform this checking and translation. Rules that process input to a record are called INPROCS; rules that modify stored data for output are called OUTPROCS. A third category of processing rules called INCLOSE actions are associated with INPROCS and perform a special kind of input processing. The two remaining categories of actions, SRCPROCS and PASSPROCS deal with searching and indexing, respectively; these are discussed in later chapters. [See B.8.9, B.8.11.] The general syntax rules discussed in this chapter will apply to them, however.

The basic form of processing rule is called an "action". Another type of processing rule, called a "system proc" (pronounced "prock"), is also available; system procs are an alternate way to specify an action or combination of actions using a more descriptive language and simpler (in most cases) syntax. System procs are based on "processing rule-string procedures", which are described later in this manual. [See C.10.] This chapter and the rest of this manual discuss actions as the main type of processing rule, but keep in mind that system procs may be used in place of actions, and are in fact used by the File Definer when it creates file definitions. [See A.3.2.1.] They are described fully in the SPIRES reference manual "System Procs".

B.4.2  INPROC and OUTPROC Rule Functions

In general, INPROCS can validate data on input and convert element values from one form of representation to another, perhaps more compact, form. In addition, if any of the validation rules discover an error, you can ignore it, or simply put out a warning message, or reject the value, or reject the entire record.

Let's briefly consider a few examples. Suppose we have an AGE element; we can verify that the input is between 1 and 100, then convert the number to a one byte binary value. This allows us to specify the LEN attribute of the AGE element as fixed at one. Consider another case: if we had an element that contains the number of children a couple had in the last decade, we could validate that the value was between 0 and 15, supply a default value of 0 if no value was input, and store the value as a one byte binary number; now the element could be coded in the FIXED-REQ section of the goal record definition. Similarly, a date element can be verified for one of several forms, converted to a canonical form, then stored as a four byte binary value.

String input value processing rules can compress lengthy values; a day of the week element can be converted to a one byte binary number representing numbers between one and seven after validating that the input value was a correct day of the week. On output these codes could be converted back to the full name of the appropriate day of the week. String processing rules thus allow you to input an abbreviation, store the value in its most compact form, then translate the compact form into a more verbose form for visually pleasing output.

B.4.3  Processing Rule Strings

The conversions and manipulations described are not usually performed by a single processing rule, but series of rules, each separated from the preceding one by a "/". Such a series of rules is often called a "processing rule string" or simply a "rule string." The result or output of one action becomes the input to the next; thus, the order in which you specify rules may be important. Processing rules usually apply to a specific element in a record, and are defined along with the ELEM, OCC, LEN and ALIASES attributes. Processing rules can be coded for a structure as a whole; in this case they are coded along with the "TYPE=STR;" attribute. An element description is thus extended as follows:

A single processing rule or action can be coded as simply as this:

This rule would convert data element input in upper-lower case to uppercase only before storing it on disk.

Processing rules usually occur in strings, each rule effecting some transformation on the element value input or, if a rule is preceded by another rule, effecting some transformation on the output of the previous rule. A simple rule string would be:

This sequence of actions would convert an upper-lowercase input value to uppercase (A30) then compress multiple blanks to a single blank (A40) before storing the data on disk.

Up to 31 actions (15 if an "INCLOSE" action is in the string) can be used together, each separated by a "/". A long rule string can be continued over several lines terminated by a semicolon. As with any input value to SPIRES, the value can only be continued to another line if it can be broken at a blank. For this reason it is important to use "/ " (a slash blank) as a separator to insure that SPIRES can break a processing-rule string into multiple lines at blanks.

A30 and A40 both exhibit the simplest form of an action: an "A" followed by a number. Processing rules are referenced by number, to allow compact expression. They typically have more complex syntax; for example:

B.4.4  Processing Rule Syntax

A general syntax for processing rules follows. Optional portions are enclosed in square brackets. Choices are configured one above another vertically. Defaults are underlined. The symbol "#" represents a number.

         -   -           -          -
         | D |           | ,P2[,P3] |
           -     -     -
         | W |   | :P1 | |          |
        A| E | # | :0  | | ,P+...   |
                 -  -  - -          -
         | S |
         -   -

Examine the following examples:

"A30" contains the only required parts of the syntax, "A" and a number, which is 30. "AE146:3,10" contains an "A", an "E", a number, P1=3, and P2=10.

Let's examine what each unit of the syntax indicates to SPIRES, and any restrictions on the units.

A

This is simply the letter "A", with which every instance of a rule begins.

D, W, E, or S

Only one of these parameters can be coded; if none is coded, then D is the default. This parameter controls the level of the response if an error flag is set on by the processing rule. For example, an action that translates integers to binary will set on the error flag if the input value contains characters other than a sign and digits. The possible error responses are:

#

The rule number. Processing rules are referenced by number, not by name.

:P1

The P-one parameter specifies the "way" in which an action works: whether the other parameters form an inclusion or exclusion list, or whether the result is stored in 1, 2 or 4 bytes, for example. If P1 is not explicitly coded, it defaults to 0. When coded, the P1 parameter is always preceded by a colon, ":", to distinguish it from other parameters. Some processing rule descriptions say that you should add "4" or "8", for example, to a given value of P1 in order to specify a slightly different way in which the action should work. Thus, if you were intending to code "3" but the description you desired said to add "8" to it, you would code "11". Alternatively, you may code "3+8", which may help you read the coded action more easily later if you need to reinterpret its meaning using the action description. (Processing rule-string procedures [See C.10.] can also take good advantage of this feature.)

,P2

The P-two parameter. For many rules, the P2 parameter controls the function of the processing rule subroutine. The P2 parameter is preceded by a comma.

,P3

The P-three parameter. For some rules, the P3 parameter controls the function of the processing rule subroutine. The P3 parameter is preceded by a comma. The P3 parameter always follows the P2 parameter; no rule uses a P3 parameter without a P2 parameter. If the P3 parameter is null, the preceding comma must still be coded in rules which require a P3 parameter.

,P+

The P-plus parameter(s). Some rules require a set of parameters. There may be as many as 255 P+ parameters in any one processing rule; each is preceded by a comma.

B.4.5  Processing Rule Restrictions

Restrictions which you should keep in mind when coding any processing rules:

Length and Occurrence:

Note: INCLOSE rules are a special variety of INPROC rules. They are discussed later in this chapter. [See B.4.8.]

Special Characters:

Processing rules may transform data prior to storage on disk. The RECOMPILE command may be used after adding or changing processing rules in a file definition only if data already stored on disk is not thereby invalidated. In all other cases, the ZAP and COMPILE commands must be used, and the file rebuilt. [See B.5.8.]

B.4.6  Understanding Processing Rule Descriptions

"Processing Rules: Complete Listing by Number" is a list of processing rules current at the time this manual was produced. [See D.1.] This appendix also describes the use of the actions file; [See D.1.1.] a protocol is available to produce an up-to-date actions listing. [See C.6.9.]

Understanding the explanations for each processing rule in the actions listing requires a familiarity with the syntax of a processing rule, and its requirements for P1, P2, P3 or P+ parameters. Since this familiarity is best obtained through practice, let's look at a few element definitions in which we code processing rules.

As our first example, let's interpret the following element definition:

Beginning with the first rule of the INPROC rule string we find the following explanation for A30 in the listing of rules:

This rule, most frequently used before an input value is to be tested for codes, takes input such as "Female" and converts it to "FEMALE". The output from this rule becomes the input to the next rule, A44.

The line of information in the description "Processing Rules: INPROC, OUTPROC, SRCPROC" tells you that the rule may appear in INPROC, OUTPROC and SRCPROC rule strings. It may not appear in a PASSPROC rule string.

The description of action 44, the next rule in the INPROC string, is more difficult to understand than that of action 30. Unfortunately, action 44, having several parameters, is more typical of action descriptions than is A30. Here is the description we find:

Compare the first two lines, which describe the syntax of A44, with the actual rules coded:

In the rule coded, every occurrence of the string "MALE" in the output of A30 is changed to "M". Thus, not only is "MALE" changed to "M", but "FEMALE" is changed to "FEM". The second A44, "A44,FEM,F" changes any occurrence of "FEM" to "F".

We have not coded D,W,E or S in either of the A44s (it is meaningless in A30), so D is the default. When is the error flag turned on by A44, and what is the system's response when D is the error response parameter? According to the description the error flag is set on if the P2 string, "MALE" in the first A44 and "FEM" in the second, is not found. Why wasn't an error message requested when the strings weren't found? In the case of an input value "FEMALE", both A44's will find a string to convert from P2 to P3. In the case of an input value "MALE", only the first A44 will do any conversion. For this reason, it is best to wait until the next action, A46, to determine whether an error condition exists.

The last action in the INPROC rule string is A46; its description reads as follows:

   A46 :<NUM> ,VALUE (,...., VALUE)
   A46 :P1 ,P+
      Purpose: INCLUSION, EXCLUSION, EXCLUSION LIST, VALUE
        REPLACEMENT
      Processing Rules: SRCPROC, INPROC
        The alphanumeric value is  matched against  the string  of
        codes  that make up the multiple occurrences of P+.  There
        may not be more than 255 occurrences.  If P1 is 2 then the
        value is  replaced  by  a  short  integer  containing  the
        relative  position of the code that matches the value.  If
        P1 is 1, then the value is replaced by a  byte  containing
        the  relative position of the code that matches the value.
        The short integer or byte value  is  suitable  for  output
        using  OUTPROC  action  A75  having  a duplicate string of
        codes.  If P1 is > 2, the value is unaffected if  a  match
        is  found.   For all of these P1 conditions the error flag
        is turned on if the value does not match  a  code  in  the
        list.   For  P1  =2  or  1, the highest code number + 1 is
        returned as a default value.  For P1  =  0  the  value  is
        abandoned  when a match is found and the error flag is set
        on.  P1 = 0 is used for exclusion lists.
      Processing Rules: OUTPROC
        This  action  allows  exclusion   on   output.    Elements
        containing  values matching the supplied list of values in
        this action are excluded from output.  Note  that  a  null
        valued element is always excluded.
      Processing Rules: PASSPROC
        The alphanumeric value is  matched against the  string  of
        values that make up the multiple occurrences of P+.  There
        may  not  be  more  than 255 occurrences.  If P1 = 1 and a
        match is not made, or P1 ~= 1 and a match  is  made,  then
        the  value  is not passed.  For P1 = 1, P+ is an inclusion
        list and only values contained in P+ are passed. For P1~=1
        (P1  =  0)  P+ is an exclusion list and only values  which
        are not included in P+ are passed.

Since A46 in the SEX element INPROC has a P1 parameter of 1, the value "M" will be stored as a binary 1, and the value "F" will be stored as a binary 2; any other value will be stored as a binary 3. Since "W" was coded as the error response parameter, any input to this action other than "M" or "F" will cause an error condition, a warning message will be printed, and the default action is taken, giving the "bad" value a binary value one higher than any legal code.

B.4.7  Custom Error Messages

Suppose the following INPROC is coded for an element called SEX:

If the input for this element was:

then error message put out looks like this:

A user who issued the command "EXPLAIN E46" would get the following explanation:

The generality, and, in fact, ambiguity of this diagnostic is necessary because A46 wears several faces depending upon the value of the P1 parameter coded by the file definer. To reduce the ambiguity of many multi-purpose diagnostics, you may want to code one or several A56's in a processing rule string. A56 facilitates customized error messages as follows:

Whenever any INPROC or OUTPROC rule in an action sequence puts out an error or warning message, the string in the nearest preceding A56 is also output. A56 might be coded in our rule string as follows:

Now, if "SEX=MAIL;" were input, the error message would be:

Note that an action 56 coded before an INCLOSE rule [See B.4.8.] will have no effect on error messages.

B.4.8  INCLOSE Rules

In the INPROC rule strings we have examined so far, the output from one rule becomes input to the next rule in the string. INCLOSE rules, which provide the only form of looping available in processing rule strings, are the exception to this chain. The term "INCLOSE" can be considered a short form of "INPROC CLOSE-OUT." This expanded term tells us two things about INCLOSE rules: 1) they are coded in an INPROC rule string, and 2) they "close out" the rule string and hence must be the last rule or rules coded in a string of rules. INCLOSE rules, A122 through A148, can supply values such as date, time or account to a data base element; they can supply a value for one element that is the number of occurrences of another element; they can supply additional values or delete values if an element doesn't occur a specified number of times. An INCLOSE rule can be used to sort multiple occurrences of an element or of a structure by its key value.

Below is an example of the use of INCLOSE rules. Two elements are defined as FIXED, neither of which need ever be supplied by the user. These elements will contain the date on which a record was first added to a subfile and the date on which it was last updated.

In the above example, note that A126 produces an eight byte character string of the form MM/DD/YY, yet we are going to store this value in a four byte binary number. A31 performs this conversion, yet we have coded it before A126; this is because INCLOSE rules perform a type of looping. The value generated is passed back through the entire INPROC rule string where the new value is converted or validated as specified. (The generated value is usually not passed through the INCLOSE action a second time, unless the rule's description states that it is; A123 for example.) Look at the following example:

If there are fewer than three occurrences of the element (P2 of A123=3), then an additional occurrence is generated and passed through the rule string, where A46 converts it to a binary code. Note that when an INCLOSE is coded, any previous INPROC rule must find the output from the INCLOSE rules acceptable, because the value generated is passed through the entire INPROC rule string.

It is important to note that INPROC and INCLOSE rules are executed at completely different times. INPROC rules are executed when an element is recognized on input. INCLOSE rules occur when "closing out" the structure in which they occur. INCLOSE rules are executed when all elements of a structure, including all lower level structures, have been input.

INCLOSE rules for elements embedded in a structure will not be executed unless the structure (i.e., any element in the structure) occurs. INCLOSE rules that refer to elements in addition to the one for which they are coded (for example, A122, A131, A132) can refer only to elements in the same structure. If such a rule is coded on a record level element, then it can only refer to other record level elements. INPROC and INCLOSE rules may be coded for a structured element (an element of TYPE = STR); see A33 for an example.

Some additional rules governing INCLOSE rules are:

Now that we have looked at the syntax and semantics of processing rules, let's look at some of the different functions they may perform. The suggestions that follow are by no means exhaustive, in fact, only the more common rule strings are sampled.

B.4.9  Processing Rules For Numeric Data

Numeric data may be more compactly stored if it is converted to a binary or packed decimal representation. For example, a one-byte binary integer can represent a number in the range 0 through 255; a two-byte binary integer any number in the range -32768 through 32767; and a four-byte binary integer any number in the range -2,147,483,648 through 2,147,483,647. The savings in disk storage charges by converting integers from their character representations may be significant. Note that three-byte binary numbers are not supported by IBM.

The same processing rules that convert numeric data may also be used to validate that the character string values being entered are numbers. Once numbers have been converted, it is possible to validate, by processing rules, that they are within a specific numeric range.

Action 21 will convert a value from a character string to a fixed binary integer. If the P1 parameter is 1, the result will be a one-byte binary number in the range 0 through 255. This rule may be used as follows:

Note that since the value is being converted to one-byte, a length attribute of 1 can be specified. (This element may now be placed in the FIXED-REQ section of the record or structure.)

If the value as entered is not an integer number, or not in the range 0 through 255, the error flag will be set. The letter "S" in the processing rule specifies that if an error is detected a serious error message is to be typed and the entire record rejected.

Since the value as stored is now a binary integer, it should be converted on output to a character string format. The appropriate element definition thus becomes:

To validate that the AGE element is in the range 1 through 99, the INPROC and OUTPROC could be written as:

If the value is greater than 99, action 24 will report a warning message, and will set the value to 99. If the value is less than 1 (e.g., if it is equal to 0 or negative) a warning message will be issued and the value will be set to 1. Note that a P2 parameter of 2 is sufficient for action 71 to output a number no greater than 99.

INPROC action 25 is used for floating point numbers which may be represented on input as signed or unsigned real constants or in scientific notation (e.g., 1.705E-3 is the same as .001705). SPIRES will set the error flag and type a message if the value is not in the correct format, depending upon whether the rule is written as "AS25", "AE25", "AW25" or "AD25" (or "A25"). If the P1 parameter is 1, a single precision internal floating point number is generated; if P1 is 2, a double precision floating point number is generated; if P1 is 0 the value is only checked for floating point format and not converted. Action 26 is the floating point equivalent of INPROC action 24, and action 72 the floating point equivalent of OUTPROC action 71. A72 will suppress leading zeros, but all places to the right of the decimal point (the number of places is specified in the P2 parameter) will be printed (e.g., -1.75000 is possible with A72,8,5). A P1 parameter of 2 or 3 provides for stripping trailing zeros in the decimal portion (e.g. -- 1.75 is possible with A72:2,8,5). For example:

The packed decimal conversion rules are INPROC action 39 and OUTPROC action 80. Packed decimal conversion is very useful when a number is to be stored, but it is too large to be stored in a four-byte binary number. Though social security numbers can be stored in four-byte binary form (after "-"s have been removed), a telephone number with an area code cannot. Such a number can be stored in packed decimal format.

Check digits can be appended to fixed binary values by INPROC action 27 and validated by action 34. See also OUTPROC action 77.

B.4.10  Processing Rules for Dollar-and-Cents Data

The processing rules for dollars-and-cents are similar to the processing rules for floating point numbers described in the last section; in fact dollars-and-cents values are converted to single precision binary floating point numbers. For example:

An error message is produced and the value rejected if the value is not in the correct format. The OUTPROC specifies a maximum field width of 15 characters, which is more than sufficient for a single-precision floating point number and a dollar sign.

Action 85 can be used to "edit" a dollar and cents field, inserting check protection or appending a "CR" if the value is negative, for example. The use of this action is patterned after the COBOL and PL/I picture edit masks; see COBOL or PL/I documentation on the "Picture Clause" for complete documentation of the options available.

B.4.11  Processing Rules for Free Form Character Strings

Character string values, such as book titles, disease diagnoses, journal abstracts, etc., can seldom be validated on input. It is possible, however, to perform limited conversion, in most cases only to conserve storage space.

INPROC action 30 can be used to convert the value into upper case. (Ordinarily, text strings should be stored in upper and lower case for ease of reading.) Action 30 will be more useful when processing coded values. [See B.4.12.]

Actions 40 and 51 will compress extra blanks from character string values. These extra blanks are often unnecessary, and increase storage cost. A51 removes trailing blanks; A40 "squeezes" multiple blanks to a single blank and removes trailing blanks. An example of the use of A40 follows:

Blanks which have been removed by actions 40 or 51 cannot be reinserted with an OUTPROC. Note that if you use A40 you do not need to use A51.

Using Action 57, a character string is translated on input into a packed form for compact storage, then retranslated using action 58 on output. Restrictions apply on which characters will be translated and retranslated, and which will be converted to another character.

B.4.12  Processing Rules for Strings of Codified Data

The examples of processing rules used in the introduction to this chapter highlight the facility for processing rules to examine an input value for predetermined string values. Actions are available to convert one string to another (for example, A44 or A48), or a string to a binary code (for example, A46). An example shown earlier in the text was:

Instead of the two A44's, we might have coded a single A48 instead, observing the note in the action description:

Using A48, we would have coded:

Since the value we have stored on disk is a binary number, we must code an OUTPROC with a set of codes corresponding to those in the A46 to translate the binary value stored back into the appropriate string. It is extremely important that any value output (by an OUTPROC or not) be able to go back in through the INPROCS, since it is likely that transfer and update commands will be used in any file. For this reason, the output "Androgynous" will trigger a warning message on an ADD or UPDATE command, when it is sent through the INPROC; but the record is accepted, and the value "ANDROGYNOUS" is converted to a binary 3.

Processing rules are available to convert character string values from one character set to another, or to validate that only legal characters are entered in an element value. Refer to the descriptions of actions 43 through 49 in the actions appendix for compete descriptions. [See D.1.4.] Action 46 can be used to test whether specified strings either occur or don't occur, that is, to form an inclusion or exclusion list. Character set translation, say "$" to "-", is performed by action 43.

B.4.13  Processing Rule for Personal Names to Canonical Form

Action 41 is available to give great flexibility to personal name entry and retrieval. Any recognizable form for a personal name can be input, and SPIRES will convert all forms to a canonical form before storage. The file owner determines the canonical form from among six forms. If a user will later want to sequence a search result by last name, then the name must be stored in a last-name first form.

B.4.14  Processing Rules for Validating Length and Occurrence

The length and occurrence attributes of an element describe the disk storage that SPIRES will reserve for an element. If an element value exceeds the length attribute, or does not occur the specified number of times, the record is rejected. More extensive validation of element length and occurrence can be made with processing rules.

Actions 22 and 23 may be used to validate the length of an element. The following element definition ensures that the value is at least 2 and no more than 15 characters long. Warning messages are issued if an error is detected; no fewer than 2 and no more than 15 characters will be stored:

If a VARYING-REQ element has no LEN specified for it, it is possible for a user to input a value of 0 length. If this element were then to be passed to an index, a PASS ERROR would occur because a zero length element was being passed. [See B.10.10.] In order to prevent this, action 23 should be used to force the value to have a minimum length, perhaps padding with blanks. For example:

would allow "POSTAL-CODE;" (a zero length value) to be input. But

would provide a warning message if "POSTAL-CODE;" were input, and would enter a blank value rather than a null value.

Action 146 may be used to ensure that an element occurs fewer than a specified number of times. The following element definition will produce a warning message if a person has more than 10 children:

A123, which tests for a minimum number of occurrences, and A146, which tests for a maximum number of occurrences, are especially useful when a VARYING-REQ or OPTIONAL element is multiply occurring with a fixed number of occurrences. As noted before, [See B.1.3.] SPIRES does not validate the occurrence count of VARYING-REQ or OPTIONAL elements that have more than one occurrence specified; these elements are simply "multiply occurring" to SPIRES; A123 or A146 must be used to insure fixed occurrence. These actions can also be useful to force a fixed number of occurrences for FIXED-REQ elements that are multiply occurring. Otherwise, SPIRES will store blanks or zeroes for occurrences that are required but do not occur in the input data.

B.4.15  Processing Rules and Element Types

Every element in a record-type can be categorized by "type". To most users, the element type is not important -- the user gives the element a string value for input and it comes out as a string value on output, so its internal type is only significant to SPIRES. But there are programming situations where the internal type is relevant, and even situations where the programmer needs the element to be treated as a different type.

For most elements, the element type is determined by INPROC processing rules. For example, an element whose INPROC string ends with the $INT proc (A21) will be type integer. Processing rules also determine that an element is type character [Elements are called type string or type character interchangeably, and there is no difference, as there is between variables of type string and character. Elements would be exclusively called type string except that the abbreviation STR is used for elements that are type structure.] (the default, if no rules are specified), type real, type packed, and type hex. [Elements may be declared type "bits" with the TYPE statement, where "bits" means SPIRES will treat it as type hex. That is sometimes done on elements created internally (e.g., during passing) that do not have INPROC statements but which you do not want SPIRES to treat as type character. It is also done in situations where you want WHERE clauses and ALSO commands to be case-sensitive for a given element. [See B.9.3.16.]]

But the types of some elements, i.e., structures, locators and executable elements, are determined by the TYPE statement. [See B.3.2, B.7.1, B.4.] The TYPE statement has the following syntax:

where "type" is one of the following:

     STR   -  structure
     LCTR  -  locator
     XEQ   -  executable
     CHAR  -  string, character
     HEX   -  hex     (BITS is equivalent)
     REAL  -  real
     PACK  -  packed decimal
     INT   -  integer

If you want to know what type an element is, you can issue the command SHOW ELEMENT CHARACTERISTICS. Note that it will list "CHAR" elements as "String" and "BITS" elements as "Hex".

Knowing the element type is also useful when you use the system variables $CVAL and $UVAL in formats and the function $GETUVAL. The type of value returned by them is determined by the type of the element involved. For example, in an output format, the type of $UVAL changes from label group to label group, based on the type of element retrieved by the GETELEM statement.

Using the TYPE Statement to Solve Programming Problems

In most cases, you should only put the TYPE statement in an element definition when you need to. Specifically, you must use it when you are defining a structure, a locator or an executable element. But if a processing rule defines an element as type integer, for example, you do not need to include "TYPE = INT" in its definition.

However, in certain circumstances, you might want to declare an element as a different type than the processing rules (or lack of processing rules) would create. For instance, an element might be defined as such:

For some reason, the file owner has decided not to code an INPROC; the input is perhaps coming from some external source already in integer form. So it is stored as a four-byte integer. The problem is that SPIRES considers it an element of type string (since it has no INPROC rules), which could create problems for format definers who think it is type integer. Simply adding the statement "TYPE = INT" would solve those problems. [This is a somewhat unlikely example, however. The file owner is more likely to include an INPROC, such as $INT, which would set the type to integer properly, and then write an input format to handle the input from the external source. That format could then override the file definition's $INT rule for the element. A more likely example would probably be more complicated, however.]

The TYPE statement may not be used indiscriminately to change an element to any type at all. The following is allowed: an element whose INPROC rules would define it as type string or hex may be changed by the TYPE statement to INT, REAL or PACK. Also, an element of any type may be "redefined" as a hex element, using "TYPE = BITS".

B.4.16  Processing Rule Tracing: SET PTRACE

SET PTRACE is the basic command for activating processing-rule tracing, where SPIRES displays environmental information about the processing rules being executed as element values are processed by them. Record-processing commands that can display Ptrace information include DISPLAY, TYPE, ADD, UPDATE, MERGE, ADDUPDATE, ADDMERGE, the INPUT commands, and TRANSFER, where Inproc, Outproc and Inclose rules will be traced. It can also be invoked for tracing the execution of Searchproc rules in searches, i.e., in FIND, AND, AND NOT and OR commands. However, Ptrace does not trace Passproc rules; a separate facility, invoked by the command SET PASSTRACE, provides that facility. [See B.4.17.]

It is also available in SPIBILD for batch processing commands such as INPUT BATCH, for Inproc and Inclose processing.

Depending on the Ptrace options selected, SPIRES can show you:

Additionally, you can ask for tracing only for specific elements.

There are eight forms of the SET PTRACE command, depending on which options you want to use. You may issue multiple SET PTRACE commands to achieve particular combinations.

Here is the full list of commands, each of which is described below:

Each has a parallel CLEAR PTRACE command to clear its particular effect. Also, "+ names" and "- names" lists can be used on the USERPROCS, VARIABLES, ELEMENTS or TYPES flavors to alter the effect of a previous command.

SET PTRACE

The basic command is:

Like all SET PTRACE commands, it should be issued after the subfile you want is selected. The command will succeed only if you have SEE access to the SPIRES file to which the subfile belongs. By default, Ptracing will then be in effect for all elements of the goal record-type except phantom elements and dynamic elements. [Ptracing of phantom elements may be a future extension of this feature.]

With the basic SET PTRACE in effect, SPIRES will display a "-Ptrace elem: <name>" and "-End ptrace: <name>" line for each element as execution of its processing rules begins and ends. Additionally, between those two will be a line displaying the value of the element at the end of its processing ($Val) as well as its length ($Vlen).

For instance, a basic SET PTRACE command might show this for a date element on output:

SET PTRACE SNAPSHOT

For more explicit information, add the SNAPSHOT or SNAP option to the command, which will show you a "snapshot" of information about each action as it is executed:

The information shown includes the action number, often with its P1 parameter, the type of the value if it is not string, and the value and its length BEFORE the action is executed.

Continuing the example of the BIRTHDATE element above, here is the Ptrace information on output under SET PTRACE SNAPSHOT:

The processing rule A76:3 works with the input value 17231222 of type Hex; A76, the action for converting date values for output, returns the value "WED DEC 22, 1723", which is processed by the second rule, A30, and the final output value is shown in the fourth line.

The information shown by Ptrace is derived from compiled code during its execution. Unfortunately, because system procs (such as $INT or $DATE) used in processing rules, are converted during compilation to their action components (such as A21 or A30), any system proc that invoked the action cannot be displayed by Ptrace, which means interpreting Ptrace information may be challenging. For instance, in the example shown above, the one system proc coded in the Outproc rule string for the BIRTHDATE element was a $DATE.OUT rule, which turned into actions A76 and A30 during compilation.

[For assistance in translating system procs into actions, see the System Proc Expansion section at the end of each system proc description in the SPIRES manual "System Procs". Also see the Action descriptions at the end of this manual. [See D.1.1.]]

SET PTRACE USERPROCS

When the processing rules invoke Userprocs, neither the basic form nor the SNAPSHOT form will name the Userproc called. To get that information, use the USERPROCS form:

Besides the basic information shown for SET PTRACE, entry to and exit from each Userproc called by action A62 or A124 ($CALL proc) will be displayed as it occurs, as shown in the example below. Userprocs invoked from within other Userprocs with the XEQ USERPROC Uproc will not be shown; use SET PTRACE JUMP for that.

In the absence of a list of Userproc names on SET PTRACE USERPROCS, SPIRES will show trace information for all Userprocs called by action A62 or A124. If you add a list to the command, then only the Userprocs that match the names given will be traced. You can alter that list further with subsequent SET PTRACE USERPROCS commands, using the "+" or "-" characters in front of the list. You can also begin by adding an "exclusion list", listing the Userprocs you don't want to trace by preceding them with a minus sign; in other words, all Userprocs except those named in the exclusion list will be included in the trace.

Here is an example of Userproc tracing for an element:

SET PTRACE VARIABLES and SET PTRACE JUMP

The USERPROCS version of SET PTRACE is often accompanied by SET PTRACE VARIABLES, which shows the values of user variables in the Userprocs as they are changed, and SET PTRACE JUMP, which shows all JUMP (GOTO), XEQ USERPROC, and RETURN Uprocs within Userprocs as they execute. Neither command has any effect unless SET PTRACE USERPROCS is in effect.

When SET PTRACE VARIABLES is in effect, anytime the value of a user variable in a Userproc is changed, the new value is shown in the tracing information.

In the absence of a list of variable names on SET PTRACE VARIABLES, SPIRES will show trace information for all user variables in executing Userprocs. If you add a list to the command, then only the variables that match the names given will be traced. You can alter that list further with subsequent SET PTRACE VARIABLES commands, using the "+" or "-" characters in front of the list. Note that you omit the preceding pound sign when specifying them here. You can also begin by adding an "exclusion list", listing the variables you don't want to see by preceding them with a minus sign; in other words, all user variables except those named in the exclusion list will be included in the trace list.

With SET PTRACE JUMP, you will see information about JUMP or GOTO Uprocs as well as XEQ USERPROC Uprocs and any RETURN Uprocs that return execution control back to the calling Userproc. The XEQ USERPROC trace information names the Userproc that has been invoked. The JUMP trace information, again dependent on the compiled code, does not have the label name to which execution has been directed, but does have the (admittedly cryptic) displacement value used internally by SPIRES.

The example below, expanding on the previous one, shows the influence of SET PTRACE VARIABLES and SET PTRACE JUMP, with SET PTRACE SNAPSHOT as well:

SET PTRACE ELEMENTS and SET PTRACE TYPES

The last two forms of SET PTRACE control the scope of Ptracing, as established by the other commands.

You can request Ptracing for only specific elements by naming them in a SET PTRACE ELEMENTS command:

The first time you use the ELEMENTS option in a given PTRACE session, you specify either an inclusion list ("elements"), the elements for which you want tracing, or an exclusion list ("- elements"), the elements for which you don't want tracing. Later, you can change the list of elements being traced by adding new ones ("+ elements") or removing others ("- elements").

You can request Ptracing for only specific types of processing rules (for instance, you are interested only in Inproc rules as they are executed) with the SET PTRACE TYPES command:

where "types" can be one or more of the following: INPROC, INCLOSE, OUTPROC, SEARCHPROC. If this command is used, only the types of processing rules named will be traced. Note that INPROC includes INCLOSE.

SET PTRACE ALL

Issuing the SET PTRACE ALL command is equivalent to issuing the following commands all in a row, making it a convenient shortcut when you want full tracing:

CLEAR PTRACE

Turning off processing-rule tracing is as simple as:

CLEAR PTRACE and CLEAR PTRACE ALL are equivalent, turning off all the tracing. However, you can turn off the individual options as well:

SET PTRACE is often used in combination with SET TLOG, which directs the tracing data to a temporary log file, which can be examined with the SHOW TLOG DATA command. For more information, [EXPLAIN SET TLOG COMMAND.]

You can see what SET PTRACE commands are in effect with the SHOW PTRACE command:

Since SHOW PTRACE lists the actual commands, you can use the IN ACTIVE command to put them into your active file, where you can modify them and then re-execute them with the XEQ command; clearing the current Ptrace settings with a CLEAR PTRACE command first is recommended.

The $PTRACE flag variable can also be checked to determine if Ptrace is in effect.

B.4.17  Processing Rule Tracing for Passprocs: SET PASSTRACE

SET PASSTRACE is the basic command for activating processing-rule tracing for Passprocs, where SPIRES displays environmental information about the processing rules being executed as element values are being passed to indexes.

The Passtrace facility is available only in SPIRES, not in SPIBILD or FASTBILD. Thus, it is available when a file is processed in SPIRES (using the PROCESS command) or when a goal record-type has immediate indexing and a record transaction occurs. It displays information only about the Passproc rules that are executed; other processing-rule tracing is handled by the command SET PTRACE. [See B.4.16.] A limitation: If you are passing virtual elements whose values are generated by executing Outprocs and/or Inprocs, or if you are passing the external form of an element, then SET PASSTRACE does not display trace information for the Outprocs and Inprocs that are executed. They would not be shown if you SET PTRACE either, however.

Depending on the Passtrace options selected, SPIRES can show you:

There are six forms of the SET PASSTRACE command, depending on which options you want to use. You may issue multiple SET PASSTRACE commands to achieve particular combinations.

Here is the full list of commands:

  SET PASSTRACE                   - basic information
  SET PASSTRACE SNAPSHOT          - detail information
  SET PASSTRACE USERPROCS [names] - information about Userprocs
  SET PASSTRACE VARIABLES [names] - information about variables
                                      used in Userprocs
  SET PASSTRACE JUMP              - information about execution
                                      flow commands in Userprocs
  SET PASSTRACE ALL               - same as issuing all of the above

Each has a parallel CLEAR PASSTRACE command to clear its particular effect. The "names" option on the USERPROCS and VARIABLES forms limits the tracing to the specific Userprocs or variables named. Also, "+ names" and "- names" lists can be used on the USERPROCS and VARIABLES flavors to alter the effect of a previous command.

Aside from some basic differences in the output between Passtrace and Ptrace, the Passtrace commands work the same as their Ptrace counterparts, with similar function. Additionally, like Ptrace, Passtrace has a $PASSTRACE flag variable that you can check (e.g., in a Userproc: "UPROC = If $Passtrace Then * 'Current value is ' $Procvalue;") to determine is Passtracing is set.

Here is a sample session showing part of the output from Passtracing; this subfile, to which a record is being added, has immediate indexing. Note the use of TLOG to hold the tracing output; Passtrace commonly generates hundreds of lines of output per record, particularly for updates. (Updates generate pass entries from both the replacement copy of the record and the tree copy being replaced.)

-> select authors
-> set passtrace all
-> set tlog
-> add
-> sho tlog
-Begin passing (Add) Goal-type BMRC  , Key = 342201
 Passproc A170  $Vlen: 0
-Pass Index value:  $Vlen: 4  $Val: '<00001E01>'
 Passproc A169:1  $Vlen: 0
-Fetch Goal element FD  $Vlen: 4  $Val: <19750803>
-Pass Index value:  $Vlen: 4  $Val: <19750803>
 Passproc A169:1  $Vlen: 0
 Passproc A167:2  $Vlen: 0
-Fetch Goal element MEPN  $Vlen: 28  $Val: 'FREEMAN, HENRY GEORGE, 1902-'
 Passproc A168  $Vlen: 28  $Val: 'FREEMAN, HENRY GEORGE, 1902-'
 Passproc A44  $Vlen: 28  $Val: 'FREEMAN, HENRY GEORGE, 1902-'
 Passproc A44  $Vlen: 28  $Val: 'FREEMAN, HENRY GEORGE, 1902-'
 Passproc A43  $Vlen: 28  $Val: 'FREEMAN, HENRY GEORGE, 1902-'
 Passproc A38:1  $Vlen: 28  $Val: 'FREEMAN, HENRY GEORGE, 1902 '
-Pass Index value:  $Vlen: 7  $Val: 'FREEMAN'
-Pass Index value:  $Vlen: 13  $Val: '<05>HENRY<06>GEORGE'
 Passproc A165  $Vlen: 0
 Passproc A165  $Vlen: 0
 Passproc A167:2  $Vlen: 0
 ...

CLEAR PASSTRACE

Turning off Passproc tracing is as simple as:

CLEAR PASSTRACE and CLEAR PASSTRACE ALL are equivalent, turning off all the tracing. However, you can turn off the individual options as well:

SET PASSTRACE is often used in combination with SET TLOG, which directs the tracing data to a temporary log file, which can be examined with the SHOW TLOG DATA command. For more information, [EXPLAIN SET TLOG COMMAND.]

You can see what SET PASSTRACE commands are in effect with the SHOW PASSTRACE command:

Since SHOW PASSTRACE lists the actual commands, you can use the IN ACTIVE command to put them into your active file, where you can modify them and then re-execute them with the XEQ command; clearing the current Passtrace settings with a CLEAR PASSTRACE command first is recommended.

B.5  The FILEDEF Subfile and File Compilation

B.5.1  The FILEDEF Subfile

All of the file definitions we have written so far have at least one thing in common: they look like records in a SPIRES file. In fact, every file definition is stored and compiled from a record which you add to a highly-structured SPIRES subfile called FILEDEF. This record is your file definition itself.

The fact that all file definitions are records in the FILEDEF subfile indicates several things about the form your file definition must take in order for it to be accepted as a record in FILEDEF. Lines of our definitions, such as "FILE = GG.UUU.DIRECTORY;" are in the form "element name = value;". Lines such as "REQUIRED;" are in the form "element name;". Several lines of a file definition can actually be coded on one line as long as the ";" is present to act as a delimiter. This compression is not recommended, however, since errors while attempting to add a record to FILEDEF result in a diagnostic message which refers to a line number in the active file. (The definitions in this manual are compressed only to increase readability and save space.)

Since a single statement in a file definition is an element in a SPIRES file, you may extend it over several lines: break the value at any blank and continue it on subsequent lines. One length restriction exists, however: no single element in your file definition may have more than 4,096 characters in it. (In other words, the MAXVAL of the file containing the FILEDEF subfile is 4096.)

Just as with the records we have defined so far, records in the FILEDEF subfile have a unique key. The record is a file definition itself, and the key is the file name. You indicate the key of a FILEDEF record by coding the "FILE =" element of your definition. Since the key of each goal record in a subfile must be unique, you cannot define two files with the same file name. The key, or file name, is the value by which you reference your file definition when you issue a TRANSFER, DISPLAY, UPDATE or REMOVE command while FILEDEF is selected.

The FILEDEF key includes the "GG.UUU" (account) portion of the file name. When you issue a TRANSFER, DISPLAY, UPDATE or REMOVE command, a check is made to insure that the key of the record involved begins with the account number of the logged-on user. You cannot see, modify, or remove a file definition from any account but the one that added the record, even if you have privileges to use that account's subfiles.

In addition to the FILE element, the FILEDEF subfile has an element called "RECORD-NAME" that is the key of a data element hierarchy defined as a structure. The values of the multiply occurring RECORD element are the names of the records in your file definition. There is a record description that follows each RECORD element. For each record description there is a FIXED section, a REQUIRED section, and an OPTIONAL section. Within each section are element descriptions headed by an ELEM element and followed by an OCC and LEN attribute.

If the ELEM is actually the KEY, it may or may not have a LEN specified. The KEY cannot be in the OPTIONAL section of the RECORD; it must occur in either the FIXED or REQUIRED section, but only in one of these. An additional requirement is that the KEY must be the first element defined in the one section in which it occurs.

Non-KEY elements may occur in any of the three sections. Elements defined in the FIXED section must have both an OCC and LEN specified. If an element occurs in the REQUIRED section, it need not have an OCC or LEN, but its OCC is one or more. Elements defined in the OPTIONAL section also need not have an OCC or LEN, but their OCC is zero or more. As mentioned earlier, if an OCC of greater than one is specified for a REQUIRED or OPTIONAL element, then SPIRES makes that element multiply occurring. The occurrence count of such a multiply occurring element is not checked unless appropriate INPROCs are coded.

Following the record definitions, another portion of the FILEDEF record defines the subfile by specifying the SUBFILE-NAME, the GOAL-RECORD and what ACCOUNTS have access to the goal record. This section, called the "subfile section," also allows an optional element, EXP; this element contains explanatory information about the subfile. The user requests this information by issuing the EXPLAIN command.

You can have several subfile sections, though there need not be more than one. Multiple subfile sections allow you to specify different subfile names, goal records, explanations and access levels to various account groups.

B.5.2  Adding Records to FILEDEF

Once you have collected a record in your active file that meets all the requirements of the FILEDEF subfile structure, you add the record to the file definition subfile. A record is added to FILEDEF just as it is to any subfile, except that in this case the ADD command invokes some complicated validation procedures. Among the validation tests is a check that the first portion of the file name matches the account of the logged-on user.

To add a file definition, do the following in SPIRES:

If your definition contains certain kinds of errors, they will be reported to you at this time and the record will not be added; some other kinds of errors will not be caught until you try to compile the definition. Errors are usually reported for certain lines of the file definition code in your active file; the error number tells you what processing rule you violated in your file definition. These errors can be explained using the EXPLAIN command. The record was successfully added to the FILEDEF subfile if SPIRES returns only a command prompt to you.

Typical errors that are detected when you try to add a record to FILEDEF are:

For example, if you had specified "RECORD-NAME=ARTICLE" in the record description and "GOAL-RECORD=REC01" in the subfile section, no error would be reported until you tried to compile the definition.

Errors can be explained using the EXPLAIN command. If errors are reported, you correct your record using the text editor and reissue the ADD command. Once the record has been added, you may then compile it.

B.5.3  Compiling File Definitions

To compile your file definition, select the FILEDEF subfile and issue the COMPILE command:

[IN ACTIVE [CLEAR|CONTINUE]] COMPILE filename [BIG] [STATISTICS] [NOWARN]

All these options, also available on the RECOMPILE command, are described below:

An Example of Compiling a File Definition

Despite the elaborate syntax of the COMPILE command, compiling most file definitions is quite simple; most of the options are for the benefit of large or complex files. Here's how a typical compilation session looks:

If SPIRES cannot compile the file definition, it will report errors to you. These errors will look quite different from those you may have gotten when you tried to add the record to the FILEDEF subfile. In particular, the errors caught at the time of compilation are not keyed to line numbers, but to structures and elements in the FILEDEF record. All the diagnostics you might receive from the compiler are described in "File Definition Compilation Diagnostics"; explanations and suggested solutions are provided for most error messages. [See C.9.] You may also issue the EXPLAIN command to find out more about error messages you may receive.

If errors were reported, you should modify your definition in the FILEDEF subfile, and then try compiling it again.

The STATISTICS Option on the COMPILE Command

The STATISTICS option on the COMPILE command is quite useful when you are compiling a very large file definition, because it shows you how close internal tables are getting to their size limitations. Here is a sample display:

The first number in each line is the number of bytes in the table; the second number is the maximum size for that table; the two numbers are compared in the percentage that follows.

Asterisks at the start of a line indicate a table that exceeds 90% of its allowed size. If you don't include the STATISTICS option on the COMPILE command, those lines are displayed as warning messages, which can be suppressed by using the NOWARN option.

Two important tables to watch are the "Overall Limit" and the "USERPROC Tables & Header" in the Record Statistics section. Each is a combination of most of the tables listed above it, with a limit that is less than the total of all the limits of the tables within it. So you could exceed its limit without exceeding the limits of any of the tables within it.

B.5.4  Altering a File Definition in FILEDEF

To alter a file definition in the FILEDEF subfile, use the following commands:

The SPIRES command does not clear the active file, so if your added definition is still in the active file, you may make modifications to it using the text editor, and issue the command:

with no preceding transfer command. You may also want to use the MERGE command when it will suit your purposes. To use any of these commands, you must be in SPIRES and have FILEDEF selected. The "account" portion of the file name is required when you are naming the key of a record in the FILEDEF subfile (e.g., on a TRANSFER, DISPLAY, UPDATE or REMOVE command).

If you do transfer your definition you will notice that it is listed in a form that may be different from the form in which you added it: every element is on a separate line, and indentation has been added to show a hierarchical relationship among elements of the definition. You may want to obtain an offline listing of the definition; you can use the PERFORM PRINT command to get one. For help, [EXPLAIN PERFORM PRINT.]

After you have issued the UPDATE command and received no error messages, you must attempt to compile the file definition record in the FILEDEF subfile again by issuing the COMPILE command. If you again receive error messages, you must modify the definition in the FILEDEF subfile before trying the COMPILE command again.

B.5.5  ORVYL Files Created by Compilation

You may find it informative to issue the ORVYL command SHOW FILES both before and after you compile your definition, so you can see exactly what disk files are created on your account by SPIRES. The disk data sets created by the COMPILE command (and erased by the ZAP command) are the following:

The goal-record and index-record data sets are created one per record-type, by default. So, the first record-type defined in the file will be stored in the data set "filename.REC1", the second in "filename.REC2", etc. Only 15 such data sets are allowed, "numbered" from REC1 through REC9, followed by RECA, RECB, RECC, RECD, RECE and RECF.

For various reasons, you may want or need to "combine" several record-types into a single data set. That would allow you to have more than 15 record-types, or to have fewer data sets in use at once. You'll use the COMBINE statement to do this. [See B.6.5.]

B.5.6  The ATTACH and SELECT Commands

As soon as you have added the file definition to the FILEDEF subfile, you (and others users to whom you've given access) can issue the SELECT command to select your subfile. (If the file definition has not been compiled, SPIRES will tell you that there is no file to access.) At that point, you may begin adding and displaying the records, and processing the file in SPIBILD in order to try the indexes.

The ATTACH command is necessary when you wish to examine some record-type other than one that is the goal record-type of a subfile. The syntax of the command is:

"Filename" should use the "ORV." form of the name (ORV.gg.uuu.filename), unless the file is your own, in which case you can omit the "ORV.gg.uuu." portion. Without any of the options (i.e., "ATTACH filename"), the first record-type of the file will be attached.

You can also specify the record-type you want to attach by either its name or number, where "record-number" is the ordinal number (1 through 64) of the record-type you wish to ATTACH. The first record-type defined in the file is number 1; the second is number 2, etc.

This command simulates the SELECT command, although no subfile is involved. However, the attached record-type may be treated as if it were a goal record-type for a selected subfile: the DISPLAY and BROWSE GOAL commands are particularly useful. The ATTACH command is very useful in allowing a file owner to examine the contents of index records, since these can be attached as if they were goal records. [See C.6.5.]

Subfile Selection When Multiple Subfiles Share a Name

Anytime SPIRES, SPIBILD or FASTBILD must attach a record-type based on a subfile name (e.g., when the SELECT command is issued, or a subfile lookup proc such as $SUBF.LOOKUP or A65 is executed, or subfile subgoal processing occurs in formats, or subfile phantom structures are used), a procedure for determining which file and which record-type to attach is followed:

The effect of this procedure is to establish an order of precedence for subfile selection, allowing your own files and their subfiles to take precedence over the system's, which in turn take precedence over other people's when you select a subfile whose name is shared by multiple files.

B.5.7  The PROCESS Command in SPIBILD

After you have added a few records to the subfile, using the ADD or INPUT BATCH commands in SPIRES, you may want to pass those records from the deferred queue to the goal record data set. Normally this process is done overnight by an automatically-generated batch SPIBILD unless you have coded NOAUTOGEN in the file definition or issued the SET NOAUTOGEN command. If you do not want to wait overnight, you may pass the records online by using the SPIBILD processor. SPIBILD may be called from SPIRES or WYLBUR by issuing the SPIBILD command. The processor is used as follows:

If you want to process a file belonging to another account to which you have processing privileges [See B.9a.] you must type the entire file name, including the account number, preceded by "ORV.", as in:

For more information about SPIBILD and file-processing, see the manual "SPIRES File Management"; online, EXPLAIN PROCESS COMMAND.

B.5.8  Making Major Changes to a File: The ZAP FILE Command

[Editor's note: A fuller explanation of this process appears in the SPIRES manual "File Management", chapters 2.6 through 2.9.]

Once you have a compiled file definition you may want to change it -- put new elements in the file, add aliases or delete some elements. After you modify the definition that exists in the FILEDEF subfile, you will have to compile the definition again. However, the COMPILE command will return an error message if the disk data sets created by the COMPILE command already exist. So in order to modify a definition, you often must "zap" the file, which means discarding any and all records already in the file. (There are some changes that can be made to the file definition that will not require you to destroy the records already there because the data actually stored on disk is not invalidated. [See C.1.1.] Changes in the subfile section of the file definition do not force you to erase the records you have entered, for example.)

Let's assume that your changes will require you to erase the file's contents and ORVYL files and start again. Before you get rid of the file that already exists, you may want to salvage data that has been built into the goal records from adds or updates in the deferred queue, in order to minimize the amount of data that must be entered again after the file has been redefined.

To retain records that have been passed from the deferred queue to the goal record data set (either by JOBGEN overnight or by the online SPIBILD PROCESS command), issue the following commands:

After you destroy the current file and compile the file definition again, you can add these records into the new file, as described below.

You now can erase the disk data sets associated with your SPIRES file (those created by the COMPILE command) by issuing the ZAP FILE command in SPIRES:

For example,

The "permission" that SPIRES asks lets you confirm that you have requested the proper file to be "zapped". (Remember, this is the most permanent change you can make to a file!) If you include the NOWARN option, SPIRES will not ask you if you are sure you want to zap that file.

If you issue the ORVYL command SHOW FILES after the ZAP FILE command, you will see that the disk files created under your account by the COMPILE command have been erased. If you enter SPIRES and select the FILEDEF subfile, you will see that the file definition for the file you zapped, which is maintained in the system FILEDEF subfile, has not been destroyed or altered. You may transfer and update this definition to reflect the changes you wish to make, and then compile the new definition with the COMPILE command.

After you have "zapped" the current file and compiled the file definition anew, you can get the old records back into your active file and then, in SPIRES, select the subfile and issue the INPUT BATCH command to put the records back into the file (that is, into the new file):

Alternatively, use the INPUT BATCH command in SPIBILD as follows:

B.5.9  Making Minor Changes to a File: The RECOMPILE Command

There are some changes you can make to your file definition that do not invalidate data already stored on disk. For example: you can add an element or elements to the OPTIONAL section; you can add aliases to any element; you can add more record types (such as the index records we will define later); you can change some processing rule sequences. A detailed list specifying the changes that can and can't be made to a previously compiled definition is in the section "Recompile of an Existing File's Definition." [See C.1.]

If you make changes like these, you need only TRANSFER and UPDATE the file definition in the FILEDEF subfile, and then issue the RECOMPILE command:

With the exception of SHARE, described below, these options are the same as those on the COMPILE command. [See B.5.3.]

The RECOMPILE command changes only the "master" ORVYL data set for your SPIRES file (filename.MSTR); data records are not altered. Take care that you do not use the RECOMPILE command when data already stored on disk will be altered; this error could cause catastrophic loss of your data, and at best is very difficult to recover from. It is a good precaution to save a copy of your old file definition before you update the FILEDEF record and RECOMPILE; if an error is made, you can recover by going back to the previous definition and recompiling it.

You can make some changes that do not require a COMPILE or RECOMPILE command. Changes in the subfile section (which begins with the SUBFILE-NAME element) only require that you update the file definition in the FILEDEF subfile; the changes will take effect immediately. However, changes in BIN, NOAUTOGEN, or MAXVAL do require a RECOMPILE.

The SHARE option allows you to recompile a file definition even when other users have subfiles of the file selected. (If you don't include the SHARE option, SPIRES won't allow you to recompile in that situation.) The only possible impact to those users might be a delay of a few seconds in processing commands they issue while the recompilation takes place. They won't see any of the changes wrought by the recompilation (e.g., new index or element names, new elements, etc.) until they completely reselect the subfile (that means they must CLEAR SELECT or select another subfile before reselecting the subfile in question).

Using the SHARE option is not recommended if you are changing the number of record-types in the file.

B.5.10  Destroying a SPIRES File

When you want to completely destroy a SPIRES file because you no longer need it, you usually follow two steps. First, you destroy the ORVYL data sets stored on your account that hold the data, by issuing the ZAP FILE command discussed earlier. Second, you remove your file definition from the FILEDEF subfile.

The first step, issuing the ZAP FILE command, is quite radical, since it immediately destroys your entire file, releasing the storage blocks back to ORVYL. If you think there is a possibility that you will need to use the data in the file again someday, you should consider saving the data somewhere else before zapping the file. The procedure for doing that was discussed earlier. [See B.5.8.]

Then you issue the ZAP FILE command:

where "account.filename" is the name of the file as given in the FILE statement of the file definition. (The "account" portion may be omitted, if desired, or replaced by an asterisk or a period, as in "ZAP FILE .MYFILE".)

For example,

If you specify the NOWARN option, SPIRES will destroy the file without asking you to confirm your request. Remember that this change is permanent; once the command starts executing, your file is gone.

The second step is to remove your file definition from the FILEDEF subfile:

If you think you might want to use the file definition again someday, you are welcome to move it from the FILEDEF subfile to the BACKFILE subfile, an archive of file definitions. The above procedure would be modified as follows:

You should remove the file definition from the FILEDEF subfile so that the names of its subfiles will no longer appear in response to the SHOW SUBFILES command. The list of subfiles an account can select is derived from the FILEDEF subfile, not from the compiled files themselves.

Other Uses of the BACKFILE Subfile

There are other reasons why a file owner might want to put a file definition into the BACKFILE subfile. These reasons depend on the fact that a file definition in BACKFILE can be compiled, creating a regular SPIRES file. However, the compiled file is different from other files in two respects:

To compile a file definition in the BACKFILE subfile, follow this procedure:

The COMPILE command has precisely the same syntax in this context as it does when you compile a file definition stored in the FILEDEF subfile. [See B.5.3.]

Record-types may be defined with DEFINED-BY statements, indicating that the record definition is in a BACKRECS subfile record and has been compiled separately. That is, if you move a file definition containing DEFINED-BY statements referring to RECDEF subfile records, you should move the record definitions from RECDEF into BACKRECS -- otherwise, the file definition cannot be compiled in BACKFILE. You can combine the procedure described above for moving a file definition from FILEDEF to BACKFILE, altering it for RECDEF and BACKREC, with the procedure for compiling record definitions, replacing RECDEF with BACKREC, which is described later in this manual. [See C.7.1.]

Similarly, if the file definition contains EXT-REC or EXT-LINK statements, the records referred to in the FILEDEF subfile should be moved (or copied, if they are used by other files) from FILEDEF to the BACKFILE subfile. These are not compiled separately, so no further work needs to be done. [See C.7.2.]

EXTDEF subfile records are handled differently, however. If you compile a file definition or a record definition from BACKFILE or BACKREC respectively, and it contains an EXTDEF-ID statement, the record referred to must be in the EXTDEF subfile, not the BACKDEFS subfile; otherwise, SPIRES will not find it. The BACKDEFS subfile is strictly for archival purposes, and is not examined by SPIRES during any compilation process. [See C.10.5.]

B.5.11  Summary

In sum, the SPIRES commands TRANSFER, UPDATE, DISPLAY and REMOVE are used to maintain your file definition record in the FILEDEF subfile. The COMPILE, RECOMPILE and ZAP commands are used to create, modify and destroy the ORVYL files associated with your SPIRES file that are stored under your account. The BATCH and PROCESS commands are used in SPIBILD to add records to the deferred queue and pass records to the record types (goal record and index records) of your file.

B.6  File Structure: Tree & Slot, Goal & Index, Removed Records

B.6.1  Introduction

In the file definitions coded up to this point, each has had only one RECORD-NAME statement; the record named has been called the "goal record." Though it is possible for a SPIRES file to have only this single record definition, typically a file definition will have several RECORD-NAME statements. Each RECORD-NAME statement after the goal record's RECORD-NAME statement signals the definition of another "record-type," whose contents may be derived from or independent of the contents of the goal record. Additional records whose contents are derived from the elements in the goal records are called "index records"; a file has as many index record-types as it has indexes. Indexes in SPIRES can serve the same purpose as indexes in books; they relate a term (or search value in SPIRES) to its location (a goal record). At this point, it may be helpful to review the terms defined in the glossary. [See A.4.]

Figure B.6.6 is an outline of the relationship between commands issued by a user and the goal and index records. Simply stated, information is passed from the goal records to the appropriate index record or records when the file is processed; this process is called "passing" appropriately enough. When a FIND command is issued, SPIRES examines the index record named in the command. For example:

would examine the author index and report on the number of SMITH's found. Each SMITH located is associated with a pointer to a particular goal record that contains the name SMITH. When a TYPE command is issued, SPIRES uses the pointers to access and retrieve the goal records themselves.

SPIRES updates the index records during overnight processing. The definition of index records is fairly straightforward, and can be reduced almost to "recipes" for the majority of files defined. The linkage between goal and index records is defined by the file owner, and can also largely be reduced to recipe. The recipes require the file definer to determine how the file will be used; the types of indexes needed often follow directly from user requirements.

How does SPIRES interpret, handle, and store data that you input to your file? This question addresses at least two levels of data organization: 1) the element level, dealing with the organization and demarcation of elements in a single record; 2) the record level, dealing with the organization of records in a single record type, i.e., goal or index. We will begin with the element level.

B.6.2  Element Storage

When coding the goal record definition, three storage categories were used: FIXED for elements fixed in both length and occurrence, REQUIRED for elements that must occur, and OPTIONAL for elements that may not occur. (Note: these storage categories are distinct from element structure categories, of which we have encountered only simple elements and TYPE = STR elements; there are also LCTR and XEQ elements.)

"FIXED", "REQUIRED and "OPTIONAL" signify both an element type and an element storage category. Each type requires a slightly different storage schema. In an earlier chapter of this manual, it was noted that FIXED elements, those for which both LEN and OCC are determined, are the least expensive to store, while OPTIONAL elements, those for which neither LEN nor OCC need be specified, are the most expensive.

Fixed elements are the least expensive to store because they are of predefined and unvarying length and occurrence. SPIRES groups these elements together in the order in which they appear in the file definition, so that individual elements can be separated out of a packed data schema by using element locators stored in the compiled file characteristics. These characteristics are stored in the MSTR (master) file created by the COMPILE command for the following reason: Fixed elements are always stored at the beginning of a record in a SPIRES subfile. The locations of these elements never change from record to record, so the location information for all records can be stored just once, in the file characteristics (MSTR) table.

Element location information for Required elements cannot be stored entirely in the file characteristics table. Because these elements can vary in length and occurrence from record to record, "header" information must be prefixed to each element to indicate how many values occur and how long each value is. Figure B.6.7 shows schematically how the headers are arranged as required for each storage category.

If a Required element has fixed length, then the header contains the total length for all of the element's occurrences. If the element is singly occurring but of variable length, then the header still contains the total length of all occurrences, but here "all occurrences" means "one occurrence." If a Required element varies in both length and occurrence, then three kinds of information are stored in the record in front of the elements' occurrences: 1) again, the total length of all occurrences is stored at the head of the group of occurrences of an element; 2) this single header is followed by an occurrence count for the element; 3) each element occurrence is preceded by the length of that occurrence of the element. Each of these three pieces of information is stored in a two-byte field.

Note that Required (and optional) elements really have only two categories of occurrence counts: single and multiple. An occurrence count other than one (eg. OCC=3;) does not cause SPIRES to verify that the element occurs the specified number of times; but if OCC=1 is specified no occurrence count bytes will be maintained. INCLOSE rules can be used to verify that an element occurs a certain number of times.

Each data element in the Required section of the record may be located by skipping over any preceding elements through the use of the length information prefixed to each element. Thus, the time and expense required to locate a Required element increases the farther into the Required section of a record the element is stored; for this reason it is advantageous to define often-accessed elements early in the Required section and put seldom used elements toward the end.

Optional data elements occur in the same way as required elements, except that they need not occur at all in some records. The same sort of information that is stored with required elements is sufficient to locate optional elements, but only if there is some means of detecting the presence or absence of each optional element before the value-skipping process begins. This means of detection is a string of bits (binary digits) that is called the "optional element bit mask." If optional elements were defined in a record definition, the optional element bit mask is stored as a data element in the required section. The presence of this element is transparent to the user: it can not be displayed. Each optional element in the record is assigned a number, starting with one. The corresponding bit in the optional element bit mask will be "1" if the element occurs, and "0" if it does not. (For instance, if the fourth bit in the mask for a record is "0", the fourth possible optional element does not occur in that particular record.) In order to locate an optional element, the system checks the corresponding bit in the mask and if it is "1", then counts the number of "1" bits that precede it. The counting determines how many preceding elements in the optional section must be skipped over to reach the desired element.

The optional element bit mask is itself an element in the file. Just as SPIRES creates a fixed key element for you if your record is slot (and creates an implicit fixed section if you had not coded one before), SPIRES implicitly creates a Required element of fixed occurrence (one) and varying length if you define any optional elements. If no Required section exists, one is created for you, though only SPIRES knows it exists. However, a slot key is explicitly requested by the file definer, while the optional element bit mask is part of the overhead involved in storing optional elements.

Once a record-type has records in it, elements can only be added to the optional section. If no optional elements were defined, then no new elements can ever be defined for the record-type. But if any optional elements are coded, the optional element bit mask will occur, and other optional elements can be added to the record definition, with SPIRES simply making the bit mask (which is a varying-length element) longer. You can see now why it is advisable to code a "dummy" element in the optional section if you would not otherwise have an optional section: it allows you to add elements to your file at a later date. Elements added must go in the optional section after any previously defined elements. You may want to specify length and occurrence if they are known, but you cannot add elements to the Fixed or Required sections.

Adding elements to structure definitions has the same rules as adding elements to record definitions: new elements must go at the end of the OPTIONAL section of the structure. If no OPTIONAL elements were defined for the structure, then no elements can be added. Adding new structures to a record definition follows the same rules as adding elements: new structures must be declared in the OPTIONAL section of the record definition.

The chapter "Recompiling An Existing File's Definition" contains detailed information on recompiling a file definition. [See C.1.]

As shown in the chapter "Structures," [See B.3.] a data element may be simple and incapable of any further breakdown, or it may be a structure that consists in turn of other simple or structure elements. Data elements in any of the three storage categories may be structures. Furthermore, the elements within a structure are categorized as fixed, required, and optional. Unlike full records, structures may or may not have key data elements, but if they do, these keys must be fixed or required. There are several corollaries to all this. One is that if a structure is a fixed element, then all its elements must be fixed elements. Probably, if all the simple data elements in a structure are optional, then the structure itself should be declared optional--because if none of the structure's optional elements occur, the structure will not have occurred. (This is why INCLOSE rules for elements in a structure are not executed unless at least one element in the structure occurs.) However, a structure containing required elements may be an optional structure.

B.6.3  Record Storage

We can now look at SPIRES file structure at the record level; that is, how does SPIRES store the goal and index records?

Recall that in the definition of goal records, two types of goal records were distinguished: "tree" and "slot." For a slot goal record we coded "SLOT;" probably "REMOVED;" and possibly "SLOTCHECK;". The way SPIRES handles tree and slot goal records is quite different; since most goal records and all index records are tree structured, we will begin the discussion of record storage with a tree structured data set, such as the one shown in figure B.6.8.

At Stanford, SPIRES files are maintained in the ORVYL file system, in which files are made up of fixed length blocks. SPIRES takes the goal and index records in your file and packs them into a block. A block is a unit of space on disk measuring 2048 bytes or characters. It is the amount of contiguous data that can be retrieved from disk into memory in one read operation by the system. A block is a physical record; it can be packed with a number of logical records such as the "records" you add to your file. Just as a block can contain several logical records, one logical record may extend over several blocks if the record is larger than 2048 bytes.

Given that one block is 2048 bytes, then 1) if you have goal records of about 200 bytes, only 10 of them will fit into a block; 2) if your goal records are only 10 bytes long, then about 200 of them will fit into a block.

Record data sets (RECORD-NAME = REC01 is the start of one record-type data set in many of the sample definitions in this manual) must be organized so that a particular record can be located by the value of its key data element (with a command such as DISPLAY <key>) in the fewest possible number of disk reads. The number of disk reads is the largest factor in the speed of searching a file. Keeping in mind that the number of blocks the system must examine to locate a particular key is identical to the number of reads, you can see why the more highly packed structure, 200 records of 10 bytes each per block, is preferable to an organization in which only 10 records are contained in a block. But this does not mean that SPIRES efficiency deteriorates as the size of the records increases.

In the introduction to this manual, the claim was made that SPIRES can locate one record in 500,000 in only four disk accesses or reads. What kind of file structure allows such efficiency? SPIRES files are structured in what is called a "B-tree." Figure B.6.8 depicts a very simple B-tree structure; we will use this figure in the discussion of file structure that follows. Many of the details of a file block's contents have been removed from Figure B.6.8; these details are concentrated in Figure B.6.9, showing only one block. The tree structured data set shown in Figure B.6.8 might show the goal records in a dictionary subfile keyed on "word," where "word" is the item you would look up to find its definition. The figure shows only the records' keys, instead of all the elements in each record. Let's assume that each record contains all the elements that are normally associated with a dictionary entry.

Some data base terminology is necessary before we describe how SPIRES would locate a record in this dictionary tree. (You will need to refer to Figures B.6.8 and B.6.9 in this short definition of terms.) The "depth" of a tree is the maximum number of disk accesses to reach a record, or the number of accesses necessary to reach the deepest record. In Figure B.6.8 the depth of the tree is three. The "average disk accesses" is the sum of the number of disk accesses to reach each record divided by the number of records; this is 2.24 in figure B.6.8. A "node" is any record in a block; there are twenty-one nodes in the present case. A "terminal node" is any record that does not point to a block deeper in the tree, "jungle" and "zero" are terminal nodes, while "join" is a non-terminal node. The highest level block, not pointed to by any other blocks, is called the base or "root" block (also called block 0). Each block begins with an area of "block control information" (labeled "header" in Figure B.6.9) that SPIRES uses to validate the integrity of data stored in the block. Each block ends with an area of "trailers." Trailers begin at the end of the block and move toward the center of the block. They are fixed length and are maintained in key sequence, which is either alphabetically increasing, depending on the type of the key. Trailers are used to give the location ("displacement") of each record in the block. Thus, the fixed length trailers can be used to access the varying length records. Because the trailers are kept in key sequence, the records need not be. The trailers are shown only in Figure B.6.9, and Figure B.6.8 shows the records themselves in key sequence for simplicity.

After the block control information the actual records are stored, each preceded by a "branch pointer" to a block deeper in the tree. No two branch pointers point to the same block; thus, there is one and only one path from the base block to any record in the data set.

Except in unusual cases, SPIRES will insure that each block contains more than sixteen records, so that accessing efficiency will not be impaired by a few large records. The mechanism for this is described in the SPIRES design documents, and will not be discussed here. (Figure B.6.8 is simplified by putting fewer than sixteen records in each block.)

Let's outline what happens when the command "DISPLAY HERB" is issued, where HERB is the key of a record in the tree-structured goal record shown in Figure B.6.8. SPIRES will read the base block into memory and scan the keys for "HERB". Finding the key, the search terminates, having taken only one disk access. The record is then displayed.

Suppose the command "DISPLAY DATA" is issued. Starting at the base block again, the block is read into memory and scanned. "DATA" is not found before a key greater than "DATA" in sort sequence, "HERB," is found. Right before HERB there is a branch pointer to block 2. Block 2 is read into memory and scanned until a key greater than "DATA," "ELEMENT," is found. Preceding "ELEMENT" is a branch pointer to block 6, which contains the key "DATA." Blocks 0, 2 and 6 were read to find the key "DATA". Thus, three disk accesses were required.

As noted before, the depth of the tree in Figure B.6.8 is three: three disk accesses are required to read the record whose key is "JUNGLE". The depth of any tree is a function of the number of records that can fit into a block: the more records per block, the shallower the tree. If eight records were put into each block in Figure B.6.8, only three blocks would be needed for the entire file, and it would never take more than two disk accesses to locate any record. If 100 records could be put into each block, only a single block would be used, making retrieval extremely efficient. Later in this chapter, we will see how this many records can be packed into a single block.

If you add a number of records whose keys are close together (with respect to alphabetical sequence), one particular path of the tree will become longer and more heavily "travelled" during record searching operations than others. This is because more records are stored on this long path. Figure B.6.10 depicts such a situation. The unbalanced tree shown would be very unlikely; SPIRES would automatically invoke a rebalancing process to correct the situation, trying to establish a uniform depth for the tree, such as that shown in Figure B.6.11. Uniform depth is not the only quality that SPIRES optimizes for in the rebalancing process, however; SPIRES tries to optimize for a minimum average accesses per record. Figure B.6.12 shows a tree with a uniform depth of three, in which the average number of accesses per record is 2.74. The same tree can be rebalanced as shown in Figure B.6.13. The depth is still three, but the average number of accesses per record is now 2.19.

The dynamic rebalancing that SPIRES does is localized; it does not extend across the tree. If a general rebalancing of the tree ever becomes necessary, a utility is available. [See B.10.16.] A file manager can determine the shape of any tree by using the file status utilities described in "SPIRES File Management." [See B.10.]

B.6.4  Removed Record-Types

If SPIRES insists on sixteen or more records per block, it would seem that only records of less than 128 bytes would be allowed in the tree, since sixteen 128 byte records would exhaust the 2048 byte limit. But SPIRES records have no such length limitation; strings of textual data alone are commonly longer than 128 bytes. In the chapter describing goal records, the only limitations noted were that 1) if any single occurrence of an element (or structure) was longer than 1024 bytes, a statement such as "MAXVAL=2048" must be coded after the file name [See B.1.8.2.] and 2) no record or element can be longer than 80,000 characters.

How, then, can sixteen or more records, each longer than 128 bytes, be packed into the 2048 bytes in a block? A complete answer to this question must take us into an examination of the ORVYL files created under your account when you issue the COMPILE command.

In the previous chapter, "FILEDEF Subfile and File Compilation," [See B.5.5.] we named the following files and functions:

   MSTR  an encoded characteristics table created from your file
         definition.
   REC1  the first and following record-types you define, usually
   REC2  the goal and index records.
    :
    :
   RECn
   DEFQ  the deferred queue, containing records added or updated
         during the day.
   RES   the "RESidual" data set, whose function is described in
         the remainder of this chapter.

Having these names in mind, let's outline what happens when a record passed from the deferred queue to the tree of REC1 would, because of the record's size, cause fewer than sixteen records to exist in one of the blocks of the REC1 tree. Also, what would happen if the record itself were larger than a block (2048 bytes)?

At this point, SPIRES would not put the record directly in the tree, where it would promote accessing inefficiency, but would "remove" the record to the "residual" ORVYL file. When a record is removed to the residual, the entire record is placed in the residual, and the tree contains only the key of the removed record, and an absolute pointer to its address in the residual. Thus, for the record whose key is "CYST" in figure B.6.8, only the key and a four byte pointer are stored in the tree, while the entire record, including its key, is stored in the residual. (Note: in record-types not coded as REMOVED (see below), the Fixed elements are also placed in the key tree with the key if the key is Required; however, if the key is Fixed, the other Fixed elements are placed in the residual with the remainder of the record. In other words, for non-REMOVED record-types, the key and all Fixed data that precede it are stored in the tree, while the rest of the record may be moved to the residual -- see below.)

What are the consequences of this technique? Several small disadvantages may be obvious: 1) we have duplicate storage of the key, once in the tree and once in the residual, and 2) in order to display a record, SPIRES must do an extra disk access to the residual after having read the necessary blocks of the tree to find the pointer to a record's address in the residual. First, duplicate storage of the key is of small importance, since the key is typically only six to ten bytes. With respect to the second disadvantage, the extra disk access necessary to the residual dataset is usually only a single access: the residual is not tree-structured, but organized sequentially; this means that the pointers into it are not keys, as in tree datasets, but physical addresses of information. SPIRES does not have to hunt for a record in the residual, but can go to its block and address immediately from the pointer information in the tree.

The advantages of removing records to the residual are significant. Since the number of records that can fit in a block is now dependent upon the size of the record key and not the size of the record, many more records can fit in a block, and, since each record in a block points to a block below it, the depth or number of levels of the tree is significantly reduced. Also, if all goal records are removed to the residual, a search request of an index record will locate address pointers to the residual, rather than keys of records in the goal tree. This results in a significant savings in retrieval time and disk accesses. But, the advantages of record removal for indexed search and retrieval are only available if all records are removed.

If a block would contain fewer than sixteen records, SPIRES will automatically remove some records to the residual. It is possible, and, in fact, the norm, for the file definer to request that this process be done unconditionally. This is done by coding "REMOVED;" after the "RECORD-NAME = " statement. Record removal can be specified for both slot and tree records, and is required for slot record-types if all elements are not fixed length.

There are few instances in which you would not want to code REMOVED for goal records (it is very unusual to code REMOVED for index records since they are usually small). If your goal records are small (less than sixty bytes) and there are few of them, then a tree of non-removed records would never become very deep ("deep" is three levels). In this case, if most of your record retrieval were to be done by key values using the DISPLAY command rather than index searching using the FIND command, you would probably not remove records. If your records are large, or you have a large number of them, or they pass to index records and are not themselves index records, then code REMOVED for the goal records. Record-types that serve only as index records should not be REMOVED, and those that serve as both goal and index records usually should not be REMOVED. [See B.2.3.]

B.6.4.1  Very Large Databases

In most SPIRES files with more than a hundred records, the residual (RES) file is the largest of all the files associated with a SPIRES file. There is a system limit to how large a single residual file can grow: 425984 blocks. If your file will, or is about to, grow beyond this limit, you should specify the RESIDUAL statement in your file definition.

The RESIDUAL statement is specified after the FILE name statement, and indicates the number of residual files to be allowed for your SPIRES file. The allowed values are:

Any changes in the RESIDUAL statement of a file definition require that the definition be recompiled.

Only 45 residuals may reside on the file owner's account. Alternatively, you may specify multiple RES-ACCOUNTS, one for each residual. If you use RES-ACCOUNTS, there must be RESIDUAL number of them.

where "gg.uuN" is an account that will hold the N-th residual data set. You must have WRITE access to each file for each RES-ACCOUNTS account. Therefore, RES-ACCOUNTS works much like DEFQ-ACCOUNT. [See C.6.23 for details on subjects such as default data set permits for the new residuals, etc..]

[See B.6.4.2 for information about RES-LARGE and its relationship to the RES-ACCOUNTS statement..]

B.6.4.2  RES-LARGE Statement

The RES-LARGE statement indicates that each RESIDUAL data set can grow to 13,631,488 blocks, and that locators will be 26-bits in size. RES-LARGE may be coded without RES-ACCOUNTS, in which case RESIDUALS will be large (as above), and will be stored on the FILE's account. However, RES-LARGE is usually coded with a matching set of RES-ACCOUNTS. [See B.6.4.1.]

Currently, only the Research Libraries Group files use RES-LARGE. You can't switch from a file without RES-LARGE to a file with it without rebuilding the file because the locators change to 26-bits.

B.6.5  Combined Record-Types

It is possible to reduce the number of disk accesses in another way. When a FIND command is issued, SPIRES attaches all ORVYL files associated with the indexes being searched in the selected SPIRES file. For a file with fourteen index datasets and a goal record dataset, this could require attaching fifteen ORVYL files: REC1, REC2,...REC9, RECA, RECB, RECC, RECD, RECE and RECF. Such a file could not have more than fourteen indexes, since a "RECG" is not allowed.

To reduce the number of attaches, it is advisable to increase the number of index records put into the same physical ORVYL file, while keeping them logically distinct. This is done by coding "COMBINE = <record-name>;" after the "RECORD-NAME = " statement.

Often all index records can be combined into the goal record dataset as follows:

By combining data sets, you may have up to 64 record types defined (i.e. RECORD-NAME statements) in a file; you are limited to thirteen otherwise. (By the way, if there are more than 32 record types in a file, issuing the SHOW SUBFILE SIZE command may not give you the number of records in a particular subfile of the file; instead you will receive the message "UNKNOWN SIZE". This won't happen if the file's available-space tables have been reformatted with the AVSPREC command, using the REFORMAT option; see the manual "SPIRES File Management" for details.)

No more than eight of the record types defined in a file may be slot, however.

The "record-name" of "COMBINE = record-name" may never be a slot record and a SLOT record type may never have a COMBINE statement coded in it; that is, each SLOT record type must be physically separate from any other record types defined. [See B.7.12.]

B.6.5a  Extended Tree Data Sets for Large Databases

As of January 2003, all new SPIRES non SLOT tree data sets (RECn data sets) will be defined as Extended-Tree data sets by default. This means that the "EXTENDED-TREE;" statement described below is no longer needed to secure the benefits of extended tree data sets.

The RECOMPILE command will still recognize that non extended tree data sets are to remain non extended. If you are RECOMPILEing an existing file definition and adding record-types which will be in a new RECn data set, then that data set will be an extended tree data set. If you wish to convert an older data set to extended-tree you still must follow the steps that enable you to perform this conversion [EXPLAIN EXTENDED TREES, CREATING.]

You may have valid reasons to ensure that a tree data set is not extended (certain system data sets must remain this way). An option on the EXTENDED- TREE statement allows this possibility. If you code "EXTENDED-TREE = NO;" you will accomplish this goal.

Typically, in a large file, the residual dataset is the first physical dataset to grow to the ORVYL file system limit of 425,984 blocks. Getting around that limit requires the use of multiple residuals, as described earlier in this chapter. [See B.6.4.1.]

But eventually, a very large file may run into the limit on the size of record trees, which is 64K blocks per tree. By coding the EXTENDED-TREE statement with no value as part of the record definition, you can request that the data set containing the current record-type is to be an "extended-tree" data set, one which can grow to 434,280 blocks:

Once this has been added to the file definition, and the file definition has been recompiled, you need to follow a special rebalancing procedure to convert the existing trees that are affected by this change (internally speaking, that means they will have 3-byte instead of 2-byte branch block pointers). The full procedure to follow is explained in the manual "SPIRES File Management", chapter 5.6; online, [EXPLAIN EXTENDED TREE DATA SETS.]

Restriction: Slot record-types may not be in an extended tree data set; that means a slot record-type cannot be combined with a record-type that is in an extended tree data set.

Using the TREE-DATA Structure to Control Tree Depth

Some files are prone to exceptionally large amounts of updating activity in a particular key area. For example, suppose a goal record-type of transactional data has keys that begin with some particular prefix (the current year, for example) and there is a huge amount of growth in one area of the tree. That could lead to tree depth problems in that area.

The TREE-DATA structure lets you request a tree specifically for a key area of the record-type. It occurs multiple times in a record-type definition, so multiple trees can (and should) be defined. If a record definition contains TREE-DATA structures, then it must be in an extended tree data set or COMBINEd with a record-type that is in an extended tree data set.

The START-KEY statement holds the value of the lowest key that will be stored in the tree being defined by this structure. All records whose keys are greater than or equal to this starting key but are less than the start-key of the next tree will be stored in this tree. The START-KEY value will be embedded in the block header of all blocks associated with this tree.

It can be variable in length, and have one of two forms:

The optional TREE-PREFIX statement is for special cases of tree-data structures where you know every single value that will go in the tree being defined will have the identical prefix. You would know that's true if your Inproc rules guaranteed it, or if the next TREE-DATA structure's START-KEY was the very next possible prefix (see below). The TREE-PREFIX statement provides an integer specifying the number of characters in the START-KEY that will be on all keys of the tree. So, for instance, if the START-KEY of a tree is 92, and the START-KEY for the next tree is 93, then you would know for sure that all keys between 92 and 93 would go into this tree, and they would all begin with "92". So you could specify TREE-PREFIX = 2.

The TREE-PREFIX statement causes SPIRES to strip off the prefix before storage in the tree. If you have thousands of records with the same prefix, which has several bytes, this can be a significant storage savings over time.

As mentioned above, your record-type may have Inproc rules to guarantee what the possibilities for a key are. However, it is recommended that you also include TREE-DATA structures to cover the entire range of possible key values, particularly at the beginning. For example, here is a series of TREE-DATA structures for a file:

As you can see, there is no key value that would be left without a tree to call its home. The records stored in the middle three trees listed above would all be stored without the first two characters of their keys, which would be either 90, 91 or 92 as appropriate. Notice that a null value was given for the first one to handle any records with keys previous to "90"; a null value needs to be written as shown, using apostrophes. The standard null value "START-KEY;" will not compile.

The TREE-MONOTONIC statement tells SPIRES that the keys of the records stored in that particular tree will be entered monotonically, i.e., in order. That helps SPIRES store the records more efficiently. It should not be coded unless at least the vast majority (say 98 percent or more) of the records for that tree will be entered into the database in sequential order. If the entire record-type (as opposed to individual trees) would be monotonic, you should code the MONOTONIC statement. [See B.2.4.]

There is no fixed limit on the number of trees you can define with TREE-DATA. The limit is determined by an internal table that keeps track of numerous file characteristics. The length of the start keys will also affect the limit. But generally speaking, there is probably room for dozens, perhaps a hundred or more, TREE-DATA structures.

When you add or change TREE-DATA structures to an existing file, you need to recompile the file definition with the REBALANCE option. Again, the full procedure to follow is explained in the manual "SPIRES File Management", chapter 5.6; online, EXPLAIN EXTENDED TREE DATA SETS.

B.6.5b  The OVERFLOW-TO and OVERFLOW-KEY statements

In extremely large files, a record tree can possibly become too large for SPIRES to handle properly. Specifically, if the tree would contain more than 65K blocks, then it must be split over two identical record-types, using the OVERFLOW-TO and OVERFLOW key statements.

To use "overflow processing", your file definition must have two almost identical record definitions. The difference between them is that one includes the OVERFLOW-TO and OVERFLOW-KEY statements. When records being added to that record-type have keys greater than that given in the OVERFLOW-KEY statement, they are added to the other record-type. In accessing records by key, SPIRES checks the requested key against the "overflow key" to determine which record-type contains the desired record. No other special statements or values need to be coded (e.g., you do not specify both record-types as GOAL-RECORDs). [See B.9.2.] Once you have coded the OVERFLOW statements, SPIRES handles the rest.

The syntax of the two statements is:

These are coded at the end of the definition of the "overflowing" record-type. The "record-name" is the name of another record-type defined in the file that has the same design as the current one. The "hex-value" is a four-byte hexadecimal value that specifies the value at which overflowing is to occur. For example, "OVERFLOW-KEY = C4000000;" specifies that all records with keys whose values are "D" or greater ("D" is "C4" in hex) are to overflow to the specified record-type. (Note: keys shorter than 4 bytes are padded on the right with zeroes (i.e., hex 00) to a length of four bytes before being compared to the overflow key value.)

Even if overflow processing is not required, it may be desirable to specify it to help keep record accessing efficient. In any event, the decision to use it should be made with your SPIRES consultant.

B.6.5c  The EXTERNAL-TYPE statement

A SPIRES file can be defined in such a way that one of its record-types is an External record-type. This is done by coding the element EXTERNAL-TYPE in the record-type definition. A subfile which has an EXTERNAL goal record-type has some properties that are quite different from other SPIRES subfiles. This subfile will have a look similar to normal subfiles in that its Deferred Queue can be used for retrieval and update but the "tree" portion of the subfile is "external" to SPIRES. That is, the "tree" is not in an ORVYL RECn data set. Rather it is located either on a SPIRES Device or on some medium foreign to the ORVYL environment.

This "foreign" information source refers to any source of information that can be moved in some fashion into a SPIRES Device area. This data could come from a WYLBUR data set, a remote database accessed through the NIO Device Area, a WYLBUR QUERY or OS data sets accessed through batch jobs or through the SUSAN Path.

The EXTERNAL-TYPE statement provides the linkage that SPIRES needs to understand the access control to this remote (external) data.

[EXPLAIN EXTERNAL FILES, INTRODUCTION.] for more information about this facility.

B.6.6  Figure: Function of Goal and Index Records

B.6.7  Figure: Storage of Element Length and Occurrence Information

B.6.8  Figure: A Tree-Structured Data Set

B.6.9  Figure: Detail of the Structure of a Single File Block

B.6.10  Figure: Sample Tree After Intense Local Growth

B.6.11  Figure: Sample Tree With Well-Distributed Growth

B.6.12  Figure: Tree Showing High Number of Access Per Record

B.6.13  Figure: Previous Tree After Rebalancing

B.7  Understanding and Coding Index Records

B.7.1  How Indexing Works

Let's consider what happens when we want to build an index. Suppose we had a subfile called "TABLE OF CONTENTS"; each record in the subfile is a chapter number (the key of the record) and a chapter title. If an appropriate format were written, the table of contents for the first seven chapters of Part B of this manual might look like this:

Chapter   1: Goal Record Concepts and Definition
Chapter   2: Goal Record Keys, Slot and Removed Records
Chapter   3: Structures
Chapter   4: Processing Rules: INPROC, INCLOSE, OUTPROC
Chapter   5: FILEDEF Subfile and File Compilation
Chapter   6: File Structure: Tree and Slot, Goal and Index Records
Chapter   7: Understanding and Coding Index Records

An index based on the words appearing in that chapter titles might look like this:

In fact, an index at the end of a book is a good example of the structure of a simple SPIRES index. The record definition for this index would look like this:

The element "TITLE-WORD" contains one of the words in each of the titles. Since it is the key of the record, it can only occur once in each record. So, each word ("AND", "CODING", etc.) in the above index is the key of a separate record in the index record-type. But notice that for each "TITLE-WORD" there may be several occurrences of "GOAL-RECORD-KEY", each occurrence pointing to the goal record in which the title word occurs.

The record in this index record-type for the TITLE-WORD "GOAL" would look something like this:

This is an index record whose key is some "word" (here, "GOAL") and contains a set of pointers to those goal records that contain a certain word in the title. Each record that is stored in this index contains as its key a word from a chapter title, and one or more pointers, each to a goal record whose TITLE element contains the word that is the key of the record.

A SPIRES search of this index could look like this:

While a DISPLAY <key> command searches the goal record tree for the key named, an index record is searched by the FIND <searchterm> <key> command. For example: SPIRES will locate the record that has the key-value "goal" in the index named in the FIND command, and count up the number of pointers to the goal record data set to indicate how many records are likely to be in the search result. (The number reported may not be entirely accurate, since a single particular record may have several pointers to it in the same index record; if this is the case, SPIRES will report a corrected number in the search result after it has examined the records via the TYPE or OUTPUT commands.)

An index record thus looks very much like a goal record as far as SPIRES is concerned. It has a key of fixed or varying length, depending upon the nature of the data being passed to the index; it has a multiply occurring element called a pointer (that may be the key of a structure). Since the FIND command attempts to locate records by key, the most efficient structure for storing and locating index records will be a tree structure. Typically, the records in an index are not REMOVED, since they are usually quite small, allowing a large number of them to fit into a single tree block. Index records are thus structured and stored in a manner identical to that for goal records, except that we can take advantage of their small size.

The simple definition for the TITLE-WORD index shown above is probably not what most file definers would specify, especially if the goal record (the chapter titles) were a REMOVED record type. If the goal record is REMOVED, then its indexes do not usually store goal record keys, but addresses of goal records in the residual data set. The index definition would probably look more like this:

Index records exhibit a new type of element, the locator, denoted by "TYPE=LCTR." This element refers to ("locates") a goal record, not by its key, but by its address (location) in the residual data set. To see why this is done, consider the sequence of events for a FIND and TYPE command. SPIRES searches an index and accumulates a list of pointers to the goal records, and reports on the number of pointers found. If each of the pointers were in the form of a goal record key, then the TYPE command would cause SPIRES to read blocks of the goal record tree until it found the location of the referenced record in the residual; then SPIRES would access the residual. In almost all cases, the middle step of searching the goal record tree for the record's location in the residual can be eliminated by storing that location itself as the pointer, rather than the key of the record; this optimization can only be done when the goal records are REMOVED.

To be precise, a pointer to a non-REMOVED record type is not declared TYPE=LCTR, since it contains a goal record key rather than a pointer to a location in the residual data set. Only the pointer to a REMOVED record type can be TYPE=LCTR.

Since SPIRES creates and maintains indexes automatically, the file definer must tell SPIRES how and what information is to go from the goal record to a certain index. The file definer specifies how this is done in the "Linkage Section" that follows all of the index record definitions and precedes the "Subfile Section" that specifies subfile name and privileges.

As its name implies, the Linkage Section links the goal and index records; it defines how information is passed from the goal to the index records when file updating is done, and it defines how indexes are to be searched. The details of coding the Linkage Section are covered in the next chapter "Understanding and Coding the Linkage Section." [See B.8.] With this brief look at the structure of a very simple index record, we can now consider the different methods of indexing available to the file definer. Each method has a search and retrieval situation for which it is particularly well suited. For any file that will be searched often, or will contain more than one thousand records, indexing plans should be discussed with the SPIRES consultant. For each indexing strategy described below, guidelines for its use are also presented.

B.7.2  Understanding Simple Indexes

Simple indexes may be defined for files of any size. Their structure and use by the system is "simple" and efficient. Here is a picture of two records in a simple index:

Also, simple indexes are the only type of index for which a synonym can be maintained. [See C.3.]

If an element to be indexed has only a few possible values, it may be best to "search" this element using Global FOR, or perhaps index it as a "qualifier" (see below). Any time a search request would retrieve a large percentage of the records in a file (seventy percent or so), simple indexes may not be the best search mechanism. For example, an index built on the sex (male or female) of people in a personnel file may or may not be necessary, depending upon the search situation. If that index will not be searched frequently, it may be cheaper to search the goal records sequentially (using FOR or ALSO) than to pay the cost of building, updating, and storing the very large index record entries. If a search result will frequently be narrowed by a "sex" criterion, then "sex" might be added to another index or indexes as a qualifier.

B.7.3  Understanding Qualifiers

Qualifiers provide a search flexibility for large files, allowing search requests to be narrowed by the specification of criteria that would be inefficient to search and index otherwise (such as the language in which a program is written in the MASTERLIST subfile--only four or so possibilities exist). Qualifiers should be used sparingly: they must be stored redundantly in each index to which they apply, generating high storage costs. Let's look at the structure of a simple index with one qualifier.

If we were to qualify the title-word index shown in the beginning of this chapter with a STATUS element, allowing only the values "Preliminary," "Current," and "Out of Date," the structure of the index and a sample record in it would be something like this:

As you can see, the qualifier is stored with each pointer. Thus, a qualifier takes up quite a bit of space, relative to a simple index on an element with only a few values (such as STATUS). But, the time required to search on the basis of a qualifier is less than that for searching two indexes, especially if one of them has only a few large records (entries) in it. This is because a qualifier search request narrows a search by operating off an existing search result stack; a search involving two indexes requires SPIRES to build two search results, and AND them together.

So, if search time is more important than storage cost, and you will frequently want to qualify a search request by a certain criterion or criteria (there can be more than one qualifier for an index), a qualifier may be appropriate.

Several other facts about qualifiers will influence a decision on their use: 1) they may only be used with the AND and AND NOT logical operators; 2) they allow the full range of relational operators, such as ">" and "<"; 2) they can only be used after a search request involving the index to which they are attached. For example, assume DATE is a qualifier to a TITLE index:

One additional requirement is that the qualifier must occur in any index(es) to which it applies; note that a global qualifier usually is a Required element in the Optional pointer structure--if the POINTER occurs, then the qualifier must occur also. For both global and local qualifiers, this means that either:

Qualifiers may also be "local" or "global." If local, then it may only be used after the index to which it applies has been named in a search request. If global, then it may be used any time after the first FIND command referencing any index. Global qualifiers are stored redundantly on every pointer of every index; they are thus quite expensive from a standpoint of storage costs.

B.7.4  Understanding Sub-Indexes

Sub-indexes are almost exclusively used with personal name indexes. The personal name search processing rule (SRCPROC A38) breaks a search value into two portions: last name, and first names. After searching the index record on the last name, the first names are used to determine which sub-index structures define the pointer groups. If no first names were given in the search request, all pointer groups in the index record are logically OR'd together.

Sub-indexes can be used in other ways besides personal name, and are searched by specifying the commercial ("@") character. For example, we might make CITY a sub-index of STATE and search as follows:

or make SEAT and ROW sub-indexes of SECTION:

Sub-indexes can be useful when things logically fit inside other things, as cities do in states, or seats do in sections. They allow you to choose a subset of the index as a result.

Only the equality operator ("=" or a blank) may be used with sub-indexes.

B.7.5  Understanding Compound Indexes

Several elements can be passed to a single compound index (only one compound index may be defined per goal record), requiring somewhat less storage space than several simple indexes. Passing several elements to a single compound index is necessary when the number of indexes defined for a goal record would require that more record-types be defined in a file definition than are allowed (64 is the maximum). Compound index organization is most efficient when the data elements are numerics (often requiring relational operators for effective searching) and short alphanumerics (such as codes).

However, there are significant disadvantages to compound indexing strategies when a file is large (more than eight thousand records) and is searched or updated frequently. As the file gets large, the cost of updating a compound index with many entries gets progressively greater; compound indexes are also somewhat more time consuming and expensive to search than simple indexes, usually requiring several disk accesses to retrieve the large records they contain from the residual data set. Also, many of the elaborate search and pass processing rules (SRCPROC and PASSPROC) available for simple indexes are not available for compound indexes. In addition, the BROWSE command cannot be used to inspect the contents of a compound index.

All of the advantages and disadvantages of compound indexes arise either from the search and update techniques they require, or from their structure, which is similar to indexes with local qualifiers. In a simple index, there is one index record for each unique value passed from the goal records; if several goal records had the same value, then the one index record for that value would have multiple occurrences of pointers to the goal records. For compound indexes, however, there is one index record for each element-mnemonic in the goal record that passes to the index, and every unique value that that mnemonic has forms an occurrence of a pointer structure containing the pointer and the value of the element in the goal record that is being pointed to. For example: if a goal record passes TEMPERATURE, AGE and DATE to a compound index, the goal and index records would look like this:

Note that the records shown in the right column are created and maintained by passing. The key of the record is a combination of the structure and element number of the elements that is being passed from the goal record; these keys are computed by SPIRES.

When a search request is made against a compound index,

the single index record containing all of the AGE values in the goal record is read, then all the values (ELEM-VALUE, above) in the index record read in are scanned, and pointer groups not meeting the criteria are weeded out of the search result. Because a single record containing all AGE values exists in a compound index on AGE, a command such as

is possible; the result will be all records in which the AGE element passed a value to the index.

A compound index record may grow quite large if 1) it contains many values because the number of goal records passing to it is large, or 2) the values passing to it are long, such as lengthy character strings. Since a large record must be read and then scanned, searching a compound index, particularly in medium and large sized files, may give a noticeably slower response than searching a simple index in the same subfile. Also, updating such a large record is more time consuming and thus expensive than updating a simple index; because the large records in a compound index may often overflow the 2048-byte limit for an ORVYL file block, multiple disk accesses may be necessary to search for or update a single record.

If the file is not large or, if the elements being indexed do not occur in a majority of the goal records, or if updating is not done nightly, then compound indexes are quite suitable for numerics and short alphanumerics, such as codes.

B.7.6  The Impact of Global FOR and ALSO on Indexing

The ALSO and Global FOR commands provide substitutes for a compound index in a large file. Of course, these methods involve sequential rather than indexed searching, and will be noticeably slower (more elapsed and CPU time required) than a compound index search unless the existing search result is small.

The ALSO command always examines all the goal records pointed to in an existing search result; this capability is also available using the Global FOR commands. In contrast to the FOR and ALSO sequential search, an index search request preceded by another search request operates as any compound search request: two or more subsets of pointers are built, one for each of the search criteria, then put together into a single search result. For this reason, if a search request requiring relational operators were always preceded by search requests yielding a relatively small search result, such a request might be performed most efficiently using a Global FOR or ALSO command.

Another consideration: if searching is done sequentially by the Global FOR or ALSO commands, then no expenses are incurred for building, updating and storing compound or simple indexes. If search requests against the values in some elements will be quite infrequent, it may be advisable to use sequential search techniques rather than indexed search techniques. Retrieval may be slower, but costly indexes of little use will not be maintained.

There are some cautions to the use of sequential searching techniques, however. Unlike the FIND command, the ALSO command cannot initiate a search; it must always operate on a preceding search result--in this respect it is like a Qualifier. Unlike the ALSO command, the Global FOR commands need not operate via a search result, but can.

The search criteria for Global FOR commands are specified in the WHERE clause. Two additional operators are available in the Global FOR WHERE clause: OCCURS and LENGTH; these are not available to the FIND or ALSO commands. OCCURS allows a user to specify search criteria based on the number of occurrences of an element, and LENGTH allows criteria based on the length of any single occurrence. For example, suppose you wished to print mailing labels from your subfile's records, but first wanted to print all addresses that would not fit on standard labels. This might be done as follows:

This would place in the active file the subset of all goal records that had more than four lines of ADDRESS or had any occurrence of the element ADDRESS that was longer than the width of a label, 35 characters.

Note from the above example that the Global FOR commands do not automatically provide you with a count of the number of records meeting the criteria specified in the (optional) WHERE clause. This is because the FOR command itself does not initiate a search of the file; the file is not searched until another command is issued that specifies what is to be done with the records--remove them, display them, dequeue them, etc. A count can be obtained, however, and a new set of WHERE criteria specified if the number is too small. The following example shows this process, which involves several examinations of the goal records in the search result, and is therefore rather time-consuming and expensive:

The system's response to the SHOW LEVEL command gives two numbers: The second indicates the number of records examined--here it is 22, the same as the number of records in the search result. The first number indicates how many of the records examined met the criteria specified in the WHERE clause--4 for the first WHERE clause and 11 for the second.

The same results are more directly obtained by the use of the ALSO command, which gives an indication of the number of records meeting the criteria immediately, just as a FIND or other index search command does. For example:

If further index searching commands (e.g. AND, OR) are necessary, then the ALSO command must be used, since the "result" of a Global FOR command is not a set of pointers in a search result. The pointers in a search result can be combined logically with the pointers meeting the criteria specified in subsequent search commands. If, however, the records meeting the WHERE criteria are to be displayed at the terminal or placed in the active file, regardless of their number, then Global FOR is a far more efficient way to do this than the ALSO command.

Compare the two search scenarios following:

The second series of search commands is almost twice as efficient as the first. With the ALSO command, the system must read the goal records to examine the EYES element, then read the records meeting the criteria a second time when a TYPE command is given. With the FOR RESULT command, the record is read to examine the EYES element, then, while the record is still in main memory, it is displayed on the terminal. The net effect is that a record is accessed only once when FOR RESULT is used.

In addition, Global FOR commands can be used for many record management functions other than what has been described. Here we have just exhibited its capabilities with respect to those of the ALSO command. The Global FOR commands facilitate a full range of data base and record management functions unavailable otherwise. All file owners and managers should be familiar with the capabilities of Global FOR for sequential subfile search and subsetting. Consult "SPIRES Searching and Updating" for an introduction to Global FOR searching.

B.7.7  Index Definition

Having considered the different indexing options available to the file definer, and having described the functional differences among them, we can now attack the practical problem of coding the record definitions for the different indexes a file will have.

Subfiles may have one, several or no index records defined. There usually is one index record definition for each simple index in a file. One index record definition could be for a compound index in the subfile (remember that only one such index can be defined per subfile). Through a process called "passing", a compound index typically receives values from more than one element in the goal record, while a simple index typically receives values from only one element in the goal record. However, it is entirely possible for a compound index to have its values passed from a single goal record element.

It is also possible for more than one element in the goal record to pass to a simple index; this situation is known as "multiple passers." There may not be more than one compound index per subfile, but there may be more than one compound index defined in a file that has more than one subfile. There may be a large number of simple indexes, provided that the total number of records defined for a file (goal and index records) does not exceed sixty-four.

The different kinds of indexes a subfile has influences the kinds of records defined for all indexes in a subfile. This is due to one of the primary rules of coding index record definitions: all pointer groups to the same goal record must "look alike" in terms of their structure. This means that if there is a compound index or a simple index with a qualifier for a goal record, the pointer groups in all indexes to that goal record must exhibit the structure of a compound index or simple index with a qualifier.

It is fairly easy to reduce the definition of most index records to a "recipe," and "A Guide to Coding Index Record Definitions" [See D.5.] gives recipes for the indexes encountered in most SPIRES file definitions. The following sections describe how to code index record definitions in such a way that you can see the reason for their structure.

B.7.8  Coding Simple Indexes

If all indexes to a single goal record are to be simple indexes, then the structure of each index record might look something like the following:

The only essential difference between the two is that one declares the key to be fixed in length, the other declares it to be varying in length. What is the key? The key of a simple index record is, in almost all cases, the value of an element passed from the goal record. Thus, if you were passing a fixed binary number, such as a price or date, you would want to specify that the key is fixed. The length is the same as the length of the stored value in the goal record.

It is wise to choose names carefully for the elements whose values are shown in lower case. "Record-name" can be up to six characters long; its value is used by SPIRES to sort record definitions for both goal and index records into alphabetical sequence. By tradition, the name REC01 has often been used for the goal record, and REC02, REC03, etc., have been chosen for the index records.

The "element-name" may be anything up to sixteen characters long. For simplicity, it is usually best to give this element the same name as the name of the goal record element that passes its value to this index record.

The "pointer-name" again may be anything, but the name you choose must be coded in the linkage section. One additional requirement falls on the "pointer-name": it must be given the same name in all indexes for a particular goal record. This is the second primary rule for coding indexes. The pointer element is often given the mnemonic name "POINTER." In the example shown above, the pointer element is an optional multiply occurring simple element. If possible, it is best for the pointer element to be fixed length. That's because SPIRES can do logical operations more efficiently when the pointer element is fixed length, and fixed length elements take less overhead in the index records.

Only one other comment need be made about these two index record definitions. The pointer-name is said to be "TYPE=LCTR;" (fixed length of 4 bytes). The pointer-element is what SPIRES tallies when it reports the number of records in a search result. It is also the element in which SPIRES stores the reference back to the goal record.

If the file definer has coded "REMOVED;" for the goal record definition, then the "reference back" or "pointer" is usually in the form of the four-byte location (a "locator") of the goal record in the residual dataset. If the goal-record has not been removed, then TYPE=LCTR may not be coded; instead the goal-record key will serve as the pointer. But note: even if the goal-record has been removed, it may be advantageous to pass the key rather than a locator in some circumstances. [See B.7.14.]

Let's look now at a very simple bibliographic file definition which contains two indexes, one for titles and one for dates. The date element will pass its value to the FIXED key of an index. Note that the following definition is incomplete in that it doesn't say how this "passing" is to occur. This process is defined in the next chapter.

B.7.9  Coding Simple Indexes with Qualifiers

A qualifier adds another level of "depth" to a simple index record definition: it introduces a structure, the same kind of structure that was defined by "TYPE=STR" in the goal record.

The pointer element, which was only a simple data element containing a reference to a goal record, is now a structure. The structure is always a keyed structure, and the key is always the pointer element. The structure itself is optional, but its key is fixed if the key is TYPE=LCTR. This introduces the third rule of index definition: if the pointer element is in a structure, then it must be the key of the structure. Moreover, the pointer element should always be the first element in the structure; do not begin it with Fixed qualifier elements, for instance, if the pointer element itself is in the Required section.

What of the other elements in the structure? Typically, there is usually only one, the qualifier itself; if there is more than one qualifier, then there will be more than one qualifier element in the structure. The qualifier elements should always be defined in the index record definition with their OCC=1. If the goal record element that is passed to the qualifier does not occur in the goal record, then special pass-processing rules (PASSPROC rules, covered in the next chapter) should be coded to provide a default value. Note that the maximum length limit for a value passed as a qualifier element is 2047 bytes.

Let's examine the skeleton of a simple index with one qualifier. Next to it is shown a TITLE index in which there is a SUBJECT qualifier.

We can now expand the example file definition which had two simple indexes on DATE and TITLE to include a qualifier on the TITLE index. The new definition will illustrate another primary rule of coding indexes: if a pointer structure is used in one index, it must be coded in all indexes to that goal record, whether it is necessary to the structure of the specific index record in which it occurs or not. To see this, notice how the definition of the DATE index record has changed from its appearance in the previous example. Before, it was only a simple index; now, it looks like a simple index with a qualifier--even though no qualifier is passed to the DATE index.

In general, pointer groups for indexes that apply to the same goal record must have identical structure; this is so SPIRES can AND and OR pointer groups when manipulating search results. If you need to violate this general rule, then the first index record-type defined in the linkage section for the goal-record is taken as the model for the other index record-types. If this record-type has the appropriate structure defined for it (as described above), then more specific rules can be used for other index record-types.

The specific rules for pointer group structures are as follows:

The pointer element must be the key of the structure, and it must be the first element in the structure.

If there is no REQUIRED section, then all pointer groups must be identical through the length of the FIXED section. If one pointer group structure is declared with LEN attribute, then all must be declared that same way, and only FIXED elements are allowed. This is the most efficient form of pointer group structure. If the pointer group structure is not declared with LEN attribute, then any (or all) may have OPTIONAL elements.

If there is a REQUIRED section, then all pointer groups must be identical through the end of that section. If the only REQUIRED element is the KEY of the pointer group, then any (or all) may have OPTIONAL elements. If there are non-key REQUIRED elements, then if one pointer group has OPTIONAL elements declared, all must declare OPTIONAL elements.

Note that if an index record-type is to have multiple qualifiers passed to it, then the following definition is appropriate:

B.7.10  Coding Compound Indexes

A compound index record cosmetically looks very similar to a simple index with one qualifier. Two important differences must be noted. First, the KEY of the compound index record is always fixed with a length of two bytes. This is because the key of such a record is an encoded form of the element name that is being passed to this index. [See B.7.5.] If the elements DATE, AGE, and TEMPERATURE all pass to a compound index, there will be three keys, and hence three records, in the index. (The keys, with which the file definer and searcher need never be concerned, tell SPIRES the structure and element number of the element in the goal record definition.) Second, the element that previously named the qualifier is now given a generic name, usually something mnemonically significant, like "VALUE", since each occurrence of it contains one value passed from the goal record. Note that the value passed cannot exceed 2047 bytes in length.

Below, a skeleton record definition is presented. Next to it is shown a compound index that contains a DATE occurrence. Notice that the word DATE never appears in the index definition. The linkage section specifies which element(s) will pass to the compound index record.

Notice how these record definitions follow one of the indexing rules: if the pointer (here, TYPE=LCTR) is in a structure, it must be the key of the structure.

As the following example shows, all of the other rules are followed: 1) if one index has a pointer structure (because it is a simple index with a qualifier or because it is a compound index) then all indexes must have pointer structures; 2) all pointer elements must have the same name in each index.

The following example is similar to the previous two in some ways. The goal record contains TITLE, SUBJECT and DATE elements; COST will now be introduced and placed in the compound index. Note how the definition is quite different from the first example, which showed only simple indexes; but its only difference from the previous example, which showed simple index qualifiers, is the addition of a compound index.

Note that the occurrence of the VALUE element is always 1. The structure containing this element occurs once for each value of an element passed from the goal record. A record passing two COST values would cause two POINTER-STRs to occur, each with a single POINTER and VALUE. When SPIRES retrieves such records, it reports a result, which is the number of POINTER-STRs that met the criteria specified; this count may be high, since a single record could be represented in the POINTER-STR list more than once. SPIRES will correct any erroneous result count after it has been asked to TYPE the records in the search result.

Notice that the POINTER-STR in REC02 (TITLE) has the same "form" as the POINTER-STR in REC03 (DATE) and REC04. SUBJECT, DUMMY, and VALUE all occupy the same position.

B.7.11  Coding Sub-Indexes

Sub-indexes, usually used only for personal name indexing, provide a variation on the theme of simple indexes and simple indexes with qualifiers. Sub-indexes cannot be defined as part of a compound index, but may be defined for simple indexes in subfiles that have compound indexes.

Sub-indexes provide a way of searching data that has a hierarchical organization. Two simple hierarchies might have the following structures:

It would not be useful to find all people with a first name of "John" in a subfile unless you had first established that you were interested only in people whose last name was "Smith." It would also not be helpful for an airline reservation system to be able to find all seats with the number 13 unless a particular flight had been established to restrict the domain of the search.

Two types of sub-indexes can be defined, one for subfiles that contain only simple indexes and no qualifiers (no pointer structures would be involved in this case) and one for subfiles that contain either qualifiers or compound indexes. The following example shows the two types of record definitions; each is defined for a personal name index. The key of such an index is the person's last name, and the key of the sub-index (which is a structure) is the rest of the person's name (first, middle, etc.). Note that the first name structure is not a pointer structure: the pointer or pointer structure is an optional element in the first name structure.

RECORD-NAME = REC02;                   RECORD-NAME = INDEX5;
  REQUIRED;                              REQUIRED;
    KEY = LAST-NAME;                       KEY = LAST-NAME;
  OPTIONAL;                              OPTIONAL;
    ELEM = FIRSTNAME-STRUCT;               ELEM = FIRSTNAME-STR;
      TYPE = STR;                            TYPE = STR;
  STRUCTURE = FIRSTNAME-STRUCT;        STRUCTURE = FIRSTNAME-STR;
    REQUIRED;                            REQUIRED;
      KEY = FIRST-NAME;                    KEY = FIRST-NAME;
      ELEM = POINTER;                    OPTIONAL;
        TYPE = LCTR;                       ELEM = POINTER-STR;
                                             TYPE = STR;
                                       STRUCTURE = POINTER-STR;
                                         FIXED;
                                           KEY = POINTER;
                                             TYPE = LCTR;
                                         OPTIONAL;
                                           ELEM = VALUE;
                                             OCC = 1;

The value passed as a SUB-INDEX value cannot exceed 2047 bytes.

B.7.12  Index Record and Goal Record Elements

Up to this point, the definition of an index record looks very similar to the definition of a goal record: there are keys, elements of fixed or varying length and optional elements, and there are structures. Several file definition elements have not appeared: SLOT, SLOTCHECK, REMOVED, INPROC, OUTPROC, and ALIASES.

SLOT, SLOTCHECK and REMOVED are rarely coded for index records. However, one element is often coded for index records that is not usually coded for the first record-type (usually the goal record) in the file defintion; this is COMBINE. As explained in "Tree and Slot, Goal and Index Records," [See B.6.5.] COMBINE specifies that the data sets created by the compiler for each record definition specifying COMBINEd are to be merged into a single data set or file. Any tree structured data set can be combined with any other tree structured data set; slot record-types cannot be combined with each other, or with any record-type. Except in the largest files (over 100,000 records) with several subfiles, or files in which there are large table-lookup files, COMBINE should be used whenever possible. When there are table-lookup record-types, it is often a good idea to COMBINE them with each other, and to COMBINE the goal and index record-types together. This allows flexibility in erasing and recreating the table files with the ZAP DATA SET command. [See B.10.17.]

The COMBINE element is coded just after the RECORD-NAME element as follows:

Note that COMBINE is not coded for the goal record, REC01, in the above example, since it is the record with which other record-types are combined. The record-type named in the COMBINED statement must have been defined earlier in the file definition; it may not be defined further down. All of the file definitions in "Annotated File Definition Examples" use the COMBINE feature wherever possible. [See D.7.]

Warning: when many terminals will be using the same file simultaneously, the COMBINED statement is not recommended. In general in such situations, you want as many ORVYL data sets to be used as possible, rather than as few as possible. Note, however, that there is a limit of nine ORVYL data sets (filename.REC1 through filename.REC9) that can exist for a single file, so if you have more than nine record-types, the COMBINE statement will be necessary for some of them.

B.7.13  Index Records as Goal Records

What about coding INPROC, OUTPROC and ALIASES for index record definitions? These file definition elements may be coded for index records, but often are not. Since SPIBILD maintains the indexes, there is no need for ALIASES, and any INPROC, INCLOSE or OUTPROC rules that are coded are ignored when SPIBILD is updating the indexes as part of its processing. If the SPIBILD process will create new records in a record-type (as is usually the case with index records), it is important that no FIXED or REQUIRED elements be defined that will not be created by SPIBILD. If a required element is not present when SPIBILD attempts to create a new record in a record-type, a PASS ERROR with a code of S419 will occur. This is a serious error.

Generally file owners are encouraged to include INPROCs and OUTPROCs for the key of each index record, because these affect the results displayed by the BROWSE command. When index values are displayed with the BROWSE command, the values are processed through the OUTPROCs for the key. Also, if a value is given in the BROWSE command (such as BROWSE DATE-ADDED 7/1/80), that value is processed through the key's INPROCs as well. Without such INPROCs and OUTPROCs on the key of the index record, browsing the index can be a pointless exercise for the subfile user. If only an INPROC is defined for an index record key and the INPROC sets the type for the key, e.g., an A31 identifies the stored key as a hexadecimal one, then SPIRES will convert the values displayed to string values when the BROWSE command is issued. This is not usually as valuable, or as straightforward, as putting the appropriate INPROCs and OUTPROCs in the index record definition.

It is also important to code INPROCs and OUTPROCs (and perhaps even ALIASES) if the index is to be used as a goal record that can be selected (using the SELECT command) or attached (using the ATTACH command).

In the following example, suppose the first index record, the SUBJECT index, can be selected as a goal record. An element called CROSS-REFERENCE has been added, so that the file owner can add cross-reference records to the SUBJECT index. Also, an OUTPROC action 32 has been added on the pointer, and it will convert the pointer on output by referring to the goal record it locates and looking up the TITLE element. The details of coding action 32 are covered in "Indirect Record-Access: Action 32 and SUBGOAL Processing." [See C.5.]

B.7.14  Indexes for Non-Removed Record Types; Keys vs. Locators

If the goal record does not have REMOVED specified, the file definer may not use the TYPE=LCTR specification in defining index records. Whenever the pointer to the goal record is the goal record key, rather than a location of the goal record in the residual data set, then TYPE=LCTR may not be specified.

This is always the situation when the goal record is not REMOVED. It may also occur for REMOVED goal records if the file definer has chosen to pass the goal record's key rather than the goal record's residual location.

All of the sample index definitions shown so far have assumed that a location in the residual data set is being stored rather than a key. However, it is fairly simple to see the implications of storing a key for index record definition:

Traditionally, SPIRES file owners have more frequently passed locators rather than goal record keys as pointers. By default, the File Definer subsystem creates index records that contain locators rather than keys. However, it can be very useful to pass the key at times: if the key itself is stored in the index record, the key can be examined directly when the index is used as a goal record. Also, if the techniques of "goal-to-goal passing" or "self-indexing goal records" are being used, the key usually must be passed. [See C.12.]

Below are some guidelines to follow in choosing whether to pass locators or to pass keys. Note that it would not be a fatal error to pass keys instead of locators or vice versa in contradiction to the second recommendation; however, it might mean that SPIRES would not handle certain search procedures as efficiently as it could.

These guidelines can be affected by other circumstances. For example, if most searches will retrieve more than half of the records in the result from the deferred queue, locator access could be more costly than key access. A detailed study discussing the key-locator decision appears in the back of this manual. Remember, for most files, the choice will not make very much difference; for most of the rest, the guidelines above will be satisfactory. [See D.8.]

B.7.15  Ensuring the Validity of Index Records

Index records are normally maintained entirely by SPIBILD, in accordance with the rules the file definer specifies in the linkage section of the file definition. [See B.8.] The file definer does not need to take any explicit action to ensure that information in the indexes is valid, but must ensure that a null value isn't passed as the key of an index record.

When index records can be transferred and updated as goal records, [See B.7.13.] the file owner must ensure that a user cannot incorrectly alter the linkages between goal and index records build by SPIBILD.

The most important ingredient or rule of this linkage is that all keys along the structural path from the index record's key to the pointer group (or pointer element) are in descending sort order. This is the way they are automatically created by SPIBILD. The best way to ensure this is to make these elements non-updateable, either in a view or with a PRIV-TAG specification. [See B.9.4.]

All structures along the pointer group path up to and including the pointer element itself must be in descending sort order. This can be ensured by coding an A138:0 as the INPROC for all structures along this path.

For example, in a personal name index [See B.7.11.] the sub-index structure must be sorted in descending order by its key (a person's first name), and:

B.8  Understanding and Coding the Linkage Section

B.8.1  Functions of the Linkage Section

The previous chapters of this manual have covered the definition of the record-types that will make up a SPIRES file. Two kinds of record-types have been examined in detail: goal records and index records.

The linkage section, as its name implies, links the goal and index records for two purposes: searching and passing. The linkage section controls the search process by specifying in the "SEARCHTERMS" statement the names of the components of an index to be searched. The linkage section also specifies, in the "SRCPROC" statement, the processing rules to be applied to values in a search request. Passing, which is the process of using information in a goal record to build an index record, is controlled by specifying the goal record information to be passed. The source of this information is specified by the "GOALREC-ELEM" statement or by processing rules coded in the "PASSPROC" statement.

Thus we can see at least two different parts of a file definition. The first part, defining the goal and index records, is a description of data structures. The second part, defining the linkage between goal and index records, is devoted to procedural rather than descriptive statements. These procedural statements provide for passing and searching. A third part of the file definition, defining the privileges of any user or group of users with respect to a subfile, is described in the next chapter "Defining Subfile Privileges." [See B.9.]

The linkage section itself can be subdivided into small sections: 1) a single group of statements defining certain global relationships between a goal record and all its index record(s), and 2) groups of statements describing the specific processing of the linkage between the goal record and a single index record. (1) is discussed in "The Global Parameters Section" [See B.8.2.] and (2) is described in "Individual Index Linkages" [See B.8.3.] There usually is one individual index linkage for each index record you have defined. The structure of these parts is fairly simple; the definition of the linkage is in terms of SEARCHTERMS, SRCPROC, GOALREC-ELEM and PASSPROC statements. If a compound index, qualifiers, or sub-indexes are used, then one or two additional elements must be specified in the linkage definition for the index record in which they occur. The only difficulty usually encountered in defining linkage sections is in coding the various PASSPROC rules, and occasionally in coding the SRCPROC rules; we will not consider the definition of these processing rule strings in detail until the end of this chapter.

B.8.2  The Global Parameters Section

The linkage section for any particular goal record begins with some "global" information that is common to all indexes belonging to that goal record. This information always includes the name of the goal record to which the entire linkage section applies, the name given to a search result for the goal record, and the name of the pointer element in all of the indexes. Any global qualifiers (qualifiers that are passed to all indexes) are specified here also.

Linkage sections are coded following the record definitions of the goal and index records. The linkage section begins with the global parameters portion:

Because of the rarity of global qualifiers, the additions they require to the global section will not be covered until later in this chapter. [See B.8.6.] Let's begin coding the linkage section by defining the global parameters section for a very simple bibliographic file. In this file, we have one goal record, BOOK, and two index records, REC02 and REC03. Let's say that we want the search result to be called "CITATION".

The file definition, up through the global portion of the linkage section, looks like this:

The GOALREC-NAME statement names the goal record by specifying its RECORD-NAME. The EXTERNAL-NAME statement declares what a search result will be called when SPIRES reports the result count after a search command such as FIND. The PTR-ELEM statement names the element in each index record that is to receive the pointer back to the goal record. You may choose any element name you wish, but it must be the same in all index records. In our example, it happens to be POINT-BACK.

The PASSPROC specifies A170 because the pointer element is TYPE=LCTR and the goal records are REMOVED. A170 specifies that the information passed to the pointer element in each index will be the address of the goal record in the residual data set. If the pointer element is not TYPE=LCTR, then the information passed to the pointer element in each index should be the key of the goal record, which is usually specified by a GOALREC-KEY statement. If the goal records are not REMOVED, then the pointer element cannot be TYPE=LCTR and A170 cannot be used.

B.8.2.1  The SEARCHPROC Statement in the Global Parameters section

As noted earlier, you can allow users to search for records using record-key criteria in a FIND command by adding a SEARCHPROC statement to the global parameters section, e.g.:

In this example, the keys of record-type REC01 are presumably stored as 4-byte integers. When a user issues the search command FIND PERSON 295, the value "295" will be run through the $INT(4) processing rule to retrieve that record.

No extra index record-type needs to be defined; the goal record-type, which is arranged in key order, of course, is used as the index. The feature is indicated to the user by the term "Goal-Index:" in front of the goal record name, instead of "Goal Record:" in the SHOW INDEXES display.

There are some restrictions on the use of this feature, however:

The SEARCHPROC string, like any other SEARCHPROC, should translate the incoming search values into values that might be found in the index records. [See B.8.4, B.8.9.]

To use this facility in searching, you type any of the values for EXTERNAL-NAME as the search term, e.g., using the example above:

All relational operators, except those restricted to compound indexes, may be used in such searches.

In a sense, added records in the deferred queue are "immediately indexed" in the goal-index -- they can be found immediately with the FIND command:

Trouble can arise if a search result like the one above containing added records is stored, using the STORE RESULT command -- if the result is restored after the file has been processed, those particular records may not be found. [They would have been represented in the stored result with a temporary deferred queue locator, not their permanent locator.]

The logical operators TAND, TNOT and TOR will not work properly in most searches involving the goal-index. Similarly, when secure-switch 7 is set, pointer groups may not be compared properly. [Technical note: when processing "goal-as-index" searches, SPIRES must create pointer groups (either locators or the actual keys) for the "goal-index" records retrieved, so that they can be compared with other pointer groups in iterative searching. If the pointer groups for the regular indexes are structures (meaning they contain qualifiers), SPIRES must build dummy structures for the "goal-index" records. These dummy structures are built with binary-zero Fixed values and null-length Required values. Thus, if qualifiers are compared in iterative searching (e.g., because of TAND), SPIRES will be comparing dummy qualifiers to real ones, creating inaccurate results.]

B.8.2.2  The EXTERNAL-NAME Statement

The EXTERNAL-NAME statement can thus have two purposes:

No name can exceed 15 characters in length nor contain any blanks.

If you are not creating a goal-index, then the value for the EXTERNAL-NAME statement will be one or possibly two values that will appear in search result messages. If only one value is given, e.g.,

it will be used as the singular form in messages, with an "S" added to create the plural form when needed:

If multiple values are given, SPIRES will use the first for the singular form, and the second for the plural. For example:

would lead to:

and

would lead to:

If you are using the EXTERNAL-NAME statement both to supply the singular and plural forms and to supply additional search terms for the goal-index, be sure to place the singular and plural terms as the first two entries in the list.

On the other hand, if you are adding terms to the EXTERNAL-NAME statement only for the purpose of supplying additional search terms, remember that you will be affecting the singular and plural terms used by SPIRES in reporting search result. SPIRES will always use the first one for the singular form and the second one for the plural.

B.8.3  Individual Index Linkages

After any global parameters section, a linkage between the goal record and each index record must be defined. The definition of these linkages is, in structure, fairly straight-forward, and looks like the following:

The structure of this skeleton can be slightly complicated by the inclusion of linkage information for a sub-index, local qualifiers, or a compound index. These cases will be covered later in this chapter. [See B.8.5, B.8.7, B.8.8.]

A "recipe" for coding the global and individual parameters of the linkage section is given in "A Guide to Coding the Linkage Section Definition." This guide covers all types of linkage definition. [See D.6.] The different kinds of index records coded in the preceding chapter will serve as examples of simple, personal name, qualified and compound index linkage sections. We will take each possibility in turn, leaving the detailed consideration of PASSPROC rule strings to the end of this chapter.

The PTR-GROUP statement, for any particular index, names a multiply occurring element in the index that is either a STRUCTURE element whose KEY is the PTR-ELEM, or a simple ELEM which is the PTR-ELEM. If the subfile needs only simple pointers back to goal records (no compound index or qualifiers), then PTR-ELEM and PTR-GROUP refer to the same simple pointer ELEM in all indexes. But if there is a need for compound index or qualifier terms, then PTR-GROUP for each index refers to a multiply occurring STRUCTURE which will contain those terms. The KEY of each STRUCTURE is the PTR-ELEM (pointer element).

You are limited to 88 INDEX-NAME sections per linkage section that actually pass data from the goal record-type to an index record-type. (In some situations, you can code an INDEX-NAME section that does not cause any data to be passed. [See D.1.7.1.6.5 for an explanation of action A165 and the $PASS.OCC proc.] A section like that does not count as one of the 88.)

B.8.4  Simple Indexes

Here are the two individual linkages to the index records REC02 and REC03. In the complete file definition we are building towards, these would be added right after the global parameters section with which the previous example ended.

The INDEX-NAME statements name the particular index records that will be linked to the goal record; here they are REC02 and REC03. The SEARCHTERMS statement is similar to the ALIASES statement in the goal record definition. Here SEARCHTERMS specifies the name or names that can be used to access an index in a search command such as FIND. [See B.9.4.5 to see how PRIV-TAG can restrict the use of SEARCHTERMS.]

The SEARCHPROC statement (also seen as SRCPROC) specifies processing that is to be performed on search values given in search commands. This processing is usually equivalent to a combination of both INPROC and PASSPROC rules used to determine the form in which goal record values are to be placed in the index record. That is, SEARCHPROC rules are usually coded to "translate" incoming search values into values that might be found in the index records. The SEARCHPROC for REC02:

breaks a search value up into individual words ("A45,", which breaks on blanks), then excludes any words of two or fewer characters (A47,2), and allows special truncated search if a word of more than three characters contains a "#" (A11:3,#).

Notice that the PASSPROC for this index contains similar rules:

A166 specifies that the goal record element value (or values) named in GOALREC-ELEM is to be fetched, and is later to be processed by A45 or A38 (both are actions that "split" a value into parts). The rules "A45,/ A47,2" make sure that only individual words are passed to the key of the index records, and that no words less than two characters are passed. This part of the rule string is identical to a portion of the SEARCHPROC rule string.

The SEARCHPROC and PASSPROC rules coded for REC03 are as follows:

The SEARCHPROC rules coded will convert a date in a search value to the internal form of a date, just as was done by an INPROC=AS31 statement in the goal record definition. The PASSPROC rule specifies only that the element whose name is coded in the GOALREC-ELEM statement be fetched and stored in the index record without the standard conversion to uppercase on passing. Values that are stored in character form should always be forced to uppercase on passing. Any other form of a value (e.g., binary, floating-point, packed decimal) should not be forced to uppercase. No translation by a matching AS31 is necessary in the PASSPROC, since the date is stored in the appropriate format in the goal record via the INPROC=AS31. Part of the power of SPIRES indexing methods is that values can appear in the goal record in one form, and can be passed and searched in a more usable (for the searcher) form.

The PTR-GROUP statement names the same element as the PTR-ELEM statement because our indexes have been defined to use only simple pointer elements (no qualifiers or compound index).

Although each INDEX-NAME refers to a different RECORD-NAME in our example, it is possible for any RECORD-NAME to be referenced by more than one INDEX-NAME. Such a case usually occurs when different elements within the goal records are to be passed to the same index, but those elements require different PASSPROC or SEARCHPROC rules. For passing efficiency, it is better to use multiple-passer rules ($PASS.ELEM, A167), putting the elements all together into a single individual linkage section, when you can. However, if you do need to code separate linkage sections because of different PASSPROC or SEARCHPROC needs, and if any single goal record could be so large that it could not fit in the pass stack, then try to put the individual linkages as early as possible in the linkage section to insure that they will pass at the same time. (This last point is noteworthy only for files with very, very large records that each pass lots of data to indexes.)

B.8.5  Sub-Indexes

The general form of SUB-INDEX linkage is similar to INDEX-NAME:

When SUB-INDEX terms are added to a simple index, the effect is to introduce additional structural levels to the hierarchy leading from the KEY of the INDEX-NAME record to the PTR-GROUP element. SUB-INDEX names a keyed structure in the index record. The KEY of that STRUCTURE receives the goal record's value being passed for the sub-index term. A personal name index is a good example of a simple index with a sub-index structure. Let's modify our sample file definition to include a PERSON element in the BOOK records, and another index record: REC04

No GOALREC-ELEM was needed for the SUB-INDEX term in this example because the PASSPROC A165 indicates the value to be passed to FIRST-NAME had already been created by A38 in the PASSPROC associated with INDEX-NAME. This is usually the case with personal name sub-indexes, but not for other sub-index structures. The SEARCHTERMS of the SUB-INDEX for personal name are not usually used in a search request because A38 in the SEARCHPROC of the INDEX-NAME provides the necessary search values for the SUB-INDEX. [See B.9.4.5 to see how PRIV-TAG can restrict the use of SEARCHTERMS.]

Let's examine the index record definition and linkage definition for a sub-index that is not for a personal name. Suppose the following hierarchy were needed for an airline reservation system:

So, SEAT is inside SECTION which is inside FLIGHT. The index record definition for this structure would look like this:

The linkage definition for this index record would look like this:

Note the use of A171 to pass a default value of SECTION and SEAT if no value is found in the goal record. This will ensure that the index record is created, even if it is incomplete. A171 is also used this way in passing qualifier elements. [See B.8.6, B.8.7.]

The SEARCHTERMS of a SUB-INDEX are specified with a leading @-sign in a search request along with the SEARCHTERMS of the INDEX-NAME. For example, FIND FLIGHT 27 @SECTION B @SEAT 9 requests a specific hierarchy within the REC04 index.

B.8.6  Global Qualifiers

In order to have qualifiers in an index record, PTR-GROUP should specify a structure element in all index records. PTR-ELEM specifies the KEY of the structure, and the other elements within the structure receive qualifier values. The "form" of the structure must be the same across all index records associated with a particular GOALREC-NAME. By that is meant, the number of FIXED, REQUIRED, and OPTIONAL elements must be the same in each definition of the structure; and the LENgth and OCCurrence attributes associated with corresponding elements must be the same within each structure. The KEY of the structure receives the pointer back to the goal.

Global qualifiers are specified in the global parameters section of a linkage description just prior to the first INDEX-NAME section. The statements of the QUAL-ELEM section are:

The SEARCHTERMS of any QUAL-ELEM are specified in a search request following the AND or AND NOT logical operators. [See B.9.4.5 to see how PRIV-TAG can restrict the use of SEARCHTERMS.]

Let's alter our sample file definition and linkage section to pass DATE as a global qualifier instead of building a separate DATE index (REC03). We will make DATE a global qualifier of both TITLE (REC02) and PERSON (REC04) indexes. Since PTR-ELEM must become the KEY of a PTR-GROUP structure, we will have to alter the index record definitions. The revised definition might look like:

Notice that we defined the POINTER-STR as consisting of entirely FIXED information, and included LEN=8 with TYPE=STR. The "form" of the pointer group structure is the same in all indexes.

If the DATE element within the BOOK records occurred multiple times, only the first occurrence would be passed to the global qualifier. And if DATE hadn't occurred at all, either A171 would need to be specified in the PASSPROC to supply a default value, or else a null value would be passed to the global qualifier.

There is a special case of global qualifier worth mentioning. If the keys of goal records are passed to PTR-ELEM, then the pointer element in the indexes referred to by the PTR-ELEM can also be referred to by a global QUAL-ELEM. The QUAL-ELEM would not specify a GOALREC-ELEM since the key of the goal records had already been passed to PTR-ELEM. The SEARCHPROC would correspond to the INPROC of the goal record's keys, and the PASSPROC must be A165. The SEARCHTERM statement provides you with search names that allow you to use the PTR-ELEM as a qualifier, which means you can qualify your search requests by goal record key criteria.

If this special QUAL-ELEM is the only qualifier defined for the GOALREC-NAME, and there is no compound index, then PTR-GROUP, PTR-ELEM, and this QUAL-ELEM can all refer to the same simple element in the indexes. This is the only exception to the rule about PTR-GROUP structures being required when qualifiers are defined. [See C.6.11 for more information about this technique, B.8.2 for another technique that provides key searching.]

B.8.7  Local Qualifiers

All the rules for PTR-GROUP structures and PTR-ELEM keys apply for local qualifiers just as they do for global qualifiers. [See B.8.6.] Local qualifiers are specified in the linkage section for any particular index by adding QUAL-ELEM sections just after the PTR-GROUP statement.

Let's alter our sample file definition and linkage section again to pass DATE as a local qualifier of the TITLE index (REC02) instead of making it a global qualifier in all indexes. We will keep the personal name index (REC04) introduced in the SUB-INDEX section, but it will not have a qualifier.

Notice the DUMMY element in the pointer group structure of REC04. It is there to make the structure "form" identical to the structure defined in REC02, which has a DATE-QUALIFIER element. Also notice that both the DUMMY element and DATE-QUALIFIER element were declared OPTIONAL. That's because the DUMMY element will not occur within REC04 occurrences of the POINTER-STR.

In this sample definition, the DATE element in the goal records always occurred since it is a FIXED element. However, if had been an OPTIONAL element which did not occur, then A171 should be coded in the PASSPROC to pass some default value to the local qualifier, otherwise no index entries would be created for TITLE. All local qualifier and sub-index sections must define values for an index entry to be created. If the goal record elements which supply values for local qualifiers or sub-index terms are multiply occurring, or a PASSPROC rule specifies multiple passer elements, then multiple index entries can be created. [See B.8.14.]

Problems may be encountered if a variable length qualifier is passed to a fixed-length qualifier element in the index record. If this is being done, the following PASSPROC should be included with any other qualifier PASSPROCs:

where "n" is the value of the LEN statement on the qualifier element (i.e., the fixed-length field size).

B.8.8  Compound Indexes

CINDEX-VALUE is just a special case of local qualifier. PTR-GROUP must be a structure with PTR-ELEM as its KEY.

Let's alter our file definition to make DATE a compound index. REC03 will now be used to define a compound index record-type. Remember, the "form" of the PTR-GROUP structure must be the same in all index definitions. Here is the revised file definition, including the linkage sections for both the simple index on TITLE and the compound index on DATE (the PERSON index has been dropped).

Compare the linkage definitions for REC02, a simple index, and REC03, a compound index. Notice that the PTR-GROUP statements refer to different structure names in each index, but the "form" of those structures is the same, and they have the same KEY name. The PTR-GROUP structure names are usually the same, but that is not a requirement, as this example demonstrates.

Also notice that the compound index linkage has a "dummy" SEARCHTERMS statement coded, no SEARCHPROC or GOALREC-ELEM statements, two PASSPROC statements, and a new kind of statement, "CINDEX-VALUE".

When searching a compound index, the searcher may use the element name or alias of any of the goal record elements passing to the compound index; the SEARCHTERMS statement must be coded, but its value is meaningless. The index names that are reported when a user issues the SHOW SEARCH TERMS command are picked up from the P+ values of PASSPROC=A167. Note from the description of this action [See D.1.7.] that the order of the P+ parameters is not important unless some of the elements being passed are inside structures; in this case, the order must be the order in which the elements would be displayed if a record from the file were displayed in the standard output format.

It is this first PASSPROC, A167, that specifies the goal record elements that are passed to the compound index; this is why no GOALREC-ELEM statement is needed. Instead of a SEARCHPROC rule string, SPIRES passes all search values through the INPROC rules for the particular goal record element being searched. (Only one SEARCHPROC rule can be coded in a compound index definition: SEARCHPROC = A6; in the INDEX-NAME section.)

The CINDEX-VALUE statement is only coded in the linkage to a compound index, immediately following the PTR-GROUP statement. It names an element in the index record's PTR-GROUP structure that will receive data values being passed from the goal record elements (See A167 in the PASSPROC of INDEX-NAME). In the sample file, this element has the name "VALUE", hence the statement CINDEX-VALUE=VALUE in the linkage definiton.

The final statement, a second PASSPROC, is always coded in compound index linkages. If the elements being passed are in binary form (as is often the case in compound indexes), such as the DATE element, then A169:1 is the only rule coded for this statement. If the elements are values that must be converted to uppercase, then A169:0 (or simply A169) is coded. If some elements being passed must not be converted to uppercase and others must be, then A162 is also coded, as explained later. [See B.8.11.]

B.8.9  Coding Searchproc Rules

If the values of an element have been altered by an Inproc or Passproc then the same processing rules are generally coded in the Searchproc rule string to apply a similar transformation on the values a user might specify in search commands.

Numerous Searchproc rules are available to modify the search process, as listed in section D.1.2 of this manual and in the "SPIRES System Procs" manual. Other actions may also be used in Searchproc rule strings, as indicated in their descriptions.

If no Searchproc statement is coded, and thus no Searchproc rules are specified, the default Searchproc will be used: "A45,". This will automatically cause search values to be broken on blanks.

Case-Sensitive Searching

By default, SPIRES assumes all index entries are in uppercase, and thus converts search values into uppercase automatically. Hence, no rule such as A30 or $CAP is needed in the Searchproc to convert the search value to uppercase.

However, in rare circumstances, file owners want case-sensitive searching, e.g., FIND ID = 32s14 should provide a different result from FIND ID = 32S14. In those cases, the file owner needs to turn off the automatic uppercase-conversion by invoking Secure-Switch 16 in the Subfile section of the file definition. Then, any other indexes of the subfile in which uppercase search values are required should probably include $CAP or A30 in their Searchproc statements. [See B.9.3.16.]

Note that case-sensitive indexes are created in passing by choosing a fetcher rule that doesn't automatically convert the passed value to uppercase. [See B.8.12.]

B.8.10  The NOPASS Statement

The NOPASS statement may be specified in the linkage section of a file definition. If it is coded, SPIBILD will not attempt to update ANY of a subfile's indexes when it is processing records. The indexes can still be searched, however.

The statement, "NOPASS;", is placed after the last statement in the linkage section of the goal record whose indexing is to be "turned off." The file definition must then be recompiled. Subsequent SPIBILD processing will not attempt to pass any information to the subfile's indexes. In order to re-start index updating, the NOPASS statement must be removed from the file definition, and the definition must then be recompiled.

Note that NOPASS stops passing to all indexes in a single linkage section; it cannot be used to disable one of several indexes selectively. To do that, consider using the PASSPROC = $NOPASS statement. [EXPLAIN $NOPASS PROC.]

You can disable passing on a case-by-case basis (as opposed to globally by changing the file definition) by using the SET PASS or SET NOPASS commands is SPIBILD. These commands also allow you to turn off passing from individual record-types. [See B.10.12.2.]

B.8.11  Coding PASSPROC Rules

The rules for coding strings of PASSPROCs are more rigid and difficult than the rules for coding INPROCS, OUTPROCS or SEARCHPROCS. The descriptions of the PASSPROC rules in the last part of this manual [See D.1.7, D.2.6, D.3.] and in the ACTIONS subfile are very concise; a problem with their brevity is that the choices a file definer has in coding PASSPROC rule strings are not easily distinguished. This section will focus on the central choices that must be made in coding PASSPROC strings.

The first PASSPROC encountered in most file definitions is in the global parameters of the linkage section. For example:

As has been mentioned, PASSPROC=A170 is used if the goal record is REMOVED (as most samples definitions in this manual are) and the PTR-ELEM is declared TYPE=LCTR. This rule says that the PTR-ELEM in each index record will receive the locator of the goal record in the residual data set.

It is also possible to pass the key of the goal record to the PTR-ELEM, instead of the locator of the goal record in the residual data set. If the goal records are not REMOVED then you must pass the key. To do this, one of the following PASSPROC rules must be coded instead of A170:

Both of these rules do not force to uppercase. When no PASSPROC is coded, the default action is to force the key to upper case, which should only be done when the goal record's key is already an uppercase value.

The first rule is used when only the GOALREC-KEY is to be passed. That includes passing the slot-number of SLOT records. The second rule is usually used with multiple passer elements, but can be used with just the goal record's key specified.

Although it is not frequently done, it is possible to pass the key of the goal record as the pointer even if the goal record is a REMOVED record-type. This should be avoided if the key is varying in length or if the key is more than four bytes long.

There are two other places in the linkage section where the choice of a PASSPROC is simple. The first of these is the second PASSPROC statement coded in a compound index linkage definition:

A169:1 is used when the value or values being passed are stored as numbers and hence must not be forced to uppercase. A169:0 would be used if character strings were being passed to this index. If both character and binary data were being passed to this index, then A162:1 would also be coded to exclude certain elements' values from uppercase conversion. For example:

would force all values to uppercase upon passing, except those from the DATE element.

The second case in which the choice of a single PASSPROC is simple is the second PASSPROC string coded in a personal name index:

Here, only PASSPROC=A165 can be coded because A38 in the PASSPROC of INDEX-NAME supplies both the KEY of the index record and the KEY of the sub-index structure.

B.8.12  Choosing the "Fetcher" Passproc

Choosing the Passproc rule that fetches the element value or values from the goal record is a matter of selecting one rule from among sixteen that are shown in a table below. However, the table itself requires some explanation of the terminology that is often found in SPIRES processing rule descriptions.

The terms "single passer" and "multiple passer" need definition. A single passer situation occurs when only one goal record element is passing its value or values to an index and the GOALREC-ELEM statement is coded. It does not matter whether this element is itself singly or multiply occurring, or whether A45 is used to break a single occurrence into multiple occurrences. A multiple passer situation occurs when more than one element in the goal record passes to a single index, and so the elements to be passed are coded in the Passproc itself, not in any GOALREC-ELEM statement. For example, if HOME-PHONE and BUSINESS-PHONE elements in a goal record were both passed to a PHONE index, this would be a multiple passer situation.

Compound indexes present a special problem, since they contain two Passproc rule strings. However, the choice of Passproc rules is fairly straight-forward. [See B.8.5 for a complete discussion.] The first Passproc is always "A167:0"; this defines the elements from the goal record to be passed to the index. The second Passproc may be either A169:0 or A169:1, depending on whether the elements being passed are to be forced to upper case in passing or not; see the entry for a single passer without A38 or A45 in the following table.

One other factor affects the choice of Passproc rules from the table. If values "fetched" (obtained from) the goal record are to be broken apart in passing by action 45 or by action 38 (the latter for personal name indexes), then different Passproc rules must be selected than would be if A45 or A38 were not to be coded in the same rule string.

The P1 parameter on the Passproc that fetches the value from the goal record is determined by whether or not the value is to be forced to uppercase in passing. Values that are converted to an internal form, such as fixed binary or date values, must not be converted to "uppercase" in passing, since this would change their value. Otherwise you would choose a rule that would convert the value to uppercase, since SPIRES will automatically convert the user's search value to uppercase (except in the rare situation of case-sensitive indexes). [See B.8.9.]

The P1 parameter is also determined by whether or not the value should be processed through the OUTPROC rules associated with that element, thus passing the external form of the value to the index.

Both actions and equivalent system procs are shown. For details about system procs, see the SPIRES manual "System Procs".

                 Without                With
               A38 or A45            A38 or A45
         |--------------------|---------------------|
         |                    |                     |
 Single  | A169:0             | A166:0              |--Force Upper
 Passer  | $PASS              | $PASS(,BREAK)       |
         |                    |                     |
         | A169:1             | A166:1              |--Don't Force
         | $PASS(UPLOW)       | $PASS(UPLOW, BREAK) |
         |                    |                     |
         | A169:8             | A166:8              |--Force Upper,
         | $PASS(,,OUT)       | $PASS(,BREAK, OUT)  |  but Pass
         |                    |                     |  External Form
         |                    |                     |
         | A169:9             | A166:9              |--Don't Force,
         | $PASS(UPLOW,,OUT)  | $PASS(UPLOW, BREAK, |  but Pass
         |                    |   OUT)              |  External Form
         |--------------------|---------------------|
         |                    |                     |
Multiple | A167:1             | A167:2              |--Force Upper
 Passers | $PASS.ELEM(elems)  | $PASS.ELEM(elems,   |
         |                    |  BREAK)             |
         |                    |                     |
         | A167:5             | A167:6              |--Don't Force
         | $PASS.ELEM(elems,  | $PASS.ELEM(elems,   |
         |  UPLOW)            |  BREAK.NUM)         |
         |                    |                     |
         | A167:9             | A167:10             |--Force Upper, but
         | $PASS.ELEM(elems,, | $PASS.ELEM(elems,   |  Pass External Form
         |  OUT)              |  BREAK, OUT)        |
         |                    |                     |
         | A167:13            | A167:14             |--Don't Force, but
         | $PASS.ELEM(elems,  | $PASS.ELEM(elems,   |  Pass External Form
         |  UPLOW, OUT)       |  BREAK.NUM, OUT)    |
         |--------------------|---------------------|

B.8.13  Other Actions in a PASSPROC Rule String

The following describes the syntax for any PASSPROC rule string. You enter the table at the word "PASSPROC" and follow the paths defined. The symbol "::" is read "is defined as." Terms on the left side of a "::" are defined by the term(s) that appear on the right side of the "::". Terms on the right side of the "::" that are listed directly under another term (or terms) on that side are an alternative definition for the term on the left side of the "::". The symbol "|" means "or," and also separates alternative definitions.

 <Term>    indicates a term that must occur once, i.e., is required
 (Term)    indicates a term that may occur once, i.e., is optional
 (0,Term)  indicates a term that may occur several times,
           i.e., may not occur or may occur more than once
 A-number  indicates a required processing rule.  If no P1
           parameter is specified, then all P1 parameters are
           included.  If a P1 parameter is specified, then only
           that P1 parameter is allowed.
PASSPROC         ::  <MULTIPLE-PASSER>
                  |  <SINGLE-PASSER>
                  |  <SIMPLE-PASSER>
MULTIPLE-PASSER  ::  (DEFAULT) <MULTIPLE-FETCHER> (0,MIDDLE) <BREAK>
SINGLE-PASSER    ::  (DEFAULT)  <SINGLE-FETCHER>  (0,END)
SIMPLE-PASSER    ::  <DEFAULT> | A165 | A167:0 | A170
DEFAULT          ::  A171
MULTIPLE-FETCHER ::  A166 | A167:2 | A167:6
SINGLE-FETCHER   ::  A167:1 | A167:5 | A169
MIDDLE           ::  A22  | A32  | A36  | A40  | A43  | A44  | A46  | A47
                  |  A48  | A55  | A62  | A161 | A162 | A163 | A168
BREAK            ::  A45  (0,END)
                  |  A38
END              ::  <MIDDLE> | A52 | A164

For example, this syntax shows that the following is illegal syntax:

because A52 (as an END rule) must follow A45 (which is a BREAK rule). This syntax also shows that A38 must be the last rule in any PASSPROC in which it is coded.

B.9  Defining Subfile Privileges

B.9.1  The Function of the Subfile Section

Preceding chapters of this manual have considered the definition of goal and index records, and most recently, the definition of the linkage section. The present chapter will describe the last major section of a file definition, the subfile section.

The subfile section determines a particular user's view of the file and its data. Different users may have different views defined. The subfile section defines what are called "privilege groups," that is, groups or individual users who have certain "rights" of access (privileges) when a particular subfile is selected. Generally, these privileges determine 1) whether a user can update a subfile or merely search it; 2) whether a user can issue commands such as SET FORMAT, SET ELEMENT, and BROWSE; 3) whether certain indexes cannot be searched; 4) whether certain structures or elements can't be updated; 5) whether certain structures or elements can't be seen; 6) what format, search modifier, or subfile explanation are in effect when a user has selected a subfile. Together, all these privileges make up a profile or view of the data in a file that particular users see.

Privilege groups in the subfile section vary in number and complexity according to the number and variety of restrictions placed on elements and indexes. Most file definers have found the definition of the subfile section to be the easiest and most straight forward part of the file definition task. In fact, you may choose to omit the section from your file definition entirely. But since this section defines, among other things, the name that will be used to access the goal record in a SELECT command, only the file owner could have access to the goal record, and then only via the ATTACH command. Quite often during the design and testing of a file definition this is acceptable, and so the subfile section is frequently not coded until the file definition is complete.

The information in this section of the file definition is quite unlike statements in other parts of the file definition in one important respect. Information added or changed in this section does not require a RECOMPILE of the definition to take effect. The changes made here take effect as soon as the record is added to, updated in or removed from the FILEDEF subfile.

The information in the subfile section does not affect the structure of the file, just the manner or conditions under which data can be accessed.

B.9.2  Basic Statements in the Subfile Section

Look carefully at the new group of statements appearing at the end of the following skeleton file definition.

Beginning with the SUBFILE-NAME statement, the last lines of the definition make up a simple subfile section. For many files defined by users, these are all the statements ever needed in the subfile section. Here we have defined a subfile called "PROFESSORS" that can be selected by anyone in group GG or by accounts GA.SPI and GA.LHB. In addition if any of these users issue the command EXPLAIN PROFESSORS, the six lines of explanation (EXP) provided in the file definition will be output.

Let's look at a descriptive skeleton of this simple subfile section. Note the indentation used, as it indicates which elements are structures.

ACCOUNTS is a singly occurring element in a multiply occurring structure keyed on GOAL-RECORD. GOAL-RECORD in turn is the key of a multiply occurring structure in another multiply occurring structure keyed on SUBFILE-NAME. We will see later in this chapter how this nesting of structures gives a good deal of flexibility in setting up different groups of users with different access privileges to information in the file.

The value coded in the SUBFILE-NAME statement should not be longer than 32 characters, including blanks, and should be descriptive of the contents of the subfile. The name chosen should be unique among subfile names that the users of the subfile can select. If the name is not unique in this way, a user attempting to select the subfile may be asked to specify which file he or she wants, after being given the file names of the "competing" subfiles. The user will not be asked if one of the files belongs to him or her, in which case that subfile will be used. [See B.5.6.]

The value(s) coded for the EXP statement(s) each make up a single line of terminal output when the user types the EXPLAIN command and names the subfile. Notice from the example that blank lines can be output by coding only "EXP;", and that lines with leading blanks must have their values enclosed in double quotes. The subfile explanation is not required, but is strongly recommended, especially when a subfile is made public. The explanation might include a description of the contents, uses and scope of the subfile; information about updating frequency, data elements, and any special aspects of subfile use are also helpful. At a minimum, a public subfile should give the name of a person or document that can be consulted for further information.

The GOAL-RECORD statement must have the name of the goal record that was coded in the RECORD-NAME statement in the record definition.

The value of the ACCOUNTS statement is a list of the accounts that are permitted to select the subfile; the accounts are separated from each other by commas. It is possible to privilege an entire group to select a subfile by coding the group name followed by four periods: "XA...." for example, will permit any member of group "XA" to select the subfile. (More specific or more general subsetting of permitted users is also allowed. The following forms are valid: g....., gg...., gg.u.., gg.uu., and gg.uuu. The form "g....." would determine community subfiles, and the form "gg...." would determine group subfiles, as listed by the SHOW SUBFILES command. Note that the number of periods plus the number of characters of the account number must equal six. Other forms, such as "...uuu", are not allowed.) You may permit the public (any account) to select a subfile by specifying "PUBLIC" as an account.

If a user appears in several privilege groups for a single subfile, he or she will be given the privileges associated with the most specific listing of the account number. That is, if account XX.YYY selects a subfile with a set of privileges for the PUBLIC, another set for community X, another set for group XX, and another set for account XX.YYY, the set for account XX.YYY (the most specific listing) will be in effect. This situation often arises when a particular account or accounts are allowed to update a subfile that is selectable by the public for searching only.

In the following subfile section, two simple statements have been added to the previous subfile section:

The FORMAT statement specifies the name of a format that is to be set automatically whenever the subfile is selected in SPIRES (not in SPIBILD), just as if a SET FORMAT command had been issued. The SEARCH-MOD statement specifies a string that is to be appended to every search command issued, just as with the SET SEARCH MODIFIER command. Both of these statements take effect only once, when the subfile is selected. The user is free to issue other SET FORMAT or SET SEARCH MODIFIER commands to change these settings.

A WHERE-MOD statement, similar to the SEARCH-MOD statement, can be specified. This statement specifies a criteria clause (the clause that follows WHERE in a FOR command) that is appended to all FOR commands (except FOR REMOVES) issued against the subfile. The syntax of the statement is:

If no logical operator is included, AND is assumed. AND NOT and NOT produce the same result. Any FOR command will have this "WHERE-MOD" appended to it; unlike SEARCH-MOD where a SET SEARCH MODIFIER command can cancel it (unless there is some SECURE-SWITCH involved), the WHERE-MOD cannot be removed by the subfile user.

The WHERE-MOD statement has no effect on Global FOR requests in SPIBILD.

B.9.2a  Subfile Selection for "Access Lists" of Accounts

File owners often own multiple files that can be selected by the same sets of accounts. For example, you might own ten files that everyone in your department must be able to select. But suppose your department, like most others, has a great deal of turnover, and every time someone comes or goes, you have to change the accounts lists in ten file definitions.

Instead of trying to keep the same list of accounts in synch between ten file definitions, you could maintain the list in a single place -- a record in the EXTDEF subfile -- and refer to the list by name in the Subfile section of each file definition. That way, you could make changes to the single EXTDEF record, and they would apply immediately to each of the file definitions that referred to it.

The relevant statements in the EXTDEF record are:

You put the list of accounts in the ACCESS-ACCOUNTS statement. The ACCESS-ACCOUNTS list has the same format as the ACCOUNTS statement in the file definition. Any of the following may be specified, each separated from the next by a comma:

The ID of the EXTDEF record is then named in ACCESS-LISTS, a statement in the Subfile section of the appropriate file definitions:

As the form suggests, you can name several access lists in the ACCESS-LISTS statement. In addition, you can also name individual accounts in the ACCOUNTS statement as well; both the ACCESS-LISTS and ACCOUNTS statements may appear in the same Subfile section. [See B.9.2.]

If it simplifies maintenance of access lists for you, you may also want to "nest" access lists within each other by using the ACCESS-SUBLIST element in the EXTDEF record:

Anytime the current EXTDEF record is referenced as an access list during the subfile-selection process, SPIRES will first look through the accounts listed in the ACCESS-ACCOUNTS statement in the current record; then, it will look in the "secondary" access lists named in the ACCESS-SUBLIST statement in the current record.

It is possible for other file owners to use the "access lists" you have put in the EXTDEF subfile as well. That means that you can share access lists with other file owners. If you want to prevent others from using your access lists, or wish to limit the accounts that can, you can add the ACCOUNTS statement to the EXTDEF record:

If specified, only the accounts listed (as well as your own account) can use the EXTDEF record. To give access to your account only, add an ACCOUNTS statement with only your account listed. Do not confuse the ACCOUNTS statement in EXTDEF with the ACCOUNTS statement in the Subfile section of a FILEDEF.

Note that other file owners have "use-only" access to your EXTDEF records; they cannot see or change them, meaning that any changes to the ACCESS-ACCOUNTS (or ACCOUNTS) statement must be made by you. (However, you may give other accounts "update" access to EXTDEF records using the "metacct" facility; EXPLAIN METACCT SUBFILE for details.)

(Stanford-only) The ACCESS-UNIVIDS Statement

You can also specify the users who can select your subfile by placing their 8-digit Stanford University ID into the ACCESS-UNIVIDS statement, e.g.,

When the $UNIVID variable is set to one of the specified University IDs, then subfiles that give access to those IDs on this access list can be selected.

Subfiles that give access via ACCESS-UNIVIDS will not show up in the SHOW SUBFILES display.

The ACCESS-MATCH Element

A virtual element called ACCESS-MATCH provides a handy way for comparing the current user's account or Stanford University ID to the lists of accounts or University IDs specified in the ACCESS-ACCOUNTS, ACCESS-UNIVIDS and ACCESS-SUBLISTS elements in the EXTDEF record. If SPIRES matches the current user to an entry in one of those elements, then it returns the matching value in the ACCESS-MATCH element. If there is no match, the element's value is null.

In other words, if a program needs to check whether the current user is in one of the access lists of the EXTDEF record, it can simply check the value of the ACCESS-MATCH element.

B.9.3  The SECURE-SWITCHES Statement

Often it is not desirable to allow the user to set another format, or to clear the format set by the file definer, just as it is sometimes not desirable to allow a particular user and group to update a file that can be selected and searched. Preventing certain kinds of data manipulation is the function of the SECURE-SWITCHES statement. This statement, usually used to prevent users from doing particular things, is also coded in the subfile section.

Secure-switches are specified as numbers that activate certain functions:

For example:

There are 19 secure-switches that can be used independently or in any combination. A summary list, which describes how each switch affects the targeted users, appears below; details on each switch follow this section. Online, issue the command "EXPLAIN SSW n", replacing "n" with the number of the switch you're interested in.

Two other commands are handy when working with secure-switches:

which shows the secure-switches in effect for the selected subfile; and

which allows the file owner, and users with MASTER access to the file, to set a secure-switch dynamically, usually in order to test its effect on the subfile. [See B.9.3a.]

B.9.3.1  Secure-Switches 1 and 2

Secure-switches 1 and 2 provide for variations in the action taken when an error is detected by a processing rule. For example, suppose a personnel office has a file of salary data. In order to verify that salaries entered are within a particular range, the following might be the definition of the salary elements.

The Inproc coded would abort a record being added or updated if the salary was not from 9000 to 15000. However, it might be reasonable to allow certain groups of accounts to enter salaries outside of this range, perhaps providing a warning message if such data were entered. If the Inproc were recoded with a "V" for variable error conditions, and the appropriate secure-switch or switches were specified, this would be possible. For example:

Then, if no secure-switch were coded for a particular account group, no error message would be output and the value would be accepted. If SECURE-SWITCHES=2 were coded for the privilege group, then VD would be treated as a W error, and a warning would be provided. If SECURE-SWITCHES=1,2 were coded for the privilege group, then VD would be treated as an S error, and a message would be output, and the record aborted. If no SECURE-SWITCHES statement is coded for a privilege group, then all error conditions are treated as if the "V" specification were absent; "VD" is treated as "D", for example.

If SECURE-SWITCHES=1 is coded for a privilege group, then

If SECURE-SWITCHES=2 is coded for a privilege group, then

If SECURE-SWITCHES=1,2 is coded for a privilege group, then

As shown, if both 1 and 2 are coded, all V errors are treated as S errors. So, if VW were coded for actions for which S would have been coded, it would be possible to have one account group receive only warning messages while another account group, having SECURE-SWITCHES 1 and 2 specified, would receive serious error messages.

Variable error modifiers can also be used to advantage in Searchproc rules, to force one group of accounts to search only certain values in an index, while others could search the entire index.

By the way, X is allowed as an equivalent for VD; for example, AX53 and AVD53 are the same.

B.9.3.2  Secure-Switch 2

This section is merely a place-holder for Secure-Switch 2, which is described in the previous section. [See B.9.3.1.]

B.9.3.3  Secure-Switch 3

Secure-switch 3 prevents users from adding, updating or removing goal records. It blocks ADD, UPDATE, ADDUPDATE, REMOVE, MERGE, ADDMERGE, DEQUEUE, UNQUEUE, BATCH and all the INPUT commands in SPIRES.

In an obscure way, it also affects SPIBILD. In SPIBILD, input access is limited to the file owner and users with PROCESS access to the file. [See B.9a.] However, even those users cannot do any input processing (e.g., with the INPUT BATCH command) into a subfile if secure-switch 3 is set for their accounts.

B.9.3.4  Secure-Switch 4

Under secure-switch 4, account checking by processing rules ($TEST.ACCT proc or action A53) is ignored for the account using the subfile. This switch is most often used to allow a file owner, or specially privileged accounts, to have access to all records in a subfile when normal access to the records is controlled by account number.

B.9.3.5  Secure-Switch 5

Under secure-switch 5, the BROWSE GOAL command is blocked. Global FOR commands are not affected.

B.9.3.6  Secure-Switch 6

Secure-switch 6 is meant to restrict access to records only through normal search processing; "TRANSFER key", "DISPLAY key", EXTRACT, SCAN, SET and SHOW SEARCH MODIFIER, STACK, echo of protocol commands and SHOW XEQ STACK are blocked. Most Global FOR commands are also blocked, with the exception of processing under FOR RESULT, FOR SET or FOR STACK, if the stack was created from a result or set.

B.9.3.7  Secure-Switch 7

Secure-switch 7 ensures that when OR and AND NOT operations are performed during index searching, the entire pointer structure is compared, not just the pointer; therefore qualifiers are compared with each other. In other words, OR and AND NOT will always be treated as TOR and TNOT respectively in index searching.

Secure-switch 17 has a similar effect on AND, converting it to TAND. [See B.9.3.17.]

B.9.3.8  Secure-Switch 8

Secure-switch 8 blocks any non-formatted processing or display of records, including REFERENCE, SET ELEMENT, SHOW ELEMENT, GENERATE SET, GENERATE LOAD and CLEAR FORMAT commands, as well as TYPE and OUTPUT commands followed by an element list. TRANSFER is also blocked unless a format with USAGE=TRANSFER or ALL is in effect.

B.9.3.9  Secure-Switch 9

Secure-switch 9 blocks SET FORMAT, SHOW FORMAT, TRANSFER and REFERENCE commands; hence, it allows processing only with formats or without, depending on whether a default format is set for the subfile using the FORMAT statement. [See B.9.2.]

B.9.3.10  Secure-Switch 10

Secure-switch 10 prevents two users from having the same record transferred or referenced (unless the NOUPDATE option appears on the REFERENCE command) at the same time.

In addition, it blocks the "UPDATE key" command and the ADDUPDATE command. UPDATE alone, following a TRANSFER or REFERENCE command, is allowed under SSW 10.

The MERGE command is blocked under SSW 10 if the record being updated is currently transferred or referenced by another user.

Lastly, SSW 10 blocks the UPDATE option on INPUT LOAD commands in SPIRES, but allows it in SPIBILD.

B.9.3.11  Secure-Switch 11

Secure-switch 11 affects how SPIRES treats relational operators during searches involving simple indexes. When secure-switch 11 is set, only the equality operator is recognized; other operators, if they appear in the search command, are treated as part of the search value.

The session below shows how secure-switch 11 can affect a search; the user shown is the file owner, who is allowed to change the secure-switches with the SET SSW command. [See B.9.3a.] At the start, the switch is not set.

In the first FIND command, SPIRES treats the word AFTER as the relational operator. In the second, SPIRES treats it as part of the search value, since it is enclosed with the rest of the value within quotation marks.

But once secure-switch 11 is set, the word AFTER is treated as part of the value, not as a relational operator. So, in essence, the second and third FIND commands have the same effect: they both find a record with the title AFTER THE FOX.

The error message that SPIRES displays when it spots a relational operator other than the equality operator under secure-switch 11 processing is a warning error, which can be suppressed in protocols by issuing the SET WARNING MESSAGES = 0 command, if desired.

B.9.3.12  Secure-Switch 12

Secure-switch 12 has been withdrawn. It did provide a special type of SPIBILD processing for batch requests where index information was passed immediately for every update or merge request processed.

B.9.3.13  Secure-Switch 13

Secure-switch 13 is used to force a record into memory for key processing when the record is accessed. If the key of the goal record has a Userproc in its Outproc statement (a Userproc in the SLOT-PROC statement [See C.11.4.2.] if the goal record is a slot), then secure-switch 13 will make sure that the record key is driven through its Outproc (or Slot-proc) when the record is accessed. Thus, for example, you could prevent a particular user from updating records by coding a Userproc that checks the account number of the user. Whenever the record was accessed by its key (using a command such as "REMOVE 13" for example) the record would first be brought into core and the Userproc on the key's Outproc would be executed.

Note that SECURE-SWITCH 13 does not guarantee that the record will always be brought into core whenever it is accessed. This disclaimer is needed because it is possible that a record is already in core. For example, suppose that a Userproc on the key forbids some user from updating a record, i.e., issuing the UPDATE command against a record. If the user has issued a Global FOR command, such as FOR TREE, and DISPLAYed the record, the record has already been driven through its key's Outproc when the record is displayed. A subsequent TRANSFER * command, followed by an UPDATE, would be permitted; the record is already in core, and thus would not be checked again.

The ADDUPDATE command cannot be used with subfiles that have SECURE-SWITCH 13 set.

B.9.3.14  Secure-Switch 14

Secure-switch 14 must be set for users to take advantage of the WITH UPDATE feature, which allows:

See the manual "SPIRES Formats" for further information on WITH UPDATE.

B.9.3.15  Secure-Switch 15

Secure-switch 15 provides a second way for SPIRES to handle record input containing see-only elements. [See B.9.4.3.]

If secure-switch 15 is set, record transactions that attempt to add, update or remove see-only elements will fail with an S281 error. If secure-switch 15 is not set, the input for the see-only elements is completely ignored, i.e., SPIRES discards it, and it has no effect on the success or failure of the transaction.

B.9.3.16  Secure-Switch 16

Secure-switch 16 allows indexes containing upper- and lowercase values to be searched successfully. Unless this secure-switch is specified, the values processed by FIND, AND, OR, AND NOT, ALSO and SYNONYM commands are always converted to uppercase by SPIRES before any Searchproc rules are applied.

If this secure-switch is specified, then no automatic conversion to uppercase is done. Thus, upper-lowercase indexes can be searched. Any uppercase indexes should have a $CAP or $UPPER proc (A30) in the Searchproc (or the user would have to remember to enter values for uppercase indexes in uppercase only). (If an A30 is specified in a Searchproc, it should precede any A14 or A11 truncated search rules.)

For compound indexes, if this secure-switch is specified, then the elements being indexed should have a $CAP or $UPPER proc (A30) as an Inproc rule, should be numeric elements, or should be passed with A169:1 (which prevents forcing to uppercase on passing).

Note that specifying this secure-switch has no effect on processing of element values against criteria in an ALSO command or WHERE clause. If it is necessary for such criteria to be "case sensitive" then the data element (or a corresponding redefined virtual element) should have TYPE=HEX specified; WHERE and ALSO criteria are not forced to uppercase for such element-types. Declaring the element TYPE=HEX has the effect of defining the element as non-character, even if it still is character. (Note: HEX is equivalent to BITS as an element type.)

B.9.3.17  Secure-Switch 17

Secure-switch 17, which affects ANDs in index searches, is similar to secure-switch 7 for AND NOT and OR. It forces SPIRES to treat ANDs as if they were TANDs; in other words, SPIRES will compare both the pointers and qualifiers when ANDing pointer groups together.

The switch is meant for use only when all indexes of the subfile are guaranteed to have qualifiers.

B.9.3.18  Secure-Switch 18

Secure-switch 18 cuts off general SELECT access to the subfile, making it available only via subgoal access (e.g., $LOOKSUBF procs, subgoal processing, phantom elements). The subfile cannot be selected, nor can records be batched into it in SPIBILD.

B.9.3.19  Secure-Switch 19

Secure-switch 19 cuts off all subgoal access (e.g., $LOOKSUBF procs, subgoal processing, phantom elements) to the subfile, making it available only through SELECT commands. This is a complementary switch to SSW 18. [See B.9.3.18.]

B.9.3.20  Secure-Switch 20

Secure-switch 20 (SSW 20) will indicate to SPIRES that all OR operations are to act just like TOR, which means all unique combinations of pointer groups occur in the result. AND NOT will operate as before, which means it will eliminate all pointer groups from the 1st set that have pointers matching at least one pointer group in the 2nd set (1st AND NOT 2nd). AND will be done in a very special manner. The result set will contain all combinations of pointer groups from both sets when the pointer matches at least once in both sets, otherwise the pointer groups are eliminated. Here is an example of this special AND, called XAND:

      1st set      2nd set       result set
        PTR = 24;    PTR = 30;     PTR = 24;
         QUAL = X;    QUAL = A;     QUAL = X;
        PTR = 24;    PTR = 24;     PTR = 24;
         QUAL = A;    QUAL = K;     QUAL = K;
        PTR = 10;    PTR = 15;     PTR = 24;
         QUAL = F;     QUAL = G;    QUAL = A;
        PTR = 7;     PTR = 12;     PTR = 7;
         QUAL = R;    QUAL = J;     QUAL = Y;
        PTR = 7;     PTR = 7;      PTR = 7;
         QUAL = H;    QUAL = Y;     QUAL = R;
        PTR = 3;     PTR = 7;      PTR = 7;
         QUAL = S;    QUAL = R;     QUAL = H;

Although AND normally reduces a result set, this new kind of AND can increase the set since it logically does TOR of all pointer groups for which the pointer matches at least once in both source sets. Therefore, $RESULT may be larger, but $RESCNT (the count of unique pointers) will remain the same or will reduce. In the example above, $RESCNT for the 1st set was 4, while $RESCNT for the result is just 2. But all combinations of pointer groups for PTR=24 and PTR=7 are in the result (1st AND 2nd).

Note that this new AND (XAND with SSW 20) will take longer to perform since pointers need to be compared, and complete pointer groups also need to be compared.

Here are the major implementation features of AutoTAND.

If Secure-Switch 20 is set, and A6:16 through A6:30 are used to identify "classes" in searching, then SPIRES does the following with search requests:

a. Each search value defines a "mnemonic/class" code. For search values that don't use A6 classes, the code is 0. For search values that do use A6 classes, the code is a combination of "mnemonic" and "class number".

b. An internal processing stack maintains either "mnemonic/class" or "class/bit" codes. A "class/bit" code has just a single bit set for a single "class number", and no "mnemonic/class" ever matches any "class/bit" or combination of "class/bits". Entries are added to the processing stack by search values.

c. Each logical operator "processes" the last two entries in the internal processing stack. The last entry of the two is defined by the term to the right of the logical operator, and the first entry is defined by the term to the left of the logical operator. The result of the logical operation is to drop the last entry and replace the first entry by some logical combination of the original two entries. If the two entries match exactly, the result is the first entry, and the logical operator is not altered (AND remains AND which implies it will be processed as XAND with Secure-switch 20 set).

If the two entries don't match, then "mnemonic/class" entries are converted to "class/bit" entries (if needed), and the result entry depends upon the logical operator and bit patterns of the two "class/bit" entries as follows:

If the logical operator is OR or TOR, then the resulting entry is 0 if either "class/bit" entry is zero, otherwise it is the or'd bit pattern of the two "class/bit" entries.

If the logical operator isn't OR or TOR, then if bits match in both "class/bit" entries, the result is the and'd bits of the two "class/bit" entries (matching bits only) and the logical operator is changed to a T-operator (TAND or TANDNOT). But, if there are no matching bits, the result is 0 if the logical operator is already a T-operator, otherwise the result is the or'd bit pattern and an AND operator is treated as XAND. Finally, if the operator is some form of NOT, the result is set back to the original first entry (before conversion).

d. When a search command is finished being analyzed, the internal processing stack contains only one entry...that corresponding to the final result. This information is remembered and is used as the initial entry for an iterative search.

Under AutoTAND logic (SSW 20), note that OR'ing only defines a combination class when both sides contribute classification bits. If either side is zero, the or'd class is assumed to be zero. This has an effect on any subsequent AND'ing. AND's which have class bits are TAND'd when there are matching bits from both sides. OR's can only contribute bits when both contribute, so TAND'ing only occurs when everything contributes bits. Otherwise, XAND occurs, and the bit pattern becomes the or'd bit pattern of both sides, yielding non-zero bits if either side contributes bits.

The following example may help shed some light on the situation. Assume four search terms:

  1.  NAME       (0, does NOT contribute bits).
  2.  AUTH       (bit 1)
  3.  AUTH.CLASS (bit 1)
  4.  MJR        (bit 4)

  -> find name garcia or auth.class 7
  -> find auth.class 7 or name garcia

  ; The result bit-pattern is zero because NAME
  ; on either side of the OR contributes 0.

  -> find auth.class 7 or mjr D10
  -> find mjr D10 or auth.class 7

  ; The result bit-pattern is 5 (1 or 4, 4 or 1).
  ; Both sides contribute bits, thus the or'd pattern.

  -> and auth gra

  ; If done with either result involving NAME, this
  ; is done with an XAND and has a bit-pattern of 1.

  ; If done with either result involving MJR, this
  ; is done with TAND and has a bit-pattern of 1.

  -> find name smith and name wesson

  ; Done with an XAND, but the result bit-pattern is
  ; zero because NAME contributes 0 from both sides.

B.9.3.21  Secure-Switch 21

Secure-switch 21 alters the way SPIRES constructs the "-Result: <n> Records" message users see after issuing a successful FIND command. Instead of "n" coming from the value in the $RESULT variable, it will come from the $RESCNT variable when secure-switch 21 is set. Similarly, the displays from the SHOW RESULT and SHOW RESULT HISTORY commands are also affected.

Note, however, that a TYPE or DISPLAY ALL command, which would normally eliminate any pointers in the result that point to records that have been removed or that otherwise result in some error, will not eliminate those pointers when SSW 21 is set; neither $RESULT nor $RESCNT will change.

This switch is useful only when a subfile has indexes with qualifiers, which can produce misleadingly high result counts, as reflected in $RESULT. In such situations, $RESCNT is more accurate because it actually scans the search result and counts unique records. Because that takes more work, SSW 21 can make searches more expensive.

B.9.3a  The SHOW SSW and SET SSW Commands

You can see what secure-switches are in effect for a selected subfile by issuing the SHOW SSW command:

Those shown are the secure-switches in effect for the account which has the subfile selected. Unlike most SHOW commands, the SHOW SSW command may not be prefixed by IN ACTIVE in order to place the information in the active file.

The $SSW function may also be used to see whether a particular switch is in effect. See the manual "SPIRES Protocols" for more information about it; online, EXPLAIN $SSW FUNCTION.

The file owner and those with master privileges to the file can change the secure-switches in effect for the duration of the session with the SET SSW command:

where "ssw-number" is a valid secure-switch. If ON is used, the secure-switch is turned on; if OFF is used, the named switch is turned off; if neither is used, ON is assumed. This command can be useful for testing the effects of various secure-switches without having to change the file definition first.

Any changes made will be in effect only for the account under which the commands were issued and only while the subfile remains selected. To make permanent changes to the secure-switches, you must change the file definition.

B.9.4  Security for Individual Elements and Indexes

Previous sections of this chapter have shown how the Subfile section of the file definition can be used to provide subfile-level security. For example, you use the ACCOUNTS statement to specify exactly which accounts can select a subfile. And, in combination with secure-switch 3, for example, you can use the ACCOUNTS statement to specify which accounts can update records in a subfile. [See B.9.2, B.9.3.3.]

But you can also use the Subfile section to control access to individual elements and indexes. For example, you might have a subfile of personnel records accessible to a group of accounts that contains salary information that only a few of those accounts are allowed to see, and even fewer are allowed to update. This section, B.9.4 and its subsections, will discuss methods you can use to provide security for individual elements and indexes. [The Subfile section can also be used to provide security for specific records. This might be done using a combination of Userprocs and the SUBCODE and ACCOUNTS statements. [See C.11.3.]]

The contents of this section are outlined below:

B.9.4.1  Views: Element Security Defined in Packets

If you are responsible for a subfile, you may want one group of users to have no restrictions on the elements in each record; they can see and make changes to all the data. On the other hand, if the subfile contained sensitive data, e.g., grade elements in a school's registration subfile, you might want to let some accounts see the grade elements but not update them. And a third group might need access to the other elements in records of the subfile but have no need to see or update the grade data.

Each group of users needs to have a separate "view" of the goal records; the first group has an unrestricted view of all the elements, while another group's view lets its users see all but update only some of the elements. The third group's view lets its users see and update only some of the elements.

SPIRES lets you define views, such as the three described above, in the record definition. The two views that limit users might look like this:

Though each of these two views specifies only a single element, they actually define a view of all the elements in the record-type. That is, in both of these views, all elements except GRADE can be seen and updated, since only GRADE has any restrictions place on it.

You connect these views to particular user accounts in the Subfile section of the file definition, like this:

Most users (PUBLIC) will see the records through the HIDE.GRADE view -- that is, the GRADE element is hidden from them for both seeing and updating. Account TR.OUT, with the SEE.GRADE.ONLY view, can see the value of the GRADE element but not update it. Account SH.ARK has the least restrictive view at all; it has no view restrictions since no view is assigned to that account. [See B.9.4.3 for details on how specific commands are affected by various view restrictions.]

The VIEW Statements in the Record Definition

A view definition appears under the RECORD-NAME statement for the appropriate record in the file definition. Though it is compiled as part of the file's characteristics, it has no effect at all unless a VIEW-NAME statement naming it appears in the Subfile section of the file definition.

Specifically, a view definition can have the following statements, all of which are optional:

The MODIFY-VIEW statement is described in the next section. [See B.9.4.2.] The others are described below, though not quite in the same order shown above:

The "element.list" consists of a list of one or more element names, separated by commas or blanks, to a maximum length of 3000 characters. Each of the statements whose value is an element list can be multiply occurring, so the 3000-character limit would not pose a problem if you had to list hundreds of elements.

You can specify structure names too, if you want to assign a particular view category to all the elements in the structure (or those elements in the structure not named in another category) at once. Some other special forms for element specification are available too. Structures in view definitions and other advanced topics are discussed in the following section. [See B.9.4.2.]

The VIEW-NAME Statement in the Subfile Section

The VIEW-NAME statement has the same syntax as the VIEW statement:

Its value is chosen from the list of views defined in the definition of the subfile's goal record. It is placed in the Subfile section under the ACCOUNTS statement, as shown in the example above. Only one view may be specified per account list. The VIEW-NAME statement is optional; if it is omitted for a group of accounts, no view restrictions will exist for those accounts. [See B.9.4.4 to see how they might instead be affected by CONSTRAINT and NOUPDATE statements.]

You cannot specify both a view and any of the statements that refer to priv-tags (except for NOSEARCH) in the Subfile section. The same file, and even the same record-type, may have both priv-tags and views defined in them, but only one method of element security may be declared in the Subfile section.

In the view definition itself, if no DEFAULT statement is coded, the default security level for elements not covered by the HIDDEN, SEE-ONLY and UPDATE statements depends on which of these three statements were coded. The default value is:

Remember, though, that the defaults discussed here apply when you do not use the DEFAULT statement -- you can use the DEFAULT statement to override (or confirm) the default. Coding a DEFAULT statement is recommended, to insure the proper default for unassigned elements.

Another recommendation is that you keep the element lists in the HIDDEN, SEE-ONLY and UPDATE statements as short as possible, for efficiency's sake. For example, if all but three elements should be hidden, it is best to name the three exceptions in an UPDATE statement, and code the DEFAULT as HIDDEN, rather than name all the hidden elements in a HIDDEN statement.

For many applications, which just need to protect a couple of elements from particular groups of users, the above material will be more than sufficient when defining views. The next section covers some advanced features of the view facility.

B.9.4.2  Advanced Features of the View Facility

This section is divided into several discussions:

     1. Allowing records to be updated but not added or removed
     2. Assigning security levels to structural elements
     3. Changing one view into a similar one: MODIFY-VIEW
     4. View definitions in an EXTDEF record
     5. Alternate ways to list elements in view statements

1) Allowing Records to be Updated but not Added or Removed

If you use an element view to make a non-slot record key a "see-only" element, you prevent users from being able to add new records or remove existing ones. If it is the only see-only element, then the users can make changes to the rest of the record as desired with MERGE and UPDATE commands, but they cannot remove, add, dequeue or unqueue any records. [See B.9.4.3.]

A slot record-key, if declared "see-only", will prevent users from removing, dequeuing, or unqueuing any records, but adds are still allowed, because the slot key is assigned as part of the Inclose processing.

On the other hand, if you make any required element hidden, with no Inclose processing to supply default values, then users will not be able to add or remove records just as described above.

2) Assigning Security Levels to Structural Elements

If you name a structure in a view element-list, then all the elements within that structure are assigned that security level. For example, given the following elements in a record definition:

If a view is defined with COURSE as hidden, then COURSE.NUMBER and GRADE are hidden. If you want only GRADE to be hidden, just code "HIDDEN = GRADE".

On the other hand, suppose you want all the elements within QUARTER to be "see-only" except for those in COURSE, which should be hidden:

In essence, naming a structure in a view element-list is a shortcut way of naming all the elements within it; to say that a structure is hidden is to talk about its contents, not the container. This concept is important to understand in some situations.

For example, using the same record-type as above, suppose you want the elements YEAR and QUARTER.NUMBER to be hidden, while COURSE.NUMBER and GRADE should be see-only. Here are two different view definitions that produce the same view:

Both views have the following effect:

The SHORT view definition shows that when all but a few elements in a structure are to be assigned a particular level, it may be easier to assign that level to the structure as a whole and list the exceptions in another statement, rather than list all the elements for that level, as was done in the LONG view.

In sum, listing a structure in (say) a HIDDEN statement does not mean that the structure itself is "hidden"; instead, it means that all elements inside it will be hidden, unless they are changed by other statements. In the SHORT example above, the elements in QUARTER were first described as hidden, but then the SEE-ONLY statement changed the elements in COURSE to see-only.

The Overall Security Level for a Structure

There are uses for terms such as "hidden structures" and "see-only structures", but it is important to realize that these labels are not determined absolutely by a statement such as HIDDEN, but by the final status of the structure after all the statements of the view definition have been established. That is, in the example above, we would not call QUARTER a hidden structure, since there are elements within it that are not hidden at all.

So here is a list of the types of structures that a view can establish, followed by a discussion of how the types affect users differently:

Here is another sample view, shown with its effects on the sample record-type:

The COURSE structure is see-only, because it contains a combination of see-only and hidden elements. On the other hand, QUARTER is a partial-update structure, because it contains an update element (e.g., YEAR) and a "non-update" structure (COURSE).

These distinctions can be important when you are trying to determine the effects of a view on individual users. [See B.9.4.3.]

3) Changing One View to a Similar One: MODIFY-VIEW

The MODIFY-VIEW statement lets you make minor changes to another view without having to respecify the entire view. For example, here are two views, where one modifies the other:

In this particular case, the two views are identical, except for the COURSE.NUMBER element, which is HIDDEN in PRIMARY but SEE-ONLY in SECONDARY.

The MODIFY-VIEW statement is part of the view definition in the record definition. Its syntax is:

where "view.name" is the name of another view defined for the same record-type.

When compiling a view that includes a MODIFY-VIEW statement, SPIRES first sets the levels for the HIDDEN, SEE-ONLY and UPDATE statements listed in the view named in the MODIFY-VIEW statement. Then it sets the views for the HIDDEN, SEE-ONLY and UPDATE statements listed in the rest of the current view definition. Then it sets the default, using the DEFAULT statement in the current view definition, if there is one; otherwise, it uses the DEFAULT statement in the view named in the MODIFY-VIEW statement.

For INPROC-REQ and OUTPROC-REQ statements, all the elements named in all occurrences of these statements, either in the current view or in the view named in the MODIFY-VIEW statement, will be restricted.

4) View Definitions in an EXTDEF Record

You can place a record's view definitions in a record of the EXTDEF subfile if you want. Doing so could be advantageous if record-types in several different files belonging to you (or even in the same file) have the same record definition (or at least the same element names). Rather than maintaining the same view definitions in several different file definitions, you can keep them in a single EXTDEF record.

You can place the view definitions into an EXTDEF subfile record belonging to your account. Then, you tell SPIRES where the view definitions are by adding the EXT-VIEW statement to the appropriate record-definition section of the file definition:

where "gg.uuu.extdef.record" is the ID of the EXTDEF record. Remember, the EXTDEF record must be your own; you cannot name EXTDEF records belonging to other accounts. More than one may be named. In the record definition, the EXT-VIEW statement comes just before any view definitions you want to code there.

All the views in an EXTDEF record must be for the same record-type. SPIRES will issue error messages if it finds names of elements not in the current record-type it is compiling. So if you want several record-types in the same file to have view definitions that are stored in the EXTDEF subfile, you should put the definitions into separate EXTDEF records by record-type. [See C.10.5.]

Alternate Ways to List Elements in View Statements

The HIDDEN, SEE-ONLY, UPDATE, INPROC-REQ and OUTPROC-REQ statements all specify elements to be affected by their control. Some special ways of listing elements can simplify the coding of these elements.

If several different elements in different structures all have the same element name (e.g., PHONE.NUMBER), you can identify each one individually be using the "structural path" form of the name:

On the other hand, if you want to specify that all elements of that name should be protected similarly, precede the element name with the "@" symbol, e.g.:

All elements named PHONE.NUMBER, regardless of what structures they are in, will be hidden in that view. That technique also works for floating structures.

If you need to identify several individual elements in one structure and you would need to use the "structural path" forms of their names to identify them, a special form using parentheses is available to you. For example, suppose PHONE, AGE and WEIGHT are elements in the CHILD structure while other structures also have the same elements. To hide these three elements in the CHILD structure only, you can code either of the following:

Other structures can be nested inside; be sure to get the parentheses matched up properly though!

B.9.4.3  Specific Effects of the View Facility

The effects of element views are in most cases quite obvious, but are occasionally subtle too. Naturally, if you declare an element to be hidden in one view, you know the element's values cannot be changed through that view. But if the UPDATE command completely replaces the old version of a record with a new one, wouldn't an update of such a record cause any hidden elements to be discarded? Or do hidden elements mean an UPDATE command is not allowed?

This section will answer those kinds of questions as it discusses the effects that element views have on various SPIRES commands. In particular:

- for output commands, such as TYPE and DISPLAY:

Occurrences of any hidden elements will not appear in the output, but update and see-only elements will.

- for ADD commands:

- for TRANSFER commands:

- for UPDATE commands:

- for MERGE commands:

SPIRES treats hidden and see-only elements the same for MERGE commands as it does for UPDATE commands (see above). MERGE is still different than UPDATE for the "update" elements: UPDATE discards all the update elements from the old record and adds the new ones from the input data. MERGE, on the other hand, merges the new data into the old data of the "update" elements.

- for ADDUPDATE commands:

SPIRES will treat the record as described above for ADD commands or for UPDATE commands, depending on whether the record turns out to be an add or an update.

- for REMOVE commands:

You can remove records in a subfile that defines hidden or see-only elements if and only if those records contain no occurrences of the see-only and hidden elements.

- for UNQUEUE and DEQUEUE commands:

You cannot issue either UNQUEUE or DEQUEUE commands if the goal record has either hidden or see-only elements in your view of the subfile.

- for SHOW ELEMENT commands:

For commands that list the elements in a record-type, such as SHOW ELEMENTS and SHOW ELEMENT NAMES, hidden elements and hidden structures will not be shown to the user at all. See-only and update elements and structures, and the non-hidden portions of partial-update structures will be shown.

- for other commands and functions:

Any command in which you name an element (e.g., "TYPE element-list") will fail if you name a hidden element; you will be told that you have named an "invalid mnemonic". Functions in which you name an element, such as $ELEMTEST, may return only a null value if you name a hidden element.

B.9.4.4  Priv-Tags and the CONSTRAINT and NOUPDATE Statements

Another way to handle element security, mutually exclusive from element views [See B.9.4.1.] is with "priv-tags" assigned to elements, combined with CONSTRAINT and NOUPDATE statements coded in the Subfile section of the file definition.

The CONSTRAINT and NOUPDATE statements, associated with a given group of accounts that can select the subfile, specify "priv-tag numbers" of those elements to be affected. Any element with the matching priv-tag number is assigned the particular degree of security identified by the statement.

The syntax of each statement is shown below:

What are "priv-tags"? These are numbers that the file definer assigns to elements in the record or linkage definitions by coding a PRIV-TAG statement. These numbers may then be referenced by CONSTRAINT, NOUPDATE, INPROC-REQ or OUTPROC-REQ statements in privilege group specifications. [See B.9.4.5 for information about INPROC-REQ and OUTPROC-REQ.]

Each element in a record-type may be assigned a single priv-tag number. The PRIV-TAG statement appears in the element definition:

where "n" is a single integer from 1 to 63. If the element is a slot key, you must code the priv-tag number on the SLOT statement:

where "n" is a priv-tag integer from 1 to 9 (the slot key cannot have a priv-tag of two digits).

Only one number can be assigned to each element. Any positive integers from 1 to 63 can be chosen (except for the slot key, as noted above), and they do not have to be in sequence. Several elements may share the same priv-tag number. This is usually done to reduce the number of tags that must be specified in the CONSTRAINT and NOUPDATE statements, whenever the elements have identical restrictions placed on them.

Since CONSTRAINT and NOUPDATE apply only to priv-tags in the selected goal record, priv-tag numbers in one goal record can duplicate priv-tag numbers in other goal records in the same file.

To use priv-tags to make an element invisible to a user, the element must have both CONSTRAINT and NOUPDATE priv-tags applied. If a user can see an element but not update it, then only NOUPDATE need be specified. A simple (i.e., non-structured) data element never has CONSTRAINT specified without NOUPDATE.

Here are two elements, SALARY and NAME, that have PRIV-TAG values specified:

Let's now define two privilege groups for a subfile PROFESSORS with a goal record called PROF, providing varying levels of access to those two elements. Note how two privilege groups are defined for a single subfile simply by repeating the privilege-group structure key, GOAL-RECORD. A third privilege group must select the subfile under a different name, and has access to a different explanation.

Note that only one method of element security may be used in a subfile: either the priv-tag method, using the CONSTRAINT, NOUPDATE, INPROC-REQ or OUTPROC-REQ statements, or the view method discussed in the previous sections. Both views and priv-tags may be defined in the same record-type, but the Subfile section of the file definition may only refer to one of the methods.

Priv-tag values are somewhat more difficult to assign when elements being controlled by CONSTRAINT and NOUPDATE are elements in a structure. In such cases, the structure itself must always have a priv-tag value. If any element in a structure has CONSTRAINT and NOUPDATE applied to it then the restriction on the structure itself is always the same as the least stringent level of restriction on the elements.

The following table indicates what restriction should be assigned to an element whose TYPE=STR, depending upon the minimum and maximum restrictions defined for elements that are part of the structure. In this table,

        - 0 indicates that the element is not restricted;  that is,
          neither CONSTRAINT nor NOUPDATE are specified.
        - 1  indicates  that  the  element   has   only   CONSTRAINT
          specified.
        - 2 indicates that the element has only NOUPDATE  specified.
        - 3 indicates that  the  element  has  both  CONSTRAINT  and
          NOUPDATE specified.

    Maximum Restriction in Structure

       0      1       2      3
     ---------------------------
    |  0  |   1   |   1   |  1  |   0
     ---------------------------
          |   1   |   1   |  1  |   1   Minimum Restriction
           ---------------------           in Structure
                  |   2   |  2  |   2
                   -------------
                          |  3  |   3
                           -----

Thus, if one element in a structure has no restrictions applied to it (a minimum restriction of 0), and another element has CONSTRAINT and NOUPDATE (a maximum restriction of 3), then the entire structure must have CONSTRAINT applied to it (a restriction of 1); this is the least stringent restriction in the structure. (This is the only situation in which CONSTRAINT is specified without NOUPDATE.) If all elements in a structure have both CONSTRAINT and NOUPDATE applied to them (a minimum and maximum restriction of 3), then the structure itself must have a priv-tag of CONSTRAINT and NOUPDATE. If structures exist within structures, then begin assigning restrictions to the elements deepest within structures and work toward the record level.

It is often desirable to prevent members of a privilege group from adding or removing records, but to allow updating access to other elements in a record. This can be done simply by putting a restriction of NOUPDATE on the key of the record.

Note that if a goal record has any NOUPDATE restrictions for the user:

More information about how SPIRES treats elements that are hidden because of CONSTRAINT and NOUPDATE statements can be gleaned from the earlier section on the effects of hidden and see-only elements, which are identical. [See B.9.4.3.]

B.9.4.5  The INPROC-REQ and OUTPROC-REQ Statements

SPIRES by default allows a subfile's users to retrieve both the internal and external form of an element. For example, a dynamic element can be defined to provide the internal form of another element:

Sometimes, though, a file owner needs to prevent users from retrieving the internal form of an element, often for security reasons. In other words, the file owner wants to prevent users from bypassing processing rules.

The statements INPROC-REQ and OUTPROC-REQ, in conjunction with either element views or element priv-tags, can be used to require that element values be processed through the INPROC or OUTPROC rules coded in the file definition respectively. If used with element views, the statements are coded in the view definition, and name the elements to be affected. [See B.9.4.1.] If used with priv-tags, they are coded in the Subfile section of the file definition, like CONSTRAINT and NOUPDATE. [See B.9.4.4.] Either way, you can specify them for different elements and for different accounts.

Whether in view definitions or in the Subfile section (for the priv-tag method), the INPROC-REQ and OUTPROC-REQ statements share the same syntax:

where the list of element names, separated by commas or blanks, can be up to 3000 characters long, and each statement may occur multiple times in the view definition.

where each "priv-tag" is an integer from 1 to 63 corresponding to the priv-tag value assigned to elements to be affected by these statements. [See B.9.4.4.]

The impact of the INPROC-REQ statement is on input formats. Label groups in an input format may include INPROC statements, which are executed instead of the file definition's INPROC rules for the element being processed. But if the element is under the control of the INPROC-REQ statement, the INPROC statement in the format label group will not be executed. Instead, SPIRES sets the $SKIPEL flag, which means that the remainder of the label group will be executed, but any PUTELEM or REMELEM statement in the label group will be ignored. In other words, a format cannot override the file definition's INPROC rules for an element under the control of INPROC-REQ.

The OUTPROC-REQ statement affects output formats similarly. When an element is under the control of the OUTPROC-REQ statement, any OUTPROC statement in a format label group processing that element will be ignored. Instead, the rest of the label group is skipped, unless the DEFAULT statement is coded in the label group to specify default processing. [See the manual "SPIRES Formats" for more information on the effects of INPROC-REQ and OUTPROC-REQ on formats.]

The OUTPROC-REQ blocks access to the internal form of an element by other techniques as well. Specifically, if an element is controlled by OUTPROC-REQ:

Please refer to the documentation of these features for more specific information.

B.9.4.6  Index Security: Priv-Tags and the NOSEARCH Statement

Occasionally you may want or need to hide an index from users of your subfiles. This happens more frequently than you might think; you will probably want to hide the sub-index part of a name index, for example. [See B.7.11, B.8.5.] Or, if you have an element that is hidden from a group of users but is also indexed, you will probably want to hide the index from the users as well, so that they cannot use the index or see values in it (with the BROWSE command). [See B.9.4.1.]

To hide one or more indexes, you must use priv-tags in combination with the NOSEARCH statement coded in the linkage sections of the indexes you want to hide. The NOSEARCH statement's syntax is:

where each "n" is an integer from 1 to 63 matching "priv-tag numbers" of indexes, sub-indexes or qualifiers that are not to be searched by accounts in the privilege group.

Unless you are hiding different indexes in several combinations from several different user groups, you will probably use only one or two priv-tag numbers, which you assign in the linkage section of each index, sub-index or qualifier you want to hide:

where "n" is an integer from 1 to 63. Each index can have only one priv-tag number, but the same number may be (and most commonly is) shared between several indexes.

As suggested earlier, the most common situation involving NOSEARCH is when a file owner wants to hide the sub-index portion of a name index from users. The NOSEARCH statement will be used to prevent users from seeing the sub-index that contains the first names of the indexed people; searchers don't need to see the sub-index because the Searchproc action A38 ($PNAME proc) will search it automatically.

In other words, the Searchproc lets the user type a command such as "FIND NAME JOHN SMITH" and converts it (more or less) into the command "FIND NAME SMITH @ FIRST.NAME JOHN", which means the user does not have to type a complicated command like that. By hiding the sub-index FIRST.NAME, the user does not see the sub-index listed in the SHOW INDEXES display, even though he or she actually does use it when the name index is used.

Here is the way the linkage section for the personal name index might look:

Since you probably want to hide the sub-index from all accounts, the Subfile section of the file definition might look like this:

If you had several sub-indexes (or any other indexes) you wanted to hide from the view of these accounts, you could assign them all the priv-tag value of 1, or give them different integer values and add those values to the NOSEARCH statement.

B.9.5  The SUBGOAL Statement

The SUBGOAL statement allows members of a privilege group to access data in record-types of the file other than the goal record record-type using SPIRES formats. Such access may be necessary when indirect record-access operations such as table-lookup become too complex for action 32 ($LOOKUP proc) to handle. [See C.5.7, C.5.8.]

The SUBGOAL statement is also necessary when a user other than the file owner writes a format for a file and that format contains an action 32 ($LOOKUP) -- except for the file owner, only users of the format who have been given access to the accessed record-type by the SUBGOAL statement in the file definition can successfully use the format. (Accounts given SEE file access [See B.9a.] can also use the format.) _G Detail The SUBGOAL statement is also needed for the $LOOKSUBG function to work from any accounts other than the file owner's and those given SEE access.

Phantom structure access, another way of allowing access to other record-types in a file, does not require that the SUBGOAL statement be specified. A file owner establishes phantom structures in the file definition, so the file owner has other controls (CONSTRAINT, etc.) for blocking their use.

SPIRES formats that use the SUBGOAL processing feature cannot be used except by accounts that have been given SUBGOAL privileges. This privilege is granted by coding a SUBGOAL statement in the subfile-section for the account or group of accounts to be privileged. The SUBGOAL statement specifies the RECORD-NAME values of the record-types that may be accessed. The SUBGOAL statement can specify a maximum of ten record names that can be used by a privilege-group.

For example, to give groups W7 and W8 the ability to access INDX1 and INDX2 using SUBGOAL processing, the following subfile-section would be coded:

Whenever a member of either group W7 or W8 selects the PROFESSORS subfile, he or she may invoke a format to access records in INDX1 or INDX2; the PUBLIC would not be able to invoke such a format.

Note that if you are using the subfile's goal record for subgoal processing (that is, the same record-type is both the goal record and the subgoal), you do not need to code the record name of the goal record in the SUBGOAL statement -- it is automatically available for subgoal processing. Also, it does not count as one of the ten record-types allowed, as described above.

B.9.6  The SELECT-COMMAND Statement

SELECT-COMMAND is a multiply occurring element that specifies commands to be executed whenever the subfile is selected. Hence, it can be used to set a COMPXEQ or XEQ subfile for use with the subfile, for example, or to issue informational messages, such as "subfile news", to the user selecting the subfile.

Here is an example of some SELECT-COMMAND statements at the end of a SUBFILE section:

When the subfile is selected,

Then, if you issue the SHOW XEQ command for confirmation:

All commands (of any system) are valid, including XEQ FROM to invoke a protocol. Moreover, because the SELECT-COMMANDs are treated like a protocol, you can use labels and JUMP commands as SELECT-COMMANDs. [See "SPIRES Protocols" for more information about protocols.]

Note that there is a limit of 10 occurrences of SELECT-COMMAND in each ACCOUNTS section of the SUBFILE section. Note also that even if ECHO is set, the "select commands" are not echoed at the user's terminal; only the file owner knows for sure what commands have been issued.

The select commands are executed at the very end of the selection process, after the default format is set (if a FORMAT statement is coded).

B.9.7  The PROGRAM Statement

The file owner can limit the programs in which a user can select the subfile by adding the PROGRAM statement to the subfile section of the file definition. The most common application of this statement is to make a subfile available for selection only in Prism.

The syntax of the statement is:

where "program" is a program chosen from the following list:

 SPIRES   PRISM   SPIBILD   FASTBILD   FOLIO

If the user attempts to select the subfile but is not in one of the programs specified in the PROGRAM statement, an error will occur and the select will fail. (The currently executing program is named in the system variable $PROGRAM.) If desired, the user can issue the EXPLAIN command to get an explanation of the error message.

Of course, if the PROGRAM statement is omitted, the subfile can be selected from any of the programs listed here that are available to the user.

B.9.8  The SUBCODE Statement

The SUBCODE statement in the Subfile section of a file definition allows you to establish a "name" for the particular set of users who have access to the subfile with the specific constraints, secure-switches, etc. defined there. This name is placed in the system variable $SUBCODE, which can be used by Userprocs, protocols and formats.

Here is the syntax of the SUBCODE statement:

where "value" is a string up to 32 characters in length. SPIRES will convert the value to uppercase, so case is not significant. The variable $SUBCODE is set to "value" when the user selects the subfile; the user cannot change the value, though he or she can examine it.

The statement is best considered as a label for the accounts given a particular view of a subfile as defined in the rest of the Subfile section. For example, suppose a file definition has this Subfile section:

If a format, protocol or Userproc needs to determine whether the current user is in the first group or the second, it can check the value of $SUBCODE:

If the user is one of the second set, the above command would cause the subfile log to be displayed. That is considerably cleaner than this:

B.9a  Defining File Access Privileges

By default, certain commands relating to file management can be used only by the file owner. Some of the commands provide information only (SHOW FILE STATUS or SHOW SUBFILE TRANSACTIONS, for example); others can affect the status of the file (SET AUTOGEN, or PROCESS in SPIBILD); still others can completely destroy the file (ZAP FILE). The FILE-PERMITS section, a record-level structure in the file definition, gives the file owner the ability to allow other accounts to issue various sets of these commands. For instance, other accounts may be able to process the file [See B.10.12.] or copy the entire file to their own account, and so forth.

The FILE-PERMITS structure appears at the end of the file definition, following the SUBFILE section. It contains two elements. The first is called FILE-ACCESS and is the key of the structure. The second is a multiply-occurring ACCOUNTS element.

FILE-ACCESS can have one or more of these values: SEE, UPDATE, PROCESS, MASTER and COPY. (If more than one value is used, they must be separated by commas.) The five values are described below:

            SHOW FILE ACTIVITY
            SHOW FILE BLOCK
            SHOW FILE STATUS
            SHOW RECORD OVERHEAD
            SHOW RECORD RECOVERY
            SHOW SUBFILE TRANSACTIONS
            SHOW SUBFILE LOG
            STATUS
            ATTACH
            GENERATE LOAD

The ACCOUNTS statement can specify one or more accounts which are to be given the access specified in the previous FILE-ACCESS statement. The same forms of the account that were valid for the ACCOUNTS statement in the SUBFILE section are valid here (e.g., GA.JNK, GG...., or PUBLIC). [See B.9.2.]

Here is a sample of a FILE-PERMITS structure which would appear at the end of a file definition:

All accounts have SEE access to the file. All accounts in group GG, as well as account GA.SPI, have MASTER access to the file. Two accounts, GG.WCK and GG.RLG, have COPY access to the file.

There are, of course, still some privileges that only the file owner has. Only the file owner can see and update the file definition record in the FILEDEF subfile for the file. Also, only the file owner can issue the COMPILE and RECOMPILE commands. Destroying the file, using the command ZAP FILE, can only be done by the file owner as well.

Note that, unlike subfile access privileges [See B.9.] which take effect as soon as the file definition is added or updated in the FILEDEF subfile, the privileges defined in the FILE-PERMITS structure are compiled in the file definition characteristics, and take effect immediately upon compilation.

B.10  SPIRES File Management

The contents of this chapter have been incorporated into the manual "SPIRES File Management".

"File management" is about how to take care of a SPIRES file that already exists. It is often the next subject after "file definition" in that it covers, for example, how to load data into a subfile (i.e., add multiple records to a subfile at one time), how to add new indexes or "remove" indexes no longer needed, how to ensure that the file is using its storage space most efficiently, etc.

B.11  Logging Database Use in SPIRES

The owner of a SPIRES file has the option of having SPIRES log certain information about the access and use of a database. The file definer can specify the kind of information that is to be logged and the manner in which it is to be stored. This is done with the LOG and STATISTICS statements.

The file definer can also inform SPIRES that use of a database is to be charged for (as distinct from merely logged), supplying the rates which apply for particular uses of the database. In this case, SPIRES will inform a user who attempts to select a chargeable database of the rates that apply. SPIRES will then log those charges incurred during individual users' sessions.

Details about creating and using a log appear in the manual "SPIRES File Management", chapter 4.3. Online, EXPLAIN FILE LOG.

B.12  Immediate Indexing

By default, subfile updating activities do not affect its indexes. Added and updated records, as well as removal requests, are placed in the deferred queue -- not till the file is processed in SPIBILD are the indexes updated to reflect the deferred queue data. However, in some applications, immediate indexing of the deferred queue data may be desirable. A file owner may request that one or more indexes of a subfile be updated immediately when a subfile transaction would affect them, meaning that those indexes will always be synchronized with the goal records.

Only two extra steps in the file definition procedure are needed to request immediate indexing. First, the IMMEDIATE statement must be added to the linkage section of each index for which the feature is desired. Second, appropriate ORVYL data set security permits must be set to allow accounts updating the subfile to change the data sets containing the indexes. These steps will be discussed in the next section. [See B.12.1.]

Considered by themselves, the costs of "immediate" indexes are higher than "non-immediate" indexes. For example, processing that would occur overnight when rates are cheaper will occur whenever a subfile transaction involving immediate indexing occurs. Also, immediate indexing handles records individually, whereas overnight processing handles multiple records all at one time, which is more efficient. On the other hand, if users must search the deferred queue fairly regularly because the data they need may not be indexed, the benefits of immediate indexing (in terms of both searching costs and user convenience) may outweigh any efficiency disadvantages. Efficiency considerations will be considered in more detail later in this chapter. [See B.12.2.]

It is important to realize that the file owner declares indexes to be "immediate" on an individual basis -- in any given subfile with, say, a dozen indexes, any number from 0 to 12 of them could be immediate indexes, and the rest would not be. Thus, immediate indexing does not mean that a file's deferred queue is unnecessary. Quite to the contrary, SPIRES will still add the subfile transaction to the file's deferred queue, awaiting tree processing by SPIBILD. However, in addition, SPIRES will update the tree copies of any immediate indexes at the time of the subfile transaction. All Global FOR operations against the deferred queue (FOR UPDATES, DISPLAY ALL, for instance) will still work as they did before, since the transactions are still placed in it.

B.12.1  Coding Immediate Indexes

To request immediate indexing of one or more indexes, the file owner must follow two steps. First, the IMMEDIATE statement must be added to the linkage section of the particular index. Second, appropriate ORVYL data set permits may need to be set.

The IMMEDIATE Statement

The IMMEDIATE statement is placed at the end of the linkage section for a given index. Its syntax is simply

Below is a sample individual linkage section:

The IMMEDIATE statement here applies only to the index record-type ZIN04. An IMMEDIATE statement must be added to the linkage section of each index for which immediate indexing is desired.

Immediate indexing may be coded for any index in any SPIRES file with the following exception in goal-to-goal passing: if a goal record-type (say, REC01) passes data to an index record-type via immediate indexing (REC02), that index record-type may not pass the same data to another index record-type (REC03) via immediate indexing. In other words, if REC01 passes data immediately to REC02, REC02 cannot itself pass the same data immediately to any other record-type. The SPIRES compile process may not detect this error, but will warn you if a record type is both an immediate index and a record type that passes data immediately. [See B.12.3 for other information on immediate indexing with goal-to-goal passing.]

If you are adding immediate indexing to a file that already exists, you must recompile the file definition after adding the IMMEDIATE statement. The deferred queue of the file must be empty.

In addition to the other ORVYL data sets, a "filename.CKPT" data set will be created when the file definition is compiled (or recompiled, after IMMEDIATE is added). [See B.5.5.] That data set will be used instead of the SYSTEM.CHKPT data set for file processing. [See C.6.24.] (Be aware of the difference in the abbreviations of "checkpoint" in the data set names.)

ORVYL Data Set Permits

When you compile a file definition, SPIRES creates ORVYL data sets in which the file data will be stored. Appropriate permits are set to allow users to change particular data sets or forbid the same. By default, for example, the DEFQ data set of a file (i.e., the ORVYL data set "ORV.gg.uuu.filename.DEFQ") is set for "public WRITE" access, which allows SPIRES to allow users with subfile access to place subfile transactions in the deferred queue. All other data sets for the file, such as RES or MSTR, are set for public READ, allowing the data to be read but not changed by other accounts, since all updating activity goes to the deferred queue.

If, however, you define an index as immediate, SPIRES will set public WRITE on all newly created data sets for the file that might be modified during immediate-index activity. Usually that means these data sets:

SPIRES will not change the permits for data sets that already exist. Therefore if you are recompiling a file definition, of changing an index that already exists into an immediate index, you will need to set the permits properly (i.e., to WRITE access) yourself.

Giving the data sets public WRITE access does not necessarily mean the public can update your SPIRES file, particularly if you have used the SET SPIFILE command in ORVYL. [See C.6.22.] With the SET SPIFILE command, you can completely control who has access to your file by means of the file definition.

However, if limitations of the SET SPIFILE command get in your way, the ORVYL commands SET CLP, SET PERMIT and SET NOCLP may be used to change the permits for those data sets if you want. The procedure to follow for each data set is:

where "data.set.name" is the name of the ORVYL data set whose permits are to be changed. Various forms of "account" are allowed, including "gg.uuu" for a specific account, "GROUP gg" for all members of a group, and PUBLIC.

For example, to allow users in group FF to add records to be immediately indexed to the subfile STAFF in the file FF.JNK.EMPLOYEES, the file owner account would issue the following commands:

And that process would be repeated for other data sets, as described above.

B.12.2  Efficiency Considerations for Immediate Indexes

There are two types of data processing to consider when determining the impact of immediate indexing in regard to efficiency and costs: the maintenance of the data base (in terms of index building from user updates) and the general use of the data base (e.g., for searching). In both regards, immediate indexing incurs additional overhead not incurred by non-immediate indexing. In neither regard, however, is the overhead likely to be overwhelming to the point of discouraging you from using immediate indexes as long as you have good reasons to.

(This discussion is not meant to discourage you from using immediate indexing, but to tell you about its overhead, albeit minor in most cases. For applications where immediate indexing is clearly required, this overhead would probably be considered insignificant or irrelevant. Probably the best attitude is to consider immediate indexing as a small luxury that can become expensive if overused.)

In regard to file maintenance, changing an index, say, twenty times a day for individual records is less efficient than changing it one time each day (overnight at cheaper rates) for twenty records. However, since SPIRES is using only very small amounts of CPU-time with either method, the difference on a record-to-record basis for a single immediate index will probably be minor (probably under a penny). Of course, the difference would be multiplied for each immediate index that is updated, which emphasizes again that you should only make indexes immediate indexes when users are counting on them to absolutely reflect the data base in its current state.

Be aware also that the costs of updating immediate indexes are incurred by the user updating the subfile, not by the file owner, who would pay for the overnight SPIBILD processing of the index. In that sense, immediate indexing is cheaper than non-immediate indexing, at least to the file owner. (Remember that SPIBILD processing is still necessary for files, even those having all immediate indexes. SPIBILD will not do anything to the immediate indexes, however.)

Immediate indexing causes additional overhead for searching as well as updating. SPIRES knows that non-immediate indexes will not be changing from one search request to the next and can make some assumptions for efficiency that it cannot make for immediate indexes. Also, more ORVYL data sets (such as "filename.DEFQ" or "filename.RES") are "locked" during a subfile transaction involving immediate indexing, which means that data cannot be read from them during that time. Again, only fractions of a real-time second are usually involved, but it is overhead, nonetheless.

B.12.3  Immediate Indexing and Goal-to-Goal Passing

In the introduction to this chapter, you were told that an immediate index record-type could not always pass data to another immediate index record-type, which is one form of "goal-to-goal passing". Goal-to-goal passing is discussed in detail later in this manual. [See C.12.] Other effects of immediate indexing may arise in files with goal-to-goal passing -- they are discussed in this section.

For most immediate index record-types, SPIRES updates the tree version of the index. Suppose that goal record-type REC01 passes immediately to index record-type REC02. REC02 also serves as a goal record-type; it passes data to REC03, but not immediately. Thus, records in REC02 can be updated independently of those in REC01, though presumably pointer information in REC02 records pointing to REC01 would be left alone by the user (NOUPDATE might be specified for the pointer group through the PRIV-TAG facility.).

The immediate indexing of REC01 data into REC02 has the following effect: if REC01 passes index data to a record in REC02 that is in the deferred queue, the tree copy of the REC02 record is not changed. The new data from REC01 affects the deferred queue copy of the REC02 record. On the other hand, if REC01 passes index data to a REC02 record that is not in the deferred queue, the tree copy of the REC02 record is changed, which is the effect of immediate indexing on "normal" indexes.

What is the impact of index records being in the deferred queue rather than in the tree? For direct searches using the equality operator (but not truncated searches), SPIRES will check the deferred queue for the index record before checking the tree. This minor overhead (a minimum of one I/O per key-value in the search) is incurred by all immediate indexes, by the way. (For non-immediate indexes, SPIRES can assume that no index records are in the deferred queue and hence can always use the tree of the index record-type.)

For non-direct searches (e.g., range searches or truncated searches), SPIRES does not check the deferred queue, so any updated REC02 records that also had new index information from REC01 would not be included. Remember, though, that this situation only affects those REC02 records already in the deferred queue when the immediate indexing request appears -- otherwise, the immediate indexing will affect the tree copy of the REC02 record, which could be included in a non-direct search. You could build a separate, non-goal, immediate index for REC01 if range and truncated searches of record-type REC02 must reflect REC01 updates.

The same rules apply to records in REC01 processed under the Global FOR command FOR INDEX, where the named index is REC02: deferred queue records in REC02 will be ignored.

It is worth remembering that the deferred queue will grow faster and larger when goal-to-goal passing and immediate indexing are combined, since both goal and index records are being added to the deferred queue. For most files, the deferred queue will only be receiving goal records. Remember too that records are only removed from the deferred queue when the file is processed -- each time a record is updated, even if it is updating a record already in the deferred queue, the record is added to the end of the deferred queue. This can be significant if the size of the deferred queue is limited.

C  Additional Facilities for the SPIRES File Definer

C.1  Recompile of an Existing File's Definition

C.1.1  The Function of the RECOMPILE Command

The syntax of the RECOMPILE command for file definitions is:

The SHARE option allows you to recompile the file definition even though the file is being used (e.g., a subfile is selected) by other users. The other options are also on the COMPILE command, and have the same meaning here. [See B.5.3.]

When you issue the RECOMPILE command, SPIRES will examine the FILEDEF subfile and find the latest copy of your file definition and compile it. Unlike the COMPILE command, however, RECOMPILE only causes the replacement of a previous MSTR file with a newly compiled one; no other ORVYL files are altered. [However, if the file doesn't currently exist, the RECOMPILE command is treated like a COMPILE command, compiling the file definition from scratch.]

The RECOMPILE command does not alter any of your file's data records on disk. But, if you make an incorrect change, you may make SPIRES look at the data in a different way. For example, if you add a binary-to-string conversion OUTPROC where none existed before, SPIRES will try to convert data already stored from the binary form it has been told to expect.

Because the RECOMPILE command may cause you to "lose" data already in your file, you should be able to "recover" from any recompilation errors. The steps for doing this are: 1) before you modify a file definition, save a copy of it on disk, so that you can go back to this definition days later if necessary; 2) if you found you have made an error, remove or dequeue any records added or updated since the recompile, then enter SPIBILD and process the file; 3) update your file definition in FILEDEF using the old version you had saved, and RECOMPILE; your file should now be back to its previous state.

Often, SPIRES will not complete a recompile when it detects that a change you have made will cause it to lose track of data already stored on disk. Be warned, however, that you can make many disastrous changes without SPIRES detecting a change in data structure and organization. Error messages for an invalid recompile are usually INITFILE errors such as S437 or S446. [See C.9.2.] With respect to recompiling an existing file, statements fall into these categories:

Statements in the three categories are listed in the three sections that follow.

C.1.2  Statements You Can Change, Add or Delete Anytime

The following are file definition statements that can be changed, added, or deleted without damaging the file (Unless otherwise indicated, any change here requires you to recompile the file definition for the changes to go into effect):

FILE Level:

RECORD Level:

ELEMENT Level:

LINKAGE Level:

SUBFILE Level:

(Note: no recompilation is necessary; changes will take place as soon as the revised file definition is put back in the FILEDEF subfile.)

C.1.3  Statements You Can Sometimes Change, Add or Delete

The following are file definition statements that can be changed, added, or deleted under specified circumstances:

FILE level:

RECORD Level:

ELEMENT Level:

LINKAGE Level:

C.1.4  Statements You Can Never Change, Add or Delete

The following are file definition statements that may not be changed, added or deleted once records have been built. [See B.5.8.] If you must change them, you will have to rebuild the file.

FILE Level:

RECORD Level:

ELEMENT Level:

C.2  [Currently not used]

This chapter was once the home to documentation about the THESAURUS command facility (feature, capability, function), which was removed (taken away, withdrawn, discarded, banished, given the shiv, repositioned) from SPIRES in mid-1991.

C.3  Synonyms

C.3.1  The Function of a Synonym

File definers often want to provide a way for searchers to find alternate search values or cross references. This is done with the SYNONYM capability.

Suppose you were defining a chemical abstracts database, and wanted to allow non-chemists to search an index built on the names of chemical compounds. It would be useful to give the novice a way to find out what the chemical name is for certain compounds with common names, such as salt, water, or aspirin. The following is a sample searching session which uses the SYNONYM facility as a sort of online dictionary and cross-reference tool:

It is quite easy to code the SYNONYM facility for particular indexes in a file; it only requires the addition of a single element to the record definition for the index record. The element must be at the record level (that is, it cannot be inside a structure), and is usually variable length and variable occurrence. When the SYNONYM command is issued, it names the key of a particular index record (such as SALT or WATER in the above example); SPIRES then lists at the terminal all the occurrences of the SYNONYM-type element that are in that record.

C.3.2  Defining a SYNONYM Index

The COMPOUND index record type used in the above example might have the following definition:

The name "COMPOUND-SYN" is not required; any name can be given to the synonym element, as long as "SYNONYM" is coded after it. The INPROC and OUTPROC actions are not necessary, but they do make it more convenient to add synonyms to records. A138:1 will sort the synonyms in ascending alphabetic order. A45 and A82 will simply string multiple synonyms out one after another when the file owner is adding or updating synonyms. It is also desirable, but not necessary, to put a priv-tag specifying CONSTRAINT on the POINTER element; this will make it impossible for the file manager to alter the pointers, which should never be changed except by SPIBILD processing.

C.3.3  Adding Synonyms to Index Records

Here is how a synonym might be added to an existing record, such as WATER:

The ATTACH command can only be used by the file owner, as described earlier. [See B.5.6.] If the file owner would want others to add or update synonyms, the index record would have to be made into a subfile by coding an appropriate subfile privileges section. The TRANSFER command is used to transfer the record whose key is to have the synonym attached. It might be a good idea to add synonyms for the H20 and HEAVY WATER records also, so that a SYNONYM command that named WATER, H20 or HEAVY WATER as its value would give the user the other two values as synonyms.

Suppose a record for HEAVY WATER did not previously exist in the COMPOUNDS index. In order to create a SYNONYM for HEAVY WATER, the following commands would be used:

C.4  Executable Elements: Protocols and TYPE=XEQ

Protocols are a series of SPIRES, WYLBUR, MILTEN and ORVYL commands that can be executed. These protocols can be kept in a WYLBUR library, then executed from the active file--or the protocols can be stored in a protocols subfile.

Many SPIRES users are familiar with the PUBLIC PROTOCOLS subfile. Protocols in this subfile can be executed by commands such as:

SPIRES users who want to have their own protocol subfiles must define them, add them to the FILEDEF subfile, and compile them. The definition of a protocol subfile is simple enough that a protocol stored in the PUBLIC PROTOCOLS subfile can generate one for you. The file definition that will be added to FILEDEF by this protocol will look something like this:

The protocol, BUILD.PROTOCOLS.FILE, will prompt you for a subfile name, accounts that can select the subfile, and a bin number.

If you will frequently be using protocols to assist you in searching, updating, or reporting from a subfile, it is a good idea to define the protocol subfile as a record-type in the file you will be working with. For example, you may have a subfile of personnel data and a subfile of protocols to help you extract reports on salary data. It is very easy to make the protocols subfile just another record-type in the personnel data base. By combining the two separate subfiles in a single file definition, the process of selecting the two subfiles and accessing records in both of them is made more efficient in many cases.

As can be seen from the above file definition, a protocol record-type need contain only two elements. The first element must be the key of the record, and it must be defined in the REQUIRED section of the record definition. This element should usually have action 30 as an INPROC, to insure that the key is always in uppercase. The second element must be in the REQUIRED or OPTIONAL section, and must be TYPE=XEQ.

TYPE=XEQ indicates to SPIRES that this is a "command" element, that is, an element that can be executed by the system as a command. The first element, the key, must be TYPE=CHAR; though this is the default if no other TYPE is specified or if no binary conversion processing rules are coded, it is important that the key be stored in its character form.

A TYPE=XEQ element can also be coded in other record-types that are not primarily protocol record-types. But if this is done, the rules for the key and the command element still must be followed: the key must be a REQUIRED character string and the first element of the record, and the command element must be the next element coded (either as REQUIRED or OPTIONAL) and be TYPE=XEQ.

Any record-type that has a TYPE=XEQ element coded in it must be one of the first five physical record-types defined in the file. This means that if there are more than five record-types defined (say a goal record, four index records, and a protocol record), and the COMBINE statement is not coded to combine some of the logical records into the same physical record, then the RECORD-NAME of the protocol record-type must be such that it is one of the first five physical record-types appearing in the file definition. Remember that SPIRES will reorder your record definitions into ascending alphabetical order by the RECORD-NAME you have specified.

C.5  Indirect Record-Access: Action 32 and SUBGOAL Processing

C.5.1  The Function of Action 32

In general, action 32 allows the use of a record-type in a SPIRES file as a table. The key of the record-type must be the argument that will be used to look up values in the table, but the table can have several elements in it. In order to add keys and other elements to the table record-type, a subfile-privileges section must be defined that selects the table as the goal record (discussed in "Defining Subfile Privileges"). [See B.9.] If the key of the record-type is passed from an element in a goal record data set, then the table may contain goal record linkages in the form of pointers to goal records that contain the table's key as an element. Thus, the table can be a simple table of arguments and results, an index record with linkage via pointers to a goal record, a goal record of some complexity, or the same record as the record-type that contains the A32.

This brief "architectural" description of A32 is incomplete without a consideration of the various functional uses of A32 that are possible. To this end, this chapter will present various file definition problems and their solution using A32 for table lookup. Many situations arise in which information input or output from one goal record requires access to information in another goal record (either in the same record-type, or in another record-type); by considering the second record-type as a "table," you should be able to extrapolate from the examples presented in this appendix to more complex, less predictable record access operations using A32.

A32 is described as follows:

A32 :<NUM> ,<RECORD NUMBER> ,<ELEMENT>

A32 :P1 ,P2 ,P3

     Purpose:  VALUE REPLACEMENT BY ELEMENT
     Processing Rules:  SRCPROC, INPROC, OUTPROC

The value input to A32 is used as the key of a record in order to
access record-type P2; the input value and the key must be of the
same type.  If P1 is 3 then the value is replaced by the value of
singularly occurring element P3 of the record retrieved.   If  P3
is given a value of -1 and if P2 indicates a REMOVED record-type,
then the value is replaced by  a pointer to the record retrieved.
If no record is retrieved, the error flag is set on and the value
is unaffected.  If P1 is 2 then the value is unaffected  and  the
error  flag is set on if no record is retrieved.  If P1 is 1 then
the value is unaffected and the error flag is set on if a  record
is  retrieved.   The  element  P3  is  restricted to be the first
occurrence of the element within the record or structure.  P3  is
defined  as  either just an element number (record level) or as a
structure number followed by an @ symbol  and  then  the  element
number within that structure.

If 4 is added to any P1 value above, then the original  value  is
assumed  to  be  a  pointer  to  index  record  type P2.  This is
especially useful as an OUTPROC action on the pointer elements of
an index.

If 8 is added to any P1 value above, then the deferred queue will
not be examined for records of the accessed  record  type.   This
will save one I/O per transaction.

An alternate form of this rule is allowed in file definitions:

     A32 :<NUM> ,<RECORD NAME> ,<ELEMENT NAME>

which allows you to specify the name of the  record-type  and  an
element  within  that  record-type.   This form is not allowed in
formats or RECDEF definitions.  If a record name is used, then an
element name (not an element number) must be used; vice versa, if
a record number is used, then an element number must be used.

     Processing Rules:  PASSPROC

The value is used as a key of a record in order to access  record
type  P2.   The  value  is  replaced  by  the value of singularly
occurring element P3 of the record retrieved.  (This is the  same
as  a  P1  of 3 above; no other uses of this action as a PASSPROC
are allowed.)  If P1=0, then the retrieved  value  is  forced  to
upper  case; if P1=1, then the value is not forced to upper case.
If no value is retrieved, then no further pass processing for the
value takes place; no error flag is set on.  Either of the  above
forms, using record and element names or numbers, is permitted.

Note: The first record-type defined  is  record  number  1.   The
first   element   defined   is  element  number  0.   In  a  slot
record-type,  the  slot   number   is   element   0.    "Indirect
Record-Access,"  contains details and examples of the use of this
action.

Note carefully the limits of action 32. It can only be used to access the first occurrence of an element in a record or structure. If a value exists in some other occurrence of an element, or if a value exists in some other occurrence of a structure, then action 32 cannot be used. If such information must be accessed, the SUBGOAL capability is used. [See C.5.8.]

C.5.2  Action 32: Problem One

Consider the following file definition problem:

A file is to be built of the faculty and staff members at a large medical center. There will be about three thousand records, and storage costs are to be kept to a minimum. Each of the personnel records will have a "DEPARTMENT" element, containing an indication of which one or ones of the thirty departments the faculty or staff member is associated with.

By building a "table" of department codes and department names, we can save a good amount of storage space. Allow the DEPARTMENT element to be input only as a code number from 1 to 30; take this code number and look up a record in another record-type, the table, that has the code number as its key. Put out a warning message if no record with that key can be found. Store only the code number on input.

On output, of course, the code number could be used to look up the full name of the department in the table, and replace the code with the full name; but this would not allow normal transfer/update processing, since a text string is being output, but only a code number is permitted as input.

The simplest solution to this dilemma would be to output only the code number for transfer/update operations, but look up the department name from its encoded value when doing formatted output for mailing lists, directories, and other publications.

A second solution would use one element to store and output the coded value; a second element that is never stored takes the coded value from the code-number element and looks up the corresponding text string on output. The element's value is converted to null on input, so that the text string it contains on output is not stored.

A third solution stores only a binary pointer to a text string in the table on input, then on output uses the pointer to access the text string stored in the table.

C.5.3  First Solution to Problem One

A file definition for the first solution would be coded as follows:

Action 32 is used as an INPROC in this example. The P1, P2 and P3 parameters specify that the action is used only to validate an input code. "AS32:2,2,0" can be dissected as follows:

Adding records to the table is quite simple once a subfile with the table record-type as its goal has been established. The subfile named TABLE MAINTENANCE is selected to add departments to the table. Note that a department's code must be added to the table subfile before a personnel record using that code is added to the faculty subfile, because of the "S" in AS32. Additions to the table are accomplished as follows:

C.5.4  Second Solution to Problem One

The second solution to the file definition problem requires only the addition of one element to the PERSON record. Remember that this solution uses the DEPARTMENT element to store and output the coded value, but uses a second element, here called TEXT, to look up the coded value in the table, replacing the code with the text value stored in the table. The value in the TEXT element is deleted on input. If DEPARTMENT is multiply occurring, then DEPARTMENT and TEXT must be singly occurring elements in a multiply occurring structure.

For the second solution, the PERSON record would be redefined as follows:

In this record definition, A32 is used as both an INPROC and OUTPROC. Its use as an INPROC for the DEPARTMENT element is identical to that in the first solution; its use as an OUTPROC in TEXT is an example of A32 taking a value as the key of a record in a table and replacing the value with the value of an element in the retrieved table-record. "A32:3,2,1" functions as follows:

C.5.5  Third Solution to Problem One

The third solution to the faculty-staff department-code problem is quite different from the preceding two. In this case, the table will be made a REMOVED record-type. On input, the department name in its non-coded form is used as the key of a table entry; the name is then replaced by a locator (4 bytes in length) to its entry in the table, and then stored. On output, the locator is used to access the table; the locator is replaced by an element in the table record. Because the table must be a REMOVED record-type, it would be fairly expensive to use it also as an index record. Thus, the following file definition does not index the DEPARTMENT table as the previous definitions in this appendix have. Notice also that no "code list" is maintained; instead department titles must be input in strings identical to those in the table, otherwise no table record will be retrieved.

The file definition for this third solution is as follows:

Since no coordination of codes and text values is used, table entry involves the input of only a single element for each department record; this element is the key of a record in the table. Table entry is as simple as this:

Let's examine the parameters specified for A32 in detail:

By adding 4 to the P1 parameter of A32, it becomes very useful as an OUTPROC on the pointer (TYPE = LCTR) elements of an index. If you ATTACH the index record as a goal record (see example below) and SET ELEMENTS for the key and pointer, you can display a record in the index as an indexed term followed by an element in the goal record that passed the term to the index. The following is a redefinition of the file definition used for the first solution to the department problem.

Let's examine the OUTPROCs on the table's POINTER element.

The P1 parameter of 7 on A32 says that the value is a locator to a REMOVED record-type. The P2 parameter of 1 says that the record-type is the first defined in the file. The P3 parameter of 0 indicates that the value used to access record-type P2 is to be replaced by the 0 element defined in that record, which is the slot number. A77 has a dual purpose: the retrieved binary slot number is converted to alphanumeric, and a check-digit is computed and appended to the slot number. Obviously, if SLOTCHECK were not specified for the goal record, A71:1,4 would be sufficient to convert the binary slot number. Remember that the check-digit is not stored with the slot number, but is appended on operations that display the slot number.

If the "COMMENT =" is taken off here and put on the OUTPROC above it, and the file definition recompiled, the table record can be used to produce a department by department list of the names of department members. The A32 will not retrieve the slot number as above, but will retrieve the NAME element, element number 1 in the goal record.

This file definition also contains the first example of the use of A32 in a SEARCHPROC. It is identical to the A32 coded for the INPROC of the element that passes to the table, the DEPARTMENT element. A32 as a SEARCHPROC can validate search requests to see that they do not request values that are not in the table. If such a validation is not performed, then a "-ZERO RESULT" will ambiguously occur both when a department has not been put in the table, and when no goal records have the search value as their department number.

C.5.6  Action 32: Problem Two

Consider a different file definition problem. Data has been received from a national agency containing the names and addresses of its members. You find that no attempt has been made to standardize the STATE element; that is, "California" appears as "Calif", "Ca", and "California". Almost all other states have several names. In order to make searching and sequencing on a STATE index a simple and accurate way of retrieving all members from a particular state, input must be standardized. In order to conserve storage costs, all forms of input to the STATE element must be reduced to the official postal service two-character abbreviations. Because you are printing convention badges, however, you want all abbreviations translated to the full state-name on output. A solution might be coded as follows:

C.5.7  Solution to Problem Two

Before adding members in California to the MEMBERS subfile, the following records must be added to the STATES subfile:

Thus, whenever CALIF, CA or CALIFORNIA (and their upper-lower case equivalents) are input, the INPROC AS32:3,2,0 looks the input value up in the table and changes it to the 0 element, "STORED." On output, only CA will ever be looked up, and it is changed to the number 2 element, "OUTPUT," by the OUTPROC A32:3,2,2. For this reason, OUTPUT need only occur on the records that have the two-character code as their key. Note that A32 coded on the SEARCHPROC allows searching on any of the input values, since all are changed to the cannonical CA by the same kind of table lookup that is used for value translation on input.

C.5.8  SUBGOAL Processing

In some cases, the limitation of action 32 to the first occurrence of an element in an element or structure prove to be too severe. Use of SUBGOAL processing removes these restrictions; however, SUBGOAL processing can only be done via SPIRES formats.

SUBGOAL allows a format written for one subfile to retrieve all or part of a record in another subfile or record-type. There are no limitations on multiply occurring elements or structures. To use SUBGOAL processing, an account or accounts must be privileged to access data in a record-type other than the selected subfile's goal record. This is done when the file owner specifies the SUBGOAL statement in the subfile-privileges section of the file definition. [See B.9.5.]

C.6  Practical Techniques for File Definers and Managers

C.6.1  Introduction

This appendix is a potpourri of ideas, code, processes and commands that have been used to solve file definition and file management problems. There is no particular continuity maintained from one section to another, nor are the sections in any particular order; therefore, a descriptive section title has been highlighted to make browsing convenient. File definers and managers are welcome to suggest additions to this appendix, in the form of problems they have solved, or problems in search of a solution.

Don't worry if there is no continuity to the section numbers either. Sections come and go, usually leaving because they are expanded or moved into parts of other chapters or even other sections, or because they solve problems in ways that have become obsolete or archaic.

C.6.2  Proximity Searching: Information for File Developers

When you have goal records with textual data, it is common practice to create word indexes for the text elements. This works well for most searching situations, particularly when users are looking up records containing a particular word. But when they are looking for particular phrases, it may become desirable to provide a way to "lock together" the words in the phrase, to avoid finding records where the words in the search appear individually but not together.

For example, FIND TITLE MUSIC MAN might retrieve items with titles of THE MUSIC MAN; MUSIC MAKES THE MAN; THE WOMAN WHO HATED BACH'S MUSIC AND THE MAN WHO LOVED HER, etc. Sometimes then, the extra flexibility of word indexes when used with phrases becomes a hindrance rather than a benefit.

SPIRES provides a capability called proximity searching that can help with that problem as well as some others:

Descriptions of how proximity searching works from the user's point of view appear in the SPIRES manual "Searching and Updating". The subsections of C.6.2 will discuss what you need to include in a file definition to provide this feature, how the feature works, and some warnings about customizing it to your own needs. [See C.6.2.1, C.6.2.2.]

C.6.2.1  Proximity Searching: File Definition Requirements

Within a file definition, an index that supports proximity searching is defined as a combination of a word index and a personal-name index. It includes a SUB-INDEX structure that must occur at the record-level.

The Index Record Definition

The record definition for the proximity search index should be structured like this:

You could or must replace the words in lowercase with appropriate terms, depending on your own file definition. The "pointer" at the end represents your standard PTR-GROUP element just as it would be defined in all other index records for that goal record-type.

You can choose either a 1-byte or 2-byte integer for the position number. If data elements you are indexing may have more than 255 words total being passed per record, be sure to choose 2-byte integers; otherwise, 1-byte will be adequate.

Just for the record, in case you are upgrading from a normal word index to a proximity search index, here is what the record definition for a normal word index usually looks like:

The difference then is the "proxstr" structure, with its position information.

The Linkage Section

Generally, two linkage sections are written for the proximity-search index. One is for proximity searching, and the other is to use the index as a standard word index. Only one of them, the proximity-searching linkage, actually handles passing; the other is coded with $NOPASS to prevent passing the element values twice.

Here is a template for how they would look. The first linkage section is for proximity searching, and the second provides a standard word-index interface to the same index, along with a sub-index that can be used in a special way described later:

Notes about the template:

Here is how a typical pair of index linkages might really look:

The sub-index entries generated by passing in the first linkage section represent the position within the element where the indexed word appeared. So, for instance, with the title TROUBLE IN TAHITI, the word TROUBLE would get a sub-index entry of 1, IN would get 2, and TAHITI would get 3.

How do excluded words affect the indexing? For instance, using the sample linkages above, the word IN in TROUBLE IN TAHITI would not be indexed. Still, SPIRES is careful enough to count the excluded words even if they are not actually indexed; in this example, TAHITI would still be assigned "3".

Effects of Proximity Searching on Index Size

The most obvious effect is perhaps one of the least significant: in each index record, there will be an additional one- or two-byte position number for each position that a word appears in, across all the goal records. So, for instance, if 10 records have the word ANTWERP in position 1 of a word-indexed element, and 5 records have it in position 2, the ANTWERP index record will have two position numbers of one or two bytes each stored within it. That's not usually very much additional storage to consider.

More significantly, with a normal word index, each record usually contributes only a single pointer or pointer-group to an word index-record. A title like CATCH AS CATCH CAN would add only one pointer for the title word CATCH even though it occurred twice in the record.

But with a proximity-search index, CATCH needs to get two pointers: one for the goal record for position 1, and one for position 3. Depending on the amount of repetition of words within the goal records, this can be a significant factor in the size of the index records.

Because proximity search indexes tend to be larger than normal word indexes, their index records are more likely to split into large record segments. Large record segments require more I/O and CPU to process a search, and this affects any use of the physical index, including through the standard word index (linkage section 2). It is worthwhile to test search costs using a smaller test version of a file to see the impact of proximity searching if search costs are an important consideration.

Possible Differences from the Model

The template and example above do not have to be followed slavishly. Here are some details about variations you can have that will still support proximity searching:

1) In the index record-definition, you can make the pointer element ("pointer" in the example) be required rather than optional. That saves 3 bytes per occurrence of the "proxstr" structure. However, it also prevents you from adding any elements to that structure in the future, though that usually isn't a problem.

2) You can also make the "proxstr" structure be required rather than optional at the record-level. However, that saves only 3 bytes per index record, and again prevents you from adding any record-level elements to the index record-type, which is perhaps more likely to be a problem here than within the "proxstr" structure, as described above.

3) The "position" element must be fixed in length, either one or two bytes. Remember to use the two-byte size if more than 255 words may be passed by a single goal record; the two-byte size will handle over 65,000 words per goal record, which is actually more than a single record can contain.

4) You can pass multiple data elements to the same proximity-search index; however, the $PASS.OCC rule (A165) treats all values for the goal record as if they were in one long value. That is, the word positions are calculated as if the value being passed was one long value made up of all the occurrences of the first element being passed concatenated to all those of the second, etc.

If it is important to you that words from one source element are only compared to words from the same element, you should pass some form of qualifier along with the pointer that identifies the source element. The qualifiers will be compared along with the pointers since SPIRES uses TAND internally to handle the search. [See C.6.2.2.]

Note too that if you pass multiple elements (or multiple occurrences of a single element, for that matter), the position number for the word will match its location within the "concatenated value", not its absolute location within an occurrence.

You can use multiple index linkage sections to pass each element separately, but each should use either a separate pointer-group within the "proxstr" structure or the common "pointer" should have a qualifier that uniquely identifies the passing element. Don't use multiple index linkage sections to build a common pointer-group that does not distinguish between source elements.

C.6.2.2  Proximity Searching: How it Works

Using the Proximity-Search Index and the Word Index

When you use a proximity-search index, you enter only two words as the search value, plus an optional proximity-operator between them. Sample commands include:

Full details on command syntax for using proximity-search indexes appear in the SPIRES manual "Searching and Updating".

The index, using the search terms of the second linkage section, [See C.6.2.1.] also serves as a normal word index:

An added bonus of the sub-index is that the position value of the word can be used as part of the search:

How Proximity Searching Works

When you issue a proximity search like FIND PROXTITLE DATA ENTRY, SPIRES looks for the DATA and ENTRY index records and TANDs all the pointer group occurrences that occur under the "position" values that satisfy the proximity operator. The absolute position values are not significant; what's significant is their relative difference. So in the example, if DATA has a pointer under position 35, then SPIRES would only match that up to a pointer to the same record in the ENTRY index record under position 36.

Another example: FIND PROXTITLE WOMAN N1 MAN (WOMAN within one word of MAN) would only match up a record with WOMAN at position 10 if MAN was at position 8, 9, 11 or 12. (It wouldn't match with one at 10, since two words can't occupy the same position.)

C.6.5  Examination Of Index Entries

During the early stages of file building, it is a good idea to monitor the entries being passed into index records, to see if INPROCs and PASSPROCs are operating as file searching will require. To examine some of the entries in an index, it is only necessary to issue the "BROWSE simple-index-name" command, or, to get a more complete list of entries, the "BROWSE FIRST simple-index-name" command. To speed the process, the following commands might be issued:

The entire process can be done by the file owner or those with See access to the file by using Global FOR commands. The index record is attached as the goal record; [See B.5.6.] Global FOR is then used to display all keys or put them into the active file. Remember that for simple indexes, the key of the index record is the value passed to the index from the goal record. The process can be done as follows:

C.6.10  Automatic Accumulation of Record Modification Dates

Most of the sample file definitions at the end of this manual provide two elements that keep track of the date on which a record was first added to the file and the date on which it was last updated. It is often desirable to keep more information about the updating of a record than this provides. In such cases, you may want to keep track of all dates on which a record had been updated, or perhaps just the three most recent record modification dates.

This can be done using a technique shown in the following file definition code. The DATE-ADDED element will automatically supply and keep the date on which the record was first added to the file. The DATE-UPDATED and UPDATE-GENERATOR elements work together to keep track of the most recent dates on which the record was updated. It is very important that DATE-UPDATED be coded in the REQUIRED section, and UPDATE-GENERATOR be coded in the OPTIONAL section; it is also important that no OCC be specified for DATE-UPDATED and that OCC=0 and LEN=4 be specified for UPDATE-GENERATOR.

The INPROC rules of UPDATE-GENERATOR create a value that is passed back to the DATE-UPDATED element. This value is then passed through the INPROC rules of DATE-UPDATED, where an INCLOSE rule discards all but the first three occurrences of the element; the UPDATE-GENERATOR creates a fourth occurrence when its INCLOSE rule, A147, is executed. Thus, the sample definition will maintain only the four most recent record modification dates. The element SCHOOL is the key of the record, and is coded here to provide some perspective on where other elements in this record definition would be coded.

C.6.11  Free Global Qualifiers

[Editor's note: Since this section was written, the ability to use the goal record-type as an index record-type to itself, thus allowing key searching with indexes, has been added to SPIRES. [See B.8.2.] Hence, it is now less likely that a file would be defined to pass a record key as a global qualifier. The section below is still relevant, however, if you are passing keys rather than locators anyway. (In other words, you would probably not decide to pass keys rather than locators just so this "free qualifier" technique would be available.) Although you cannot initiate a search with the key criteria when this qualifier technique is used, the technique is cheaper to use in searching in terms of I/O than the goal-as-index technique.]

Suppose a company were assigning accession numbers as the keys of records. That is:

Then it might be very useful in some searches to restrict (or qualify) the search result to a particular range of keys. For example:

Here, the key of the record is being used as a global qualifier. Now, global qualifiers are quite expensive, since they are stored redundantly in every pointer group in every index. But if the key of the goal record is the desired qualifier, and the goal record key is being passed to indexes rather than locators (TYPE=LCTR elements), then a qualifier is already being passed to every index, as the goal record key. To make use of this information, you need only point SPIRES at this qualifier when you are doing searching.

The following simple definition shows this technique, as defined for a SLOT file. Note that the key of the goal record, the slot element ID, is being passed to indexes.

Note carefully the use of A165 in the QUAL-ELEM definition. This prevents SPIRES from trying to pass a real qualifier. To use this technique, you MUST be passing the key of the goal record (using some form of passproc A169) rather than a locator to the goal record in the residual data set (using PASSPROC=A170).

C.6.12  Record Protection By Account Number

In shared cataloging databases it is necessary to allow all participating accounts to examine all records, but it is usually desirable to prevent any user from updating any but his or her own records. This is easily done if the account number is part of the key of the records (the FILEDEF and FORMATS subfiles do this), but usually the key cannot be made to contain the account number.

The following simple definition provides one solution to the problem. Account GA.SPI is the "master" account; it can update any account's records. Account GG.VVN is a cataloger's account; it can examine any records, but cannot update any but the ones it has added. Account HA.SPI is a public searching account; it cannot update any records at all.

The presence and relationship of the two ACCOUNT elements make this definition successful. No account but the master can put anything but its own account number in ACCOUNT-UPD; this is because of A53. When a record is added to the file, A127 automatically supplies values for both ACCOUNT elements (even though one of them, ACCOUNT-ADD, appears in NOUPDATE). When a record is updated, A137 checks that both account elements have the same values. The key is that no account (except the master account) can change ACCOUNT-ADD; it is always the account that first added the record. So only the account that added the record will be allowed to update it. However, any account can DEQUEUE any other account's records, and only the master account can REMOVE a record from the file. [See C.11 to learn how to use USERPROCs to solve security problems.]

C.6.13  Same-Structure Retrieval Through Indexes

If elements inside structures are passed to simple indexes, there is no way to use the indexes to specify that retrieval criteria be met by element values in a single occurrence of a structure. However, using qualifiers created by action A165 in combination with the TAND, TNOT and TOR logical operators, you can solve such a problem.

For example, suppose the following records appeared in a simple file that assisted water conservationists. Each record contains observations made at a single site; each occurrence of a structure records the soil type at a certain depth at the one site.

A searcher might want to retrieve all records that showed a SOIL type of 10 at a DEPTH of 50. If both SOIL and DEPTH were passed to indexes, then SPIRES would falsely report that both of the above records met the criteria. Actually, a researcher would only want the second record to be retrieved.

Using same-structure retrieval criteria in a WHERE clause in Global FOR, the researcher could find the appropriate record:

The @-sign indicates that SPIRES should look at DEPTH values only in occurrences of the structure containing SOIL values of 10. For more information, see the manual "Sequential Record Processing in SPIRES: Global FOR" or [EXPLAIN WHERE CLAUSE.]

However, if it is important to you to have this capability available for index searches, two different file designs are available. The simpler one is to create a SOIL index where DEPTH is a local qualifier. [See B.7.3.] A search using such an index might look like this:

The disadvantage to this approach is that there is no DEPTH index that can be searched independently from the SOIL index. Depending on your needs, that may or may not create problems.

The second file design allows you to have indexes for both DEPTH and SOIL, but each indexed value is qualified locally by the occurrence number of the OBSERVATION structure in which it appears. Thus the index records for the two goal records above would look like this:

With this arrangement, you can use either index separately. But if you want to use both indexes in a search, you can use the TAND, TNOT or TOR logical operators to tell SPIRES to compare entire pointer groups, not just the pointers:

In the first two commands of the search, SPIRES compares only the pointers, and both records 1 and 2 are represented by pointers in the SOIL index record 10 and the DEPTH index record 50. However, when pointer groups are compared under TAND, only record 2's SOIL=10 and DEPTH=50 index records match.

An extra benefit of this method is that you can qualify searches by occurrence number. For example, to find records where the first observation has a DEPTH of 50:

Though the occurrence number of the structure is passed as the qualifier in this example, the same searching techniques (i.e., using TAND, etc.) are applicable when the qualifier is some other element within the structure that is unique across the record. For example, a FAMILY record with a multiply occurring CHILD structure could pass the CHILD's name as a qualifier to indexes containing data from the structure -- that qualifier would then be involved when TAND was used. A search such as "FIND CHILD.SEX = FEMALE TAND CHILD.AGE < 10" would then find families having female children under the age of ten. The structure's occurrence number could also be used as the qualifier in this situation, but if a unique value in the structure might be useful as a qualifier in its own right (such as the child's name), consider using it.

To pass occurrence numbers as qualifiers, you need to code PASSPROC A165 ($PASS.OCC system proc) with a P1 parameter of 6 to 15 on the qualifier element, as shown in the sample file definition below (lines 62, 73). That form of A165 (A165:7) tells SPIRES to pass the occurrence number, counting from 1, of the structure containing the element passed by the previous PASSPROC, i.e., the occurrence number of the structure OBSERVATION containing SOIL (line 62) or DEPTH (line 73). (Only the last element passed in the current INDEX-NAME section may be handled this way.) Other forms of A165 may request the occurrence number of the named element itself, or even several occurrence numbers at once, e.g., the occurrence numbers of both the element and the containing structure or structures. [See D.1.7.1.6.5.]

      1.   FILE = GA.SPI.OBSERVATIONS;
      2.   MODDATE = TUE. APRIL  5, 1983;
      3.   DEFDATE = TUE. APRIL  5, 1983;
      4.   RECORD-NAME = SITE;
      5.     REMOVED;
      6.     SLOT;
      7.     REQUIRED;
      8.       ELEM = OBSERVATION;
      9.         LEN = 8;
     10.         TYPE = STR;
     11.     STRUCTURE = OBSERVATION;
     12.       FIXED;
     13.         KEY = SOIL;
     14.           LEN = 4;
     15.           INPROC = A21:4;
     16.           OUTPROC = A71,10;
     17.         ELEM = DEPTH;
     18.           OCC = 1;
     19.           LEN = 4;
     20.           INPROC = A21:4;
     21.           OUTPROC = A71,10;
     22.   RECORD-NAME = REC02;
     23.     FIXED;
     24.       KEY = SOIL;
     25.         LEN = 4;
     26.     OPTIONAL;
     27.       ELEM = POINTER.STR;
     28.         TYPE = STR;
     29.     STRUCTURE = POINTER.STR;
     30.       KEY = POINTER;
     31.         TYPE = LCTR;
     32.       ELEM = OCC;
     33.         OCC = 1;
     34.         LEN = 1;
     35.   RECORD-NAME = REC03;
     36.     FIXED;
     37.       KEY = DEPTH;
     38.         LEN = 4;
     39.     OPTIONAL;
     40.       ELEM = POINTER.STR;
     41.         TYPE = STR;
     42.     STRUCTURE = POINTER.STR;
     43.       KEY = POINTER;
     44.         TYPE = LCTR;
     45.       ELEM = OCC;
     46.         OCC = 1;
     47.         LEN = 1;
     48.   GOALREC-NAME = SITE;
     49.     PTR-ELEM = POINTER;
     50.       EXTERNAL-NAME = SITE;
     51.         PASSPROC = A170;
     52.     INDEX-NAME = REC02;
     53.       SEARCHTERMS = SOIL;
     54.         SEARCHPROC = A21:4;
     55.         GOALREC-ELEM = SOIL;
     56.         PASSPROC = A169:1;
     57.       PTR-GROUP = POINTER.STR;
     58.       QUAL-ELEM = OCC;
     59.         SEARCHTERMS = OCC;
     60.           SEARCHPROC = A21:1;
     61.           PASSPROC = A165:7;
     62.     INDEX-NAME = REC03;
     63.       SEARCHTERMS = DEPTH;
     64.         SEARCHPROC = A21:4;
     65.         GOALREC-ELEM = DEPTH;
     66.         PASSPROC = A169:1;
     67.       PTR-GROUP = POINTER.STR;
     68.       QUAL-ELEM = OCC;
     69.         SEARCHTERMS = OCC;
     70.           SEARCHPROC = A21:1;
     71.           PASSPROC = A165:7;
     72.   SUBFILE-NAME = OBSERVATIONS;
     73.     GOAL-RECORD = SITE;
     74.       ACCOUNTS = GA.SPI;

Note: Although $INT and $INT.OUT were used as INPROC and OUTPROC rules on the OCC elements in REC02 and REC03, A54 ($PARAGRAPH) should be used when multiple occurrence numbers (e.g., both the structure's and the element's occurrence numbers) are being passed. If you want all AND, OR and AND NOT searches to compare whole pointer groups rather than only searches using the TAND, TNOT and TOR logical operators, code secure-switches 7 and 17 in the subfile section. [See B.9.3.7, B.9.3.17.]

C.6.14  Phonetic Search of Personal Names

In some searching situations, particularly where personal names are involved, variant spelling possibilities for parts of search values make it difficult to specify appropriate search values. The truncated search capability (and the special truncated search capability for embedded truncation) cover many possible cases. Another set of problems can arise when the sound of a search value is known, but its exact spelling, even of a root, is not known.

In these situations, a "phonetic search" capability is required. This capability allows the search values "SAC," "SOK," "SOC," "SAK," "SOCK," "SICK," and other "similar" variants to retrieve the name "SACK," for example.

By using a combination of SPIRES processing rules on the Searchproc and Passproc rules for a simple index that converts a personal name (in last name, first name order) into a generalized phonetic code, you can simulate such a capability in SPIRES. In specific, you use the system procs $PHONETIC and $PHONETIC.SEARCH, whose syntaxes appear in the SPIRES manual "System Procs". Online, EXPLAIN either of those for syntax details.

If necessary or desirable, you can code your own version of the phonetic search rules, basing them on the two system procs, in order to fine-tune the algorithm (see below). For example, the system procs will not retrieve "SCHROEDA" as a variant of "SCHROEDER," but will retrieve "PUFFAME" or "POOPHAMIE" for "BUFFUM."

The following file definition shows how to use the procs. Users who implement it are invited to note its shortcomings and submit any improvements they make.

Below is the system proc expansion for the $PHONETIC proc. The $PHONETIC.SEARCH proc is basically the same, though it also includes some other Searchproc rules as well (A11, A13). WARNING: To help anyone trying to understand this system proc, we have not doubled an apostrophe within a string in line 2 of the display of the proc below (after the ampersand). If you copy this proc to modify it for yourself, be sure to double that apostrophe; otherwise, the proc will not compile properly.

$PHONETIC -- A43, '!$%&'()-=@[]:+*./<>? AEIOUYBFPVCGJKQSXZDTHWLMNR',
                  'YYYYYYYYYYYYYYYYYYYYYYYYYYY22223333333344ZZ6778'/
             A36:3,'X',0/
             A48,Z8,8, XY,1, XZ,5, X,, ',YY',',1', ',YZ',',5'/
             A48,Y,,Z,/
             A48,222,2, 333,3, 444,4, 666,6, 777,7, 888,8,
                  22,2,33,3,44,4,66,6,77,7,88,8

Examine the $PHONETIC rule above. The A43 translates all characters into phonetic classes, translating each character in the value that appears in the first string (on the first line) into its counterpart in the second string (on the second line). The A36 inserts an "X" at the beginning of the value. The first A48 isolates vowels or "W" or "H" counts which occur at the beginning of a last name or first name. The next A48 eliminates these sounds in other places. The last A48 combines multiple sounds occurring together or separated by vowel sounds. This forms the phonetic key, which is numeric, that is passed to the index.

The following searches will be valid:

C.6.15  Non-Unique INDEX-NAME Statements

In a file definition, it is possible for several INDEX-NAME sections to refer to the same record-type. If such a technique is used, then several things are possible:

The following linkage section shows how this capability of the system can be used to pass the same element, but specifying different SEARCHPROC rules for searching.

Note that A47 is specified as the PASSPROC in the second section to prevent the element from being passed twice to the same index.

The following example shows how two elements can be passed to the same index to conserve record-types. No value of either element should be equivalent to any value of the other element. The structure of the index record-type should be amenable to both elements. In the example below, SEX (values can be M or F) and EEO-CODE (values can be 1, 2, 3, 4, or 5) are all passed to a single index.

Consideration must be given to the kind of information one chooses to combine in this way. Searching of specific values will yield the desired result, but a search using relational operators might be affected by the presence of more than one type of value. In the above example, for instance, a search for those records whose EEO-CODE was greater than 3 would retrieve all records in the index whose key was greater than "3"; this would include not only "4" and "5", but "M" and "F" as well. In addition, the browse command in either the EEO-CODE or SEX index would yield results from both since BROWSE displays a sampling of the keys of the index record named in the linkage section.

The following example shows how sub-indexes or qualifiers can be bound to appropriate goal record elements using this technique. In this example, assume that elements A and B are record-level elements in the goal record; assume that A1 and B1 are elements in structures A and B, respectively. Both A and B are to pass to the key or the index record, and A1 and B1 are to pass to the sub-index (or qualifier); it is important that A1 qualify only A, and that B1 qualify only B, and that A1 not qualify B or B1 not qualify A.

The PRIV-TAG is coded to hide the second index and sub-index pair from searchers. This will allow them to search A/A1 or B/B1 with the same searchterm, as if the file definer had specified multiple passers on the first index and sub-index. But multiple passers would not have given the binding of structural elements to each other that would be required.

Another technique can be used to combine information from what are usually separate indexes into a single index record-type. The index record definition contains more than one pointer group -- each pointer group is used in conjunction with a particular index. A record definition to be used for two indexes might look like the following:

The two pointer elements are placed in separate (preferably fixed-length, for efficiency) structures so that they can have the same alias. This alias would be the same as the name of the pointer element for all other indexes, and would be the name used in the global part of the linkage section (i.e. PTR-ELEM statement). All the index definitions would have to be structured according to this form, that is, the pointer element in all cases should be contained in a structure (of fixed length, if other pointer groups are fixed-length).

The individual linkage sections for the separate indexes would name the same record in the INDEX-NAME statement, but specify different pointer structures (i.e., in the PTR-GROUP statement) in the index record to be accessed in passing and searching:

The use of multiple pointer elements in the same record keeps the search logic for the different indexes distinct since SPIRES uses only the pointer element named in the PTR-GROUP statement of the linkage section when it determines a result. However, the BROWSE command will display any value in the record, regardless of the particular index to which it belongs.

C.6.17  The QUELEVEL and RES-LEVEL Statements

Files with regular updating activity often must ask ORVYL, with each transaction, for blocks to add to the deferred queue or, in cases where immediate indexing is involved, the residual. Later, during file-processing, SPIRES releases the DEFQ blocks as the records are moved into the tree. Then, when updating resumes, this cycle continues.

Requesting and releasing those deferred-queue blocks adds overhead to transactions that can be reduced considerably by including the QUELEVEL statement in the file definition. The QUELEVEL statement tells SPIRES not to release any blocks used for the deferred queue unless the number of blocks in the deferred queue is greater than the number specified in the QUELEVEL statement.

The RES-LEVEL statement has a similar effect: it tells SPIBILD to increase the size of the residual data set so that the data set has the number of blocks equivalent to the "residual next block" value (basically, a count of the number of blocks of data in the residual) plus the value in RES-LEVEL. That means SPIRES will not need to increment the residual block by block during the day's transaction-processing -- the blocks will already be available.

The syntax of the QUELEVEL statement is:

where "n" is an integer, specifying the maximum number of blocks to be kept for the deferred queue after SPIBILD processing. The QUELEVEL statement is a record-level element in the file definition, appearing after the BIN statement.

The syntax of the RES-LEVEL statement is:

where "n" is an integer, specifying the number of extra blocks that the residual should have after file processing. The RES-LEVEL statement is a record-level element in the file definition.

When the QUELEVEL statement is first added to a file definition and compiled, ORVYL does not assign that number of blocks to the deferred queue. Instead, blocks are added as they are requested by SPIRES as transactions occur. Then, during file-processing, SPIBILD compares the QUELEVEL number to the actual number of blocks used. If the number of blocks used exceeds the QUELEVEL number, the excess blocks are released back to ORVYL.

For RES-LEVEL, the extra blocks are added to the residual the next time the file is processed in SPIBILD after the RES-LEVEL statement is added. And each time the file is processed in SPIBILD, extra blocks are added, as necessary, to raise the residual size so that it has the RES-LEVEL number of extra blocks. (If the file is processed in SPIRES instead of SPIBILD, the RES-LEVEL value is ignored.)

Of course, with both QUELEVEL and RES-LEVEL, you will be charged for these additional blocks. However, the added efficiency for subfile transactions may be more important to you, especially in large applications. For small, personalized data bases where transactions are few in number or are infrequent, the two statements are probably not useful.

What should you choose for a value for QUELEVEL? A fairly good choice would be a "high average" of the number of blocks in the deferred queue between file processings. Another formula used is to figure the average number of transactions and allow one block apiece.

For RES-LEVEL, you might use a "high average" of the number of residual blocks added to the file across several days, from the time the file was processed until just before it was next processed. (You don't need or want to count blocks added because of records moved from the deferred queue to the tree during file processing.)

C.6.18  Record Size Limits and Split Records

All records in SPIRES, both goal and index varieties, are processed in either final or intermediate form. In final form (i.e., the form in which the record is stored), a single record cannot grow beyond 120K bytes, including overhead information. The SHOW RECORD OVERHEAD command can tell you the exact size of a record in its final form in bytes. [See B.10.5.]

How much space is available for the intermediate form depends on the amount of working space established by the SET SUPERMAX and SET SUPERVAL commands. The default values, 32,768 bytes each (32K), represent the amount of internal space allotted for the data itself ($SUPERVAL) and special tables ($SUPERMAX). The amount of table space determines the number of element and structure occurrences allowed for the record, since each occurrence uses 12 bytes of table space. Hence, the number of element and structure occurrences allowed within a record is $SUPERMAX divided by 12, which is 2730 occurrences, using the default value of 32K for $SUPERMAX. [Each table entry describes one occurrence of an element or structure. Thus, one occurrence of a structure containing a singly occurring element would take two table entries, for 24 bytes. However, within the data section, only the element's value would take space.]

By raising the values of $SUPERMAX and $SUPERVAL, you can raise the limits for the amount of data and the number of occurrences per goal record. The recommended maximum total of $SUPERVAL and $SUPERMAX is 128K, which is twice the default total. EXPLAIN SET SUPERVAL for more information.

A final form goal record is constructed from the intermediate form, which has a limit of $SUPERMAX/12 occurrences, as described above. However, an index record that is created and maintained by SPIBILD and/or FASTBILD can have considerably more occurrences of a pointer element.

In an index record, the number of pointer groups (the primary entity created by the passing process) is limited to about 64K divided by the length of each pointer group. Therefore, a four-byte pointer (e.g., a locator) would be limited to about 16K occurrences in the record. (How SPIRES can handle index records requiring more than 16K pointers is discussed below.) Note that for all practical purposes, four-digit numbers are sufficient for handling element occurrence counts (i.e., 0-9999).

Two commands do cause the final-form version of the record to be placed into the value portion of the intermediate version: MERGE and REFERENCE. Hence, for these two commands to work, the stored version of the record being processed cannot exceed the size of the current value of $SUPERVAL. If a record's final form is thus larger than $SUPERVAL, the REFERENCE or MERGE command will fail. The record can still be changed by using the UPDATE (or ADDUPDATE) command, which does not bring the stored final-form version of the record into the value portion of the intermediate.

When Large Index Records are Split

Index records that are removed to the residual are handled exactly as above. By default, they are limited in size to 80K bytes maximum. However, non-removed index record-types, or removed index record-types defined with the SPLIT statement (see below) can grow much larger.

In SPIBILD and FASTBILD, when an index record for a non-removed record-type grows too large, it is split into several final-form records, each with the same key. (This is the only situation in SPIRES where multiple records in the residual may have the same key value.) As index records, SPIRES can retrieve them properly when processing a search request. However, to retrieve these index records properly as goal records, you must use a format; otherwise, not all the "pieces" of the split record will be retrieved. Within the format, the SET LARGE Uproc in conjunction with the $LRGNXT flag variable must be used to process the split record properly. [See the manual "SPIRES Formats" for more information.]

SPIBILD will inform you when it creates a split index record by displaying the "SPLIT RECORD TYPE" message specifying the record-type and the key of the record. You can also determine whether your file contains split records by examining the STATUS command's output, which indicates these records in the "large record segments list". [See B.10.2.] In practice, SPIBILD splits index records when they would grow beyond 14K bytes (not the absolute limit of 80K).

The SPLIT Statement in an Element Definition

By default, splitting a record so that it can grow beyond the limit of 80K bytes can only occur under these circumstances:

An occasionally problematic aspect of this behavior is that the splitting moves all optional elements to the residual, leaving only the key behind in the tree. This isn't usually a problem for index record-types whose only element is an optionally occurring pointer or pointer-group. However, if the record-type also has other record-level optional elements, a split will force them into large record segments too, which are difficult to retrieve.

The SPLIT statement, which can be added to an element definition, tells SPIRES to split only that optional element off to large record segments, leaving all other record-level elements in the tree-level segment.

SPLIT is simply added to the element definition like this:

SPLIT can be included in only a single record-level element definition per record-type. Most commonly, it is added to the pointer element of an index record-definition. That's because SPIRES internally handles the large record segments when pointers are involved quite well for searching, so it makes sense to make the pointer (the element that's least often used directly by you) to the large record segments be split.

So the net result is that if a record splits, all the pointers will be split to the residual, and all the other elements in the record will remain in the tree-level part of the record.

SPLIT has several other benefits too -- when coded, it allows both Removed and non-Removed record-types to split, and also allows record-types containing Fixed or Required elements other than the key to split.

Note that you can code "SPLIT = ALL;" instead of just "SPLIT;" if you need to request that all optional elements, not just the one in whose definition SPLIT is coded, be split off. This is needed only in extremely rare cases.

The FIX ELEMENT command

Occasionally it is necessary to manually split an index record, such as when a SPLIT element has not been specified and the record grows large.

All occurrences of the <n>th element in the tree level version of the <key> index record are moved to a large record segment. <n> is the element number at the record level within the specified record. This is exactly the same as what happens automatically for SPLIT elements.

C.6.19  Indexing Negative Integer or Real Values

Only in a few SPIRES subfiles containing indexes with numeric values are both of the following statements true:

If both of these statements are true for indexes in one of your subfiles, you will find that standard index-building techniques will not allow range searches to work properly. A value such as -3 would be indexed between 2 and 4, meaning that a search such as "FIND VALUE > 1" would retrieve records with the value of "-3". This is true for both integer and real values. (It is also true for packed decimals, but their problems in this context are compounded by varying lengths and exponents. They are not recommended for use in indexes when the indexed values may be negative.)

The technique described below creates indexes that allow range operators to work properly with negative numbers. As an added benefit, the BROWSE command also works properly with such indexes.

The technique involves extensive use of rule A59 using a P1 value of 3. The action or its system proc equivalent, $LOGICAL, must be applied to the values as they are passed to the index.

Below are appropriate excerpts from a file definition using these techniques to index integer values. The element containing integer values that may be negative is called INTNUM.

If you are working with real numbers instead of integers, you replace the integer processing rules with those for handling real numbers, and, in the A59 rules, replace INT with the letter I (for Input) on the Inproc, Searchproc and Passproc, and replace INT with the letter O on the Outproc. Remember that none of this is needed, however, unless both of the statements at the start of this section are true.

C.6.20  Encrypting Data Values

File owners may decide to encrypt data in their files. A technique involving the SET CRYPT command and action A60 ($ENCRYPT proc) allows the file owner to specify that a given element have its values be translated into other values, using an encryption key established by the user. Thus, the user sets the encryption key before adding or updating records; later, if the same encryption key is not set before displaying those records, the data displayed will appear to be "garbage".

Two types of encryption are supported by action A60: one that uses the encryption key and one that uses both the encryption key and the length of the input value to convert the value. (Neither type changes the length of the input value, however.) The processing rule (either A60 or $ENCRYPT) should be placed on both the INPROC and the OUTPROC of the element. In the INPROC string, it should be the last non-INCLOSE rule; in the OUTPROC string, it should be the first rule (unless the string includes A82 ($BUILD), which should precede A60 or $ENCRYPT).

Encrypted elements are typically not indexed, although they can be. If so, A60 or $ENCRYPT should be added to the end of the SRCPROC string. Action A45 ($BREAK) to break up the value should not appear in either the SRCPROC or PASSPROC.

Few of the relational operators work properly with encrypted elements, either in index-searching commands (e.g., FIND, ALSO) or in WHERE clauses in "FOR class" commands. Generally speaking, only the equality and inequality operators ("=" and "~=") should be used; PREFIX and other forms of truncated search may work if the first method of encryption (the one that does not involve the input value's length) is used for the element.

The syntax of the SET CRYPT command is:

where "string" is a string expression containing from one to eight characters. (The expression may be longer than eight characters, but those following the eighth character have no effect.) If the character contains non-alphanumeric characters, it should be enclosed within apostrophes or quotation marks.

The SET CRYPT command is issued after the SELECT or ATTACH command in SPIRES. In SPIBILD, you should issue the SET CRYPT command before the ESTABLISH command when you are batching records into the subfile or processing records under Global FOR in SPIBILD. [See C.6.16.]

Encryption of data may be done for several reasons. In situations where several people may be doing data entry into a subfile but a given element's value should only be seen by that person, the file owner can request that each user employ a separate encryption key. Encryption also disguises the data in the event that someone can gain access to the ORVYL data sets in which the SPIRES file is stored; even if that user had the file definition and could figure out where the element was stored in a data set for a given record, he or she could not determine its value.

Remember though that the encryption key is not stored with the value in any way; if you forget the encryption key that was set when the encrypted element was input to the data base, you will not be able to determine its value.

Encrypted Elements, Phantom Structures and Subgoal Processing

You can access encrypted elements in other record-types using phantom structures or subgoal processing. However, whether the element is correctly decrypted depends on the mode of access and the environment established.

If the encrypted element is being retrieved through a phantom structure defined with the SUBGOAL statement or through a format using same-file subgoal access, then the value currently established by the SET CRYPT command will be applied to the encrypted element. Because you are in effect retrieving these values from the currently selected subfile, the encryption code set under the currently selected subfile is used.

On the other hand, if the encrypted element is being retrieved through a phantom structure defined with the SUBFILE statement or through a format using subfile subgoal access, then the value currently established by the SET CRYPT command for the currently selected subfile will not be applied to encrypted element. Because you are in effect retrieving the values from another subfile, the currently selected subfile's encryption value is not used. What you must do is establish a subfile path for that subgoal subfile and establish an encryption key for it, using the commands SELECT and SET CRYPT preceded by the "THROUGH pathname" prefix. For example, suppose subfile CARS has a phantom structure connecting it to subfile DRIVERS, and an element in DRIVERS is encrypted:

Establishing the subfile path for DRIVERS is unnecessary in most uses of the phantom structure in CARS. Here, however, the encryption key needs to be established for the DRIVERS subfile, so the subfile path must be established to tell SPIRES what encryption key to use when retrieving encrypted elements from the DRIVERS subfile. If the CARS subfile had elements with encrypted values, the SET CRYPT command could be issued without a THROUGH prefix to tell SPIRES what key to use for those elements -- that key would not be applied to the encrypted elements in DRIVERS.

C.6.21  DEFQ-Only Record-Types

A special kind of record-type may be created to exist only in the deferred queue. When the file is processed, its records are discarded. Most commonly, DEFQ-only record-types (sometimes called "pseudo-record-types") are created to hold data about transactions in other record-types of the file. [See C.14 for specific information about handling and processing transaction data.] Other uses are certainly possible, though.

To declare a record-type to be "DEFQ-only", you begin its name with a period in the RECORD-NAME statement, i.e.,

where "name" is an alphanumeric string of five or fewer characters. (The maximum length for any record-type's name is six characters.)

DEFQ-only record-types will be the last record-types defined in a file definition. They may not be named in GOALREC-NAME statements. That is, they may not serve as goal records for index records. They may, however, be named in GOAL-RECORD statements in the Subfile section of a file definition, allowing them to serve as goal records of a subfile.

A DEFQ-only record-type may also serve as a temporary immediate index. Obviously, since the index exists only until the deferred queue is processed, there are not many situations where this would be practical. However, if a DEFQ-only record-type itself needs an index, this would be a good way to define it. Keep in mind that the deferred queue has a limit of 64K blocks; if much indexing is done, you can reach that limit quickly.

C.6.22  File Security: ORVYL Data Sets

Because SPIRES uses the file system of its time-sharing system, data base security issues also involve the time-sharing system. If users access a SPIRES file only through SPIRES, then the security measures built into SPIRES (e.g., SECURE-SWITCH 3, ACCOUNTS statements, FILE-ACCESS statement, etc.) are sufficient. However, under the default security setup for a SPIRES file, it is possible for other users to run other non-SPIRES ORVYL programs that can retrieve (and, in the case of some data sets, change) the individual data sets of a SPIRES file. The ORVYL command SET SPIFILE can be used to block such access without limiting the access to the data sets that SPIRES (as well as the other SPIRES programs, such as SPIBILD) requires.

The syntax of the SET SPIFILE command is:

Its opposite can be used to reverse the condition:

Because ORVYL commands are limited to eight characters, SET NOSPIFILE is not allowed; the command is SET NOSPIFIL.

To see whether SPIFILE is set for an ORVYL data set:

As the syntax of these ORVYL commands suggest, the "FILE" in the term "SPIFILE" is not a SPIRES file but an ORVYL "file" or data set. Thus, each data set of a SPIRES file must be handled individually:

These commands do not affect the use of the data set by programs run under the file owner's account -- they limit access by programs run from other accounts.

If a non-SPIRES ORVYL program submitted by another user tries to read a data set that has been "set" to SPIFILE, an error code will be returned. Be aware, however, that two batch processors of the SPIRES system -- batch SPIRES and FASTBILD -- cannot access SPIFILE data sets unless the job is run under the file owner's account. (The file owner can always access the data sets.)

All the public interactive SPIRES programs (e.g., SPIBILD, SPIRES) can access SPIFILE data sets. Test versions of those programs cannot, however.

Other accounts cannot use ORVYL commands such as GET and PUT to access the data in your SPIRES files, whether or not SET SPIFILE is in effect for the data sets. Although the data sets of a SPIRES file are set to allow at least READ access to them by the public, they are also set NOCLP (NO Command Language Processing), which means they cannot be used except through an ORVYL program such as SPIRES. Even if you, the file owner, try a command such as GET ALMANAC.MSTR, the command will fail, because CLP is not set. If you really wanted to, you could SET CLP for the data set and then issue the GET command, since you are the file owner, but no other account could do that for your data sets. (Actually, however, there should be no reason for you to ever do that.)

You may also use the ORVYL command SET PERMITS to limit the accounts that can access the data sets of your ORVYL file, but be sure you do not make them more restrictive than the security you have requested in the file definition. For example, if an account is allowed to update the records in a subfile, according to the file definition, don't change the DEFQ data set to READ-only access for that account.

For most file owners, none of these extra precautions are necessary. The default NOCLP setting will keep all users from seeing and changing the data without going through SPIRES security. It will similarly keep non-SPIRES programs from changing the data without going through SPIRES security. To prevent non-SPIRES programs from seeing the data (and possibly changing the DEFQ data set) as well, use the SET SPIFILE command.

C.6.23  Creating a Deferred Queue on Another Account

It is possible to create the deferred queue of a SPIRES file on an account other than that of the file owner. This feature is not thought to be of much use generally, but certain users may find that it helps to satisfy an important need. The need that comes to mind first is to remove a very volatile data set from an account that otherwise would have a minimum of write activity for one reason or another. In the ORVYL file system especially, this can be quite useful as the deferred queue can be in a separate file system, thus allowing file maintenance to continue in the account of the file owner in read only mode while the deferred queue itself is being continually updated.

Additionally, it is needed when you want a file to have "defq-duplication", i.e., the file will make two copies of the deferred queue, with all transactions being written to both, [See C.6.23a.]

So the DEFQ-ACCOUNT feature is not meant for general use; it is meant for very unusual situations.

The feature is enabled by supplying the six-byte DEFQ-ACCOUNT value in the form "gg.uuu" at the end of a file definition record. You may then compile or recompile the file definition.

If you COMPILE a new file with DEFQ-ACCOUNT specified, two .DEFQ data sets are actually created. The first is created on the file's own account but is just a skeleton deferred queue that points to the real deferred queue on the account specified by the DEFQ-ACCOUNT value. If you try to compile a file definition where the DEFQ-ACCOUNT specifies an account that already has a deferred queue for another SPIRES file, you will get a compile error (of code S467).

If you recompile a file definition and have specified DEFQ-ACCOUNT, then if the current file does not have the deferred queue on another account, one will be created. However, you cannot change the DEFQ-ACCOUNT value of a file unless the deferred queue is empty. You should always process the file in SPIBILD if you intend either to include or remove the DEFQ-ACCOUNT value. If you are recompiling to remove a DEFQ-ACCOUNT, then the deferred queue will be erased from the old account and reestablished fully on the file's own account. It should be noted here that you cannot recompile a file which has its deferred queue on one non file owner account to create the deferred queue on another account. If you attempt this, you will get an S467 error for your trouble. To do this properly, you must first recompile the file definition with DEFQ-ACCOUNT removed and then recompile again with the new value specified.

In order to create a file on another account, you must have WRITE permission on that account from ORVYL. That same permission must also allow erasing a file. Unlike the normal activity during new SPIRES deferred queue creation, SPIRES does not set public READ or WRITE permits on the new deferred queue that is created on another account, nor does it set NOCLP. You must take care of these things yourself on the account of the new deferred queue. [See C.6.22.]

Since a skeleton deferred queue has been left on the file account, JOBGEN is able to understand this situation and will handle it just as it does normally.

The COPY FILE and ZAP FILE commands will not be allowed for a file that has its deferred queue on another account.

C.6.23a  Creating a Duplicate Deferred Queue

Imagine the following scenario: Your file keeps track of banking transactions, like an automatic teller machine's. It all works fine, until one day, a power problem causes a file system problem, and the file needs to be rebuilt. All the backup files are around, so it is relatively easy to restore the file to the way it was this morning. But what about all the transactions that took place in the meantime? What customer accounts need to be debited for all the money handed out by the machine this morning?

In situations where the rare problem of a damaged deferred queue would be impossible to fix (as in this example, where we could not get the people making the transactions to do them again), you may want to consider the "duplicate deferred queue". This feature creates a second deferred queue for the file, but on another account; it is in effect an extension of the DEFQ-ACCOUNT feature described in the previous section. [See C.6.23.] With the "duplicate defq" feature, SPIRES writes all transactions to both deferred queues, so that there are two copies of today's data. In case of calamity, there are thus two copies of the defq data.

To request a duplicate defq, you add both the DEFQ-ACCOUNT and DUPLICATE-DEFQ statements to the end of your file definition:

The DEFQ-ACCOUNT statement tells SPIRES under which account to place the duplicate defq data set; the DUPLICATE-DEFQ statement takes no value.

DUPLICATE-DEFQ changes the meaning of the preceding DEFQ-ACCOUNT statement; instead of the named account holding the only copy of the defq data set, it holds a duplicate data set, called "ORV.gg.uuu.filename.DPFQ", where "gg.uuu" is the named account.

In order to create the deferred queue copy on the other account, you, the file owner, must have WRITE access to that account from ORVYL. For other permit conditions, see the previous section on the DEFQ-ACCOUNT statement.

Once it is created, the duplicate defq will begin receiving defq transactions just as the regular one does. Of course, this redundancy means transactions will cost more since the data needs to be written twice.

The file owner and those with Master access can disable and enable defq duplication with the SET SUBFILE DISABLE FOR DUPLICATE and SET SUBFILE ENABLE FOR DUPLICATE commands. Note that disabling turns off duplication immediately; but the ENABLE version of the command only resets a flag allowing duplication, which won't start again until the file has been processed or the normal deferred queue has been re-initialized (for instance, with ZAP DEFQ). See the manual "SPIRES File Management" for more information about those two commands; online, EXPLAIN SET SUBFILE DISABLE.

You can also determine whether defq-duplication is in effect with the $DEFQTEST(DUPLICATE) function. See the manual "SPIRES Protocols"; online, EXPLAIN $DEFQTEST FUNCTION.

If a problem occurs that means you really do need to use the duplicate defq to help put the file back together again, take special care. Consider the duplicate defq as an assurance that you have all the data; but putting the file back together again is seldom as simple as restoring the file to its state after the last SPIBILD processing and then restoring the defq. For example, if the file has any immediate indexes, you will want to re-input the deferred-queue records so that those get updated appropriately with the defq data.

But if something does happen to one or the other deferred queue files, you can erase the damaged defq data set and replace it with a copy of the other one.

If you need to do this, it is highly recommended that you contact your SPIRES consultant for assistance.

C.6.24  Checkpoint Data Sets

Under some circumstances, SPIRES and SPIBILD use a "checkpoint" data set to hold data before writing it into actual file data blocks. These situations include:

When the file blocks in the checkpoint data set are successfully written to the file, the next write operation is written over the previous one in the checkpoint data set, and so forth. When all the processing is complete, the checkpoint data set should be left "empty". But if the processing fails (e.g., due to a system crash), the data in the checkpoint data set may still remain, from an interrupted write operation.

Each time one of the situations listed above happens, SPIRES or SPIBILD checks the checkpoint data set to verify that it is empty. If it isn't, they can use the data in it to complete the interrupted operation before continuing on with the new processing.

For older files (say, those created prior to about 1986), the checkpoint data set is a single Orvyl data set called SYSTEM.CHKPT, shared between those files. This sharing doesn't work for files that have immediate indexing or that are used for transaction processing; hence, such files have their own checkpoint data set, called "filename.CKPT" (just as the data set of the master characteristics of a SPIRES file is called "filename.MSTR"). Similarly, two files on the same account that must share a SYSTEM.CHKPT data set cannot be processed in SPIBILD simultaneously, since SPIBILD needs exclusive use of the checkpoint data set used by the file.

All new files are created with their own checkpoint data set. But if you have an older one that doesn't and you want to use it for transaction-group processing or immediate indexing, you can add the record-level element "CHECKPOINT;" to the file's file definition. (The element has no value.) When you recompile the file definition, SPIRES will create a private checkpoint data set for the file, which SPIRES and SPIBILD will use from then on.

C.7  Compiling File Definition Code from Several Sources

When a file definition is compiled, SPIRES can retrieve the necessary pieces from several sources, not all of which are necessarily in the FILEDEF subfile. A file owner may split the file definition into several pieces for either of two reasons:

This chapter will discuss two of the techniques available that allow you to split file definition code into several pieces. Specifically, this chapter will discuss:

The technique of placing proc definitions into the EXTDEF subfile is discussed later, in the chapter on procs. [See C.10.]

C.7.1  The RECDEF Subfile

The RECDEF subfile contains record definitions for files. RECDEF and the FILEDEF subfile are linked by a "DEFINED-BY" element that points from a RECORD-NAME element in a file definition to that record's definition in RECDEF. For example, a protocols file definition might look like this in FILEDEF:

Before this definition could be compiled, a record definition would have to be added to the RECDEF subfile with the key specified in DEFINED-BY, GA.SPI.PROTOREC, and compiled. The process might look like this:

The COMPILE command for compiling record definitions in the RECDEFS and BACKRECS subfiles has the same syntax as it does when you use it to compile file definitions:

If you replace COMPILE with RECOMPILE in the syntax above, you have the syntax for the RECOMPILE command, when used to compile records in the RECDEFS and BACKRECS subfiles. [See B.5.3.]

A file may have several of its records defined by a record or records in the RECDEF subfile. If several records are to be combined physically into the same ORVYL file, then the COMBINE statement should appear after the appropriate DEFINED-BY statement in the FILEDEF record. Slot record definitions in RECDEF may have a SLOT-NAME statement coded to give the slot-number element a name different from the default, the name of the record, which is coded in the file definition.

The record may refer to user-defined processing rules ("procs") that are defined at the end of the record definition (just as they would be defined at the end of the linkage section in a file definition) or that are defined in an EXTDEF subfile record whose name is given in an EXTDEF-ID statement at the end of the record definition. [See C.10.5.] Similarly, element information packets named in the record definition may be defined at the end of the record definition or in an EXTDEF subfile record whose name appears in the EXTDEF-ID statement at the end of the record definition. [See C.13.5.]

Userprocs may also be defined at the end of a RECDEF record definition (ahead of any procs or EXTDEF-ID statement), just as they would appear at the end of the record definition in a file definition. Once the RECDEF record definition is compiled, any Userprocs in it may also be called from actions A62 and A124 ($CALL proc) in other file, record and format definitions.

Element view packets for a record may be defined in the RECDEF record, or in an EXTDEF record, or after the DEFINED-BY statement in the file definition or in any combination of those three. [See B.9.4.1.]

Advantages of Using the RECDEF Subfile

RECDEF gives the definers of large or complex files several options. Record definitions, such as index records, that are duplicated several times in a file or files can be defined and compiled a single time in RECDEF, and referenced by several DEFINED-BY elements. This reduces the time required to compile and recompile the file definition and simplifies the appearance of the definition.

Multiple authorship of files is also possible, since each record can be defined and compiled independently of other records in the file. More important, however, is the ability to "share" record definitions among several accounts working on files containing records with identical definitions. This is very helpful in large application development efforts where several programmers may be working with the same record definitions.

Like FILEDEF records, RECDEF records can be displayed, added, updated, removed and compiled only by the record's owner. However, any user who knows the name of any RECDEF record can name it in a file definition and compile the definition, even if the RECDEF record is not his or her own. Only the RECDEF record's owner can compile the actual record definition, but anyone who knows its name can have it be compiled into his or her own file. [See C.7.3 for a further discussion on RECDEF security.]

When you no longer need a RECDEF record, you should issue the ZAP RECDEF command to get rid of it:

where "recdef.id" is the name of the record definition in the RECDEF subfile. The ZAP RECDEF command destroys the compiled version; the SOURCE option tells SPIRES to remove the record from the RECDEF subfile too.

C.7.2  The EXT-REC and EXT-LINK Statements

The EXT-REC and EXT-LINK statements tell SPIRES to retrieve other file definitions from the FILEDEF subfile to use in compiling the current file definition's record-types or linkage sections. Specifically, the EXT-REC statement names a file definition whose record definitions are to be compiled as part of the current file. Similarly, the EXT-LINK statement names a file definition whose linkage sections are to be compiled as part of the current file.

These features are especially useful in two types of situations:

The EXT-REC and EXT-LINK statements are coded as follows:

where "gg.uuu.filename" is the FILE value of the other file definition. Only file definitions belonging to your own account (i.e., whose FILE values begin with your account number) can be specified. You cannot use this technique to compile record definitions and/or linkage sections from file definitions belonging to other accounts.

These two multiply-occurring statements are record-level elements in a FILEDEF record, appearing at the end of the file definition.

The file definition named in EXT-REC or EXT-LINK statements does not need to be compiled. SPIRES uses the source record, not its compiled counterpart, when compiling a file definition containing either of these statements. This procedure is thus quite different from the RECDEF procedure. [See C.7.1, C.7.3.] Note that the file definition named in the EXT-REC or EXT-LINK statement should not itself have EXT-REC or EXT-LINK statements in it; no nesting of these statements is allowed.

When EXT-REC is coded, only the record definitions from the source record are retrieved; that means that any element information packets must be defined within the record definition, and not at record-level in the source file definition. Alternatively, they can appear at record-level in the primary file definition, or be invoked from an EXTDEF record. [See C.10.5, C.13.5.]

For user-defined processing rules (procs) that are used in Inproc and Outproc rule strings in a source record definition, the rule definitions should appear in the primary file definition (after the linkage section), as well as in the source definition if that source definition is to be compiled. [See C.10.] Again, alternatively, they can be invoked from an EXTDEF record.

For Userprocs that appear in Inproc and Outproc rule strings in a source record, they absolutely must be defined in the source record definition that refers to them. [See C.11.1.]

When EXT-LINK is coded, similar rules apply. Again, only the linkage sections from the source file definition are retrieved. To be compiled into the primary file definition, any index information packets must be defined within the source linkage section, and not at record-level in the source file definition. Alternatively, they can appear at record-level in the primary file definition, or be invoked from an EXTDEF record.

For procs used in Searchproc and Passproc rule strings in the source linkage section, the rule definitions should appear in the primary file definition (after the linkage section), as well as in the source definition if that source definition is to be compiled. [See C.10.] Again, alternatively, they can be invoked from an EXTDEF record.

For Userprocs that appear in Searchproc and Passproc rule strings in a source record, they must be defined in the record definition of the goal record-type for that linkage section, wherever it is defined.

A file definition containing EXT-REC statements may still have its own record definitions; a definition with EXT-LINK statements may still have its own linkage sections as well. However, the order of compilation becomes very important in such cases.

When SPIRES compiles the record definitions of a file definition, it first looks to see whether any EXT-REC statements exist. If so, then the file definition referred to in the first EXT-REC statement is fetched, and its record definitions are compiled. Then the record definitions in the file definition named by the next EXT-REC statement are compiled, and so forth. When no more EXT-REC statements are left, SPIRES will compile the record definitions in the actual file definition being compiled.

SPIRES requires that the record-types be compiled in order by record-name. The FILEDEF subfile always sorts the record definitions in a file definition so that they appear in order by record-type. If an EXT-REC statement appears, all of the record-types in the named file definition must sort before the record-types in the file definition being compiled.

For example,

The above arrangement will work fine since the record-type REC01 defined in the EXT-REC record GQ.DOC.PATRONS sorts before REC10 in the file definition being compiled. On the other hand, if the record name in GQ.DOC.PATRONS were REC20, the GQ.DOC.LIBRARY definition would not compile properly, since SPIRES would compile REC20 and then try to compile REC10, which would cause an error.

The same rules apply to the EXT-LINK statement. If specified, the linkage sections in the file definition it names are compiled before those in the file definition being compiled. The GOALREC-NAME statements must be in ascending order for SPIRES to compile the linkage sections properly. Those GOALREC-NAME statements in the external file definition must sort before those in the file definition being compiled. Otherwise a compilation error will occur.

In most situations, the above warnings are not important. The most common use for EXT-LINK and EXT-REC is to split a large file definition into two or more pieces because it will not fit into the FILEDEF subfile as a single record. In such cases, all of the record definitions or all of the linkage sections are placed into a separate file definition, which is added to the FILEDEF subfile. This "dummy" file definition is not itself compiled -- it is used only by the main file definition, which calls it via EXT-REC or EXT-LINK statements.

Another use of this feature is to produce multiple files with the same file structure. Suppose, for example, that you want to have several files with the same file structure as your file GQ.DOC.PRIMARY. For each file, your file definition can be as simple as this:

In other words, all you need are FILE, EXT-REC and EXT-LINK statements and a subfile section. You then add the definition to the FILEDEF subfile and compile it.

This procedure is very handy when you want to change such files. If you want to make a change in a record definition or linkage section, you make the change in only one file definition (GQ.DOC.PRIMARY in the example), though you recompile all of them. That is much simpler than making the change in all the file definitions and recompiling them all, however.

C.7.3  Comparing the DEFINED-BY and EXT-REC / EXT-LINK Statements

In most cases, a user learns about the DEFINED-BY/RECDEF or EXT-REC/EXT-LINK techniques because a file definition has become too large to fit in the FILEDEF subfile. [See C.7.1, C.7.2.] Sometimes removing unnecessary statements (perhaps COMMENTS) can shrink the definition adequately, but these other techniques are also available. This section will compare these two techniques to help you choose the technique most appropriate for your situation.

Both of these techniques are easy to use. In both cases, you simply move some code into another record or records and add another statement to the main file definition. The RECDEF technique requires some extra work, since the record definition must be separately compiled.

The separate compilation of record definitions can be a benefit, however -- it makes compilation of the actual file definition quicker. If changes are made to the file definition in areas other than the separately compiled record definitions, a recompilation of the file will not require recompilation of those record-types. Using the EXT-REC technique, all the record-types would be recompiled during a recompilation of the file, even though they may not have needed to be.

The RECDEF technique makes code management a bit more difficult, however. It adds another type of goal record (the record definition) to another system subfile, RECDEF. Moreover, if several record-types are moved into RECDEF, then each must have its own record in RECDEF. The EXT-REC technique keeps the pieces in one subfile, FILEDEF, though they are in several different goal records.

Another important distinction between the two techniques is that a file owner can compile other users' RECDEF records into his or her own file -- that is, the DEFINED-BY statement can name record definitions belonging to other accounts. (Subfile security prevents other users from displaying, adding, removing or seeing the keys of records in RECDEF other than their own, which means a user must know the key of someone else's RECDEF record in order to use it in the DEFINED-BY statement.) The EXT-REC and EXT-LINK statements can name only file definitions belonging to the user. Hence, the DEFINED-BY technique permits some code-sharing across accounts, while the EXT-REC technique does not.

C.8  FASTBILD

FASTBILD is a SPIRES program you can use to add multiple records into an empty subfile, more efficiently than in SPIBILD in many cases. Documentation for FASTBILD appears in the manual "SPIRES File Management". Online, EXPLAIN FASTBILD.

C.9  File Definition Compilation Diagnostics

C.9.1  Determining the Location of Errors

If the SPIRES detects errors in a file definition during the compilation process, diagnostics will be issued. In addition to a diagnostic (such as "INVALID ACTION CODE") the compiler will generate a trace to pinpoint the location of the error.

Errors that occur in the record definitions for goal or index records would be traced as follows:

Here the definer would quickly see that the length of a fixed length (FIXED-REQ) element was omitted when the definition was added to the FILEDEF subfile. The element is "SALARY" which is inside a structure "WAGES" which is in REC01. If syntactical errors are discovered in the value of an element, the portion of the element's value that was incorrect will be reported as "EXTRANEOUS"; this message is most often issued when processing rules are incorrectly specified. If additional errors are discovered during compilation, they too would be reported and traced. Finally, the message "COMPILE ERROR" will appear to indicate the end of an unsuccessful compilation; if the compilation were successful, the "COMPILE SUCCESSFUL" message would be issued.

If the compiler detects an error in the linkage section, quite a different trace might be produced:

INVALID REC NAME. INDEX-NAME=REC02, ITEM=RC03, GOALREC-NAME=REC01
COMPILE ERROR

This "trace" indicates that in the linkage section that links REC01 (the goal record) to REC02 (the index record) contains an error in it. The RECORD-NAME or "REC NAME" supplied was misspelled in the file definition as "RC03" (instead of "REC03").

The "EXTRANEOUS" message might also be issued in the linkage section if a PASSPROC or SRCPROC is invalid:

The error is in the PASSPROC rule string in the linkage section linking the goal record REC01 to the index record REC02. The part of the rule string that is labelled extraneous should be examined for invalid syntax, and the rule before the extraneous portion should be checked also.

C.9.2  ATCHFILE, INITFILE and Other Error Messages

In addition to "COMPILE ERRORS," there are two other kinds of errors that commonly occur. These are ATCHFILE and INITFILE errors. Whenever such an error occurs, an error code is given. This code number can be explained via the EXPLAIN command. For example:

The explanation for this error can be obtained by the command "EXPLAIN S56".

The following are the common ATCHFILE and INITFILE errors:

S40

An attempt to define a new file failed because an old file by the same name already exists.

S437

During a recompile of an existing file definition. The SPIRES compiler detected data set assignment inconsistencies between the new definition and the existing file. Record types which previously were combined physically must remain as they were. New record types cannot be inserted between existing record types. Remember that the record types are sorted in ascending order by record name.

S439

The file definition being compiled is attempting to assign more physical ORVYL data sets than the system allows. SPIRES allows a maximum of nine record type data sets. Perhaps several logical record types can be combined with one another to form a fewer number of physical data sets.

S44

System tables overflowed.

S446

An attempt was made to recompile a file and change the number of record types in the file. The deferred queue of the file must be empty before this can be done. You must either PROCESS the file using SPIBILD, or issue the DEQ ALL command in SPIRES. Remember that BATCH requests issued in the SPIRES processor are retained in the DEFQ and will be thrown away with the DEQ ALL or Online SPIBILD processing.

S56

The file is busy, not currently available.

S60

The file name supplied is illegal.

S68

The file is busy, not currently available.

S72

Access is not permitted. ORVYL files are not allowed for your account. Check with the Accounting office.

S76

An attempt to define a new file failed because an old file by the same name already exists.

INITFILE errors other than the ones explained above may occur during a recompile, and commonly indicate that bad data may exist in some of the records of the file. Such errors should be brought to the attention of the SPIRES consultant before any updating or processing of the file is done.

Some other types of errors are very uncommon. They may indicate a SPIRES system problem or bad data existing in a file being recompiled. The error types are:

All are reported with a code number that can be explained. If the error message indicates a system malfunction rather than an error in your definition, please report the error to the SPIRES consultant.

C.9.3  Compile Diagnostics and Errors

The messages and explanations for compilation diagnostics follow:

C.9.3.1  * ACCOUNT MISMATCH

ACCOUNT MISMATCH

You have specified an account prefix on the file name in a COMPILE or RECOMPILE command, and the account is not the same as the one you are logged on under.

C.9.3.2  * AET TABLE OVERFLOW

AET TABLE OVERFLOW

Too many elements have been defined for the record-type. Reduce the number of elements in the record definition. There is a limit of 1,024 element definitions per record-type.

C.9.3.3  * COMBINE RECORD INVALID

COMBINE RECORD INVALID

The COMBINE statement has been coded in a SLOT record-type, or the COMBINE statement in another record-type names a SLOT record-type. A SLOT record-type cannot combine with other SLOT record-types or with non-SLOT record-types.

Another possible cause of this message is that the record-type being compiled names a record-type in its COMBINE statement that has not yet been compiled. Remember that SPIRES compiles the record-types in sort-order by their names. Hence, if REC02 has "COMBINE = REC03", this error would occur, since REC03 has not yet been compiled.

C.9.3.3.0  * NAME PCT TABLE FULL

NAME PCT TABLE FULL

A PCT or "packed character table" is a table in which duplicate strings and stems are not duplicated, but overlap.

If you are compiling a file definition, too many mnemonics and aliases have been defined for the record-type. Reduce the number or the length of the mnemonics and aliases. If you were compiling a format or protocol, there are too many (or too long) strings. Reduce the number or length of them.

C.9.3.3.0a  * PACKED CHAR TABLE FULL

INPROC/OUTPROC/SRCPROC/PASSPROC ERROR -- PACKED CHAR TABLE FULL

Either too many character strings or character strings that are too long (or a combination of both) are coded into a particular type (PASSPROC, SEARCHPROC, etc.) of processing rule. Try to simplify the processing rules of this type by shortening some of the strings, or see your SPIRES consultant for more help.

C.9.3.3.0b  * ELEMENT TABLE FULL

ELEMENT TABLE FULL

This table, used in element processing for the file, has become full. Please see your SPIRES consultant for further help.

C.9.3.3.0c  * USERPROC TABLE FULL

USERPROC TABLE FULL

This table, used in USERPROC processing for the file, has become full. Please see your SPIRES consultant for further help.

C.9.3.3.1  * LABL PCT TABLE FULL

LABL PCT TABLE FULL

A packed character table has too many characters in it. Please see the SPIRES consultant.

C.9.3.3.1a  * IMT TABLE FULL

IMT TABLE FULL

An internal table has too many characters in it. Please see the SPIRES consultant.

C.9.3.3.1b  * AET TABLE FULL

AET TABLE FULL

An internal table has too many characters in it. Please see the SPIRES consultant.

C.9.3.3.1c  * LABEL TABLE FULL

LABEL TABLE FULL

An internal table has too many characters in it. Please see the SPIRES consultant.

C.9.3.3.2  * TOO MANY RECORD TYPES

TOO MANY RECORD TYPES

The number of logical record-types (defined by RECORD-NAME statements) that can be defined for a single file is sixty-four. If you have more than this number, you must define some of them in another file.

C.9.3.3.2a  * TOO MANY REAL RECORD TYPES

TOO MANY REAL RECORD TYPES

The number of physical record-types that can be defined in a single file is nine. If you must have more than nine RECORD-NAME statements, see if the COMBINE option can be used with some of them.

C.9.3.3.3  * TOO MANY SLOT TYPES

TOO MANY SLOT TYPES

Too many SLOT record-types have been defined for the file. The current limit is four. See if the augmented key processing rule (action 35) can be used to reduce the number of SLOT record-types.

C.9.3.3.4  * XEQ TYPE MUST BE VARIABLE

XEQ TYPE MUST BE VARIABLE

An element whose type is XEQ must be coded in the REQUIRED section of a record definition, and no LEN can be specified.

C.9.3.4  * DEFINED-BY RECORD TABLES NOT FOUND

DEFINED-BY RECORD TABLES NOT FOUND

The RECHAR record referenced in a DEFINED-BY statement does not exist or has not been compiled. Check to see that the proper account and RECDEF name have been given.

C.9.3.5  * EXTRANEOUS string

EXTRANEOUS: string

The most common error message is "EXTRANEOUS," with some string following the error message. This almost always indicates that a processing rule string has been incorrectly coded. If such an error occurs in a record definition, then a trace will be provided to locate which processing rule string for which element is at fault.

But if the error occurs in the examination of a PASSPROC or SEARCHPROC rule string, no trace will probably be given. In this case, the occurrence of the invalid rule string following the EXTRANEOUS message can be located most easily with WYLBUR or Global FOR. In a few cases, only the messages

will be output, with no indication as to what was extraneous or erroneous. This would indicate that a required PASSPROC rule, such as A45 or A38, was not found where it was expected by the system.

C.9.3.6  * FILE EXISTS

FILE EXISTS

During a COMPILE command, the system sensed that one or more of the ORVYL files it was building for a new file already exist under your account; probably, a file with the same name has been compiled previously. You may ZAP the old file (if it is a SPIRES file), change the name of the file you are trying to compile, or use the RECOMPILE command (if you are changing the definition of an old file). Use the RECOMPILE command only if the new definition will not invalidate data in the present file.

C.9.3.7  * FILE NOT AVAILABLE

FILE NOT AVAILABLE

Another user has a subfile of the file you are trying to recompile selected. You cannot recompile while any user has the file selected or attached, unless you add the SHARE option to the RECOMPILE command. [See B.5.9.]

C.9.3.8  * ILLEGAL 2ND PROPERTY

ILLEGAL 2ND PROPERTY

Two elements in the same record-type have been defined as SPLIT, or two elements have been defined as SYNONYM, or one element has been defined as both SYNONYM and SPLIT. Or, MAXVAL may have been given a value greater than 32,767, or an element has been coded without a mnemonic. Also, it is possible that an element has been defined as both REDEFINES and TYPE=STR.

C.9.3.9  * INVALID ACTION CODE

INVALID ACTION CODE

An action with an invalid action number has been coded, or the action coded cannot be used in the defined context (such as using an OUTPROC as a PASSPROC).

C.9.3.9a  * ELEMENT MUST BE TYPE STRUCTURE

ELEMENT MUST BE TYPE STRUCTURE

An action has been coded that can be used only with elements that are structures, and this element has not been defined as a structure.

C.9.3.10  * INVALID ACTION SYNTAX

INVALID ACTION SYNTAX

A processing rule or rules have been incorrectly coded. Check the rule immediately following the EXTRANEOUS message (and the entire rule string if necessary) for the following:

Other errors may result in this message, but these are the most frequent. If the syntax error occurs on a PASSPROC, try: EXPLAIN PASSPROC SYNTAX.

C.9.3.10a  * ILLEGAL ACTION SEQUENCE

ILLEGAL ACTION SEQUENCE

The actions are not allowed to be in the order given. For example, this message is issued when an INCLOSE rule is followed by a non-INCLOSE rule, since INCLOSE rules, if they appear at all, must end an INPROC string.

C.9.3.10b  * INVALID ACTION GROUP

INVALID ACTION GROUP

This error message indicates that a processing rule has been coded in an inappropriate statement -- for example, a rule restricted to use as an INPROC has been coded in an OUTPROC statement.

C.9.3.11  * INVALID CINDEX

INVALID CINDEX

The CINDEX statement has named an element that is not defined in the compound index record-type, or the element has been defined improperly.

C.9.3.12  * INVALID MNEMONIC

INVALID MNEMONIC

A PASSPROC rule, such as A167, has named an element that is not defined, possibly because of a misspelling. The invalid element name is given in the ITEM message.

C.9.3.13  * INVALID NAME LEN

INVALID NAME LEN

One single term in a SEARCHTERMS statement or EXTERNAL-NAME statement is more than sixteen characters long.

C.9.3.13a  * INVALID LOCATOR LENGTH

INVALID LOCATOR LENGTH

The element was identified as a locator, which is by definition 4 bytes long, but a LEN statement for the element gives a different length. Unless it really isn't a locator, you might as well remove the LEN statement.

C.9.3.14  * INVALID REC NAME

INVALID REC NAME

An INDEX-NAME or GOAL-RECORD statement has named a record-type that is not defined by a RECORD-NAME statement.

C.9.3.15  * INVALID SEQUENCE

INVALID SEQUENCE

A SUB-INDEX statement has not named an element that is the name of a structure in the index record-type being linked to. Possibly, the element named is either not a structure element, or it is a structure in a structure.

If no SUB-INDEX statement is involved, there may be a mis-match between the PTR-ELEM name and the name of the key of PTR-GROUP structure. You may be able to correct this problem by adding aliases to the element.

C.9.3.16  * KEY ELEMENT ERROR

KEY ELEMENT ERROR

A REQUIRED key has been given a LEN specification, and there is a FIXED section of the record definition; move the key element to the FIXED section. Or, the key element has been defined in the OPTIONAL section, or it has been defined in both the FIXED and REQUIRED sections. Note that these errors can also occur when the key of a structure has been specified improperly.

C.9.3.17  * LENGTH VALUE > 255

LEN VALUE > 255

The length of a fixed length element cannot be longer than 255 characters.

C.9.3.18  * LENGTH NOT GIVEN

LENGTH NOT GIVEN

An element in the FIXED section of the record or structure does not have a LEN specification.

C.9.3.19  * DUPLICATE ELEMENT NAMES IN STRUCTURE

DUPLICATE ELEMENT NAMES IN STRUCTURE -- ITEM = value

Two or more elements in the same structure or at the record level have been given the same name or share a common alias.

C.9.3.19a  * DUPLICATE VARIABLE NAME

DUPLICATE VARIABLE NAME -- ITEM = value

Two or more variables in the same USERDEFS section of the record definition have been given the same name.

C.9.3.20  * NO ELEMENTS ALLOWED WITH DEFINED-BY VALUE

NO ELEMENTS ALLOWED WITH DEFINED-BY VALUE

If a record definition contains the DEFINED-BY statement, no other statements may be coded in the record-definition except the COMBINED statement.

C.9.3.21  * NO ELEMENTS IN RECORD

NO ELEMENTS IN RECORD

A RECORD-NAME statement was coded, but no elements were defined for the record.

C.9.3.22  * NO ELEMENTS IN STRUCTURE

NO ELEMENTS IN STRUCTURE

An element of TYPE=STR has been defined, but no STRUCTURE statement appears to define the elements in that structure.

C.9.3.23  * NO RECORD TO COMPILE

NO RECORD TO COMPILE

The file name you have given in the COMPILE or RECOMPILE command does not match the FILE statement of any file definition stored under your account. Check your spelling.

C.9.3.24  * NON FIXED ELEMS FOR SLOT

NON FIXED ELEMS FOR SLOT

A SLOT record-type has been coded, but not all elements in it are in the FIXED section. Either REMOVED must be coded for the record-type, or all elements must be fixed in length and occurrence.

C.9.3.25  * ONE UNIQUE ID PER RECORD

ONE UNIQUE ID PER RECORD

The system can only generate one unique value or slot number per record type. Thus, the slot generation INPROC cannot be coded in a SLOT record-type, and it cannot be coded more than once in any record-type.

C.9.3.26  * PROCESSING RULE TABLES FULL

PASSPROC/INPROC/OUTPROC TABLES FULL

Too many processing rules have been defined for the record-type or linkage, or too many strings are in actions coded for the record-type or linkage.

C.9.3.27  * PROCESSING RULE TABLE OVERFLOW

PROCESSING RULE TABLE OVERFLOW

Too many processing rules have been coded for the record-type. The system limit is currently about 2,500 for INPROC rules and about 2,500 for OUTPROC rules.

C.9.3.28  * RECORD KEY ELEMENT MISSING

RECORD KEY ELEMENT MISSING

A non-slot record-type does not have a key element specified, or the key element is incorrectly specified.

C.9.3.28a  * IMMEDIATE INDEX IS ALSO IMMEDIATE GOAL

IMMEDIATE INDEX IS ALSO IMMEDIATE GOAL

This is a warning message. A record-type serving as an immediate index has also been defined as a goal record-type that has its own immediate indexes. That is allowed.

But you should ensure that the same data being passed immediately to that record-type is not then being passed immediately to the other indexes, because that will not work. Unfortunately, SPIRES cannot determine whether it is the same data or not; all it can do is warn you that the file definition has the potential for this type of trouble.

C.9.3.28b  * TREE-DATA VALUE INVALID

TREE-DATA VALUE INVALID

This message usually indicates a problem with the START-KEY statement in the record-type being compiled. Some possible reasons for invalid values:

Another possibility is that the TREE-PREFIX statement does not specify a valid number of characters based on the length of the START-KEY value. [See B.6.5a.]

C.9.3.29  * INVALID SLOTCHECK VALUE

INVALID SLOTCHECK VALUE

Only "SLOTCHECK;" or "SLOTCHECK=n;" where "n" is 0 to 4 can be coded. EXPLAIN A27 for information about the different check-digit formulas represented by the values 0 to 4.

C.9.3.29a  * INVALID ELEM MNEMONIC

INVALID ELEM MNEMONIC ITEM = name

A "name" given in an action that may accept the name of an element (such as A32 or $LOOKUP) is invalid, or is missing.

C.9.3.29b  * ACTION 32 RECORD recname NOT VALID

ACTION 32 RECORD recname NOT VALID

If SPIRES displays this error message, replacing "recname" with the name of the record-type being accessed by action A32 or system proc $LOOKUP, the named record-type is not defined in the file definition. Check for spelling errors.

C.9.3.29c  * USEMPROC error

USEMPROC error

Usually this USEMPROC error indicates a looping type of problem in a record definition, such as a structure that contains itself as an element. Each time SPIRES tries to compile the structure, it must first compile the structure inside, a looping process that quickly fills up some internal tables called USEMPROC tables. Please see your SPIRES consultant if this does not seem to explain the problem for your situation.

C.10  Processing Rule-String Procedures

Many times the same sequence of processing rules will appear in several places in a file definition. Repeatedly coding the same rules is both tedious and error prone. To avoid this, you may want to define processing rule procedures.

C.10.1  The PROC and RULE Statements

A procedure, or "PROC," is a processing rule or a string of processing rules that may be called from any INPROC, OUTPROC, SEARCHPROC or PASSPROC in a file definition. ("System procs", a collection of procedures that can be invoked by any file definition or format, are based on the "proc" capability. See the reference manual "SPIRES System Procs" for further information.)

The declaration or definition of a PROC requires only two statements:

PROC definitions are placed near the end of the file definition, just before the subfile section. (They are placed at the very end of format definitions.)

The PROC statement may specify almost any name for the PROC that does not look like an action. For example, "ALPHA.IN" is valid, but "AS148.IN" is not. To avoid confusion with system procs, it should not begin with a dollar sign. The PROC name may be up to sixteen characters long, and may not contain a blank or any of the following: ;:"/',<>()|&~@=

The syntax of the RULES statement is identical to the syntax of any INPROC, OUTPROC, SEARCHPROC or PASSPROC rule string. The RULES statement may even contain the name of another PROC, as long as that PROC's definition does not in turn call another PROC; that is, a single level of nesting is allowed. Be aware that if you include any system procs in a proc definition, you are in fact using that single level of nesting. [The $PHONETIC.SEARCH proc, since it calls another system proc, cannot be called from a user-defined proc.]

Here is an example of how an element definition might use PROCs in its INPROC and OUTPROC statements:

The PROCs that would be defined are:

PROC definitions may be specified in any order, but SPIRES will always place them in ascending order by PROC name. No two PROCs may have the same name.

C.10.2  The PARM and DEFAULT Statements

For more sophisticated and elaborate processing rule procedures, certain parts can be defined as "variable." When the procedure is called, a value may be specified for the "parm" (the parameter or variable). If no value is specified, a default can be applied.

There can be several parms for a particular RULE statement. Each parm can have a default, but the default is not required. Each parm is named in a PARM statement; if a default applies for a particular PARM, the default must be specified in a DEFAULT statement immediately after the PARM statement to which it applies.

The following example shows a simple use of the PARM and DEFAULT statements. Action 24, range test for minimum and maximum from a previous binary conversion, is used.

The '&' is used to indicate that the name of a PARM follows. PARM names may be up to sixteen characters. PARM values may be as long as 255 characters. The 255-characters limit applies to default values, user-supplied values or values defined in the VALUE statement. [See C.10.3.]

This PROC could be invoked in a file definition by any of the following forms:

The PARMs are evaluated positionally within the parentheses. The first value encountered within the parentheses is taken to be the value of the first PARM declared in the PROC (in the example, it is the PARM called MINIMUM). If no value is specified (but a comma is coded to "hold" its place), then the DEFAULT, if any, for the PARM is used.

If the last PARM values are not needed by a particular PROC, then no trailing commas need be coded.

The use of the apostrophe (') character in coding processing rules can also be made easier by defining appropriate PARM (and DEFAULT) statements. For example:

Then:

Note how a null string is specified for a DEFAULT. This is not really necessary, since every PARM has a DEFAULT of null if no DEFAULT is specifically coded. Thus, the absense of a DEFAULT statement indicates a DEFAULT of the null string.

If the PROC had been coded as:

Then:

If a "null" value is supposed to replace a parameter which has a non-null default value, as in the above example, then a null apostrophe string ('') is used as a parameter value.

C.10.3  The SYMBOL and VALUE Statements

When defining a PARM, it is possible that literals may be more meaningful than various numeric P1 (or P2, P3 or P+) parameters. For example, it may be more meaningful to allow a value for a PARM for A49 (tests to see if the input value contains or does not contain a specified character string) to be "LIKE" or "UNLIKE" rather than the P1 parameters or 0 or 1.

A particular PARM can have one or more literals defined for it. These literals are defined in SYMBOL statements immediately following the PARM statement (or DEFAULT statement if one is present) for which the SYMBOL applies. Each SYMBOL must have one and only one VALUE defined for it; the value is specified in a VALUE statement immediately after the SYMBOL statement.

For example, the following rule definition makes A49 easier to use:

The above PROC could be invoked by any of the following:

The symbol, when specified in the proc, should not be enclosed in apostrophes or quotation marks; if it is, it will be interpreted as a literal value to be passed to the rule.

The rules specified above [See C.10.2.] for specifying null DEFAULT values apply to the VALUE statement also.

C.10.4  Processing Rule Procedure Examples

The following PROC definitions show further examples of the use of the PARM, DEFAULT, SYMBOL and VALUE statements.

PROC = VALIDATE; RULE = AS46:&TYPE,&VALUES;
                 PARM = TYPE; DEFAULT = 3;
                              SYMBOL = ACCEPT; VALUE = 3;
                              SYMBOL = REJECT; VALUE = 0;
                 PARM = VALUES;
PROC = TEST;     RULE = AS49:&TYPE,&CHARS;
                 PARM = TYPE; DEFAULT = 0;
                              SYMBOL = LIKE; VALUE = 0;
                              SYMBOL = UNLIKE; VALUE 1;
                 PARM = CHARS; DEFAULT = '1234567890';
                              SYMBOL = LETTERS; VALUE = 'ABCDEF
                                                GHIJKLMNOPQRST
                                                UVWXYZ';
                              SYMBOL = NUMBERS; VALUE = '1234567890';
                              SYMBOL = BLANKS;  VALUE = ' ';
PROC = UPPER;    RULE = A30;
PROC = BLANKOUT; RULE = A40/A51;
PROC = REPLACE;  RULE = A44:&LENGTH,&INPUT,&OUTPUT;
                 PARM = INPUT;
                 PARM = OUTPUT;
                 PARM = LENGTH; DEFAULT = 0;
PROC = TRANSLATE; RULE = A43,&INPUT,&OUTPUT;
                  PARM = INPUT;
                  PARM = OUTPUT; DEFAULT = ' ';
PROC = RANGE;     RULE = AS24,&MAXIMUM,&MINIMUM;
                  PARM = MINIMUM; DEFAULT = 0;
                  PARM = MAXIMUM;
PROC = BINARY-IN; RULE = AS21:&SIZE;
                  PARM = SIZE; DEFAULT = 4;
                               SYMBOL = FULLWORD; VALUE = 4;
                               SYMBOL = FULL; VALUE = 4;
                               SYMBOL = HALFWORD; VALUE = 2;
                               SYMBOL = HALF; VALUE = 2;
                               SYMBOL = BYTE; VALUE = 1;
                  PARM = ERROR; DEFAULT = S;
PROC = LENGTH;    RULE = AW22:&RECOVERY,&LENGTH/
                         AW23:&PAD,&LENGTH;
                  PARM = LENGTH;
                  PARM = RECOVERY; DEFAULT = 0;
                               SYMBOL = RETAINALL; VALUE = 0;
                               SYMBOL = RETAINFIRST; VALUE = 1;
                               SYMBOL = RETAINLAST; VALUE = 3;
                               SYMBOL = DROPFIRST; VALUE = 3;
                               SYMBOL = DROPLAST; VALUE = 4;
                  PARM = PAD; DEFAULT = 0;
                               SYMBOL = PAD; VALUE = 1;
PROC = INSERT;    RULE = A36:&HOW,&WHAT,&WHERE;
                  PARM = WHAT; DEFAULT = ' ';
                  PARM = HOW; DEFAULT = 3;
                               SYMBOL = AFTER; VALUE = 0;
                               SYMBOL = BEFORE; VALUE = 3;
                               SYMBOL = FRONT; VALUE = 1;
                               SYMBOL = BACK; VALUE = 2;
                  PARM = WHERE; DEFAULT = 0;
PROC = NAME;      RULE = A41:&ORDER;
                  PARM = ORDER; DEFAULT = 1;
                               SYMBOL = LASTFIRST; VALUE = 1;
                               SYMBOL = FIRSTLAST; VALUE = 0;
PROC = FETCH;     RULE = A169:&TYPE;
                  PARM = TYPE; DEFAULT = 1;
                               SYMBOL = CHARACTER; VALUE = 1;
                               SYMBOL = NUMBER; VALUE = 0;
PROC = LOOKUP;    RULE = AS32:&HOW,&RECORD,&ELEMENT;
                  PARM = RECORD;
                  PARM = ELEMENT; DEFAULT = 0;
                               SYMBOL = KEY; VALUE = 0;
                  PARM = HOW; DEFAULT = 3;
                               SYMBOL = REPLACE; VALUE = 3;
                               SYMBOL = VERIFY; VALUE = 2;
PROC = STORE;     RULE = A61;
PROC = RESTORE;   RULE = A61:&MORE;
                  PARM = MORE; DEFAULT = 1;
                               SYMBOL = HOLD; VALUE = 1;
                               SYMBOL = RELEASE; VALUE = 2;
PROC = BREAK;     RULE = A45,&CHARACTER;
                  PARM = CHARACTER; DEFAULT = ' ';
                               SYMBOL = BLANK; VALUE = ' ';
PROC = BINARY;    RULE = A&ERROR21:4;
                  PARM = ERROR;

C.10.5  The EXTDEF Subfile

File owners who develop many files often discover that they want to use the same PROCs from one file to another. Since it would be rather tedious to have to put them in by hand each time or even to use text-editing magic, SPIRES has a subfile, EXTDEF (formerly called PROCDEF), in which collections of PROCs can be kept.

An EXTDEF record consists of an ID element (the key of the record, in the form 'gg.uuu.anyname', where 'gg.uuu' is your account), and PROC structures like those you would include in a file definition in the PROC section:

It may also include element and index information packets. [See C.13.5.]

You add such a record to the public subfile EXTDEF. To access PROCs in your EXTDEF record, you must code the EXTDEF-ID (alias PROCDEF) statement in your file definition. Its position in the definition is right after the PROC statements, i.e., ahead of the subfile section.

Naturally, you would supply the name of the record you want to use. When you compile your file definition, SPIRES will access the referenced EXTDEF record or records (more than one can be called), find the named PROC, and compile its rules into the file characteristics. The file definition stored in FILEDEF is not affected; that is, the processing rules are not merged into the stored file definition. You can only see the processing rules by selecting EXTDEF and examining the appropriate record. This characteristic does complicate file definition debugging, but most file owners who have used EXTDEF records find the extra convenience of not having to enter the same PROCs over and over again worth this very small inconvenience.

You can also reference an EXTDEF record from a format, placing the EXTDEF-ID statement at the very end of the format definition.

By default, anyone can code the names of your EXTDEF records in an EXTDEF-ID statement and thus use your PROCs. You can change this public access by coding an ACCOUNT statement in your EXTDEF record:

The ACCOUNTS statement has the same form of ACCOUNTS statements in the file definition. [See B.9.2.] Only those accounts given specific access by the ACCOUNTS statement can compile file definitions and formats using your PROCs. Note that even though others can, by default, access your EXTDEF record during compilation of their file definitions, only you can TRANSFER, DISPLAY, DEQUEUE, UPDATE or REMOVE your EXTDEF records.

If you include both PROCs and an EXTDEF-ID statement in your file definition or format, SPIRES will look for a named PROC in the set within the file definition or format before looking for it in the EXTDEF record. Also, if you have more than one EXTDEF-ID statement, during compilation, SPIRES will look for a named PROC in the last-named one, then in the preceding one, and so forth, going backwards through the list.

The public subfile BACKDEFS can be used to store EXTDEF records that you no longer need but wish to archive.

C.11  User-Defined Processing Rules: Userprocs

System procs and actions can be considered "system-defined" processing rules -- though many are quite elaborate, they work in predefined, limited ways. On the other hand, the SPIRES procedural language exemplified by protocols and formats has considerably more power and flexibility. You have the power to construct your own processing, using variables, If...Then commands, block structures, etc.

For example, if you need to check an input value to determine that it is a date, the $DATE proc is very useful. But if you need to test the date to be certain it falls within a certain range of years, or to be certain the input date is a weekday, or to be certain it is a date no more than ten days earlier, you need more power than system procs or actions have by themselves.

Fortunately, much of the power of the SPIRES procedural language is available through special processing rules called "Userprocs", for "User-defined processing rules". About 30 different "commands", all system variables, almost all functions and user-defined variables are available in Userprocs, providing a broad palette of processing possibilities for elements.

You may invoke a Userproc from any processing rule string, whether it is an Inproc, Outproc, Searchproc or Passproc. Userprocs can be invoked as Inclose rules in an Inproc rule string, to supply element values or even to create values from other elements. Userprocs may be called from the INPROC or OUTPROC statements in a format definition, and could even be used from a protocol using the $PROCSUBG function.

For this extra flexibility, you give up some efficiency. If you wrote a Userproc that exactly duplicated the processing that a system proc or action did, it would be less efficient than the proc or action. On the other hand, if you write a Userproc that does work that no system proc or action could do, efficiency is a moot issue.

The range of Userproc uses is infinite -- this manual can do no more than suggest a few. Any programmer's imagination can invent many creative uses for them. This chapter will thus concentrate on the basic rules for defining them, and is organized in sections as follows:

C.11.1  Coding Userprocs

There are two major pieces involved with each use of a Userproc:

The $CALL System Proc and Actions A62 and A124

Detailed information about the $CALL system proc appears in the "System Procs" manual, while information about the actions appears at the end of this manual. Here though are summaries of syntax:

The $CALL System Proc

"Name" is the name of the system proc to be called, up to 16 characters in length. If the Userproc is stored in a compiled, external record definition in the RECDEF subfile, you precede the Userproc name with the key of the RECDEF record, followed by a pound sign, as in "GG.UUU.INSTRUCTOR#SalaryCheck". [See C.7.1.]

"Type" represents whether $CALL is being used as an Inclose rule (the value "Inclose") or anything else, such as an Inproc or Outproc (in which case the value should be "Normal").

"P1" is an integer from 0 to 15 (default = 0) whose value is placed in the system variable $P1, available for use in the Userproc. [See C.11.3.]

"Error" represents the error level for the proc; its default value is "D". Errors occur if the Userproc executes a SET ERROR Uproc.

Actions A62 and A124

You use A124 if you want to use the Userproc as an Inclose rule; otherwise, you use A62. The parameters are the same as their counterparts in $CALL. "E" represents the error level, of course; if omitted, the default is "D" level errors. "P1" can be an integer from 0 to 15; it really has no meaning unless you use $P1 in the Userproc. "Userproc-name" is the 1- to 16-character Userproc name. If the Userproc is stored in a compiled, external record definition in the RECDEF subfile, use the form described above for $CALL. [See C.7.1.]

The Userproc

Userprocs are usually defined within the USERDEFS section of a record definition, specifically the record definition containing the element that calls it, though they can be stored in and invoked from another record definition placed in the RECDEF subfile. If the processing rule appears in an Searchproc or Passproc rule string, SPIRES will look for the Userproc in the goal record-type's definition (i.e., the record-type named in that linkage section's GOALREC-NAME statement).

The USERDEFS section has two main sub-sections: variable definitions (which are optional) and Userprocs:

The variable definitions follow the same rules as variable definitions in VGROUPS records. They may include these statements:

Additional information about user-defined variables appears later in this chapter [See C.11.3.] but for detailed information about these statements, see the manual "SPIRES Protocols" or use the EXPLAIN command online.

For the time being, note these facts about user-defined variables carefully:

The Userproc name in the $CALL proc or A62 or A124 action must match the name of a Userproc in the USERDEFS section. The name is followed by Uprocs, which are statements specifying individual actions to be taken.

For instance, here is a simple Userproc:

This Uproc takes the element value at the time the Userproc is called (stored in the variable $PROCVALUE), multiplies it by 12, places it back in the variable [Unfortunately, the SET VALUE Uproc sets a variable whose name can be given as $VAL, $PROCVAL or $PROCVALUE, but not $VALUE. You may, however, abbreviate SET VALUE to SET VAL, which gives you a parallel to the $VAL alias.] and then returns to the processing rule string, returning the multiplied value as a string for any further processing.

More examples of Userprocs appear in the rest of this chapter. [See C.11.4 especially.]

The specific Uprocs allowed are discussed in the next section. [See C.11.2.] In general terms, you will find Uprocs have these capabilities:

Errors

Userprocs can set the processing-rule error flag explicitly, using the SET ERROR Uproc (described in the next section). But unforseen problems can occur with a Userproc that will also cause the error flag to be set. The list below describes the kinds of errors that can happen, preceded by a "message" used in the system error message as described following the list:

If a Userproc fails for one of these reasons, then execution control returns immediately to the processing rule that called it, setting the error flag, setting the error level to S (for Serious) and setting $UCODE to "USERPROC name (msg)", where "name" is the name of the Userproc that failed and "msg" is the appropriate message shown on the left above.

Also, any SET VALUE Uproc is cancelled, and therefore no new value is returned. Any SET DELETE Uproc is also cancelled. (These two Uprocs are discussed in the next section.)

In sum, the effect is that A124 or the $CALL proc used as an Inclose does not delete any values and does not add any new value. For A62 or $CALL not used as an Inclose, the value before the Userproc is retained. The action's error code becomes Serious for diagnostic processing.

C.11.2  Uprocs (Commands) Available in Userprocs

Below is a list of all the Uprocs that can be coded in a Userproc. Each is described in some detail in the indicated subsection of this section. Online, use the EXPLAIN command to get the specific information you want.

C.11.2.1  SET Uprocs

This section describes the SET Uprocs that are available in Userprocs: SET VALUE, SET ERROR, SET UCODE, SET LASTRULE, SET DELETE, SET OCC, SET ASK, SET PROMPT and SET PARM.

Be aware as you look at the syntax of these Uprocs that neither "string-expression" nor "integer-expression" may contain functions. A string expression may consist of one or more strings or variables concatenated together, however. An item marked simply "expression" may include functions, as in the SET VALUE Uproc shown below.

SET VALUE = expression

This Uproc controls the value that will be returned to the processing rule string when the Userproc ends. The Uproc sets the system variable $PROCVALUE (aliases $PROCVAL and $VAL) to the value of the evaluated expression, which can be any type of expression allowed in a LET command. (You can abbreviate SET VALUE to SET VAL if you want a direct association between the Uproc and the name of the variable, in its $VAL alias.)

NOTE: The expression's value will always be converted to a string value for storage in $PROCVALUE; if the conversion fails, an E62 or E124 error will occur. [See C.11.1.] The conversion to string will cause problems if a processing rule following the Userproc assumes the value returned from the Userproc is in some other form, such as an integer. Using the $RETYPE function may be necessary, as described in the documentation for the $PROCVALUE variable. [See C.11.3.1.]

If the length of the value exceeds the file's MAXVAL length, it will be truncated to MAXVAL characters.

If the Userproc is not called as an Inclose rule, the SET VALUE Uproc creates a value that replaces the original element value. If the Userproc is called as an Inclose rule, SET VALUE creates a value that will be used to create an additional occurrence of the element. (Remember that an Inclose rule will execute regardless of whether the element has any occurrences; if the element has none, it will get one only if the Userproc executes a SET VALUE Uproc.)

Example

This example shows a simple use of the SET VALUE Uproc:

The first three characters of $PROCVALUE, the current element value, are concatenated to a hyphen.

But in the next example, note that, despite the best attempts of the file definer, the value returned from the Userproc is a string:

SET ERROR [code]

This Uproc sets the processing-rule error flag. It can optionally be followed by one of the error codes: D, W, E or S. If the code is specified, it unconditionally replaces the original processing rule's error code.

Example

If the system proc $CALL(testval,,,W) invoked a Userproc containing the Uproc "SET ERROR S", the error flag would be set, and the error condition would be S for Serious, not W for Warning.

SET UCODE = string-expression

This Uproc sets the system variable $UCODE, replacing its current value. It is displayed as part of the standard SPIRES error message for elements and is meant to give customized information about the error:

This message and error were established using these two Uprocs:

$UCODE can also set with the $MSG system proc, action A56, and the SET UCODE Uproc in formats.

If the value exceeds 128 characters, it is truncated to 128.

SET LASTRULE

The SET LASTRULE Uproc tells SPIRES that the current processing rule (A62 or $CALL when not used as an Inclose) will be the last processing rule executed for this element; no further processing rules in the string will be executed.

NOTE: SET LASTRULE has no effect when the Userproc is called as a Passproc rule, or when called as an Inclose rule. Note too that when invoked during Inproc processing, it does not prevent any Inclose rules from executing.

SET DELETE

The SET DELETE Uproc is useful only when the Userproc is called from an Inclose rule. It indicates that occurrences of the element are to be deleted. By default, all occurrences except for the one created by any SET VALUE Uproc will be deleted. However, using the SET OCC Uproc described below, you can limit the occurrences to be deleted.

SET OCC = n;

This Uproc, useful only when the Userproc is called from an Inclose rule and only when SET DELETE is in effect (see above), indicates how many occurrences of the element are to be deleted. If "n" is positive, the first "n" occurrences are deleted; if "n" is negative, the last occurrences are deleted.

SET ASK = string-expression

The SET ASK Uproc sets the value of the $ASK variable, which otherwise holds the response returned by the user to the last ASK Uproc or command. However, $ASK, like $PROMPT and $PARM (see below), is sometimes used simply to hold a value temporarily, or to pass a value to the world outside of the Userproc (e.g., a format or protocol). The value of $ASK may not exceed 256 characters. If you try setting it to a value longer than 256 characters, it will be truncated. [The SET ASK command works differently than the SET ASK Uproc in that respect. The command will not truncate a value longer than 256 characters, but will discard it completely and set the variable to null. The same is true for SET PROMPT and SET PARM too.]

SET PROMPT = string-expression

The SET PROMPT Uproc sets the value of the $PROMPT variable, whose value is generally used as a prompt for user input by the ASK command or Uproc. [See C.11.2.5.]

The value cannot exceed 256 characters; longer values will be truncated. However, practically speaking, Orvyl restricts an ASK command's prompt plus its response to a total of 168 characters, so keep that in mind if you are setting a prompt for the ASK Uproc or command.

SET PARM = string-expression

The SET PARM Uproc sets the value of the system variable $PARM, which is also set by the SET PARM Uproc in formats, the SET PARM command and the SET FORMAT and SET GLOBAL FORMAT commands. It doesn't have any fixed use, though its main purpose is to hold the parameter list specified on the SET FORMAT and SET GLOBAL FORMAT commands. How you use it is up to you.

Like $ASK and $PROMPT, $PARM cannot exceed 256 characters in length. Longer values will be truncated.

C.11.2.2  Block-Construct Uprocs for Execution Flow

Many Uprocs are available in Userprocs for condition testing and execution flow. A major group of them, discussed in this section, are called "block-construct" Uprocs:

Others are discussed in the next section. [See C.11.2.3 or EXPLAIN UPROCS, FOR EXECUTION FLOW online.]

Block-Construct Uprocs in General

[This information is condensed from the reference manual "SPIRES Protocols", which contains more detailed information about block constructs.]

A block construct is a set of Uprocs delimited by two Uprocs, one at the start and one at the end, that define the type of block. The Uprocs within the block execute in order, one by one, but they can also be regarded as a self-contained entity that executes as a whole or not at all:

In this example, if the value of $VELEM is NAME, then all the Uprocs between BEGINBLOCK and ENDBLOCK will be executed; if $VELEM's value is something else, none of those Uprocs will be executed.

There are three sets of block constructs, each of which has three parts:

BEGINBLOCK and ENDBLOCK open and close a simple block of Uprocs. The other two pairs are called "looping block constructs". WHILE and ENDWHILE open and close a block that loops as long as (WHILE) conditions stated at the beginning of the block are true. REPEAT and UNTIL open and close a block that loops until conditions stated at the end of the block are true.

In the diagram, you can see how similar the constructs are. Statement 1 opens the block of Uprocs (2), and statement 3 closes it. Statement 1 must always be paired with its corresponding statement 3; otherwise a compilation or execution error will occur. Statement 3 cannot be the object of an IF, THEN or ELSE Uproc; for example, "ELSE ENDBLOCK" is an illegal Uproc.

The Uprocs within the block can be any other Uprocs, including other block constructs. However, block constructs inside of others must be completely nested therein; they cannot overlap.

The JUMP and GOTO Uprocs should not be used indiscriminately within block constructs, either. [Many programmers would contend that JUMP and GOTO have no place within block-construct programming at all.] Jumping into or from the middle of a block construct can lead to possible logic problems with the $ELSE system flag, which controls THEN/ELSE processing, and is not generally recommended. However, using JUMP to move around within a block construct normally causes no problems. Be aware that if you do use JUMP to leave a block, $ELSE will retain its value from within the block; it will not revert to the value it had when the block began execution, which it would do if you left the block normally.

Looping block constructs provide another way to leave from inside: the LEAVE Uproc. LEAVE causes execution to continue with the next Uproc outside of the looping block, just past the ENDWHILE or UNTIL Uproc. LEAVE is not available in BEGINBLOCK...ENDBLOCK constructs.

Here is an example demonstrating use of the LEAVE Uproc:

The idea behind these Uprocs is to place the first ten occurrences of the CROSSREF element into the first ten occurrences of the #XREF variable. However, when no value exists for an occurrence, the $GETUVAL function returns the value "No value", which in the next Uproc is treated as the signal to leave the block construct.

Another useful Uproc for looping block constructs is the ITERATE Uproc, which transfers execution to the last statement in the block, either ENDWHILE or UNTIL.

Limits and Other Technical Considerations

Block constructs may be nested to 31 deep in a Userproc.

In technical terms, block constructs are basically fancy JUMP statements, with some condition handling (IF/THEN/ELSE) built in. This can be useful to know if you find yourself stretching the medium of block constructs by using JUMP or GOTO statements in spite of the earlier warning. If you do jump out of a block, do not depend on the value of the $ELSE flag (which affects the processing of THEN/ELSE) to reflect the value it had when you went into the block.

C.11.2.3  Other Uprocs for Execution Flow

Besides the block-construct Uprocs, other Uprocs are available for condition testing and execution flow in Userprocs: IF...THEN, THEN, ELSE, ++label, JUMP, GOTO, XEQ USERPROC, XEQ PROCEDURE and RETURN. All but XEQ USERPROC have equivalents in the Protocols language, and further details about those can be found in the reference manual "SPIRES Protocols".

IF condition THEN uproc

If the condition is true, then the Uproc that follows THEN will be executed:

The condition must follow the same rules for the IF command in SPIRES. See the manual "SPIRES Protocols", or online, [EXPLAIN IF COMMAND.]

If you are unfamiliar with the IF command, please read the referenced documentation; don't try to extrapolate too much from the fairly elementary examples shown below.

The term THEN must follow the condition; ELSE (see below) is not allowed. A colon is allowed as a valid abbreviation for THEN in an IF Uproc:

The "uproc" that follows THEN may be any other Uproc allowed in a Userproc (including another IF Uproc), EXCEPT

Here are some typical IF Uprocs:

The $RECTEST function tests whether a record with a key having the value of $ASK exists; if the returned value from the function is less than zero, the record doesn't exist. Thus, the Uproc says that execution control should jump to the NoRecord label if the record doesn't exist.

This Uproc will set the value of $PROCVALUE to the string "0" if its current value is "None" (Remember, $VAL is an alias for $PROCVALUE.) or if $ASK has a value of "0".

This Uproc will set the value of $PROCVALUE to the current value of $ASK if neither its current value nor $ASK is "0".

The user will be prompted for a "New value?" if the system variable $ASK has no current value, i.e., it has a null value. (If $ASK has a non-null value, then the condition "$ASK" is true, and the condition "~$ASK" is false. But if $ASK has a null value, then the conditions have opposite values.)

"THEN uproc" and "ELSE uproc"

This pair of Uprocs work in conjunction with the previously executed IF Uproc in the Userproc. The THEN Uproc will be executed if the conditions in the previous IF Uproc are true; if false, the THEN Uproc won't be executed. ELSE works oppositely: it will be executed if the conditions in the previous IF Uproc are false; if true, the ELSE Uproc won't be executed.

As noted above for the IF Uproc, the Uproc that follows THEN or ELSE may be any other Uproc valid in a Userproc, EXCEPT for

Unlike in the IF Uproc, THEN on its own cannot be abbreviated to a colon.

Here is an example showing IF, THEN and ELSE Uprocs working together:

If the user responds to a prompt for a "New Value?" with an ATTN/BREAK, then the Userproc sets $PROCVALUE to the current date and the $PARM variable to the string "REJECT". But any other response causes the ELSE Uprocs to execute, rather than the THEN ones; $PROCVALUE is set to the user's input value ($ASK) and $PARM is cleared (i.e., set to a null string).

Be particularly careful if you nest IF Uprocs, e.g., "IF condition THEN IF condition THEN ..., ELSE IF..., etc. The block construct Uprocs can generally be used for equivalent results but with a clarified structure. [See C.11.2.2.]

++label

The label Uproc identifies a point to which execution may branch, using the JUMP, GOTO and XEQ PROCEDURE Uprocs. You may also use labels simply to identify particular sections of a Userproc or to break it up into smaller pieces for clarity.

The label name may contain alphanumeric characters, with a period (.) being the only special character allowed. The name may contain up to 16 characters; if you use more, the compiler will truncate the length to 16 internally. You will get a warning message during compilation, but it will not cause any problems, unless the first 16 characters match the first 16 of another label, in which case the compilation will fail.

The label name may not contain any blanks, either. Any characters following a blank are considered a comment.

"JUMP label" and "GOTO label"

JUMP and GOTO are equivalent Uprocs. Execution control goes immediately to the named label group, which must be within the current Userproc.

XEQ USERPROC userproc-name

The XEQ USERPROC Uproc transfers execution to the named Userproc, which must be defined in the same record-type as the current one. When the called Userproc finishes executing or executes a RETURN Uproc (see below), execution control returns to the original Userproc, continuing from the XEQ USERPROC.

Nesting of XEQ USERPROC Uprocs is allowed, up to five deep. If any XEQ PROCEDURE Uprocs (see below) are nested within XEQ USERPROC calls, they count in the limit of 5 too.

XEQ PROCEDURE label

The XEQ PROCEDURE (or XEQ PROC) Uproc transfers execution control to the named label group. Unlike JUMP or GOTO, which do not keep track of where they were called from, XEQ PROC, like XEQ USERPROC and the calling processing rule, remembers where it is when it transfers control. A RETURN Uproc will always return to that point.

This is particularly useful when you want to invoke the same procedure from several different points in a Userproc. Each point calls the procedure using an XEQ PROC Uproc, and the procedure ends with a RETURN, to return control to whichever place called it.

Nesting of procedures is allowed, up to five deep. But any XEQ USERPROC calls nested within XEQ PROC calls count in the limit of 5 too.

RETURN

The RETURN Uproc returns execution back where it came from, one level up, in regard to XEQ USERPROC, XEQ PROCEDURE and the Userproc itself. A RETURN Uproc is supplied automatically at the end of every Userproc.

Here is an example showing two different uses of the RETURN Uproc:

If all the Uprocs of the Userproc are shown, the RETURN Uproc at the end is superfluous; a RETURN will be executed anyway. In any case, if execution goes to NewTax, the RETURN at the end will return control to the ELSE Uproc, the second one shown. Of course, it will not execute, since the previous condition was true. (Note that even if the NewTax label group had its own IF/THEN/ELSE processing in it, the condition established before the XEQ PROC occurred would be re-established upon return from it.)

On the other hand, the first RETURN Uproc stops the execution of the current Userproc, returning control either to the processing rule string or another Userproc that called it. In this case, it prevents the Userproc from executing the NewTax procedure at the end, unless it's called from the IF Uproc.

C.11.2.4  Uprocs for Setting User Variables

User variables defined in the Variables section of the USERDEFS structure can be set in two ways:

This section will discuss the LET Uproc. A special type of user variable called a "triple" can be set using special triples functions, often in conjunction with the EVAL Uproc, which is discussed in a later section. [See C.11.2.6.]

LET variable = expression

The "variable" must be one of those defined by the VARIABLES section of the USERDEFS structure containing this Uproc. It may be subscripted if it was defined as having multiple occurrences. The variable name should not be preceded by a "#", though the subscript, if a variable, should be:

The "expression" must follow the same rules as expressions in LET commands, described in the reference manual "SPIRES Protocols". The result of the expression must either be the same type as the variable or be convertible to that type. If SPIRES cannot convert it to the right type, an E62 or E124 "processor" error will occur. [See C.11.1.]

Here are a couple examples of LET Uprocs:

The variable TOTAL is assigned the sum of the first three occurrences of variable A. Variable PART1 gets the first part of the string $PROCVALUE, up to the first blank within it, as its value.

C.11.2.5  Uprocs for Terminal Input/Output

This section describes two Uprocs used to communicate between the user and the Userproc. The "* Uproc" (called the "star Uproc") displays messages on the terminal screen. The ASK Uproc asks the user for additional input during the Userproc's execution.

* string-expression

The star Uproc displays the value of the string expression at the user's terminal.

The expression may contain a mixture of strings and variables that SPIRES can convert to strings for display:

which might show up on the user's terminal as:

Note that the "star" (asterisk) does not appear in the display, as it does in star commands.

ASK PROMPT = string-expression

The ASK Uproc in Userprocs works similarly to the ASK command and ASK Uproc in formats, though it has some noteworthy limitations different from the other varieties. There are no options on the ASK Uproc in a Userproc; PROMPT is required.

In these three cases, the user will receive the following three prompts (assume $PROMPT is set to "Value?"), with the underscore indicating the position of the terminal's cursor:

The prompting is different from the way the ASK command works. Here, if no prompt is explicitly given, a colon is used. If you want to use the value of $PROMPT as the prompt, you must request it specifically, as shown in the third example.

How SPIRES handles the response to a Userproc's ASK Uproc is slightly different too. If the user responds with a carriage return, the value of $ASK is set to null. If the response is ATTN/BREAK, the value of $ASK is set to ' <ATTN>' (i.e., the string "blank, less-than, A, T, T, N, greater-than"). Otherwise, the user's response is limited to 160 characters, and is placed in the $ASK variable. Any leading blanks in the user's response are discarded; trailing blanks are retained.

If a Userproc's ASK Uproc is executed in a batch SPIRES or batch SPIBILD job, an E62 or E124 "processor" error will occur.

Here is a sample Userproc containing star and ASK Uprocs:

The "If $Ask" Uprocs near the end handle the cases where the user presses only the Return key or only the ATTN/BREAK key as the response.

C.11.2.6  Miscellaneous Userproc Uprocs

The comment and EVAL Uprocs don't fit neatly into other categories and are thus described under a category of their own.

- comment

The comment Uproc lets you put comments into the Userproc where you see fit. COMMENTS elements are available in the USERPROCS structure, but they appear at the end of each Userproc, which is no help if you need to comment the execution flow within the Userproc.

Comment Uprocs have no effect on the compiled code. Begin the comment one space after the "hyphen" that introduces it.

EVAL $function(expression)

The EVAL Uproc, whose object must be a function, evaluates the expression but does nothing with the value returned from the function. Thus, it is most often used with a function whose usefulness is in its execution rather than its returned value. Some functions that you might use in an EVAL statement are:

UPROC;

For formatting reasons, you may include a Uproc with a null value:

SPIRES simply ignores it during execution. It may be a useful technique if you are trying to make your Userprocs easier to read.

C.11.3  Using Variables in Userprocs

You may use both system variables and user variables in Userprocs. System variables such as $TIME and $SUBCODE may be retrieved and tested and used in other computations to establish a value for the element. User variables are frequently useful in complex Userprocs where many values are being manipulated together.

User variables are defined in the VARIABLES section of the USERDEFS structure, just as they are in a vgroup. They may be set using LET Uprocs, and their values are available to other Uprocs, such as SET VALUE or ASK PROMPT.

Some system variables are available only in Userprocs, and user variables defined for use within Userprocs are available only to those Userprocs -- they are unavailable outside of the Userproc and are reset each time the Userproc returns to the processing rule string.

But other system variables and dynamic variables are available if you need to pass values in and out from the Userproc, independently of the element value itself.

In the first subsection of this chapter, the system variables used exclusively in Userprocs are described. [See C.11.3.1.] In the second, the use of user variables, including dynamic variables, is covered in brief. [See C.11.3.2.]

C.11.3.1  System Variables Available Only in Userprocs

Most, but not all, system variables are available for use in Userprocs. [Even system variables for formats (e.g., $CROW) are available, if the Userproc is called only from a format.] However, four system variables are available only within Userprocs. Outside of Userprocs, they are not recognized. Described in detail below, they are:

$PROCVALUE ($PROCVAL, $VAL)

When a Userproc begins execution, this variable contains the current value of the element (i.e., the value at that point in the processing rule string). If the Userproc is called as an Inclose rule, its value is the value of the first occurrence of the element, or if the element has no values already, its value is null.

$PROCVALUE's type varies depending on the situation; however, the SET VALUE Uproc, the only way a Userproc can set $PROCVALUE, always sets it to a string value. Further information about $PROCVALUE's type appears at the end of this section.

$VELEM

$VELEM is a string variable containing the primary mnemonic of the element calling this Userproc. When the Userproc is called from an Inproc, Outproc or Passproc string, $VELEM is always the element's primary mnemonic, even though the element may have been called by an alias. If the Userproc is invoked from a Searchproc rule string, $VELEM contains the searchterm named in the user's search expression. (There is no such thing as a "primary search term".)

This variable is mostly useful when a single Userproc is called from several different processing rule strings.

$VOCC

For Outproc and Passproc Userprocs, this integer variable contains the number of the current element occurrence within its containing structure or, if not in a structure, in the record itself, starting with "1" for the first occurrence.. When used in an Inclose Userproc, it contains the count of the number of occurrences of the element (0 if there are none). (Its range is 0 to 4095, the maximum number of occurrences of an element.) For Searchproc and non-Inclose Inproc Userprocs, its value is always 1.

$P1

$P1 is an integer variable containing the value of the P1 parameter of the $CALL, A62 or A124 rule that called the Userproc. Its range is 0 to 15.

$PASSDEL

$PASSDEL is a flag variable that may be tested in Userprocs called during passing (i.e., from a Passproc). Its value is normally false; however, when passing is occurring for the purpose of deleting (say, the goal record is being removed, so SPIRES is generating the pass values so it knows what to remove from the index records), then $PASSDEL is true.

Specifically, during passing situations, $PASSDEL is true or false according to the following:

So $PASSDEL is only true during the passing of an old record. You can use this fact to control passing. For example, you may have decided to stop indexing a particular value, but you'd like the pointers for all existing occurrences of that value to be deleted from the index over time. You could write:

Values would be passed for deletion (when $PASSDEL is TRUE), but would be suppressed when the specified conditions occurred when they were being passed for addition.

The Userproc may be called by the Passproc, or may be part of the Inproc or Outproc of the passed elements if those rules are called during passing.

(*) Technical Details About $PROCVALUE's Type

Many processing rules, such as $INT (A21), convert a value from one type to another; that is, the value is changed to an equivalent value with a different type. When a Userproc begins execution, the $PROCVALUE variable, containing the current element value, has the same type as the element does at the time the Userproc is called. However, it will always be a string variable after being set by a SET VALUE statement, which can cause a problem if the Userproc returns the value to a rule string that expects the value to be another type, a problem discussed below.

First, compare these three Inproc strings:

In string 1, $PROCVALUE will be a string variable at the start of the Int1 Userproc, just as it was input by the user. In string 2, it will be an integer variable, since the $INT proc converts the input value to an integer. (If the conversion had failed, the string would have been abandoned before reaching the $CALL proc.)

String 3, where the Userproc is called as an Inclose rule, is slightly more complicated. If the user inputs no values for the element, then when the Userproc is called, $PROCVALUE is a null string. (Remember, an Inclose rule will execute whether or not a value was input for the element.) But if the element does have values, then $PROCVALUE contains the first occurrence, and is the same type as that occurrence.

$PROCVALUE in Searchprocs is the same as in Inprocs: its value is a string variable if the Userproc is called before a type-conversion rule; if called after such a rule, it has that type.

In Passprocs, $PROCVALUE has the same type as the stored element, unless it follows another type-conversion rule, in which case it has the same type as the conversion. (Remember that a Passproc may request the external form of an element, which could mean a type-conversion, probably to string, before the Passproc's Userproc gets the value.)

In Outprocs, $PROCVALUE has the same type as the stored element, unless it follows a type-conversion rule, in which case it has the same type as the conversion.

Suppose that you use the second Inproc string that was shown above:

When the Userproc is called, the current value of the element is type integer. But if you use the SET VALUE Uproc to change the value, which returns a string value, what happens to the value when it is returned to the processing rule?

To set the element's value to "5" for storage as an integer, here are three different approaches you might consider, only the third of which works properly in this situation:

In all cases, the processing rule string assumes that the value being returned from the Userproc is already an integer -- the conversion was already made, back in the $INT proc. Whatever the returned value is, the processing rule interprets it as an integer, even if it really a string.

In the first two cases, both return with the string "5", not the integer 5. The second case represents a good try, but it doesn't take into account the absolute determination of the SET VALUE Uproc to return a string value; it will always try to convert the expression to a string value. So in the second case, the value is converted from the string "5" to the integer 5 and then back to the string "5". When the processing rule string gets that value back, it will interpret the value as an integer, not as a string, with incorrect results. (It will in fact think the value is the integer "-180338624", a far cry from 5. This is worth trying for yourself if you are skeptical.)

The third case works because it changes the value to the integer 5, then tells SPIRES to assume the value is a string value, not whatever type it was. (In other words, the $RETYPE function tells SPIRES to change the element's type without actually converting it to its equivalent form in that type.) Because the $RETYPE function has returned a string value, the SET VALUE Uproc doesn't try to convert the value to a string; it already is one.

Then, when the value is returned to the processing rule string, its value is still that of the integer 5, even though SET VALUE is returning a string. The processing rule doesn't care what type is returned; it simply "knows" the value being returned is an integer and interprets it as such.

This is a complicated concept, without a doubt. It is usually only relevant if you need to handle an element value in its internal form in a Userproc. Most Userprocs handle the value in its external "string" form, before any type-conversion rules in an Inproc, and after them in an Outproc.

For more information about the $RETYPE function, see the manual "SPIRES Protocols" or online EXPLAIN $RETYPE FUNCTION.

C.11.3.2  User Variables in Userprocs

User variables are often very handy in Userprocs. In general, they have the same capabilities they have in a format or in an allocated vgroup. Their primary limitation is that they are re-initialized each time a Userproc is called from a processing rule string (but not when one Userproc is called from another).

To use user variables in a Userproc, you must define them at the beginning of the USERDEFS structure, just as they are in a vgroup definition. [Dynamic elements are an exception to that rule, as they are to many others. They may be defined here; if they are not, they must be handled using the $DYNPUT, $DYNGET and $DYNZAP functions.] You can then set and change them in a Userproc with the LET Uproc, and retrieve their values in other Uprocs as desired. [See C.11.2.4.]

A variable definition begins with the VARIABLE statement, which names the variable:

Here "variablename" is a name from 1 to 16 characters long. Generally speaking, it should contain only letters and numbers.

Other statements describing the variable may follow:

OCC = number.occurrences;

This statement declares the number of occurrences that the variable may have. The default, if no OCC statement is coded, is one occurrence for each variable. (Multiple occurrences of a variable are individually referenced by an index: the variable name is followed by "::n" where "n" is a number from 0 to 32,767 representing the desired occurrence, counting from 0, or "n" is an integer variable representing such a number. Alternatively, the "::INDEX" feature can be used to handle such references.) The form of the OCC statement will be different for two- or three-dimensional arrays. [See "SPIRES Protocols", section 4.2.2.5.]

LEN = length;

This statement specifies the length in bytes of each occurrence. It must conform to the restrictions inherent in TYPE. Default lengths are provided for each type, as shown in the TYPE chart below.

TYPE = type;

This statement, which is probably the most important of them, describes the type of value that the variable will represent. The allowed types (and their allowed and default lengths per occurrence) are:

     Type      Allowed lengths      Default lengths
     STRING      1-32,765*            80
     INTEGER     1, 2, 4               4
     REAL        4, 8                  4
     PACKED      1-16                  4
     FLAG        1                     1
     CHAR        1-256                16
     LINE        4                     4
     HEX         1-32,767*             4
     DYNAMIC     1-32,765              0

     * The upper limit applies to singly occurring variables; if the
       variable occurs more than once, the upper limit  is  253  for
       string variables and 255 for hex.

VALUE = value0, value1, value2, ...;

This statement lets you assign initial values to the occurrences of the variable, one value per occurrence as given in the OCC statement. The given values, each a string, will be converted to the appropriate type during compilation. If special characters appear in any strings, including blanks or commas, those strings should be enclosed in apostrophes. No single value should exceed 255 characters in length.

For flag variables, you may use the values $TRUE and 1 to represent "true", and $FALSE and 0 to represent "false".

To assign the same value to multiple occurrences in a row, you can follow the first occurrence with an integer in parentheses representing the number of occurrences to be assigned the value, e.g.:

The first occurrence of the variable will have the value "0", while the next ten will have the value "1".

Other useful statements include COMMENTS (just like other COMMENTS statements in a file definition), REDEFINE, INDEXED-BY and DISPLAY-DECIMALS. For more information about the last three statements, see section 4.2.2.1 of the manual "SPIRES Protocols", or issue an EXPLAIN command.

A typical set of variable definitions might look like this:

(*) Using Dynamic Variables with Userprocs

Dynamic variables may be used in Userprocs if they are handled in either of the following ways:

You cannot create a dynamic variable in a Userproc simply by referencing it in a LET Uproc, as you might do with a LET command in a protocol. That is, the following Uproc will not compile properly unless the variable TOTAL is in a defined vgroup:

Dynamic variables in Userprocs may be useful in the following situations:

[See the manual "SPIRES Protocols" for more information on using dynamic variables.]

(*) Using Variables in Global Vgroups with Userprocs

Another way to share variables between formats, protocols and Userprocs is with a global vgroup. From within a Userproc, you can work with these variables only via the functions $STATGET and $STATPUT. Descriptions of these functions as well as of global vgroups in general appear in the manual "SPIRES Protocols"; alternatively, use the EXPLAIN command to get information about these terms online.

C.11.4  Some Interesting and Unusual Uses for Userprocs

This section demonstrates both typical and unusual uses for Userprocs, showing the interactions of some of the ingredients most often used.

Its current subsections are:

Both subsections become quite technical toward the end, and are not for the faint of heart. The technical sections are marked by the "(*)" sign.

More subsections will probably be added in the future.

C.11.4.1  Retrieving Other Element Values with the $GETxVAL Functions

Probably the most common application of Userprocs is to fetch a value from another element, which is then used to change or test the current element value. This is frequently the case with virtual elements, where the Userproc retrieves another element value and uses it to establish a value for the otherwise valueless virtual element. [See C.11a.]

The $GETUVAL, $GETCVAL, $GETIVAL and $GETXVAL functions (or, to refer to them all succinctly, the $GETxVAL functions) are extremely important in Outproc, Passproc and Inclose Userprocs, since they are the vehicles by which other element values are retrieved. They fetch the internal or external form of the named element. (Be aware that these functions are not available in Searchproc and non-Inclose Inproc Userprocs.)

In this example, the Userproc establishes a value for the estimated shipping date (EST.SHIP.DATE), a virtual element, by adding 14 days to the ENTRY.DATE element:

 (1) ELEM = ENTRY.DATE;
       INPROC = $Date/ $Gen.Date(Add); OUTPROC = $Date.Out;
     VIRTUAL;
     ELEM = EST.SHIP.DATE;
       INPROC = $Date; OUTPROC = $Call(CheckShipDate);
     ...
 (6) USERDEFS;
       VARIABLE = Entry.Date; TYPE = HEX;
       VARIABLE = Days; TYPE = INT;
       USERPROC = CheckShipDate;
         UPROC = Let Entry.Date = $GetUval(Entry.Date);
(11)     UPROC = Let Days = $Days(#Entry.Date);
         UPROC = Set Value = $DateOut($XDate(#Days + 14),0);

The EST.SHIP.DATE Userproc is called as an Outproc rule from EST.SHIP.DATE (line 5). In line 10, the Userproc fetches the first occurrence of the ENTRY.DATE element, using the $GETUVAL function, which returns the unconverted, i.e., the internal form of its value. The computation in line 11 converts the entry date into an integer ($DAYS function). In line 12, the number 14 (two weeks in days) is added to that value, which is then reconverted into a date ($XDATE and $DATEOUT functions), whose value is returned to the processing rule for output.

A brief description of the $GETxVAL functions' syntaxes are shown below. Full details appear in the manual "SPIRES Protocols"; online, EXPLAIN $GETUVAL FUNCTION, or whichever one you want.

   $GETUVAL(elemname,occ-number,default,structural-occ-map,FILTERED)
   $GETCVAL(elemname,occ-number,default,structural-occ-map,FILTERED)
   $GETIVAL(elemname,occ-number,default,structural-occ-map)
   $GETXVAL(elemname,occ-number,default,structural-occ-map)

The options are:

$GETUVAL and $GETIVAL both retrieve the internal form of the element. $GETCVAL and $GETXVAL both retrieve the external form of the element.

The difference between $GETUVAL and $GETIVAL relates to filters (see the description of the FILTERED option above) and to virtual elements. For virtual elements, $GETUVAL usually returns a useless value: if the virtual element redefines another element, the internal stored form returned by $GETUVAL is the internal form of that other element, which you could pick up directly; for other types of virtual elements, $GETUVAL returns a null value. On the other hand, $GETIVAL executes the virtual element's Outproc and then Inproc rules to create the returned value.

The only difference between $GETCVAL and $GETXVAL is the filtering, as described above.

Here are the rules that define which elements may be retrieved by a $GETxVAL function in Inclose, Outproc and Passproc Userprocs:

(*) Retrieving Element Occurrences Across Structural Occurrences

The structural occurrence map is primarily needed in cases where you are traversing a record in a linear manner, that is, moving into a series of nested structures, without any other means of determining and maintaining structural boundaries. In a Userproc, you have no way to assure structural integrity in a Userproc that is called by a structural element. In this case, you could use the $GETxVAL function to set the occurrence path so that you could use this information to access other values in the same structure occurrence. For example, consider the following element hierarchy:

FUND, PATRON, and GIFT are all multiply-occurring structures, and TOTAL a virtual element that calls a Userproc to total all the AMT values in the PATRON structure to provide a grand-total of all gift amounts for each patron. The USERPROC looks like this:

     1. USERDEFS;
     2.   VARIABLE = CT;
     3.     LEN = 4;
     4.     TYPE = INT;
     5.   VARIABLE = NEW.VAL;
     6.     LEN = 4;
     7.     TYPE = INT;
     8.   VARIABLE = POS;
     9.     TYPE = STRING;
    10.    VARIABLE = TOTAL;
    11.      LEN = 4;
    12.      TYPE = INT;
    13.    USERPROC = TOTAL;
    14.      UPROC = LET CT = 0;
    15.      UPROC = EVAL $GETUVAL(GIFT);
    16.      UPROC = IF $GETSPATH = '' THEN RETURN;
    17.      UPROC = LET POS = $XSTR($GETSPATH,DROP,LAST,5)
    18.      UPROC = ++LOOP;
    19.      UPROC = LET NEW.VAL = $GETUVAL(AMT,#CT,,#POS);
    20.      UPROC = IF $GETSPATH = '' THEN JUMP FINISH;
    21.      UPROC = LET TOTAL = #TOTAL + #NEW.VAL;
    22.      UPROC = LET CT = #CT + 1;
    23.      UPROC = JUMP LOOP;
    24.      UPROC = ++FINISH;
    25.      UPROC = SET VAL = #TOTAL;
    26.      UPROC = RETURN;

The Userproc is executed each time a TOTAL value is to be displayed for an occurrence of the PATRON structure. The EVAL Uproc in line 15 retrieves the first occurrence of the GIFT structure in order to set the value of the $GETSPATH variable, which holds the "structural occurrence map" describing which occurrence of each containing structure gets us to this occurrence.

We can use the map to retrieve all the AMT values in this PATRON structure. To do that, we must back up a level -- up from this occurrence of GIFT to this occurrence of PATRON -- by removing the last five characters from $GETSPATH before assigning this value to the variable POS. What this is doing is changing the structural occurrence map, from, for example, '0001.0002' to '0001'. So that instead of pointing to a specific GIFT structure, we are pointing to a specific PATRON structure.

You could not get the structural occurrence map of the PATRON structure by using $GETUVAL(PATRON) because the $GETUVAL must name an element in the current structure, which is the structure pointed to by the established structural occurrence map. Otherwise, SPIRES returns to the record level, in which case the default occurrence number is "0". In other words, if you wish to find out which occurrence of a structure you are in, the element referenced by the EVAL $GETUVAL statement must be contained by that structure. You can then use $XSTR to retain only the part of the value placed in $GETSPATH that you need to determine the structure's path.

Note that this capability is available in Userprocs only when $GETxVAL is used on an INCLOSE, OUTPROC, or PASSPROC processing rule. It is also available directly from the command level during Partial Processing of a referenced record, and in Uprocs within format definitions.

In some cases -- probably rare ones -- in which element values are being generated by INCLOSE rules (such as $GEN.DATE), the $GETxVAL function will not be able to retrieve the newly-generated values, depending on the location of the $GETxVAL call relative to the element being generated. Remember, the general rule is that the element doing the INCLOSE retrieval must occur after the generated element, and the structure that contains the generated element must not contain the retrieving element in any way (i.e., same structure or nested structure).

C.11.4.2  (*) Outproc Userprocs for Record Keys and Secure-Switch 13

In some SPIRES applications, some of the security needed for a subfile must be built into the data itself. For example, in a salary file, it might be necessary to prohibit an employee from looking at or changing his or her own record. When a value within a record is used to permit or block access to the record to various users, a Userproc in the record key's Outproc rule string is often used to do the security processing.

The key's Outproc string is a useful place for such processing for the simple reason that SPIRES runs the record key through its Outproc rule string before and/or after almost every input or output operation involving a record. Even if SPIRES fetches a record via its pointer for subgoal processing in a format, the key is run through its Outproc before any other element values are fetched.

Hence, if a Userproc on the key's Outproc determines that the record should not be seen by the current user for one reason or another, it can withhold the record, similarly to the way the $TEST.ACCT proc (A53) can block a user from seeing records whose keys don't include his or her account number.

Even a slot key can call a Userproc, even though the input and output processing is otherwise handled automatically without INPROC and OUTPROC statements for it in the file definition. To request a Userproc for the Outproc string for a slot key, code the SLOT-PROC statement after the SLOT-NAME statement:

where "Userproc-name" is the name of a Userproc defined in the same record definition. (Or you can name a compiled RECDEF record and a Userproc defined within it, using the form "gg.uuu.recdefkey#userproc-name".) The Userproc will be executed after normal output processing of the key (i.e., conversion from string to integer).

The blocking of the record is commonly done by setting the record key's value to null, using the SET VALUE Uproc. [It can also be done with SET ERROR E or SET ERROR S Uprocs, to cause an E-level or S-level error.] In the example below, for instance, the file owner (GQ.DOC) has determined that records with keys greater that 1000 should be unavailable to other users:

If another user tries TRANSFER 1001, the command will fail with an S269 error. In other cases, such as an attempt to display the record under Global FOR (e.g., FOR SUBFILE, DISPLAY ALL), the records would be skipped over entirely.

But suppose the Userproc includes $GETUVAL functions to retrieve other element values. There are times when SPIRES works with the key of a record without having retrieved the entire record, such as under the REMOVE, DEQUEUE or UNQUEUE commands or SHOW KEYS in Global FOR. In such cases, the $GETUVAL function fails, returning the default value.

Secure-Switch 13 was created to handle that situation; it forces the record into memory anytime the key is being processed so that all the record's elements are available to $GETUVAL functions in the key's Outprocs. If Secure-Switch 13 is set, the REMOVE command, etc., will work the same as the DISPLAY command.

WARNING: Unfortunately, even with Secure-Switch 13, situations arise where a command such as REMOVE will not bring the record into main memory, and again $GETUVAL functions can fail, i.e., return the default value. These situations occur in Global FOR processing, usually involving "*" (current record) processing.

For example, if you have a Userproc that prohibits the REMOVE command if a record contains a certain DATE value, then a sequence of commands such as FOR SUBFILE, DISPLAY, REMOVE * may still allow the REMOVE command to succeed for records with that DATE value. This happens because SPIRES doesn't bring the record back into memory again, because it supposedly is already there. Regardless, the best way to circumvent this problem is to avoid the Global FOR command using the current record and to substitute instead the same command outside of Global FOR, using the VIA SUBFILE prefix. For instance, don't issue the REMOVE * command in that circumstance; instead, issue the command VIA SUBFILE /REMOVE $KEY, which will bring the record back into memory again.

C.11a  Virtual Elements

In a file definition you can define a kind of data element called a Virtual element; in conjunction with Userproc capabilities, it allows you to have different views and uses for the same data. Virtual elements do not really exist in the record, but when specifically retrieved by a SPIRES command, they can link to other existing data elements to form their own values, thus behaving as if they really exist.

Virtual elements reside in their own section of a record or structure, the VIRTUAL section, which follows the OPTIONAL section of element definitions.

By default, virtual elements are singly occurring, and when accessed, that single value will have a zero length, i.e., it is a null value. Virtual elements may redefine other elements, however, in which case they may have multiple occurrences and have non-zero lengths. [See C.11a.3.] Virtual elements may also be defined as "variably occurring", in which case they may have no occurrences, one occurrence or multiple occurrences. [See C.11a.4.]

A record or structure may not consist of only virtual elements; at least one non-virtual data element must be defined or an error will be produced during compilation. A virtual element within a structure by default always exists if the structure exists, and does not exist if the structure does not exist.

C.11a.1  Examining Virtual Elements

As stated earlier, virtual elements can be "seen" by specifying them in particular SPIRES commands. Currently, these commands can be used to access them:

Under Partial Processing, "FOR elem" will allow DISPLAY of virtual elements, but ADD, UPDATE, MERGE, and REMOVE are invalid commands.

Normally when a record is displayed in the standard SPIRES format, virtual elements are not shown. To see virtual elements as well as the rest of the record when using the DISPLAY command or the TYPE command not followed by an element list, you can use an option on the SET ELEMENTS command:

where "virtual-element-list" consists of the names of the virtual elements you would like displayed. All other elements in the record will also be displayed when the SET ELEMENTS command is in effect. Note that you can specify virtual elements in the normal SET ELEMENTS command if you do not want to see the entire record.

Both internal ($UVAL) and external ($CVAL) forms of a virtual element are available in formats. The external form can also be accessed via the $GETCVAL or $GETXVAL function; the internal form via the $GETXVAL function. The internal form is not accessible with $GETCVAL.

The external form of a virtual element is derived by executing the element's Outproc rule string (which creates the value). To get an internal form, SPIRES then processes the external form through any Inproc rule string that exists for the virtual element. If there is no Inproc rule for the virtual element, then the external form and internal form are identical. Note that it is the internal form that SPIRES uses when a virtual element is indexed, and also when a virtual element is named in a WHERE clause of a "FOR class" command (that is, the value given in the WHERE clause is processed through the Inproc and compared to the internal values of the virtual element).

An Inproc rule on a virtual element is also executed if a value for that element is provided when a record is being added or updated. However, the resulting value will be discarded since there is no place for a virtual element to be stored, unless the virtual element is a redefining element. [See C.11a.3.]

C.11a.2  Retrieving Other Elements for Virtual Elements

Virtual elements can only link to other elements during Outproc, Passproc or Inclose processing. Outproc A62 ($CALL proc) can name a Userproc to retrieve other elements via the $GETUVAL, $GETCVAL, $GETXVAL and $GETIVAL functions. The result of a series of manipulations with this data can be output as the value of the virtual element by coding "UPROC = SET VALUE = value;".

These functions will retrieve the value of the specified data element from a record in core. The value retrieved will be in its "unconverted" form, i.e., in the form stored in the record, from $GETUVAL or $GETIVAL, or in its "converted" form, i.e., in its external form after processing through Outproc rules, from $GETCVAL or $GETXVAL.

The syntax of the four $GETxVAL functions was presented in the previous chapter. [See C.11.4.1.] For full details, see their explanations in the manual "SPIRES Protocols", or online EXPLAIN each individually.

Currently these functions will operate in a Userproc only if called via an Outproc, Passproc or Inclose action. All but $GETCVAL can retrieve another virtual element. When a $GETxVAL function is used with an Inclose action, it can only retrieve element values provided on input by the user or element values that replace other element values from Inclose processing. However, the function cannot retrieve element values created (as opposed to replaced) by the system in other Inclose actions for the record, unless the following conditions are met:

The value type will be the type of the retrieved value unless that value did not exist. In that case, the returned type will be string, whether it is the default value or a null value.

Caution: if the value accessed has a possibility of non-existence, it is generally a good idea to code a default value.

If it is important to know whether the value exists or not, the default string value can be tested as an indicator of the element's non-existence.

Also, if the value of the function is to be stored as a non-string value, it is a good idea to assign a default value that will convert into the new type, since the null value may precipitate a conversion error.

Here's an example that demonstrates some of the above points. Suppose a file definition contains the following elements:

Then a Userproc can be coded to sum values of ELEMA.

The code accesses the first two occurrences of ELEMA (keep in mind that the occurrence numbers start with "0" for the first one, "1" for the second, etc.) and adds them to produce the final element value. The default string of 0 is coded to provide a value of SUMA that is produced by the SET VALUE command.

The INPROC = $INT coded for element SUMA serves to set the value type and also to provide proper conversion activity during WHERE processing.

C.11a.3  The REDEFINE Statement

Previously, virtual elements have been restricted to being singly occurring and being used only for output and passing. There is a special type of virtual element, called a "redefining" element, that goes beyond both of these restrictions; that is, it can be used for input as well as output, and it can occur more than once. However, in gaining these capabilities, it loses some of the other capabilities and advantages of virtual elements; this will be discussed later in this section.

An element in a record (element A) can be "redefined" by a virtual element at the same structural level (element B) such that, during output, all occurrences of element A can also be processed through the OUTPROCs of element B. Similarly, on input, any values entered for element B, the redefining virtual element, are processed through B's INPROCs but are stored as occurrences of element A. Thus redefining elements can be thought of as merely different views of, or paths to, real elements elsewhere in the record. A virtual element cannot be redefined by another virtual element.

Like standard virtual elements, redefining elements can be used in innumerable applications; only a few suggestions and rules can be stated here.

Since a redefining element is a virtual element, it is defined in the VIRTUAL section of the record or the structure definition in which it occurs. For example,

As shown above, the REDEFINE statement occurs within the virtual element's definition and names the element being redefined by the virtual element. In the example, the PHONE-NUMBER element would contain values like "415-497-4420" (without storing the hyphens) but the virtual element would return "497-4420". Of course, the major benefit derived here is that if more than one PHONE-NUMBER occurs, then that number of PH-WITHOUT-CODE values occurs. Regular virtual elements can and must occur only once.

Because redefining elements are virtual elements, they will not be displayed during output processing unless they are specifically named, as in a DEFINE TABLE, SET ELEMENTS, or "TYPE element-list" command. [See C.11a.1.]

Both PHONE-NUMBER and PH-WITHOUT-CODE are record-level elements. Remember that in all cases, an element and the virtual element that redefines it must both occur at the record level or within the same structure.

For input, if an element value is entered for a redefining element (such as PH-WITHOUT-CODE above) then the value is processed through any INPROCs that exist for the redefining element yet is stored as an occurrence of the redefined element. Thus, if an element value can take several forms and each form requires separate editing through processing rules, each of the different forms can be written as the INPROC to a separate redefining element. For instance, suppose that you will be entering dates into a record, but sometimes you might have the Julian date (e.g., "54.182" for the 182nd day of 1954, i.e., July 1) which the A31 processing rule cannot handle. If the virtual element JULIAN-DATE-IN redefines the element DATE-IN and its INPROCs computed a 4-byte hexadecimal form of the date represented by the Julian date, then a Julian date could be entered as a value for JULIAN-DATE-IN, and it would be stored like any other date as a value of the element DATE-IN.

There are several important points to remember about using redefining elements for input. First, the value being input is processed only through the redefining element's INPROC rules and not also through the redefined element's INPROCs. That means that in the example above, if DATE-IN converts incoming values into the 4-byte hexadecimal form for dates, the redefining element JULIAN-DATE-IN should do the same in its INPROCs; otherwise, problems will arise when the values are returned through OUTPROCs. Note that if there is any INCLOSE rule on the redefined element's INPROCs, it will be applied to all elements being stored for the redefined element, even those values which were input through the redefining element; no INCLOSE rules are allowed on the redefining (the virtual) element's INPROCs.

Second, it is helpful to consider the redefining element as an alias for the redefined element on input, meaning that values being input through the redefining element are treated for storage purposes exactly like values being input through the redefined element. For example, if only one date value can be stored as DATE-IN and if a value for JULIAN-DATE-IN appears in the input data before one for DATE-IN, then the JULIAN-DATE-IN value would be stored and the other would not. (If DATE-IN could be multiply occurring, both values would be stored, with the value from the virtual element being stored as the first occurrence, in this particular case.)

Third, it is important to remember that if any value is given for a redefining element on input, whether or not there are any INPROC rules for that redefining element, then SPIRES will try to store that value as an occurrence of the redefined element. (For standard virtual elements, any value supplied on input is of course thrown away, since no value is stored.) The significance of this point is that you may intend your redefining element to be used only for record display, such as the above example with PH-WITHOUT-CODE. However, if you update a copy of a record that includes a value for the PH-WITHOUT-CODE rather than updating a transferred copy of the record (which would not include any virtual elements) then that value would be stored in PHONE-NUMBER, somewhat chaotically in this case. If you follow normal updating procedures, that will not be a problem. To remove the possibility of such problems, you can code an INPROC for the redefining element (A52) that throws away any values being entered through that element.

Fourth, though you may use the redefining element to input only a few occurrences of the redefined element, if you display the record and examine the values of the redefining virtual element, all occurrences of the redefined element will be displayed, not just the ones that were input through the virtual element. That point should be obvious -- again, on output, there is one occurrence of the redefining element for each occurrence of the redefined element in the record.

Fifth, it can be useful to pass redefining elements into indexes, like standard virtual elements. When a redefining element is passed to an index, it is first created by processing the internal form of the redefined element through the redefining element's OUTPROCs and is then transformed into an internal form by processing it through any INPROCs that exist for the redefining element. If you pass the external form of the element, then just the OUTPROC value is passed.

A very important general point about redefining elements is that the element type is determined by the INPROCs coded with it -- that is, a redefining element might be type HEX when the redefined element is INTEGER, or the redefining element can redefine an entire structure so that it looks like a string on output, a capability which is most useful when all the elements in the structure are fixed-length. This is important to remember, for instance, when you need to do some type of conversions or perhaps some arithmetic where the redefined element is not of the proper type for such computation.

Finally, it is possible to defined TYPE = STR; which makes the element into a Virtual Structure, or Phantom Structure. For more information, [EXPLAIN PHANTOM STRUCTURES.]

C.11a.3.1  (A Comparison of Rule A79, $GETxVAL, and Redefining Elements)

There are three methods for retrieving values of a different element within an element's processing rule strings. One is a processing rule, A79; the second, $GETxVAL, is a function used in a Userproc; and the third is virtual redefining elements, signified by the REDEFINE statement in the record definition. Here are some guidelines to help you decide which method to use when you need to access different elements:

While in most cases choosing the the correct method is simple, there are times when such a decision might not be easy. Suppose for example that you have a virtual element that retrieves the value of a singularly occurring element and processes it through some Outproc processing rules (though not an A62 with a Userproc) that are very different from those of the retrieved element.

First, assume that you will use the element only for output display. If you are getting the other element value only in order to process it through standard processing rules (that is, no Userproc will be needed) it would not be very efficient to enter a Userproc only for the purpose of using the $GETxVAL function and then to return that value to the rest of the rule string. It would be more efficient either to use the A79 rule to begin the Outproc string for the virtual element or to use the REDEFINE capability.

But suppose that the retrieved element does not exist. If you use the REDEFINE capability, then the virtual element will not occur, since the redefined element does not. If instead you use the A79 rule, which will retrieve a null value for the non-occurring element, then the virtual element will process the null value through the Outproc rule string, and the virtual element will exist (though its value may not be very useful).

But if you use the $GETxVAL function in a Userproc to access the element value, you can set a default value for cases where no value is retrieved. That value can be returned to the rule string or the Userproc can test for the default value and take some other action. Thus, each of the three methods is useful in this situation, depending on your needs in regard to efficiency and to default processing.

These same factors apply if you have decided to pass the virtual element to an index. Remember that SPIBILD creates the pass value for a virtual element by executing the virtual element's Outproc rule string and then (for the internal form) executing its Inproc rule string.

For virtual elements that redefine others, SPIBILD will try to pass an occurrence for each occurrence of the element being redefined. If that element has none, no occurrences of the virtual element will be created and passed.

Non-redefining virtual elements always have one occurrence, so some value will be passed. Again, if the element being retrieved doesn't occur, A79 will return a null value, whereas the $GETxVAL function in a Userproc will return the default value coded within it.

C.11a.4  Variably-Occurring Virtual Elements

So far, a virtual element has been able to have:

But in some cases you may need the number of occurrences of a virtual element to vary, depending for instance on the total number of occurrences of two other elements, or on the number of occurrences of values in a dynamic variable array.

Suppose you have a person's record with these elements:

You might want to create a virtual element called FAMILY.MEMBER.NM, where the collection of occurrences is the PERSON.NAME element, the optional SPOUSE.NAME occurrence, and all the occurrences, if any, of CHILD.NAME:

Such a virtual element is called "variably-occurring" -- it may occur 0, 1 or multiple times. The OCCURS statement in its definition is what distinguishes it from other virtual elements:

Notice the value for OCCURS is VARIABLE; VARIABLE is allowed as the value for OCCURS only for this kind of virtual elements.

OCCURS = VARIABLE affects the execution of the virtual element's Outproc statement in the following way: as long as the Userproc called by a $CALL proc (action A62) (which should be the first processing rule in the Outproc string) executes a SET VALUE statement, SPIRES will re-start the Outproc string when it finishes the Outproc string for the current occurrence. In other words, it will keep looping through the Outproc string until the Userproc does not execute a SET VALUE statement.

In the Userproc example below, for instance, the Userproc will be executed 6 times; the first 5 times, it will return 5 values from a dynamic array called, obviously enough, DynamicArray. On the sixth time through, it will SET LASTRULE, and then return without having done a SET VALUE; that will break the loop.

Notice that when looping through the Outproc string for a variably-occurring virtual element, SPIRES does not re-initialize the Userproc's local variables (Counter, in the example above) except, of course, for the first time through. In other words, Counter, an integer variable, starts at 0 the first time through.

The SET LASTRULE Uproc is useful if the $CALL proc (A62, A124) is not the only rule in the processing rule string. For example, suppose you had a virtual element definition like this:

When the GETNAMES Userproc runs the last time (i.e., when it does not do a SET VALUE), it will return a null value. The next rule, $INSERT, sets the error flag if the input value is null. The error will be avoided if SET LASTRULE is executed, telling SPIRES to do no further processing of the rule string once the Userproc ends.

SET LASTRULE is particularly important if the rule string contains a second $CALL proc, whose Userproc would almost certainly contain a SET VALUE Uproc. That can cause an infinite loop.

To finish up the original example, here is one way to write the GETNAMES Userproc used to generate values for the FAMILY.MEMBER.NM element defined above:

The first time through, the Counter variable would be 0 -- SPIRES would retrieve the "required" element value Person.Name and assign it to the virtual element with SET VALUE. The second time through (#Counter is now 1), if there is an occurrence of SPOUSE.NAME, then it would be retrieved and assigned with SET VALUE. And then, subsequent times through, any and all occurrences of CHILD.NAME would be retrieved and assigned.

NOTE: If you intend to attempt to retrieve variably occurring virtual elements by means of $LOOKSUBF or $LOOKSUBG, then be sure to use REPLACE instead of EXTERNAL for the second parameter of the function.

C.12  Record-Types that Serve as Goals and Indexes; Goal-to-Goal Passing

Any record-type in a file can serve as a goal record-type for a subfile. A record-type used as an index record-type for one subfile can thus be a goal record-type for another. There are several ways to look at this capability:

Both of these ways can be considered "goal-to-goal passing" -- that is, data from one goal record-type is passed via SPIBILD to an index record-type that also serves as a goal for a different subfile. (The term "goal-to-goal passing" commonly refers to the second way, i.e., the SPIBILD processing of files containing two or more goal record-types, each of which has some non-pointer elements, and at least one of which serves as an index record-type for the other.)

This chapter will explain what it means for a file to have multiple sets of goal records. (We will talk about files with two sets of goal records in this discussion, but files with more than two sets follow the same principles.) To determine the best file design for your situation, you must determine the relationship between the two sets. Let's call the two sets of goal records REC01G and REC03G; several record-types used only as indexes may exist as well, under various names. Here are some questions to consider:

This chapter begins with a discussion of the simplest circumstances in which a record-type serves both index and goal functions. More complicated file designs, yielding richer and more complex features, are covered as the chapter continues. The last three sections discuss special types of "goal-to-goal" passing: "double-headed files", "chain passing" and "goal-to-same-goal passing".

C.12.1  Index Record-Types Used as Goal Records

As a file owner, you can examine records in any record-type of your file, regardless of whether the record-type serves as the goal record-type of a subfile. By issuing the ATTACH command, you can request that a particular record-type be set up for you to access, almost as if it were the goal record-type of a subfile. [See C.6.5.] If the record-type is used only as an index, the ATTACH procedure allows you to examine the complete index records -- either individually (DISPLAY key) or sequentially (using Global FOR).

Besides being used to examine index records, the ATTACH procedure may be used to access the goal records to which the index records point, using subgoal processing. This allows you to create reports in which the goal records are arranged in the order in which they are indexed. [See the manual "SPIRES Formats" for information on subgoal processing. This particular capability is also available through the Global FOR command FOR INDEX. See the SPIRES manual "Sequential Processing in SPIRES: Global FOR".]

Only the file owner and accounts given SEE, UPDATE, MASTER or COPY file privileges can use the ATTACH command on the given file. [See B.9a.]

If you (or others) will be directly accessing the index records frequently, you might want to make the index record-type be the goal record of a subfile. This choice has several advantages over the ATTACH method:

The subfile section of the file definition might look like this:

The above demonstrates the most basic way in which a record-type (REC03G in the example) can perform the dual roles of goal and index. The record-type, whose records are created solely by passing, serves primarily as an index record-type for REC01G. However, by selecting DATE-ORDERED INDEX, you can examine and process those index records directly, since they are now also goal records. Secure-switch 3 is added here to prevent users from adding, updating or removing any of the records, since the creation and maintenance of these records is handled by SPIBILD during passing.

It is important to remember that any record-type within a file can be made a goal record-type: just add a subfile section to the file definition in which the record-type is named the goal record-type of some subfile, as shown above.

C.12.2  Record-Types with Goal and Index Data

In the previous section, record-type REC03G serves primarily as an index record-type for REC01G. The only data in the REC03G record-type, besides the key element, is a pointer back to REC01G. But suppose the REC03G record-type, as the goal record-type of a second subfile, contains more data than just a key and a set of pointers. That is, suppose records in REC03G are created by adding them directly to the REC03G subfile, as opposed to (or in addition to) being created by passing from REC01G. Why would a file be constructed like this? And what advantages or disadvantages are derived from such a design?

For example, suppose that REC01G contains personnel records. One of the elements is a DEPARTMENT code that is to be indexed. During record input, the DEPARTMENT code is processed by $LOOKUP, the table lookup proc (i.e., A32); the input code is "looked up" as the key of a record in REC03G, the goal record of the DEPARTMENT CODES subfile, thus ensuring it is a legitimate value. [See C.5.2, C.5.3 for a similar example.] Later, during SPIBILD processing, pointers from REC01G would be added to the appropriate records in REC03G. Thus, you would add goal records to REC03G directly for table-lookup purposes; SPIBILD would add the pointers to make them index records.

A twist on that design would be to have another element in REC03G (an abbreviation, for instance) used as a replacement value for the "looked-up" element of REC01G. Then, on passing, the value is reconverted (probably by another table lookup) to the longer value for searching purposes; the longer value, matching the record keys in REC03G, can thus be indexed in REC03G.

There are several advantages to such a design. First, you save storage space. If the table-lookup records were in a different record-type from the index records, the keys of one set (most of which, if not all, match the other set) would be stored redundantly. Second, you save a record-type. Few file designs require more than the allowed number of record-types (64); still, this procedure could be helpful in very large applications.

This design might seem to adversely affect searching the REC03G index, because there may be records in it used for table lookup that do not contain pointers back to REC01G. Remember, however, that SPIRES counts pointer groups when reporting the number of records in a search result, so if an index record does not have any pointers, it will not affect the result. One possible related disadvantage: if you browse the REC03G index, you can see all the key values in the index, and some of them may not have pointers associated with them. However, that possibility can arise with any index, not just in this situation. On the other hand, an advantage of this arrangement is that users entering records can browse the index and see all the values allowed for the element, whether or not a record for each value is already in the subfile.

In this situation, the subfile for which REC03G is the goal record-type must allow you to add records, since the table-lookup procedure happens on input before any pointers are created by SPIBILD. Instead of coding Secure-switch 3, which prevents adding and updating records, you should use some form of element security on the pointer element to prevent any accidental changes to the pointer elements when you are updating the table-lookup portion of a REC03G record. [See B.9.4.] This protective device is recommended for similar situations discussed later. [See C.12.3.1 for other suggestions and rules about record-type design.]

It is very important that the pointer-groups be maintained in descending-sort sequence, as SPIBILD creates them. The proc $SORT(DESCEND.ERR) could be considered the default INPROC rule for pointer groups created by SPIBILD.

Suppose that REC03G contains three elements: a key element, a pointer element back to REC01G, and some third element. Suppose also that a given record does not contain any occurrences of the third element. If SPIBILD processing causes all the pointers in that record to go away, SPIBILD may remove that record from the REC03G record-type. In other words, if the only two elements occurring in an index record are the key and pointers and if SPIBILD processing removes all the pointers, SPIBILD may remove the record as well. To prevent this, you must be sure other elements occur in the index record, possibly by adding required elements.

C.12.3  Linking Goal-Record Types: Goal-to-Goal Passing

Earlier, we considered making a record-type that was primarily an index into a goal record-type besides, for the purposes of directly examining the index records, among other reasons. [See C.12.1.] Then we made the record-type have a dual purpose from the start: REC03G was used both for table-lookup (as goal records) and for searching (as index records). Still, REC03G's overall purpose was to support record-type REC01G. [See C.12.2.]

In this section, the next logical development will be examined: having REC03G serve as a useful goal record-type apart from any connection it might have to goal record-type REC01G. That is, REC03G might be like any other goal record-type, having many different elements and structures and even indexes of its own; its subfile is useful on its own terms. However, it also still serves as an index record-type to REC01G, containing pointers back to that record-type.

For example, suppose that REC01G is the goal record-type for a subfile of employee records, while REC03G is the goal record-type for a subfile of job descriptions, whose keys are position titles.

Because the JOB.TITLE element is the key of REC03G, REC03G could be used as an index record-type for REC01G by adding a pointer element to REC03G and coding the linkage section for REC01G to pass pointers from REC01G to REC03G as appropriate.

Advantages of this design again include the savings of storage space and of a record-type. Also, however, is the advantage of mutual subgoal access. A goal record displayed from the EMPLOYEES subfile not only can name the position an employee has but also can include the job description as well, using subgoal access in a format. Using the JOB.TITLE element in the displayed record to access the appropriate record in REC03G via subgoal, the format makes the record appear to contain structures that are actually records in another record-type. Similarly, a record displayed from the JOB DESCRIPTIONS subfile could list the employees and their departments who have jobs of that particular description, again using subgoal access through a format. [See the manual "SPIRES Formats" for information on subgoal access through formats.]

How would such a file be designed? Below is a general diagram of how we want SPIBILD passing to proceed. The arrows indicate elements being passed:

  EMPLOYEE record    JOB DESCRIPTION record    SALARY INDEX record

   NUMBER (key)-|  |-> JOB.TITLE (key)---|  |-> SALARY.MINIMUM (key)
                |  |                     |  |
   NAME         |  |   JOB.DESCRIPTION   |--|-> POINTER.JOB
                |  |                        |
   JOB.TITLE ---|--|   SALARY.MINIMUM ------|
                |
   DEPARTMENT   |----> POINTER.EMP

Note here that the names that we gave to the record-types (REC01G and REC03G) have been removed for the time being. The reason for this is that the names of the record-types given in RECORD-NAME statements will determine the order in which they are processed by SPIBILD, as specified in the linkage sections. That is, the records in the deferred queue for REC01G will be processed before those for REC03G, because REC01G precedes REC03G in sort order. Controlling the order in which record-types are processed by controlling their names is a useful capability that needs to be discussed; naming the record-types at this point would effectively eliminate some design possibilities.

Several different designs are possible for this file; which one is best can be determined by considering the consequences of various designs. In the next section, some design rules and suggestions will be discussed, and in the section after that, a specific design will be proposed. [See C.12.3.1, C.12.3.2.]

C.12.3.1  Rules and Suggestions for Goal-to-Goal Passing

This section is a guide to various considerations of file design involving some of the more complicated aspects of goal-to-goal passing, as suggested by the example in the previous section. In this section, the term GOAL will represent a goal record-type the term INDEX will represent an index record-type, and the term INDEX-GOAL will represent a record-type that serves as both goal and index.

First, remember that users who can update INDEX-GOAL records should not be allowed to change the index information, that is, the pointers, in those records. To prevent users from doing that, use some form of element security in the Subfile section, such as the HIDDEN statement in a view, or priv-tags with the NOUPDATE statement. [See B.9.4, C.12.2.]

It was pointed out earlier, in another context, that index record-types should not usually be REMOVED to a residual data set. [See B.7.1.] Because of their indexing function, INDEX-GOAL record-types should generally not be REMOVED either, if possible, particularly when they themselves have indexes. Otherwise, record segmentation problems and SPIBILD timing problems may arise. On the other hand, very large INDEX-GOAL records may have problems if they are not REMOVED, as described below.

When you are deciding which elements in the INDEX-GOAL will be Fixed, which will be Required, etc., it is important to know whether all INDEX-GOAL records will be created by a user adding them to the subfile (when they are goal records) or whether SPIBILD may create records during passing (when they are index records). If the latter is the case, then no elements except the key should be Required in INDEX-GOAL. Why? First, pointer elements should always be made Optional. [See B.7.8.] Second, if a SPIBILD-created record containing only the key and an optional pointer is added to INDEX-GOAL, then passing errors will occur, since Required elements are missing. (However, this rule may be broken if the Required elements are being passed as qualifiers via SPIBILD -- the rule given applies to elements that would be added by users, not to those that are passed via SPIBILD.)

On the other hand, if you will not be letting SPIBILD create any of the records, there are some important advantages in having at least one other Fixed or Required element other than the key at the record level. (Again, however, it should not be the pointer element.) This is particularly true when the INDEX-GOAL records can become very large (over 14K bytes in size), whether from a large amount of goal record data or a large number of pointers from SPIBILD.

When an index record contains only a key and optional elements (and the record-type is not REMOVED), it can only grow to about 14K bytes before it is split into "large record nodes" by SPIBILD. SPIRES can easily handle large record nodes when the record-type is being used as an index -- that is, a search command that accesses the large record nodes of a given index record will be processed properly. However, treating the split record as a goal record can be a problem. For instance, the DISPLAY or TRANSFER commands will only access part of the record by default, and that part is unlikely to contain the goal record data you want. To access that data, you would have to write a format that accessed the split data using the SET VIALLCTR or SET LARGE Uproc. [See "SPIRES Formats".]

So, for an INDEX-GOAL with potentially large records, there are a couple of recommended choices. First, you could code at least one Fixed or Required element other than the key. But be aware there are some limits involved with this choice. Remember that SPIBILD cannot usually create such records -- the passing must be done to records already in the record-type, unless the data being passed includes the Fixed or Required elements. And, if any such record grows beyond about 32K bytes, it cannot be updated by you (SPIBILD can update index records up to about 40K, the absolute limit for a record).

The recommended choice is to code a SPLIT statement on the pointer element. This allows SPIBILD to move the pointers into "large record nodes" leaving your important INDEX-GOAL information at the tree-level. [See C.6.18.]

Suppose now that you do not expect to have INDEX-GOAL records larger than 14K bytes. If you do want to have Fixed or Required elements in INDEX-GOAL but also allow INDEX-GOAL records to be created by passing, a reasonable compromise is to place those elements within an optional structure. Below left might be your original design for INDEX-GOAL, with the compromise solution shown on the right:

In summary:

Arranging Record-Types for SPIBILD Processing

Traditionally, most files having a single goal record-type have it named REC01, and the index record-types named REC02, REC03, etc. When two or more goal record-types are in a file, more attention needs to be given to choosing these names, because the order in which the goal record names would sort is the order in which they would be processed in SPIBILD. Specifying that order properly can insure maximum SPIBILD efficiency.

In most "goal-to-goal" applications, the index data being passed from GOAL to INDEX-GOAL is completely different data from the index data being passed from INDEX-GOAL to INDEX. For instance, in the EMPLOYEE and JOB DESCRIPTION example [See C.12.3.] the EMPLOYEE subfile indexes the JOB.TITLE element (the GOAL to INDEX-GOAL link) and the JOB DESCRIPTION subfile indexes the SALARY.MINIMUM element (the INDEX-GOAL to INDEX link). Here the data passed from GOAL to INDEX-GOAL is not related to that passed from INDEX-GOAL to INDEX. In this situation, it basically does not matter whether SPIBILD processes the GOAL or INDEX-GOAL first.

However, some goal-to-goal applications exist that pass data from GOAL to INDEX-GOAL and then pass some of the same data from INDEX-GOAL to INDEX. In such cases, the order of SPIBILD processing is very important: if INDEX-GOAL to INDEX happened before GOAL to INDEX-GOAL, the data in GOAL that was supposed to arrive at INDEX would not make it.

Only one file design pattern (called a "chain-passing" design) allows passing of this nature to occur. A file that needs to pass data that way must use that design; a file that does not need to should not use a chain-passing design, because it makes SPIBILD processing less efficient than other designs do.

For the rest of this chapter, it is assumed that chain passing is not required. Chain passing will be discussed in detail later. [See C.12.5.]

In terms of file design, there is basically only one rule: An INDEX-GOAL should not pass to INDEXs that have later sort-order names if an earlier sort-order GOAL passes to the INDEX-GOAL, unless chain passing is desired. The diagram below shows the situation that should be avoided when chain passing is not involved (that is, it is the chain-passing design):

During SPIBILD processing, REC01G passes index data to REC03G; later, REC03G passes index data to REC04I. In most cases, this causes very inefficient SPIBILD processing.

You can break the chain-passing chain with a NOCHAIN option, which must be placed on the goalrec-name that receives information from a lower record-type. Thus, NOCHAIN on REC03G breaks REC01G -> REC03G -> REC04I chain-passing. Basically, this says that the receiving index (which is also a goal) indicates NOCHAIN. The reason is quite simple: REC01G might pass to more than one goal record-type, some of which are meant to chain-pass, and others which are not meant to chain-pass. NOCHAIN must be placed on the receiving index-goal record-type to allow such selectivity.

NOCHAIN on REC03G says to treat REC03G as a normal index, not chained. This means that what REC01G passes to REC03G is not passed to REC04I. Meanwhile, REC01G -> REC05G -> REC06I is still chain-passing. A better solution would be to place the INDEX for the INDEX-GOAL before it rather than after it:

Of course, the above is only a model. Other possibilities may be useful, such as REC01G being the INDEX-GOAL, which passes to REC02I and then receives from REC03G, the GOAL, which is in fact even more efficient for SPIBILD:

The numbers indicate the order of the SPIBILD processing.

By the way, why is the latter of these two designs more efficient for SPIBILD processing? Remember that SPIBILD processes the record-types in sort order by name. In the first design, when the GOAL passes to the INDEX-GOAL, SPIBILD must check the deferred queue (DEFQ) to see if the INDEX-GOAL records being updated with pointer information are already there, as a result of being updated as goal records. In the second design, when the GOAL passes to the INDEX-GOAL, the INDEX-GOAL records have already been processed (i.e., moved into the tree) so SPIBILD does not have to look in the DEFQ each time index information for an INDEX-GOAL record is passed. In that case, SPIBILD knows that the latest copy of each INDEX-GOAL record is in the tree. Hence, if any INDEX-GOAL records are likely to be in the DEFQ when GOAL records are too, the latter design shown above is preferable for SPIBILD efficiency's sake.

In fact, the most efficient files, in terms of SPIBILD processing, have all index record-types before their goal record-types (in record-name sort order). That organization ensures that index records are not in the deferred queue when SPIBILD processing of the goal records occurs. But for single-subfile files, this organization is not really relevant, since you would probably not update (i.e., put into the DEFQ) index records yourself.

C.12.3.2  A Solution to the Example in C.12.3

Let's return to the question of how to design the file for the EMPLOYEES and JOB DESCRIPTIONS subfiles.

The size of both subfiles will be reasonably small -- say, 1500 employee records and 900 job descriptions. Each employee would have one job description (that is, each record in EMPLOYEES would point to a single record in JOB DESCRIPTIONS) but the same job description might serve many employees. Still, no JOB DESCRIPTIONS record is likely to exceed 12K bytes in length.

As a data entry validation, we will insist that when records are added to the EMPLOYEES subfile, the JOB.TITLE element must be verified (via table-lookup) to be a value for which a JOB DESCRIPTIONS record already exists. This decision means that SPIBILD will not create new JOB DESCRIPTIONS records when passing index data from EMPLOYEES to JOB DESCRIPTIONS, since only values matching keys of records already in JOB DESCRIPTIONS will be allowed for the JOB.TITLE element in EMPLOYEES.

We will call the EMPLOYEES record-type (the GOAL) REC03G. We will follow the sample diagram shown at the end of the last section: the JOB DESCRIPTIONS goal record-type (the INDEX-GOAL) will be REC01G, its index (INDEX) will follow it as REC02I, and REC03G (GOAL) will come last.

Here is the structural outline of the record-types REC01G and REC03G:

   RECORD-NAME = REC01G;                 RECORD-NAME = REC03G;
     COMMENTS = INDEX for EMPLOYEES,     COMMENTS = GOAL for EMPLOYEES;
              GOAL for JOB DESCRIPTIONS;
                                         REMOVED;
   REQUIRED;                             SLOT; SLOT-NAME = NUMBER;
     KEY = JOB.TITLE;                    REQUIRED;
     ELEM = JOB.DESCRIPTION;               ELEM = NAME;
     ELEM = SALARY.MINIMUM;                ELEM = JOB.TITLE;
       (etc.)                                INPROC = $LOOKUP(VERIFY,
   OPTIONAL;                                            REC01G)
     ELEM = POINTER.EMP; TYPE = LCTR;      ELEM = DEPARTMENT;
       COMMENT = LCTR from REC03G;           (etc.)

Although goal record-type REC03G was coded as REMOVED, REC01G was not -- it serves as an index record, and index records should not be REMOVED.

Just to complete the record definitions, here is the definition for REC02I:

   RECORD-NAME = REC02I;
   COMMENT = INDEX for REC01G;
   REQUIRED;
     KEY = SALARY.MINIMUM;
   OPTIONAL;
     ELEM = POINTER.JOB;
       COMMENT = Pointer from JOB DESCRIPTIONS;

What is being passed as the pointer to REC02I? Not a locator, since REC01G is not REMOVED, but the key of REC01G. Thus, you would not want to code TYPE = LCTR on the pointer element here.

The linkage sections would then be coded. Basically they would look like this:

   GOALREC-NAME = REC01G;             GOALREC-NAME = REC03G;
     PTR-ELEM = POINTER.JOB;            PTR-ELEM = POINTER.EMP;
     PASSPROC = $PASS;                  PASSPROC = $PASS.LCTR;
     INDEX-NAME = REC02I;               INDEX-NAME = REC01G;
       GOALREC-ELEM = SALARY.MINIMUM;     GOALREC-ELEM = JOB.TITLE;
       PASSPROC = $PASS(BINARY);          PASSPROC = $PASS;
         PTR-GROUP = POINTER.JOB;         PTR-GROUP = POINTER.EMP;

Again note the differences required, here in the global portion of the linkages, because REC03G is REMOVED and REC01G is not.

C.12.3.3  Using Qualifiers in Passing to Create Goal Records

In the example in the previous section, only two elements were passed from GOAL to INDEX-GOAL: the JOB.TITLE element and the EMPLOYEE locator. It is possible to pass more elements than just the goal record pointer and the index value, by passing qualifiers from GOAL to INDEX-GOAL. [See B.7.9, B.8.6, B.8.7.] This capability is particularly useful when designing files where choosing what a goal record should represent is a dilemma.

For instance, suppose you are designing a data base for the fund-raising committee of a foundation that conducts fund drives through the mail. The foundation wants to keep track of people to whom they mail their requests and from whom they receive donations, but they want reports based on specific donations as well. For example, below are two designs distinguished by their choice of goal record. In both designs, the SOLICITATION.NUM element represents a unique value printed on the donation request form to help keep track of the donations.

In the DONORS subfile, each person has only one record. Obviously, this is a major advantage when address labels need to be generated; the second design allows one person to have many records, and eliminating duplicates for address labels would be a major task. On the other hand, reports about donations in a given month would not be as straightforward to create as they would be with the DONATIONS design (though record filters, i.e., the SET FILTER command, are not difficult to use to create such a report for DONORS.).

A way to have the advantages of both designs is to use qualifiers to create the DONATIONS records from the DONORS records. All the data in DONATIONS records is already found in the DONORS records; data entered for the DONORS records could be passed via qualifiers to create DONATIONS records.

We will call the DONORS goal record-type (i.e., GOAL) REC01G and the DONATIONS goal record-type (i.e., INDEX-GOAL) REC03G. The DONATIONS goal record-type must be designed with the non-key elements, including the pointer from DONORS, in a structure. Each of the elements within the structure will be singly occurring, and a value must be given to each (see the $PASS.DEF proc in the linkage below). Here is an outline of the REC03G structure:

Here is an outline of the appropriate parts of the linkage section:

It is likely that some of the elements in REC03G will in turn be indexed, requiring a chain-passing design -- that is, the record names of its indexes will have to follow REC03G in sort-order. [See C.12.5.]

A serious problem in most applications such as this one is finding a key element for the INDEX-GOAL record (e.g., the DONATIONS goal record), since it must be a unique value derived from the GOAL record. In this particular application, the structural element SOLICITATION.NUM was defined to be a unique value, but perhaps most other such files do not have an element like that one. Often a virtual element in the GOAL created by appending the structural occurrence to the GOAL key is passed as the key of the INDEX-GOAL record. For example, the occurrence number of the DONATION structure could be added to the slot number of the DONORS goal record to create the DONATIONS key. This choice also allows you, in formats, to use subgoal access from the INDEX-GOAL record to get the appropriate structural occurrence within the "parent" GOAL record in order to access any elements within the structure that you did not pass from GOAL to INDEX-GOAL.

C.12.4  Double-Headed Files

Sometimes file designs require two goal record-types to pass to each other. That is, there are two INDEX-GOAL record-types, each of which serves as an index to the other.

For example, suppose you have a file of data about students and the courses they take. One goal record-type represents each student, and another represents each course. Here are some of the elements in each:

In this situation, when SPIBILD processes the added records in the STUDENTS goal record-type, the COURSE.IDs that identify classes a student is taking are passed to the COURSES goal record, with a pointer back to the STUDENTS record. Similarly, when SPIBILD processes the added records in the COURSES goal record-type, the STUDENT.IDs that identify students taking a class are passed to the STUDENTS goal record, with a pointer back to the COURSES record.

In SPIRES lingo, this type of file is called "double-headed", because there are two goal record-types, neither of which is dominant over the other -- in SPIRES's view, they are treated equally.

Even though both goal record-types serve as indexes, either or both can be REMOVED, in contrast to what was stated about files where only one record-type was INDEX-GOAL. [See C.12.3.1.] However, you should pass the keys of one INDEX-GOAL, not the locators, to the other. For example, you would not want the COURSE.ID element in the Student records to contain locators passed from the other GOAL (as opposed to keys). That would make data entry very difficult: when you wanted to enter Student records with COURSE.IDs, you would need to know the locator (if one had even been assigned at that point) rather than the key (which is presumably a meaningful piece of data to you) of each Course record for which you want to enter a COURSE.ID.

The rule to keep in mind in designing a double-headed file is that indexes for the second INDEX-GOAL (as determined by their RECORD-NAME order) should have record names that come before it in sequential order. For example, REC01I and REC03I would be good INDEXes for either double-headed INDEX-GOAL REC02G or REC04G. However, REC05I would only be a good INDEX for REC02G, since it comes after REC04G (the second of the INDEX-GOALes). As before, this rule applies when chain passing is not part of the design. If chain passing is intended, then the rule must be broken. [See C.12.3.1.]

Below is a diagram that demonstrates possible double-headed file structures that follow the rule. GOAL1 and GOAL2 represent the two goal record-types that are indexes of each other. INDEX1 represents optional index record-types for GOAL1, and INDEX2 represents optional index record-types for GOAL2. In ascending order by record name, the preferred file structure order is:

Here are the skeletons of the two INDEX-GOAL record definitions as well as the two linkage sections.

                     (Record Definitions)
     RECORD-NAME = REC01G;           RECORD-NAME = REC03G;
       FIXED;                          FIXED;
         KEY = STUDENT.ID; LEN = 7;      KEY = COURSE.ID; LEN = 6;
       OPTIONAL;                       OPTIONAL;
         ELEM = NAME;                    ELEM = TITLE;
         ELEM = ADDRESS;                 ELEM = INSTRUCTOR;
         ELEM = COURSE.ID; LEN = 6;      ELEM = STUDENT.ID; LEN = 7;
                      (Linkage Sections)
     GOALREC-NAME = REC01G;          GOALREC-NAME = REC03G;
       PTR-ELEM = STUDENT.ID;          PTR-ELEM = COURSE.ID;
       GOALREC-KEY = STUDENT.ID;       GOALREC-KEY = COURSE.ID;
       PASSPROC = $PASS;               PASSPROC = $PASS;
       INDEX-NAME = REC03G;            INDEX-NAME = REC01G;
         GOALREC-ELEM = COURSE.ID;       GOALREC-ELEM = STUDENT.ID;
         PASSPROC = $PASS;               PASSPROC = $PASS;
         PTR-GROUP = STUDENT.ID;         PTR-GROUP = COURSE.ID;

Notice that the non-key elements in both record-types were declared Optional. Thus, records for either record-type may be created during SPIBILD passing.

It is recommended that you make the keys of both goal record-types Fixed elements; similarly, the pointer elements should reflect those lengths, as shown in the example above. That helps conserve storage space in the indexes and also makes searching more efficient.

C.12.5  Chain Passing

In some goal-to-goal passing situations, where you have GOAL, INDEX-GOAL and INDEX record-types, you want SPIBILD to pass some of the same data from INDEX-GOAL to INDEX that was passed from GOAL to INDEX-GOAL. When data passes from one record-type to another to another, SPIBILD is doing "chain passing".

A particular file structure is required when chain passing is desired; as noted before, unless chain passing is intended, this file structure should be avoided, or else the NOCHAIN statement should be added to the end of the INDEX-GOAL's linkage section. [See C.12.3.1.] In words, the required design must follow this rule: if data will be passed from GOAL through INDEX-GOAL to INDEX, then the sequential order of the record-type names must match the order GOAL, INDEX-GOAL, INDEX. For example, REC01G (GOAL), REC03G (INDEX-GOAL) and REC04I (INDEX) would cause chain passing; if the INDEX were REC02I, chain passing would not occur, since it would then come between the GOAL and the INDEX-GOAL. Pictorially, the required design looks like this:

and not like this:

Here is a very simple example of a file that takes advantage of chain passing. Basically the GOAL is a person's record, the INDEX-GOAL is a name index (by last name, with the first name in a sub-index structure) and INDEX is a "first name" index.

Here is a diagram showing the passing scheme for this file:

  REC01G (GOAL)      REC02G (INDEX-GOAL)         REC03I (INDEX)

   SLOT (lctr) -|  |-> LAST.NAME (key) ---|   |--> FIRST.NAME (key)
                |  |                      |   |
   NAME --------|----> FIRST.NAME --------|---|
                |                         |
                |----> POINTER            |------> LAST.NAME

This example demonstrates several points about chain passing. First, consider the relationship of most goal and index record-types:

If the GOAL contains a key element and a value element, the INDEX inverts it, so that the value becomes the key of the index record. If the INDEX record were to be passed to another index, that index would re-invert the record, making it resemble the original goal record. But that suggests that there would be little or no need to build the second index, since the goal record could serve as the index to the index record. [See C.12.4.]

Chain passing is usually needed because several elements are passed via qualifiers or sub-indexes to the INDEX-GOAL, and some of those elements in turn must be indexed. Cases requiring such treatment almost always involve goal records containing structures that themselves need to be treated as goal records. For example, three system subfiles are linked via chain passing: FILEDEF, SUBFILE and ACCT (only the first may be directly selected by the public). Here are the main elements involved:

In the file definition (the FILEDEF record), the subfile section (the structure identified by the SUBFILE-NAME element) provides SPIRES with information on subfiles and privileges for them. During SPIBILD processing overnight, the elements shown are passed to the SUBFILE goal record-type. SPIRES uses the SUBFILE records when a SELECT command is issued to find the file and goal-record names so that the appropriate ORVYL files can be attached for use. (Obviously, it would be very inefficient for SPIRES to examine each file definition record, i.e., use Global FOR, to access this needed information each time a SELECT command is issued.)

SPIBILD then passes the ACCT and SUBFILE-NAME elements to another index, which serves as the goal record-type for the ACCT subfile. That subfile is accessed by SPIRES when a SHOW SUBFILES command is issued: it finds the appropriate records (the "gg.uuu" record, the PUBLIC record, the group "gg" record, etc.) and displays the SUBFILE-NAME values stored therein.

This example also suggests a common characteristic, though not a requirement, of such files: usually, the records in INDEX-GOAL are created and maintained by SPIBILD.

When you design a file with chain passing, try to minimize the amount of redundant data passed and stored. You might consider "chain-passing" only those elements that will in turn be passed to another index, since you can use subgoal processing in formats to access the data in the original GOAL records.

Chain passing is inefficient for files that do not require data to be passed from GOAL through INDEX-GOAL to INDEX. This is because SPIBILD, as it processes GOAL records during chain passing, creates or updates INDEX-GOAL records, which are then put back in the deferred queue, rather than put into the INDEX-GOAL tree. Then the INDEX-GOAL records are processed. So if none of the INDEX-GOAL records created or updated by SPIBILD during GOAL processing will have data passed through to the INDEX record, then the INDEX-GOAL records should be put in the tree and not back in the deferred queue to be processed again.

C.12.6  Goal-to-Same-Goal Passing

An unusual variation on goal-to-goal passing is the case where a goal record-type passes to itself: it serves as its own index. Goal-to-same-goal passing is allowed on any non-REMOVED record-type.

Goal-to-same-goal passing can be used when cross-references to and from other goal records are desired. For example, suppose you have a subfile of "index cards" in which you keep notes for a paper you are writing:

The elements REFERENCE.TO and REFERENCED.BY are optional elements. You enter values for REFERENCE.TO; values for REFERENCED.BY will be created by SPIBILD from values of CARD.NUMBER in other records with appropriate REFERENCE.TO values. For instance, a record that looks like the one on the left would cause SPIBILD to create the one on the right:

Of course, chances are that there already is a record 14, so SPIBILD would merge the REFERENCED.BY element into that record. (In this situation, you could use a table-lookup processing rule to verify that any "referenced" records already exist when the record is added. To allow forward references, i.e., references to later cards that have not yet been written and added, you might set the error level for the lookup rule to W, for warning. Thus, if you add a record that references a non-existent record, you receive a warning message, but the error does not cause the record to be rejected.)

If SPIBILD might create new goal records, you must make all elements except the key Optional, just as in other goal-to-goal passing designs, unless other elements will be passed via sub-indexes or qualifiers, in which case those elements do not have to be Optional. [See C.12.3.1.]

If you allow "forward referencing", that is, if you allow SPIBILD to create goal records because the referenced goal record has not yet been added, a very important limitation goes into effect: you may not batch or load records into the subfile from SPIBILD. Why not? Suppose records 15 through 17 are batched into the subfile, and record 15 references record 17. When SPIBILD processes record 15, it will create a record 17 with the reference data. Then when record 17 is batched, it will be rejected if it is being added, since a record 17 already exists. On the other hand, if record 17 is being updated (or "addupdated"), the batched record 17 will completely replace the earlier SPIBILD-created record 17, which means the reference data will be lost.

One way to circumvent this problem is to use ADDMERGE to add the records: either use the INPUT ADDMERGE command in SPIBILD, or use INPUT BATCH, placing "ADDMERGE;" at the start of each record in standard SPIRES format. (See the manual "SPIRES File Management" for details.)

You can also circumvent this restriction by adding all your records to the deferred queue through SPIRES using one of the INPUT commands there. Then, to create all the index references, process the file in SPIBILD.

C.12a  Indirect Searching

The technique known as "indirect search" links subfile records together, using an index for one subfile with index keys that match the keys in a second subfile's goal record-type. The indexes belonging to the second subfile can then be used in searches for goal records in the first.

For example, suppose you have two subfiles, STUDENTS and COURSES. As you will see later, they may be in the same file, but for the time being, assume they belong to two different files. Here is a list of the relevant elements and indexes of each:

Because the key values appearing in the COURSE.ID index in the STUDENTS subfile are the same as the key values of goal records in the COURSES subfile, the indirect search technique can link the subfiles. That would allow the PROFESSOR index to be added to the list of indexes for the STUDENTS subfile. A search command issued under the STUDENTS subfile such as:

would find students taking courses taught by that professor, even though the PROFESSOR index is built for and stored in the COURSES subfile. This type of index is actually called an "indirect index" in SPIRES terminology.

An indirect index is very simple to set up. It can be done either permanently in the file definition, or dynamically using the DEFINE INDEX command when needed. The DEFINE INDEX technique is covered later in this chapter. [See C.12a.4.]

In the file definition, you simply add an index section to the linkage section of the subfile getting the indirect index (the STUDENTS subfile, in this example). This "indirect linkage" is tied to the index record-type whose keys match the keys in the other subfile's goal records. Its other noteworthy features are the $SEARCH.SUBF proc (action A8) in the SEARCHPROC string, and the $NOPASS proc (a combination of actions A169, A47 and A52) in the PASSPROC string.

For example, here is the complete linkage section for the STUDENTS subfile's goal record-type, including the linkage for the PROFESSOR index:

Notice then that there are two linkages for index ZIN02 -- one normal linkage, and one indirect linkage. The first one, because of its PASSPROC rules, will handle all the index building of ZIN02; the indirect one will not.

The $SEARCH.SUBF proc tells SPIRES:

In the example, a search such as FIND PROFESSOR = CHIPS would send SPIRES to the COURSES subfile, use the value CHIPS in a search of the PROFESSOR index there, retrieve the key values (COURSE.ID values) of those records, and use them to search ZIN02 for STUDENTS goal records.

Complete details on action A8 appear later in this manual. See the SPIRES manual "System Procs" for information on its system proc counterpart, $SEARCH.SUBF.

The remainder of this chapter will discuss applications that can take advantage of the indirect search technique, as well as particular details of the feature that file owners may need to know. [See C.12a.1, C.12a.2.]

In general, the following terminology will be used in discussions of indirect search:

C.12a.1  Coding Details for Indirect Indexes

This section will present details about indirect searching that may be important to the application developer trying to use the feature. The information presented in the previous section, along with the description of action A8 (or system proc $SEARCH.SUBF), will be adequate for many applications. [See C.12a.] In general, the material here will discuss restrictions that will not apply to most applications.

Before noting restrictions, however, it is worth mentioning that the BROWSE command may be issued freely against the indirect index. For instance, using the example in the previous section, the command BROWSE PROFESSOR could be issued from both the STUDENTS and COURSES subfile, with the same effects.

Another capability that has interesting effects is that an indirect index may point to another indirect index, and so forth. That means a search could span across many different indexes, subfiles and record-types before returning a search result for the current subfile.

In terms of restrictions, the most important applies to the direct index, i.e., the index of the direct goal, which will be used in the final part of the search. That index must be a simple index or a goal-index. It may have qualifiers and sub-indexes (see below), but it must be a simple index or goal-index. In other words, the A8 action ($SEARCH.SUBF proc) must be specified in the SEARCHPROC string of a simple index or goal-index, not of a compound index, a qualifier, or a sub-index. [If you add the processing rule to a goal-index's Searchproc string, be aware that the keys of the goal-index (i.e., the keys of the Direct Goal) must match the keys of the Indirect Goal for successful matches to be found.]

The direct index can, in fact, be the goal record-type itself, as an index. [See B.8.2.1.] In order to do this, you will need to include A10 ($GOAL.INDEX proc) in the SEARCHPROC string.

The indirect index named by A8 must also be a simple index, which serves the indirect goal. It too may have qualifiers and sub-indexes. However, they cannot be used directly via the indirect index, with the exception of sub-indexes associated with action A38 ($PNAME), which allows personal name indexing to work properly via the indirect index. [You may be able to use action A7 ($REPARSE proc) to create sub-index or qualifier references for the indirect index to use. Remember to code the A7 in the SEARCHPROC in the index's linkage to the indirect goal, not in the SEARCHPROC in the index linkage for the direct goal.]

As mentioned above, the direct index may have qualifiers or sub-indexes. You can use them as part of an indirect search. Of course, they must be coded as part of the indirect linkage in the direct goal's linkage section if you want to use them in the indirect search. Remember not to pass anything to them, since that is taken care of by the normal linkage for the direct index.

There is a restriction, though, if the direct index is a personal name index -- that is, if it contains a sub-index and it is normally processed with action A38 ($PNAME). That means that the direct index and the indirect goal both use the last name as the record key, and the first name is stored elsewhere. The restriction is that actions A8 and A38 ($SEARCH.SUBF and $PNAME) cannot co-exist in the same SEARCHPROC string. In other words, A38 cannot be used in the A8 SEARCHPROC string, so first names cannot be used directly in the search. (As restrictive as this ban sounds, it would probably not affect many files -- not many applications would want an indirect goal to use a personal last name as an indirect goal record key.)

The general rule is that the SEARCHPROC containing the A8 cannot also contain A38 ($PNAME) nor can it contain A45 ($BREAK, $BREAK.HEX or $WORD). However, the SEARCHPROC of the index pointed to by the A8 can have any of these processing rules.

The SEARCHPROC containing the A8 may not also contain any of the truncation SEARCHPROC rules, such as A14 ($SEARCH.TRUNC). Again, this restriction does not apply to the SEARCHPROC of the index pointed to by the A8. It may have any of the truncation rules, which can be used in the indirect index search.

Alternatively, you can avoid that other SEARCHPROC string completely if you want. By coding a P1 parameter of "1" on A8 (or coding the LASTRULE parameter in $SEARCH.SUBF), you can tell SPIRES that you do not want the current search value processed by the SEARCHPROC string of the index being pointed to. Instead, the value should go directly to index processing "as is".

Another important restriction is that A8 must be the last rule in the SEARCHPROC string. If it is not, an error will occur when the action is executed. In fact, errors caused by violating these restrictions will be detected during execution (an E8 error will occur), not during file definition compilation.

One final restriction: an S299 error will result if all three of the following conditions are met:

C.12a.2  Uses for Indirect Searching

This section will discuss some ways in which the indirect search feature can be used. In all cases, indirect search broadens the user's searching power. The suggestions given here do not represent a complete catalog of its uses -- like many other SPIRES features, it may solve problems completely unconsidered by its designers.

One use for indirect search was discussed earlier. [See C.12a.] In the STUDENTS and COURSES subfiles, it allowed users to find student records in the STUDENTS subfile based on information about the courses those students were taking that was stored in the COURSES subfile.

In that example, the subfiles were declared to be in two different files, but that is not a requirement. In fact, the two subfiles may be in the same subfile, and the direct index may even be the same record-type as the indirect goal, in a simple form of goal-to-goal passing. [See C.12.2.] For example, here is a possible outline of the two goal record-types, showing how the COURSES goal record-type could contain both its own goal data and the pointer data for the STUDENTS subfile:

RECORD-NAME = REC01;              RECORD-NAME = REC04;
  COMMENT = STUDENTS goal-record;   COMMENT = COURSES goal-record
  REMOVED;                                    and STUDENTS index;
  FIXED;                            FIXED;
    KEY = STUDENT.ID; LEN = 8;        KEY = COURSE.ID; LEN = 10;
  OPTIONAL;                         OPTIONAL;
    ELEM = COURSE.ID; LEN = 10;       ELEM = PROFESSOR;
    ELEM = STUDENT.POINTER;             TYPE = LCTR;

The linkage section for the STUDENTS subfile would still be the same as it was in the earlier example, except that the name of the pointer element changed from POINTER to STUDENT.POINTER. Pointer information would be passed by SPIBILD from REC01 to REC04. There would also be index record-types for REC04; for the COURSES subfile they would serve as direct indexes, but they could also serve as indirect indexes for STUDENTS. In the earlier example, the PROFESSOR element was passed to a direct index for COURSES, becoming an indirect index for STUDENTS.

Thus, it is not uncommon for the indirect goal and the direct index to be combined in the same record-type, using goal-to-goal passing techniques. The set of keys is stored only once, which could be a significant savings for a large file. So consider this technique if you are designing two related files from scratch. But also consider adding indirect indexing to files that already exist with subfiles linked by goal-to-goal passing in this way.

There are some other interesting uses of indirect indexing with special types of files that use goal-to-goal passing, specifically, double-headed files and files that use goal-to-same-goal passing. [See C.12.4, C.12.6.] One goal-to-same-goal application that uses indirect search is a genealogy data base that uses indirect indexes to search for people by a parent, grandparent, great-grandparent, grandchild, etc., using nested indirect searching with only a single direct index, which is the goal itself! [See C.12a.1.]

Indirect Indexes and Coded Data

Let's take the STUDENTS/COURSES example another step farther. Suppose that COURSE.ID is an code that cannot be decoded by the untrained eye, and that another element in the COURSES subfile, COURSE.NAME, is the uncoded name. If the COURSES subfile created an index for that element, the STUDENTS subfile could claim it as an indirect index, giving users a way to find students by the actual course name, rather than by the course code supplied for administrative purposes.

Aren't there ways to handle this problem without indirect indexing? Yes, but they have some distinct disadvantages. You might consider storing the COURSE.NAME in the STUDENTS record along with the COURSE.ID. That would cause problems if the COURSE.NAME was changed -- many STUDENTS records would have to be changed.

Another technique is to make COURSE.NAME a virtual element in the STUDENTS record; it could be looked up in the COURSES subfile when the record was retrieved. But this solution brings back another problem: the virtual element value cannot be indexed properly. When you index a virtual element that is based on the value of another element, changing the other element will not cause the virtual element to be re-indexed automatically. All the records that would be affected by the change would have to be placed in the deferred queue for re-passing by SPIBILD.

With the indirect search technique, the course name can easily be changed in one record in the COURSES subfile, and all its indexes will be changed automatically. When those indexes are searched indirectly from the STUDENTS subfile, the course name changes will be in effect without any reprocessing of the STUDENTS goal records.

Note that you might want to add a virtual element or even a phantom structure to link the COURSE.NAME and/or other elements of the COURSES goal record into the STUDENTS subfile, so that the records in the search of the indirect index will demonstrate why they were retrieved. For example, if you issue the command FIND PROFESSOR = HILL and the records displayed for the students have only the COURSE.ID, you might be suspicious or distressed that the professor's name is not shown too.

Thus, indirect indexing can make code handling much more convenient both to the application developer and the searcher. Its power in combination with phantom structures, virtual elements, and table lookups (actions A32 and A65, system procs $LOOKUP and $SUBF.LOOKUP) is quite extensive. [See C.5.1, C.11.2.]

C.12a.3  Some Technical Details on how SPIRES Handles Indirect Indexes

If a search-expression for an indirect index is used with the equality operator (or with no relational operator), then logical operations associated with that search-expression are done at the direct-index level using the pointer-groups retrieved from the relevant direct index. This technique affects results for multiple occurrences of the indexed element.

Here is an example: suppose a file has these record-types:

REC03 is a word index for the DEPT.NAME element of goal-record-type REC02. REC01 uses REC02 as an index for the DEPT.CODE element and also uses REC03 as in indirect-index (via REC02 and DEPT.CODE).

Using REC01 as the goal, consider the search:

The point of the search is to find any instructors who teach courses in both ENGLISH and MATH. A record for such an instructor would contain two DEPT.CODEs, each of which relates to a REC02 record having a DEPT.NAME of ENGLISH or MATH. No DEPT.NAME in REC02 is likely to contain both words ENGLISH and MATH.

If SPIRES combined the pointers at the indirect-index level, the search would almost surely give a zero result, because SPIRES would retrieve DEPT.CODE pointers to REC02 from REC03, "and"ing them together then. Since no REC02 record would have a DEPT.NAME that included both ENGLISH and MATH, the result of the "and"ing would be zero, leading back to a zero result for the command.

But in fact, SPIRES will not do the "and"ing until the pointers get all the way back to the direct index. That means the pointers to REC02 for MATH will lead back to the pointers (INSTRUCTOR.NAME) to REC01; the pointers for REC02 for ENGLISH will lead back to the pointers to REC01; and then those pointers will be "and"ed together, which will return the desired result.

Note that this technique is also applied to iterative searches, which also logically combine the pointers together at the direct-index level.

Note however that the method of performing logical operations at the indirect-index level occurs in either of these circumstances:

C.12a.4  Dynamic Indexes

There may be situations where it is easier or even necessary to establish an indirect search link dynamically, rather than add it to and compile it in the file definition. This can be done with the DEFINE INDEX command.

We'll use the same example [See C.12a.] from earlier in the chapter: you have access to a STUDENTS and a COURSES subfile, which are linked to each other through the COURSE.ID element and index:

The dynamically defined PROF index links to the PROFESSOR index in the COURSES subfile. Keys returned from the PROFESSOR index are COURSE.IDs that SPIRES uses in searches against the COURSE.ID index in the STUDENTS subfile to return a result of students. The 493 students found in the example above are thus students taking courses for which Professor Gizmo is listed as the Professor in the COURSES subfile.

where:

It must not match the name of an existing real index in the current subfile; and if it matches the name of an existing dynamic index, this new dynamic index definition will replace that previous one.

for which the indirect index will return keys.

"ON local-index-name" is optional; you do not need it if the keys returned to the selected subfile by the indirect search (COURSE.IDs in our example) represent the keys of the goal records in the selected subfile, rather than the keys of an index for the selected subfile. That would be the case, for instance, when you have two subfiles whose keys both represent the same type of goal record.

As an example where you don't need the ON option, suppose you have two subfiles like this:

Here the housing information for students is kept in a separate subfile from the main academic data, but the key for both types of goal record is a student ID.

Using indirect indexes lets you tie these subfiles together to search on criteria from both subfiles.

Other useful commands for working with dynamic indexes are:

where "name" is the name of the specific dynamic index you want to disable. The first two commands display and disable, respectively, all the dynamic indexes defined for the currently selected subfile or subfile path.

All dynamic indexes are cleared automatically when the selected subfile is cleared or re-selected.

The DEFINE INDEX command may also be used to define dynamic aliases for indexes in the currently selected subfile, simply by naming both the index for which you want the alias and the name of the currently selected subfile as the "index-name" and the "subfile-name" respectively:

C.13  File Definition Information Packets

Element information packets and index information packets allow the file owner to place specific information in the file definition about individual elements. That information can be accessed by users through system commands such as SHOW ELEMENTS and SHOW ELEMENT DESCRIPTION, and by applications (such as the $REPORT format) through the system functions $ELEMINFO, $ELIDINFO and $INDEXINFO.

The information contained in an information packet has either or both of two purposes:

Information packets consist of structures in the file definition at the element and index definition levels. For elements, the information packet is the ELEMINFO-DEF structure allowed under the SLOT, ELEM or KEY statements -- that is, for each element in a record-type, there may be an occurrence of the ELEMINFO-DEF structure. For indexes, the information packet is the INDEXINFO-DEF structure, allowed in the individual linkage sections, in the sub-index section, and in the global and local qualifier sections.

Each occurrence of the ELEMINFO-DEF structure or INDEXINFO-DEF structure is known as an "information packet". Each element and index may have one packet. A packet consists of various statements ("info-elements"), all but one of them optional; many of them are multiply occurring within the packet. The packet does not have to contain the info-elements for the element, however -- the packet may instead refer to another packet defined at the end of the file definition or in a record in the public subfile EXTDEF, allowing you to separate the element and index information from their definitions, if desired. [See C.13.5.]

As part of the file definition or an EXTDEF record, the info-elements can only be seen by the file owner. The information packets are accessible to others when the given subfile is selected (i.e., when the appropriate record-type is attached). The three system functions $ELEMINFO, $ELIDINFO and $INDEXINFO may be used to retrieve the values of the various statements in the information packets. The first two retrieve element information -- which one you use depends upon whether you have the element name or the four-byte element ID number -- and the third retrieves index information. Their syntaxes are very similar:

The "elemname" is the name of a real, virtual, or phantom element within the selected subfile. The "elem.ID" is a four-byte hex value derived from the $ELEMID variable. The "searchterm" is a valid index-name available to the user (meaning that indexes hidden to the user by NOSEARCH are hidden to this function too). The "info-elem" is the name of the element within the information packet containing the desired information, such as HEADING or DESCRIPTION. The "occurrence" is an integer, starting from 1, denoting which occurrence of the info-element is to be retrieved. The "default" value is optional; if it is coded and if the specified occurrence of the info-element does not exist, the default value will be returned by the function.

Not all of the info-elements are used by system utilities -- some are currently there for informational use only, such as the element info-element INDEX, which tells what indexes the current element is passed to. Thus, it is possible to use those info-elements to store other information than they were meant to hold. However, because system utilities created in the future may use these "informational" info-elements, it is recommended that you use all info-elements only for the purposes described in their documentation, so that they will work properly with such forthcoming utilities.

Similarly, you may discover info-elements allowed in a file definition other than those documented here. Assigning values to them may or may not be allowed, but is certainly not recommended.

The next section will discuss the various info-elements available in element information packets. [See C.13.1.] Later, the info-elements for index information packets will be covered. [See C.13.3.]

C.13.1  Element Information Packets

Element information packets may be included in any real, virtual or phantom element definition. That is, in a record definition, the ELEMINFO-DEF structure may be coded for any SLOT (SLOT-DEF structure), KEY (KEY-DEF structure) or ELEM (ELEM-DEF structure) statement, e.g.,

The ELEMINFO-DEF structure contains the key element ELEMINFO (which is given a null value, as shown above) and optional info-elements, such as NOTE and DESCRIPTION in the example. The info-elements currently useful in the ELEMINFO-DEF structure are:

Each is described separately.

All of the info-elements for a given element may be seen by issuing the command "SHOW ELEMENT INFORMATION elemname", where "elemname" is the name of the desired element in the selected goal record-type. Other SHOW ELEMENT commands will show you various subsets of this information. [See C.13.2.]

Note that the values of the info-elements picked up by various system utilities such as the $REPORT or $PROMPT formats are generally used as defaults for the element. Suppose, for example, that the element PHN.NUM has an information packet containing a value of "Phone Number" for the COL-HEADING info-element. If a command such as the following is issued:

the format will use the value "Phone Number" for the heading of that element in the display. However, the user can override the info-element "default" by including the HEADING parameter in the SET FORMAT command:

The heading "Phone" will be used, overriding the info-element heading "Phone Number". The user can always locally override an info-element default when using a system utility that takes advantage of info-elements.

C.13.1.1  The ELEMINFO (INFO) Info-element

ELEMINFO is the key element of the element information packet. It is thus required. It appears as the first element in the ELEMINFO-DEF structure. ELEMINFO usually has a null value:

The null value tells SPIRES that the info-elements for the element follow this statement.

If you want the info-elements to be kept in element packets at the end of the file definition or in element packets in an EXTDEF subfile record, you need to give ELEMINFO a value here. The value here will match the value given for ELEMINFO in the other packet. When compiling the element information for the element, SPIRES will use the name given here to find the appropriate packet at the end of the definition or in the appropriate EXTDEF record. [See C.13.5.]

C.13.1.2  The NOTE Info-element

The NOTE info-element usually contains a brief "one-line" description of the element. This description appears as part of the SHOW ELEMENT and SHOW ELEMENT DESCRIPTION commands. [See C.13.2.] Only one occurrence of NOTE is allowed in an information packet. Like all info-elements except ELEMINFO, it is optional.

Although its length may be up to 3072 characters, it is generally rather short -- 50 characters or less. It is frequently used as a means of presenting an unabbreviated or slightly expanded form of an element name:

The DESCRIPTION info-element is generally used for lengthier descriptions of the element's purpose. [See C.13.1.3.]

C.13.1.3  The DESCRIPTION (DESC) Info-Element

The DESCRIPTION info-element is an optional, multiply occurring statement describing the element. Its values are displayed by the SHOW ELEMENT DESCRIPTION command. Also, the system format $PROMPT will display the DESCRIPTION values when the user enters a help character (usually a question mark), requesting more information about the element being prompted for. Thus, DESCRIPTION generally provides details about the element, such as its uses, possible values, types of editing done by processing rules, related elements, etc.

When the SHOW ELEMENT commands and the $PROMPT format display an element description, each occurrence of the DESCRIPTION element begins on a new line. Values longer than the current setting of the $LENGTH variable will wrap into subsequent lines. For ease of entry, a paragraph of explanatory text is usually entered as a single, long occurrence of the DESCRIPTION element. A single value may be as long as 3072 characters (about 42 72-character lines). Of course, since DESCRIPTION is multiply occurring, you can have about as many paragraphs as you like.

Blank lines between paragraphs can be produced by adding an occurrence of DESCRIPTION with a null value:

Tables, examples and other text that you do not want aligned as described above can be entered such that each line of text as you want it displayed is a separate occurrence of DESCRIPTION. Just be sure that each line is shorter than the value of $LENGTH set by the user. (The default for $LENGTH, which is seldom changed, is 72.) Lines with leading blanks should be enclosed in quotation marks (") as shown in the example below:

Those three "paragraphs" would appear like this:

C.13.1.4  The HEADING (HEAD) Info-element

The singly-occurring HEADING info-element usually contains a short description of the element. It is often used as a "super-mnemonic" for the element, since it may contain blanks and may exceed 16 characters, breaking two limits of element names.

The value can be used to represent the element name in utilities such as the $REPORT and $PROMPT formats. The value will also be displayed by the SHOW ELEMENTS command.

Various system utilities may use the HEADING value when displaying the "name" of the element. For example, the $REPORT format may use this value if the related element appears in a report. The value will be used as the column heading for the element. (It may be overridden locally by the use of the HEADING parameter in the SET FORMAT $REPORT command, and overridden globally by the COL-HEADING info-element, as described below.)

The value may be any length, but since column headings in $REPORT are limited to 36 characters, a warning message will appear if you exceed that length when the file definition is added to the FILEDEF subfile. (If the width of the area allocated for the element by $REPORT is shorter than HEADING, then HEADING will be truncated to that width for the report.)

For $REPORT horizontal reports, if COL-HEADING exists for a given element, it will override HEADING. So if you want the $REPORT heading to differ from the "heading" shown by the various SHOW ELEMENT commands, use COL-HEADING instead. [See C.13.1.5.] COL-HEADING also supports multiple-row headings, while HEADING does not.

The $PROMPT format will use the HEADING value for the element's prompt string when records are added, merged or displayed.

C.13.1.5  The COL-HEADING (COLHEAD) Info-element

The COL-HEADING info-element contains a value or values to be used as headings for the element in formats. It is currently used by the $REPORT format to construct a column heading for the given element. The $REPORT format will check for occurrences of this info-element if no HEADING parameter appears in the SET FORMAT command for the element. If neither of those is available, $REPORT will use the HEADING info-element instead. [See C.13.1.4.]

The COL-HEADING info-element may have as many occurrences as you want, and each occurrence may have any length desired. However, a warning message will be displayed if any input occurrence for COL-HEADING exceeds 36 characters. The $REPORT format will use only the first 36 characters total from the first three occurrences, constructing a multiple-row heading.

C.13.1.6  The WIDTH (WID) Info-element

The singly-occurring WIDTH info-element is a positive integer specifying the default width of the display area for the element value. It may be used by applications to determine how much space to reserve for an element value.

For example, for the $REPORT format, it overrides the default width of the element assigned by $REPORT but is itself overridden if a width value appears for the element in a SET FORMAT $REPORT command.

For example,

would create a field 30 characters wide for the given element in a report created with $REPORT unless the SET FORMAT $REPORT command specified a width value for the element.

C.13.1.7  The ADJUST (ADJ) Info-element

ADJUST, a singly occurring info-element, can tell applications how to adjust the element data in a fixed-width field. An element that is numeric, for example, is usually "right-adjusted" in the field. Only one of a limited number of values is allowed:

where "adjust" is one of the following:

SPIRES will store the full-length version of the info-element value in the file definition -- that is, if you specify C for ADJUST, SPIRES will change it to CENTER for storage.

ADJUST is used by the $REPORT format to determine the adjustment of the value in its display. Like COL-HEADING, ADJUST overrides the default adjustment of the element done by $REPORT but is overridden by any explicit adjustment of the element specified in the SET FORMAT command. For LEFT, RIGHT and CENTER, if the length of the element value exceeds the allowed length, the value will wrap into the next row.

C.13.1.8  The INDENT Info-element

The singly-occurring INDENT info-element has an integer value that tells applications how to handle element values that wrap into subsequent rows on output. Its syntax is:

where "n" is a positive or negative integer. If positive, $REPORT will indent the value for the first row that number of spaces, creating a "paragraph" indent. If negative, $REPORT will indent the value in subsequent rows that number of spaces, creating a hanging indent.

The $REPORT format will use the INDENT info-element if it is available. However, an INDENT parameter used in the SET FORMAT $REPORT command will override it. Also, an ADJUST parameter in the SET FORMAT $REPORT command will cause the INDENT value in the packet to be ignored if the adjustment specified is TRUNCATE.

C.13.1.9  The MAXROWS (MAXROW) Info-element

The MAXROWS info-element, which is singly occurring, holds a positive integer that tells an application the maximum number of rows to allow for all occurrences of the element for a single record. This info-element is currently used by the $REPORT format. The default value used by $REPORT is 100 rows. Hence, the info-element is generally used only if the file owner knows that the element will contain so many or such large occurrences that more than 100 rows would be needed to display them all.

In $REPORT, if several elements have MAXROW values, the format will use the highest one. If the SET FORMAT $REPORT OPTION command is issued with the MAXROWS option, that value will override the MAXROWS value set from elements already specified. However, if other elements with MAXROWS info-elements are added to the report later, the maximum MAXROWS value among them will be used, if it is larger than the one currently set by the format.

C.13.1.10  The EDIT Info-element

The singly-occurring info-element EDIT provides an edit mask for applications to apply to the value. It could also be applied to any computations done with the element, such as totals or averages. It often matches the edit mask, if any, coded in the element's OUTPROC -- the reason for the duplication is that the edit mask coded in the OUTPROC cannot be accessed for other uses by an application.

The syntax of the EDIT info-element is:

where "edit-mask" is a valid SPIRES edit mask and "divisor" is an integer representing a "power-of-ten divisor" by which the element value should be divided before the edit mask is applied.

The $REPORT format uses the EDIT info-element. The edit mask specified here will be used not only for occurrences of the element in the detail lines of the report but also for totals and other functions in which the element is used. For detail lines, it can be overridden by an EDIT parameter for the element in the SET FORMAT command; it can be overridden for the functions in each function's parameter list in the SET FORMAT command. [EXPLAIN EDIT MASKS.] for more information on valid edit masks in SPIRES. [EXPLAIN $REPORT FORMAT, EDIT PARAMETER.] for more information on using edit masks with $REPORT.

C.13.1.11  The VALUE-TYPE (VTYPE) Info-element

VALUE-TYPE is a singly-occurring info-element whose value describes the type of value the element represents. For example, a VALUE-TYPE of DATE indicates that element values are dates, stored in the 4-byte hex form favored by SPIRES. The type entered here is more general than the strict data type specified by the TYPE statement in an element definition (e.g., TYPE = STRUCTURE or TYPE = LCTR).

The syntax of the VALUE-TYPE statement is:

where "vtype" is one of the following values:

The first three characters may be used to specify each value-type. (The expanded, full form will be stored, however.) Only one value-type is allowed per element.

The value given for VTYPE does not have to match the real attributes of the stored element. An element stored as a string (a zipcode element, for instance) could be called NUMERIC; however, its values must be convertible to numbers by SPIRES.

The VALUE-TYPE info-element affects the $PROMPT format when its value is STRUCTURED: the $PROMPT format will prompt for and display the structure as a single value, not as the individual elements within it. As the description of STRUCTURED above suggests, an element should not have STRUCTURED for a VTYPE value unless it is processed by $STRUC (A33) processing rules.

The $REPORT format will use the VALUE-TYPE info-element if the value is NUMERIC or TEXT, in situations where $REPORT options are specified to globally affect NUMERIC or TEXT values (such as the DEFAULT option).

C.13.1.12  The INDEX Info-element

The INDEX info-element specifies the indexes to which the element is passed. This info-element is multiply occurring, with each value representing a searchterm identifying an index. The multiple occurrences may be entered as a single occurrence of INDEX, whose syntax is:

Although they may be entered as a single occurrence, multiple occurrences will be stored separately.

C.13.1.13  The USERINFO Info-element

The multiply-occurring USERINFO info-element may be used to hold any data the file owner desires. In most cases in which it is used, USERINFO holds information to help drive specific applications.

Its syntax is simply:

where "text" is any textual information desired. Null occurrences are allowed too, as they are for DESCRIPTION. [See C.13.1.3.]

C.13.1.14  The INPUT-OCC (INOCC) Info-element

The singly-occurring INPUT-OCC info-element indicates the maximum number of element occurrences that system utilities should prompt for during input.

where "n" is an integer value from 1 to 32767.

The $PROMPT format uses this info-element, if specified, to determine the maximum number of times to prompt for occurrences of a multiply-occurring element. Its most common value is "1", specified when the user is allowed to enter multiple occurrences of an element as a single occurrence, which is broken apart into multiple occurrences by the $BREAK (A45) rule. Hence the user would be prompted only once, though multiple values could be specified.

Remember that this info-element, like the others, does not itself affect the values or the number of values allowed for the element it describes. If you want to absolutely control the number of occurrences of an element, you must use the OCC statement and/or the $MAX.OCC, $MIN.OCC, or $DEFAULT (A123, A146) rules. However, if you have used one of those methods to limit the number of occurrences of an element to, say, five, you might want to limit the number of times the $PROMPT format will prompt for occurrences to five as well, by coding this info-element.

C.13.1.15  The DEFAULT Info-element

The singly-occurring DEFAULT info-element serves two purposes. First, it tells system utilities that an Inclose rule for the element will supply a default value for that element if none is provided. Second, it tells utilities what that value is, so that it can be shown or described to the user. Its syntax is:

where "value" is optional. The value may be either the actual default value that will be provided (e.g., "Palo Alto" for a CITY element), or it may be a description of the value (e.g., "your account number" for an ACCOUNT element). SPIRES utilities will assume that "value" is a description of the value, not the actual value; thus, no SPIRES utility will use "value" as the default input value for an element.

The $PROMPT format uses this info-element in several ways. If a value is supplied, it will be shown to the user when the element is prompted, along with the suggestion that a null response to the prompt will supply the default value. Moreover, if the element is a Fixed or Required element, the $PROMPT format will allow the user to skip the element by simply pressing the return key, regardless of the value of DEFAULT (i.e., it can be null). (If there is no DEFAULT info-element, $PROMPT requires the user to provide some value for a Fixed or Required element, since it does not know about Inclose processing.)

C.13.1.16  The SUPPLIED Info-element

The singly-occurring SUPPLIED info-element is similar to the DEFAULT info-element. [See C.13.1.15.] An occurrence of this info-element means that a value for the element will be supplied by SPIRES during record input, whether the user provides a value or not. If the user does provides a value, that value will be discarded. The value of the SUPPLIED info-element, if specified, describes the value that will be supplied by the utility for the element during input. A DATE.UPDATED element might have a SUPPLIED info-element, for example.

The syntax of the SUPPLIED statement is:

where "value" is optional. The value, if specified, should be the value (or a description of the value) that a utility would supply for the element during record input.

The $PROMPT format uses the SUPPLIED info-element in two ways. First, if the SUPPLIED info-element (whether it has a value or not) is specified for an element, then $PROMPT will not include it when establishing a default element list. That is, if the user issues the command SET FORMAT $PROMPT without specifying an element list, elements having a SUPPLIED info-element will not be included. But even if such an element is explicitly named in the $PROMPT element list, $PROMPT will bypass any prompt for new input for it when a record is being added; the current value will be displayed during MERGE and DISPLAY operations, however. If a SUPPLIED value is specified, that value will be shown during an ADD operation before $PROMPT proceeds to the next element:

The value of the SUPPLIED info-element is "Today's Date". (The underscore character "_" indicates the cursor position after the ADD command has been issued; the user does not get to input any value for the DATE.UPDATED element.) Again, the SUPPLIED value is displayed only when the element is actually named in the $PROMPT element list; otherwise, the element will be completely bypassed during input and display processing.

C.13.1.17  The RDBMS_COLUMN Info-element

The singly-occurring RDBMS_COLUMN info-element may be used to relate a SPIRES data element to a particular RDBMS table Column name. A number of SPIRES applications are able to utilize this correspondence when passing data to or from an RDBMS database. Note that naming conventions of SPIRES data elements can be very different from those in relational systems (e.g. SPIRES data element names have a maximum length of 16 characters while an RDBMS system may allow up to 32 characters in a Column name).

The $COLUMNTEST function is available for applications that wish to find a SPIRES data element name that corresponds to a given RDBMS column name.

C.13.1.18  The RDBMS_DATATYPE Info-element

This singly-occurring info-element may be used to specify the RDBMS Column data type of the RDBMS_COLUMN that corresponds to a SPIRES data element. This data type is generally not represented in the same manner that it is in SPIRES. For example a data element of type STRING might be represented by "RDBMS_DATATYPE = VARCHAR;".

C.13.1.19  The RDBMS_DATALENGTH Info-element

This singly-occurring info-element may be used to specify the RDBMS Column length value of the RDBMS_COLUMN that corresponds to a SPIRES data element. This data length value may differ from the WIDTH info-element as they serve different purposes. The WIDTH value tells SPIRES output processes the width of the field for output purposes whereas RDBMS_DATALENGTH hold the maximum width of a Column in an RDBMS table.

C.13.2  System Commands and Utilities Using Element Information

This section lists the SPIRES commands or utilities that take advantage of element information packets, if they are coded. The specific info-elements used by each are noted as well.

The SHOW ELEMENTS command

The output from a SHOW ELEMENTS command looks like this:

For each element, the primary mnemonic, followed by its aliases, is displayed, all in uppercase. After the dash (--), the HEADING info-element is displayed. To the right, starting in column 40, is the NOTE info-element. Wrapping is done as needed, as the example shows.

If the SHOW ELEMENTS command is followed by the name of a single element, the information described above is listed for that element only (and all the elements within it if it is a structure).

The SHOW ELEMENT DESCRIPTION command

The SHOW ELEMENT DESCRIPTION command displays descriptive information about the elements in the subfile -- all of them, in fact, unless a specific element is named in the command, e.g., "SHOW ELEMENT DESCRIPTION DATE".

The first line gives the element's primary mnemonic and aliases, followed by the value of the HEADING info-element (after the dash). The next line displays the NOTE info-element. Following a blank line come the occurrences of the DESCRIPTION element. [See C.13.1.3.]

Again, if the element named is a structure, the same information for all the elements within the structure will be displayed.

The SHOW ELEMENT INFORMATION command

The SHOW ELEMENT INFORMATION command shows all the element information for all elements in the subfile, unless a specific element is named, as in the example below:

Each info-element coded for the element is displayed, with all but the HEADING (in the first line, following the dash as always) identified with titles. With the exception of the info-elements WIDTH, ADJUST, INDENT and EDIT, no indication appears in the display if a given info-element is not coded. If none of the four exceptions is coded, that line of the display is omitted; but if any one of the four is coded (as the example shows), the entire line is displayed.

If the element named in the SHOW ELEMENT INFORMATION command is a structure, the element information for all the elements within it will also be displayed.

The $REPORT Format

If an element having element information is named in a SET FORMAT command that is setting or resetting the $REPORT format, then $REPORT will use various info-elements to set up the display parameters for that element, unless they are explicitly overridden in the SET FORMAT command. The info-elements used by $REPORT are:

Details about their use with $REPORT appears in their individual descriptions.

The $PROMPT Format

If an element having element information is used by the $PROMPT format, then $PROMPT will use various info-elements to control the execution of the format. The info-elements may be overridden in the SET FORMAT $PROMPT command. The info-elements used by $PROMPT are:

Details about their use with $PROMPT appears in their indivudual descriptions.

C.13.3  Index Information Packets

An index information packet contains information about a particular index of the subfile. It may be coded in individual linkage sections, in the sub-index sections and in global and local qualifier sections of the file definition. Here, for example, is a linkage section for a simple index that contains an information packet:

Note the similarities to element information packets. First, the packet begins with the key element INDEXINFO, which has a null value -- this announces the start of the index information packet. Also, three of the info-elements, NOTE, DESCRIPTION and VALUE-TYPE (and USERINFO, not shown in the example) have counterparts in element information packets.

The complete list of info-elements available in index information packets is:

Each is described separately. Remember that each is optional, with the exception of INDEXINFO.

Index information packets do not have to be stored in the index definition -- they may appear at the end of the file definition in record-level INDEXINFO-DEF structures, or in EXTDEF subfile records. In such cases, the INDEXINFO element in the index definition is given a non-null value, to tell SPICOMP what information packet in one of these alternate locations to use. [See C.13.5.]

C.13.3.1  The INDEXINFO (INFO) Info-element

INDEXINFO is the key element of the index information packet, and so is required. It appears as the first element in the INDEXINFO-DEF structure. INDEXINFO currently may have only a null value:

The null value tells SPIRES that the info-elements for the index follow this statement.

If you want the info-elements to be kept in packets at the end of the file definition or in packets in an EXTDEF subfile record, you need to give INDEXINFO a value here. The value here will match the value given for INDEXINFO in the other packet. When compiling the index information for the index, SPIRES will use the name given here to find the appropriate packet at the end of the definition or in the appropriate EXTDEF record.

C.13.3.2  The NOTE Info-element

The NOTE info-element usually contains a brief "one-line" description of the index. This description appears as part of the SHOW INDEX DESCRIPTION command. [See C.13.2.] Only one occurrence of NOTE is allowed in an information packet. Like all info-elements except INDEXINFO, it is optional.

Although its length may be up to 3072 characters, it is generally rather short -- 50 characters or less. The DESCRIPTION info-element is generally used for lengthier descriptions of the index's purpose.

C.13.3.3  The DESCRIPTION (DESC) Info-Element

The DESCRIPTION info-element is an optional, multiply occurring statement describing the index. Its values are displayed by the SHOW INDEX DESCRIPTION command. DESCRIPTION usually provides details about the index, such as its uses, possible values, types of editing done by processing rules, related indexes, etc.

When the SHOW INDEX DESCRIPTION command displays an index description, each occurrence of the DESCRIPTION element begins on a new line. Values longer than the current setting of the $LENGTH variable will wrap into subsequent lines. For ease of entry, a paragraph of explanatory text is usually entered as a single, long occurrence of the DESCRIPTION element. A single value may be as long as 3072 characters (about 42 72-character lines). Of course, since DESCRIPTION is multiply occurring, you can have about as many paragraphs as you like.

DESCRIPTION info-elements for indexes follow the same rules as those for elements. [See C.13.1.3 for suggestions on coding them.]

C.13.3.4  The SOURCE (SOU) Info-element

The SOURCE info-element names the elements whose values are passed to the index. It is multiply occurring, since multiple elements may be passed. The syntax is:

where "elem-name" is the name of an element being passed to the index. SPIRES verifies that each value follows the rules of form for element names; however, it cannot verify that each name is the actual name of an element in the goal record-type nor that the element named is actually passed to the index.

Although you can enter multiple element names in one value for SOURCE, SPIRES will split them into separate occurrences for storage.

C.13.3.5  The VALUE-TYPE (VTYPE) Info-element

The VALUE-TYPE info-element describes the values being passed to the index. It is similar to the VALUE-TYPE info-element for element information packets. [See C.13.1.11.] Most of the allowed values there are valid here too.

The syntax of the VALUE-TYPE info-element for index information packets is:

where "vtype" is one of the following:

Only one value-type is allowed per index.

C.13.3.6  The TRUNCATE (TRUNC) Info-element

The singly-occurring TRUNCATE info-element, if coded, indicates that the index allows truncated searching and, if it has a single-character value, what the truncation character is. Its syntax is:

where "x" is the truncation character named in the appropriate SRCPROC, a single character. You may code a null value (i.e., "TRUNCATE;") to indicate that the index is a date index allowing the special year-only or month-and-year-only truncation.

C.13.3.7  The EXCLUDE (EXC) Info-element

EXCLUDE is a multiply-occurring info-element representing values that are excluded from passing to the index. These are values intercepted by a PASSPROC rule such as $EXCLUDE (A46) so that they will not be indexed. The most common such values are articles, prepositions and pronouns when a word index is being built. The syntax of the EXCLUDE info-element is:

where "value" is a value being excluded. Multiple values may be entered at once, as indicated by the syntax, but they will be stored as separate occurrences.

C.13.3.8  The USERINFO Info-element

The multiply-occurring USERINFO info-element may be used to hold any data the file owner desires. In most cases in which it is used, USERINFO holds information to help drive specific applications.

Its syntax is simply:

where "text" is any textual information desired. Null occurrences are allowed too, as they are for DESCRIPTION. [See C.13.1.3.]

C.13.4  System Commands and Utilities Using Index Information

This section lists the SPIRES commands or utilities that take advantage of index information packets, if they are coded. The specific info-elements used by each are noted. Note that the SHOW INDEXES command does not use index information packets.

The SHOW INDEX DESCRIPTION command

This command displays all occurrences of the DESCRIPTION info-element for all the indexes or the named index if one is named in the command. Unlike the SHOW ELEMENT DESCRIPTION command, which displays the values of several other info-elements as well, the SHOW INDEX DESCRIPTION command displays only DESCRIPTION values.

The SHOW INDEX INFORMATION command

This command displays all the info-elements in the index information packet for all indexes or for only the named index. For example,

C.13.5  Alternate Locations for Information Packet Definitions

You do not have to define the actual element or index information packet in the element or index definition. You may instead place packet definitions in one of these places:

A file owner might want to take advantage of this capability for any of several reasons. First, it allows information packets to be shared by multiple record-types in a single file or by multiple files. Second, it allows the element and index information to be separate from the definitions themselves -- that might be important to you if you consider information packets as a convenience rather than as an integral part of the file definition "program".

To take advantage of the capability, you must tell SPIRES in the element definition to look elsewhere for the information packet during compilation. To do so, you give the ELEMINFO or INDEXINFO info-element, which would begin the information packet, a value:

If the ELEMINFO or INDEXINFO info-element has a null value, SPIRES assumes that the information packet for the element or index follows immediately. However, if ELEMINFO or INDEXINFO does have a value for "string" (up to 16 characters in length), SPIRES will look for an information packet with a matching ELEMINFO or INDEXINFO value somewhere else. Any other info-elements appearing in the element or index definition after an ELEMINFO or INDEXINFO statement having a value will be ignored.

The procedure SPIRES follows for locating the appropriate element information packet during compilation is:

A similar procedure is followed for index information packets during compilation, except that the RECDEF step is skipped, since index information packets are not allowed in RECDEF records.

The procedure for using the EXTDEF subfile (i.e., adding records, the EXTDEF-ID statement, etc.) is documented elsewhere in this manual. [See C.10.5.] EXTDEF records may contain PROC definitions as well as element and index information packets.

It is important to remember the following when you are separating information packets from the elements or indexes to which they apply:

D  Appendices

D.1  Actions: Complete Listing By Number

D.1.1  About this Chapter

This chapter contains a description of each of the actions that can be used in a SPIRES file definition.

Most of the action descriptions also reference system procs that incorporate them. More information about the named system procs can be found online through the EXPLAIN command (e.g., EXPLAIN $CHANGE PROC) or in the SPIRES reference manual "System Procs", which also describes how system procs can be used to simplify processing rule specification.

D.1.2  Actions Used Only As SEARCHPROC Rules (A6 -- A16)

D.1.2.0.0.6  * A6

A6 :<NUM>

A6 :P1

The implicit ANDs between search values are replaced with the following depending on the value of P1:

Note that action 45 should be called to break up the search values, but it should precede the A6:

This rule applies only to simple index terms and does not apply to qualifier or sub-index or compound index terms.

D.1.2.0.0.7  * A7

A7

A7

A7 reparses the search value, which should have been modified to begin with a different SEARCHTERM mnemonic in a previous rule. The input value may be any legal search expression or a combination of search expressions linked by logical operators (AND, AND NOT, OR). The A7 rule then sends the value to be reprocessed.

This rule can thus be used to "disguise" searches that would require multiple indexes, sub-indexes or qualifiers so that they seem to be simple index searches to the user. For example, suppose the AT "index", as it appears to the user, represents an author-title search handled by A7. The user might issue a command such as

but the SEARCHPROC for AT includes rules, such as A36, A48 or A62, that convert "SHAKESPEARE/HAMLET" to "AUTHOR SHAKEPEARE AND TITLE HAMLET". When the A7 at the end of the SEARCHPROC is executed, the new value is reprocessed, as if the original command had been:

Note too that the value handled by the A7 is treated as if it were in parentheses, e.g.,

is treated by SPIRES (after the "AT" SEARCHPROC, including the A7, is executed) as

rather than as:

which would probably retrieve only "Jungle Book" by Kipling, because the final "AND TITLE JUNGLE BOOK" would be applied to the entire preceding search, not just to "AUTHOR KIPLING".

A7 must be the final rule in the SEARCHPROC string. A45 ($BREAK) may not be used in the same SEARCHPROC string. No relational operator other than "=" (or blank) may appear in the search expression referring to an "index" processed by A7. The search value may not be null when the A7 is executed.

The "reprocessed" SEARCHPROC may include searchterms that also use A7; but they in turn may not; in other words, A7s may be nested, but only one deep. The "reprocessed" SEARCHPROC may use any of the relational operators, however; this is true regardless of the setting of secure-switch 11. [See B.9.3.11.]

Since the A7-handled searchterm is not associated with an actual index, the linkage section that defines it should include the $NOPASS proc, e.g.,

Since neither passing to nor searching of the named index record-type actually occurs, any index record-type for the subfile can be named there.

D.1.2.0.0.8  * A8

A8 :<NUM>, <STRING>, <STRING>

A8 : P1, P2, P3

This action requests indirect search processing, discussed in detail earlier in this manual. [See C.12a.] The P3 parameter names a searchterm for the index in another subfile that will be used as the indirect index; the P2 parameter names that subfile. SPIRES will go to that subfile, search the indirect index with the current search value, retrieve the keys of the goal records that search finds, and use those keys to search the current index. The keys in the current index must thus match the keys in the goal records of the named subfile.

If P1 = 1, the search value is applied directly to the indirect index; the value is not processed through the indirect index's SEARCHPROC string. If P1 = 0 (more common), the search value is processed through the indirect index's SEARCHPROCs. See below for P1 = 2.

A8 must be the last SEARCHPROC in the rule string; it may not appear in a rule string with A38 or A45. It may be used only in simple indexes or goal-indexes; it may not appear as a SEARCHPROC for qualifiers, compound indexes or sub-indexes. [Some of these restrictions do not apply to the indirect index as it exists as an index in the subfile named by P2. See the section on "Indirect Search" for more information.]

The error flag is set if any of the following conditions occur:

If the subfile name in P2 contains more than one word, or any unusual characters, it should be placed in single quotes, i.e., apostrophes, as in 'SYS PROTO'. Also, if more than one subfile has the subfile name, you can qualify the subfile name with the file name, prefixed by an ampersand, as in '&GG.SPI.FORMATS SYS PROTO'.

If P1 = 2, when SPIRES returns from the indirect goal with keys, it will not check the tree of the current subfile to make certain that the record keys exist there. This makes the action more efficient, but is practical only in situations where you know for certain that any keys returned by the indirect search exist in the subfile. (Note, however, that SPIRES will still check the deferred queue of the current subfile for adds and removes; it just will not verify that the other keys are in the tree.)

D.1.2.0.0.9  * A9

A9 :<NUM>

A9 :P1

For P1 = 0, this Searchproc, in a sense, "does nothing". By default, if no SEARCHPROC rule string is specified for an index, SPIRES supplies A45, meaning that the value is broken into words at blanks. Any SEARCHPROC rule overrides the default, but A9:0 does so without altering the value in any way whatsoever.

For P1 > 0, the action controls the handling of relational operators other than the equality operator (=) when it appears in a Searchproc string for simple indexes and goal-indexes. Specifically, depending on the precise value, it puts the relational operator back into the value for search purposes.

For example, if P1=3, here are some sample search commands, as interpreted by SPIRES:

command                       interpretation
-------------------------     --------------------------
FIND TITLE MASK OF ZORRO      FIND TITLE "MASK OF ZORRO"
FIND TITLE AFTER DARK         FIND TITLE "AFTER DARK"
FIND DATE >1985               FIND DATE "> 1985"

The last example shows that when the relational operator is reattached to the value, it is always followed by a blank. The operators that are words will be converted to uppercase, unless secure-switch 16 is in effect, which means they will retain the case in which they were input by the user.

In most uses, A9 should appear as one of the first rules in the Searchproc string, so that subsequent rules can act upon the entire value.

D.1.2.0.1.0  * A10

A10

A10

A10 provides some additional flexibility to the goal-index searching capability. [See B.8.2.1.] The easiest way to request that the goal record-type be used as an index to the records (i.e., that users can search by record key) is by adding a SEARCHPROC to the Global Parameters section of the linkage.

As an alternative, or to allow multiple searchterms and SEARCHPROCs to point to the same goal-index (perhaps for purposes of indirect searching), you can code additional individual index linkages that use A10 to announce that the named index is the goal record-type so special processing can be done.

Part of a typical linkage section using A10 is shown below. Record-type REC01 is the goal record-type for the subfile, as shown by the global linkage section at the start. It is named as a goal-index twice, first declared in the global index and then in the individual index linkage that follows:

When anyone uses the PERFORMER index, SPIRES searches for the given value in the PERFORMER index of the SELECTIONS subfile; the keys it retrieves there presumably will match keys in the current subfile, which it will use to retrieve records there.

Though the pointer group named in the INDEX-NAME section is not used by SPIRES, it must be declared in order for the record to go into the subfile. Moreover, the element named as PTR-ELEM must exist in the goal record-type, even though the A10 effectively tells SPIRES not to use it. The common approach to this problem is to add the PTR-ELEM value as an alias to the goal record's key element, which meets the requirement.

D.1.2.0.1.1  * A11

A11 :<NUM>,<SINGLE CHARACTER>

A11 :P1 ,P2

If any character following the (P1+1)th character is the character P2, then the characters preceding it constitute the start of each search key, and the characters which follow it (if any) consititute a string of characters which must match somewhere within the remainder of the search keys. Note: Typically, search keys are entire phrases, not single words.

This rule applies only to simple index terms and does not apply to qualifier or sub-index or compound index terms.

If an inequality operator (except ~=) or a range operator such as BEFORE is used in a search request and the SEARCHPROC uses A11, the portion of the search value after and including character P2 is discarded for the search.

D.1.2.0.1.2  * A12

A12 :<NUM>

A12 :P1

This action supplies an operator for compound index or qualifier requests if none has been supplied in the search request. Thus, 'Mnemonic Value' becomes 'Mnemonic Operator Value' with operator specified by P1 from the following list:

  Operator  P1   Operator  P1   Operator  P1   Operator  P1   Operator  P1
     =       1      <=      4    Before    6    String   10    Prefix   13
    ~=       2       >      5    After     5    Mask     11    Word     14
    >=       3       <      6    With      9    Having   12    Suffix   15

If this action is not included in the SEARCHPROC of an access or qualifier or P1 does not have a value taken from the above list, then 'Mnemonic Value' assumes 'Mnemonic = value' in the search request.

Note: This is the only action that can be specified in the SEARCHPROC of a compound index, and it applies to all elements in the index.

D.1.2.0.1.3  * A13

A13 ,<CHARACTER STRING>

A13 ,P2

If there are record keys whose first N characters match the N characters of the search value and whose N+1st character is one of the P2 characters, then the pointer groups in these records are ored onto the result stack

This rule applies only to simple index terms and does not apply to qualifier or sub-index or compound index terms.

D.1.2.0.1.4  * A14

A14 :<NUM> ,<SINGLE CHARACTER>

A14 :P1 ,P2

If the final (nth) character of the search value is character P2, then the pointer groups of records with keys whose first N-1 characters match those of the search value are "or"ed onto the result stack. The P1 parameter may be used to set a minimum length for this character string. The error flag will be set if N-1 is less than or equal to P1.

Note: SEARCHPROC=A31/ A44,#0000#,#00#/ A14,#00#; will cause automatic truncated searching. July 27, l952 will be retrieved by: FIND DATE 1952 or by FIND DATE JULY 1952.

This rule applies only to simple index terms and qualifiers; it does not apply to sub-index or compound index terms. Note that for qualifiers, if the user doesn't provide a relational operator and if another one isn't provided by A12 ($SEARCH.QUAL), then the default relational operator is PREFIX. So, for example, if DATE is a qualifier, the search command AND DATE 6/87 is interpreted as AND DATE PREFIX 4/87, which would select all the DATE qualifier values in June of 1987.

D.1.2.0.1.5  * A15

A15

A15

If no record is found with a key that matches the search value and if the record with the alphabetically next smaller key has a match between its key and the initial characters of the search value, then that record is retrieved.

This rule applies only to simple index terms and does not apply to qualifier or sub-index or compound index terms.

D.1.2.0.1.6  * A16

A16

A16

If no record is found with a key that matches the search value, the alphabetically next higher record is retrieved.

This rule applies only to simple index terms and does not apply to qualifier or sub-index or compound index terms.

D.1.2.0.1.7  * A17

A17

A17

This rule is used on the Searchproc for a proximity-search index, that is, a word index that has a special sub-index with positional information that can be used to look for words next to or near each other within the record. [See C.6.2.]

This action should be coded with S-level error-handling.

The error flag is set for any of the following reasons:

D.1.3  Actions Used as INPROC, OUTPROC, SEARCHPROC or PASSPROC Rules (A21  -- A66)

D.1.3.0.2.1  * A21

A21 :<NUM>

A21 :P1

The alphanumeric integer constant is converted to fixed binary. If P1 is 4, the four-byte result replaces the nonconverted value. (Four bytes is the standard length of integer values in SPIRES.) If P1 is 2, the two-byte result replaces the nonconverted value. If P1 is 1, the one-byte result replaces the nonconverted value. If P1 is 0, the nonconverted value is retained. (P1 may also be 3, for a three-byte result, though that is used very infrequently.)

Leading or trailing blanks are allowed, and a leading sign (+ or -) may be surrounded by blanks. The error condition is set on if the value contains anything other than a sign and digits, or if the value is too big for the designated space.

For P1=1, the error condition is set on if the value is smaller than -256 or larger than 255; if a value from -1 through -256 is entered, it is translated according to the table below without signalling an error condition.

For P1=2, the error condition is set on if the value is smaller than -65536 or larger than 65535; however, if a value from 32768 through 65535 or a value from -32769 through -65536 is entered, it is translated according to the table below without signalling an error condition.

For P1=3, the error condition is set on if the value is smaller than -16,777,216 or larger than 16,777,215; but if a value from -16,777,216 to -8,388,609 or a value from 8,388,608 to 16,777,215 is entered, it is translated according to the table below without setting the error flag.

For P1=4, the error condition is set on if the value is smaller than -2,147,483,647 or larger than 2,147,483,647.

The fact that values outside certain ranges are translated without warning for P1=1, 2 or 3 makes these P1 values unacceptable if the range is later to be validated by A24; a P1 of 4 should be used on A24.

If values outside the "stored" ranges are input, and they are inside the "entry" ranges noted above for P1, then they will be converted to values inside the "stored" range. If values outside the "entry" ranges are input, an error is signalled.

If the error flag is set, the returned value is the nonconverted number if P1 is 0 or zero if P1 is 1, 2, 3 or 4. If 8 is added to any of the above P1 values, then a null input value is allowed.

D.1.3.0.2.2  * A22

A22 :<NUM> ,<NUM>

A22 :P1 , P2

The value is tested to see that it is not longer than P2 characters. For P1 of 0 through 4, the error flag is set if the value contains more than P2 characters.

The default recovery action depends on the P1 parameter as follows:

D.1.3.0.2.3  * A23

A23 :<NUM> ,<NUM>

A23 :P1 ,P2

The value is tested to see that it is not shorter than P2 characters. The error condition is set on if the value contains too few characters. If P1 is 0, then the default recovery is to retain the original value. If P1 is 1, then the default recovery is to add enough blanks to the right end of the value to make it P2 characters long.

D.1.3.0.2.4  * A24

A24 :<NUM> ,<NUM> ,<NUM>

A24 :P1 ,P2 ,P3

If the incoming value is binary, then the following applies:

The binary value is tested to see that it is not numerically greater than P2 or numerically less than P3. The error condition is set on if either of the limits is exceeded. The value is replaced with the value of the violated limit. If the converted value is 2 or 4 bytes long and the sign bit is set, then the value is taken as negative in the tests. Likewise, if the sign bit of P2 or P3 is set, those values are taken to be negative. The value must be binary; this action usually follows an A21 (convert string to binary). If the value is null, it is treated like 0 (zero). The P1 parameter has no effect when the action is used for binary values (but see the discussion below).

Here are the boundary limits for P2 and P3: -2139999999 <= P <= 2139999999.

If the incoming value is a packed decimal, the action provides several types of range tests, depending on the value of P1. (See also A67, which provides a simple range-testing capability for packed values.) EXPLAIN PACKED DECIMALS for more information about some of the concepts used below.

For all the packed range tests described below, P2 represents the maximum limit and P3 represents the minimum limit. Both P2 and P3 must be integers. (Note: if P3 > P2, then P2 is raised automatically to equal P3.) In all cases, the error flag is set if the specified range is exceeded. Unlike the case for integers described above, the input packed value is never modified by this action. If the value is null, it is treated like 0 (zero).

For P1=0, the range specifies "decimal places" allowed in the input value. For example, AS24:0,2,0 means the input packed value may have from 0 to 2 decimal places; otherwise, a serious error occurs.

For P1=1, the range specifies the range of the exponent allowed. The maximum allowed range for the exponent of a packed value is -128 to 127.

For P1=2, the range specifies the allowed magnitude of the value.

For P1=3, the range specifies the allowed precision of the value.

For P1=4, the range specifies the allowed sign of the value (i.e., whether the value can be positive or negative). In this case, P2 and P3 must be integer values from -2 to 2, as follows:

For example, AW24:4,0,-2 would allow values from -@@ up to and including 0; positive packed values would set the error flag. The action AS24:4,1,-1 would allow all numeric values except positive or negative infinity.

D.1.3.0.2.5  * A25

A25 :<NUM>

A25 :P1

The alphanumeric real constant is converted to floating point.

If P1=2, the double precision result replaces the value.
If P1=1, the single precision result replaces the value.
If P1=0, the nonconverted value is retained.

The error condition is set on if the value contains unrecognizable characters or is outside the allowed range; the value returned when the error flag is set is "0" (zero). The default value is the nonconverted number if P1 is 0 or zero if P1 is 1 or 2.

The range of allowed values for a single precision result is:

  -7.237E75 <--  -5.4E-79   0   5.4E-79  --> 7.237E75

allowing as many as seven places of accuracy.

D.1.3.0.2.6  * A26

A26 ,<REAL NUM> ,<REAL NUM>

A26 ,P2 ,P3

The result of a previous floating point conversion that returned a converted value is tested to see that it is not numerically greater than P2 or less than P3. The error condition is set on if either of the limits is exceeded. The value is replaced with the value of the violated limit.

D.1.3.0.2.7  * A27

A27 :<NUM>

A27 :P1

The last digit of the input value is set aside, and the remainder of the input value is processed by action 21. The result of action 21 is then processed by one of the following, depending on the P1 value: a MOD-11 rule (P1 is 0, 1, 2 or 4), a student services rule (P1 is 8-15), the "modified student services rule" (P1 is 16-23), the Luhn rule (P1 is 24-31), or the "ABA" rule (P1 is 3, 5, 6 or 7). The result from each of these rules is a single digit.

If the set-aside digit matches the calculated digit, the input value is considered valid, otherwise an error condition is flagged. The value stored is the result from the action 21 processing and uses the P1 parameter the same as action 21.

If you want A27 to use the MOD-11 rule, use the same P1 value as you would use for A21. To use one of the other rules, do one of the following:

The MOD-11 rule (P1 is 0-7)

The check digit computation for MOD-11 is performed as follows: Each digit in the original number (the number to be augmented by a check digit) is multiplied by 2+n, where "n" is the number of digits between the digit being multiplied and the decimal point. The results are then summed. For example, if the original number is "175", then:

The sum, "35" in this example, is then subtracted from the next larger multiple of 11, than this result is the check digit appended to the value. If the result of the subtraction is 10, then the check digit is 0. Continuing the above example:

The check digit on "175" is "9", yielding "1759".

The Student Services rule (P1 is 8-15)

The check digit computation for P1>=8 but <16 (the "student services" rule) is done as follows: After leading zeroes are discarded, the number consists of n digits: the leftmost non-zero digit is called "D(1)", the next "D(2)" and so forth, to the last digit "D(n)", which is the check digit. First compute the following total:

Then divide the total by 10, and the remainder is the check digit "n". For example, if the number 1575 is entered:

The value of d(4) should be the check digit. The computation confirms that:

Dividing 45 by 10 returns a remainder of 5, which was the check digit given.

The Modified Student Services rule (P1 is 16-23)

The modified student services rule is the same as the student services rule described above, except that leading zeroes are taken into account, rather than discarded. If this version is used, be sure that both input and output values are padded to the proper length with zeroes. For input, that might mean coding the rule "A36:2,0,n" to specify that the value be overlayed from the right on a field of "n" zeroes. (That can be done on output by adding 1 to the P1 parameter on A77.)

The Luhn rule (P1 is 24-31)

The Luhn formula for computing a modulus-10 check digit is:

1) Starting with the low-order digit (units position), double the value of it and every other digit (all odd positions).

2) Add all the digits of those products to all the even-position digits of the original number.

3) Subtract the sum total from the next higher value of 10.

4) If the result is 10, then the check digit is 0; otherwise, the check digit is the result.

For example, suppose you have the number 534. First, double the 4 and the 5, getting 8 and 10 respectively. Second, add the digits of those products (8, 1 and 0) to the even-position digit, the 3; 8+1+0+3 = 12. Then subtract 12 from 20, the next higher value of 10, and the result is 8, the check digit.

The ABA rule (P1 is 3, 5, 6 or 7)

The ABA rule for computing a check digit is:

1) If the value is shorter than 8 digits, right-adjust the value within a field of 8 zeroes (00000000).

2) Take the right-most eight digits and multiply each digit, left to right, by the number shown in the table below:

3) Add the individual products together, and retain only the final digit of the sum.

4) If that digit is 0, that is the check digit. If it is not 0, subtract that digit from 10 to get the check digit.

Example: You want the check digit for the number 534. Step 1: Right-adjusting it in the field of 8 zeroes, you get "00000534". Step 2: Multiplying each digit by the appropriate number for its respective position from the table, you get 5 (the 6th position) times 1 (product=5), 3 times 3 (product=9), 4 times 7 (product=28). Step 3: Adding the products together, you get 5+9+28=42. Keep only the final digit, 2. Step 4: Subtract that digit from 10 to get the check digit, 8.

D.1.3.0.2.8  * A28

A28 :<NUM>

A28 :P1

For TIME conversion, [See D.1.3.0.2.8a.]

or DATETIME conversion, [See D.1.3.0.2.8b.]

D.1.3.0.2.8a  * A28 for time

A28 :<NUM>

A28 :P1

For time conversion, the alphanumeric time value is converted to a 2-byte hex or 4-byte binary value, depending on the value of P1. If P1 = 0 or 1, the time is converted to a 2-byte hex value; if P1 = 2 or 3, then it is converted to a 4-byte integer. The resultant value can then be output using action A73.

Time values stored in 2 bytes are in 2-second increments, from 0:00:00 to 36:24:30 -- uneven seconds are rounded down to even seconds (8:23:11 to 8:23:10, for example). Values higher than 36:24:31.999 cause the error flag to be set; in addition, SPIRES then sets the value to the limit 36:24:30. Any "odd" numbered seconds will be rounded down to the nearest "even" number of seconds; for example: 13:24:25 becomes 13:24:24.

Time values stored in integers have a much larger range of values: from 0.000 seconds to 24 days, 20 hours, 31 minutes, 23.647 seconds, in millisecond increments. Values that exceed this amount cause the error flag to be set, and the value is replaced with the integer -1.

For both forms, a null value will set the error flag and return either 36:24:30 for the 2-byte hex form or -1 for the integer form.

The standard forms of time specification are based on a 24-hour clock. If P1 = 0 or 2, then time may be specified as seconds, as minutes and seconds, or as hours, minutes and seconds:

If P1 = 1 or 3, then time may be specified as hours, as hours and minutes, or as hours, minutes and seconds:

To this basic form may be added a days option, a decimal option and an AM/PM option, each of which is described below. More examples of valid input forms are shown in the description of A73 output forms.

The days option allows you to precede the time value with a number of days, followed by the word "days", "day" or "d":

If you specify days, they may not be included in the "colon" portion of the value; "4:03:24:12" to mean "4 days, 3 hours, etc." is not allowed.

On the other hand, the hours, minutes and seconds may also be indicated with words rather than in the colon form:

These may be abbreviated to their singular form or to 1 letter as well -- for example, "hour" or "h" -- or to reasonable abbreviations such as "hr" or "mins". You cannot mix words and the colon form except for "days" with the colon form; "4 days, 3 hours, 24:12" would cause an error. If "hours", "minutes" or "seconds" words are used, then all the units must have words (or abbreviations); "3 hours, 24, 12 seconds" would cause an error. The only other "words" allowed are "A.M." and "P.M.", with or without periods, in upper, lower or "uplow" case.

The decimal option lets you specify a decimal value on the smallest portion of the value:

A value such as "3.24:12" would cause an error. Remember for the 2-byte hex form that values that are not exactly a two-second increment will round down to a 2-second increment. If 3:24:11.999 is input, it will be stored as 3:24:10.

The AM/PM option allows the input value to have "AM" or "PM" (in upper, lower or mixed cases, with or without periods) at the end of the time value. It may be used with any of the forms above. However, an error will occur if the hours portion exceeds 12. For example, "13:00:00 pm" would set the error flag.

NOON or MIDNIGHT, optionally preceded by the number 12 (or 12:00 or 12:00:00) and optionally abbreviated to three characters or more, are allowed values. They may be in upper, lower or mixed case.

For DATETIME conversion, [See D.1.3.0.2.8b.]

D.1.3.0.2.8b  * A28 for datetime

A28:8

For DATETIME conversion, the input value is a combination of DATE and TIME. This is the functional equivalent of SQL's DATETIME values.

The internal form of this new value is an 8-byte HEX value that is the concatenation of the 4-byte HEX DATE and 4-byte INTEGER TIME. The external form is a DATE followed by a separator followed by a TIME which must include a colon (:) character, or just a DATE, NA or N/A.

The "time" portion must be a "hh:mm:ss.fff" form, with "h:" as a minimum. Time consists of decimal digits, colon characters, and possible period character between seconds and fractions of seconds. Time range: "0:" through "23:59:59.999" which become 00000000 through 05265BFF as the time portion of the internal form (tttttttt). Values that exceed this amount cause the error flag to be set, and the value is replaced with the integer -1.

The external form is scanned from right to left looking for the digits, colons, and periods. If colon is not found, the external form is assumed to consist of only a date. If a time is supplied, the separator between date and time may be any character other than digits, colons or periods. Blank is the preferred character, which is what will be output when the time portion of the internal form is not the integer -1 (indicating an unknown time). [See D.1.3.0.2.8a.]

The "date" portion may be any form of date allowed by $DATEIN with the NA option. That means dates without the century will default to the current century. You may also use "relative" dates, such as today, yesterday, tomorrow. N/A and NA and NONE are also accepted by this input process creating 99990909FFFFFFFF as the 8-byte HEX form. If the "date" portion fails to convert properly, the resultant 8-byte datetime value will be: 00000000FFFFFFFF.

Data elements which hold DATETIME values probably should be defined with LENGTH = 8; The TYPE defaults to HEX if you use the inproc rule described below.

The default external form is: "mm/dd/ccyy hh:mm:ss", which is output by A76. [See D.1.5.0.7.6b.]

D.1.3.0.2.9  * A29

A29 :<NUM> ,<NUM>

A29 :P1 ,P2

The alphanumeric hex string is converted into bit string form. If P2 is zero then the value which is stored will be half the length of the original value (rounded up to an even number). If the length of the original value is odd, the error flag is set if P1=0, otherwise an odd length value is not an error.

If P2 is not zero then it specifies a limiting length for the stored number. The error flag is set if the original value is greater than 2 * P2 characters in length. In this case only P2 characters are stored disregarding the leftmost characters. If the original value is less than or equal to 2 * P2 characters in length, the value is padded on the left with zeroes and P1 is tested. If P1=0 then the value stored will be half the length of the original (rounded up to an even number). Otherwise P2 characters are stored.

The hex string is tested for erroneous characters or blanks.

D.1.3.0.3.0  * A30

A30: <NUM>

A30: P1

This processing rule provides several different ways to handle the case (upper or lower or mixed) of the value. If P1=0, all lowercase letters in the value are replaced by uppercase letters. If P1=1, all uppercase letters in the value are converted to lowercase.

If P1 is 2 or 3, then only the first letter of each word in the value is converted to uppercase (P1=2) or lowercase (P1=3). If P1 is 4 or 5, only the first character of the entire value is converted to uppercase (P1=4) or lowercase (P1=5). ("Word" here means any string whose component characters are lexically greater than HEX 7F and which is delimited by characters lexically less than HEX 80 (except for an apostrophe -- see below); this would include alphabetic or numeric words, separated by blanks, commas, or most other punctuation. Refer to an EBCDIC chart for details.)

If P1=10, the entire value is converted to lowercase and then the first character of each word is converted to uppercase. If P1=11, the entire value is converted to uppercase and then the first character of each word is converted to lowercase.

If P1=12, the entire value is converted to lowercase and then the first character of the entire value is converted to uppercase. If P1=13, the entire value is converted to uppercase and then the first character of the entire value is converted to lowercase.

Although an apostrophe is actually hex 7D, it is treated special if it appears within a word, i.e., it is not next to blanks or any other characters lexically less than hex 80. It is treated as if it were part of the word rather than as a separator character between words. So for example, the value "MOE'S", if processed through A30:10, would be converted to "Moe's". This is usually the preferred way of behavior; however, it will also convert a value like "O'TOOLE" to "O'toole".

D.1.3.0.3.1  * A31

A31 :<NUM>

A31 :P1

[Note: This action has been incorporated into action A70, which you are encouraged to use instead.]

The alphanumeric date is converted to a 4-byte hexadecimal value of the form CCYYMMDD if P1 is 0. If P1 is 1 the value is unaffected. If P1 is 2, the action works the same as P1 = 0, except that some special "accounting" dates are allowed: August 32, 33 and 34 of any year. If the CC part of the year is not supplied as part of the input value, the current century is assumed. The error flag is set if the value is not one of a standard set of date formats, or if the converted date would be ambiguous.

          12 March 76                              = 03/12/1976
          03/76                                    = 03/--/1976
          The 12th day of March in '76             = 03/12/1976

Many relative date forms are also allowed, which express dates in terms of their relation to the current date. Here are some of the forms allowed:

Notice that the value always converts to a specific date, or a specific month and year, or a specific year. A value that would convert to a range of days within a month (such as "last week") will not be recognized, and will set the error flag.

A full description of allowed forms appears in the manual "SPIRES Searching and Updating" -- online, EXPLAIN DATES, INDEXING.

This rule does check the validity of dates, and only valid dates are allowed within a given month. For instance, "September 31" would cause an error. And unless P1 = 2 to allow the special accounting dates (see above), "August 32, 1987" would cause an error too. Also, "Feb. 29, 1980" would be acceptable, but "Feb. 29, 1979" would cause an error. The only exception (allowed date that shouldn't be) is "Feb. 29" in any century year not evenly divisible by 400 (these are not leap years), which does not cause an error.

A potential pitfall for users is that SPIRES accepts an input value of "12/31", treating it not as "December 31 of this year" but as "December 1931". If you add 4 to the P1 value you choose, SPIRES will not accept an input value unless it has all three pieces: a day, a month, and a year.

D.1.3.0.3.2  * A32

A32 :<NUM> ,<RECORD NUMBER> ,<ELEMENT>

A32 :P1 ,P2 ,P3

The value input to A32 is used as the key of a record in order to access record-type P2; the input value must match the internal form of the record key (although the types need not match). If P1 is 3 then the value is replaced by the first occurrence of element P3 (in its internal form) of the record retrieved. Similarly, if P1 is 0, then the value is replaced by the first occurrence of element P3 in its external form.

If P3 is given a value of -1 and if P2 indicates a REMOVED record-type, then the value is replaced by the locator to the record retrieved. If no record is retrieved, the error flag is set on and the value is unaffected.

If P1 is 2 then the value is unaffected and the error flag is set on if no record is retrieved. If P1 is 1 then the value is unaffected and the error flag is set on if a record is retrieved.

The element P3 is restricted to be the first occurrence of the element within the record or structure. P3 is defined as either just an element number (record level) or as a structure number followed by an @ symbol and then the element number within that structure. P3 may be any element in the record, including virtual elements.

If 4 is added to any P1 value above, then the original value is assumed to be a locator to index record-type P2. This is especially useful as an OUTPROC action on the locator elements of an index.

If 8 is added to any P1 value above, then the deferred queue will not be examined for records of the accessed record-type. This will save one I/O per transaction.

An alternate form of this rule is allowed in file definitions:

which allows you to specify the name of the record-type and an element within that record-type. This form is not allowed in format definitions. If a record name is used, then an element name (not an element number) must be used; vice versa, if a record number is used, then an element number must be used.

The value is used as a key of a record in order to access record type P2. The value is replaced by the value of singularly occurring element P3 of the record retrieved. (This is the same as a P1 of 3 above; no other uses of this action as a PASSPROC are allowed.) If P1=0, then the retrieved value is forced to uppercase; if P1=1, then the value is not forced to uppercase. If no record or no value in the record is retrieved, then no further pass processing for the value takes place; no error flag is set on. Either of the above forms, using record and element names or numbers, is permitted.

Note: The first record-type defined is record number 1. The first element defined is element number 0. In a slot record-type, the slot number is element 0. "Indirect Record-Access," contains details and examples of the use of this action. [See C.5.]

Efficiency considerations: In a file definition, A32 causes the base block of the record-type being accessed to be locked in core. In a format, the action causes the base block of the accessed record-type to be locked in core, as is the retrieved record, so that subsequent access to the same record in that record-type will cause no additional I/O.

D.1.3.0.3.3  * A33

A33 ,<PARM> (...,<PARM>)

A33,P+

This rule is used on TYPE=STR data elements. The rule allows a single value to be input for an entire occurrence of the structure. (It does not preclude values being input for the elements in the structure individually.) The input value is broken down into individual components according to the parameters. Each component represents a single occurrence of an element of the structure, each element in sequence. Each component is processed by the INPROC rules (if any) associated with the corresponding element within the structure.

The rule may also be used as an OUTPROC (see below) to put the elements of the structure back together into a single value.

The parameters indicate how the original input value is to be broken into components:

1)  L<number>       Component is next <number> characters.
2)  X<delimiter>    Component is all characters up to but not
                    including the first occurrence of <delimiter>.
                    The next component would begin with the
                    <delimiter> character.
3)  J<delimiter>    Component is all characters up to but not
                    including the first occurrence of <delimiter>.
                    The next component would begin with the first
                    character following the <delimiter>.
4)  I<delimiter>     Component is all characters up to and including
                    the first occurrence of <delimiter>.
                    The next component would begin with the first
                    character following the <delimiter>.
    Delimiters are specified as:

    1) N               Any numeric character, 0 thru 9.
    2) A               Any alphabetic character, a thru Z.
    3) S               Any non-alphanumeric, special characters.
    4) '<single char>' A specific character other than apostrophe.
    5) ''''            An apostrophe character.
    6) #xx#            Any Hex-character (xx), such as: #FF#

    (Note: In EBCDIC, special characters are 00 thru 7F, alphabetic
     are 80 thru EF, and numeric are F0 thru FF.)

  Example: Assume:  ELEM = X;      INPROC = A33,L5,XA,J'/',I'+';
                    TYPE = STR;   OUTPROC = A33,L5,XA,J'/',I'+';
                STRUCTURE = X;
                  FIXED;
                    KEY =  X1;  LEN = 5;
                    ELEM = X2;  LEN = 4;  INPROC = A21:4;
                                          OUTPROC = A71,15;
                  REQUIRED;
                    ELEM = X3;  OCC = 1;
                    ELEM = X4;  OCC = 1;

           With the following input:     X = APPLE-234sample/input;

           The components are:  X1 = APPLE;
                                X2 = -234;     (converted to binary)
                                X3 = sample;
                                X4 = input;

           With the following input:
               X = BREAD17my test case/your output;
           The components are:  X1 = BREAD;
                                X2 = 17;    (converted to binary)
                                X3 = my test case;
                                X4 = your output;

When this rule is used as an OUTPROC for a TYPE=STR data element, the individual element values within the structure are processed by their OUTPROC rules (if any) and then combined together to form a single value for the TYPE=STR element according to the parameters. Normally the INPROC and OUTPROC rules (A33) are the same. (In the standard SPIRES format, a structure that has A33 coded for an OUTPROC but not for an INPROC will ignore the A33 OUTPROC -- for the A33 to be recognized on output, it must also be coded for input. Note, however, that a custom format may use A33 in an OUTPROC even though the structure's INPROC does not have an A33.)

J<delimiter> specifications cause <delimiter> to be appended to the individual value. The character appended for <delimiter> specifications of N, A, or S are: N='9', A='X', S=' '. L<number> specifications cause blank padding if the value is shorter than <number> characters.

Note: A33 may not be used with A45 or A37 in the same INPROC. If an INPROC on an individual element within a structure input via A33 contains an A45 or A37, then only the first occurrence of that element will be retained in the record. If A33 is used with A61 (Save and Restore Value), then no A61 can follow the A33 in the rule string, and no save area value may exist when A33 is encountered.

D.1.3.0.3.4  * A34

A34 ,<NUM>

A34 ,P2

The last digit in the alphanumeric value is used for check digit testing. All but the last digit are converted into a fixed binary value and the result is divided by P2. If the remainder of the division does not match the check digit, then the error flag is set on. The alphanumeric value is not replaced.

The limit of the value (without the check digit) is 2,147,483,647.

D.1.3.0.3.5  * A35

A35 :<NUM> ,<SINGLE CHARACTER>

A35 :P1 ,P2

This action may be coded only for the key element of a non-slot record type. It ensures that the input value is a unique key, even though a record "with the same key" has already been added, by adding new characters at the end, usually as counters.

A35 works in two very different ways, depending on the value of P1. The first method described appends an integer to the value; the second appends a character string, and is in general used for variable-length values stored as character strings.

Augment value with appended integer (P1 < 4)

If P1 is less than 4, SPIRES will add a 1-, 2- or 3-byte integer to the end of the value, which makes the value unique. This integer value is always handled separately by SPIRES, regardless of the type of the remainder of the element.

A35 must be the first action coded in either an INPROC or OUTPROC string. For an INPROC string, the P1 parameter specifies the length of a value which is added to the key by the system to generate a unique key value. This augmented portion is a one, two or three byte binary value that is concatenated to the fully converted key (but only after all other INPROC Actions are complete).

Thus if P1 = 1 a one-byte value is used with a range of 0 through 254. If P1 = 0 a two-byte value of 0 through 32766 will be added. If P1 = 3 a three-byte value from 0 to 16,777,214 will be added.

P2 is a character SPIRES will look for in the input value. If the character appears, then SPIRES assumes that the characters to the right of the P2 character are to be converted and saved as the known augmented portion of the record key for record search purposes. (If the P2 character ends the input value, the value will be augmented as if the character were not present.)

If the P2 character is not found, then if the request is an ADD, SPIRES will augment the key with a binary value one greater than the highest augmented portion currently existing for that key. For example, if P2=# and the input value is KEYV, then if no other KEYV key exists, KEYV#0 is assigned as the key. If a value of KEYV#50 exists, then the key will be KEYV#51 following the ADD. To access a specific record, the fully qualified key must be given (eg. DISPLAY KEYV#17).

For an OUTPROC or SEARCHPROC string, P1 and P2 must have the same values as does the corresponding INPROC Action. For the OUTPROC, the fully converted key value will be augmented upon output by the P2 character, followed by the converted binary value whose internal length is determined by the P1 value.

Augment value with string counter (P1 >= 4)

This technique differs from the first one in several ways:

The augmentation is done with letters or numbers, from 1 to 4 characters in length, depending on the P1 value chosen:

Suppose the following INPROC is coded for the key of an record:

Assuming the records would otherwise be added to the subfile successfully, below are some sample keys being input and the results. Remember, augmentation doesn't occur unless the augmentation character ("-" in these examples) ends the value.

The S438 error indicates that all possible values may have been assigned. Though that's not quite true, you can see by the direction of the example that SPIRES doesn't look for empty places but augments by the value after the highest value for that key.

Contrary to the example, however, the input key is unlikely to contain the augmentation already; rather, it will end with the augmentation character so that the augmentation will be done automatically.

General Information

Note: This action can be used to conserve slot record-types in a file by allowing a record that would be slot to be non-slot. A file may have only four slot record-types, none of which may be COMBINED with other record-types. A35 can also be used to reduce the use of the expensive SEQUENCE command when producing reports. If reports are most often produced using one element as the sort-key, then it may be advantageous to make that element the record-key; SPIRES will DISPLAY records in key-sequence when record subsetting was done entirely under the Global FOR command FOR TREE or FOR SUBFILE.

D.1.3.0.3.6  * A36

A36 :<NUM> ,<CHARACTER STRING> ,<NUM>

A36 :P1, P2 ,P3

If P1 is 0 then: The P2 string is appended to the data value if P3 is 0, otherwise the P2 string is inserted into the data value to the left of the P3-RD character of the data value, counting from the left end of the string. No insertion is attempted if the original data value is shorter than P3 characters, and the error flag is turned on.

If P1 is 3 then: The P2 string is inserted before the data value if P3 is 0, otherwise the P2 string is inserted to the right of the P3-rd character of the data value, counting from the right end of the string. No insertion is attempted if the original data value is shorter than P3 characters.

If P1 is 1 or 2 then: The P2 string is replicated as many times as necessary to create a string which is P3 characters long. Then the original data value is overlayed onto this string either at the front if P1=1 or at the rear if P1=2. The error flag is turned on if the length of the original value is greater than P3, and the original value is unmodified.

In all cases, the error flag is turned on if the final value exceeds MAXVAL characters, in which case the original value is unaffected.

D.1.3.0.3.7  * A37

A37 ,<NUM>

A37 ,P2

If the value has more than P2 characters in it, the first P2 characters are treated as an occurrence of the element and the remainder are treated as a second occurrence. This action may not be used with A45.

D.1.4.0.3.8  * A38

A38 :<NUM>

A38 :P1

For INPROC, the value is forced to upper case and then blanks are squeezed in the same manner as for A40. A blank is then prefixed to the value, and all blank bytes are then replaced by a byte containing the length of the non-blank portion which follows (maximum of 255). For OUTPROC, all length bytes are replaced by blank bytes and the leading blank is eliminated. This action is most frequently used with the "remaining" element of a personal name in an index record. (See PASSPROC).

This action processes a name for the key of an index record (or structure) and the key of an associated structure element. The name being processed is split into two parts, depending on commas. The key of the index record (or structure) is defined by the "surname" part and the key of the associated structure element is defined by the "remaining" part. The two parts are determined as follows:

   1)  Value without commas:

The last non-blank portion of the value is the "surname" part, and all other non-blank portions of the value constitute the "remaining" part.

   2)  Value with one or more commas:

The last non-blank portion of the value preceding the first comma defines the "surname" part, and all the non-blank portions following that comma until the next comma or end-of-value (whichever comes first) followed by the remaining non-blank portions of the value preceeding the "surname" part constitute the "remaining" part.

For PASSPROC, P1=0 operates as just described, but if P1=1 then PASSPROC will build an additional "surname" and "remaining" part if the value contains one or more commas, and more than one non-blank portion precedes the first comma. In such a case, the first non-blank portion of the value defines another "surname" and all the non-blanks portions following the first comma up to the second comma or end-of-value (whichever comes first), defines the associated "remaining" part.

PASSPROC Examples:

      NAME                        SURNAME     REMAINING
      Smith, John                 Smith       John
      Sir Walter Raleigh          Raleigh     Sir Walter
      Jesus Christ, Lord          Christ      Lord Jesus
          (additional for A38:1)  Jesus       Lord
      Kirk Patrick, John Thomas   Patrick     John Thomas Kirk
          (additional for A38:1)  Kirk        John Thomas
      Mao Tse Tung,               Tung        Mao Tse
          (additional for A38:1)  Mao
      William Bryan,,Lawyer       Brian       William
          (additional for A38:1)  William

PASSPROC Sample Definition:

      INDEX-NAME = Rec-name;  or  SUB-INDEX = Structure;
        SEARCHTERMS = Search-mnemonics;
          PASSPROC = ---- / A38;
        SUB-INDEX = Structure;
          SEARCHTERMS = Dummy-mnemonic;
            PASSPROC = A165;
      Note:  A165 must be used as shown when A38 is used.

As a SEARCHPROC, A38 is used to convert a name supplied in a search command into a valid form for index searching. The flexibility in forms of a name that the searcher may supply depends on the P1 parameter. In most applications, the default (P1 = 0) is selected, since it gives the searcher the most possibilities. In such cases, the first name or names may be omitted or abbreviated to single letters (see the example below). However, first names must be in the precise order in which they appear in the retrieved record (see examples 6 and 11 below).

If P1 is 1, then no abbreviating of the first name(s) is allowed. Also, if more than one first name appears in the search value, then they must be in the order in which they appear in the index with no first names in between missing (see example 5 below). If P1 is 2, then either the complete first name in order or no first name will retrieve the values. If P1 is 3, then the first names of the stored value are checked through the length of the supplied search value's first names; indexed values that are shorter than the search value are skipped, as are those that don't match, and a match occurs only when all first names of the search value match those of the index value in the order given.

Suppose the user is trying to find records for Carl Philipp Emanuel Bach. The chart below indicates for the various P1 parameters whether a given search value would retrieve such a record.

Note for this example that P1=0 and P1=3 are the same. Where they differ is in the handling of search values that are longer than the stored value. Suppose for example, that the stored value is Carl P. Bach. The chart below indicates for the various P1 parameters whether a given search value would retrieve that record.

(All other examples from the first group would have NO in all columns.)

When A38 ($PNAME) is used as a Searchproc, the error flag is set when the user's search value contains more than just a last name and one of the following conditions is true:

In either situation, the error flag is set, the first-name portion of the value is discarded, and the search is executed.

For more information about how searching works with names, [EXPLAIN PERSONAL NAME INDEXING.]

D.1.4.0.3.9  * A39

A39 :<NUM> ,<NUM>

A39 :P1, P2

The value is converted to packed decimal form. The input value can be character or binary (1-4 bytes). For example, "10", "12.345", "-428E_5" and "-1776E56" are possible character values.

Leading and trailing blanks are stripped from the value. What remains can be:

All parts are optional, though there must be digits somewhere in the value. Otherwise, the error flag is set. (There is one exception to these rules: the values @@ and -@@, which indicate positive and negative infinity, can be entered as well.)

The input value is first adjusted so that it begins with an integer followed by an exponent. For example, "12.345" is adjusted to "12345E_3", and "1.2E4" is adjusted to "12E3". The value stored consists of one byte containing the exponent (ranging from -128 to 127), a byte to contain a sign character and the final digit of the integer, and then a byte for each two digits in the remainder of the value:

where "d" represents a digit, "s" represents the sign character, and "exp" represents the exponent. Thus, the minimum length for a stored value is two bytes. The storage length is controlled by the P1 and P2 parameters.

Basically, P2 states a length in bytes for the stored value. For example, if P2 is 2, then the stored value will be two bytes long. P2 must not exceed 256. If the converted value will not fit into P2 bytes, the error flag is set. If the converted value would fit in less than P2 bytes, then zeroes are prefixed to the value to make it P2 bytes long (unless 4 is added to P1; see below).

If P2 gives a "length" of 0, then the converted value will be stored in the minimum number of bytes possible, meaning that the storage length will vary depending on the precision of the number. The length will never exceed 256 bytes, however.

You can specify that the stored value should be no more than P2 bytes long but should be stored in the minimum number of bytes possible (like a combination of P2=0 and P2~=0) by coding the P2 maximum length and adding 4 to the P1 parameter.

Also for P1: Begin with P1=0 and add to it according to the options desired as described below. (You may add nothing or choose several different options, increasing the value of P1 as directed for each.)

By default, when P1 is 0, any form described above is allowed for input. If 1 is added to the P1 value, then decimal point and exponent input are not allowed. For example, "123" would be allowed, but not "1.23" or "123E5". The error flag is set if an illegal value is entered.

By default, trailing zeroes after a decimal point are left on the input value (e.g., 3.5000). If 2 is added to the P1 value, then trailing zeroes for values having negative exponents are removed after the value is adjusted for storage; the exponent is raised accordingly. For example, the input value 3.5000 is first adjusted to 35000E_4. Then, if 2 has been added to P1, the value to be stored would be changed to 35E_1. Each zero stripped raises the exponent by 1; as long as the final exponent created is less than or equal to zero, trailing zeroes are removed. The value 350.00 would be stored as 350E0.

By default, if the input value is outside the allowed ranges, the error flag is set. If 8 is added to P1, then SPIRES will permit the input value to be outside the allowed ranges, converting the value to "@@" (positive infinity, for positive values with exponents larger than 127), "0" (zero, for values with exponents smaller than -128) or "-@@" (negative infinity, for negative values with exponents larger than 127).

Packed decimal values are usually processed by either A80 or A85 on output. Several other valuable processing capabilities for packed decimal values are provided by A55.

D.1.4.0.4.0  * A40

A40

A40

Groups of more than one blank in the value are replaced by a single blank. All leading and trailing blanks are stripped. The result replaces the value. (Since A40 removes trailing blanks, coding "A40/A51" is unnecessary.)

D.1.4.0.4.1  * A41

A41 :<NUM>

A41 :P1

The input value is assumed to be one of the forms specified below. The title portion preceded by comma or double comma is optional.

The input value is converted to one of the forms above as specified by the following table:

  Input value of form:    1   2   3   4   5   6
  Final form for P1=0:    1   1   1   4   4   6
  Final form for P1=1:    2   2   2   5   5   6

When P1=2, values input with a comma are left as input, but values input without a comma are processed as for P1=1.

Example:  When P1=0,
          A) John Von Schroeder,,Dr.
             would not be converted.
          B) Schroeder, John Von,Dr.
             would be converted to:  John Von Schroeder,,Dr.
          C) Schroeder, John
             would be converted to:  John Schroeder

          When P1=1,
          A) John Von Schroeder,,Dr.
             would be converted to:  Schroeder, John Von,Dr.
          B) Von Schroeder, John,Dr.
             would be converted to:  Schroeder, John Von,Dr.
          C) John Schroeder
             would be converted to:  Schroeder, John

          When P1=2,
          A) Von Schroeder, John
             would not be converted.
          B) John Von Schroeder
             would be converted to:  Schroeder, John Von

Note: If an element processed by A41 as an INPROC is to be used to sequence results in last name alphabetical order, then the last name must be stored first, as with A41:1 or A41:2.

When A41 is a Searchproc, it may also have the values 4 through 7 for P1, which make it work similarly to A38 with values 0 through 3. The difference is that the stored value in the index record is a single value, and does not have the first and middle names stored in a sub-index. In other words, the record key should be the entire name, not just the last name.

This type of index can be created with....

As a Passproc, A41:0 and A41:1 behave just like A38:0 and A38:1, except that they build the index's key as a single value, beginning with the surname portion of the name (LAST) followed optionally by a comma and the first/middle names portion. The value is not split into pieces to go into a separate sub-index, the way A38 name indexes are normally built.

Another difference from A38 to note: titles are not passed, i.e., anything following a second comma is discarded, along with that comma.

For more information about how searching works with names, [EXPLAIN PERSONAL NAME INDEXING.]

D.1.4.0.4.2  * A42

A42 :<NUM>

A42 :P1

The value is tested to see if it obeys the conventions for dollar and cents or decimal quantities. The error flag is set on if the value is not in the proper form which must be one of the following: ($)D, ($)D., ($)D.CC, or ($).CC, where D is a string of digits, CC is one or two digits (depending on P1), and ($) is an optional dollar sign. Leading and trailing blanks are stripped, and there may be a leading "+" or "-" sign either before or after the dollar sign.

With P1 values of 0 to 7, A42 is designed for dollar-cents values. If P1=0 or 4, the input value will be unaffected. If P1=1 or 5, the input value is replaced by a single precision (4 byte) floating point equivalent, whose value may be output with A81:0 or A81:1. If P1=2 or 6, the input value is replaced by a (4 byte) binary equivalent, whose value may be output by A81:2 or A81:3. The binary value is always an integer in "cents". (The range of values for a binary value is -21474836.47 to 21474836.47.) For P1 values of 4 or more, the CC field on input may be one or two digits. If P1 is less than 4, then CC, if it occurs, must be two digits; if it is not, then an error condition will result. Thus, for P1 of 4 or more, $3.5 is equivalent to $3.50.

With P1 values of 8 to 15, A42 is used to store decimal values. The stored value may have from 0 to 7 places of decimal accuracy, depending on the value of P1; the value "P1-8" is the number of places of accuracy. The input value is assumed to have that number of decimal places; if it does not, zeroes (and a decimal point, if necessary) are appended. For example, if P1 is 11, then the value is given three decimal places of accuracy. The input value "1.234" would be unaffected; "1.23" would have a zero appended: "1.230"; the value "1" would be changed to "1.000". The decimal point is then removed from the value, which is then converted to a four-byte integer for storage. The stored value may be reconverted by the edit processing rule A85. Thus the above values would be stored as 1234, 1230 and 1000 respectively.

For P1 of 8 to 15, the error flag is set if the input value contains more than "P1-8" decimal places or if the value after the decimal point is removed is less than -2,147,483,647 or greater than 2,147,483,647.

D.1.4.0.4.3  * A43

A43 ,<CHARACTER STRING> ,<CHARACTER STRING>

A43 ,P2 ,P3

The value is converted into a different character set. Each character of the original value is looked up in the P2 string. If the character is found in the P2 string then it is replaced by the corresponding character from the P3 string. The translated result replaces the value. If the P3 value is a single character then each occurrence of any of the P2 characters is replaced by the single character in P3. If P3 is not a single character then strings P2 and P3 must be the same length.

D.1.4.0.4.4  * A44

A44 :<NUM>, <CHARACTER STRING> ,<CHARACTER STRING>

A44 :P1, P2 ,P3

The value is scanned for the P2 string. When an occurrence of this string is found, it is replaced by the P3 string, if P1=0. If P1=1, then the P2 string and the 1 character following it are replaced by P3. If P1=2, then the P2 string and the 2 characters following it are replaced by P3. P1 values of 0 to 7 may be used in this manner to indicate the number of characters after the P2 string that are to be deleted or changed. There must be at least P1 characters following string P2 for processing to occur. P3 may be null. The error flag is set on if the P2 string is not found or if insufficient characters follow the P2 string for processing or if the converted value exceeds MAXVAL characters in length. In all cases of errors, the original value is retained.

For P1 values of 8 to 15, processing is as for P1 of 0 to 7 respectively, but the length of the value being processed must be:

or the value is not processed. This allows processing only if all of a value matches the P2 string.

D.1.4.0.4.5  * A45

A45 : <NUM> ,<SINGLE CHARACTER>

A45 : P1 ,P2

The value is really a string of values separated by delimiters. The delimiter may be a space or character P2 surrounded by optional spaces. The string is broken up into its separate parts, and each part is treated as an occurrence of the element; the delimiter, and leading and trailing blanks are discarded.

If P1 is non-zero, individual values may contain the delimiter character if the individual value is surrounded by apostrophes (') or quotation marks ("). In particular, if P1 is 1, then the value will not be broken up at delimiters that are between apostrophes. If P1 is 2, the value will not be broken up at delimiters that are between quotation marks. If P1 is 3 (1+2), then the value will not be broken at delimiters that are between pairs of apostrophes or quotation marks. Note that the apostrophes or quotation marks will be considered part of the individual value. For example,

If P1 were 0, the same value would be split into separate values at every single comma, regardless of the quotation marks and apostrophes.

Note: The P2 parameter of A45 must be a single character. If the delimiter is several characters, use A44 to change this delimiter string to a single character that would not occur in the value (such as an unprintable hex character) and then use A45 to break on that character. For example: INPROC=A44,'//',#01#/A45,#01#;

Note: All actions following the A45 in a processing rule string are repeated for each occurrence of the element that is generated by the A45. All actions up to and including the A45 are executed only once, operating on the entire original value.

Note: the SEARCHPROC for a sub-index should not contain an A45. The SEARCHPROC for the simple index containing a sub-index may have an A45 coded in it; however, value breakup from the A45 and sub-index use should not occur in the same search command. For example, if the breakup character is a blank, either of these commands will work properly:

But this command would fail:

This rule cannot come between a pair of A61s in a rule string. See the details under "A61".

You should not use more than one A45 per rule string.

If the value consists of nothing but delimiters and blanks, then a null value is returned when the rule is used in an INPROC; for PASSPROC, the value is ignored; for