INDEX
*  SPIRES File Definition
A  Introduction to SPIRES File Definition
A.1  Preface
A.2  System Overview
A.2.1  SPIRES Command and Definition Languages
A.2.2  SPIRES Processors and Programs
A.2.3  The User Interface
A.2.4  File Definition Concepts and Facilities
A.3  The Process of Defining a File
A.3.1  Design Analysis
A.3.2  File Definition
A.3.2.1  The File Definer: A SPIRES Subsystem to Simplify File Definition
A.4  Glossary of Important File Definition Terms
A.4.1  Element
A.4.2  Record
A.4.3  Structure
A.4.4  Key
A.4.5  Goal Record
A.4.6  Index Record
A.4.7  Record-type
A.4.8  Index
A.4.9  Simple Index
A.4.10  Compound Index
A.4.11  Combined Record-type
A.4.12  Removed Record-type
A.4.13  Subfile
A.4.14  File
A.4.15  Hierarchy of File Definition Components
B  Defining a SPIRES File
B.1  Goal Record Concepts and Definition
B.1.1  Element Names, Occurrences, and Lengths
B.1.2  File and Record Name Statements
B.1.3  Element Categories
B.1.4  Record Keys
B.1.5  Element Name, Occurrence and Length Statements
B.1.6  Element Aliases
B.1.7  Dummy Elements; Comment Statements
B.1.8  Optional Statements: AUTHOR, MAXVAL, NOAUTOGEN and BIN
B.1.8.1  (The AUTHOR Statement)
B.1.8.2  (The MAXVAL Statement)
B.1.8.3  (The NOAUTOGEN Statement)
B.1.8.4  (The BIN Statement)
B.1.9  Statements in the Subfile Section
B.1.10  A Complete File Definition
B.2  Goal Record Keys, Slot and Removed Records
B.2.1  Record Keys
B.2.2  Slot Keys
B.2.2a  SLOT-START Statement
B.2.3  Removed Record-Types
B.2.4  Monotonic Record-Types
B.3  Structures
B.3.1  Data Structuring
B.3.2  Coding Structures
B.3.3  Structured-Data Input
B.3.4  Keyed Structures
B.3.5  Keyed-Structure Data Input
B.3.6  Floating Structures
B.4  Processing Rules: INPROC, INCLOSE, OUTPROC
B.4.1  Functions of Processing Rules
B.4.2  INPROC and OUTPROC Rule Functions
B.4.3  Processing Rule Strings
B.4.4  Processing Rule Syntax
B.4.5  Processing Rule Restrictions
B.4.6  Understanding Processing Rule Descriptions
B.4.7  Custom Error Messages
B.4.8  INCLOSE Rules
B.4.9  Processing Rules For Numeric Data
B.4.10  Processing Rules for Dollar-and-Cents Data
B.4.11  Processing Rules for Free Form Character Strings
B.4.12  Processing Rules for Strings of Codified Data
B.4.13  Processing Rule for Personal Names to Canonical Form
B.4.14  Processing Rules for Validating Length and Occurrence
B.4.15  Processing Rules and Element Types
B.4.16  Processing Rule Tracing: SET PTRACE
B.4.17  Processing Rule Tracing for Passprocs: SET PASSTRACE
B.5  The FILEDEF Subfile and File Compilation
B.5.1  The FILEDEF Subfile
B.5.2  Adding Records to FILEDEF
B.5.3  Compiling File Definitions
B.5.4  Altering a File Definition in FILEDEF
B.5.5  ORVYL Files Created by Compilation
B.5.6  The ATTACH and SELECT Commands
B.5.7  The PROCESS Command in SPIBILD
B.5.8  Making Major Changes to a File: The ZAP FILE Command
B.5.9  Making Minor Changes to a File: The RECOMPILE Command
B.5.10  Destroying a SPIRES File
B.5.11  Summary
B.6  File Structure: Tree & Slot, Goal & Index, Removed Records
B.6.1  Introduction
B.6.2  Element Storage
B.6.3  Record Storage
B.6.4  Removed Record-Types
B.6.4.1  Very Large Databases
B.6.4.2  RES-LARGE Statement
B.6.5  Combined Record-Types
B.6.5a  Extended Tree Data Sets for Large Databases
B.6.5b  The OVERFLOW-TO and OVERFLOW-KEY statements
B.6.5c  The EXTERNAL-TYPE statement
B.6.6  Figure: Function of Goal and Index Records
B.6.7  Figure: Storage of Element Length and Occurrence Information
B.6.8  Figure: A Tree-Structured Data Set
B.6.9  Figure: Detail of the Structure of a Single File Block
B.6.10  Figure: Sample Tree After Intense Local Growth
B.6.11  Figure: Sample Tree With Well-Distributed Growth
B.6.12  Figure: Tree Showing High Number of Access Per Record
B.6.13  Figure: Previous Tree After Rebalancing
B.7  Understanding and Coding Index Records
B.7.1  How Indexing Works
B.7.2  Understanding Simple Indexes
B.7.3  Understanding Qualifiers
B.7.4  Understanding Sub-Indexes
B.7.5  Understanding Compound Indexes
B.7.6  The Impact of Global FOR and ALSO on Indexing
B.7.7  Index Definition
B.7.8  Coding Simple Indexes
B.7.9  Coding Simple Indexes with Qualifiers
B.7.10  Coding Compound Indexes
B.7.11  Coding Sub-Indexes
B.7.12  Index Record and Goal Record Elements
B.7.13  Index Records as Goal Records
B.7.14  Indexes for Non-Removed Record Types; Keys vs. Locators
B.7.15  Ensuring the Validity of Index Records
B.8  Understanding and Coding the Linkage Section
B.8.1  Functions of the Linkage Section
B.8.2  The Global Parameters Section
B.8.2.1  The SEARCHPROC Statement in the Global Parameters section
B.8.2.2  The EXTERNAL-NAME Statement
B.8.3  Individual Index Linkages
B.8.4  Simple Indexes
B.8.5  Sub-Indexes
B.8.6  Global Qualifiers
B.8.7  Local Qualifiers
B.8.8  Compound Indexes
B.8.9  Coding Searchproc Rules
B.8.10  The NOPASS Statement
B.8.11  Coding PASSPROC Rules
B.8.12  Choosing the "Fetcher" Passproc
B.8.13  Other Actions in a PASSPROC Rule String
B.9  Defining Subfile Privileges
B.9.1  The Function of the Subfile Section
B.9.2  Basic Statements in the Subfile Section
B.9.2a  Subfile Selection for "Access Lists" of Accounts
B.9.3  The SECURE-SWITCHES Statement
B.9.3.1  Secure-Switches 1 and 2
B.9.3.2  Secure-Switch 2
B.9.3.3  Secure-Switch 3
B.9.3.4  Secure-Switch 4
B.9.3.5  Secure-Switch 5
B.9.3.6  Secure-Switch 6
B.9.3.7  Secure-Switch 7
B.9.3.8  Secure-Switch 8
B.9.3.9  Secure-Switch 9
B.9.3.10  Secure-Switch 10
B.9.3.11  Secure-Switch 11
B.9.3.12  Secure-Switch 12
B.9.3.13  Secure-Switch 13
B.9.3.14  Secure-Switch 14
B.9.3.15  Secure-Switch 15
B.9.3.16  Secure-Switch 16
B.9.3.17  Secure-Switch 17
B.9.3.18  Secure-Switch 18
B.9.3.19  Secure-Switch 19
B.9.3.20  Secure-Switch 20
B.9.3.21  Secure-Switch 21
B.9.3a  The SHOW SSW and SET SSW Commands
B.9.4  Security for Individual Elements and Indexes
B.9.4.1  Views: Element Security Defined in Packets
B.9.4.2  Advanced Features of the View Facility
B.9.4.3  Specific Effects of the View Facility
B.9.4.4  Priv-Tags and the CONSTRAINT and NOUPDATE Statements
B.9.4.5  The INPROC-REQ and OUTPROC-REQ Statements
B.9.4.6  Index Security: Priv-Tags and the NOSEARCH Statement
B.9.5  The SUBGOAL Statement
B.9.6  The SELECT-COMMAND Statement
B.9.7  The PROGRAM Statement
B.9.8  The SUBCODE Statement
B.9a  Defining File Access Privileges
B.10  SPIRES File Management
B.11  Logging Database Use in SPIRES
B.12  Immediate Indexing
B.12.1  Coding Immediate Indexes
B.12.2  Efficiency Considerations for Immediate Indexes
B.12.3  Immediate Indexing and Goal-to-Goal Passing
C  Additional Facilities for the SPIRES File Definer
C.1  Recompile of an Existing File's Definition
C.1.1  The Function of the RECOMPILE Command
C.1.2  Statements You Can Change, Add or Delete Anytime
C.1.3  Statements You Can Sometimes Change, Add or Delete
C.1.4  Statements You Can Never Change, Add or Delete
C.2  [Currently not used]
C.3  Synonyms
C.3.1  The Function of a Synonym
C.3.2  Defining a SYNONYM Index
C.3.3  Adding Synonyms to Index Records
C.4  Executable Elements: Protocols and TYPE=XEQ
C.5  Indirect Record-Access: Action 32 and SUBGOAL Processing
C.5.1  The Function of Action 32
C.5.2  Action 32: Problem One
C.5.3  First Solution to Problem One
C.5.4  Second Solution to Problem One
C.5.5  Third Solution to Problem One
C.5.6  Action 32: Problem Two
C.5.7  Solution to Problem Two
C.5.8  SUBGOAL Processing
C.6  Practical Techniques for File Definers and Managers
C.6.1  Introduction
C.6.2  Proximity Searching: Information for File Developers
C.6.2.1  Proximity Searching: File Definition Requirements
C.6.2.2  Proximity Searching: How it Works
C.6.5  Examination Of Index Entries
C.6.10  Automatic Accumulation of Record Modification Dates
C.6.11  Free Global Qualifiers
C.6.12  Record Protection By Account Number
C.6.13  Same-Structure Retrieval Through Indexes
C.6.14  Phonetic Search of Personal Names
C.6.15  Non-Unique INDEX-NAME Statements
C.6.17  The QUELEVEL and RES-LEVEL Statements
C.6.18  Record Size Limits and Split Records
C.6.19  Indexing Negative Integer or Real Values
C.6.20  Encrypting Data Values
C.6.21  DEFQ-Only Record-Types
C.6.22  File Security: ORVYL Data Sets
C.6.23  Creating a Deferred Queue on Another Account
C.6.23a  Creating a Duplicate Deferred Queue
C.6.24  Checkpoint Data Sets
C.7  Compiling File Definition Code from Several Sources
C.7.1  The RECDEF Subfile
C.7.2  The EXT-REC and EXT-LINK Statements
C.7.3  Comparing the DEFINED-BY and EXT-REC / EXT-LINK Statements
C.8  FASTBILD
C.9  File Definition Compilation Diagnostics
C.9.1  Determining the Location of Errors
C.9.2  ATCHFILE, INITFILE and Other Error Messages
C.9.3  Compile Diagnostics and Errors
C.9.3.1  * ACCOUNT MISMATCH
C.9.3.2  * AET TABLE OVERFLOW
C.9.3.3  * COMBINE RECORD INVALID
C.9.3.3.0  * NAME PCT TABLE FULL
C.9.3.3.0a  * PACKED CHAR TABLE FULL
C.9.3.3.0b  * ELEMENT TABLE FULL
C.9.3.3.0c  * USERPROC TABLE FULL
C.9.3.3.1  * LABL PCT TABLE FULL
C.9.3.3.1a  * IMT TABLE FULL
C.9.3.3.1b  * AET TABLE FULL
C.9.3.3.1c  * LABEL TABLE FULL
C.9.3.3.2  * TOO MANY RECORD TYPES
C.9.3.3.2a  * TOO MANY REAL RECORD TYPES
C.9.3.3.3  * TOO MANY SLOT TYPES
C.9.3.3.4  * XEQ TYPE MUST BE VARIABLE
C.9.3.4  * DEFINED-BY RECORD TABLES NOT FOUND
C.9.3.5  * EXTRANEOUS string
C.9.3.6  * FILE EXISTS
C.9.3.7  * FILE NOT AVAILABLE
C.9.3.8  * ILLEGAL 2ND PROPERTY
C.9.3.9  * INVALID ACTION CODE
C.9.3.9a  * ELEMENT MUST BE TYPE STRUCTURE
C.9.3.10  * INVALID ACTION SYNTAX
C.9.3.10a  * ILLEGAL ACTION SEQUENCE
C.9.3.10b  * INVALID ACTION GROUP
C.9.3.11  * INVALID CINDEX
C.9.3.12  * INVALID MNEMONIC
C.9.3.13  * INVALID NAME LEN
C.9.3.13a  * INVALID LOCATOR LENGTH
C.9.3.14  * INVALID REC NAME
C.9.3.15  * INVALID SEQUENCE
C.9.3.16  * KEY ELEMENT ERROR
C.9.3.17  * LENGTH VALUE > 255
C.9.3.18  * LENGTH NOT GIVEN
C.9.3.19  * DUPLICATE ELEMENT NAMES IN STRUCTURE
C.9.3.19a  * DUPLICATE VARIABLE NAME
C.9.3.20  * NO ELEMENTS ALLOWED WITH DEFINED-BY VALUE
C.9.3.21  * NO ELEMENTS IN RECORD
C.9.3.22  * NO ELEMENTS IN STRUCTURE
C.9.3.23  * NO RECORD TO COMPILE
C.9.3.24  * NON FIXED ELEMS FOR SLOT
C.9.3.25  * ONE UNIQUE ID PER RECORD
C.9.3.26  * PROCESSING RULE TABLES FULL
C.9.3.27  * PROCESSING RULE TABLE OVERFLOW
C.9.3.28  * RECORD KEY ELEMENT MISSING
C.9.3.28a  * IMMEDIATE INDEX IS ALSO IMMEDIATE GOAL
C.9.3.28b  * TREE-DATA VALUE INVALID
C.9.3.29  * INVALID SLOTCHECK VALUE
C.9.3.29a  * INVALID ELEM MNEMONIC
C.9.3.29b  * ACTION 32 RECORD recname NOT VALID
C.9.3.29c  * USEMPROC error
C.10  Processing Rule-String Procedures
C.10.1  The PROC and RULE Statements
C.10.2  The PARM and DEFAULT Statements
C.10.3  The SYMBOL and VALUE Statements
C.10.4  Processing Rule Procedure Examples
C.10.5  The EXTDEF Subfile
C.11  User-Defined Processing Rules: Userprocs
C.11.1  Coding Userprocs
C.11.2  Uprocs (Commands) Available in Userprocs
C.11.2.1  SET Uprocs
C.11.2.2  Block-Construct Uprocs for Execution Flow
C.11.2.3  Other Uprocs for Execution Flow
C.11.2.4  Uprocs for Setting User Variables
C.11.2.5  Uprocs for Terminal Input/Output
C.11.2.6  Miscellaneous Userproc Uprocs
C.11.3  Using Variables in Userprocs
C.11.3.1  System Variables Available Only in Userprocs
C.11.3.2  User Variables in Userprocs
C.11.4  Some Interesting and Unusual Uses for Userprocs
C.11.4.1  Retrieving Other Element Values with the $GETxVAL Functions
C.11.4.2  (*) Outproc Userprocs for Record Keys and Secure-Switch 13
C.11a  Virtual Elements
C.11a.1  Examining Virtual Elements
C.11a.2  Retrieving Other Elements for Virtual Elements
C.11a.3  The REDEFINE Statement
C.11a.3.1  (A Comparison of Rule A79, $GETxVAL, and Redefining Elements)
C.11a.4  Variably-Occurring Virtual Elements
C.12  Record-Types that Serve as Goals and Indexes; Goal-to-Goal Passing
C.12.1  Index Record-Types Used as Goal Records
C.12.2  Record-Types with Goal and Index Data
C.12.3  Linking Goal-Record Types: Goal-to-Goal Passing
C.12.3.1  Rules and Suggestions for Goal-to-Goal Passing
C.12.3.2  A Solution to the Example in C.12.3
C.12.3.3  Using Qualifiers in Passing to Create Goal Records
C.12.4  Double-Headed Files
C.12.5  Chain Passing
C.12.6  Goal-to-Same-Goal Passing
C.12a  Indirect Searching
C.12a.1  Coding Details for Indirect Indexes
C.12a.2  Uses for Indirect Searching
C.12a.3  Some Technical Details on how SPIRES Handles Indirect Indexes
C.12a.4  Dynamic Indexes
C.13  File Definition Information Packets
C.13.1  Element Information Packets
C.13.1.1  The ELEMINFO (INFO) Info-element
C.13.1.2  The NOTE Info-element
C.13.1.3  The DESCRIPTION (DESC) Info-Element
C.13.1.4  The HEADING (HEAD) Info-element
C.13.1.5  The COL-HEADING (COLHEAD) Info-element
C.13.1.6  The WIDTH (WID) Info-element
C.13.1.7  The ADJUST (ADJ) Info-element
C.13.1.8  The INDENT Info-element
C.13.1.9  The MAXROWS (MAXROW) Info-element
C.13.1.10  The EDIT Info-element
C.13.1.11  The VALUE-TYPE (VTYPE) Info-element
C.13.1.12  The INDEX Info-element
C.13.1.13  The USERINFO Info-element
C.13.1.14  The INPUT-OCC (INOCC) Info-element
C.13.1.15  The DEFAULT Info-element
C.13.1.16  The SUPPLIED Info-element
C.13.1.17  The RDBMS_COLUMN Info-element
C.13.1.18  The RDBMS_DATATYPE Info-element
C.13.1.19  The RDBMS_DATALENGTH Info-element
C.13.2  System Commands and Utilities Using Element Information
C.13.3  Index Information Packets
C.13.3.1  The INDEXINFO (INFO) Info-element
C.13.3.2  The NOTE Info-element
C.13.3.3  The DESCRIPTION (DESC) Info-Element
C.13.3.4  The SOURCE (SOU) Info-element
C.13.3.5  The VALUE-TYPE (VTYPE) Info-element
C.13.3.6  The TRUNCATE (TRUNC) Info-element
C.13.3.7  The EXCLUDE (EXC) Info-element
C.13.3.8  The USERINFO Info-element
C.13.4  System Commands and Utilities Using Index Information
C.13.5  Alternate Locations for Information Packet Definitions
D  Appendices
D.1  Actions: Complete Listing By Number
D.1.1  About this Chapter
D.1.2  Actions Used Only As SEARCHPROC Rules (A6 -- A16)
D.1.2.0.0.6  * A6
D.1.2.0.0.7  * A7
D.1.2.0.0.8  * A8
D.1.2.0.0.9  * A9
D.1.2.0.1.0  * A10
D.1.2.0.1.1  * A11
D.1.2.0.1.2  * A12
D.1.2.0.1.3  * A13
D.1.2.0.1.4  * A14
D.1.2.0.1.5  * A15
D.1.2.0.1.6  * A16
D.1.2.0.1.7  * A17
D.1.3  Actions Used as INPROC, OUTPROC, SEARCHPROC or PASSPROC Rules (A21 -- A66)
D.1.3.0.2.1  * A21
D.1.3.0.2.2  * A22
D.1.3.0.2.3  * A23
D.1.3.0.2.4  * A24
D.1.3.0.2.5  * A25
D.1.3.0.2.6  * A26
D.1.3.0.2.7  * A27
D.1.3.0.2.8  * A28
D.1.3.0.2.8a  * A28 for time
D.1.3.0.2.8b  * A28 for datetime
D.1.3.0.2.9  * A29
D.1.3.0.3.0  * A30
D.1.3.0.3.1  * A31
D.1.3.0.3.2  * A32
D.1.3.0.3.3  * A33
D.1.3.0.3.4  * A34
D.1.3.0.3.5  * A35
D.1.3.0.3.6  * A36
D.1.3.0.3.7  * A37
D.1.4.0.3.8  * A38
D.1.4.0.3.9  * A39
D.1.4.0.4.0  * A40
D.1.4.0.4.1  * A41
D.1.4.0.4.2  * A42
D.1.4.0.4.3  * A43
D.1.4.0.4.4  * A44
D.1.4.0.4.5  * A45
D.1.4.0.4.6  * A46
D.1.4.0.4.7  * A47
D.1.4.0.4.8  * A48
D.1.4.0.4.9  * A49
D.1.4.0.5.0  * A50
D.1.4.0.5.1  * A51
D.1.4.0.5.2  * A52
D.1.4.0.5.3  * A53
D.1.4.0.5.4  * A54
D.1.4.0.5.5  * A55
D.1.4.0.5.6  * A56
D.1.4.0.5.7  * A57
D.1.4.0.5.8  * A58
D.1.4.0.5.9  * A59
D.1.4.0.6.0  * A60
D.1.4.0.6.1  * A61
D.1.4.0.6.2  * A62
D.1.4.0.6.3  * A63
D.1.4.0.6.4  * A64
D.1.4.0.6.5  * A65
D.1.4.0.6.6  * A66
D.1.4.0.6.7  * A67
D.1.4.0.6.8  * A68
D.1.4.0.6.9  * A69
D.1.5.0.7.0  * A70
D.1.5.0.7.1  * A71
D.1.5.0.7.2  * A72
D.1.5.0.7.3  * A73
D.1.5.0.7.4  * A74
D.1.5.0.7.5  * A75
D.1.5.0.7.6  * A76
D.1.5.0.7.6a  * A76 for date
D.1.5.0.7.6b  * A76 for datetime
D.1.5.0.7.7  * A77
D.1.5.0.7.8  * A78
D.1.5.0.7.9  * A79
D.1.5.0.8.0  * A80
D.1.5.0.8.1  * A81
D.1.5.0.8.2  * A82
D.1.5.0.8.3  * A83
D.1.5.0.8.4  * A84
D.1.5.0.8.5  * A85
D.1.5.0.8.6  * A86
D.1.6  Actions Used Only as INCLOSE Rules (A122 -- A148)
D.1.6.1.2.2  * A122
D.1.6.1.2.3  * A123
D.1.6.1.2.4  * A124
D.1.6.1.2.5  * A125 assigned element
D.1.6.1.2.6  * A126
D.1.6.1.2.7  * A127
D.1.6.1.2.8  * A128
D.1.6.1.2.9  * A129
D.1.6.1.3.0  * A130
D.1.6.1.3.1  * A131
D.1.6.1.3.2  * A132
D.1.6.1.3.3  * A133
D.1.6.1.3.4  * A134
D.1.6.1.3.7  * A137
D.1.6.1.3.8  * A138
D.1.6.1.3.9  * A139
D.1.6.1.4.0  * A140
D.1.6.1.4.6  * A146
D.1.6.1.4.7  * A147
D.1.6.1.4.8  * A148
D.1.7  Actions Used Only as PASSPROC Rules (A161 -- A171)
D.1.7.1.6.1  * A161
D.1.7.1.6.2  * A162
D.1.7.1.6.3  * A163
D.1.7.1.6.4  * A164
D.1.7.1.6.5  * A165
D.1.7.1.6.6  * A166
D.1.7.1.6.7  * A167
D.1.7.1.6.8  * A168
D.1.7.1.6.9  * A169
D.1.7.1.7.0  * A170
D.1.7.1.7.1  * A171
D.1.7.1.7.2  * A172
D.2  Quick Reference to Processing Rule Functions by Number
D.2.1  Actions Used Only As SEARCHPROC Rules (A6 -- A16)
D.2.2  Actions Used as INPROC, OUTPROC or SEARCHPROC Rules (A21 -- A37)
D.2.3  Actions Used as INPROC, OUTPROC, SEARCHPROC, PASSPROC Rules (A38-A62)
D.2.4  Actions Used Only as OUTPROC Rules (A71 -- A85)
D.2.5  Actions Used Only as INCLOSE Rules (A122 -- A148)
D.2.6  Actions Used Only as PASSPROC Rules (A161 -- A171)
D.3  Quick Reference to Processing Rules by Function-Keyword
D.3.1  Binary Conversion
D.3.2  Character Test
D.3.3  Comparison and Generation of Elements
D.3.4  Date Generation and Conversion
D.3.5  Default Value Generation
D.3.6  Dollar-and-Cents Conversion
D.3.7  Floating-Point Conversion
D.3.8  General
D.3.8a  Hexadecimal Conversion
D.3.9  Insertion of String
D.3.10  Length Test
D.3.11  Multiple Occurrence Conversion
D.3.11a  Packed Decimal Conversion
D.3.12  Personal Name Algorithm
D.3.13  Range Test
D.3.14  String Replacement, Inclusion, Exclusion, Code Translation
D.3.15  Time Generation and Conversion
D.4  FILEDEF Subfile Syntax and Semantics
D.4.1  Understanding FILEDEF Subfile Structure
D.4.2  Record Level Elements
D.4.3  Record Definition Elements
D.4.4  Key Element Definition
D.4.5  Element Definition
D.4.6  Structure Definition Elements
D.4.7  Linkage Section Definition Elements
D.4.8  Individual Index Linkage Elements
D.4.9  Qualifier Linkage Definition Elements
D.4.10  Sub-Index Linkage Definition Elements
D.4.11  Compound Index Linkage Definition Elements
D.4.12  Subfile Section Definition Elements
D.4.13  Processing Rule Procedure Definition Elements
D.4.14  SLOT Section Definition Elements
D.4.15  User Defined Processing Section
D.4.16  Subfile Section Selection Elements
D.4.17  FILE-PERMITS Definition Elements
D.4.18  Element-Information-Packet Elements
D.4.19  Index-Information-Packet Elements
D.4.20  View Definition Elements
D.4.21  Version Information Elements
D.5  A Guide to Coding Index Record Definitions
D.6  A Guide to Coding the Linkage Section Definition
D.7  Annotated File Definition Examples
D.7.1  GA.SPI.BIBLIOGRAPHY
D.7.2  GA.SPI.BIBLIOGRAPHY2
D.7.3  GA.SPI.PEOPLE
D.7.4  GA.SPI.RESTAURANT
D.7.5  XA.B14.AV
D.7.6  GA.SPI.MAIL
D.8  Passing Keys or Passing Locators to Indexes
:29  SPIRES Documentation

*  SPIRES File Definition

******************************************************************
*                                                                *
*                     Stanford Data Center                       *
*                     Stanford University                        *
*                     Stanford, Ca.   94305                      *
*                                                                *
*       (c)Copyright 1994 by the Board of Trustees of the        *
*               Leland Stanford Junior University                *
*                      All rights reserved                       *
*            Printed in the United States of America             *
*                                                                *
******************************************************************

        SPIRES (TM) is a trademark of Stanford University.

A  Introduction to SPIRES File Definition

A.1  Preface

The intent of this manual is to enable anyone with a good knowledge of SPIRES searching and updating to define a functional SPIRES file and manage its contents.

The file definition process for most files is fairly straight- forward: analyze the structure of the records you want to have in your data base, define the characteristics of those records in the file definition language, then test your definition with sample data records. Next, by analyzing the requirements for retrieving those records, define the indexes required and the method in which information from the data records will be passed to the indexes. After testing the searching capabilities you may want to define various levels of access and protection for your file.

Before tackling a production file requiring complex techniques, experiment with a file of modest requirements. This manual is meant to accompany you through your first file definition; several appendixes give details of some powerful definition techniques, which rely on a previous mastery of the basics of file definition.

Since a file definition is itself a record in a SPIRES system-owned subfile, a knowledge of SPIRES searching and updating commands is essential for entering a file definition. This knowledge is also essential for the assessment of searching requirements for a file: you must formulate your searching needs in terms of the SPIRES command facilities you understand. For example, you might be likely to overlook compound indexing techniques as a possibility for a file if you had not used the extended capabilities of compound indexes in searching a SPIRES file. An informed file definer is first an experienced SPIRES user; you are encouraged to investigate the SPIRES facilities you will need with the SPIRES consultant.

Many experienced users are not aware of the internal features of SPIRES that provide the command language. Therefore, Part A of this document is devoted to linking conceptually the internal file definition options to the external command language. A cursory view is presented of those facilities of the file definition language which are invisible to the general user.

The section following the overview gives a timetable for the file definition process, outlining the topics to be considered at each stage of the definition. It is tempting to define a file all at once, that is, defining indexes and access privileges while trying to decide the length of certain elements. This approach is haphazard at best; at its worst it is confusing--try to approach the tasks of file definition in the order in which they are presented in this manual.

If you encounter problems in learning file definition, you may contact the SPIRES consultant in SCIP User Services at Polya Hall for assistance. Extensive assistance in defining files is available for a fee through Contract Programming services.

Intimately tied to a file may be various input and output formats, as well as protocols. Formats and protocols, whose command languages are described in other SPIRES documents, can be used to present an attractive, concise and helpful interface between your file and its users.

The original file definition manual was written by J. R. Schroeder; the present document was written by J. R. Sack.

A.2  System Overview

SPIRES, the Stanford Public Information Retrieval System, is a generalized, online data base management system developed at Stanford University in the early 1970's.

The task of the original SPIRES development was to provide a file and file management system for Stanford's library automation project (BALLOTS). The versatility of the present SPIRES design can be gauged by the diversity of applications SPIRES now supports. Since 1972 well over 300 data bases have been implemented, including such applications as BALLOTS, the library automation project; bibliographic citations files, such as PLANTBIO; student record management; document inquiry and preparation systems, using SPIDOC; program library maintenance, MASTERLIST; survey data, both geo-physical and astronomical; inventory and materials tracking systems; directories, catalogs, mailing lists, and others. Present files range in size from a few dozen records to over three-quarters of a million. In sum, SPIRES serves the database needs of a large and diverse computing community.

SPIRES users design and maintain their own data bases; there is no centralized data base administrator. A number of the data base applications noted above were defined by individual users, non-professionals in data base systems, largely without individual programming aid from the data base professional staff.

Files presently in the system vary widely in complexity from unindexed files with two elements (a protocols subfile, for example), through files with ten indexes for as many elements (a personnel file, perhaps), up to files with over a hundred elements and nested data structures (the MARC or FILEDEF subfile, for example). Many of the present files are made up of more than one subfile, with the file definition describing the interrelationships between subfiles in a single file.

The command language available to SPIRES users for manipulation and management of these data bases is described in the SPIRES/370 Searching and Updating Manual; only commands that relate specifically to file definition and management will be discussed in this manual.

A.2.1  SPIRES Command and Definition Languages

The various "languages" that form the language for development, management and use of data bases are:

Interactive Command Language

Most SPIRES users are familiar with this language, which is used for input and editing of data (TRANSFER, UPDATE, ADD), record retrieval (FIND, AND, OR, ALSO, FOR), and record display (SET FORMAT, SET REPORT, TYPE, OUTPUT, DISPLAY). The prospective file definer and manager should know the capabilities of the index and sequential searching commands.

Protocols Language

This facility is an extension of the command language; protocols are a set of SPIRES, WYLBUR, MILTEN and ORVYL commands. The commands make up a program that can be executed by users needing guidance in manipulation of a particular data base. Using protocols you can extend the normal SPIRES command language, tailoring the interface of specific users to specific files. This tailoring is particularly valuable when the end user of a production file has no special training in SPIRES commands.

Protocols are developed and tested interactively, and feature string manipulation and arithmetic functions as well as condition testing and branching capabilities for sophisticated interactive dialog and data base manipulation.

Format Definition Language

Any file user can provide formats for input and/or output of any data base's contents by defining a set of data transformations that map information from a source (a data base or terminal, for example) to a destination (a terminal or data base).

By means of input formats, a user can be prompted for the records' element values, and be given helpful diagnostic messages and reprompts should any error conditions be raised. Also, input formats provide a tool for converting pre-existing machine readable data to SPIRES-suitable input form. Using commands for arithmetic and string operations, condition testing and branching, complex algorithms for data validation can be performed that are unavailable using even elaborate file-definition input editing rule sequences.

By means of output formats, products such as reports, directories and catalogs can be produced by mapping file element values, and any other computed values, onto a two-dimensional array that can be output onto a terminal, line-printer, or full-face CRT. Output formats can make SPIRES data base contents acceptable for use by batch programs, or can arrange the same data so it can be easily understood by an untrained user.

The formats facility should be considered an integral part of the file/user interface. Database contents can be graphically organized so that information structures and hierarchies are easily recognized by any user. For example, a bibliographic entry in the CATALOG file can be output in the form of a library catalog card so that any user would easily recognize which elements are the author, title, and call number by their place on the output screen. Using another format, information in a card catalog file can be selected for printing in the form of a call number on a book's spine label. A data base of computer charges can be used to produce a billing letter for an individual user and reports of system-wide charges for an accountant. Just as in an outline, formats can make use of indentation to highlight hierarchical relationships of elements in a record. Where data is logically understood as a table, table formats can be devised. Text such as catalogs can be output in a form suitable for photocomposition.

A.2.2  SPIRES Processors and Programs

There are several SPIRES processors that are important to the file definer and manager.

Online SPIRES

The SPIRES that most users know, this is the program with which most file searching and updating is done.

Batch SPIRES

This is an offline version of SPIRES that allows users to indicate a series of SPIRES commands that are to be executed during non-prime time blocks. Large searches and reports are also aided by this facility, since they are not hampered by the active file and user core size restrictions that are applied to interactive SPIRES users. Use of this facility is made easier by the OFFLINE file: user commands are added as a record to this file, then executed after file updating is done by JOBGEN (see below).

SPICOMP

This is an online program that compiles file definitions, format definitions, variable group definitions and protocols, returning error messages if the compilation process is not successful. Users no longer need SPICOMP, since all the compilation functions mentioned are incorporated into SPIRES.

JOBGEN

This is a batch program run every night against any file that has records in the deferred queue and does not have the NOAUTOGEN option set. JOBGEN submits a batch SPIBILD job (see below) that causes the passing of deferred queue records to goal and index records overnight, during non-prime time blocks; the same passing can be done online by SPIBILD. The file owner can issue online commands that cause JOBGEN to pass over the file on certain nights, or can indicate in the file definition that JOBGEN is not to be run on the file.

SPIBILD

This program can be called in either an online or batch form to pass records in the deferred queue of your file to the goal record and index record data sets.

FASTBILD

This is a batch program that greatly reduces the CPU time and I/O's necessary for adding a large number of initial records to a new (empty) file. A protocol is available to generate the necessary JCL.

FASTADD

This is a batch program similar to FASTBILD: it provides a facility for adding a large number of records to a file using less CPU time and fewer I/O's than SPIBILD. Unlike FASTBILD, it can be used on files that already contain data. A protocol is available to generate the necessary JCL.

Host Language Interface (HLI)

This facility allows access to SPIRES data bases through batch programs in PL/I, COBOL, and FORTRAN. The batch programs can provide input and/or process output from SPIRES files.

A.2.3  The User Interface

A SPIRES user can access any data base permitted to him or her through a powerful set of English-like commands. The same command language applies to all SPIRES files, in keeping with the general-purpose intent of the SPIRES system. Though the number of commands is large, a searcher develops a feel for the grammar implicit in SPIRES commands. These commands allow you to

At Stanford these command facilities are supplemented by the text-editor, WYLBUR, which provides commands for data entry, offline listings, and remote job entry. Terminal communication is the function of another service program called MILTEN. The timesharing monitor, called ORVYL, supports the file system in which SPIRES data bases are stored, and controls virtual memory and resource scheduling for interactive SPIRES users. These three subsystems, WYLBUR, ORVYL and MILTEN, are distinct from SPIRES, and SPIRES does not duplicate their functions. At other installations these companion services are provided by different text editors, communications controllers and timesharing systems.

The SPIRES commands used for searching and updating any data base are covered in a user's manual of less than 170 pages ("SPIRES/370 Searching and Updating"), and can be learned in a four-session course.

A.2.4  File Definition Concepts and Facilities

The primary data base unit familiar to SPIRES users is the "subfile." The subfile is a set of goal records optionally linked to index records. Speaking generally, we can say that a goal record is what you retrieve from a search request--a book in the CATALOG subfile, a restaurant in the RESTAURANT subfile, a file definition in the FILEDEF subfile.

The subfile name is specified in a section of the file definition, appropriately called the "subfile section," in conjunction with a list of the computer account numbers of users or groups of users for whom you specify various levels of access to your file. There are many levels possible. You may make the entire subfile available to the public for searching and updating (the RESTAURANT subfile is an example of this privilege level). You may make all records searchable and only some updatable (such as the PEOPLE subfile), or make only certain records available for search and update (such as the FILEDEF and FORMATS subfiles). You could even prevent a user from seeing, searching and/or updating certain elements in all of the goal records of a subfile. Of course, a subfile's use can be restricted to only one account (the file owner's) if you wish.

The file definer must be aware of the difference between a file and a subfile. The difference is best shown through examples: the DOCUMENTS, MASTERLIST, and MASTERLIST SHARE CODES subfiles are all part of a SCIP-maintained file of software resources; the MARC and CATALOG subfiles belong to a single file maintained by the BALLOTS Center.

It is often useful to put subfiles that share some common information together in a single file, allowing SPIRES to look up or cross-reference information in one subfile when it is needed for input or output operations in another. Information is not stored redundantly in each subfile, because look-ups between subfiles in the same file can be performed; such a method of linkage also makes file updating more convenient, since common information need only be modified once per file, not once per subfile. It might be useful to link together subfiles of student, teacher, and course goal records in a single file, looking up a student's identification number or name when printing out a teacher's course list, and looking up a course number or title when printing out a student's transcript.

Search and retrieval of a reasonable number of records (less than five million) through index is accomplished in seconds, with a maximum of five disk accesses to retrieve records in a file of a million records. For most applications SPIRES uses a "B-tree" method of record access. The chapter on SPIRES File Structure contains more information on this. [See B.6.] Rapid retrieval is possible if the file definer has specified that selected information in the goal records be passed to index records. If an index was not built when records were first added to the file, it can usually be added easily to the file at a later date, by using the original goal records to pass the requisite information to that index. In cases where indexes have not been built, perhaps because the frequency of searching for a certain kind of information would not have warranted the cost of building or maintaining an index, the file can be searched sequentially for this information. It is also possible to obtain a subset of the file via an index search, then examine or process that subset sequentially (using global FOR). However, a sequential search is usually slower and more expensive than an index search.

When records have been retrieved, they can be manipulated in several ways: they can be displayed at the terminal using any predefined output format; they can be put into a user's work-area or "scratch pad" for manipulation by the text-editor; they can be made available to a batch program; they can become a source for a new series of SPIRES commands, such as a sequencing command; or they can be used to generate reports.

Keeping information in the data base current is done by removing records from, adding records to, or updating records in the file. For adding and updating, records can be moved between a subfile and the text editor's work-area using simple commands. Data is collected or modified using the text-editor's commmands. SPIRES also supports use of a CRT in full face mode.

The integrity of the data is constantly verified and protected by SPIRES. Redundant information stored in each file block insures the validity of the data inside. Modifications made by users to the data base do not take place immediately (though they are immediately visible online). Changes are held until the deferred queue records are processed by JOBGEN overnight; this reduces the chance of accidental loss of data base contents as a result of a system crash or user error.

Data can also be validated on entry by specifying certain tests that the input values must pass. These tests can be specified in the file definition, or in an input format or protocol used to add records to the file. They may be as simple as tests based on the number of occurrences of an element, an element's length, a required range for element values, or inclusion or exclusion of elements that contain certain characters. Editing of input data can also be specified: values can be converted to binary; personal names can be changed to a canonical form; the date or time of day can be supplied, and many other kinds of editing can take place. Similar to these input processing rules are index passing rules that specify how or what kind of information is passed from an element in the goal record to an element in an index. You can specify that groups of words, individual words or phrases, all words, or all words over a certain length be passed; you can also include or exclude a certain set of words, delete or nullify punctuation, or force capitalization, when an element or elements are passed to an index.

A.3  The Process of Defining a File

A complete file definition is never easy. But if you approach the process in the order outlined below, trying not to define ahead of your comprehension, you will avoid general confusion. If you test your first file definition at the different stages or "test points" indicated, you will be able to locate problems more easily and review the materials in a particular section of this manual or ask the SPIRES consultant in User Services for help.

Experienced file definers go through all of the following steps, and usually in the order presented.

A.3.1  Design Analysis

Analyze your data base with respect to the following:

a) What are the elementary parts of each "entry" in your file?

An entry such as a restaurant in the RESTAURANT subfile is usually called a "record." A collection of logically related entries or records that are the goal of search requests in your file are called "goal records." For example, an entry or record for an individual restaurant in the RESTAURANT subfile is a goal record in that subfile.

The elementary parts of a record in the phone book are a name, an address, and a phone number. In a SPIRES file, each of these becomes an "element" and is assigned an "element name" in the file definition.

b) How many times does each element occur?

In most entries in a phone book for example, each name has a single address and phone number. But if we were making a file of students and the courses they take, the "courses" element in the record might occur four, five, or six times, or perhaps not at all.

So, in your file perhaps, some elements must occur once or twice, some must occur at least once but perhaps many times, and some elements may be entirely optional. A file can have elements with any combination of these possibilities.

c) Do you know the length of some of the elements?

Elements like a telephone number can be fixed in length, just as a social security number can be. Elements like dates and most numbers are of fixed length if you tell SPIRES to change them to binary (a fixed internal form) before storing them. If some elements only have a limited number of values, you may want to have SPIRES turn the value into a fixed-length code.

The "length" of an element is always the length in bytes (or characters) as the value will be stored on disk. It is cheapest to process and store elements that are either fixed in occurrence (see "b" above) or fixed in length, or fixed in both occurrence and length. Elements that vary in either or both length and occurrence are more expensive to store and process. Optionally-occurring elements that may vary in length and occurrence are the most expensive to process and store.

d) Do some elements have only certain allowable values or forms?

SPIRES can be told to check the validity of input data if you can specify the criteria for validation. Elements like a phone number and a social security number have "-" in certain required places and are of a known length. Zip codes are of a certain length and contain only numerals. Course numbers at certain institutions might always be three letters, a blank, then three numbers. You can tell SPIRES to reject input to an "age" element when a value is greater than 100 and less than 0, or a "number of children" element's input value that is negative or greater than fifteen. You may want to supply automatically a default value if an element is not input, or override an input value if one is supplied: the date and time a record is added to your file can be supplied as input data by SPIRES.

e) Do some elements occur with other elements in a "structure"?

Elements that are grouped together form a "structure." Taken together, a street address, city, state and zip-code might be called an "address"; in a file in which several addresses occur in each record (perhaps a home address and business address in a mailing list file) there must be a way to associate or bind the first city input with the first state and zip-code input, and the second city with the second state and zip-code. This logical binding of different elements is a structure. The university affiliation of a professor may always be paired with his or her name in a subfile of conference participants, for example. Or, a job-code might always occur with a salary figure in a record of a person's employment history.

Structures can be nested within structures: the record of a student's grades for a single term, making up a course-number/grade structure, could itself be a structure that occurs several times in a goal record that is a student's transcript for several terms of work.

f) What single element is best suited to be the key of the record?

The key of the record is a unique identifier by which one record can be distinguished from all other records. The key must be chosen carefully, since it has many consequences: the element designated as the key must only occur once in each record and the value for that key must be unique among all the records in the subfile.

In a file with a goal record of employee data, the name of a single employee is not likely to be unique, but his or her social security number is unique. So the social security number may be the best choice for the key. In a file whose goal record is comprised of items ordered for a store, you might be tempted to use a purchase order number as the key, but if more than one item were listed on a purchase order then the goal record, which is the result retrieved from a SPIRES search (see "a" above), could not be individual items but would have to be purchase orders.

If you don't have an element that can be the key, that is, an element whose uniqueness could be guaranteed by the nature of your data, you can have SPIRES assign an integer or "slot" key for each added record; this technique is used by the RESTAURANT subfile, a "slot" subfile. SPIRES simply assigns the first record the key "1", the second record the key "2" and so on.

An "augmented key" can also be coded if you must use a non-unique key. SPIRES will simply place a suffix on any non-unique key you enter; this suffix will make the key unique. This technique is useful when personal names are used as keys, or when accession numbers are being assigned.

A.3.2  File Definition

Step One

Define your goal record elements using the file definition language to describe the data characteristics you determined in the above steps. [See A.3.1.] The language for goal record definition is described in the first three chapters of Part B, "Goal Record Concepts and Definition," "Goal Record Keys, Slot and Removed Records," and "Structures." [See B.1, B.2, B.3.]

Step Two

Add processing rules to the description of each element. These processing rules or "actions" are called INPROCS or OUTPROCS depending upon whether they affect the input or output of an element. Study "Processing Rules: INPROC, INCLOSE, OUTPROC" [See B.4.] and become familiar with the use of the appendices "Processing Rules: Complete Listing by Number," "Quick Reference to Processing Rule Functions by Number," and "Quick Reference to Processing Rules by Function-Keyword." [See D.1, D.2, D.3.]

Step Three

Test your basic goal record description. To do this you must first study "The FILEDEF Subfile and File Compilation." [See B.5.] Then:

Two additional chapters may be of help at this point: "File Definition Syntax and Semantics" and "Recompile of an Existing File's Definition." [See D.4, C.1.]

Step Four

Study "File Structure: Tree and Slot, Goal and Index Records," [See B.5.] in order to understand the structure of the ORVYL files that SPIRES has created according to your file definition. That study prepares you for defining your file's index records.

Step Five

Study "Understanding and Coding Index Records" [See B.7.] for an understanding of the various indexing techniques and when to use each: Simple Indexes, Qualifiers, Sub-Indexes and Compound Indexes. In that chapter you will learn how to describe the structure of the "index records" which are associated with the "goal record" you defined and tested previously.

Step Six

Study "Understanding and Coding the Linkage Section" [See B.8.] so you can use the file definition language to describe:

Step Seven

Make use of the appendices "A Guide for Coding Index Record Definitions" and "A Guide for Coding the Linkage Section" [See D.5, D.6.] to code the index and linkage sections of your file. This chapter provides (almost) guaranteed recipes for these two difficult-to-code parts of a file definition. You determine the indexing techniques you need, and use the recipes for index and linkage sections that are indicated.

Step Eight

Transfer your goal record file definition and add the index and linkage sections to it, update the definition, then erase and compile your file. (Before this, you may want to save any data that you have entered.) Use the online SPIBILD processor to pass information from the deferred queue and goal records to the index records you just defined, and build the indexes and goal records into a searchable file. Use the SPIRES searching and browsing commands to check your SRCPROCS and PASSPROCS, verifying that the indexes contain the information you intended.

Step Nine

Make use of the file definition language described in "Defining Subfile Privileges" [See B.9.] to specify accounts or groups of accounts that can: use the file, search it, update it, see only some elements, and update only some elements. Modify your file definition, adding this new code, and recompile it. You will be able to verify that you specified the correct privilege codes after the system FILEDEF file has been updated--that is, the day after you make any changes.

Step Ten

Make use of the file-manager commands described in "SPIRES File Management" [See B.10.] to monitor and control the status, activity and processing of your file.

A.3.2.1  The File Definer: A SPIRES Subsystem to Simplify File Definition

A subsystem of SPIRES, named the File Definer, can simplify the process of file definition. By using a concentrated language, based on a subset of the standard file definition language discussed in this manual, you can specify basic information about the file design, such as element names, which elements should be indexed, etc., and the File Definer will generate a complete file definition for you, saving you the trouble of writing and coding goal and index records and the linkage section.

The File Definer subsytem is available only when you are in SPIRES; to use it, you issue the SPIRES command ENTER FILE DEFINER. Below is an example of a sample File Definer session:

The five lines of input shown would be used by File Definer to generate a file definition of about 30 lines, including a goal record definition with the elements NAME, PHONE and COMMENT, an index record definition for the NAME element, a linkage section and a subfile section. That is much simpler than coding the record definitions and linkage sections yourself. [See B.7, B.8.] The file definition generated may then be added to the FILEDEF subfile and compiled.

Most people writing file definitions will want to use the File Definer at some point because it relieves them from the tedious task of coding index record definitions and linkage sections. Even if you cannot code the entire definition using File Definer (it has some limitations, e.g., you cannot directly code SEARCHTERMS, SRCPROC and PASSPROC statements), you can use it to create a file definition ranging from skeletal to almost complete for any file.

Naturally, there is a great deal of educational value in writing an entire file definition (goal records, index records, linkage section and all) yourself, especially if you want to learn and understand SPIRES file structure. However, letting the File Definer do the tedious work and studying the file definition it generates can be educationally rewarding as well, especially if you do so as you read this manual.

The File Definer has its own reference manual, entitled "File Definer", which is written for people already familiar with the concepts and language of file definition as taught in this manual. A primer to the File Definer, aimed at people who primarily want to create a SPIRES file quickly but who are unfamiliar with file definition concepts and language, may be found in the SPIRES primer "A Guide to Data Base Development".

A.4  Glossary of Important File Definition Terms

A.4.1  Element

A data "element" is the smallest unit of named data known to SPIRES. Data elements (or "fields" as they are called in other systems) may consist of characters, numbers or bits; they may be fixed in length or varying in length. They may also be required to occur more than once or be completely optional. Elements are things such as a person's name, a social security number, a salary, or an abstract of an article.

A.4.2  Record

A "record" consists of a series of data elements and their values. Usually, the record is a collection of all the data elements that pertain to a single entity in the entire collection of data. Thus, a record could be made up of one person's name, address, social security number, and salary. Another record in the same collection of data would have the same elements, but for a different person.

A.4.3  Structure

Within a record, elements may be grouped together in "structures," which are referenced in the same manner as elements. For example, if a person has several offices and phone numbers, the office and phone number elements might be grouped or paired together in a structure to keep the proper phone number associated with an office.

Elements that are not in structures are called "record level" elements. Elements inside a structure are "lower level" elements with respect to the record level elements. Structures that are not inside of other structures are record level structures. Structures inside of another structure are lower level structures with respect to the containing structure.

A.4.4  Key

Each record in a collection of data maintained by SPIRES has a required singularly occurring data element known as the "key." Each key within a collection of goal records must be unique to the goal record--no two records in the same goal record collection can have the same key. The key element in a personnel goal record would probably be the social security number, since it will be unique for each person. If the data records themselves do not contain unique elements that are useable as keys, SPIRES can supply unique consecutive numbers as the values for the keys in a set of goal records.

A.4.5  Goal Record

"Goal Record" is the SPIRES term for a data record that could found as a result of a SPIRES search operation. In a collection of data about restaurants, the goal records would probably be restaurants; in a collection of data about library holdings, a goal record would probably be a book. A single record retrieved from a SPIRES search is a goal record. All of the records that have the same structure as the retrieved record or records, and hence "could" have been retrieved by a search, are referred to collectively as "the goal record" or "the goal record data set."

A.4.6  Index Record

An "index record" consists of a series of data elements and their values, just as a goal record does. However, in index records one of the data elements contains as its value an internal pointer (or pointers) to a goal record (or records). An index built out of names in the goal record would contain one index record for each name that occurs in the goal record as well as a pointer to the goal record (or records) in which the name occurs. The user has no direct interaction with indexes, though they are used by searching commands.

A.4.7  Record-type

A "record-type" must be distinguished clearly from a "record." A record-type refers to a collection of records, and may refer to either goal records or index records. There are "goal record record-types" and "index record record-types." The record-type is a collection of records that all have the same structure. In a personnel file, the goal record of social security numbers, names and salaries makes up one record-type, while a name index and a salary index are two other record-types.

A.4.8  Index

An "index" is a collection of index records created and maintained by SPIRES; one usually does not manipulate them directly. Indexes act as a "go between" between a searching command and the goal records. The values in a search request are looked up in the index, and the values in the index point to particular goal records. This is similar to the index one might find at the end of a book: such an index contains words or concepts, and each word or concept has a list of the pages on which it occurs.

There are two types of indexes available: simple and compound.

A.4.9  Simple Index

A "simple index" (or more specifically, a simple index record-type) contains one record for each entry in the index. For each unique name in a personnel file, there is one record in a NAME index that points to each goal record containing that name. The key of a simple index is the thing being indexed, a person's name, for example. Simple indexes are cheaper to search than compound indexes.

A.4.10  Compound Index

A "compound index" may index several elements, and is usually used to index short numeric values or coded elements such as salaries and dates. In a compound index, there is one record for each element being indexed. If a personnel file has the elements salary, job-class, and date-hired all in a compound index, then the compound index will have three records (one for salary, one for job-class, and one for date-hired). Each record in the compound index contains all the values that exist in the goal records for a particular element; this is why compound indexes are not recommended for large files -- the compound index records become too large to be searched quickly. Compound indexes may be searched with all of the relational operators, but are more expensive to search than simple indexes.

A.4.11  Combined Record-type

Combined record-types are record-types that are stored in the same ORVYL file. The file owner specifies in the file definition which record-types are to be combined together; if no combinations are defined, then each record-type occupies its own ORVYL file. There is a system limit of 13 physical record-types, so if a goal record is to have more than eight indexes, some of them must be combined into the same ORVYL file. SLOT record-types may not be combined with any other record-type. Record-types that are physically combined are kept conceptually separate by the logic built into SPIRES. Record-types that are physically combined with each other are different "logical record-types," even though they may occupy the same "physical record-type" (the same ORVYL file). There is a system limit of 64 logical record-types.

A.4.12  Removed Record-type

A "removed record-type" has nothing to do with records that have been deleted from the file with the REMOVE command. Removed record-types may provide increased access efficiency for some data. SPIRES access efficiency depends on a large number of records being packed in a single file block. SPIRES provides the file definer with an option of keeping only the key of a record in the file block, plus a pointer to the remainder of the record's data. The remainder of the data is kept in the "residual data set." This is called "record removal" and allows many more (partial) records to be kept in a file block than if whole records were kept intact.

A.4.13  Subfile

A "subfile" is defined as one set of goal records, the indexes to those goal records, and the access and update restrictions that apply to the data elements. Among the record-types that are brought into association in a subfile, a clear distinction is made between goal records and index records, since the user can only manipulate goal records, not index records. If a goal record has no indexes built for it, then the subfile consists of only the goal record and the access restrictions to it.

A.4.14  File

Several subfiles may relate to the same data and may be placed in one "file" or data base. A file thus contains all the subfiles that relate to the same data. A user can only work on a single subfile at a time, even though there may be several subfiles defined in one file. It is also possible (and is frequently the case) that only a single subfile is contained in a file.

A.4.15  Hierarchy of File Definition Components

The following chart shows the relationship of the parts of a file. The chart depicts a single file, with two subfiles. The first subfile has three record-types: one goal record and two index records. The second subfile has only a single record-type, the goal record.

The goal record of the first subfile is composed of a record with a key and several elements. One of the elements is a structure, and is thus itself composed of elements. The first index record of the first subfile is a record with only a key and a pointer.

              ------  ----------   -------------    -----
                                   | goal record-> | KEY
              |       |            |               | ELEMENT
              |       | "goal      |               | ELEMENT     ---
              |       |  record    |               |    :        |
              |       |  record-   |               | STRUCTURE---|ELEMENT
              |       |  type"     |               | ELEMENT     |ELEMENT
              |       |   or    -> |               |    :        |--:
              |       | "goal      | goal record   |

              |       |  record    | goal record   |
              |       |  dataset"  |      :        ---
              |       |            |      :          :
              |       |            |-----------      :
              |   S   |            ------------
              |   U   |            | index record  |---
              |   B   | "index     |             ->| KEY
              |   F ->|  record    |               | POINTER
              |   I   |  record- ->|               |----
      FILE -->|   L   |  type"     | index record  [
              |   E   |   or       | index record  [
              |       | "index"    | index record  [
              |       |            |      :
              |       |            |-------------
              |       |
              |       |             -------------
              |       | "index     |
              |       |  record    |
              |       |  record-   |
              |       |  type"   ->|   as above
              |       |   or       |   for
              |       | "index"    |   index-record
              |       |            |   record-type
              |       |            |
              |       -----------  -------------
              |
              |       -----------   -------------
              |       |             |
              |       | "goal       |
              |       |  record     |
              |   S   |  record-    |
              |   U   |  type"    ->|  as above
              |   B ->|    or       |  for
              |   F   | "goal       |  goal
              |   I   |  record     |  record
              |   L   |  data set"  |  record-type
              |   E   |             |
              |       |             |
              ------- -----------   ------------

B  Defining a SPIRES File

B.1  Goal Record Concepts and Definition

To begin our consideration of goal record definition, let's take a telephone directory as our example; the structure of a directory is something with which we are all familiar. Certain assumptions we are going to make for our directory file will simplify its structure.

What information is stored in a telephone directory? Usually, and most simply, a name, address, and telephone number make up each entry.

B.1.1  Element Names, Occurrences, and Lengths

If we have a single "record" or entry that consists of the three elements, name, address, and phone number, how many times will each element occur in a single record? Let's look at the question this way:

Most likely (in the simplest case), the name and address elements will occur once and only once: for each name there is one address. But it would not be unusual for a person to have several phone numbers, so we don't know how many phone numbers to expect or allow for.

Now, what can we say about the length of each of these elements? Here is a review of what we have so far:

We really don't know the length of the longest possible name and address. We could probably specify a length that couldn't be exceeded, but SPIRES does not require us to. If you do not specify a length, SPIRES stores only the length of the value input, plus two bytes of information about the length. Let's not specify a length for the NAME and ADDRESS elements.

The question of the length of the phone number requires some decision; let's agree that a phone number is an eight character (or eight "byte") value, such as "497-4420". (If we wanted to include the area code with each number, then the value is thirteen bytes long: "(415)497-4420" for example.) Our "file definition" now looks like this:

B.1.2  File and Record Name Statements

Let's see what this file will look like in the SPIRES file definition language. The first thing we must "code" or specify is the name of the file. This name is always an alphanumeric string preceded by the file definer's account number in the form GG.UUU. This account becomes the only account that by default can modify or compile the file definition. The file name is coded first in the definition, and looks like this:

Here, GG.UUU is the account, and "DIRECTORY" is the name chosen for this particular file. The file name (including the account) may be up to 23 characters long (longer names will be truncated by SPIRES). No one but the file owner need ever see this name. This is not the name used to select the subfile.

A file consists of sets of records; each set is called a "record-type." Most often there is a goal record record-type and several index record record-types per subfile. To simplify our discussion at this point, we will call a record-type a "record." (Though this is not true, strictly speaking. Many goal-records make up a single goal-record record-type. [See A.4.] Each of these records has a unique name.) The goal record is often called "REC01", and its name is coded

We will see later why this name is most common, and some circumstances in which you might want to choose a different name. [See B.2.2.]

B.1.3  Element Categories

Within each record we define that record's elements. All of the elements must be in one of three categories: FIXED, REQUIRED, or OPTIONAL. Elements are segregated into these categories by their occurrence and length attributes as follows:

Notice that "Fixed" and "Varying" for FIXED and REQUIRED refer to the length attribute of the element being defined, not its occurrence attribute.

The length of an element in the FIXED section of the record definition must be specified. If the occurrence is not specified for an element in the FIXED section, then the number of occurrences is assumed to be one. On the other hand, an element defined in the REQUIRED section of the record definition need not have either length or occurrence attributes specified; the element must occur--but its occurrence and length may vary from entry to entry. Elements in the OPTIONAL section need not have either length or occurrence specified, because that element may or may not occur in a given record.

If you do specify an occurrence for an element in the OPTIONAL section or the REQUIRED section, it has a special meaning, which is different from an occurrence specification for a FIXED element. If the number of occurrences is one, then the element is "singularly occurring": for a REQUIRED element, this means that it must occur once and only once in each record; for an OPTIONAL element, this means that if it occurs at all, it can occur only once. If the number of occurrences is more than one, then the element is multiply occurring: for a multiply-occurring REQUIRED or OPTIONAL element, the number of occurrences is not checked. You may have SPIRES do a minimum and/or maximum occurrence check by specifying certain processing rules, usually A123 and A146. [See B.4.14.]

A small amount of storage space is saved when REQUIRED or OPTIONAL elements are specified as singularly occurring.

Though not required for a valid file definition, an OPTIONAL section with a dummy element should be coded in every file or record definition for which an OPTIONAL section would not otherwise occur. By coding this section you have the flexibility to add elements to the record definition even after data has been stored in the file. Such elements are always added to the OPTIONAL section; the dummy element is never coded with length and occurrence attributes. [See B.1.7.] No more than 254 elements may be coded in an OPTIONAL section.

Remember that we decided "PHONE" is fixed in length, but is not fixed in the number of times it can occur, though it must occur at least once. Elements that must occur, but for which a firm occurrence count can't be specified, are placed in the REQUIRED section, even if they can be fixed in length.

We code the category name at the head of the list of elements it describes:

Note the order in which the categories must appear: FIXED, then REQUIRED, then OPTIONAL.

If you do not code any categories, all the elements will be OPTIONAL, with the exception of the key, which will be REQUIRED. [See B.1.4.]

B.1.4  Record Keys

In addition to placing each element in an appropriate category, we must choose an element that will be the "key" of the record. A key is required for every record (whether goal or index) defined in the file; it must be unique in value and occur only once in each record.

Now, in our phone directory, we would most likely pick the name as the key. What consequences does this have? The key of a certain record is a unique value for that record; no two records or entries in the file can have the same value for the key element. Thus, no two records in our telephone book could have the same name. (Here is where we allow ourselves to simplify with the assumption that no two people in our phone directory will have the same name. This would not be a realistic assumption for a real phone directory. The solution to this problem is found in the next chapter, "Goal Record Keys, Slot and Removed Records.")

In addition to being unique in value among the goal records, the key must always be singly occurring; that is, the occurrence attribute of the key must be one. For this reason, an occurrence number need not be specified for the key element. A key may be varying in length, such as the name in our phone directory. But a length attribute may be specified if it is known. In our phone directory, NAME would be coded as the key element as follows:

The key element is always coded as the first element in the category in which it is defined. Since the key must be singly occurring, but may be fixed or varying in length, it is coded as the first element in either the FIXED or REQUIRED categories.

Let's review our definition:

Since we don't have any elements in the FIXED section, we don't code it.

B.1.5  Element Name, Occurrence and Length Statements

The next element to code is ADDRESS. For this element we can specify that it must occur once and only once, but we can't specify a length attribute. The name of an element is specified in the ELEM statement. (You are allowed to use ELEMENT instead of ELEM; however, SPIRES will change it to ELEM when you add it to the FILEDEF subfile later The occurrence attribute of an element is specified in the OCC statement. (Similarly, OCCURS, OCCURRENCE and OCCURRENCES may be used instead, though they will be changed to OCC.) The ADDRESS element would be coded in the REQUIRED section as follows:

If we had not specified "OCC = 1", then ADDRESS could occur one or more times. (Since it is coded in the REQUIRED section, it must occur at least once if the occurrence attribute is not coded.)

We now must code the phone number element. For this element we can only say, "it must occur." We don't know how many times. In such a case, "OCC = 1" is not coded since this would limit the element to one and only one occurrence. We have decided that the length of the phone number in bytes (characters) as it will be stored on disk is eight characters. The length attribute of an element is specified in the LEN statement. (LENGTH is also allowed but will be changed to LEN.) Since it may vary in occurrence, the phone number element is coded in the REQUIRED section thus:

Remember that the length attribute, coded by "LEN =", is the length as the value will be stored on disk, which is not necessarily the length of the value as input when the record is added; processing rules, called "actions", can manipulate the input values.

If we specify "LEN = 8" for the PHONE-NUMBER element, then all element values stored on disk will be eight bytes long. If a value is input that is longer than eight characters, the record will be rejected for input, and an error message will be issued. If a value is input that is shorter than eight characters, SPIRES will pad it with blanks to a length of eight bytes. [To allow null values for an element's input, omit the LEN statement; otherwise, SPIRES will fill the entire length with blanks, which is not the same as a null value.] Manipulation of input values can be effected more intelligently when "actions" are coded. [See B.4.]

Embedded blanks are not permitted in element names; the special characters ".", "_", "-", and "$" are allowed, though "-" is not allowed as an element name by itself; it may be embedded within a name, however. [See "SPIRES Searching and Updating", section D.1.3.1, for more information about the "-" or "throw-away" element.] The length of an element's name is limited to sixteen characters.

B.1.6  Element Aliases

Long element names are often advisable for clarity, since the value coded in the ELEM statement is the name of the element used when records are displayed.

However, it is not convenient or sensible to enter a twelve-character element name for an eight-character value: "PHONE-NUMBER = 497-4420;". SPIRES allows you to give an element a long, descriptive name such as "PHONE-NUMBER" and refer to it by several other names, such as "P" or "PN". The file definer must indicate what these other names can be by coding "aliases" for the element names in the file definition. Here is how aliases are coded:

A phone number can now be entered by "P = 497-4420;" or, more simply, by "P 497-4420;" (since the "=" is optional). We have also allowed the aliases "PN" and "NUM", which are mnemonically more significant than the terse "P". No two elements can have the same alias at the record level or in a structure.

B.1.7  Dummy Elements; Comment Statements

Since we have no OPTIONAL section, we should code an "empty" OPTIONAL section with a single dummy element. (This will allow us to add elements to the record definition at a later date without invalidating data already stored.) This section is coded as follows:

An item called "COMMENTS" may be coded for any element you define; no single comment can be longer than 1,024 characters.

Let's look at the record definition we have coded, adding aliases where they are useful:

B.1.8  Optional Statements: AUTHOR, MAXVAL, NOAUTOGEN and BIN

Other statements can be coded that will make the file definition more complete: AUTHOR, MAXVAL, NOAUTOGEN and BIN. They are coded after the FILE statements, which is the first statement in our definition.

B.1.8.1  (The AUTHOR Statement)

It is important to specify the AUTHOR statement in your file definition. In case it is necessary for the data base systems staff to contact you, the AUTHOR statement should supply the necessary information. This element is usually coded after the FILE element, and is a free-form text string:

B.1.8.2  (The MAXVAL Statement)

Another file-level element is necessary for some applications, particularly those involving long text strings such as bibliographic and abstract files. If any element values in your file will be longer than 4,096 bytes, you must code the following in your file definition:

The value specifies the maximum data length for any single occurrence of an element in the file. MAXVAL cannot exceed 32,760. Also, no single record in a SPIRES file can be more than 120,000 characters long.

The MAXVAL limit also applies to values processed by actions A44 and A48 and by the SET VALUE Uproc in Userprocs. [See C.11.1.1.]

B.1.8.3  (The NOAUTOGEN Statement)

An optional element "NOAUTOGEN;" may be coded in your file definition if you do not want SPIBILD automatically (i.e., nightly) to pass records from the deferred queue to the goal and index records. Normally, every night that there are records in the deferred queue, JOBGEN will generate a job to build or process them into the goal and index record data sets. Every time this job runs, a certain amount of overhead for job scheduling and initiation is incurred. With only a small number of records to be processed (say, fewer than 5), this overhead is a significant percentage of the job cost.

However, if NOAUTOGEN is coded, you must explicitly cause this job to be submitted by issuing the online SET AUTOGEN command, perhaps after allowing several records to accumulate in the deferred queue. JOBGEN will generate a SPIBILD job that night, and then reset the file to the NOAUTOGEN condition. If NOAUTOGEN is not coded, then you must take specific action to prevent overnight processing; SET NOAUTOGEN can be issued to prevent the generation of this job until you explicitly SET AUTOGEN in SPIRES or PROCESS the file in SPIBILD.

B.1.8.4  (The BIN Statement)

You may code the bin number to which you wish output from SPIRES-generated jobs to be sent. Output from compilations and automatic file building (JOBGEN) will go to the bin specified; if no bin is coded, then such output will be directed to the default bin of the file owner.

If you code PURGE for the bin, then the output will be purged if there were no batch requests processed by SPIBILD and if no errors occurred during SPIBILD processing. Otherwise, the output will be sent to the file owner's default bin. Coding PURGE is recommended because it generates output only in the event of a SPIBILD problem or a batch request, thus saving you printing charges.

If you code HOLD for the bin, the output will be directed to the default bin of the file owner but the output will be held. The file owner can fetch the output and then either purge it or release it for printing.

The bin is coded in your file definition like this:

where "nnn" is the number of the bin or HOLD or PURGE as described above.

B.1.9  Statements in the Subfile Section

Though our goal record definition is now complete, there are several other things that must be coded to complete the definition of the file itself. (Remember that a file definition usually, but not necessarily, contains several record definitions.)

As noted earlier, the file name is almost never seen by the user; what the user sees is the subfile name, which is coded as the first statement in the "subfile section" of the file definition. The subfile section (or sections) follows at the end of the last record description.

Embedded blanks are allowed in the subfile name. Since this name is typed in a SELECT command, it should not be very long or otherwise difficult to type. The maximum length for a subfile name is thirty-two characters, including blanks.

The second statement in the subfile section identifies the record that will be the goal record when the subfile is selected. Because we only have one record name for our single record definition, this may seem redundant. But since most subfiles have multiple record descriptions--usually one goal record and several index records--SPIRES must be told explicitly which record is the goal record. This statement is coded as follows:

Remember that "REC01" was the name of the record we described and named by the statement "RECORD-NAME = REC01;".

Now we must specify what accounts are permitted to select the subfile whose name is given by the "SUBFILE-NAME" value immediately preceding.

This permits access to the subfile only to the account specified. At a minimum, the file-owner's account should always be specified; if it is not, then the file owner must issue the ATTACH command to use the subfile.

You can permit more than one account by coding other account values:

To permit all group "GG" accounts (but not "GA" accounts), you would include "GG...." in the ACCOUNTS value. To make a subfile public, you specify "PUBLIC" as the ACCOUNTS value. The matter of controlling access to SPIRES subfiles is detailed in "Defining Subfile Privileges." [See B.9.] A complete subfile section can be coded like this:

B.1.10  A Complete File Definition

Here is what our complete phone directory looks like when coded in the file definition language:

The indentation shown is for the sake of clarity; you can use any indentation that is helpful to you. Also, an element's name, occurrence, length and aliases need not be defined on a single line; in fact, when SPIRES displays your file definition, each of these will be on a separate line, with indentation used to structure the definition for easy reading.

B.2  Goal Record Keys, Slot and Removed Records

B.2.1  Record Keys

Let's consider another way of defining a telephone directory file. Suppose we made the telephone number the key of the record, what would be the impact on the file? Here is a record definition in which the key is the phone number; we have also allowed name and address to occur more than once by not specifying any OCC limits.

Such a directory would give you access to all the users of a particular phone number; if one person had two different phones, the name would be in two different records, each record's key being one of the phone numbers. A directory keyed on the phone number might not be useful to someone looking for John Jones' phone number, but it would be useful to someone looking for the owner of phone number 497-4420, which has been reported out of order, perhaps.

Notice that this dramatically changes how we look at or use the file. Now, all the people sharing a single office extension can be found, but one person's phone number can't be found as directly as it was in the directory keyed on name. The goal of the search--either names, as in the previous case, or phone numbers as in the present example--determines the choice of key.

Since it is unlikely that one phone number could be at more than one address (though some businesses have "extensions" in several buildings), we will code "OCC = 1" for the occurrence attribute of the ADDRESS element. But it is very likely that more than one person could be listed for each phone. For this reason, we will not code any occurrence attribute for the NAME element: we simply don't know how many times this element will occur. The occurrence of a record key must be one; the length of a record key must never be greater than 240 bytes, whether the length is fixed or not, whether it is the key of a goal record or index record.

The present definition, keyed on phone number, is different in another way from the definition keyed on name: the phone number, which was in the REQUIRED section (varying in length, required to occur) in our record keyed on name, is now in the FIXED section. The phone number was multiply occurring before, though it was fixed in length; now, since it is the key of the record, it is required to occur exactly once. Elements whose occurrence and length attributes both can be fixed are usually coded in the FIXED section.

There may be reasons why you would choose not to put a fixed length and occurrence element in the FIXED section of a record definition. Let's look at two record definitions for a phone directory keyed on phone number; we will add an element for zip code, which seems to belong in the FIXED section, being fixed in both length and occurrence.

 RECORD-NAME = RECO1               RECORD-NAME = RECO1;
   FIXED;                            FIXED;
     KEY = PHONE-NUMBER;               KEY = PHONE-NUMBER;
       LEN = 8;                          LEN = 8;
     ELEM = ZIP-CODE; OCC=1;
       LEN=5;
   REQUIRED;                         REQUIRED;
     ELEM = NAME;                      ELEM = NAME;
     ELEM = ADDRESS; OCC = 1;          ELEM = ADDRESS; OCC = 1;
                                       ELEM = ZIP-CODE; OCC = 1;
                                         LEN = 5;
   OPTIONAL;                         OPTIONAL;
     ELEM = DUMMY;                     ELEM = DUMMY;

In the standard SPIRES output format, "element mnemonic = value", the elements in a record are output in the order in which they are defined: FIXED, REQUIRED, then OPTIONAL elements. (If an element occurs more than once, its occurrences are output in the order in which they were input.) Standard record output formats for each of the above definitions might be as follows:

  PHONE-NUMBER = 497-4400;         PHONE-NUMBER = 497-4400;
  ZIP-CODE = 94305;                NAME = USER SERVICES;
  NAME = USER SERVICES;            ADDRESS = POLYA HALL 117;
  ADDRESS = POLYA HALL 117;        ZIP-CODE = 94305;

So, for readability, you may want to put ZIP-CODE in the REQUIRED section of the record definition. But if you are certain to define an output format, there is no need to consider this problem.

B.2.2  Slot Keys

We have just discussed the importance of choosing the best element for the key of the record. Let's look at situations in which the choice of unique key may be difficult or impossible.

Suppose our SPIRES file was going to be a collection of abstracts from scientific journals. Our element record definition might be as follows (note that the key is not specified):

Now, if we wanted a search to retrieve the list of journal abstracts in which the words specified in the search request appeared, the goal record would be "articles." A search request for such a file would look like this:

How would we go about choosing a key for an "article" goal record? None of the elements defined above is very likely to be entirely unique. We could contrive a unique key by concatenating portions of the JOURNAL, YEAR and PAGE elements: NG.76.202, for example, could signify page 202 of a 1976 issue of National Geographic. However, such a key would not be convenient to enter or use. (See the "Structured Key" processing rule, A33, for one solution.)

SPIRES has a more elegant solution to the problem of a lack of a natural key. If you specify that a record type is "SLOT", SPIRES will assign a unique integer key to each record added; these keys start at one and will be incremented by one as each record enters the file. SPIRES always stores a slot key as a four byte binary number.

This simple solution could lead to problems: suppose you typed a command such as "remove 197" when it was actually record 187 that was to be removed. The file definer can protect against this kind of error in a slot file. SPIRES allows you to specify that a "check digit" be appended to each integer slot number as the record is added to the file. A check digit is a single digit that is appended to the right end of a number; it is computed by performing multiplication and addition operations on each digit of the original number, and then adding and subtracting the resulting sum to yield a single digit. Since this digit is computed from the other digits in the number, the original number's digits can be verified by seeing that the final digit is correct for a number you type at the terminal.

For example, a record in your file may have the key "2757", of which the final digit, "7", is the check digit. The value "2657" would not be a valid key, however, since the first three digits, "265", require (or compute to) a different check digit than "7". Thus, each digit becomes significant in computing the check digit, and most typographical errors in specifying a record's key (such as typing "2657" instead of "2757") will be caught when the system attempts to verify the check digit. Note that the record whose key is "2757" is not two thousand seven hundred fifty-seventh record in the file, but the two hundred seventy-fifth; the final digit is the check digit. The system does not store the check digit with the key, but computes it each time you display the key--the digit is always shown when the key is displayed. (If you "look up" the key of a record using action 32 [See C.5.] and intend to display it, you must explicitly code a processing rule to have this digit displayed.)

A check digit is requested by coding "SLOTCHECK" on a SLOT record. On all commands requiring a key, the check digit, first computed by SPIRES, is appended to the record key by the user and recomputed and validated by the system. This digit functions similarly to a parity bit in tapes, verifying that the data is valid. The method SPIRES uses in computing the check digit is described in detail in the description of action 27. [See D.1.3.0.2.7.]

The default check-digit formula is called the Mod-11 rule; it can be explicitly requested by giving the value "0" to the SLOTCHECK statement ("SLOTCHECK = 0;"). Other formulas, described in the description of action 27, can be requested by coding different integers on the SLOTCHECK statement:

No KEY statement is coded for a slot record, since the slot number (with a check digit if one is requested) is the key of the record. SLOT and SLOTCHECK are coded as part of the record definition as follows. (REMOVED will be explained in the next section of this chapter.)

You will notice that "RECORD-NAME = ENTRY" was coded, instead of "RECORD-NAME = REC01" as before. In a slot record the name of the goal record key is determined by the value of the "RECORD-NAME" statement. One caution should be observed: the value of this element should be lower in alphabetical sequence than any other "RECORD-NAME" statements you code since the record definitions are displayed in alphabetical sequence by RECORD-NAME. In general, it is not good to code "RECORD-NAME = GOAL". If you wish to have the goal record key named something other than the RECORD-NAME, then following the "SLOT" statement, code the "SLOT-NAME = name" statement. You may also code an ALIASES statement for the slot key. [See B.1.6.]

The SLOT statement may also have a numeric value, representing its priv-tag number. [See B.9.4.4.]

A file may not have more than eight slot-type record definitions. You may, under certain circumstances, want to define a goal record as "slot" even when a natural key exists. The advantages and disadvantages of such a scheme must be weighed carefully, and are described below.

SPIRES treats slot type goal records in a special way; it keeps them in a data set that is organized sequentially rather than tree structured. If all of the elements in a slot record are fixed required (coded in the FIXED section of the record definition), then the amount of space each record requires is known exactly; when SPIRES goes to retrieve such a record, it "calculates" the record's position and goes directly to that location--it does not have to account for the varying size of each goal record stored. Thus, the major advantage of slot organization of fixed required elements is that record retrieval goes much faster (retrieval is not to be confused with searching, the process that usually precedes retrieval). A second significant advantage is that, since the goal records are structured sequentially, sequential searching by global FOR commands is faster.

The disadvantages of forcing your data base structure into a fixed slot-type record format are actually inconveniences; the file definer must decide if these inconveniences are acceptable.

One inconvenience is that records must be referred to by their slot number in TRANSFER, REMOVE, UPDATE and DISPLAY commands. In a personnel file keyed on social security number, you could remove or update an employee's record by simply giving the person's social security number. If this file were defined as a slot file, you would first retrieve the record by using a FIND command against a social security index, then TRANSFER the record retrieved using a global FOR command. As you can see, the extra expense of building a social security number index would have to be incurred.

A second inconvenience is the loss of verification that the social security number of each person was unique; if social security number were the key of the record, SPIRES would verify that no records had the same value for the key. Generally, when a natural key exists, SLOT organization is not used.

If you expect to do a lot of sequential searching of the data base using global FOR commands, then consider making the record elements FIXED and the record SLOT. How is this done? If most of the elements in a record are of fixed length and fixed occurrence, you can consider having SPIRES store a variable length element such as NAME as a fixed length element; you would choose the largest possible length for the length attribute of each variable length element. But if a record has an element that can vary greatly in length, such as ABSTRACT in our article goal record above, we would not want to waste storage space by fixing the length of this element at its longest possible value.

If all of a SLOT record's elements are not coded in the FIXED section, then the record must be "removed." In the following section we will see what record removal means, and how it is coded.

B.2.2a  SLOT-START Statement

SLOT-START is a new field in the SLOT structure of record definitions, both in FILEDEF and RECDEF. SLOT-START serves more than a single purpose -- enabling a simple way to generate keys that begin at a particular value.

Suppose you would like to generate Slot records whose first key value is something other than 1 -- say 9000000. You want the first record to have a key of 9000000, the second 9000001, then 9000002, etc. The ORVYL system on the Stanford mainframe stored the key of 9000000 in block 35573 of the record-type data set (assuming the block size is 2048 and the record-type is REMOVED). This situation poses no real disadvantages in mainframe SPIRES because ORVYL only writes a single block 35573 resulting in a data set that has two blocks -- block 0 and block 35573.

But for Unix SPIRES this would be a different matter. If you added the first record of 9000000 to a slot record-type in that system then SPIRES must fill in the gap between 0 and 35573. Not only does this represent "wasted" space but represents 35572 extra block read requests should you attempt to do a sequential scan (eg. FOR SUBFILE / DISPLAY ALL) of the subfile.

If you wish to take advantage of this option then you should code the "SLOT-START = number" statement immediately following the "SLOT" statement in your record definition. If the subfile is a NEW subfile then your work is done and the first record added to the subfile will have a key of "number".

If the subfile already exists and has SLOT keys that begin from a different value and you wish to take advantage of this new option then you should RECOMPILE using the REBALANCE option, following the same recipe that you use to rebalance Tree data set using the CONVERT option. [EXPLAIN RECOMPILE COMMAND, WITH REBALANCE OPTION.]

Here are some answers to other questions you might ask:

B.2.3  Removed Record-Types

In our first example of a file, a telephone directory keyed on name, the length of a single record is approximately the sum of the lengths of the individual elements. It would probably not exceed fifty bytes or characters: the NAME and ADDRESS elements may take twenty bytes each, and the PHONE-NUMBER element takes an additional eight bytes.

This means that in one block of ORVYL storage, which is 2048 bytes, approximately forty records could be stored. SPIRES access efficiency depends to a great extent upon the number of records each block contains. If the average number of records per block is eighty, then one record in 512,000 can be retrieved by accessing three blocks or less, if the data is structured in tree fashion. If the average number of records in a block is only twenty (each record averaging one hundred bytes in length), then five accesses may be necessary. If the number of records per block drops to sixteen or fewer, then efficiency of record access seriously degenerates, since many file I/O's may be necessary to locate a particular record.

In order to keep access efficiency high, SPIRES provides the file definer with the option of removing large record types (remember that a "record type" is our "REC01," a "record" is an entry in the file), say of 60 bytes or more, from the goal record data set to a "residual data set." This removal is done for all records in a record type, and may be specified for any record-type, whether large or small in size. Only a key and a pointer to the record's location in the residual data set remain in the goal record data set when you specify record removal.

Tree (or "non-slot") record types, such as our telephone directory, should usually be removed, since the size of an entry is often over forty bytes. You can specify that a record type's contents are to be removed to a residual data set when you code the record type's name:

Slot record types, which use a different access technique from tree structured record types, are always removed unless all elements are fixed in length and occurrence. Slot record types, such as our articles file, can specify removal to a residual data set as follows:

Slot and tree structured data sets may be mixed in a file or data base.

Some rules-of-thumb can be stated for record removal. Remove record types if any of the following are true:

B.2.4  Monotonic Record-Types </