WWW URL and File Naming Scheme

SLAC     16 May 1997


Table of Contents

Goals for the SLAC Naming Scheme
Definitions of Production Space
Mapping between Filenames and URLs
Linking to and from Production Files
Organization of Files and Subdirectories
Naming Files and Subdirectories
Some Implementation Details
Some Notes on Change Management
Acknowledgments
Appendix A: Naming Examples
Appendix B: Group Codes

Goals for the SLAC Naming Scheme

Files from diverse sources on a wide array of subjects need to be placed into SLAC's WWW production file space. This document draws on the experience gained working with diverse groups over the past few years to articulate guidelines for naming those files and, hence, their URLs. This document deals only with text and related static files (such as PostScript and image files). It does not treat the naming of scripts or dynamically generated pages, such as those generated through CGI scripts accessing Oracle or SPIRES databases. Naming CGI, Java, and other scripts will be treated in a forthcoming document.

SLAC has collected most of its production WWW files into a single name-space, /afs/slac.stanford.edu/www/, and uses this document as a guide in naming files and their related URLs in that space. This document assumes a basic knowledge of the UNIX operating system and the AFS file management system at SLAC.

Goals for the SLAC naming scheme are to:

We expect that pursuing these goals will make it easier for: In addition to the main production server, www.slac.stanford.edu, and its production areas, there are a few specialized SLAC Web servers with their own production file spaces. Their owners, too, may find this document useful in regularizing their naming structures.

Definitions of Production Space

Production pages and directories are those that are linked directly or indirectly from the SLAC Home Page.

Production pages should reside in production file space. For the main SLAC server, this means that production pages reside somewhere in the /afs/slac.stanford.edu/www/ subdirectory tree. This single location provides for easier maintenance of the rules file, space, performance, and the information architecture itself.

Note that in this document, the word "page" may also be understood to mean "a page and its related files (such as image and PostScript )."

For more detailed descriptions, see SLAC Web Definitions.

Mapping between Filenames and URLs

In the naming scheme for the primary SLAC production space, URLs and filenames are identical except for their prefixes. Filenames have the following prefix:

/afs/slac.stanford.edu/www
which can be abbreviated at SLAC:
/afs/slac/www
URLs have the prefix:
http://www.slac.stanford.edu
For example, the filename for the SLAC Experiment E144 Home Page would be:
/afs/slac.stanford.edu/www/exp/e144/e144.html
and the fully qualified URL would be:
http://www.slac.stanford.edu/exp/e144/e144.html
Here the name of the home page follows a common convention in which the name of the home page repeats the directory name with .html on the end.

Linking to and from Production Files

Under normal circumstances, there should not be hypertext links from production space to non-production space, such as to files in a user's home directory (~USERNAME/public_html/). Non-production space is for testing and sharing informally.

Symbolic Links

SLAC has structured its UNIX file architecture such that those items that are designed to be Web-viewable reside in a clearly defined, Web-accessible space, and all other items reside in a separate and distinct part of the file architecture, which is generally presumed to be Web-invisible. This separation of function serves in part as a low-level form of insurance against the inadvertent release of inappropriate material to the World Wide Web (WWW). Use of symbolic links from Web-viewable space (the two main locations of SLAC's Web documents are /afs/slac/www/* and ~USERNAME/public_html/*) into other parts of the connected UNIX file space compromises the functional separation of these two spaces and is therefore generally discouraged.

While their use is generally discouraged, using symbolic links can help smooth the active migration of files from one place in the production Web file space to another, allowing users seamless access during the transition period.

Caution must be exercised when using symbolic links pointing from globally visible Web space to space that is normally seen only by those logged in to a SLAC host because such symbolic links may lead to unpleasant visibility surprises.

Organization of Files and Subdirectories

This section covers the guidelines for organizing the production WWW space itself.

Functional vs. Organizational Files

Files can usually be divided into two basic categories, functional and organizational. Functional files, such as /comp/platform.html and the subdirectories /comp/telecom and /library treat subjects of interest to overlapping sets of SLAC and external users. Organizational files, such as those in the /grp/irm/telecom subdirectories describe some part of the SLAC organizational structure. There is overlap between the two categories.

The functional/organizational distinction appears at the top of the SLAC WWW file space and is continued lower down in the hierarchy. See "Appendix A: Naming Examples", below. Documents published for one or more major audiences outside the generating working group often belong in a functional subdirectory tree. These pages are usually relatively formal and generally have a broader audience than organizational pages. Revisions to these pages tend to be technical or procedural.

Documents targeted for an audience within a given group often belong in organizational space along with that group or department's home page. Pages of this kind are often more casual. Revisions to organizational pages (like other organizational elements of an institution) are likely to happen more frequently and have more impact on the existing information structure than are the revisions to the functional pages. Note that while there is an effort to group pages into these two separate categories, pages in organizational space often do include functional pages -- functional pages that are directed toward those inside the working group. The SCS group page is a good example of this.

Group Space

Files in organizational space are named /grp/CODE/filename, where CODE is usually the two or three character BINLIST code. There are more than sixty of these in use. See "Appendix B: Group Codes" for a list.

These codes such as cd, pur, and scs are often already recognized by SLAC people. The BINLIST codes frequently focus on operational components of the SLAC organizational structure that seem to be less likely to change than the hierarchical levels above them. In any case, keeping the hierarchy flatter means there are fewer components subject to change than if the names reflected all levels of the current organization chart.

There are a few exceptions to using CODE. For example, the BINLIST code for the SLAC Library is lib and for TechPubs is pub, but these have other contexts in UNIX (such as, /usr/local/lib and /usr/local/pub). Also, a group, such as the Library, may be particularly identified with its name more than its code.

Restricting Access

The WWW default is to allow access to pages by anyone on the Web wherever he or she is. Access to some files, however, should be restricted to SLAC users. The best we can do now, with a reasonable degree of security, is to restrict WWW visibility to those logged in to a host in the slac.stanford.edu domain (with a SLAC IP number -- 134.79...) when the fully qualified URL includes the word slaconly, such as /pubs/slaconly/tip. Therefore, if you need to restrict access to SLAC users only, put slaconly (use only lowercase) somewhere in the fully qualified URL. Naming a file in this way will make it inaccessible on the Web to SLAC collaborators working remotely and should therefore be used with caution in the distributed collaborator environment.

Note that users with appropriate AFS privileges (including those elsewhere in the HEP community with a current AFS token at SLAC) may read any file in /afs/slac/www space including those with slaconly in their names by maneuvering through the file system.

Naming Files and Subdirectories

The most important criterion to remember in naming subdirectories is that names should be kept as short as they can be without sacrificing meaning, particularly leftward (toward the top of the file hierarchy) in the filename. For example, use

/afs/slac/www/accel/
rather than
/afs/slac/www/accelerator/
Name-length limitations in AFS volume names are particularly important in re-establishing the file system after certain crashes, and thus short subdirectory names or clear abbreviations of those names are important. Short names are also faster to type and consistent with the UNIX style of labeling things. Very long names in URLs have caused some browser displays to break. It may be useful to remember that fully qualified pathnames are not true English, they are English-like names.

In addition to trying to keep directory names within the 3-8 character range, the following recommendations for naming subdirectories and files should help to promote a coherent and usable name space:

  1. Once names for types of things have been chosen, use them consistently. Name the same kind of subdirectory with the same name wherever it appears. For example, for subdirectories containing materials relating to talks, name the subdirectories talk wherever they appear:
    /afs/slac/www/grp/irm/telecom/talk
    /afs/slac/www/grp/grp/scs/net/talk
    Consistent names help people recognize directories and files when they encounter them or even unearth them via a find command.

  2. Most subdirectories contain a master or top file -- this is the file that you want someone browsing the file hierarchy to see first. In naming this top file there are several conventions in use:
  3. Use periods in filenames only to indicate file types, such as .html, .ps, .ps.Z, .pdf, and .gif, otherwise avoid them. These endings are reserved for designated MIME types and other file formats. Naming your file with such an extension may result in your filename meaning something that you never intended it to mean. For example, if you were to name all of your PhotoShopTM files as something.ps, many programs would assume that the files were PostScript files when they weren't and would therefore be unable to read them.

  4. Try to avoid subdirectory names that are the same as major CGI script names. You may review a list of common production scripts in http://www.slac.stanford.edu/slac/www/tool/summary.html. Maintaining distinct page and script names will help users recognize the nature of the file.

  5. If you're using a formatted-file-to-HTML converter like rtftohtml, don't fight the converter's naming standards. If you're creating HTML files yourself, be aware of the a trade-off between using one larger file (which requires more time to transfer over the net) and using several smaller files (which may require longer URLs, have more URLs and files to keep track of, and make searching within a document more difficult).

The following naming conventions may also be useful. They are designed to speed typing and take some of the guesswork out of developing and searching for filenames.

  1. Restrict subdirectory names to lower case. Such names are easier to type and it's not always possible to guess the correct filename capitalization. Even if everyone memorized The Chicago Manual of Style, proper capitalization of UNIX filenames would still be governed by the restrictions of the UNIX operating system rather than those of the English language.

  2. Use the generic singular form rather than the plural in path- and filenames (/afs/slac/www/comp/form or /afs/slac/www/icon), unless a name is the name of something well-known in the plural, like /pubs or /stats. Fostering a consistent use of the singular will narrow the range of possibilities that searchers and maintainers are required to remember. Again, pathnames are not natural language; they merely borrow from the structures of natural language.

  3. Mash terms together without hyphens unless the result is misleading to read or pronounce. For example, use /slac/www/wwwtech, not /slac/www/www-tech; but use /emp/emp-opp, not /emp/empopp. As long as it's clear, typing fewer characters rather than more is faster. Establishing compound words rather than hyphenated words as the standard will make it easier to remember or guess what would be the likely name for a given subdirectory (as well as allowing a broader range of subdirectories to be displayed in a subdirectory listing).

  4. Where spaceholders are necessary, use hyphens (-) rather than underscores (_). While both are in common use, SLAC's goal of fostering naming consistency encourages users to choose one and stay with it. Since typing a hyphen does not require use of the shift key and its use is somewhat more common in the UNIX environment, it seems to be the logical choice for our default. Note that public_html is an unfortunate exception that was brought to SLAC as the default of the CERN server.

  5. Have the file owner choose the name of the file itself, without necessarily applying the strictures for pathnames and subdirectory names described above. Free use of capitalization, hyphens, and longer names in filenames may increase readability and serve as an aid to memory. Filenames are much more like a form of document title than are the pathnames. For this reason, a distinction between pathname and filename conventions may be desirable.
For examples of these guidelines as they are applied in the production SLAC WWW AFS space, see "Appendix A: Naming Examples" at the end of this document.

Some Implementation Details

Contact SCS (SCS Help Desk, ext. 4357) for initial creation of your directory in production AFS WWW space. You will be expected to know basic AFS file management commands. SCS will establish appropriate Access Control Lists (ACL) and groups. Groups will generally be set up in pairs where one group controls the membership of the group that actually has the write privileges into the AFS directories for your pages.

Time Frames

If you need help installing a new page that fits into the existing information architecture and that does not require the creation of a new subdirectory at the top of the file tree, a one-working-day turnaround is the goal for installation. Advance warning is always appreciated.

For new pages concerning an area of information that requires the creation of a new subdirectory, a one-working-day turnaround is the goal to set up the AFS groups and to request the AFS volume from the Server Support group. That group aims to have the AFS space set up within two working days. This three-day time frame assumes that the area fits into the existing information architecture and that the requestor has provided all the information described in "How to Install Pages in the Production SLAC Web," such as AFS-privileged user names and space estimates.

Finding an appropriate subdirectory name for a new kind of information area, particularly at the highest subdirectory level in AFS WWW space, may take quite a bit longer; /xorg, for example. The time is needed to think through how the creation of a new subdirectory may affect other aspects of file and page linking design. This is particularly true in the cases of those subdirectories that could easily fit in more than one directory.

Finding a good place for pages to reside at the outset can help minimize the URL changes which complicate page maintenance for everyone.

Roles

The WWW AFS Registrar designated by the WWW Coordinating Committee is responsible for naming subdirectories at the first level below /afs/slac/www/. This Registrar will work to keep the high-level taxonomy sensible and consistent in light of specific user needs and system requirements. In the short run, Joan Winters has agreed to serve as the registrar with Ilse Vinson as backup registrar and Pat Kreitz as the "higher authority." Individual groups may find designating their own group registrars useful.

Some Notes on Change Management

SLAC needs to determine where source for production WWW files that are not self-defining (such as .ps or .pdf) should be kept. In some cases a pointer file to where the source is kept may suffice, but this may well prove to be less than stable over time.

It is also recommended that SLAC develop or acquire tools to ease migration of pages through the system, including providing for file/URL renaming over the years. Cleaning out the obsolete files (and, where appropriate, putting them into /archive/YYYY, where YYYY is the year of last update) will keep the WWW information space easier to use.

Acknowledgments

This work is an outgrowth of an effort started by Tony Johnson (see "New URL Scheme for SLAC WWW Server"). He has continued to be very helpful in discussions along the way. In addition, real-world examples provided by Ilse Vinson, Andrea Chan, Brooks Collins, George Crane, Laurie Gennari, Diana Gregory, Karen Heidenreich, P.A. Moore, and others have helped significantly to flesh out the model. Feedback from the WWW Technical Committee has been very useful in developing some of the concepts. Any inadequacies in the document are, of course, our own.

Appendix A: Naming Examples

The following recommendations show these guidelines at work in the SLAC WWW AFS space.

Some key pages:

/ the SLAC Welcome Page (the default SLAC server page)

/slac/highlighted.html (the Highlighted SLAC Home Page)

/slac/detailed.html (the Detailed SLAC Home Page)

/slac/disclaimer.html (the SLAC Disclaimers, Copyright, and Other Fine Print)

Some first-level subdirectories (functional first):

/accel (accelerator)

/archive (important files that no longer have a current use)

/bis (business information systems)

/comp (computing)

/discourse (old mailing list items)

/edu (proposed for pages supporting SLAC's educational mission; some currently in /gen/edu)

/emp (employment)

/eprise (pages relating to enterprise databases)

/esh (environment, safety, and health)

/exp (experiment, often multi-institutional)

/icon (SLAC-supported icons such as the SLAC seal)

/library (library)

/phys (physics)

/pubs (SLAC periodicals, etc.)

/slac (fairly formal pages of broad interest to the SLAC working community)

/visitor (proposed polished pages dealing with specialized information that is directed at visitors to the SLAC site)

/spires (pages relating to SPIRES applications)

/welcome (polished, introductory information about SLAC)

/xorg (proposed multi-institutional groups that include SLAC, such as task forces and committees)

/grp (group- or department-oriented information)

Some second-level subdirectories (functional first):

/accel/pepii (multi-institutional)

/accel/nlc (multi-institutional)

/archive/1994

/archive/1996

/bis/acct (proposed)

/bis/budget (proposed)

/bis/commits (proposed)

/bis/pers (proposed for personnel systems)

/bis/procure

/bis/snap (proposed)

/bis/stores (proposed)

/comp/future

/comp/intro

/comp/mac

/comp/net

/comp/phys

/comp/security

/comp/telecom

/comp/unix

/comp/vendor

/comp/winnt

/edu/ssi (proposed replacement for /gen/meeting/ssi)

/edu/ssp (proposed )

/emp/emp-opp (employment opportunities)

/esh/training

/esh/slaconly

/exp/babar

/exp/e143

/exp/e144

/exp/e154

/exp/mq

/exp/sld

/icon/usr (user-contributed icons of wide interest to the SLAC community)

/pubs/beamline

/pubs/slaconly

/slac/announce

/slac/map

/slac/www

/spires/doc

/spires/form

/spires/query

/xorg/nmtf

/grp/ad

/grp/arb

/grp/bsd

/grp/cd

/grp/do

/grp/efd

/grp/pao

/grp/pep (possible for SLAC members of the PEP-II group)

/grp/pe

/grp/rd

/grp/scs

/grp/xorg (proposed for cross-SLAC groups, such as task forces and committees)

A few lower-level subdirectories (functional first):

/bis/procure/req/slaconly

/comp/telecom/phone-dir

/comp/telecom/phone-users-guide

/exp/sld/figure/top20

/slac/www/how-to-use

/slac/www/resource

/slac/www/stats

/slac/www/tool

/slac/www/tool/search

/grp/scs/net

/grp/scs/systems

/grp/xorg/wwwtech (proposed)

A few examples of conventional filenames:

/accel/pepii/home.html

/library/nobel.html

/slac/www/resource/resource.html

/grp/scs/mission.html

/grp/scs/scs.html

/grp/scs/orgchart.gif

Some exceptions in subdirectory naming:

/grp/techpubs

/grp/library

Appendix B: Group Codes

On 16 September 1996, Diana Gregory supplied the following list of group codes valid in the SPIRES BINLIST subfile:

AAO Affirmative Action Office

ACC Accounting Office

AD Accelerator Department

ARA Accelerator Research Department-A

ARB Accelerator Research Department-B

ARD Accelerator Research Department

BAS Business Applications Support Group

BBR BABAR

BSD Business Services Division

BU Budget Office

CB Crystal Ball Project

CD Controls Department

CG Computation Research Group

CYO Cryogenics Operations

DO Director's Office

DOE US Department of Energy

EA Experimental Group A

EB Experimental Group B

EC Experimental Group C

EE Experimental Group E

EFD Experimental Facilities Department

EG Experimental Group G

EI Experimental Group I

EK Experimental Group K

EPR Environmental Protection and Restoration

ESA End Station A Users

ESH Environment, Safety, and Health Division

FAC Facilities Office

FD Palo Alto Fire Department, Station 7 (ESH)

IRM Information Resource Management and Technology Transfer

IS Information Services

KLY Klystron and Microwave

LIB Library

MD Mechanical Design

ME Mechanical Engineering

MED Medical Department (ESH)

MET Metrology

MFD Mechanical Fabrications Department

NPS Nuclear Physics at SLAC

OHP Operational Health Physics (ESH)

PAO Public Affairs Office

PCD Power Conversion Department

PE Plant Engineering

PEL Physical Electronics

PEP Positron Electron Project

PER Personnel Department

PPO Planning and Assessment Department (ESH)

PRC Property Control

PUB Publications

PUR Purchasing

RD Research Division

RPG Radiation Physics (ESH)

SCS SLAC Computing Services

SEC Security

SHA Safety, Health, and Assurance Department (ESH)

SLD SLAC Large Detector

SSR Stanford Synchrotron Radiation Lab

TD Technical Division

THP Theoretical Physics

TR Travel

TSP Accelerator Theory and Special Projects

VAC Vacuum Group

WM Waste Management (ESH)


Joan Winters with Jennifer Masek