UNIX BACKUP




UNIX File System Backups at SLAC

UNIX File Systems

For UNIX, there are two primary network file systems in use at SLAC today: NFS (Network File System) and AFS (Andrew File System). They are backed up in different ways, and have different schedules and different capabilities. However, there are a few underlying policies that were followed in setting up both backup systems.

Backups are performed automatically on a daily basis and should be viewed primarily as a disaster recovery mechanism, not as an archival system. This means that the backups are not retained forever: the maximum is generally a year, but can be as short as 2 weeks. See below for the backup retention policies for NFS and AFS files.

NFS Backup and Recovering Your Own Files

By default, we do not backup NFS file systems since they are quite large and often used only as temporary work space. Special requests to backup NFS file systems should be sent via email to unix-admin.

Those NFS file systems that are backed up are done so via IBM Tivoli Storage Manager (ITSM) software. The ITSM server is currently a Sun Solaris SPARC host with a disk cache and multiple attached tape drives. ITSM supports clients running on various platforms; we currently support Solaris SPARC and Red Hat Enterprise Linux x64/x86 clients.

Files are recovered from ITSM by using the dsm (GUI) or dsmc (command line) programs on flora public machines. Users may recover any file owned by their account using either interface, though the graphical interface may be easier to use. See the ITSM Restore Web page for instructions on restoring your files. To request the restoration of files you do not own, send email (with an explanation) to unix-admin.

ITSM is an incremental backup system. It backs up only the files that changed since the last backup, and maintains information on the state of the client file system. It is possible to restore the file system to the last backup state, and to restore some older versions of deleted files. ITSM is not configured to restore the file system to the state it had at a specific point in time, i.e., it may not be possible to restore a directory to the way it looked on June 5th at 12pm or any other particular day or time. Such a policy would use significantly more tape space when files change frequently.

ITSM Schedule and Retention Policy

The ITSM backup runs nightly, usually starting sometime between midnight and 5:00AM on ITSM client machines.

ITSM maintains backup data for both active and inactive files. Active files are files that exist on the client's file system and on tape. Inactive files are files that have been deleted from the file system but still exist on tape. Unless otherwise stated, the STANDARD retention policy is as follows:

  • Up to 30 versions of a particular file are kept on tape as long as the file exists on the client's file system.
  • Only the most recently backed up version on tape is active. All other versions (up to 29) on tape are inactive.
  • Once a file on tape goes inactive, it expires after 30 days and gets deleted off tape.
  • If a file is deleted from a client's disk, all tape versions will become inactive and start expiring off tape as they reach 30 days. The last remaining inactive version (which is also the most current version) will be kept for 1 year, after which it expires and gets deleted too.
  • As long as a file remains on a client's disk, it will remain active on tape and not expire.

    What does all this mean? We will notify a file owner in advance if a backup is kept for less than the standard 1 year period.

    AFS Backup and Recovering Your Own Files

    AFS backup is provided by the native AFS backup system. The unit of AFS file storage and backup is the volume. Typically, each user's home directory is a single volume. For the first level of backup, AFS creates a copy of each volume at midnight each night. This copy is called a "backup volume". You can find this backup volume from the .backup link in each home directory. If you have just deleted or damaged a file that existed at midnight, type "cd ~/.backup" to find a version of it from the previous day and copy it back into your home directory.

    Note that backup volumes are automatically "mounted" for home directory volumes only. This means that you must manually mount backup volumes for group volumes or user sub-volumes if you need to recover files from the midnight copy. To do this you will need the volume name. The easiest way to get this is to execute the "fs listquota" command on the directory in question. For example, if you had accidentally removed a file from the directory /afs/slac/g/babar/data/data01, you would type

    > fs listquota /afs/slac/g/babar/data/data01
    
    Volume Name                   Quota      Used %Used   Partition
    g.babar.data.01              500000    223503   45%         32%
    
    The first column lists the name of this volume as "g.babar.data.01". To get the name of the backup volume, append ".backup", then mount it in your home directory with the command:
    > fs mkmount ~/bdata01 g.babar.data.01.backup
    

    and reference it at ~/bdata01. (You may pick any name in place of bdata01 as long as the directory doesn't already exist.) The only privileges you need for fs mkmount are insert and administer for the directory you are mounting in (such as your home directory).

    We recommend doing such mounts in your home directory to avoid creating directory "loops". For example, it is tempting to mount the .backup volume in the volume you're dealing with, because that is frequently your current directory. However, if you mount a volume's .backup volume within itself, and you leave the mount there, then tomorrow and thereafter, .backup and .backup/.backup and .backup/.backup/.backup etc. will exist. This causes real problems to recursive commands like "ls -lR", "find", and "du". We also recommend that you remove the mount when you are done with it, because you won't really like seeing it under your home directory. You can remove it with

    > fs rmmount ~/bdata01
    

    AFS Backup Schedule and Retention Policy

    The AFS backup is a series of full and incremental backups, designed to provide complete coverage of recent changes, and sparser coverage going back in time. A level 0 backup is a full backup of the AFS file system. A level 1 backup is an incremental backup of all changes since the previous level 0 backup. A level 2 backup is an incremental backup of all changes since the previous level 1 backup. The schedule of AFS backups is as follows:

    Level 0: A full backup is performed starting at midnight on the first Sunday of each month. This backup is retained for six months. After six months, only the quarterly (January, April, July, October) backups are kept. The quarterly backups are retained for one year.

    Level 1: An incremental backup is performed starting at midnight every Sunday morning (except for the first Sunday of each month). These backups are retained for two months.

    Level 2: An incremental backup is performed starting at midnight Monday through Saturday. These backups are retained for two weeks.

    The result of that schedule is that a volume can be retrieved from the daily backups for the first two weeks, then from the weeklies for the first two months, then from the monthlies for the first six months, and then from the quarterlies for one year.

    AFS backups are not yet retrievable by users with the exception of those files that are located in the user's .backup subdirectory created each midnight. See the AFS Restore Web page or send email to unix-admin to request the retrieval of a file from backup.


    UNIX Backup Home Page

    For corrections or comments, please send email to unix-admin. Please include this URL so we know to which page you're referring.

    Last modified: 25 July 2006