For UNIX, there are two primary network file systems in use at
SLAC today: NFS (Network File System) and AFS (Andrew File
System). They are backed up in different ways, and have different
schedules and different capabilities. However, there are a few
underlying policies that were followed in setting up both backup
Backups are performed automatically on a daily basis and should be viewed primarily as a disaster recovery mechanism, not as an archival system. This means that the backups are not retained forever: the maximum is generally a year, but can be as short as 2 weeks. See below for the backup retention policies for NFS and AFS files.
By default, we do not backup NFS file systems since they are quite large and often used only as temporary work space. Special requests to backup NFS file systems should be sent via email to unix-admin.
Those NFS file systems that are backed up are done so via IBM Tivoli Storage Manager (ITSM, or commonly just TSM) software. The ITSM server is currently a Sun Solaris host with a disk cache and multiple attached tape drives. TSM supports clients running on various platforms; we currently support Solaris SPARC, Solaris x86 and Red Hat Enterprise Linux x64/x86 clients.
Files are recovered from TSM by using the dsmj (GUI) or dsmc (command line) programs. If you backed up files from a flora public machine, then you can also restore the file from a flora. Users may recover any file owned by their account using either interface, though the graphical interface may be easier to understand. See the TSM Restore Web page for instructions on restoring your files. To request the restoration of files you do not own or that were backed up directly from an NFS server, send email (with an explanation) to unix-admin.
TSM is an incremental backup system. It backs up only the files that changed since the last backup, and maintains information on the state of the client file system. It is possible to restore the file system to the last backup state, and to restore some older versions of deleted files. TSM is not configured to restore the file system to the state it had at a specific point in time, i.e., it may not be possible to restore a directory to the way it looked 4 weeks ago at 12pm or any other particular day or time. Such a policy would use significantly more tape space since the backup server would be forced to keep a copy of every file version going back to that date.
The TSM backup runs each night, usually starting sometime between 12:00AM-5:00AM on TSM client machines.
TSM maintains backup data for both active and inactive file versions. An active version of a file is the most recent backup copy of a file stored in TSM for a file that currently exists on a file server or workstation. An active version remains active and exempt from deletion until: 1) replaced by a new backup version or 2) TSM detects, during an incremental backup, that the user has deleted the original file from a file server or workstation. An inactive version of a file is a copy of a backup file in TSM that either is not the most recent version, or the corresponding original file was deleted from the client file system.
Unless otherwise stated, the STANDARD retention policy is as follows:
What does all this mean? Basically, if a file still exists on disk, the last 31 day's worth are also on tape. But once a file is deleted from disk, the last backup copy is kept for 366 days while all older copies expire as they reach 31 days of age. So if the file is changing every day and then gets deleted, only the last 31 days worth will remain on tape. If the file is changing weekly and then gets deleted, then only about 4-5 inactive versions will remain on tape and span up to 31 days.We will notify file owners in advance if their backups are not using the STANDARD retention policy.
AFS backup is provided by the native AFS backup system. The unit of AFS file storage and backup is the volume. Typically, each user's home directory is a single volume. For the first level of backup, AFS creates a copy of each volume at midnight each night. This copy is called a "backup volume". You can find this backup volume from the .backup link in each home directory. If you have just deleted or damaged a file that existed at midnight, type "cd ~/.backup" to find a version of it from the previous day and copy it back into your home directory.
Note that backup volumes are automatically "mounted" for home directory volumes only. This means that you must manually mount backup volumes for group volumes or user sub-volumes if you need to recover files from the midnight copy. To do this you will need the volume name. The easiest way to get this is to execute the "fs listquota" command on the directory in question. For example, if you had accidentally removed a file from the directory /afs/slac/g/babar/data/data01, you would type
> fs listquota /afs/slac/g/babar/data/data01 Volume Name Quota Used %Used Partition g.babar.data.01 500000 223503 45% 32%The first column lists the name of this volume as "g.babar.data.01". To get the name of the backup volume, append ".backup", then mount it in your home directory with the command:
> fs mkmount ~/bdata01 g.babar.data.01.backup
and reference it at ~/bdata01. (You may pick any name in place of bdata01 as long as the directory doesn't already exist.) The only privileges you need for fs mkmount are insert and administer for the directory you are mounting in (such as your home directory).
We recommend doing such mounts in your home directory to avoid creating directory "loops". For example, it is tempting to mount the .backup volume in the volume you're dealing with, because that is frequently your current directory. However, if you mount a volume's .backup volume within itself, and you leave the mount there, then tomorrow and thereafter, .backup and .backup/.backup and .backup/.backup/.backup etc. will exist. This causes real problems to recursive commands like "ls -lR", "find", and "du". We also recommend that you remove the mount when you are done with it, because you won't really like seeing it under your home directory. You can remove it with
> fs rmmount ~/bdata01
The AFS backup is a series of full and incremental backups, designed to provide complete coverage of recent changes, and sparser coverage going back in time. A level 0 backup is a full backup of the AFS file system. A level 1 backup is an incremental backup of all changes since the previous level 0 backup. A level 2 backup is an incremental backup of all changes since the previous level 1 backup. The schedule of AFS backups is as follows:
Level 0: A full backup is performed starting at midnight on the first Sunday of each month. This backup is retained for six months. After six months, only the quarterly (January, April, July, October) backups are kept. The quarterly backups are retained for one year.
Level 1: An incremental backup is performed starting at midnight every Sunday morning (except for the first Sunday of each month). These backups are retained for two months.
Level 2: An incremental backup is performed starting at midnight Monday through Saturday. These backups are retained for two weeks.
The result of that schedule is that a volume can be retrieved from the daily backups for the first two weeks, then from the weeklies for the first two months, then from the monthlies for the first six months, and then from the quarterlies for one year.
AFS backups are not yet retrievable by users with the exception of those files that are located in the user's .backup subdirectory created each midnight. See the AFS Restore Web page or send email to unix-admin to request the retrieval of a file from backup.
For corrections or comments, please send email to unix-admin. Please include this URL so we know to which page you're referring.
Last modified: 09 May 2013