SLAC ESD Software Engineering Group
UNIX SYSTEM ADMIN
NFS Server Setup on Sun Storage 7310
1) Sun Unfied Storage 7310 cluster with one J4400 cabling
2) allocate three nodenames/IPs for each node
one for NFS server (node 0): node0_nfs_ip/x
connect to node0' NET port 2
one for ILOM/SP (remote console) on LCLS-UTIL: node0_sp_ip/x
connect to node0's NET MGT port
one for BUI on LCLS-UTIL: node0_bui_ip/x
connect to node0's NET port 0
Will get renamed to mccfs2
- mccfs4 172.27.8.13 NGE2
- mccfs4-mgt 172.27.7.16 NET MGT
- mccfs4-bui 172.27.7.25 NGE0
one for NFS server (same as node0): node0_nfs_ip/x
connect to node1's NET port 2
one for ILOM/SP (remote console) on LCLS-UTIL: node1_sp_ip/x
connect to node1's NET MGT port
one for BUI on LCLS-UTIL: node1_bui_ip/x
connect to node1's NET port 1
- mccfs4 (same as node0) for active/passive clustering
- mccfs3-mgt 172.27.7.15 Net MGT
- mccfs3-bui 172.27.7.24 NGE1
1) set up DHCP for ILOM/SP
Configure DHCP servers (mccsrv01 and mccsrv02), and add node0_sp_ip and node1_sp_ip
Make it ready for setting up BUI.
2) set up BUI connection
start SP/console to access node0's console
configure the network for node0's BUI (node0_bui_ip)
Hostname, DNS, IP, netwmask,etc.
- nodename: node0_bui
- domain: slac.stanford.edu
- default gateway: 172.27.4.1
- mask: 255.255.252.0
- 188.8.131.52 mccsrv01
- 184.108.40.206 mccsrv02 (later)
use node0's BUI (https://node0_bui:215 for Cluster setup)
ssh node0's SP to monitor node0's console
ssh node1's SP to monitor node1's console
ensure tht node1 hasn't been initial set up; otherwise, perform a factory reset
Click 'START' to begin. Set up the cluster using the displayed cluster
1) Enter node1's bui name (node1_bui) when prompted.
2) set up the network configuration
nge0: node0 bui Datalink node0 bui Interface (BUI for node0)
(lock the bui Network port)
nge1: node1 bui Datalink node1 bui Interface (BUI for node1)
Configuring Route to add a default route for node1_bui
Interface: node1_bui (nge1)
nge2: node0 nfs Datalink node0 nfs Interface (ip for node0 nfs server)
Configuring Route to add a default route for node0
Interface: node0 (nge2)
All IPs must be static (otherwise, cluster won't work)
Enter DNS, NTP, NIS, etc
Skip NIS LDAP AD
3) Storage Pool Setup
Create one storage pool
Pool name: mccsp
Data Profile: Mirrored
Log Profile: Striped
Skip "Phone Home" setup
4) assign ownership
Assign node0 bui network to node0_bui and set as private NGE0
Assign node1 bui network to node1_bui NGE1
Assign node0 nfs to node0_bui NGE2
Assign mccsp to node0_bui
Don't select failback when asked to confirm the change.
5) perform a "Failback"
Verify that all "singleton" resources have relocated to their host (owner)
6) BUI to node1
Go to "Configuration Cluster" and make node1 bui network interface private
on node1 (only time needed to configure in the whole cluster setup)
7) Validate Cluster Setup
On node1's Configuration Cluster Window, perform a "Takeover". Monitor
node0's console. Verify that all of node0's singleton resources have failed
over to node1. Use "ping mccfs2" and "arp mccfs2" to find mccfs2's MAC address and make sure it belongs to node1.
On node1's Configuration Cluster Window, perform a "Failback". Monitor
node0 and node1 consoles. Verify that all of node0's singleton resources
have failed back to node0. Use "ping mccfs2" and "arp mccfs2" to find mccfs2's MAC address and make sure it belongs to node0.
Bui to node0, commit (this setp is only for intial cluster setup and validation)
Shares -> PROJECTS (+ Add Project)
Shares -> General
Read only (unchecked)
Updated access time on read (unchecked)
Non-blocking mandatory locking (unchecked)
Data deduplication (unchecked)
Data compression (off)
Checksum (Fletcher4 (Standard))
Cache device usage (All data and metadata)
Synchronous write bias (Latency)
Database record size (8k)
Additional replication (Normal (Single Copy))
Virus scan (unchecked)
Prevent destruction (unchecked)
Permissions (RWX RWX RX)
LUNS (use default)
Shares -> PROJECTS -> Protocals
Share Mode (None)
Disable setuid/setgid file creation (unchecked)
Prevent clients from mounting subdirectories (unchecked)
This is requred for IOC NFS mounting.
Anonymous user mapping (nobody)
Character encoding (default)
Security mode (Default (AUTH_SYS))
Network LCLSCA (RW), PCD(RW), FACETCA (RW), DMZ (R)
ROOT ACCESS (unchecked, except for lcls-daemon3.slac.stanford.edu and mccfs5.slac.stanford.edu )
Note that the nodename must be fully qualified. The nodes with root access will be used to configure the shared filesystems
with root privilege.
a) Root Acess for 172.27.8.0/22 has been enabled to help with data migration. Will be disabled.
b) Configuration-> Services -> NFS: Minimum supported version set to NFSv2 to support rtems based hard IOCs which are running NFSv2.
Share mode (Read only)
Shares -> PROJECTS -> Access
Shares -> PROJECTS -> Snapshots
Shares -> PROJECTS -> Replication
Create shares in project mccfs
Shares -> PROJECTS -> mccfs
(Note: for /usr/local, uncheck "Inherit from project" and manually set the mountpoint to /export/mccfs/usr/local; for all other shares, check "Inherit from project")
Data migration source (None)
Permissions (RWX RX RX)
Inherit mountpoint (checked)
Rject non UTF-8 (checked)
Case sensitivity (Mixed)
Create shares: home, u1, u2, u3
For each share (or filesystem created), be sure to update its access: Shares -> Shares, click each filesystem, select Access:
Root Directory Access:
Permissions RWX RX RX
Below is a snapshot of services enabled/disabled on our SS7310 system.
25-Oct-2013 Turned on SYSLOG
Click on "Configuration"
Click on button to the left of Syslog (edit Service Configuration)
Selected Protocol: Updated Syslog (RFC 5424)
Added mccsyslog: 220.127.116.11
Changes will be propagated to other server automatically
1. Firmware Upgrade Procedure
Takeover and Failback:
A takeover takes all of the resources from the partner head, regardless of which head those resources are assigned to. A failback merely gives back any resources assigned to the partner head.
To do a takeover from the partner head "for example" node1. This will cause a reboot of node0 and when it comes backup it is ready for an upgrade. When upgrading node1, you should do a takeover from node0 which will cause a
reboot of node1 and once it comes up it is ready for an upgrade.
2. Support Bundles
- Create a Support Bundles and send to TechSupport
Go to Maintenance -> System in BUI. To generate a support bundle, click (+) ioc next to Support Bundles. It will first
generate the bundle and then send to TechSupport. Since we don't have a connection to TechSupport, it will fails in upload
process and retry. At this point, click Cancle to stop the upload; instead, download the bundle to the local desktop machine
and ftp it to TechSupport. In case we want to check the bundle file ourself, ftp the bundle file to public
(ftp ftp.slac.stanford.edu, use binary mode), unzip and untar on a Solaris machine, and exaim e.g. cifs/cifs.out file.
3. Configuration Backup
Configuration Backup tasks can be accomplished using the Configuration Backup area near the bottom of the
Maintenance > System screen in the BUI.
To create a backup, click the "Backup" button above the list of saved configurations and follow the instructions.
You will be prompted to enter a descriptive comment for the backup.
This configuration file can be sent to TechSupport for debug, but we should NEVER use it to restore the system,
as I found it is very dectructive - it can wide out all configurations in Storage Pool, Projects and Shares.
a. all bundle files and configuration backup files are saved to Z:\unixadmin\SS7310 for safety.
b. Snapshoots of system configurations are also ketp in Z:\unixadmin\SS7310\Configuration Snapshoot for reference.
Don't do it, as it is a very destructive operation. Do it only when one of the clustered nodes (or heads) fails and needs to be replaced with a new one.
NFS Server Migration Plan
- configure mccfs2 to mount /newu1, /newlocal and /new home on mccfs4
mount mccfs4:/export/mccfs/home /newhome
mount mccfs4:/export/mccfs/usr/local /newlocal
mount mccfs4:/export/mccfs/u1 /newu1
2. on mccfs2 as root, run following (one at a time) and make sure each completed successfully
rsync -avSH /usr/local/ /newlocal
rsync -avSH /u1/ /newu1
rsync -avSH /home/ /newhome
3. on lcls-srv20
mount /u1, /home, /usr/local on mccfs4
- reconfigure all IOCs to use the new mounting path (in $IOC/All/Prod) and in DHCP for FACET and LCLS
- rename mccfs2 (nodename/IP) to mccfs5
This is equivalent to disable NFS server, but has additional advantages. All NFS clients should stop writing to the NFS server. mccfs5 will be kept to continue hosting Matlab License server, printing server, account management, and system file distribution. We can test all these functions on mccfs5.
- on mccfs5 as root, make a final data migration (again, one at time)
mount /newu1, /newlocal, /newhome on mccfs4
rsync -avSH --delete /u1/ /newu1
rsync -avSH --delete /usr/local/ /newlocal
rsync -avSH --delete /home/ /newhome
- rename mccfs4 to mccfs2 (nodename/IP)
- reboot all NFS clients and IOCs orderly
- reconfigure mccfs5 to mount mccfs2 and reboot mccfs5
- test applications on Sunray, OPIs, Servers (daemon and interative)
- test IOCs applications (check screenloging, save/restore, edm and etc.)
It is a restriction of the 7000 Series Appliance that all shares must be exported for NFS out of the base directory of /export. The restriction is imposed due to the underlying OS implementation and the way NFS works. Because of this, we are limited to having all shares be under the /export directory with no way to change this.
factory reset ( CLI > maintenance > system > factoryreset)
drwxr-xr-x+ 3 root root 5 Feb 24 15:53 .
The plus sign (+) indicates the presence of an ACL (access control list). ACL is an extension to the normal *nix permissions system which increases security by allowing the system more fine-tuning in who is permitted to access specific files.
. In CLI, configuration -> services -> nfs -> nfsd_servers.
Author: Ken Brobeck and Jingchen Zhou. Last edited on 02/10/11