UNIX SERVICES
CERN UNIX CORE (batch farms)
With regards to workstation farm service, CERN notes that it depends
on the data rate required. With a Gigaswitch and a FDDI ring, you
might get 8 or 10 MBps. The advantage of keeping CPU and disk
servers separate is to optimize each. Some architectures are better
for CPU-intensive work, others for I/O. DEC Alphas are just now
becoming reasonable as disk servers. Doing 8 streams together , you
can possibly get 5-6 MBps. Using remote file I/O for tape (rfio's
tpread), you can get 800 KBps for a single stream.
CERN requires 5-10 MBps to the net for disk service, CPU service and
eventually for tape service, with peak aggregates of 25 MBps. Four
years ago this required Ultranet but now you can almost do that with
segmented FDDI and a Gigaswitch. Ultranet as a company is still
shaky but probably still 1-1/2 to 2 years ahead of competitors.
There have been management problems in the company plus FDDI muddied
the market for them, they spent too much time on the Cray interface
and were slow to implement workstation interfaces, and finally they
charged too little for their major effort, their software.
CERN is waiting for ATM but looking at FCS. The good news about FCS
is that vendors other than IBM are participating but the bad news is
that most see this simply as a way to connect disks. HIPPI is
suddenly becoming interesting and cheap serial HIPPI cards are
coming. It appears to be a good short term alternative to Ultranet.
Five architectures are currently supported. They have at least 1
person per architecture plus a backup person. They have found it
important to keep systems configured similarly. In the area of
accounting Gordon Lee has been talking via email with Lois White
here at SCS. CERN uses standard UNIX accounting and massage the
data into Excel spreadsheets for color plots.
They found linking was very slow with NFS so they use rdist to keep
local copies of libraries. This is managed by each experiment.
The architectures and their support are as follows:
- SGI -- This is a good multiprocessor architecture, scalable up
to 36 CPUs per system and able to handle TBs of data. Graphics is
strong, good I/O, and a good SMP design. Probably the best choice.
However there is flaky local SGI support (no spares, export license
problems, etc.) and SGI was late in delivering the 150 MHz CPUs.
They are currently weak on the midrange.
- H-P -- These machines have excellent price/performance. The I/O
is expensive though and they have no multiple CPU design. HP-UX is
rather old at the base level but has good tools. The hardware and
software are reliable but there is a long response time on bug
fixes.
- Sun -- Until recently, they had no big machine. They do have
very cheap desktops and are widely used as a reference machine by
software developers. However Solaris is a "big pain".
- DEC -- DEC really had nothing until to offer until the Alpha
arrived. Their disk systems now appear to be good products with
good prices. CERN does not have enough experience with DEC-OSF/1
yet but have been hearing about software immaturity.
- IBM -- IBM has had a good relationship with CERN and has done
joint projects. So far they have only been used as tape servers but
soon will be doing disk serving too. They seem to have very good
TCP/IP support.
RAID disk technology at CERN is/will be only used for home directory
reliability.
CERN also has a Meiko MPP system that uses 32 Sparc processors. The
application can write into local memory to communicate with remote
nodes. It seems to have a better network than the IBM SP-x
products. It runs Solaris 2.1 on local nodes.
With respect to AFS, CERN expects to move home directories from NFS
to AFS. Right now, all "Physics data" is in NFS but they expect to
have it all AFS-accessible. They do not use automounter in CORE;
rather they have some hard and soft NFS mounts instead. At present,
the public home directories are on Sun systems and others on SGI.
It is not clear how AFS will work with large Physics data sets and
its performance compared to rfio. The AFS token lifetime for batch
is a problem to consider.
CERN uses NQS. This is historical because of their Cray and Ultrix
systems. They ported NQS to Ultrix and acquired NQS internal
expertise. They added some enhancements (e.g.., cluster, portable
interface that also runs on VMS and VM, limits on the number of
executing jobs per user). Christiane Boissat is their developer.
CERN UNIX (interactive systems)
CERN's customer base is about 1400 UNIX workstations across 7
supported architectures (Sun Sunos & Solaris, Apollo, HP, SGI, IBM,
DEC Ultrix, DEC OSF/1). They also support 250 X-terminals. They
have one specialist for each of 7 architectures plus one for AFS and
one for printers. Staffing is 6 staff plus 2 contractors.
Software is centralized except for Ultrix and OSF/1. A DEC "campus
contract" is being negotiated. They expect to pay 50K SFr to buy in
plus 400K SFr annual maintenance for 1000 workstations.
General services offered to clients include:
- security advice, CRACK service
- backup service and advice (they use ADSM, had a joint
project with IBM on WDSM)
- AFS service, gradually being expanded and experimenting
with ADSM
- DCE/DFS tests (trying IBM, promised from HP)
- management tool testing with FullSail, Patrol, Tivoli
(though expensive)
- COSE desktop study (VUE evaluation)
Their current challenges are:
- not enough staff as they try to build up IBM RS6000
support, SGI support; they are trying to buy contract support for
relatively mechanical jobs
- too much time answering mail/phones and not enough
producing guides and doing development work
- developing a standard UNIX environment in collaboration
with DESY
A new challenge is that AFS is becoming strategic to them. For
example, they developed the ASIS system to do software distribution.
It started as a "pull server" and then became an NFS mount server.
The next phase is to move to AFS and not pull but point.
They have no site-wide NIS domain. Therefore they will go through
some conversion moving to AFS if they want a single user name space
(as presumably they will want for a CERN-wide cell).
They have bought reference machines for each platform. These are
unmodified entry level machines for standard installation testing
and to generate standard binaries for AFS serving. They can be used
to check out problems in a standard vendor-supplied environment.
They are entry level and not for general use; these machines have
their own NIS domain.
Applications support is done from other groups plus some individuals
in CN.
For disk quotas, they have no quotas on home directories except for
one specific system and for AFS. (The Novell systems have a quota
of 50 MB.) They are considering a 100 MB quota for AFS but have not
thought it through yet. They feel very strongly that disk quotas
should not be an issue for researchers.
Obviously VM migration will have some effect on interactive UNIX
systems. They have about 4200 VM users/week and 300-400 VXCERN
users/week. Physicists think they'll go to UNIX based on what the
collaborations are saying. CERN is driven by what each national lab
does and the influence those labs exert on specific experiments.
CN offers the NICE environment, a "Novell Netware club" with Novell
file servers for PCs, and CUTE, the Common UNIX and X-Terminal
Environment. CUTE will offer AFS home directory servers with
backups and hierarchical storage management, AFS ASIS readonly
binary servers based on Reference Machines, and an AFS General
Staged Data Pool.
DESY Hamburg UNIX Services
UNIX began at DESY with graphics workstations, originally storage
tubes and IBM systems, then about 25 Apollo systems, then HP
9000/700 systems and then SGI systems. There are 8 HP model 735
systems for public use and about 11 model 730s for specific use.
They are moving from 730s to 735s. The SGI systems started with
MIPS 3000-based systems, originally 6 systems and a 7th for public
UNIX service, each with 6 CPUs. These became SGI Challenge systems.
There are a 7 systems with a combined total of 84 MIPS 4000 CPUs,
most using the faster 150 MHz chip. There are 2 public systems
(called x4u), 1 system for Hermes, and 2 systems for H1 (called
dice), all in the R2 YP domain, and 2 systems (called Zarah) plus 50
DEC Ultrix system for Zeus in the Zeus YP domain. They are merging
the R2 and Zeus domains. In total there is about 400 GB of disk.
There are also 2 Ampex robotic tape systems each attached to one
experiment and with 3 drives each. The Ampex equipment had only
been in production a few months as of early December, 1993. DESY
also has perhaps as many as 600 Xterminals deployed. They started
with Tektronix Xterminals, had problems, and now have mostly NCD
terminals. Most of 15" monochrome, perhaps 10% are 17" color, and
just a few are 19" monochrome.
They are testing amd at the Hamburg site, using automounters from
the vendors. They are also running some font servers for X. The R2
group has about 12 people doing support -- 4 or 5 for special
purpose, 3 or 4 on networking, 3 for SGIs, 2 for HPs, and 1 for
Apollos. They have a standard lockout screen that lets you choose a
maximum of 20 minutes for lockout. After that someone can use the
console. When logging in from an X terminal, they try to xhost to
the same machine that has the home directory of the user to reduce
network traffic and latency. They are not using AFS but seem to be
interested for the future. They are not doing any file sharing
between UNIX and MVS and noted that MVS NFS performed poorly. Like
SLAC, they don't have the staff to support all of the public domain
packages and enlist the volunteer help from their users for such
things.
DESY Zeuthen Unix Services
At the Zeuthen location near Berlin, they have about 150 people with
about 70 of those as scientists and 10-20 in the Computer Center.
The location is about 300 km. from the Hamburg location, 5 hours of
travel no matter which mode is chosen. The Zeuthen location is a
complete IBM shop now, having already moved from a VM environment on
an IBM-compatible mainframe in 1991. It took about 6 months as that
was considered "sudden". They also have HP and SGI systems with
some Sun, IBM, and Convex machines. As well, they have about 10
Macs in the H1 experiment and another 50 PCs on the network. Their
net is Ethernet with TCP/IP plus some AppleTalk and DECnet for a VMS
cluster. They use a FDDI ring between the SGI file server and the
computer servers (HP, SGI, and a Convex C3210). They have about 100
GB of SCSI disk for data. They consider it to be cheaper that
robotic storage right now though they expect to have cheaper prices
from robotics for the TB range in the next 1-2 years. For now, they
move data with 8mm tapes though these experience some problems some
times. They have about 70 X-terminals.
Zeuthen expected to install an IBM SP-1 with 10 nodes in December
1993. Why SP-1? They needed parallel computing for the
theoreticians and they wanted to investigate parallelizing Monte
Carlo simulations. They really needed a large mainframe such as a
Cray T3D or Convex. They were using part of an external Cray Y-MP
and needed their own machine.
UNIX for Zeus at DESY
A view from the Zeus project was given by Till Poser. Zeus sees
central services are needed to provide processing and reconstruction
of data. They use 18 SGI processors for batch and 18 SGI processors
for reconstruction plus about 150 GB -- 60 GB for data storage, 25
GB for Monte Carlo, and the rest for staging, spooling, etc. They
are not using Fatmen. Zeus is using rdist right now to keep
libraries in synch. They use CMZ for code management and it takes
most of an FTE to incorporate code from collaborators. Monte Carlo
production should officially be done at outside institutions.
However, they have developed a tool to scavenge workstation cycles
over the Ethernet though they need to make sure those workstations
have adequate memory. The tool is called "funnel", is runs from an
SGI "funnel server", and they could probably handle 250,000
events/week though right now they are actually doing about 2500
events/week with just a few workstations. They will use Oracle on
UNIX to record run information.
DESY UNIX Miscellanea
IP addresses are assigned within a few minutes and name servers are
updated each night. At the moment they do not keep a central
database of IP address information.
DESY has done a lot of work to develop a consistent UNIX
environment. Details made be found in WWW from the HEPix conference
held in Pisa in 10/93.
They have been using Legato Networker for backups but are planning
on moving to IBM's ADSM (though not at the Zeuthen location).
They are acquiring LoadLeveler, presumably for the SP-1 at Zeuthen,
but would like the HP version as well. They have used an NQS-like
system from Convex called Cxbatch for their Convex system. They
have found a German supercomputing company called Genias with a
package called Codine. It seems to be similar to LoadLeveler,
possibly propagated by DEC, and has ports for HP, Sun, IBM, DEC, and
SGI. Volkswagen is using it. A major advantage might be that it
seems to not be priced based on number of clients. Their cost is
about 20K DM the first year and about 8K DM maintenance each year
thereafter.
X-terminal support on the SGIs seems to need about *MB of memory for
the first window and then 1.8-2 MB of memory for each additional
window. DESY will be adding 1/2 GB of memory to both of the x4u
public SGI machines, bringing them up to 1GB each. DESY uses the
SGI Varsity software licensing program and also an academic DEC
software licensing program (equivalent to Stanford University's
ESL). Each SGI Challenge machine has 3 Ethernet interfaces (maximum
of 4) with 3 IOPs. Each SGI Challenge has 8 Fast/Wide SCSI disk
drives. They are negotiating a maintenance contract right now.
RAL UNIX Services
RAL has built a copy of the CERN Computer Simulation Facility (CSF)
from 6 HP 735s. This is a low I/O, high cpu farm application used
and paid for mainly by the HEP people. In the farm is a file server
(on one of the 6 HP 735s) which holds the home directories. Each
machine has 48 MB of main memory and 1.4 GB of disk for the OS and
file and swap space. Job scheduling is via the CERN/NQS system.
There is also a 5 CPU Alpha 3000 OSF/1 cluster each node having 64
MB of main memory and 2 GB of disk and one of the nodes acts as a
file server with 24 GB. This cluster is mainly used for batch.
There is no support for memory limits, at the moment, but this is
considered to be important. LoadLeveler is not available for Alpha
OSF/1 but DEC has a similar product called the Load Sharing Facility
which is also Condor based. Accounting does not work well for the
OSF/1 cluster, at the moment. There is no policy based batch
scheduling or reporting at the moment. Upgrades to the OSF/1 cluster
will be based on user demand.
RAL's current plans are to skip AFS and go directly to DFS. They
have DFS for the IBM RS6000 and hope to have it for OSF/1 next year.
They may become an AFS client of ASIS at CERN. On the farms they
hard mount NFS filesystems and soft mount NFS elsewhere. They do not
use NFS quotas they are considered to be too easy to bypass, it
makes disk space usage less efficient and users are not demanding
it. There are 2 GB of home directory space for users. The biggest
users are moved out to other partitions as they are identified.