UNIX SERVICES

CERN UNIX CORE (batch farms)

With regards to workstation farm service, CERN notes that it depends on the data rate required. With a Gigaswitch and a FDDI ring, you might get 8 or 10 MBps. The advantage of keeping CPU and disk servers separate is to optimize each. Some architectures are better for CPU-intensive work, others for I/O. DEC Alphas are just now becoming reasonable as disk servers. Doing 8 streams together , you can possibly get 5-6 MBps. Using remote file I/O for tape (rfio's tpread), you can get 800 KBps for a single stream.

CERN requires 5-10 MBps to the net for disk service, CPU service and eventually for tape service, with peak aggregates of 25 MBps. Four years ago this required Ultranet but now you can almost do that with segmented FDDI and a Gigaswitch. Ultranet as a company is still shaky but probably still 1-1/2 to 2 years ahead of competitors. There have been management problems in the company plus FDDI muddied the market for them, they spent too much time on the Cray interface and were slow to implement workstation interfaces, and finally they charged too little for their major effort, their software.

CERN is waiting for ATM but looking at FCS. The good news about FCS is that vendors other than IBM are participating but the bad news is that most see this simply as a way to connect disks. HIPPI is suddenly becoming interesting and cheap serial HIPPI cards are coming. It appears to be a good short term alternative to Ultranet.

Five architectures are currently supported. They have at least 1 person per architecture plus a backup person. They have found it important to keep systems configured similarly. In the area of accounting Gordon Lee has been talking via email with Lois White here at SCS. CERN uses standard UNIX accounting and massage the data into Excel spreadsheets for color plots.

They found linking was very slow with NFS so they use rdist to keep local copies of libraries. This is managed by each experiment.

The architectures and their support are as follows:

  1. SGI -- This is a good multiprocessor architecture, scalable up to 36 CPUs per system and able to handle TBs of data. Graphics is strong, good I/O, and a good SMP design. Probably the best choice. However there is flaky local SGI support (no spares, export license problems, etc.) and SGI was late in delivering the 150 MHz CPUs. They are currently weak on the midrange.
  2. H-P -- These machines have excellent price/performance. The I/O is expensive though and they have no multiple CPU design. HP-UX is rather old at the base level but has good tools. The hardware and software are reliable but there is a long response time on bug fixes.
  3. Sun -- Until recently, they had no big machine. They do have very cheap desktops and are widely used as a reference machine by software developers. However Solaris is a "big pain".
  4. DEC -- DEC really had nothing until to offer until the Alpha arrived. Their disk systems now appear to be good products with good prices. CERN does not have enough experience with DEC-OSF/1 yet but have been hearing about software immaturity.
  5. IBM -- IBM has had a good relationship with CERN and has done joint projects. So far they have only been used as tape servers but soon will be doing disk serving too. They seem to have very good TCP/IP support.
RAID disk technology at CERN is/will be only used for home directory reliability.

CERN also has a Meiko MPP system that uses 32 Sparc processors. The application can write into local memory to communicate with remote nodes. It seems to have a better network than the IBM SP-x products. It runs Solaris 2.1 on local nodes.

With respect to AFS, CERN expects to move home directories from NFS to AFS. Right now, all "Physics data" is in NFS but they expect to have it all AFS-accessible. They do not use automounter in CORE; rather they have some hard and soft NFS mounts instead. At present, the public home directories are on Sun systems and others on SGI. It is not clear how AFS will work with large Physics data sets and its performance compared to rfio. The AFS token lifetime for batch is a problem to consider.

CERN uses NQS. This is historical because of their Cray and Ultrix systems. They ported NQS to Ultrix and acquired NQS internal expertise. They added some enhancements (e.g.., cluster, portable interface that also runs on VMS and VM, limits on the number of executing jobs per user). Christiane Boissat is their developer.

CERN UNIX (interactive systems)

CERN's customer base is about 1400 UNIX workstations across 7 supported architectures (Sun Sunos & Solaris, Apollo, HP, SGI, IBM, DEC Ultrix, DEC OSF/1). They also support 250 X-terminals. They have one specialist for each of 7 architectures plus one for AFS and one for printers. Staffing is 6 staff plus 2 contractors.

Software is centralized except for Ultrix and OSF/1. A DEC "campus contract" is being negotiated. They expect to pay 50K SFr to buy in plus 400K SFr annual maintenance for 1000 workstations.

General services offered to clients include:

Their current challenges are: A new challenge is that AFS is becoming strategic to them. For example, they developed the ASIS system to do software distribution. It started as a "pull server" and then became an NFS mount server. The next phase is to move to AFS and not pull but point.

They have no site-wide NIS domain. Therefore they will go through some conversion moving to AFS if they want a single user name space (as presumably they will want for a CERN-wide cell).

They have bought reference machines for each platform. These are unmodified entry level machines for standard installation testing and to generate standard binaries for AFS serving. They can be used to check out problems in a standard vendor-supplied environment. They are entry level and not for general use; these machines have their own NIS domain.

Applications support is done from other groups plus some individuals in CN.

For disk quotas, they have no quotas on home directories except for one specific system and for AFS. (The Novell systems have a quota of 50 MB.) They are considering a 100 MB quota for AFS but have not thought it through yet. They feel very strongly that disk quotas should not be an issue for researchers.

Obviously VM migration will have some effect on interactive UNIX systems. They have about 4200 VM users/week and 300-400 VXCERN users/week. Physicists think they'll go to UNIX based on what the collaborations are saying. CERN is driven by what each national lab does and the influence those labs exert on specific experiments.

CN offers the NICE environment, a "Novell Netware club" with Novell file servers for PCs, and CUTE, the Common UNIX and X-Terminal Environment. CUTE will offer AFS home directory servers with backups and hierarchical storage management, AFS ASIS readonly binary servers based on Reference Machines, and an AFS General Staged Data Pool.

DESY Hamburg UNIX Services

UNIX began at DESY with graphics workstations, originally storage tubes and IBM systems, then about 25 Apollo systems, then HP 9000/700 systems and then SGI systems. There are 8 HP model 735 systems for public use and about 11 model 730s for specific use. They are moving from 730s to 735s. The SGI systems started with MIPS 3000-based systems, originally 6 systems and a 7th for public UNIX service, each with 6 CPUs. These became SGI Challenge systems. There are a 7 systems with a combined total of 84 MIPS 4000 CPUs, most using the faster 150 MHz chip. There are 2 public systems (called x4u), 1 system for Hermes, and 2 systems for H1 (called dice), all in the R2 YP domain, and 2 systems (called Zarah) plus 50 DEC Ultrix system for Zeus in the Zeus YP domain. They are merging the R2 and Zeus domains. In total there is about 400 GB of disk. There are also 2 Ampex robotic tape systems each attached to one experiment and with 3 drives each. The Ampex equipment had only been in production a few months as of early December, 1993. DESY also has perhaps as many as 600 Xterminals deployed. They started with Tektronix Xterminals, had problems, and now have mostly NCD terminals. Most of 15" monochrome, perhaps 10% are 17" color, and just a few are 19" monochrome.

They are testing amd at the Hamburg site, using automounters from the vendors. They are also running some font servers for X. The R2 group has about 12 people doing support -- 4 or 5 for special purpose, 3 or 4 on networking, 3 for SGIs, 2 for HPs, and 1 for Apollos. They have a standard lockout screen that lets you choose a maximum of 20 minutes for lockout. After that someone can use the console. When logging in from an X terminal, they try to xhost to the same machine that has the home directory of the user to reduce network traffic and latency. They are not using AFS but seem to be interested for the future. They are not doing any file sharing between UNIX and MVS and noted that MVS NFS performed poorly. Like SLAC, they don't have the staff to support all of the public domain packages and enlist the volunteer help from their users for such things.

DESY Zeuthen Unix Services

At the Zeuthen location near Berlin, they have about 150 people with about 70 of those as scientists and 10-20 in the Computer Center. The location is about 300 km. from the Hamburg location, 5 hours of travel no matter which mode is chosen. The Zeuthen location is a complete IBM shop now, having already moved from a VM environment on an IBM-compatible mainframe in 1991. It took about 6 months as that was considered "sudden". They also have HP and SGI systems with some Sun, IBM, and Convex machines. As well, they have about 10 Macs in the H1 experiment and another 50 PCs on the network. Their net is Ethernet with TCP/IP plus some AppleTalk and DECnet for a VMS cluster. They use a FDDI ring between the SGI file server and the computer servers (HP, SGI, and a Convex C3210). They have about 100 GB of SCSI disk for data. They consider it to be cheaper that robotic storage right now though they expect to have cheaper prices from robotics for the TB range in the next 1-2 years. For now, they move data with 8mm tapes though these experience some problems some times. They have about 70 X-terminals.

Zeuthen expected to install an IBM SP-1 with 10 nodes in December 1993. Why SP-1? They needed parallel computing for the theoreticians and they wanted to investigate parallelizing Monte Carlo simulations. They really needed a large mainframe such as a Cray T3D or Convex. They were using part of an external Cray Y-MP and needed their own machine.

UNIX for Zeus at DESY

A view from the Zeus project was given by Till Poser. Zeus sees central services are needed to provide processing and reconstruction of data. They use 18 SGI processors for batch and 18 SGI processors for reconstruction plus about 150 GB -- 60 GB for data storage, 25 GB for Monte Carlo, and the rest for staging, spooling, etc. They are not using Fatmen. Zeus is using rdist right now to keep libraries in synch. They use CMZ for code management and it takes most of an FTE to incorporate code from collaborators. Monte Carlo production should officially be done at outside institutions. However, they have developed a tool to scavenge workstation cycles over the Ethernet though they need to make sure those workstations have adequate memory. The tool is called "funnel", is runs from an SGI "funnel server", and they could probably handle 250,000 events/week though right now they are actually doing about 2500 events/week with just a few workstations. They will use Oracle on UNIX to record run information.

DESY UNIX Miscellanea

IP addresses are assigned within a few minutes and name servers are updated each night. At the moment they do not keep a central database of IP address information.

DESY has done a lot of work to develop a consistent UNIX environment. Details made be found in WWW from the HEPix conference held in Pisa in 10/93.

They have been using Legato Networker for backups but are planning on moving to IBM's ADSM (though not at the Zeuthen location).

They are acquiring LoadLeveler, presumably for the SP-1 at Zeuthen, but would like the HP version as well. They have used an NQS-like system from Convex called Cxbatch for their Convex system. They have found a German supercomputing company called Genias with a package called Codine. It seems to be similar to LoadLeveler, possibly propagated by DEC, and has ports for HP, Sun, IBM, DEC, and SGI. Volkswagen is using it. A major advantage might be that it seems to not be priced based on number of clients. Their cost is about 20K DM the first year and about 8K DM maintenance each year thereafter.

X-terminal support on the SGIs seems to need about *MB of memory for the first window and then 1.8-2 MB of memory for each additional window. DESY will be adding 1/2 GB of memory to both of the x4u public SGI machines, bringing them up to 1GB each. DESY uses the SGI Varsity software licensing program and also an academic DEC software licensing program (equivalent to Stanford University's ESL). Each SGI Challenge machine has 3 Ethernet interfaces (maximum of 4) with 3 IOPs. Each SGI Challenge has 8 Fast/Wide SCSI disk drives. They are negotiating a maintenance contract right now.

RAL UNIX Services

RAL has built a copy of the CERN Computer Simulation Facility (CSF) from 6 HP 735s. This is a low I/O, high cpu farm application used and paid for mainly by the HEP people. In the farm is a file server (on one of the 6 HP 735s) which holds the home directories. Each machine has 48 MB of main memory and 1.4 GB of disk for the OS and file and swap space. Job scheduling is via the CERN/NQS system. There is also a 5 CPU Alpha 3000 OSF/1 cluster each node having 64 MB of main memory and 2 GB of disk and one of the nodes acts as a file server with 24 GB. This cluster is mainly used for batch. There is no support for memory limits, at the moment, but this is considered to be important. LoadLeveler is not available for Alpha OSF/1 but DEC has a similar product called the Load Sharing Facility which is also Condor based. Accounting does not work well for the OSF/1 cluster, at the moment. There is no policy based batch scheduling or reporting at the moment. Upgrades to the OSF/1 cluster will be based on user demand.

RAL's current plans are to skip AFS and go directly to DFS. They have DFS for the IBM RS6000 and hope to have it for OSF/1 next year. They may become an AFS client of ASIS at CERN. On the farms they hard mount NFS filesystems and soft mount NFS elsewhere. They do not use NFS quotas they are considered to be too easy to bypass, it makes disk space usage less efficient and users are not demanding it. There are 2 GB of home directory space for users. The biggest users are moved out to other partitions as they are identified.