LCLS Controls Oracle Maintenance Scheduling and Test plans

_________________________________________________________________________________________

Known Controls Oracle database patching and maintenance schedule:

KNOWN outages:

-         Machine reboot for OS patching: SLACPROD

 

Security patches (approximate schedule):

 

SCCS’s security patch procedure description:

 

Account password changes and policy:

 

 _______________________________________________________________________________________

 

Controls Oracle overview - diagram

________________________________________________________________________________________

 

Database schemas, applications and instances relevant to LCLS operation – table view

Schema

application/usage

PROD Instance

PROD Host

DEV Instance

AIDA

control system name and value server, used for matlab apps, SCORE, etc.

SLACPROD

slac-oracle03

SLACDEV

SCORE

save restore app, and APEX maintenance

MCCO

mccora2

SLACDEV

LCLS Infrastructure

device list, model database, APEX web reporting

SLACPROD

slac-oracle03

SLACDEV

IRMIS

currently booted PV list, feeds AIDA names list, IOC Info report

SLACPROD

slac-oracle03

SLACDEV

Artemis

Remedy problem reporting

SLACPROD

slac-oracle03

SLACDEV

SLC database

SLC signals used for LCLS

mcc

mcc

mccdev

 

________________________________________________________________________________________

 

SCCS Oracle maintenance

SCCS database administration needs to apply periodic updates and patches to all Oracle instances at SLAC.  We will attempt to negotiate the timing of the updates to jive with the LCLS operations calendar, and send out timely information.  When possible!  Other groups also use the SLAC Oracle instances, and their schedules must be taken into account as well. 

 

As is the case for all computer systems, sometimes emergency maintenance is necessary.  Email alerts will be sent as soon as we hear of any of these.

 

________________________________________________________________________________

Links:

sccs outage calendar:  https://www-internal.slac.stanford.edu/comp-out/outage.aspx

 

IRMIS: http://www.slac.stanford.edu/grp/cd/soft/database/IRMISSLAC.htm

AIDA: http://www.slac.stanford.edu/grp/cd/soft/aida/index.html

SCCS database page: https://www-internal.slac.stanford.edu/database/dbteam

________________________________________________________________________________________________

Update and patch procedure:

This is the order of events for each update and patch, as much as possible:

  1. SCCS does the Oracle upgrade/patch operation to the SLACDEV instance first. 
  2. All the application users/developers test on SLACDEV, and any problems with the upgrade/patch can be resolved.
  3. Once the green light is given, SCCS does the upgrade/patch on SLACPROD 4. For those applications that are present on SLACPROD, users/developers test again.
  4. Finally, SCCS operates on MCCO and
  5. Of course, a final round of testing.

________________________________________________________________________________________

 

Test Plans:

·        Each Oracle application has its own test plan.

·        The user/owner will execute their test plan following each update or patch, first on Development, then on Production (see Update and Patch procedure above.)

 

SCORE Application, and APEX application (Debbie/Partha)

The SCORE application has an existing testplan on Sharepoint: 

LCLS Document Storage → 01 - LCLS Systems → Electron Beam Systems → Controls → SaveRestore → SCORE Mini Test Plan.doc

 

The SCORE APEX application has an existing testplan on Sharepoint:

LCLS Document Storage → 01 - LCLS Systems → Electron Beam Systems → Controls → SaveRestore → SCORE APEX Application Mini Test Plan.doc

(Since the APEX mini test plan hasn't been done in quite a while, the first time through this, I may have to update that one to point to more relevant files, etc)

 

AIDA (Greg/Bob Hall)

Per AIDA. Yes, I have a test setup for AIDA. Depending on the change that has been made, I exercise different parts of it. Bob I believe also has a comprehensive test for AIDA, so if we were asked to formalize this, we'd base the unit testing on his script, which thoroughly exercises the AIDA unit tests.

 

For this DB patch, I did the following.

 

1) Bring up cmlogviewer on production, to monitor messages

2) Exercise basic operations - Ran the basic AIDA unit test for the Test server in each AIDA network (dev and prod)

3) Exercise all peers. If 2) was successful, ran at least 1 unit test on each AIDA peer (both VMS peers, and each unix peer). In fact I did more than 1, but that's really overkill for a DB test.

4) Exercised AIDAWEB

5) Exercised command prompt aidaget, aidalist

6) Exercised matlab aidaget, aidalist

 

IRMIS (Judy)

The day after the update (i.e. the first crawl following the update):

  1. Check the daily e-mails thoroughly.
  2. Check IOC report:  http://mccas0.slac.stanford.edu/crawler/ioc_report.html
  3. Check IOC Info web page: https://seal/IRMISQueries/
  4. Run some queries with the IRMIS gui.

 

Model database (Greg)

For the model_database, I just checked correct operation of APEX reporting.

This should include upload.

 

LCLS Infrastructure and APEX applications (Elie, Andrea)

 

CATER (Bill Allen)

Run test suite and examine results.

 

CAPTAR (?)