Dear All, The phone conference to discuss HLT database access is booked for Wednesday 14th December at 17:30 CERN time. Call +41 22 767 7000 and ask for the ATLAS online database meeting organised by Richard Hawkings. For some introduction to the ATLAS model for conditions data, have a look at section 4.4 in the ATLAS computing TDR (in particular online and Athena access, performance and scalability). http://atlas-proj-computing-tdr.web.cern.ch/atlas-proj-computing-tdr/PDF/Computing-TDR-final-July04.pdf I enclose below some discussion on the current status of HLT database replication possibilities, which I hope is understandable. I should apologies - I hoped to write something more coherent for input to the meeting, but have run out of time for today. regards, Richard. -------------------------- Here is some more input to the A-Team HLT database access discussion from the last A-team (consider it as a response to my 'actions' in the last meeting). Actions >>> 1. Ask Richard to explain the mechanisms to propagate database changes from primary to secondary cache (subfarm) and local cache (local disk of farm node). -> At the moment, this is largely undefined. There will be a 'master' -> online database server (Oracle) which will actually reside in the computer centre. For database scalability and network bandwidth reasons, some sort of local replication or caching will be needed to support access by the 1000s of HLT nodes. For relational databases accessed by COOL or RAL (all Athena database access should go this way) we have two possibilities: - make physical replicas of the data which will be needed in next HLT run. RAL allows us to prepare these replicas as local MySQL databases or SQLite files, which can then be distributed to intermediate servers (e.g. one per rack) or even (in the case of SQLlite) to each worker node. The issues here are: - identify data needed for the next run - extract the neccessary data from Oracle and copy to other technology - distribute the data (MySQL db dump, or SQLlite file) to needed nodes This will clearly take some time; we need to understand if this should be done 'from scratch' at each new 'prepare for run' step, or we should foresee some kind of incremental update. COOL is already providing some selective replication tools which can do the 'identify + extract' step here, for data which is directly in COOL. For data which is is stored as references in COOL to other database info, we need to provide some intelligence in the replication tool so that it also follows the links from COOL to the payload data tables. - the alternative possibility is to use the RAL Frontier approach, where database SQL lookups are encoded as http requests, which can then be cached at an intermediate proxy server. Since each HLT node will make EXACTLY the same database queries at start of run, this is potentially a very attractive approach - we would have one (or maybe more) layers(s) of proxy servers talking to the database, and the worker nodes just talk to the proxy. Since this is done at the RAL level, it will cope with both COOL and RAL database lookup, so its quite attractive. The easiest way to update at a new run would simply be to flush all the proxies and have them re-retrieve from the database (it would probably be worth having one 'pilot' worker node go first to seed the proxy caches with the neccessary data). The choice of approaches depends largely on testing, which requires people, which at the moment we don't have (one of the issues for the current online database taskforce). We also have to deal with the POOL conditions data files required by some subdetectors - here some sort of set of buffer disks is clearly required. I'm hoping we can leverage some of the DDM tools to manage these data, but again at the moment this is completely unproven. All of the above discussion assumes that new conditions are loaded into the HLT only at start of run, and this will clearly take some time (minutes?). So this is the sort of thing you can do at start of fill, but not at the start of a new run within a full coming from a checkpoint transition. Different people have ideas about how you might perform 'incremental' conditions updates during a fill, but to my mind that looks very difficult without incurring significant deadtime; the archiecture of the conditions database is not really designed to do this (when you ask for an object, you get back its interval of validity, and you don't expect that to change 'under your feet' during a run). 5. Ask Richard to describe the mechanism by which new conditions data are produced and the relationship between this and the TDAQ end-of run transition (if any). Ask about how & when they are propagated to databases. (Relates to 4.) We can expect new conditions data to be produced during a run, and make their way into the online conditions database. This is largely asynchronous of the TDAQ end of run transition - data will be produced during a fill, and possible afterwards as a result of monitoring processes. If we start the next run with some sort of cache flush process, they will be picked up automatically if Athena makes a fresh request at the start of the next run. One problem I see is synchronising the start of run - how do we decide when all required updates are in and registered in the online conditions database. Only then can we start the setting up of replicas for the HLT nodes to use (whether by physical or virtual Frontier replicas). This process could take some time...