Production location for teststand data

MichelleGower · July 29, 2019, 7:46pm

Many of you have already heard about the following in various meetings and slack channels, but things are now complete enough to make the official announcement.

There is now a production filesystem on which teststand data will reside at NCSA (/lsstdata/offline/teststand). This filesystem is accessible from both login and compute nodes in the lsst-dev cluster as well as all the LSP nodes. Paths will be the same on all LSST systems.

Teststand sets (read-only):

auxTel from L1 Archiver: /lsstdata/offline/teststand/auxTel/L1Archiver
auxTel from ACCS/DAQ: /lsstdata/offline/teststand/auxTel/DAQ
BOT: /lsstdata/offline/teststand/BOT

Gen2 Ingestion Notes:

Mass ingestion of data prior to 2019-05-30 was done with w_2019_21 into fresh production repo
Output of ingestImages.py located in /log//ingest_output.log
Since code is still changing regularly, will use latest weekly (not a normal production procedure)
NCSA is not actively trying to ingest images that previously failed to ingest using each new weekly stack

Documentation:

K-T has an updated LSST Dev Guide with the new paths ready to merge to master
LDF documentation on teststand data is available at:
https://confluence.lsstcorp.org/display/DM/Data+Transfer+to+the+LDF

Support:

If files are in the storage directory, but not in the gen2repo directory/database, check the ingest_output.log in the log/ directory for ingestion errors.
If there are images that failed to ingest that are important, please make a specific request via JIRA (assignee: Michelle Gower)
If /lsstdata/offline/teststand isn’t visible on a particular machine/container, please report via an IHS ticket in JIRA.
Requests for new sets of teststand data should also be requested via JIRA (assignee: Michelle Butler)

Temporary Location:
While waiting for “global” access to this production filesystem, auxTel L1 Archiver images have been duplicated in a temporary location (/project/production/tmpdataloc).

For those using this temporary location via chained Gen2 repositories, the _parent link will need to be updated from /project/production/tmpdataloc/… to /lsstdata/offline/teststand/… .
(e.g., K-T is currently working with folks to determine the time of the switchover of /project/shared/auxTel/_parent)

Tucson L1 Archiver & ACCS/DAQ Image Schedule:
July 22nd: Both temp and production location actively getting any new Tucson L1 Archiver images
Aug 5th: Halt new data arrival in temp location
Sep 5th: Delete data in temp location (or earlier with users’ permission)

SLAC BOT Image Schedule:
July 22nd: Both temp and production location actively getting any new SLAC BOT raw images
Prior to next large BOT batch (early Aug): Halt new data arrival in temp location
Sep 5th: Delete data in temp location (or earlier with users’ permission since takes up a lot of /project space)

RHL · July 29, 2019, 8:13pm

I’m a little confused.

Will e.g. /lsstdata/offline/teststand/auxTel/L1Archiver be the place to point a butler, or is there another level of indirection required?
Why are we distinguishing the two sources of auxTel data? I thought that there was agreement to make the headers identical, and I thought that they were already close enough that they can be read by the same version of obs_lsst. It would be much better if all auxTel data were in the same place.
Are the reruns directories in place, and writeable by users?
Are CALIB directories in place, or what is the plan for this? It’s a bit trickier as this part of the workflow is less worked out, but users should not need to worry about where the calibs are.

ktl · July 29, 2019, 8:33pm

There is another level of indirection required if you want writable reruns (which already exists for AuxTel, hence the _parent repointing), and the Gen2 repo is a directory level below the given path.

As I understand it, the two sources of auxTel data (CCS and Archiver) will always differ slightly. They are in adjacent directories, so both are available if needed.

I don’t think CALIB directories are quite ready to be permanent, immutable data products. If that opinion is correct, they should be within or linked from the /project/shared indirect repos, not the /lsstdata repos, as is currently the case.

MichelleGower · July 29, 2019, 8:47pm

A further comment on the separate repositories for CCS and Archiver images. We are getting multiple versions for the same “image” by repository standards between the Archiver and CCS (and within CCS between the mcm and non-mcm data) that can’t both live in the same repository. The previous decision was to start with 1 repo for Archiver and 1 repo for CCS because of these collisions. We can revisit the decision if there is no need to have both and if there are clear rules for collisions, which right now would be first one taken, transferred and ingested wins.

RHL · July 29, 2019, 8:57pm

If we’re getting two copies of all the files I’m totally OK with different homes. Thanks

RHL · July 29, 2019, 8:59pm

I think we should only be advertising the linked repos with the reruns in; the other is an internal detail.
Re CALIB, we need to do something. I’m happy to have them only in the linked repos – as in the previous para, they are the only ones users should ever know about.