Gen3 reference catalogs: one top-level symlink rather than per-file symlinks?

I am using a Gen3 version of the LSST pipelines (v23_0_1) to process DECam data. My goal is to perform large sets of raw DECam image reductions with pointings scattered across most of the sky. I have a large directory of ~130k (per HTM trixel) reference catalogs that I would like to make use of for this purpose.

In Gen2 (v19_0_0), I could accomplish this by having a single symlink within DATA/ref_cats (this Butler repo was named DATA), for instance DATA/ref_cats/ps1_pv3_3pi_20170110, pointing to the directory containing the ~130k per-trixel reference catalog files. I believe that I’ve tried all of the -t options for butler ingest-files in v23_0_1, but have not been able to achieve this same effect. For instance, -t symlink seems to create ~130k symlinks, one per reference catalog file, rather than just one symlink for the directory containing the set of reference catalogs. My butler ingest-files command looks like:

butler ingest-files -t symlink $REPO ps1_dr1 refcats ps1_dr1.ecsv

And I end up with a directory named:

$REPO/refcats/ps1_dr1

That contains one symlink per reference catalog (~130k symlinks in this case).

Is there a way to link my large set of reference catalog files with only one Butler repo symlink like I could in Gen2? I primarily ask because, for certain files systems that I work with, inodes are at a premium. Thanks very much.

The transfer option you need is “direct”. This will add them to the repository without copying them into datastore and without making any sym links or using inodes. “direct” stores the full path.

1 Like

See this section of the docs for how to ingest a gen2 refcat into a gen3 repo: How to generate an LSST reference catalog — LSST Science Pipelines

1 Like

Thanks, @timj @parejkoj ! I really thought that I recalled trying -t direct, but apparently I hadn’t. Yes, I now see that -t direct just stores the full reference catalog file paths without making any new symlinks/files, which is great. I verified that ingesting the reference cats with -t direct works all the way through calibrating a test DECam CCD. Thanks again!