I would like to completely delete all contents within the postISRCCD
and icExp
subdirectories of an output run (without resorting to rm -rf
, unless that’s recommended…). I am using v23_0_1
of the LSST pipelines.
The following two commands appear to delete all files within the icExp
and postISRCCD
directories:
butler prune-datasets $REPO --purge DECam/runs/prune_test/20220926T234050Z --datasets icExp DECam/runs/prune_test/20220926T234050Z
butler prune-datasets $REPO --purge DECam/runs/prune_test/20220926T234050Z --datasets postISRCCD DECam/runs/prune_test/20220926T234050Z
But many empty directories are left behind:
$find repo/DECam/runs/prune_test/20220926T234050Z/postISRCCD -type f |wc -l
0
$find repo/DECam/runs/prune_test/20220926T234050Z/postISRCCD -type d |wc -l
34
$find repo/DECam/runs/prune_test/20220926T234050Z/icExp -type f |wc -l
0
$find repo/DECam/runs/prune_test/20220926T234050Z/icExp -type d |wc -l
50
In this run there are only 25 CCD’s worth of outputs, so the numbers of empty directories left behind aren’t huge. But I want to soon process millions of CCD’s, in which case it seems that I’d be left with O(10 million) inodes consumed by empty directories within icExp
and postISRCCD
. What is the recommended way to get rid of these directories, in addition to the files that they once contained?
Also, a related but different question which maybe should be its own separate forum topic: is there a way to embed butler prune-datasets
directly into my YAML-defined pipeline? I think that’d be preferable to running my pipeline and then running separate butler prune-datasets
commands after the fact.
I did a brief/superficial search for any instances of “prune” within all of the YAML files in our v23_0_1 installation but came up empty:
$ find lsst_stack_v23_0_1/stack -name “*.yaml” |wc -l
4523
$ find lsst_stack_v23_0_1/stack -name “*.yaml” -exec grep -i prune {} \; |wc -l
0
Thanks very much.