"No mapper" error when re-running coaddDriver

hsc
butler
Tags: #<Tag:0x00007f7f72a18af8> #<Tag:0x00007f7f72a18788>

(George Becker) #1

I am working on mutli-band HSC data and attempting to re-run coaddDriver after a crash due to a lack of disk space. I’ve defined the mapper following the processing tutorial (https://dmtn-023.lsst.io/), but I’m getting the error “RuntimeError: No mapper assigned to Repository”.

The contents of the rerun sci_2 were deleted prior to re-running the command below.

I’m new to using the pipeline, so thanks for any suggestions!

$ coaddDriver.py DATA --rerun sci_1:sci_2 --selectId visit=085190..085200:2 \
--selectId visit=127450..127480:2 --id tract=0 filter=HSC-R2 --cores=2 --config \
assembleCoadd.doApplyUberCal=False makeCoaddTempExp.doApplyUberCal=False

root INFO: Loading config overrride file u'/Users/gdb/lsst_stack/DarwinX86/obs_subaru/13.0-39-gf25a3b0/config/coaddDriver.py'
root INFO: Loading config overrride file u'/Users/gdb/lsst_stack/DarwinX86/obs_subaru/13.0-39-gf25a3b0/config/hsc/coaddDriver.py'
CameraMapper INFO: Loading Posix exposure registry from /Users/gdb/hsc/j0148_combined/DATA/rerun/sci_1
CameraMapper INFO: Loading Posix calib registry from /Users/gdb/hsc/j0148_combined/DATA/rerun/sci_1
CameraMapper INFO: Loading Posix calib registry from /Users/gdb/hsc/j0148_combined/DATA/rerun/sci_2
Traceback (most recent call last):
  File "/Users/gdb/lsst_stack/DarwinX86/pipe_drivers/13.0-14-gdb8c927/bin/coaddDriver.py", line 4, in <module>
    CoaddDriverTask.parseAndSubmit()
  File "/Users/gdb/lsst_stack/DarwinX86/ctrl_pool/13.0-5-g9cf35e0+7/python/lsst/ctrl/pool/parallel.py", line 424, in parseAndSubmit
    **kwargs)
  File "/Users/gdb/lsst_stack/DarwinX86/ctrl_pool/13.0-5-g9cf35e0+7/python/lsst/ctrl/pool/parallel.py", line 333, in parse_args
    args.parent = self._parent.parse_args(config, args=leftover, **kwargs)
  File "/Users/gdb/lsst_stack/DarwinX86/pipe_base/13.0-9-g1c7d9c5+11/python/lsst/pipe/base/argumentParser.py", line 521, in parse_args
    self._processDataIds(namespace)
  File "/Users/gdb/lsst_stack/DarwinX86/pipe_base/13.0-9-g1c7d9c5+11/python/lsst/pipe/base/argumentParser.py", line 626, in _processDataIds
    dataIdContainer.makeDataRefList(namespace)
  File "/Users/gdb/lsst_stack/DarwinX86/pipe_drivers/13.0-14-gdb8c927/python/lsst/pipe/drivers/utils.py", line 68, in makeDataRefList
    skymap = self.getSkymap(namespace)
  File "/Users/gdb/lsst_stack/DarwinX86/coadd_utils/13.0-3-g4045236/python/lsst/coadd/utils/coaddDataIdContainer.py", line 41, in getSkymap
    self._skymap = namespace.butler.get(namespace.config.coaddName + "Coadd_skyMap")
  File "/Users/gdb/lsst_stack/DarwinX86/daf_persistence/13.0-25-g49e493d/python/lsst/daf/persistence/butler.py", line 1377, in get
    location = self._locate(datasetType, dataId, write=False)
  File "/Users/gdb/lsst_stack/DarwinX86/daf_persistence/13.0-25-g49e493d/python/lsst/daf/persistence/butler.py", line 1298, in _locate
    location = repoData.repo.map(datasetType, dataId, write=write)
  File "/Users/gdb/lsst_stack/DarwinX86/daf_persistence/13.0-25-g49e493d/python/lsst/daf/persistence/repository.py", line 240, in map
    raise RuntimeError("No mapper assigned to Repository")
RuntimeError: No mapper assigned to Repository

(George Becker) #2

Sorry – I realized I probably also need to rerun makeDiscreteSkyMap before running coaddDriver. When I run makeDiscreteSkyMap, however, I get error messages that the data cannot be found, e.g.:

$ makeDiscreteSkyMap.py DATA --rerun sci_1:sci_2  --id visit=085190..085200:2 \
--id visit=127450..127480:2 --id visit=127406..127446:2 --id visit=085204..085206:2 \
--id visit=127610..127622:2 --id visit=127626..127658:2 --config skyMap.projection="TAN"

root INFO: Loading config overrride file u'/Users/gdb/lsst_stack/DarwinX86/obs_subaru/13.0-39-gf25a3b0/config/makeDiscreteSkyMap.py'
root INFO: Loading config overrride file u'/Users/gdb/lsst_stack/DarwinX86/obs_subaru/13.0-39-gf25a3b0/config/hsc/makeDiscreteSkyMap.py'
CameraMapper INFO: Loading Posix exposure registry from /Users/gdb/hsc/j0148_combined/DATA/rerun/sci_1
CameraMapper INFO: Loading Posix calib registry from /Users/gdb/hsc/j0148_combined/DATA/rerun/sci_1
CameraMapper INFO: Loading Posix calib registry from /Users/gdb/hsc/j0148_combined/DATA/rerun/sci_2
root WARN: No data found for dataId=OrderedDict([('visit', 85190)])
root WARN: No data found for dataId=OrderedDict([('visit', 85192)])
root WARN: No data found for dataId=OrderedDict([('visit', 85194)])
root WARN: No data found for dataId=OrderedDict([('visit', 85196)])
root WARN: No data found for dataId=OrderedDict([('visit', 85198)])
root WARN: No data found for dataId=OrderedDict([('visit', 85200)])
root WARN: No data found for dataId=OrderedDict([('visit', 127450)])
(etc.)

To manage disk space, I had moved DATA/rerun/sci_1, the rerun for the outputs from singelframeDriver, to an external disk, and then created a symbolic link to it. Would this confuse the Butler or create some other problem?

Thanks again!


(Paul Price) #3

Please report what version of the pipeline you’re using.

Perhaps you can outline the layout of your data repo and the contents of the repositoryCfg.yaml files in the sci_1 rerun? Even better would be if you can share your data repo, e.g., by http.


(George Becker) #4

HI Paul,

Thanks for your help on this.

I’m using version w_2017_28.

It’s possible that this is more of a file system issue than a pipeline issue. To process the raw science frames I ran singleFrameDriver with commands like

$ singleFrameDriver.py DATA --rerun sci_1 --id visit=127406..127446:2 --cores=4

I then created the sky map and ran coaddDriver using the commands above, which output to rerun/sci_2. These ran successfully the first time through. However, I ran out of disk space and could not run coaddDriver for my last filter. To help this, I moved sci_1 and a rerun for the calibration outputs (calib_1) to an external disk and created symbolic links:

$ ls -l DATA/rerun/
total 16
lrwxr-xr-x  1 gdb  staff   52 Aug 29 17:38 calib_1@ -> /Volumes/data/hsc/j0148_combined/DATA/rerun/calib_1/
lrwxr-xr-x  1 gdb  staff   50 Aug 29 17:38 sci_1@ -> /Volumes/data/hsc/j0148_combined/DATA/rerun/sci_1/
drwxr-xr-x  4 gdb  staff  136 Aug 29 22:15 sci_2/

Now when I run makeDiscreteSkyMap I get the missing data errors above.

Here are the contents of sci_1/repositoryCfg.yaml:

!RepositoryCfg_v1
_mapper: !!python/name:lsst.obs.hsc.hscMapper.HscMapper ''
_mapperArgs: null
_parents: [../..]
_policy: null
_root: null
dirty: true

and sci_2/repositoryCfg.yaml:

!RepositoryCfg_v1
_mapper: !!python/name:lsst.obs.hsc.hscMapper.HscMapper ''
_mapperArgs: null
_parents: [../sci_1]
_policy: null
_root: null
dirty: true

I notice that if I do an ‘ls’ then I see all of the repositoryCfg.yaml files:

$ ls -l DATA/rerun/*/repositoryCfg.yaml
-rw-r--r--  1 gdb  staff  379 Aug 25 09:01 DATA/rerun/calib_1/repositoryCfg.yaml
-rw-r--r--  1 gdb  staff  151 Aug 25 14:31 DATA/rerun/sci_1/repositoryCfg.yaml
-rw-r--r--  1 gdb  staff  154 Aug 29 20:22 DATA/rerun/sci_2/repositoryCfg.yaml

However, if I run find then I only see one:

$ find repositoryCfg.yaml
./DATA/rerun/sci_2/repositoryCfg.yaml

which is the only one that is not in a symbolically linked directory. Presumably the recursive decent for find stops when it encounters the symbolic link. Could a similar problem be creating the missing files error? The sci_1 rerun contains the following:

$ ls -1 DATA/rerun/sci_1/
01708/
02061/
02062/
config/
repositoryCfg.yaml
schema/

Perhaps I could try leaving DATA/rerun/sci_1/ in the directory tree (not as a symbolic link) and moving/linking only the sub-directories?

If this doesn’t sound viable then I’m happy to provide more details about the data repo.

Thanks again for your help!


(George Becker) #5

That approach worked. So instead of creating a symbolic link to rerun/sci_1, I kept rerun/sci_1 in place as a real (not linked) directory and created symbolic links to all of the sub-direcotries instead:

$ ls -1 DATA/rerun/
calib_1/
sci_1/
sci_2/

$ ls -1 DATA/rerun/sci_1
01708@
02061@
02062@
config@
repositoryCfg.yaml
schema@

where the links are to directories of the same name on an external drive. makeDiscreteSkyMap and coaddDriver now run. I kept a real copy of repositoryCfg.yaml in rerun/sci_1/, but it might have worked to link it symbolically also.

In any case, it seems like the pipeline apparently does not find the repositoryCfg.yaml file if it has to look through a directory that is symbolically linked. Thanks for mentioning this file, which turned out to be the key!


(K-T Lim) #6

I think the actual problem is that the repositoryCfg.yaml file is found but (as of the stack version you are using) it only contains relative path references, which are broken by the symlink.