Transfer datasets from an existing repo to a new one

I am trying to transfer datasets from a repo that has processed data (raws, calexps, coadds, difference exposures), and I am interested in moving the difference exposures to a new repository. My reason for doing this is that the existing repo is owned by a different user on our server, and I do not necessarily have write permissions to their registry/datastore so I can do my own processing.

I thought this would be straightforward with:

butler transfer-datasets ${OLD_REPO} ${NEW_REPO} -d "*differenceExp" --collections ${COLLECTION} --register-dataset-types

however this fails, (I think) because the new butler repo does not have the same (or any) visits defined:

$ butler transfer-datasets /epyc/users/smotherh/DEEP/PointingGroups/butler-repo /epyc/projects/DEEP_asteroids/repo -d "*differenceExp" --collections PointingGroup008/imdiff_r/20210803T142852Z --register-dataset-types
lsst.daf.butler._butler INFO: Transferring 6174 datasets into Butler(collections=[], run=None, datastore='file:///epyc/projects/DEEP_asteroids/repo/', registry='PostgreSQL@steven_deep:public')
lsst.daf.butler.cli.utils ERROR: Caught an exception, details are in traceback:
Traceback (most recent call last):
  File "/epyc/projects/lsst_stacks/stacks/w.2022.06/env/lsst-w.2022.06/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
    self.dialect.do_execute(
  File "/epyc/projects/lsst_stacks/stacks/w.2022.06/env/lsst-w.2022.06/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.ForeignKeyViolation: insert or update on table "dataset_tags_00000003" violates foreign key constraint "fkey_dataset_tags_00000003_visit_instrument_id_instrument_visit"
DETAIL:  Key (instrument, visit)=(DECam, 891500) is not present in table "visit".

Please let me know if this is expected behavior/workflow, or if there is another method I should be using to transfer data from one repository to another.

Yes. This is what currently happens.

We wrote transfer-datasets to specifically help us with batch processing where we are copying records into a standalone registry for batch and then at the end of the workflow transferring the results back to the main butler repository. This did not require any dimension record transfer at the end because we had already started with a valid repo.

We do have an intent to make the command more generally useful by including the dimension records but it hasn’t bubbled to the top of the priority list yet.

This means that unfortunately you will have to do the butler export/import approach. Something like this:

for the export and then using butler import.