Problems while trying to create a new LSST obs packag

Hello, I am a postgraduate student new to astronomy, and I am working on WFST, trying to use your well designed LSST pipeline to process our instrument’s data. At the very begining, my aim is just about the part the astrometry, but then I find the pipeline is well designed and can apply to other instruments once we set the package documents well. So, finally i decide to create a package obs_wfst. But I meet a lot of problems and I want to post the problems about creating a new package concentratly in this topic.
First, about How to create a new LSST obs package, I am following the four steps, about the instrument-test:

import unittest

import lsst.utils.tests
import lsst.obs.example
from lsst.obs.base.instrument_tests import InstrumentTests, InstrumentTestData


class TestExampleCam(InstrumentTests, lsst.utils.tests.TestCase):
    def setUp(self):
        physical_filters = {"example g filter",
                            "example z filter"}

        self.data = InstrumentTestData(name="Example",
                                       nDetectors=4,
                                       firstDetectorName="1_1",
                                       physical_filters=physical_filters)
        self.instrument = lsst.obs.example.ExampleCam()

if __name__ == '__main__':
    lsst.utils.tests.init()
    unittest.main()

and the 4 tests passed 3, the test about sqlite falled.

self = <sqlalchemy.dialects.sqlite.pysqlite.SQLiteDialect_pysqlite object at 0x7f1442060c70>, cursor = <sqlite3.Cursor object at 0x7f14336727a0>
statement = 'INSERT INTO instrument (name, visit_max, exposure_max, detector_max, class_name) VALUES (?, ?, ?, ?, ?)'
parameters = ('WFST', 1073741824, 1073741824, 10, '_instrument.WFSTCamera'), context = <sqlalchemy.dialects.sqlite.base.SQLiteExecutionContext object at 0x7f14336816d0>

    def do_execute(self, cursor, statement, parameters, context=None):
>       cursor.execute(statement, parameters)
E       sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: instrument.name
E       [SQL: INSERT INTO instrument (name, visit_max, exposure_max, detector_max, class_name) VALUES (?, ?, ?, ?, ?)]
E       [parameters: ('WFST', 1073741824, 1073741824, 10, '_instrument.WFSTCamera')]
E       (Background on this error at: https://sqlalche.me/e/14/gkpj)

../../../../../conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-0.7.0/lib/python3.8/site-packages/sqlalchemy/engine/default.py:719: IntegrityError
================================================================================== short test summary info ==================================================================================
FAILED test_instrument.py::TestInstrument::test_register - sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: instrument.name
================================================================================ 1 failed, 3 passed in 1.79s ================================================================================

And I inspect the sqlite document of HSC, find that in the instrument table, it’s class_name is lsst.obs.subaru.HyperSuprimeCam, I guess, it should have a similar format. I think this question can be solved if i can import my package like

from lsst.obs.wfst import WFSTCamera

not

import sys
sys.path.append("/home/yu/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/obs_wfst/python/obs/wfst")

from _instrument import WFSTCamera

And this make the function get_full_type_name(self) can’t work well.
So, how can I let my package can interacte with your interface and can import by lsst.obs.wfst?

I also wondering the the meaning of 22.0.1-44-g74bdbb4e+988f982fce in the url, looks like the dataset id in sqlite. I guess the package I created should also have folder named similar to commuicate with the interface?

/home/yu/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/obs_subaru/22.0.1-44-g74bdbb4e+988f982fce/hsc/

Second, is about the method makeDataIdTranslatorFactory

    def makeDataIdTranslatorFactory(self) -> TranslatorFactory:
        # Docstring inherited from lsst.obs.base.Instrument.
        factory = TranslatorFactory()
        factory.addGenericInstrumentRules(self.getName(), calibFilterType="band",
                                          detectorKey="ccdnum")
        # DECam calibRegistry entries are bands or aliases, but we need
        # physical_filter in the gen3 registry.
        factory.addRule(_DecamBandToPhysicalFilterKeyHandler(self.filterDefinitions),
                        instrument=self.getName(),
                        gen2keys=("filter",),
                        consume=("filter",),
                        datasetTypeName="cpFlat")
        return factory

I guess it’s a package that communicate from gen2 to gen3, if I want to make a completely gen3 package, should I need that? without this won’t pass the test.

Third, it’s about the transmission curves, they are many documents about transmissions curve for hsc:

[yu@localhost transmission]$ pwd
/home/yu/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/obs_subaru/22.0.1-44-g74bdbb4e+988f982fce/hsc/transmission
[yu@localhost transmission]$ ls
filterTraces.py                              wHSC-IB945.txt  wHSC-NB656.txt
M1-2010s.txt                                 wHSC-N1010.txt  wHSC-NB718.txt
modtran_maunakea_am12_pwv15_binned10ang.dat  wHSC-N400.txt   wHSC-NB816.txt
qe_ccd_HSC.txt                               wHSC-NB387.txt  wHSC-NB921.txt
README.txt                                   wHSC-NB468.txt  wHSC-NB926.txt
throughput_popt2.txt                         wHSC-NB515.txt  wHSC-NB973.txt
throughput_win.txt                           wHSC-NB527.txt

And in hsc’s _instrument.py, there are method about how to make use of these documents to create transmission curve, for example:

        # Write transmission sensor
        sensorTransmissions = getSensorTransmission()
        datasetType = DatasetType("transmission_sensor",
                                  ("instrument", "detector",),
                                  "TransmissionCurve",
                                  universe=butler.registry.dimensions,
                                  isCalibration=True)
        butler.registry.registerDatasetType(datasetType)
        for entry in sensorTransmissions.values():
            if entry is None:
                continue
            for sensor, curve in entry.items():
                dataId = DataCoordinate.standardize(baseDataId, detector=sensor)
                refs.append(butler.put(curve, datasetType, dataId, run=run))

        # Write filter transmissions
        filterTransmissions = getFilterTransmission()
        datasetType = DatasetType("transmission_filter",
                                  ("instrument", "physical_filter",),
                                  "TransmissionCurve",
                                  universe=butler.registry.dimensions,
                                  isCalibration=True)
        butler.registry.registerDatasetType(datasetType)
        for entry in filterTransmissions.values():
            if entry is None:
                continue
            for band, curve in entry.items():
                dataId = DataCoordinate.standardize(baseDataId, physical_filter=band)
                refs.append(butler.put(curve, datasetType, dataId, run=run))

        # Write atmospheric transmissions
        atmosphericTransmissions = getAtmosphereTransmission()
        datasetType = DatasetType("transmission_atmosphere", ("instrument",),
                                  "TransmissionCurve",
                                  universe=butler.registry.dimensions,
                                  isCalibration=True)
        butler.registry.registerDatasetType(datasetType)
        for entry in atmosphericTransmissions.values():
            if entry is None:
                continue
            refs.append(butler.put(entry, datasetType, {"instrument": self.getName()}, run=run))

        # Associate all datasets with the unbounded validity range.
        butler.registry.certify(collection, refs, Timespan(begin=None, end=None))

But I didn’t find similar method in DEcam’s _instrument.py, but in the quantum graph about the single_frame_process, it actually make use of the transmission curve, so where is the transmission of DEcam? What make this difference happen? And I also noticed that there are something like bfkernel, stary night in the hsc’ _instrument.py.
single_frame.pdf (23.0 KB)

There are also other questions, and I will post here later. I think i’m just a recruit new to astronomy and still in bootcamp, the problems may seems naive for you. Any answers, sugesstions or documents you point out for me will be pretty helpful.

Thank you!

You should not be naming your package lsst.obs.X – it’s going to be much clearer if you use your project’s namepsace.

Without seeing your code debugging this is going to be difficult. I think you have copied and pasted some HSC code into your package. See for example the instrument record construction in obs_cfht:

This seems to indicate that your package is not lsst.obs.wfst but is obs.wfst.

Can you import lsst.obs.wfst? get_full_type_name(self) should always be able to work. What error do you get?

Delete it. You don’t need it.

Now you are getting into curated calibrations. You will see that there are obs_x_datapackages that contain defects and QE curves and other text-based calibrations. Transmission curves are similar in that they are fairly static and can be ingested one off. Some of them are prebuilt binaries that must be ingested, others are built differently.

Curated calibrations are ingested using the butler write-curated-calibrations command. We have some default definitions and then instruments can add some to the list. See for example:

I’m not sure how DECam handles transmission curves.

This is the brighter-fatter kernel. We are currently transitioning the calibration classs for this from the HSC-specific one to a more generic form. I do not know whether your instrument needs such a correction.

1 Like

Thank you!
After inspect and correct my codes. I have passed the instrument_test.

The meaning I want to expressed this sentence is that I thought the error was caused by the format class_name should be lsst.obs.X, since subaru’s class name in its sqlite is class_name: lsst.obs.subaru.HyperSuprimeCam, as for me the get_full_type_name(self) function returns _instrument.WFSTCamera. And now I know the error not caused by above reason.

I used to put my package in a independent folder:

[yu@localhost obs_wfst]$ pwd
/home/yu/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/obs_wfst
[yu@localhost obs_wfst]$ ls
config  python  tests  ups  wfst

And can only be imported by the following, it’s the reason that make the class_name without the format lsst.obs.X.

import sys
sys.path.append("/home/yu/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/obs_wfst/python/obs/wfst")
from _instrument import WFSTCamera

Try to import lsst.obs.wfst will get a very common error:

ModuleNotFoundError: No module named 'lsst.obs.wfst'

And if put the python module into the obs_lsst subfolder

/home/yu/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/obs_lsst/22.0.1-61-gbd5239c+96886d66ff/python/lsst/obs/wfst

I can imported the WFSTCamera by

from lsst.obs.wfst._instrument import WFSTCamera

And above doesn’t matter know. Since after correct my code. I find that both urls I put wfst package can pass the instrument_test

test_instrument.py::TestInstrument::testMakeTranslatorFactory PASSED
test_instrument.py::TestInstrument::test_getCamera PASSED
test_instrument.py::TestInstrument::test_name PASSED
test_instrument.py::TestInstrument::test_register PASSED

Before next step, I have some doubt on the amplifier fits documents. Though there are comments, I still can’t get the certain meaning about the headers like readoutcorner, linearity_coeffs, raw_flip_xand so on. Is there are manual have more detailed information about the ampiflier setting?
for example, the 0_00.fits of hsc:

TTYPE11 = 'readoutcorner'      / readout corner, in the frame of the assembled i
TFORM11 = '1J      '           / format of field                                
TDOC11  = 'readout corner, in the frame of the assembled image'                 
TCCLS11 = 'Scalar  '           / Field template used by lsst.afw.table          
TTYPE12 = 'linearity_coeffs'   / coefficients for linearity fit up to cubic     
TFORM12 = '4D      '           / format of field                                
TDOC12  = 'coefficients for linearity fit up to cubic'                          
TCCLS12 = 'Array   '           / Field template used by lsst.afw.table          
TTYPE13 = 'linearity_type'     / type of linearity model                        
TFORM13 = '64A     '           / format of field                                
TDOC13  = 'type of linearity model'                                             
TCCLS13 = 'String  '           / Field template used by lsst.afw.table          
TFLAG1  = 'hasrawinfo'                                                          
TFDOC1  = 'is raw amplifier information available (e.g. untrimmed bounding box&'
CONTINUE  'es)?    '         
TFLAG2  = 'raw_flip_x'                                                          
TFDOC2  = 'flip row order to make assembled image?'                             
TFLAG3  = 'raw_flip_y'                                                          
TFDOC3  = 'flip column order to make an assembled image?'                       

Also, now I have wrote simplied CCD.py and amplifier.fits and want to process simulated data, What package do you use to simulate the data based on the CCD and amplifier, like raw, bias and so on?

And in the Getting started with the LSST Science Pipelines you give us a package: rc2_subset. Is they are guide about how to create a similar package?

Thank you!

This name indicates that there is something very wrong with the way you have set up your python package.

Can you try to set it up as a buildable python package? You should not need to be using sys.path.append to make things work. The reason we use _instrument.py is to show that the general user should never try to import from that file – there is an expectation that your package’s __init__.py will pull the class from _instrument.py (and the get_full_type_name code is clever enough to realize that the _ in the class name means it can be dropped from the full name).

You should be able to build a normal python package with a setup.cfg or pyproject.toml and build and install it (using pip install -e . or somesuch). Alternatively you can try to set up obs packages like we do using SCons and eups but that’s not necessary.

It would be much better for you if you can learn how to create a python package that can be installed and distributed. Note that you do not want to put it in the lsst.obs namespace.

There isn’t a centralized guide but there is a discussion here:

Some questions:

  • How are your raw files laid out? One FITS extension per amplifier or all amplifiers in one extension?
  • Does astrometadata translate work on your raw files?
  • Do you know what format your defects and other curated calibrations take?

We have used a package called imsim to simulate the data. I don’t know anything about the simulations though.

Thank you for your suggestions, I have used the setup.cfg and pyproject.toml to pack the obs_package.

And seems like the pipeline function getPackageDir doesn’t work with package build by setup.cfg and pyproject.toml? So I use the python bulitin fuction path = Path(wfst.__path__[0]).parents[1].
But I wonder if the warning below show something serious that I don’t know? just like before I use sys.path.append.

Input In [47], in <cell line: 1>()
----> 1 getPackageDir(wfst)

File ~/lsst_stack/23.0.1/stack/miniconda3-py38_4.9.2-0.8.1/Linux64/utils/g336def89a8+7bff505259/python/lsst/utils/_packaging.py:45, in getPackageDir(package_name)
     20 """Find the file system location of the EUPS package.
     21
     22 Parameters
   (...)
     42 Does not use EUPS directly. Uses the environment.
     43 """
     44 if not package_name or not isinstance(package_name, str):
---> 45     raise ValueError(f"EUPS package name '{package_name}' is not of a suitable form.")
     47 envvar = f"{package_name.upper()}_DIR"
     49 path = os.environ.get(envvar)

ValueError: EUPS package name '<module 'wfst' from '/home/yu/lsst_stack/23.0.1/stack/miniconda3-py38_4.9.2-0.8.1/Linux64/wfst/src/wfst/__init__.py'>' is not of a suitable form.

The instrument is still under construction, The CCDs we use are 9k x 9k, 16 amplifiers per CCD. And we plan to store all amplifiers in one extension. Are there some cautions for the two situations? Or are there differences in performance like process speed?
I still wonder, the official obs_package, CFHT, DEcam, HSC, LSST. The size of them are 4k x 4k or 2k x 4k. Don’t know if the pipeline works well for a bigger CCD 9k x 9k, like the primary part astrometry I focus on, for example if the FitTanSipWcsTass fits well? or other parts of the pipeline. I thought it may need to process data to answer the question.
And since I haven’t full know the following steps of the pipeline, so I haven’t process the simulated data until know. As for the curated calibrations. HSC has written bfkernel and transmissionCurve in official function writeAdditionalCuratedCalibrations. At least we will do the transmission correction. And choose other calibration from the official calibration classes you supply.

Thank you!

Manually messing with sys.path seems like the wrong approach.

Where is the getPackageDir coming from? I assume it’s coming from the curated calibrations handling but I’m not sure. If you are using EUPS with sconsUtils to setup the package then getPackageDir will work. If you are not using that but are instead wanting to use something like python package resources you will have to implement your own Instrument.getObsDataPackageDir method. The base class assumes you have an obs_x_data package that is separate but contains the defects etc. You if you change getObsDataPackageDir you can make it point wherever you like.

Those are some large detectors. @yusra do we have any reason to believe there will be problems with detectors that large?

1 Like