Should image process tests be run in setUpClass so results are available to all tests?

An image processing test will often involve

  1. Process image with some piece of code. This might take 5 seconds -10 seconds.
  2. A bunch of different tests on the output of that processing.

How should running the processing and doing the tests be structured?

As a concrete example, here is a snippet of new testing code for a branch in processFile. I wrote this branch partially to specifically trigger a discussion of this present issue. I asked @parejkoj to review it because I was relative certain we disagreed. We had this discussion somewhat in the abstract on HipChat. I’m interested in people’s answers for a specific, but illustrative, case.

The processing run is done as part of setUpClass so the results are available to each test. Then there are separate tests. There are currently 3 tests, but more should be added to more intelligently test the content of the results.

The obvious alternative would be to run the processing and then a series of asserts in one test.

I welcome thoughts, discussions, and votes as we (perhaps largely I) start to go through and add more comprehensive tests to processFile and assorted obs_* packages.

def runSampleProcessFile(imageFile, outputCalexp, outputCatalog, outputCalibCatalog):
    output = subprocess.check_output(["processfile.py", imageFile,
                                      "--outputCalexp", outputCalexp,
                                      "--outputCatalog", outputCatalog,
                                      "--outputCalibCatalog", outputCalibCatalog,
                                      ])
    return output


class TestProcessFileRun(unittest.TestCase):
    """Test that processFile runs.

    Ideally processFile.py would just be a call to
    python/lsst/processFile/processFile.py parseAndRun
    But it's not structured that way right now
    So instead we're going to call the executable
    and ensure that the output files are generated and non-zero in size.
    """
    @unittest.skipIf(testDataDirectory is None, "%s is not available" % testDataPackage)
    @classmethod
    def setUpClass(self):
        dataPath = os.path.join(testDataDirectory, "data")
        testImageFile = "871034p_1_MI.fits"
        testOutputCalexpFile = "871034p_1_MI.calexp.fits"
        testOutputCatalogFile = "871034p_1_MI.src.fits"
        testOutputCalibCatalogFile = "871034p_1_MI.calib.fits"
        self.imageFile = os.path.join(dataPath, testImageFile)
        self.tmpPath = tempfile.mkdtemp()
        self.outputCalexp = os.path.join(self.tmpPath, testOutputCalexpFile)
        self.outputCatalog = os.path.join(self.tmpPath, testOutputCatalogFile)
        self.outputCalibCatalog = os.path.join(self.tmpPath, testOutputCalibCatalogFile)

        # We run processFile.py here in the setUp method.
        # so that the results are availalbe to the individual tests
        runSampleProcessFile(self.imageFile, self.outputCalexp,
                             self.outputCatalog, self.outputCalibCatalog)

    @classmethod
    def tearDownClass(self):
        if os.path.exists(self.tmpPath):
            shutil.rmtree(self.tmpPath)

    def assertFileNotEmpty(self, pathname):
        sizeOfFile = os.stat(pathname).st_size
        self.assertGreater(sizeOfFile, 0)

    def testCalexpNonEmpty(self):
        self.assertFileNotEmpty(self.outputCalexp)

    def testCatalogNonEmpty(self):
        self.assertFileNotEmpty(self.outputCatalog)

    def testCalibCatalogNonEmpty(self):
        self.assertFileNotEmpty(self.outputCalibCatalog)

In this particular example, would it be more efficient to put the processFile call in setUpClass so that it only runs once?

1 Like

Hah — that’s exactly what I said on GitHub before I found this thread.

2 Likes

Yes, that’s what I meant to do. Sorry for the brain failure leading to confusion.

code example updated to no longer obscure the point.

You need to fix this one as well, otherwise it’s deleted at the end of the first test.

1 Like

Edited code again to be a copy of code that actually works.

This seems like a perfectly reasonable thing to do. What was the alternative?

I agree that would be nice, but I don’t think it will ever happen as Robert’s plan was to make sure this is independent of CmdLineTask. That’s actually one of the reasons I went down the obs_file road to begin with. I could use ProcessCcdTask directly (though I didn’t write any tests to use it :pensive:).

You could use the same parseAndRun structure even if processFile isn’t a CmdLineTask.

But we digress from the question in this ticket.

  1. We’re allowed to disagree?

  2. Tim already caught the thing I was going to point out (setUpClass, not setUp).

That said, I would say that running the code you’re actually testing in setUpClass is even worse than running it in setUp. It would mean that every test could fail for some reason that has nothing to do with the tests themselves (if something goes wrong in setUpClass), so the messages you get out might not be helpful. It also suggests to a user that each test is independent, when they’re really all testing the same piece of functionality.

Fundamentally, I think part of the problem here is due to us using a system designed for unit tests (hence the name) to do integration tests. Unit tests should be individually very fast, and should each test one simple thing. setUp[Class] is for assembling the necessary pieces to run the test, which may take time, but shouldn’t actually involve any of the things being tested themselves. Integration tests can take longer and test how a whole system hangs together. Your use case here is definitely the latter.

In the particular example here, I would suggest calling runSampleProcessFile inside one test method and doing all the asserts in that method. This is clumsy in the unittest framework, as the first failure will cause the whole thing to bail. We can get around that via unittest.subTest (added in python 3.4) or by using py.test features directly (the exact names of which I don’t have handy). Either way, putting the calling of the method to be tested and its associated asserts in one test makes it clear that one thing is being called to produce a bunch of output, all of which you’re testing.

I am certainly amenable to argument here: this is partly a testing philosophy thing. I would definitely be interested in a framework that’s more suitable to integration testing, if anyone knows of one (pytest may be enough, I don’t know).

One final thought: the fact that you’re writing an integration test for many pieces of procesFile suggests to me that what you really want is a bunch of actual unit tests for each thing you want to test. That way, each step would take a short amount of time, and be independent of the other steps. But I’m pretty sure that’s not possible for the example you have above.

def setUp(self):
    # prepare the file processor
    self.process = Processor()

def testCalibOutput(self):
    self.process.doCalibOutput()
    self.assertFileNotEmpty('somefile.fits')
1 Like

Ah, yes! Thank you for framing this so clearly.

@parejkoj In my nascent understanding of py.test, it looks like using a fixture might be one way to capture this idea of something that only has to be done once:

http://docs.pytest.org/en/latest/fixture.html

I understand your point that one of the key things we’re testing for integration tests is whether or not a given thing actually runs without crashing or raising an error. And you would like for there to be a clear message from the test saying “processFile integration test failed: crashed when running.” rather than the more indirect testCalibCatalogSize: Output catalog size not within expected bounds-type error.

Using fixturess still has the same philosophical problem that fundamentally it’s the results of running the processFile are what you’re trying to test, and putting it in a fixture, like putting it in setUpClass, obscures this.