Should setting up an obs_* package enable processCcd.py?

If I want to process data from a camera should I always have to know that I need to set up pipe_tasks first or should I just be able to only set up the obs_* package?

E.g., CFHT, should I just be able to do

setup obs_cfht

or should I have to

setup pipe_tasks
setup obs_cfht

If the latter, then when I use, e.g., lsstsw/bin/rebuild to build obs_cfht, what is it that makes this build know that it needs pipe_tasks?

[This question relates to What is the philosophy of setupOptional?
but I ask them separately to hopefully result in clearer discussions.]

I am strongly in favor of expecting the user to setup pipe_tasks to run a task from that package.

From a philosophical standpoint I think it is safest to expect users to know which package contains the task they want to use, and to explicitly set it up. That scales well, since not all tasks live in pipe_tasks or its dependencies.

From a practical standpoint, not all obs_* packages need to import pipe_tasks (though many do) and it would be a shame to expect such packages to list pipe_tasks as a dependency when it is not. As an extreme instance, obs_test cannot depend on pipe_tasks since it is intended to be usable by lower packages.

I agree with Russell. pipe_tasks is (one place) where tasks live. obs_* is configuration appropriate to an observatory/instrument; it should not be considered to include primary science driver tasks.

I agree with the basic philosophy espoused by K-T & Russell. But in fact: obs_cfht provides its own ISR tasks which derive from those defined in pipe_tasks, thereby introducing a dependency of the former on the latter. obs_ packages which, like cfht, depend on pipe_tasks will provide the primary science drivers as a side effect of setting them up, whereas others won’t. That inconsistency is unfortunate, but I don’t think it’s fatal.

Are the tasks in obs_cfht intended as primary science drivers, or are they replacements for other tasks that are executed by the primary science drivers once obs_cfht is setup? I thought it was the latter, so that users don’t need to know whether obs_cfht contains tasks.

This is a huge issue for anyone new to the stack; it is enormously difficult to know what lives where without having significant experience with the stack.

I think it’s very important to minimize the number of possible entry points the novice user has to decipher. "I have data from X instrument, I will start at obs_X" is an intuitive story that I would hate to disrupt without a case for how automatically setting up pipe_tasks would lead to trouble for the user.

If we want to make pipe_tasks some other package be the user-facing front end to the stack, then I hope that discussion would be predicated on how can we make that package obvious and helpful to someone who is new.

2 Likes

You’re correct that the tasks in obs_cfht are not primary drivers. However, setting up obs_cfht gets you the primary drivers anyway, since it depends on pipe_tasks; setting up obs_blah might not. Users don’t need to know this, but folks will notice and be confused.

I’m with Colin on this one, and in fact this seems like a good use case for setupOptional.

I would expect obs_cfht to depend on pipe_tasks if that’s the entry point. I would not expect to have to setup two packages, so if you really want to make obs_cfht independent of pipe_tasks you’d need to add reduce_cfht to depend on both of them, and I don’t see the point.

1 Like

I don’t get this. setupOptional of pipe_tasks doesn’t seem to gain anything. Can you clarify?

If that is the requirement, then there is no alternative to having the obs_* package be the entry point, since obviously the camera-specific configuration must be available. I worry a bit that thinking of the obs_* packages as being at the “top” of the hierarchy as opposed to “drivers” (in the graphics-card sense) will encourage practices that may make it difficult to easily process data from multiple cameras at the same time. But it’s not an unreasonable choice.

My thinking there was that setupOptional will set up pipe_tasks if available, but it tells users and developers that there is not a direct dependency (e.g. no imports or #includes). Unless there is a direct dependency inside of obs_cfht, in which case it should have setupRequired.

Looking in the package, there are from lsst.pipe.tasks ... imports there, so it needs setupRequired pipe_tasks anyway, so the point seems moot.