Hello all, I want to run ap_pipe for different bunches of visits for a given night (in 3 jobs separately), just to reduce the time. What is the best way to do this as I would like to get, by the end, one common association.db file that gathers all visits of the 3 jobs? Thank you
Hello, thank you for your question. ap_pipe.py
is designed so that multiple calls can update the same database, so there’s no problem there.
There may be a problem with running the jobs in parallel, however; one possibility is that sources detected in visits processed simultaneously would be assigned to different objects at the same sky position. @cmorrison would be better able to tell you how well the database handles concurrent access.
Hello, thank you for your answer. So, there is no guaranty to have safe association with parallel jobs. Is there another way to reduce the time needed by ap_pipe? It took almost 16 hours for one visit/103 ccds. Thank you
Maybe you could clarify something. When you said “in 3 jobs separately”, did you mean you run ap_pipe.py
three times? That is what I assumed at first, and as I said I am not sure what might happen in that case.
On the other hand, we have done parallel runs internal to the program using the --processes
/-j
command-line arguments, and those function without corruption (with the caveat, as @mrawls mentioned, that sources may be processed in a different order from that in which they were observed). If you are not yet using that feature, it might give you the speed-up you need.
Hello, yes this is what I meant with 3 jobs separately (run it three times). Thank you for explaining how you performed parallel jobs. I was not using the --process option before, it is what I need then to reduce the ap_pipe process time ! Thank you very much!