We are announcing the LSST AGN Science Collaboration’s 2021 Data Challenge. The purpose of this challenge is to help get more people involved in the work needed to do AGN science with the upcoming LSST data. For this purpose, we have produced a common exploratory dataset that can be used to develop tools for 1) parameterization of AGN light curves, 2) AGN selection, and 3) AGN photo-z. A panel of judges (consisting of the AGN SC leadership team and multiple members of other LSST science collaborations) will award prizes for derivative work that advances the goals of the LSST AGN SC and AGN science with LSST in general. We have LSSTC funding to award 1st, 2nd, and 3rd prizes of $2000, $1500, and $1000, respectively. In addition there is $5000 of funding for participation awards (10-20 at $250-$500) and $3000 for page charges to encourage publications that are derived from the competition. The deadline for submissions will be 17 September 2021.
More details about the competition and the data set(s) available for your use can be found at GitHub - RichardsGroup/AGN_DataChallenge: Information for LSSTC AGN Data Challenge.
Submissions would ideally be in the form of Jupyter notebooks, but the panel of judges will consider all reasonable submissions adhering to the following format:
- Introduction (What is the main goal your submission addresses with the data challenge? Does this goal relate to items in the AGN SC roadmap? If not, should a new item be added to the AGN SC road map?)
- Data (What data sets are you using for the challenge (see details about what data are available? Why do these data sets allow you to address your main goal?)
- Methods (Describe your method to extract the information for your main goal from the data. What is innovative about your implementation/application of this method to the data?)
- Results (Summarize your results. Include plots and statistics that illustrate your results. Discuss future improvements to the method or what future features could help to improve your results.)
- Code (Provide enough code that your results can be confirmed and tested by the judges on a “blinded” subsample; see below. Prize winners will be required to make their code available to the AGN SC and/or broader LSST community.)
We imagine that most submissions will fall into 3 categories:
- AGN Classification as measured by both completeness and efficiency.
- AGN Photo-z accuracy as measured by a robust estimator of the RMS and an outlier fraction. Specifically sigma_NMAD and f_out, see Lee & Chary 2020, Equation 1.
- Other results not otherwise specified herein along the lines of “Most creative effort to do things that we haven’t thought of”. For example characterization of light curve data. This category is important as we are using real data (rather than simulated) that often lacks “truth” (class and redshift). Thus there may be submissions that are exploratory in nature that do not address category 1 or 2. Such submissions could include code that allows for additional data to be added to the challenge (e.g., thumbnail images). Category 3 submissions will largely be considered for participation awards unless the submitter is able to make its relevance/importance exceedingly clear to the judges.
Blinded data: Users will need to generate their own training and test sets from the data provided, but the builders have set aside a “blinded” subsample that will be used to test submissions addressing categories 1 and 2 (or 3 if appropriate)
Judges will be guided by these categories, metrics, and analysis of blinded data, but not beholden to them as we imagine that submissions may have value beyond such statistics.