DM Doc Infrastructure Roadmap & Thoughts

In an issue over at the DESC Computing Infrastructure GitHub repo I shared the current status of documentation infrastructure and informally outlined some of our documentation roadmap. I thought it might be useful to cross-post the roadmap here to help share some of these ideas. Keep in mind that this roadmap is just an informal heads-up and not everything listed here is ticketed in JIRA yet.

  • We have Technotes that are essentially single-page Sphinx projects. We have a listing of Technotes on Community. You can typeset math in Technotes, and make bibliographies with bibtex.
  • We’re also looking into integrating Jupyter notebooks and Sphinx/reStructuredText into a unified authoring workflow. That would let an author produce plots and tables in the same environment as the document is being written. Related to this is continuous integration of documentation. We want to integrate py.test unit tests with code in notebooks so that documents can be validated. The same writing+testing strategy will also be used for code samples found in our software documentation.
  • LSST the Docs is not only for Sphinx projects. You can think of it as a static web site hosting platform that has Git branch/tag-based versioning built into its URL scheme. You could publish a Jekyll or Pelican site with it, or even a LaTeX document (see LDM-151 as simple example of how that might be done). The nice thing about LTD is that whenever a branch is pushed to the main repo, a new branch of documentation is automatically published. This feature is great for documentation PR reviews since no one needs to ever compile/generate the doc project locally.
  • We intend to ship a custom visual design for our Sphinx Technotes and software doc sites. The design would also be printable so that the HTML/CSS could be archived as a PDF, although we also believe in the value of being HTML-first. The visual design will allow different doc series to have their own identities (e.g., DM Technotes vs Sims Technotes vs DESC Technotes?).
  • I’m working on version switcher UI components and dashboard pages for LSST the Docs projects. They’re being built with JS/React and leverage the LSST the Docs API.
  • Although we’re not doing it yet, we plan on submitting all technotes to ADS and also archiving key versions to Zenodo. This will fulfill our goal of making technotes citeable in astronomy literature.
  • Documentation discoverability and usability is really important to us, and we recognize that we’re behind in this area. The strategy we’re taking is to allow content to live in whatever context makes sense (issues in JIRA, code on GitHub, conversations on, software doc projects and Technotes on LSST the Docs, papers on arXiv/ADS) and unite all of those pieces of information with a single API and search/index site. This is mentioned briefly in a DM Communications Platform inventory technote, although a lot of engineering and design work needs to be done. The documentation index (we’re calling it DocHub) would include full-text search of documents and code via Elasticsearch in addition to curated metadata-powered index pages. It should be easy to incorporate documents from science collaborations into this API.
  • To help integrate code and documentation repos on GitHub into the DocHub API, I’m formulating a proposal to embed metadata files into all of our GitHub repositories. I’m currently doing some early research on this, but I’m leaning towards a JSON-LD approach with a codemeta vocabulary. Storing metadata this way makes it straightforward to cross-walk metadata to other schemas (e.g., for submission to Zenodo or ADS). Since manually maintaining metadata is time-consuming and error-prone, I’m thinking about ways to ‘template’ the metadata and provide an API service that renders a projects metadata, using contextual information from a repo’s Git metadata, LICENSE files, files, and so on.