Formats & infrastructure for design documents?

jsick · September 11, 2015, 11:09pm

Continuing the discussion from Plan for Planning:

I ask this question with some reluctance but I want some clarification on the specs for the GitHub-based documentation platform that I’m specifically being tasked to build. Essentially: am I only supporting reST documents and bootstrapping off the platform we’re building for API docs, as @ktl indicates:

or am I also supporting LaTeX-based design documents as @mjuric suggests:

My original impression is that LDM-151 is something that I leave as-is and just work on the docs currently in Word (LDM-152, LDM-230, LDM-135, LDM-129, am I missing any?).

Or is the goal to have everything, agnostic of source format, automatically built with GitHub web hooks and rendered into nice HTML on a cohesive DM design documents website?

I’m happy to do anything (not withstanding finite story points ;)) but I just want to get a cohesive vision from @ktl, @mjuric and others for what they’d like on the output end of the design document process.

ktl · September 12, 2015, 3:38am

Sorry, but LDM-151 and the DPDD (LSE-163) also need to be built in a CI-like manner, likely with tex2ht as @timj has said. The output should have anchors suitable for linking from other documents (and JIRA issues). When we tag and release a given document, we’ll need to generate a PDF to place (manually) into DocuShare.

If it makes you feel better, I think the goal is to make these documents useful to outside developers and scientists so that they can understand the workings of the Stack at a level above the API.

frossie · September 12, 2015, 12:42pm

I am not entirely happy with this. It’s quite a lot of scope creep well in advance of having an MVP and getting consensus.

The intended remit was to move software documentation out of Confluence so it can be extended and CId. It was not to re-invent Docushare or some other CMS.

It’s not clear to me that the effort to reward ratio in dealing with largely static design documents is worth this; and moreover they can’t be CId in any meaningful sense (eg. they don’t have code examples). I am certainly not able to allocate Jonathan’s effort in W16 to this beyond maybe a “this is how a LaTeX design document could be done in RST if anyone would prefer to move to that”.

The priorities for the documentation infrastructure is (1) Sphinx instead of Doxygen (2) Live software documentation out of Confluence (3) CI’d examples/tutorials. I see no real gain in blowing up this task to the point that it slows down this high priority work.

jbosch · September 12, 2015, 3:36pm

If we want to have a cross-referencing system between all the documents, I think the argument that we should keep some of them in Latex while using reStructuctedText for the rest is a lot weaker. It’ll be a lot easier to just use one of those systems for cross-linking rather than invent our own to bridge them.

For what it’s worth, even though I’ve barely used it, I think I’d prefer reStructuredText; Latex is a pain, and the only reasons it’s still popular with scientists are inertia and math typesetting, but my understand is that all of the modern markup languages can now do the latter just as well. And I think reStructuredText will make @jsick’s life easier and the presentation of the documentations better online.

In any case, I think the structure and formatting of these documents is simple enough that the choice of markup won’t be a blocker for anyone who is interested in contributing to them; none of the options we’re considering are hard to use for the basic things we’ll be doing with them.

ktl · September 12, 2015, 4:54pm

There’s no need for that. I could maybe see linking from the Apps design document (LDM-151) to the DPDD as Mario has suggested, but they’re both in LaTeX. All the other links I’m talking about are manual.

There are only two things I see here: 1) build a system that can build our in-progress design documents (which need not be but probably should be one that can build our code documents); 2) move our Word documents to something GitHub-able. There’s no need to reinvent DocuShare which does none of the above; we are continuing to use that as our “source of truth” archive.

The whole point is to make it possible for them to not be “largely static” but instead updated continuously as we progress through Construction. This documentation is just as important to the usability of the Stack by outside users and our own developers as “what does getImage() return?”, and at least some of the content we are intending to capture is currently in Confluence pages.

@frossie To be very frank (and putting this out in semi-public), nothing is higher priority for me than setting the overall DM labor budget for Construction, because otherwise there is a significant risk that DM will not be able to complete its scope. The window for modifying that labor budget if it proves to be necessary is closing very rapidly. Getting easily reviewable (by a Project Scientist with very little time), easily edited (by a large group of contributors) design documents that clearly and correctly specify what and how we are building the DM System so that we can reasonably estimate and justify that overall DM labor budget is thus very high on my priority list. CI’d examples and tutorials are not as high.

jsick · September 12, 2015, 5:00pm

The vision is great

I fully agree with K-T’s silver lining that this exercise will make the Design Docs more useful for everyone. After the LDM-151 conversion exercise I heard from new DMers that they were pleasantly surprised at how useful and informative LDM-151 was. I do think using Docushare as both an official repository and the human-facing repo does them a disservice. By making these documents more readily, and pleasantly, available to DM and the public as beautiful websites we’ll have:

a great resource for onboarding new team members,
a great place to cross-reference in the other documentation (i.e., the software docs) for a global picture of how things are being built, and frankly
if the documents are more visible, we as a team will be more apt to making sure these docs reflect our actual plans rather than being something that gets written, archived, and left behind in the heat of the battle

Having continuous deployment of the docs is also key to preventing Design Doc updates from being a massive chore.

Supporting reST & LaTeX for Continuous Deployment will cost Story Points

When I converted LDM-151 to reST, I could use Sphinx and readthedocs.org (RTD). RTD gives us continuous deployment of reST docs for free. If I only had to support reST, I could get on with just converting the existing docs with Pandoc and then spending some story points on extending the reST markup and fixing/re-designing RTD’s HTML/CSS/JS that seemed to be a bit buggy for the case of single-page deeply sectioned documents.

I feel bad for saying this, but the issue is supporting both reST+sphinx and LaTeX+tex4ht simultaneously. If I have two formats, and two build tools, I can no longer use RTD. Instead, I have to build a ‘readthedocs for LSST DM Design Documents.’ Done properly, my JIRA stories would need to look like this:

Build a web service that uses the uses the GitHub API to listen to their webhooks and build (via sphinx or tex4ht) & deploy webpages on commit.
Extend reST to support extra syntax for Design Docs (via sphinx kit, so this story does have synergy with the Stack Docs).
Implement tweaks to tex4ht workflows to support deep linking and inter-document cross referencing.
Build HTML templates and CSS so that the reST and LaTeX documents ultimately have the same look and feel to a reader. Generally make a good reader experience.
Perform content conversion of Word documents

Honestly, this plan looks like a lot of fun to me, but also keep in mind that I have only two weeks to implement this according to K-T’s schedule. Having only two weeks to sprint on this is nice because in principle it lets me get back to my normal work, but I fear:

It won’t be perfect when it’s deployed September 25, and worst-case, bugs could become an impediment to everyone else’s productivity on this task. In an ideal world I’d have liked to make a Design Doc platform as a cool demo rather than a step in a live project that could set me up to become everyone’s blocker
It will probably need to be continuously improved to some extent for the duration of the Design Doc update process this fall, costing additional story points

Ultimately how we do Design Docs isn’t my decision, but I hope that this post will elucidate what continuous deployment of the reST+LaTeX design docs means from my perspective.

So in an Agile sense, how about I sprint on this for two weeks and then we can decide what to do next?

ktl · September 12, 2015, 5:08pm

@jsick Thanks for your post.

I hope that mine clarifies what the immediate priorities are. While in the long run these documents need to be readable and available to the user and developer community, in the very short run (until November), they only need to be readable and available to a small number of editors and a handful of readers, primarily represented by the Project Scientist. Anything else can be sacrificed, including things like this:

I did not understand that reST+sphinx and $\LaTeX$+tex4ht was going to be complicated. If the editors and Mario find it acceptable, the first thing that comes to my mind is going with reST+RTD for the next three months to capture the proper content and then possibly moving back to $\LaTeX$ afterwards if it is still thought to be necessary.

The second thing that comes to mind is having reST+sphinx+RTD for the formerly-Word documents and a separate quick&dirty system for the DPDD and LDM-151, figuring out how to merge them into one system at a later date.

jsick · September 12, 2015, 5:24pm

Your minimum viable plan looks good @ktl. I’ll get started right away converting all the Word docs to reST on GitHub and hook up off-the-shelf RTD workflows for them. Once that’s in place I can extend reST/Sphinx to suit our needs for Design Docs.

I won’t act on the LaTeX-based docs until we get a decision with the appropriate stakeholders, particularly @mjuric, on what would be good for them (either temporarily porting them to reST or having a separate and dedicated LaTeX CD service; or even investigating services like Authorea)

frossie · September 12, 2015, 5:27pm

I am happy with @jsick’s plan, but bear in mind it is timeboxed so LaTeX is the thing that will have to get dropped unless miracles happen. Nothing is “complicated” in the “multifit” sense, but even within the documentation scope there are many priorities that have strongly been expressed by many senior figures + the developers + SQuaRE’s need for stack support and time and resources are finite. Not everything can be “priority #1 we need it now”.