On behalf of the crew at SQuaRE, I’m pleased to introduce LSST the Docs, Data Management’s new documentation publishing platform. LSST the Docs will allow Data Management to create and iterate on documentation more effectively, while also giving readers a better experience.
Soon, you’ll see DM’s technotes, Developer Guide, and some design documents migrate from Read the Docs to the new platform. In the upcoming Fall 2016 cycle we will begin publishing a rebooted Science Pipelines documentation site on LSST the Docs.
You can read more about the platform in SQR-006: The LSST the Docs Platform for Continuous Documentation Delivery.
Why did we build LSST the Docs?
I really admire what Read the Docs has done for open source documentation. Read the Docs has made it so much easier for developers to continuously deploy documentation alongside their projects. At one point, LSST Data Management had 39 projects published with Read the Docs. I have been, and continue to be, grateful for what Read the Docs has done for open source software and the Python community in particular.
But we learned two things from using Read the Docs. First, LSST’s projects demand a lot of flexibility in their build environments. Second, we needed more automation to help manage the fleet of documents that Data Management ships.
Read the Docs is built to be an easy-to-use integrated documentation publishing service, and that integration includes the environment where documentation is built. Unfortunately, LSST Science Pipelines simply can’t fit in that environment, both in terms of computational resources and that LSST speaks a different build language than most Python projects (EUPS versus pip
). We already have continuous integration services for LSST projects and it makes sense to build documentation on those as well.
Beyond EUPS, we can also envision projects where data intensive computation, testing, and figure generation are part of the documentation build process. Having flexibility in the build environment makes this possible.
We also found that Read the Docs projects needed a bit of administrative effort to provision new projects, their domain names, and set up new branch builds. While we tried to hide this administrative effort, it became a bottleneck for the team. LSST the Docs is built around an API, meaning that it’s ready to automate and integrate into LSST’s systems and workflows.
What can LSST the Docs do?
Here are some of the most exciting features of the LSST the Docs platform. See SQR-006: The LSST the Docs Platform for Continuous Documentation Delivery for additional detail.
Flexible documentation builds
Documentation can be built on any continuous integration platform. Big projects, like the LSST Science Pipelines documentation, will be built on DM’s Jenkins CI. Smaller documents, like technotes, will be built on Travis CI. We’ve written documentation describing how to setup a .travis.yml. We also have an elegant system for building multi-repository documentation for EUPS-based projects.
LSST the Docs is very flexible in how documentation is built. In essence, it’s a generator-agnostic static site publishing platform. Even Sphinx isn’t a hard dependency; alternative formats, like LaTeX documents, can be published too.
Beautiful, versioned URLs
Every documentation project has its own subdomain on lsst.io
, for example ltd-keeper.lsst.io or sqr-006.lsst.io. These URLs are memorable and mean you won’t need a link shortener to refer to projects.
From these domains we publish multiple editions of documentation that map to branches on GitHub. The root URL, example.lsst.io/
, hosts the master
branch by default (though this is configurable). This gives us beautiful URLs for the canonical versions of the site we want readers to visit by default.
Documentation for branches of projects are published under /v/
. For example, a release branch might be published to example.lsst.io/v/v1/
and a ticket branch at example.lsst.io/v/DM-1234/
. Documentation for branches will be published automatically as soon as you push to GitHub. I think this feature will be tremendously valuable for documentation reviews during pull requests.
As a bonus, we retain old documentation builds. Individual builds are published to example.lsst.io/builds/(id)/
. This will be helpful for seeing, and sharing, A/B comparisons of your documentation. It also means that if one of the main documentation editions breaks we can immediately hot-swap to any previous documentation build without having to rebuild the documentation from scratch.
And don’t worry, we’ll add <link rel="canonical" href="...">
headers for our HTML templates to help search engines sort through our documentation versions.
Served by Fastly
To give readers the best experience we’re using the Fastly content distribution network for everything published by LSST the Docs. Whether you’re West Coast, East Coast, down in Chile, over in France, or anywhere else on Earth, there will be a nearby Fastly point of presence serving you docs.
Besides performance, we’re also taking advantage of the Varnish caching layer that Fastly hosts. Varnish lets us map URLs for all documentation projects, and their individual builds, to directories in a single AWS S3 bucket (see SQR-006 for details). This will allows us to scale LSST the Docs to host an enormous number of projects without breaking a sweat. (Hat tip to HashiCorp for advocating this pattern.)
Last but not least, Fastly allows us to securely deliver content over TLS (i.e., HTTPS). This is nice to have for static documentation projects, but will become critical for serving interactive content with client-side JavaScript.
API Driven
Starting with our earliest whiteboard design sessions, we knew that LSST the Docs needed to be decomposed into discrete microservices with well-defined interfaces. This design gives us flexibility, and isolates details. For example, LSST the Docs can publish documentation for EUPS projects without having to be aware of EUPS. Below is an architectural diagram describing how an EUPS-based documentation project, like the Science Pipelines, is published by LSST the Docs.
<img src="/uploads/default/original/1X/0cbd2cec69547ae01a47fd66caa07173aa5ab69d.png" width=“690” height=“260” alt=“Figure 1. LSST the Docs microservice architecture.">
At the heart of LSST the Docs is LTD Keeper, a RESTful web app. LTD Keeper maintains the state of documentation projects and builds, and coordinates the builders on CI servers (LTD Mason) and other web services (AWS S3 and Route 53, and Fastly).
This API can also be consumed by external services. For example, documents can use this API to power user interface elements that help readers find the right version of the docs. Dashboards can use the API to list documentation projects and their versions. Even ChatOps bots could use this API.
Deployed with Kubernetes
Being my first major DevOps project, I wanted to cultivate modern best practices for deploying applications to the web. We decided to deploy the LTD Keeper API server (built on Flask in Python 3) in Docker containers orchestrated by Kubernetes. This is all done in the Google Container Engine. Below is a diagram of what the application deployment looks like.
<img src="/uploads/default/original/1X/7c594bec1ad157f92ccc315cf1bbb8f233c23923.png" width=“690” height=“467” alt=“Figure 2. LTD Keeper’s Kubernetes deployment architecture">
A Kubernetes load balancer service receives traffic from the internet and routes it to pods with Nginx containers that terminate TLS traffic. These forward the traffic, via another internal load balancer, to pods composed of a Docker container that reverse-proxies traffic and finally a container with the uWSGI-run Flask application. All of the pods are managed by Kubernetes replication controllers, meaning that it’s easy to scale the number of pods, and also to deploy updated pods without service interruptions.
The best part is that this entire infrastructure is configured and managed on the command line with a few YAML files. The LTD Keeper documentation contains complete deployment instructions.
I couldn’t be happier with Kubernetes, and I believe that this deployment architecture will be a useful template for future SQuaRE projects.
Onwards
With LSST the Docs, we are at last in a position to move forward on DM’s documentation projects, not least of which will be a reboot of the LSST Science Pipelines documentation. We look forward to migrating Science Pipelines to Sphinx during the Fall 2016 development cycle.
This platform will also enable exciting integrations and automations for the LSST DM Technote platform (SQR-000) and the DocHub project (SQR-011) for LSST documentation search and discovery.
We’re continuously improving LSST the Docs. The Fall 2016 DM-5858 epic lists some of the planned work, including dashboards for listing documentation versions and builds.
Get the code and read the docs
LSST the Docs code is MIT-Licensed open source. It’s built either natively for, or compatible with, Python 3. Here are the main repositories and their documentation:
-
LTD Mason
- Docs: https://ltd-mason.lsst.io.
- GitHub: https://github.com/lsst-sqre/ltd-mason
- LTD Keeper
You can follow the progress of LSST the Docs on JIRA by searching for the label: ‘lsst-the-docs.’
The technote describing this project, its philosophy, architecture, and implementation is available at https://sqr-006.lsst.io.