Pattern for testing code that relies on RSP / Data services

frossie · February 29, 2024, 2:24am

At today’s SQuaRE office hours a developer brought us an interesting discussion of how best to test procedural or library code that relies on a running data service. This is a summary of our recommendations in case they are useful to other devs.

In this example let’s say we have some code in a package P that runs on data retrieved from service S. An example of this would be code that retrieves data from an RSP API service and then performs analytics on it.

It might be tempting to write tests so that CI (eg Github Actions) for package P hit the endpoint of service S. We are NOT a fan of this approach for a number of reasons. You are hardwiring a service to hit; you have to deal with authentication issues; you rely on that service being available; you may experience errors due to problems that are not due to anything you have done in package P (false positives); you are exercising a path (client-server interaction) that is already tested elsewhere by us; and you may be spamming an endpoint with considerable load (eg a large database query).

It might also be tempting to stand up service S in the CI container to test against a local deployment. We do NOT recommend this option; it does not scale well with number of services and results in slow CI with large containers.

Instead, we recommend a two-pronged approach:

Run standalone unit tests in CI with pre-fetched checked-in test output.
Use mobu end-to-end testing to test interactions with deployed services and other dependencies.

Running standalone tests

Let’s say your code is intended to be used on the results of a VO query that returns a VO Table. Instead of doing this query live on a running service in CI, store in your repository a VO Table and run tests against that. These allow your tests to run quickly without reliance on external services.

Opinions differ on whether one should use something like pytest-vcr to obtain this output, it seems to be predominantly a matter of aesthetics and anticipated modes of failure.

Bonus: you can code on an airplane. What else is there to do?

Using mobu

Mobu is a harness developed by SQuaRE for end-to-end service testing and monitoring. It comes standard with all RSP (phalanx) installations. It has a number of capabilities, but in this case we are going to focus on the notebook runner.

The notebook runner takes RSP notebooks in designated github repositories and is constantly executes them in exactly the same manner a human user of the RSP Notebook aspect would. An example of this is the tutorial notebooks developed by CST for the science users. Mobu is constantly running them against designated containers (such as the current recommended, the latest weekly etc) and posts any errors to Slack status channels such as #status-data-lsst-cloud.

Our recommendation is that you provide us with a notebook that uses package P in a realistic situation. Mobu will run it thus exercising all runtime dependencies and alert us if there is a problem. This means that you’re not only testing your code’s interaction with service S but also any other dependencies (science pipelines, Butler, python versions, etc). Moreover if the execution fails, we will check that it’s not a problem with a running service before bothering you. Another major advantage is that this service independently runs on all environments; mobu will run it on the production science cluster (aka IDF prod aka data.lsst.cloud), on the staff cluster at USDF (usdf-rsp), on integration clusters etc. This provides the widest realistic coverage of your package and also can spot production problems before they happen, such as failing on an integration cluster on the latest weekly, before they affect users in production.

You can read our future development roadmap for mobu in SQR-080

mschwamb · February 29, 2024, 4:31am

This is really helpful. Thanks for clarifying. This may a question for the CST or for @frossie, is it possible to get this posted somewhere in a form that is not a community post on the Rubin website or a link somewhere on the Rubin website so the contents of this post are easily accessible in 6 months or a year’s time?

ktl · February 29, 2024, 5:23am

We have content on Community from almost 9 years ago . You can get a permalink to Frossie’s post using the “share link” button at the bottom. So it’s not clear to me that this is significantly less accessible than the alternatives. (The one factor might be that to my knowledge as of now Community is not indexed by www.lsst.io.)

frossie · February 29, 2024, 5:42am

Yup totally - we are setting up more information on mobu on a dedicated site in the future and also will be expanding developer-oriented information in phalanx.lsst.io. I just wanted to post this quickly here for now so I could link to it from Slack since it closes off a discussion thread.