Using pybind11 instead of Swig to wrap C++ code?

pschella · August 26, 2016, 11:06pm

The LSST stack currently uses Swig to wrap C++ code and make it accessible from
Python. Here we discuss using pybind11 instead.

The goal of this post is to present our current understanding of using pybind11
for the stack, to encourage discussion in preparation for a formal RFC sometime
next week. If an RFC, to switch from Swig to pybind11, is adopted it will then
be given to the System Architect to make a final decision.

We explicitly do not intend to pass this by RFC alone. And there should be
plenty of room to discuss beforehand.

Executive summary

pybind11 is a lightweight header-only library that exposes C++ types in Python.
We can use this as an alternative to Swig.

The main difference is that pybind11 uses C++ templates with compile time
introspection to generate wrapper code, where Swig uses a custom C++ parser.
pybind11 is similar in syntax and use to Boost Python, but more modern and actively
maintained.

As with Boost Python, and unlike Swig, types need to be wrapped explicitly by
specifying the classes and (member) functions. This has the downside of having
to write and maintain the wrappers but has the advantage of being explicit and
allowing wrappers to be made more Pythonic.

Moreover, unlike Swig the wrapper files are just C++11, rather than a custom language.
This makes it easier for C++ developers to read, write and understand than Swig.
With pybind11, developers only need proficiency in two languages (C++ and Python).

An additional advantage over Swig is that pybind11 more easily allows for build
parallelization, since not all headers are wrapped into one module. Overall build
speeds are comparable (for the packages wrapped so far), but partial rebuilds might
take substantially less time.

A potential downside is that pybind11 is a relatively young project. It is
actively developed at the moment (it has 32 contributors on github), but
there are no guarantees that it stays this way. This risk is offset by the
small size of the library and its header only nature which would make taking
over maintenance relatively easy (we have already contributed major functionality
to its codebase as part of the trail).
With Swig the risk is lower, but it would be much more difficult to take over
maintenance if required.

As part of the experiment pybind11 is now used, on a separate branch, to wrap all
packages up to afw as well as afw.geom and afw.coord. The process of wrapping
the stack with pybind11 has been labour intensive, however no major obstacles have
been encountered and I am happy with the readability of the wrappers produced.

We should now discuss the relative merits of switching on this thread and prepare
a formal RFC.

What follows is some more detailed information.

About pybind11

pybind11 is a lightweight header-only library that exposes C++ types in
Python and vice versa, mainly to create Python bindings of existing C++
code. Its goals and syntax are similar to the excellent Boost.Python
library by David Abrahams: to minimize boilerplate code in traditional
extension modules by inferring type information using compile-time
introspection.

The main issue with Boost.Python—and the reason for creating such a
similar project—is Boost. Boost is an enormously large and complex suite
of utility libraries that works with almost every C++ compiler in
existence. This compatibility has its cost: arcane template tricks and
workarounds are necessary to support the oldest and buggiest of compiler
specimens. Now that C++11-compatible compilers are widely available,
this heavy machinery has become an excessively large and unnecessary
dependency.

Think of this library as a tiny self-contained version of Boost.Python
with everything stripped away that isn’t relevant for binding
generation. Without comments, the core header files only require ~2.5K
lines of code and depend on Python (2.7 or 3.x) and the C++ standard
library. This compact implementation was possible thanks to some of the
new C++11 language features (specifically: tuples, lambda functions and
variadic templates). Since its creation, this library has grown beyond
Boost.Python in many ways, leading to dramatically simpler binding code
in many common situations.

Current status of LSST stack migration test

As part of the DM-6168 epic I have been working to migrate (part of) the LSST stack
to pybind11 in order to test its feasibility. Currently all afw dependencies have
been successfully wrapped (on the epic branch). These are:

base
daf_base
daf_persistence
ndarray
pex_config
pex_exceptions
pex_logging
pex_policy
sconsUtils
utils

In addition afw.geom and afw.coord have also been wrapped.

Overall no major issues have been identified in wrapping our code with pybind11.

Most things worked out-of-the box, but some special cases are described below.

sconsUtils

Nothing in sconsUtils itself is wrapped, but it does now contain build support
for pybind11 packages (in the form of scripts.pybind11 which is equivalent to
scripts.python for Swig).

pex_exceptions

We added support for translating custom exception types from C++ to Python to
pybind11. This patch was accepted upstream. All pex exceptions are now
automatically translated, and it is easy to add more custom exception types.

ndarray

pybind11 type casters were added to ndarray to allow for automatic and
transparent casting of ndarray and eigen types to and from NumPy arrays
(similar to pybind11 built-in support for stl containers).

enum

pybind11 wrapped enums did not compare equal to integers. A patch was written,
and accepted upstream, to enable comparisons between enums and their underlying
types.

Compile time

It is too early in the wrapping process to truly say something about the
compile time differences. For all the afw dependencies the following
crude measurements resulted (on my system with -j3 (which I should probably
have turned off)).

| Package         | pybind11 (s) | swig (s) | speedup (x) |
|-----------------|--------------|----------|-------------|
| utils           | 11.83        | 13.65    | 1.15        |
| base            | 10.97        | 10.91    | 0.99        |
| pex_exceptions  | 8.63         | 10.29    | 1.19        |
| daf_base        | 25.02        | 30.43    | 1.22        |
| pex_policy      | 50.99        | 43.87    | 0.86        |
| pex_logging     | 42.30        | 37.54    | 0.89        |
| pex_config      | 14.57        | 15.60    | 1.07        |
| daf_persistence | 57.74        | 67.62    | 1.17        |

This shows the speeds to be at least roughy comparable. But a real test
would be afw itself, which has not yet been done (to the extent that
a comparison is feasible).

I think the major gain (if any) will actually come from the way the
code is wrapped.

I chose to wrap each header file into its own module (and then combine them
into the standard packageLib modules at import. This way build are more
or less independent (the same result could be achieved without splitting
the modules, but just split out over object files and the change could be
made easily if desired).

Note that this separation is not perfect. Because pybind11 uses typeid to
index and lookup types, they need to be fully defined. So sometimes additional
headers need to be included. But for things that are not actively exposed
in the wrapper forward declarations are all that is needed and rarely more
than one extra low-level header is required.

In addition to a possible advantage for partial recompilation and parallel
builds, this separation also allows more separation between add-on python code
and facilitates (or at least not inhibits) a future split-up of packages.

Using pybind11 to wrap C++ code in practice

Overall, working with pybind11 to wrap LSST stack code has been a nice
experience. Wrapping tends to be straightforward, and when problems occurred
they were predictable and relatively easy to trace down and solve.

A good thing about pybind11 is that it is relatively easy to step away for a
while and then step back in. The wrapping code is readable and self-contained
which makes it easy to grasp what is going on.

In fact, the tricky part was often to figure out what it was that Swig was
actually doing in order to replicate the behavior.

A minimalistic wrapping example for a class::

struct Pet {
    Pet(const std::string &name) : name(name) { }
    void setName(const std::string &name_) { name = name_; }
    const std::string &getName() const { return name; }

    std::string name;
};

looks like::

#include <pybind11/pybind11.h>

namespace py = pybind11;

PYBIND11_PLUGIN(example) {
    py::module m("example", "pybind11 example plugin");

    py::class_<Pet>(m, "Pet")
        .def(py::init<const std::string &>())
        .def("setName", &Pet::setName)
        .def("getName", &Pet::getName);

    return m.ptr();
}

For a more realistic example of the wrapped code (for afw.geom) see the
pull request for ticket DM-6296.

As can be seen, the wrapping code is just standard and clear C++11. What
you see is what you get, which is what makes understanding and debugging
relatively easy.

For the most part the process of wrapping is also nice and easy.
But it can be labour intensive and verbose. This in turn can be frustrating.
In particular the following things tend to require work::

All function overloads have to be disambiguated by writing an explicit
cast to the proper function pointer. This is not difficult, but is a lot
of work. It also turned out to be hard to script.
Default arguments to functions and constructors need to be explicitly
specified. This is good for documentation but does require documentation
and some manual work.
Templated types (and functions) need to be explicitly instantiated.
This is true for Swig too, which necessitates Swig %template statements
often wrapped in macros. With pybind11 the solution is to stuff the wrapper
declaration in another template function and call that with the desired
types (for an example see the PR mentioned above). In my opinion this
looks nicer then the Swig approach, and it requires approximately the same
amount of work, but it did take a lot of time to figure out which types
actually need to be exposed, and it creates a lot of nested templates which
may slow down compile time.
Due to the way pybind11 is written (a lot of template metaprogramming) the
compile time error messages can be somewhat intimidating at first. However,
they tend to be very regular and once you have seen a few you will know at
a glance what to do. Runtime errors are very clear.
smart pointers and ownership do require some thought. Mostly things just
work. Pybind11 uses std::unique_ptr as its internal holder type which
ensures automatic cleanup once the Python refcount goes to zero. When
std::shared_ptr is required instead the user can easily specify this.
But one has to do that and be consistent, otherwise segfaults are bound to
occur. This problem is easy to solve once identified, but it is not always
easy to catch during compilation. Another case when to be careful occurs
when (raw) pointers (or references) are returned to internal state. Then
the developer has to specify the lifetime relation between the two objects
explicitly (again easily done, but one has to be careful).

Overall these are not show-stoppers, but they do cause the bulk of the time
required to produce succesful wrappers.

The good thing is that (in my opinion) none of this requires expert C++
developers. With a bit of training everyone should be able to write wrappers
for 90% of the code without major trouble.

The other advantage is that, for people with a bit more in-depth C++ knowledge,
the pybind11 code is relatively accessible. So when changes are required, or
a bug requires deeper digging, it is actually quite doable. The layer is also
quite thin so one does not have to dig that deep and there isn’t any magic.

That said, wrapping the whole LSST stack, and maintaining the wrappers, will
likely require a non-negligable time investment that should probably be
distributed. It is definitely something that requires buy-in from everyone
that writes or works with C++ extension modules.

Missing features and other limitations

Pybind11 (currently) has the following missing features and limitations.
None of these have posed a serious problem for wrapping our code to date.
However, if some of these are deemed essential, and are not being worked on by
upstream, they might require work from us.

Multiple inheritance

Pybind11 does not (yet) support multiple inheritance. This means that multiple
inheritance relations cannot be exposed to Python (one of the inheritance
relations must be chosen). So far this has not been a problem in wrapping our
code. Although multiple inheritance is used, it seems limited to (mostly
abstract) classes that do not need to be exposed to python.

A patch to support multiple inheritance is reportedly in the works upstream so
it is realistic to expect it to work in the near future.

Sequence types

Pybind11 has built-in support for C++ STL containers. In particular it will
transparantly convert between the following types::

std::vector, std::list <-> list
std::tuple, std::pair <-> tuple
std::map <-> dict
std::set <-> set

However, this does mean that when a function accepts std::vector in C++
it will only take a Python list and not another sequence type.
This is probably something we want to change in upstream pybind11.
The required change seems to be relatively minor however.

Const correctness

pybind11 casts away const-ness in function arguments and return values. This is
in line with the Python language, which has no concept of const values. This
means that some additional care is needed to avoid bugs that would be caught by
the type checker in a traditional C++ program.

There seems to be no good way of exposing const-ness to Python. Other wrapping
tools (including Swig) have the same problem. This is not expected to be a
problem for our code (to the extent that it isn’t already).

Rubin Observatory LSST Community forum