Doxygen markup seems to cause py3 string parsing to sometimes fail

When porting code to python 3, I came across a situation where the doxygen markup seems to tell python that we are trying to write a unicode literal, and importing the file fails with the error:

Traceback (most recent call last):
  File "tests/testPsfIO.py", line 44, in <module>
    import lsst.meas.algorithms as algorithms
  File "/Users/nate/repos_lsst/meas_algorithms/python/lsst/meas/algorithms/__init__.py", line 29, in <module>
    from .detection import *
  File "/Users/nate/repos_lsst/meas_algorithms/python/lsst/meas/algorithms/detection.py", line 211
    """
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2129-2130: truncated \uXXXX escape

The docstring itself can be seen on: Github. Prefixing the docstring with r (telling python a raw string is desired) fixes the problem. I am wary of embedding the escape character in the docstring itself, as it may interfere with doxygen generation. Does anyone have thoughts or preferences?

Specifically it’s the \util line in the string. That, obviously, isn’t allowed in a python 3 strings.

>>> type("\until")
  File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape

It looks like r"" might be the only option until we implement RFC-214.

>>> r"\until"
'\\until'

I looked at what PEP 257 recommends and they say this:

For consistency, always use “”“triple double quotes”"" around docstrings. Use r""“raw triple double quotes”"" if you use any backslashes in your docstrings. For Unicode docstrings, use u""“Unicode triple-quoted strings”"" .

So it seems that using raw strings for the current generation of doxygen-marked-up docstrings is the way to go. Numpydoc won’t have this issue (maybe for latex math in docstrings, but I don’t recall ever doing anything special in those cases).

That is what I figured, but I wanted to get other input before I just did it.

I’m fairly surprised this hasn’t come up before. We use \until liberally in doxygen, so this will come up again.

Why not just use @util instead of \util and so on for all doxygen commands?

Was that an intentional change in spelling? Is \until actually identical to @until?

+1

I tend to prefer “@” over “” in Doxygen universally (even in C++) for similar reasons - almost everything tries to interpret “”, but only Doxygen (of the things we run on our source files) pays attention to “@”.

Python 2, as is its wont, is far more relaxed about unicode in general so just silently ignores any unicode issues in strings. Python 3 cares deeply about unicode so complains.