Format C++ code using clang-format

brianv0 · December 9, 2016, 8:38pm

I haven’t tested it, but git blame -w can supposedly ignore white space and git blame -w -M can detect line movement. This doesn’t help with blame in github but the UI is powerful in that case anyway.

parejkoj · December 14, 2016, 1:23am

Note that just as with python before we ran autopep8, our C++ almost certainly does not currently conform to our current coding standards. So, whether we automatically reformat or have people do it manually, there’s going to be a lot of code to cleanup to even the current standard.

Having pre-commit hooks would be nice, but doing it inline with editor linters is also extremely nice, and helps train you to write it correctly in the first place.

pschella · December 14, 2016, 5:51pm

To give some flavour of what things would look like, I ran two files from afw (Image.h and Image.cc) through clang-format in three configurations:

style=google
style=google but with 4 spaces (instead of 2) for indentation
style=file with a custom style (derived from the Google style) as close to LSST’s as I could make it without spending a lot of time on it

The results are attached:

google-4space-Image.cc (35.5 KB)
google-4space-Image.h (26.4 KB)
google-Image.cc (35.2 KB)
google-Image.h (25.9 KB)
lsst-Image.cc (35.5 KB)
lsst-Image.h (30.6 KB)
orig-Image.cc (34.5 KB)
orig-Image.h (28.2 KB)

jbosch · December 14, 2016, 6:03pm

Thanks, @pschella!

The actual styles seem perfectly fine to me. While they do differ from what we’re doing now, they’re no less readable, and certainly the custom-style one is similar enough to what we’re doing now that I don’t think it’ll lead to any slowdown in cognition, even we kept existing files as-is until we had cause to modify them.

What’s striking to me is how much apparently isn’t handled by clang-format, in that it’s apparently left inconsistent within these files (which were good choices, by the way, given how much internal inconsistency they started with!):

Sometimes the stuff within a namespace block is indented, sometimes it isn’t.
Sometimes there’s an extra newline between functions, sometimes there isn’t.

Do you know if these are just aspects that clang-format won’t ever pay any attention to? Or are they just things missing from the google style specification?

pschella · December 14, 2016, 6:46pm

The namespaces inconsistency is just an option (I think). But my quick search didn’t yet reveal an easy way to set the number of blank lines between function declarations. That does not mean it is impossible though.

pschella · December 14, 2016, 9:25pm

Just in case people want to try it out I attached the format file that closest matches the LSST style. To use it rename it to .clang-format (without the extension) and copy it into the top level directory of your source tree. Then run clang-format -i -style=file mycode.h.

When given the file option it will look for a .clang-format file in the current directory and then all the way up the tree.

clang-format-lsst.txt (2.7 KB)

rowen · December 14, 2016, 9:47pm

Thank you @pschella. I agree with @jbosch that these all look fine to me. I would like to see the effects on some pybind11 code containing lambda functions because that tends to have very long lines. But my suspicion is we can live with any of these.

For the record: I would rather have all our code cleaned up to be readable and uniform, rather than have a mishmash of old and new. Using flags with git blame seems a small price to pay for readability.

pschella · December 14, 2016, 10:22pm

So, I’d like to wrap this up in an RFC. In accordance with @swinbank’s and @fred3m’s comments I think we should add a style file to the DM guide (version controlled with the guide itself) and then change section 6 (concerning code layout) to read “code shall be formatted identically to that produced by running it through clang-format with the specified style file”.

timj · December 14, 2016, 10:23pm

I imagine you have to retain bits in the code layout part of the style guide that won’t be managed by clang-format (or make clear that the only rules are those enforced by clang-format).

pschella · December 14, 2016, 10:26pm

Yes, and I would go with the former option. Remaining rules are in addition to running clang-format (or having your parrot type the code in the same way).

swinbank · December 14, 2016, 10:56pm

Sounds wonderful to me.

(I wrote “great” first time, but I needed to be more effusive in my praise in order to meet the minimum character count!)

price · December 15, 2016, 2:52am

It looks like @pschella’s example LSST-style formatting puts spaces around the * operator, which is verboten for us.

If we put clang-format in a commit hook, how do we take care of the differences we have in our standard compared to what it produces?

What is involved in building clang-format? If it’s going to be run by the commit hook, we’re all going to have to build it.

ktl · December 15, 2016, 4:03am

Verboten for some of us (who have been pretty vocal and influential in the past). If these spaces can’t be disabled without significantly decreasing the utility of the tool overall, then I would lean towards giving up this LSST-unique rule.

timj · December 15, 2016, 4:21am

That rule also causes us problem with PEP8 in Python.

ktl · December 15, 2016, 4:23am

Yes, but at least the tool can be configured to disable the space addition. It wasn’t clear that clang-format could be configured that way (I looked at the style file but not the documentation).

parejkoj · December 15, 2016, 6:05am

In the python case, E226 disables application of the “whitespace around operators” rule for all operators, not just * and /.

jbosch · December 15, 2016, 2:09pm

I’d be more than happy to drop the spaces-around-operators rules if we switch to automatic formatting.

If we don’t switch to automatic formatting, I’d still like to remove them or at least downgrade them to recommendations; in some contexts adding spaces around * and / can greatly improve readability, especially when there is no + or - in the expression or the operands themselves involve operators like [].

pschella · December 15, 2016, 3:28pm

It doesn’t look like it is possible to disable spacing around specific operators with clang-format. I’m sure we could patch the tool to do it and commit those changes upstream (or build our own tool with libformat). But it looks like none of the major C++ styles have this rule, so I personally would prefer dropping it.

price · December 15, 2016, 3:47pm

I would be sorry to see the rule about spaces around operators be simply discarded out-of-hand. I won’t rehash the same old arguments, but merely say that I think it’s important for the readability of our scientific code. I will admit that there are times when being flexible about it makes things more readable (exceptions that prove the rule?), but I would hate to see it broken completely and forever.

timj · December 15, 2016, 3:54pm

Yes, but it has the side effect of not fixing things at all if they are not compliant with our standard.