Recently there has been some discussion on RFC-81 about the use of properties and attributes in the stack.
While the consensus seems to be that the use of properties and attributes (instead of getters and setters) is desirable and Pythonic there are some pitfalls when it relates to extension types.
With the upcoming change to pybind11 it is trivial to add properties in the C++ wrapper. However there are a few issues to resolve:
What to do when the setter is overloaded?
What to do with immutable vs mutable types?
What to do with STL container types (that return an equivalent Python container type as a copy, thus silently ignoring modification of an element)?
What to do when the setter or getter is expensive?
What do we do with the getter and setter (keep / remove / deprecate)?
This post serves as a platform to decide upon a good course of action.
As an initial proposal I suggest the following:
Add an attribute (or property) when (at least) the following criteria are met:
The setter and getter accept and return an object of the exact same type (keep setters for conversions).
There is either a negligible cost, or the value can be cached.
The returned (set) type is not an STL container type.
Probably supplemented with a rule on mutability vs immutability for which I have no good suggestion other than: “do what NumPy does”.
For completeness sake I have to mention the other alternative. We could accept that C++ extension types will never behave entirely Pythonic (even if they are already closer with pybind11), and sidestep the issues mentioned above by sticking to getters and setters.
The rule would then be to never use properties (to avoid confusing users about what to expect) and use attributes only for (non-STL-container) public data members.
I think I’d rather allow conversions when setting properties than have a setter method that behaves differently from setting the property. Maybe the easiest thing to do would be to not use properties at all when there is an overloaded setter (which might make us reconsider having one in some places).
I definitely think that it’s ok to have a property that can be to any instance of a class hierarchy.
+1. I wouldn’t be opposed to saying in addition that it should be a constant-time operation for anything that behaves like a container.
I think “STL container type” here is just one specific case of the more generic problem of returning copies that are mutable objects. If a getter returns a copy, we should not have a property for it unless that copy is or can be made immutable OR it returns a view. For example, if a method returns a std::vector, which normally translates to a Python list, we can make the property turn that list into a tuple, making it impossible to modify in-place. Python has a built-in type that provides an immutable view to a dict that is weirdly not available except in the Python C API, but we might be able to use that to do something similar for std::map/dict returns if we feel the need. Getters that return ndarray or most afw::image types by value are also safe under this rule because those objects are views. This rule would unfortunately rule out properties for Point, Extent, and Box objects, at least for now; eventually I’d like to consider making those immutable, or providing immutable versions of them we could use for properties.
This is also tied to the question of when a C++ getter that returns by reference should return a copy in Python. Here, I think the rules should be:
If the getter returns a Python builtin (e.g. numeric types or strings), it must return a copy.
If there is a C++ getter that returns a non-const reference or pointer (regardless of whether there is also a const one), the Python getter should return a non-const reference, not a copy.
If there is only a C++ getter that returns a const reference or pointer, the Python getter should return a copy unless there is a clear negative performance impact from making the copy and we can’t work around it some other way (and in that case the danger of modifying the result should be very prominently documented).
[quote=“jbosch, post:3, topic:1566”]
If a getter returns a copy, we should not have a property for it unless that copy is or can be made immutable OR it returns a view. … This rule would unfortunately rule out properties forPoint,Extent, andBox objects, at least for now[/quote]
Can you clarify what the problem is? The obvious properties for a Point would be the elements, which are always of primitive type; I would think that using properties to both get and set them would be safe.
Ah, sorry, I didn’t mean properties for Point itself - those would indeed be safe - I meant other objects defining properties that returned a Point (because then someone could do thing.point.x = 4 and it would be a confusing no-op).
but thing.getPoint().setX(4) wouldn’t be a no-op?
That would be a no-op as well, and while that’s somewhat problematic (and perhaps another reason we want more immutable objects in general), I think there’s less of a user expectation that it sets thing's x coordinate than with a property. That may just be my own perception
getPoint() in Python returns a copy of the Point from thing and not the actual object held by thing?
Sorry, yes. I’m assuming getPoint returns a copy or something computed on-the-fly. In the case where the internal point is returned as a non-const object, there’s no problem.
As another example, what is the expected behavior of calexp.getCalib().setFluxMag0(10), and what would be the equivalent via properties? The above makes sense in terms of things we probably want to be immutable anyway, but I’m not entirely clear on how otherwise mutable objects should behave.
I’m trying to understand how calexp.getCalib().setFluxMag0(10) is not identical to calexp.calib.flux_mag0 = 10. Isn’t it the same code underneath except using @property?
It’s definitely the same code underneath in this case (more on that later). As it stands right now, Calib is a mutable object (otherwise it wouldn’t have setFluxMag0), and it’s returned via a non-const shared_ptr, so everything would just work, either with or without the property: it would set the Calib held by the Exposure, with no copies involved.
The only way I could imagine this not doing the right thing is if Calib was a mutable object that was returned as a copy (which could also happen if it was returned by constshared_ptr or reference in C++ and we decided to make it a copy in Python to avoid the possibility of invalidating the C++ owner of the Calib).
Yes, yes, point made, returning a list and changing the contents of the list after receiving it is a real problem. I like your “always return tuples” option though.
Ok, that all makes sense to me and was the answer I was both hoping for and expecting. So long as we’re careful about returning non-const shared_ptr for mutable objects, we should be fine, I guess?
Well, if we also avoided returning copies of mutable objects. There are good reasons to both sometimes, though, and then we either need to just not use properties for them (and hope users don’t assume getter methods return internals) or do something to make them immutable.
I’m getting a sense we have an idea what the “low hanging fruit” is for this to work.
I agree. I’ll try to throw together a specific RFC to formalize a proposal before I go home today. Given that we’ve hashed it out pretty thorough here (and elsewhere), hopefully it won’t generate too much further discussion.