Properties and attributes

pschella · January 20, 2017, 1:52am

Recently there has been some discussion on RFC-81 about the use of properties and attributes in the stack.

While the consensus seems to be that the use of properties and attributes (instead of getters and setters) is desirable and Pythonic there are some pitfalls when it relates to extension types.

With the upcoming change to pybind11 it is trivial to add properties in the C++ wrapper. However there are a few issues to resolve:

What to do when the setter is overloaded?
What to do with immutable vs mutable types?
What to do with STL container types (that return an equivalent Python container type as a copy, thus silently ignoring modification of an element)?
What to do when the setter or getter is expensive?
What do we do with the getter and setter (keep / remove / deprecate)?

This post serves as a platform to decide upon a good course of action.

As an initial proposal I suggest the following:

Add an attribute (or property) when (at least) the following criteria are met:

The setter and getter accept and return an object of the exact same type (keep setters for conversions).
There is either a negligible cost, or the value can be cached.
The returned (set) type is not an STL container type.

Probably supplemented with a rule on mutability vs immutability for which I have no good suggestion other than: “do what NumPy does”.

I invite your comments and suggestions.

See also:

pschella · January 20, 2017, 1:12pm

For completeness sake I have to mention the other alternative. We could accept that C++ extension types will never behave entirely Pythonic (even if they are already closer with pybind11), and sidestep the issues mentioned above by sticking to getters and setters.

The rule would then be to never use properties (to avoid confusing users about what to expect) and use attributes only for (non-STL-container) public data members.

jbosch · January 20, 2017, 3:53pm

Thanks for starting this off, @pschella!

I think I’d rather allow conversions when setting properties than have a setter method that behaves differently from setting the property. Maybe the easiest thing to do would be to not use properties at all when there is an overloaded setter (which might make us reconsider having one in some places).

I definitely think that it’s ok to have a property that can be to any instance of a class hierarchy.

+1. I wouldn’t be opposed to saying in addition that it should be a constant-time operation for anything that behaves like a container.

I think “STL container type” here is just one specific case of the more generic problem of returning copies that are mutable objects. If a getter returns a copy, we should not have a property for it unless that copy is or can be made immutable OR it returns a view. For example, if a method returns a std::vector, which normally translates to a Python list, we can make the property turn that list into a tuple, making it impossible to modify in-place. Python has a built-in type that provides an immutable view to a dict that is weirdly not available except in the Python C API, but we might be able to use that to do something similar for std::map/dict returns if we feel the need. Getters that return ndarray or most afw::image types by value are also safe under this rule because those objects are views. This rule would unfortunately rule out properties for Point, Extent, and Box objects, at least for now; eventually I’d like to consider making those immutable, or providing immutable versions of them we could use for properties.

This is also tied to the question of when a C++ getter that returns by reference should return a copy in Python. Here, I think the rules should be:

If the getter returns a Python builtin (e.g. numeric types or strings), it must return a copy.
If there is a C++ getter that returns a non-const reference or pointer (regardless of whether there is also a const one), the Python getter should return a non-const reference, not a copy.
If there is only a C++ getter that returns a const reference or pointer, the Python getter should return a copy unless there is a clear negative performance impact from making the copy and we can’t work around it some other way (and in that case the danger of modifying the result should be very prominently documented).

kfindeisen · January 20, 2017, 7:20pm

[quote=“jbosch, post:3, topic:1566”]
If a getter returns a copy, we should not have a property for it unless that copy is or can be made immutable OR it returns a view. … This rule would unfortunately rule out properties forPoint,Extent, andBox objects, at least for now[/quote]
Can you clarify what the problem is? The obvious properties for a Point would be the elements, which are always of primitive type; I would think that using properties to both get and set them would be safe.

jbosch · January 20, 2017, 7:22pm

Ah, sorry, I didn’t mean properties for Point itself - those would indeed be safe - I meant other objects defining properties that returned a Point (because then someone could do thing.point.x = 4 and it would be a confusing no-op).

timj · January 20, 2017, 7:39pm

but thing.getPoint().setX(4) wouldn’t be a no-op?

getPoint() in Python returns a copy of the Point from thing and not the actual object held by thing?

jbosch · January 20, 2017, 7:42pm

but thing.getPoint().setX(4) wouldn’t be a no-op?

That would be a no-op as well, and while that’s somewhat problematic (and perhaps another reason we want more immutable objects in general), I think there’s less of a user expectation that it sets thing's x coordinate than with a property. That may just be my own perception

getPoint() in Python returns a copy of the Point from thing and not the actual object held by thing?

Sorry, yes. I’m assuming getPoint returns a copy or something computed on-the-fly. In the case where the internal point is returned as a non-const object, there’s no problem.

pschella · January 20, 2017, 7:44pm

IMHO it would be best if Point, Extent and Box were simply immutable. Then there is no confusion.
But that has been expressed before at length.

parejkoj · January 20, 2017, 7:49pm

As another example, what is the expected behavior of calexp.getCalib().setFluxMag0(10), and what would be the equivalent via properties? The above makes sense in terms of things we probably want to be immutable anyway, but I’m not entirely clear on how otherwise mutable objects should behave.

timj · January 20, 2017, 7:57pm

I’m trying to understand how calexp.getCalib().setFluxMag0(10) is not identical to calexp.calib.flux_mag0 = 10. Isn’t it the same code underneath except using @property?

jbosch · January 20, 2017, 8:01pm

It’s definitely the same code underneath in this case (more on that later). As it stands right now, Calib is a mutable object (otherwise it wouldn’t have setFluxMag0), and it’s returned via a non-const shared_ptr, so everything would just work, either with or without the property: it would set the Calib held by the Exposure, with no copies involved.

The only way I could imagine this not doing the right thing is if Calib was a mutable object that was returned as a copy (which could also happen if it was returned by const shared_ptr or reference in C++ and we decided to make it a copy in Python to avoid the possibility of invalidating the C++ owner of the Calib).

timj · January 20, 2017, 8:02pm

C++ making easy things hard…

jbosch · January 20, 2017, 8:06pm

class A:

    def __init__(self, value1, self.value2):
        self.value1 = value1
        self.value2 = value2

    @property
    def values(self):
        return [self.value1, self.value2]

a = A(1, 2)
a.values[0] = 4
print(a.values[0])

timj · January 20, 2017, 8:11pm

Yes, yes, point made, returning a list and changing the contents of the list after receiving it is a real problem. I like your “always return tuples” option though.

parejkoj · January 20, 2017, 8:13pm

Ok, that all makes sense to me and was the answer I was both hoping for and expecting. So long as we’re careful about returning non-const shared_ptr for mutable objects, we should be fine, I guess?

jbosch · January 20, 2017, 9:10pm

Well, if we also avoided returning copies of mutable objects. There are good reasons to both sometimes, though, and then we either need to just not use properties for them (and hope users don’t assume getter methods return internals) or do something to make them immutable.

timj · January 20, 2017, 9:11pm

I’m getting a sense we have an idea what the “low hanging fruit” is for this to work.

jbosch · January 20, 2017, 9:15pm

I’m getting a sense we have an idea what the “low hanging fruit” is for this to work.

I agree. I’ll try to throw together a specific RFC to formalize a proposal before I go home today. Given that we’ve hashed it out pretty thorough here (and elsewhere), hopefully it won’t generate too much further discussion.

RHL · January 25, 2017, 12:00am

Did this RFC get filed?

kfindeisen · January 25, 2017, 12:13am

RFC-279