A recent discussion with Astropy developers on improving interoperability between LSST’s image classes and Astropy’s NDData classes led to a set of questions about how our classes behave (or should behave) that @parejkoj captured in the #dm-astropy Slack channel.
I’ve tried to answer those below. Opinions about how they ought to behave are my own, though I believe in many cases they’re shared by others (or at least there’s broad agreement that the current interface is not ideal). I certainly hope anyone who disagrees will speak up on this thread, and I hope this discussion might spawn some RFCs to fix some of the problems; I think any meaningful fixes will be backwards incompatible at a level that will have to involve a frightening number of changes throughout the stack, but I don’t like the alternative of living with these problems either.
How do users access pixels?
Many of our algorithms only access pixels in C++, using standard C++ iterator patterns or a small extension to them.
Those that do access pixels in Python typically do it through NumPy views: calling getArray() (or, after DM-10541, accessing the .array attribute) returns a writeable, (height, width) array view into an image.  These views ignore xy0, which I think is desirable; our Image class is conceptually just a 2-d NumPy array (guaranteed to have contiguous values within a row) combined with the xy0 offset, so accessing the array view naturally discards that offset.
Images also support direct pixel access via image.get(x, y) and image.set(x, y, value) (which we could and perhaps should expose as image[y, x]), but using those is discouraged for the same performance reasons that discourage Python programmers from writing for loops over array elements in general.  Unfortunately, our get and set methods currently do not pay attention to xy0.  They absolutely should.
Our image classes also support in-place binary operators (+=, -=, etc.) with other images, scalar values, and instances of our Function class.  We’ve deliberately not implemented the versions of these operators that return a new object to force users to think before blindly allocating memory for a temporary.  I’m ambivalent as to whether that was a good idea; it has no doubt prevented some bad code, but I imagine it’s also led to some confusion and frustration.
What happens if you try to add two MaskedImages with different xy0
Because we only support in-place addition, the symmetry between the lhs (input/output) and rhs (input only) is broken. That sidesteps some issues that would otherwise appear.
What does happen: the two images are added, and the xy0 of the image being updated is not changed.
What should happen: If the bounding box (including xy0) of the rhs image is wholly contained by the bounding box of the lhs image, add the images in the overlap region, modifying only that part of lhs. If this is not true, raise an exception.
If we were to support regular (non in-place) binary operators, I think it’d probably be best if those required the bounding boxes (including xy0) to be identical, and raised an exception if they were not.
Can I slice my_image[5:10, 17:42] in an…
…Exposure?
Yes, and the result will be another Exposure that is a view into the original.  The ExposureInfo is transferred without modification (more on that below).
What does happen: the slice indices ignore my_image.xy0, they are interpreted as x, y, and the returned view has x0 = 5, y0 = 17.
What should happen: the slice indices account for my_image.xy0, they are interpreted as y, x, and the returned view has x0 = (my_image.x0 + 17), y0 = (my_image.y0 + 5).  This is what happens if you pass a bbox argument to the Exposure constructor, and IMO that’s the right way to get a subimage at present; slice indexing is too broken to be safe.
…ExposureInfo?
This is not possible at present, and I’m not sure it needs to be.
The only reason I think it would need to be sliceable is if it contains components that both depend on the coordinate system of the image (e.g. PSF models and WCS transforms) and have a bounding box that needs to modified to match the resulting subimage’s bounding box (as opposed to just indicating a possibly-larger region of validity for the component).  Depending on how we want to interpret the bounding boxes on those components, then, I think you could argue that there’s no need to ever slice an ExposureInfo.  And it’s desirable not to, because it avoids a lot of copying and/or proxy objects that just impose a new bounding box on their referent.
And that’s one big advantage of having xy0 in Exposure:  if you do slice it (with a bbox argument to the constructor, not through the broken slice syntax; see above), you don’t have to copy or modify the components, because the subimage and the original Exposure have the same coordinate system.  The big advantage, of course, is that this means subimages also remain valid for objects (like source catalogs, model-fitting parameters, or Footprints) that are tied to the original image’s coordinate system but aren’t a part of the Exposure object at all.
…MaskedImage?
Yes, with the same qualifications about the actual slice syntax being broken as Exposure.  The result is another MaskedImage.
…MaskedImage.data?
I’m not sure what this means; MaskedImage doesn’t have a .data, and its arrays aren’t allocated as a single block, so there’s nothing I could even imagine filling that role.
If you mean e.g. MaskedImage.image.array, that will do regular NumPy slicing on the array view, which of course returns a regular NumPy array and deliberately does not take xy0 into account.
What is the result of slicing those objects?
(included in answers above)
How does a user get world coordinates from pixel coordinates? How much linkage needs to be present for that to work, i.e. which components of Exposure are necessary to calculate a WCS?
Because we’ve defined our pixel coordinates such that the lower-left corner of the image is xy0, not (0, 0), all you need to go from pixel coordinates to sky coordinates is exposure.getWcs().pixelToSky(position).  All you really need is the Wcs object, and you can use the same Wcs object to perform coordinate transformations when working with a subimage of the original Exposure the Wcs was associated with (see above discussion on slicing ExposureInfo).  It does not have to be attached to the Exposure at all to work.
If you do care about array coordinates (i.e. indexing and slicing Image.array), you’d need to utilize both the Wcs object and the xy0 offset, of course.  But the whole philosophy of this approach is that you don’t want to care about array coordinates: you put the offset between pixel and array coordinates in your image objects, and that saves you from putting (and constantly updating) an offset in all of your other objects.