Object caching in Butler is removed

We’ve run into a few problems recently with the Butler object caching mechanism (details below), and it has been removed entirely, at least for the time being. If you think it’s worth keeping (reimplementing) please let me know. We would have to discuss the issues and brainstorm potential fixes, and RFC a proposal.

the rest of the tl;dnr is…

The feature was relatively new, and was implemented to support Composite Objects in Butler (but it did operate on all gets from Butler, composite or not). When reading an object, Butler would take a hash of the ButlerLocation object (the ButlerLocation describes how to read the object and what data was used to find it: which repository, what path within the repository, the dataId that was used to find it, the dataset type, etc)

There are 2 issues that came up recently:

The major issue was with object mutability: if an object is read (via Butler.get) and then gotten again (via another Butler.get), exactly the same object is returned. This reduces the memory footprint and saves the trouble of reading from disk. There’s no pythonic way to make the returned object const, so if the object is mutated in one place, all the places it is used are of course affected. We didn’t think this would be a problem but it turns out it is.

The second problem is an easier one to fix: the dataId used to complete the template is recorded in the ButlerLocation, but the passed-in dataId can be a superset of what is needed to complete the template, and is passed into the object deserializer to specify how the object should be deserialized. The fix would be to include the entire dataId (not just the part used to fill in the template) in the hash.