Bug in daf_persistence/blob/master/python/lsst/daf/persistence/utils.py?

Hello,

In the last version of the stack (github master), there has been modifications done to daf/persistence/registries.py which produce a little but in my code, which was working fine with version 12.0 (SqliteRegistry.lookup (version 12.0.rc1-1-gc553c11+3). The code in there is

if not hasattr(lookupProperties, '__iter__'):
         lookupProperties = (lookupProperties,)
cmd = "SELECT DISTINCT "
cmd += ", ".join(lookupProperties)

Which works fine if lookupProperties is a dictionnary.

In [18]: lookupProperties = {'a': 1, 'b': 2}

In [19]: if not hasattr(lookupProperties, '__iter__'):
   ....:     lookupProperties = (lookupProperties,)

In [20]: cmd = "SELECT DISTINCT "

In [21]: cmd += ", ".join(lookupProperties)

In [22]: lookupProperties
Out[22]: {'a': 1, 'b': 2}

For the same dictionnary, and with the last version of the stack (master on github), we get (with sequencify coming from utils.py in the same package) the following error:

lookupProperties = sequencify(lookupProperties)

cmd = "SELECT DISTINCT "
cmd += ", ".join(lookupProperties)

In [25]: ''.join(sequencify({'a': 1, 'b': 2}))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-25-e19e7f6a0aa2> in <module>()
----> 1 ''.join(sequencify({'a': 1, 'b': 2}))

TypeError: sequence item 0: expected string, dict found

Has it been seen anywhere else or is it a problem on the way I use the stack? To be precise, I get this error while doing

keys = butler.getKeys("forced_src")
butler.queryMetadata("forced_src", format=keys)

Thanks

it’s because the new version of sequencify does not try to use __iter__ because __iter__ is very unreliable in python 3 (as it includes str) and I didn’t realize that dicts were allowed (none of the tests triggered that case).

def sequencify(x):
    """Takes an object, if it is a sequence return it,
    else put it in a tuple. Strings are not sequences."""
    if isinstance(x, (Sequence, Set)) and not isinstance(x, basestring):
        pass
    else:
        x = (x, )
    return x

This fixes it:

diff --git a/python/lsst/daf/persistence/utils.py b/python/lsst/daf/persistence/utils.py
index e5b3962..651eab4 100644
--- a/python/lsst/daf/persistence/utils.py
+++ b/python/lsst/daf/persistence/utils.py
@@ -24,9 +24,9 @@
 from past.builtins import basestring
 
 try:
-    from collections.abc import Sequence, Set
+    from collections.abc import Sequence, Set, Mapping
 except ImportError:
-    from collections import Sequence, Set
+    from collections import Sequence, Set, Mapping
 
 
 # -*- python -*-
@@ -70,7 +70,7 @@ def iterify(x):
 def sequencify(x):
     """Takes an object, if it is a sequence return it,
     else put it in a tuple. Strings are not sequences."""
-    if isinstance(x, (Sequence, Set)) and not isinstance(x, basestring):
+    if isinstance(x, (Sequence, Set, Mapping)) and not isinstance(x, basestring):
         pass
     else:
         x = (x, )

@natepease can you handle this please?

sure. will track as https://jira.lsstcorp.org/browse/DM-8235

I am pondering whether a function called sequencify should return tuple(thing.keys()) if called with a Mapping. Would make more sense to me.

It seemed kind of weird to me that a dict would be passed thru sequencify unmodified. But I saw that set is also allowed transparently (where set is also not a sequence). It seems like containerize might be a more appropriate name for the function?

sequencify is a new routine added when I refactored the code for python3 and saw the same problem with __iter__ all over the place. It really does seem to be returning something sequence like and the set() exception was there to get tests passing. I don’t mind if you rename it.

I think I’ll add the dict support and leave it alone for now, unless you think it’s better to do

But that wouldn’t work to fix Nicholas’s issue (right?) and we’d have to figure out what to do.

why wouldn’t that work for @nchotard ? If you do ”, “.join(somedict) it joins the keys. That would still happen if there was an else to handle Mapping unless sequencify is actually meant to never change containers because they are meant to be used further down.

Ok, I see. Should it sort the result of thing.keys() before putting it into the tuple? IE
tuple(sorted(thing.keys()))

I don’t think there is an expectation of that but it might help for repeatability.

ok, @nchotard a fix is merged to master. please try it out when you have an opportunity.