I have a set of lists that look like this:
conditions = [
["condition1", ["sample1", "sample2", "sample3"]],
["condition2", ["sample4", "sample5", "sample6"],
...]
how can I do the following things efficiently and elegantly in Python?
Find all the elements in a certain condition?
e.g. get all the samples in condition2. Right now I can do:
for cond in conditions:
cond_name, samples = cond
if cond_name == requested_cond:
return samples
but that's clunky.
Find the ordered union of a list of conditions? E.g. ordered_union(["condition1", "condition2"], conditions) should return:
["sample1", "sample2", "sample3", "sample4", "sample5", "sample6"]
How can I do this efficiently in Python? There are probably clever one liners?
解决方案
Ah well, if you're forced to keep that clunky data structure, you can't expect much. The one-liner equivalent of your first solution is going to be something like:
def samplesof(requested_cond, conditions):
return next(s for c, s in conditions if c==requested_cond)
and for the second one, if you insist on one-liners, it's going to be something like:
def ordered_union(the_conds, conditions):
return [s for c in the_conds for s in samplesof(c, conditions)]
There are faster ways to solve the second problem, but they're all multi-line, e.g.:
aux_set = set(the_conds)
samples_by_cond = dict((c, s) for c, s in conditions if c in aux_set)
return [s for c in the_conds for s in samples_by_cond[c]]
Note that the key to the reason this latter approach is faster is that it uses the right data structures (a set and a dict) -- unfortunately it has to build them itself, because the incoming conditions nested list is really the wrong data structure.
Couldn't you encapsulate conditions as a member variable of a class that builds the crucial (right, fast) auxiliary data structures just once? E.g.:
class Sensible(object):
def __init__(self, conditions):
self.seq = []
self.dic = {}
for c, s in conditions:
self.seq.append(c)
self.dic[c] = s
def samplesof(self, requested_condition):
return self.dic[requested_condition]
def ordered_union(self, the_conds):
return [s for c in the_conds for s in self.dic[c]]
Now that is fast and elegant!
I'm assuming that you need self.seq (the sequence of conditions) for something else (it's certainly not needed for the two operations you mention!), and that there are no repetitions in that sequence and in the samples (whatever your actual specs are they won't be hard to accomodate, but blindly trying to guess them when you mention nothing about them would be very hard and pointless;-).