I am reading an article about python removing duplicate element in a list.
there is a function defined as:
def f8(seq): # Dave Kirby
# Order preserving
seen = set()
return [x for x in seq if x not in seen and not seen.add(x)]
However, i don't really understand the syntax for
[x for x in seq if x not in seen and not seen.add(x)]
what is this syntax ? how do I read it?
thank you.
解决方案
Firstly list comprehensions are usually easy to read, here is a simple example:
[x for x in seq if x != 2]
translates to:
result = []
for x in seq:
if x != 2:
result.append(x)
The reason why you can't read this code is because it is not readable and hacky code as I stated in this question:
def f8(seq):
seen = set()
return [x for x in seq if x not in seen and not seen.add(x)]
translates to:
def f8(seq):
seen = set()
result = []
for x in seq:
if x not in seen and not seen.add(x): # not seen.add(...) always True
result.append(x)
and relies on the fact that set.add is an in-place method that always returns None so not None evaluates to True.
>>> s = set()
>>> y = s.add(1) # methods usually return None
>>> print s, y
set([1]) None
The reason why the code has been written this way is to sneakily take advantage of Python's list comprehension speed optimizations.
Python methods will usually return None if they modify the data structure (pop is one of the exceptions)
I also noted that the current accepted way of doing this (2.7+) which is more readable and doesn't utilize a hack is as follows:
>>> from collections import OrderedDict
>>> items = [1, 2, 0, 1, 3, 2]
>>> list(OrderedDict.fromkeys(items))
[1, 2, 0, 3]
Dictionary keys must be unique, therefore the duplicates are filtered out.