python筛选器
介绍 (Introduction)
The Python built-in filter()
function can be used to create a new iterator from an existing iterable (like a list or dictionary) that will efficiently filter out elements using a function that we provide. An iterable is a Python object that can be “iterated over”, that is, it will return items in a sequence such that we can use it in a for
loop.
Python内置的filter()
函数可用于从现有的可迭代对象(如列表或字典 )创建新的迭代器,该迭代器将使用我们提供的函数有效滤除元素。 一个可迭代对象是可以“迭代”的Python对象,也就是说,它将按顺序返回项目,以便我们可以在for
循环中使用它。
The basic syntax for the filter()
function is:
filter()
函数的基本语法为:
filter(function, iterable)
This will return a filter object, which is an iterable. We can use a function like list()
to make a list of all the items returned in a filter object.
这将返回一个可迭代的过滤器对象。 我们可以使用类似list()
的函数来列出过滤器对象中返回的所有项目的列表。
The filter()
function provides a way of filtering values that can often be more efficient than a list comprehension, especially when we’re starting to work with larger data sets. For example, a list comprehension will make a new list, which will increase the run time for that processing. This means that after our list comprehension has completed its expression, we’ll have two lists in memory. However, filter()
will make a simple object that holds a reference to the original list, the provided function, and an index of where to go in the original list, which will take up less memory.
filter()
函数提供了一种过滤值的方法,该方法通常比列表理解更有效,尤其是当我们开始使用较大的数据集时。 例如,列表理解将创建一个新列表,这将增加该处理的运行时间。 这意味着列表理解完成后,内存中将有两个列表。 但是, filter()
将创建一个简单的对象,该对象包含对原始列表的引用,所提供的函数以及在原始列表中的位置的索引,这将占用较少的内存。
In this tutorial, we’ll review four different ways of using filter()
: with two different iterable structures, with a lambda
function, and with no defined function.
在本教程中,我们将回顾使用filter()
四种不同方式:具有两种不同的可迭代结构,具有lambda
函数且未定义函数。
将filter()
与函数一起使用 (Using filter()
with a Function)
The first argument to filter()
is a function, which we use to decide whether to include or filter out each item. The function is called once for every item in the iterable passed as the second argument and each time it returns False
, the value is dropped. As this argument is a function, we can either pass a normal function or we can make use of lambda
functions, particularly when the expression is less complex.
filter()
的第一个参数是一个函数 ,我们用它来决定是包含还是过滤掉每个项目。 对于作为第二个参数传递的iterable中的每个项目,均会调用该函数一次,并且每次返回False
,该值都将被删除。 因为此参数是一个函数,所以我们可以传递一个普通函数,也可以使用lambda
函数,尤其是当表达式不太复杂时。
Following is the syntax of a lambda
with filter()
:
以下是带有filter()
的lambda
的语法:
filter(lambda item: item[] expression, iterable)
With a list, like the following, we can incorporate a lambda
function with an expression against which we want to evaluate each item from the list:
使用如下列表,我们可以将lambda
函数与一个表达式结合在一起,我们要根据该表达式评估列表中的每个项目:
creature_names = ['Sammy', 'Ashley', 'Jo', 'Olly', 'Jackie', 'Charlie']
To filter this list to find the names of our aquarium creatures that start with a vowel, we can run the following lambda
function:
要过滤此列表以查找以元音开头的水族馆生物的名称,我们可以运行以下lambda
函数:
print(list(filter(lambda x: x[0].lower() in 'aeiou', creature_names)))
Here we declare an item in our list as x
. Then we set our expression to access the first character of each string (or character “zero”), so x[0]
. Lowering the case of each of the names ensures this will match letters to the string in our expression, 'aeiou'
.
在这里,我们将列表中的一项声明为x
。 然后,我们设置表达式以访问每个字符串的第一个字符(或字符“零”),即x[0]
。 减小每个名称的大小写可确保这将使字母与表达式'aeiou'
的字符串匹配。
Finally we pass the iterable creature_names
. Like in the previous section we apply list()
to the result in order to create a list from the iterator filter()
returns.
最后,我们传递可迭代的creature_names
。 像在上一节中一样,我们将list()
应用于结果,以便从迭代器filter()
返回值创建列表。
The output will be the following:
输出将如下所示:
Output
['Ashley', 'Olly']
This same result can be achieved using a function we define:
使用我们定义的函数可以达到相同的结果:
creature_names = ['Sammy', 'Ashley', 'Jo', 'Olly', 'Jackie', 'Charlie']
def names_vowels(x):
return x[0].lower() in 'aeiou'
filtered_names = filter(names_vowels, creature_names)
print(list(filtered_names))
Our function names_vowels
defines the expression that we will implement to filter creature_names
.
我们的函数names_vowels
定义了我们将要实现的表达式,以过滤creature_names
。
Again, the output would be as follows:
同样,输出将如下所示:
Output
['Ashley', 'Olly']
Overall, lambda
functions achieve the same result with filter()
as when we use a regular function. The necessity to define a regular function grows as the complexity of expressions for filtering our data increases, which is likely to promote better readability in our code.
总的来说, lambda
函数通过filter()
达到与使用常规函数时相同的结果。 定义正则函数的必要性随着用于过滤数据的表达式的复杂性增加而增加,这很可能会提高代码的可读性。
将None
与filter()
(Using None
with filter()
)
We can pass None
as the first argument to filter()
to have the returned iterator filter out any value that Python considers “falsy”. Generally, Python considers anything with a length of 0
(such as an empty list or empty string) or numerically equivalent to 0
as false, thus the use of the term “falsy.”
我们可以将None
作为第一个参数传递给filter()
以使返回的迭代器滤除Python认为“虚假”的任何值。 通常,Python将长度为0
任何内容(例如空列表或空字符串)或数值上等于0
视为false,因此使用术语“虚假”。
In the following case we want to filter our list to only show the tank numbers at our aquarium:
在以下情况下,我们希望过滤列表以仅显示水族馆的水箱编号:
aquarium_tanks = [11, False, 18, 21, "", 12, 34, 0, [], {}]
In this code we have a list containing integers, empty sequences, and a boolean value.
filtered_tanks = filter(None, aquarium_tanks)
We use the filter()
function with None
and pass in the aquarium_tanks
list as our iterable. Since we have passed None
as the first argument, we will check if the items in our list are considered false.
我们将filter()
函数与None
,并将aquarium_tanks
列表作为可迭代方法传递。 由于我们已将None
作为第一个参数传递,因此我们将检查列表中的项目是否被视为false。
print(list(filtered_tanks))
Then we wrap filtered_tanks
in a list()
function so that it returns a list for filtered_tanks
when we print.
然后我们总结filtered_tanks
在list()
函数,以便它返回一个列表filtered_tanks
,当我们打印。
Here the output shows only the integers. All the items that evaluated to False
, that are equivalent to 0
in length, were removed by filter()
:
在这里,输出仅显示整数。 所有评估为False
的项目(长度等于0
filter()
已由filter()
删除:
Output
[11, 25, 18, 21, 12, 34]
Note: If we don’t use list()
and print filtered_tanks
we would receive a filter object something like this: <filter object at 0x7fafd5903240>
. The filter object is an iterable, so we could loop over it with for
or we can use list()
to turn it into a list, which we’re doing here because it’s a good way to review the results.
注意 :如果我们不使用list()
并打印filtered_tanks
我们将收到类似以下内容的<filter object at 0x7fafd5903240>
: <filter object at 0x7fafd5903240>
。 过滤器对象是可迭代的,因此我们可以使用for
对其进行循环,也可以使用list()
将其转换为列表,我们在这里这样做是因为它是查看结果的好方法。
With None
we have used filter()
to quickly remove items from our list that were considered false.
对于None
我们使用filter()
快速从列表中删除被认为是假的项目。
将filter()
与字典列表一起使用 (Using filter()
with a List of Dictionaries)
When we have a more complex data structure, we can still use filter()
to evaluate each of the items. For example, if we have a list of dictionaries, not only do we want to iterate over each item in the list — one of the dictionaries — but we may also want to iterate over each key:value pair in a dictionary in order to evaluate all the data.
当我们拥有更复杂的数据结构时,我们仍然可以使用filter()
评估每个项目。 例如,如果我们有一个字典列表,我们不仅要遍历列表中的每个项目(其中之一是字典),而且还可能要遍历字典中的每个key:value对,以便求值所有数据。
As an example, let’s say we have a list of each creature in our aquarium along with different details about each of them:
举例来说,假设我们在水族馆中有每个生物的清单,以及每个生物的不同详细信息:
aquarium_creatures = [
{"name": "sammy", "species": "shark", "tank number": "11", "type": "fish"},
{"name": "ashley", "species": "crab", "tank number": "25", "type": "shellfish"},
{"name": "jo", "species": "guppy", "tank number": "18", "type": "fish"},
{"name": "jackie", "species": "lobster", "tank number": "21", "type": "shellfish"},
{"name": "charlie", "species": "clownfish", "tank number": "12", "type": "fish"},
{"name": "olly", "species": "green turtle", "tank number": "34", "type": "turtle"}
]
We want to filter this data by a search string we give to the function. To have filter()
access each dictionary and each item in the dictionaries, we construct a nested function, like the following:
我们希望通过提供给该函数的搜索字符串来过滤此数据。 为了使filter()
访问字典中的每个字典和每个项目,我们构造一个嵌套函数,如下所示:
def filter_set(aquarium_creatures, search_string):
def iterator_func(x):
for v in x.values():
if search_string in v:
return True
return False
return filter(iterator_func, aquarium_creatures)
We define a filter_set()
function that takes aquarium_creatures
and search_string
as parameters. In filter_set()
we pass our iterator_func()
as the function to filter()
. The filter_set()
function will return the iterator resulting from filter()
.
我们定义一个filter_set()
函数,该函数将aquarium_creatures
和search_string
作为参数。 在filter_set()
我们将iterator_func()
作为函数传递给filter()
。 filter_set()
函数将返回从filter()
得到的迭代filter()
。
The iterator_func()
takes x
as an argument, which represents an item in our list (that is, a single dictionary).
iterator_func()
将x
作为参数,表示我们列表中的一个项目(即单个字典)。
Next the for
loop accesses the values in each key:value pair in our dictionaries and then uses a conditional statement to check whether the search_string
is in v
, representing a value.
接下来, for
循环访问字典中每个key:value对中的值,然后使用条件语句检查search_string
是否在v
,表示一个值。
Like in our previous examples, if the expression evaluates to True
the function adds the item to the filter object. This will return once the filter_set()
function has completed. We position return False
outside of our loop so that it checks every item in each dictionary, instead of returning after checking the first dictionary alone.
就像在前面的示例中一样,如果表达式的计算结果为True
该函数会将该项添加到过滤器对象。 一旦filter_set()
函数完成,它将返回。 我们将return False
放置在循环之外,以便它检查每个词典中的每个项目,而不是仅在检查了第一个词典之后才返回。
We call filter_set()
with our list of dictionaries and the search string we want to find matches for:
我们使用字典列表和我们要查找与之匹配的搜索字符串来调用filter_set()
:
filtered_records = filter_set(aquarium_creatures, "2")
Once the function completes we have our filter object stored in the filtered_records
variable, which we turn into a list and print:
函数完成后,我们将过滤器对象存储在filtered_records
变量中,将其转换为列表并打印:
print(list(filtered_records))
We’ll receive the following output from this program:
我们将从该程序接收以下输出:
Output
[{'name': 'ashley', 'species': 'crab', 'tank number': '25', 'type': 'shellfish'}, {'name': 'jackie', 'species': 'lobster', 'tank number': '21', 'type': 'shellfish'}, {'name': 'charlie', 'species': 'clownfish', 'tank number': '12', 'type': 'fish'}]
We’ve filtered the list of dictionaries with the search string 2
. We can see that the three dictionaries that included a tank number with 2
have been returned. Using our own nested function allowed us to access every item and efficiently check each against the search string.
我们使用搜索字符串2
过滤了词典列表。 我们可以看到,返回了三个带有2
坦克编号的字典。 使用我们自己的嵌套函数,我们可以访问每个项目并根据搜索字符串有效地检查每个项目。
结论 (Conclusion)
In this tutorial, we’ve learned the different ways of using the filter()
function in Python. Now you can use filter()
with your own function, a lambda
function, or with None
to filter for items in varying complexities of data structures.
在本教程中,我们学习了在Python中使用filter()
函数的不同方法。 现在,您可以将filter()
与您自己的函数, lambda
函数一起使用,或者与None
一起使用,以过滤数据结构复杂程度不同的项目。
Although in this tutorial we printed the results from filter()
immediately in list format, it is likely in our programs we would use the returned filter()
object and further manipulate the data.
尽管在本教程中,我们立即以列表格式打印了filter()
的结果,但是很可能在我们的程序中,我们将使用返回的filter()
对象并进一步处理数据。
If you would like to learn more Python, check out our How To Code in Python 3 series and our Python topic page.
如果您想了解更多Python,请查看我们的“ 如何使用Python 3编码”系列和Python主题页面 。
翻译自: https://www.digitalocean.com/community/tutorials/how-to-use-the-python-filter-function
python筛选器