string属性要求标记只包含文本而不包含标记。如果您尝试为第一个p标记打印.string,那么它将返回None,因为其中有标记。在
或者,为了更好地解释它,documentation说:If a tag has only one child, and that child is a NavigableString, the child is made available as .string
If a tag contains more than one thing, then it’s not clear what .string should refer to, so .string is defined to be None
克服这个问题的方法是使用lambda函数。在html = """
This has a color of red. Because it likes the color red
This paragraph has a color of blue.
This paragraph does not have a color.
"""soup = BeautifulSoup(html, 'html.parser')
first_p = soup.find('p')
print(first_p)
#
This has a color of red. Because it likes the color red
print(first_p.string)
# None
print(first_p.text)
# This has a color of red. Because it likes the color red
paras = soup.find_all(lambda tag: tag.name == 'p' and 'color' in tag.text.lower())
print(paras)
# [
This has a color of red. Because it likes the color red
,This paragraph has a color of blue.
,This paragraph does not have a color.
]