beautifulsoup contents详解获取文本

最新推荐文章于 2024-04-17 13:52:02 发布

原创最新推荐文章于 2024-04-17 13:52:02 发布

· 1.3k 阅读

1 ·

版权

[code lang="php"]
#coding:utf-8
html_doc = """

Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.

sdfsdf

"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, "lxml")
head = soup.select('p')

'''如何取出Once upon a time there were three little sisters; and their names were 这段文字呢?'''

print head
# 获取列表
print "==============="

print head[0].contents
# 根据tag进行分割

print "==============="

print head[0].contents[0]
#获取文本,大功告成