python爬虫之BeautifulSoup解析网页

夏安code

于 2018-11-09 13:45:45 发布

阅读量671

点赞数

分类专栏： python 文章标签： python BeautifulSoupp 爬虫

本文链接：https://blog.csdn.net/Xu_programmer/article/details/83896484

版权

python 专栏收录该内容

6 篇文章

订阅专栏

BeautifulSoup是一个很简单又好用的库，不过解析速度相对比较慢，使用如下：

1，安装

pip install bs4 （被加到了bs4中） #python3用pip3 install bs4 ，如果有权限问题，可以试试，pip install bs4 --user

2，导包

from bs4 import BeautifulSoup

3,使用代码

from bs4 import BeautifulSoup

    html = '''

    <li> aaa</li>

    <li class = "name">bbb</li>

    '''

    soup = BeautifulSoup(html,features = "lxml")

    li = soup.findAll('li',class_='name')        #找到所有class为name的li标签

    for i in li:

        print(i.attrs['class'])         #输出name，同理可以得到所有的属性内容

        print(i.string)          #输出bbb，可以得到文本内容

最简单的使用就是这样。