python简单爬虫代码_简单Python爬虫教程 (一)

12ba479a7bce3233b92b36e8d3a2046b.png

简单Python爬虫教程

65564d055484b1ce65cf32d6ebbdc6f0.png

准备工作

需要安装第三方库

pip install requests

requests第三方库安装的时候没有截图,大家可以输入命令直接安装

pip install beautifulsoup4

D:20180801scriptpython>pip install beautifulsoup4Collecting beautifulsoup4 Downloading https://files.pythonhosted.org/packages/1a/b7/34eec2fe5a49718944e215fde81288eec1fa04638aa3fb57c1c6cd0f98c3/beautifulsoup4-4.8.0-py3-none-any.whl (97kB) |████████████████████████████████| 102kB 65kB/sCollecting soupsieve>=1.2 (from beautifulsoup4) Downloading https://files.pythonhosted.org/packages/35/e3/25079e8911085ab76a6f2facae0771078260c930216ab0b0c44dc5c9bf31/soupsieve-1.9.2-py2.py3-none-any.whlInstalling collected packages: soupsieve, beautifulsoup4Successfully installed beautifulsoup4-4.8.0 soupsieve-1.9.2WARNING: You are using pip version 19.1.1, however version 19.2.1 is available.You should consider upgrading via the 'python -m pip install --upgrade pip' command.D:20180801scriptpython>

pip install lxml

C:甥敳獲>pip install lxmlCollecting lxml Downloading https://files.pythonhosted.org/packages/21/ba/ca19058e1ae455c0425f72bd9fe1a0493e89f19f494b46a5c88867371def/lxml-4.4.0-cp37-cp37m-win_amd64.whl (3.7MB) |████████████████████████████████| 3.7MB 59kB/sInstalling collected packages: lxmlSuccessfully installed lxml-4.4.0WARNING: You are using pip version 19.1.1, however version 19.2.1 is available.You should consider upgrading via the 'python -m pip install --upgrade pip' command.C:甥敳獲>

先看下代码

这一个段代码 爬取的是静态页面中最简单的文本文件,超级简单的。

# -*- coding: utf-8 -*-import requestsfrom bs4 import BeautifulSoupreq = requests.get('http://www.huanyue123.com/book/37/37849/22075553.html')#打开网页#req.encoding = 'GBK' #编码html = req.text #获取连接的响应报文bf = BeautifulSoup(html ,'lxml') #按照 lxml报文解析texts = bf.find_all('div', class_ = 'contentbox clear') #找到div格式 contentbox clearprint(texts[0].text) #打印日志

打印日志

04f63fb7dab9ee4108d0b583bb15b5aa.png

先展示日志情况,下一篇出具体的教程。

d2d0d06d5fdf2f3c4b0439df2828c5f0.png
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值