爬虫爬虫

最新推荐文章于 2024-08-02 16:41:42 发布

源治104

最新推荐文章于 2024-08-02 16:41:42 发布

阅读量3.1k

点赞数

文章标签：爬虫 python http

本文链接：https://blog.csdn.net/weixin_62421592/article/details/122334088

版权

话不多说

直接漂亮汤

import requests   #导入请求模块
r=requests.get('https://www.baidu.com/?tn=15007414_8_dg') #向网站发送请求
print(r.status_code)  #打印状态码  200表示连接成功  404表示连接失败
r.encoding      #从HHTTP header中猜测的响应内容编码方式
r.apparent_encoding#从内容中分析出响应内容的编码方式
r.text     #即HTTP相应内容的字符串形式
r.text[500:]#取前500个字符
from bs4 import BeautifulSoup
demo=r.text
soup=BeautifulSoup(demo,'html.parser')
print(soup.prettify())