![](https://img-blog.csdnimg.cn/20201014180756927.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
Python爬虫
阿祺的阿铖呀
这个作者很懒,什么都没留下…
展开
-
财富新闻
from urllib.request import Request, urlopenimport bs4titles = []texts = []header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'}for i in range(2,7): url = 'ht原创 2021-01-10 23:56:52 · 73 阅读 · 0 评论 -
前程无忧网站爬取并写入Excel
from urllib.request import Request, urlopenimport bs4import requestsimport reimport jsonimport xlwtworkbook = xlwt.Workbook(encoding='utf-8')worksheet = workbook.add_sheet('1')worksheet.write(0, 0, label='工作名称')worksheet.write(0, 1, label='公司名称'原创 2021-01-10 23:55:05 · 186 阅读 · 0 评论 -
百度图片下载
import reimport osfrom urllib.request import Request, urlopen, urlretrieveimport bs4import jsondef json_all(pn): links = [] for i in range(0,pn+1): url = 'https://image.baidu.com/search/acjson?tn=resultjson_com&logid=6116183662344原创 2021-01-10 23:51:48 · 189 阅读 · 0 评论 -
爬取论文 PDF
import osfrom urllib.request import urlretrievefrom urllib.request import Request, urlopenimport bs4url='http://cjc.ict.ac.cn/qwjs/No2020-01.htm'header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)原创 2021-01-10 23:48:30 · 164 阅读 · 0 评论 -
爬取图片并下载
import osfrom urllib.request import urlretrievefrom urllib.request import Request, urlopenimport bs4url='https://www.sohu.com/a/286956359_301394'header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)原创 2021-01-10 23:46:37 · 132 阅读 · 0 评论 -
前程无忧 链接 序号
from urllib.request import Request, urlopenimport bs4import requestsimport reimport jsonimport xlwtworkbook = xlwt.Workbook(encoding='utf-8')worksheet = workbook.add_sheet('1')worksheet.write(0, 0, label='序号')worksheet.write(0, 1, label='工作名称')原创 2021-01-10 23:45:35 · 211 阅读 · 0 评论 -
爬取豆瓣音乐TOP250并写入Excel
爬取豆瓣音乐写入Excelfrom urllib.request import Request, urlopenimport bs4import requestsimport reimport jsonimport xlwtworkbook = xlwt.Workbook(encoding='utf-8')worksheet = workbook.add_sheet('1')worksheet.write(0, 0, label='序号')worksheet.write(0, 1,原创 2021-01-09 13:20:30 · 394 阅读 · 2 评论 -
爬取“豆瓣电影Top250”的电影排名、电影名和评分并写入文档
想要爬取豆瓣电影Top250的电影排名、电影名和评分并写入文档吗?其实很简单。打开Pycharm得到豆瓣电影Top250的网址:https://movie.douban.com/top250?start=0&filter=接下来是我的代码:from urllib.request import urlopen, Requestfrom bs4 import BeautifulSoupimport refor x in range(0,250,25): url=("https原创 2020-11-09 17:31:18 · 2271 阅读 · 2 评论