python爬虫
1eeMamas
为追逐前人的步伐,走着走着,只剩下独自孤独的背影!
展开
-
Python3爬虫实例-爬取双色球开奖记录并保存到数据库
import requestsfrom fake_useragent import UserAgentfrom lxml import etreeimport pymysqlclass SqlHelper(object): def __init__(self): self.connect() def connect(self): s...原创 2020-04-11 21:45:11 · 1134 阅读 · 0 评论 -
Python3爬虫多线程+xpath实现
多线程爬取糗事百科段子from threading import Threadfrom queue import Queueimport requestsfrom fake_useragent import UserAgentfrom lxml import etreeclass CrawlInfo(Thread): def __init__(self,url_queue...原创 2020-04-10 21:42:38 · 245 阅读 · 0 评论 -
Python3爬虫使用jsonpath
import requestsfrom fake_useragent import UserAgentfrom jsonpath import jsonpathheaders = { 'User-Agent': UserAgent().chrome}url = 'http://httpbin.org/get'response = requests.get(url, hea...原创 2020-04-10 20:42:32 · 149 阅读 · 0 评论 -
Python3爬虫Xpath使用
爬取代理实例from lxml import etreeimport requestsfrom fake_useragent import UserAgentheaders={ 'User-Agent':UserAgent().chrome}url='https://www.xicidaili.com/nn/'response=requests.get(url,hea...原创 2020-04-10 19:59:56 · 200 阅读 · 0 评论 -
Python3爬虫pyquery使用
爬取代理实例from pyquery import PyQuery as pqimport requestsfrom fake_useragent import UserAgentheaders={ 'User-Agent':UserAgent().chrome}url='https://www.xicidaili.com/nn/'response=requests....原创 2020-04-10 19:04:33 · 177 阅读 · 0 评论 -
Python3爬虫使用re库
爬取糗事百科实例import requestsfrom fake_useragent import UserAgentimport reheaders = { 'User-Agent': UserAgent().chrome}url = 'https://www.qiushibaike.com/text/'response = requests.get(url, he...原创 2020-04-10 17:08:14 · 289 阅读 · 0 评论 -
Python3爬虫requests使用
get请求import requestsfrom fake_useragent import UserAgentheaders={ 'User-Agent':UserAgent().chrome}url='http://www.xxx.com/s'params={ 'wd':'python'}response=requests.get(url,headers...原创 2020-04-10 00:48:31 · 174 阅读 · 0 评论 -
Python3爬虫urllib库的使用
访问页面from urllib.request import Request,urlopenurl='http://www.xx.com'req=Request(url)resp=urlopen(req)返回数据html=resp.read().decode()添加报头信息from fake_useragent import UserAgentheader={...原创 2020-04-10 00:12:51 · 165 阅读 · 0 评论