![](https://img-blog.csdnimg.cn/20201014180756928.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
爬虫
璐南熙
这个作者很懒,什么都没留下…
展开
-
特殊邮箱匹配
str = "2100037220@qq.com"str2="xiaoxiao@heuet.edu.com"pattern2=re.compile("\w+@(\w+\.)?\w+\.com") result=re.search(pattern2,str)result2=re.search(pattern2,str2)print(result)print(result2)str ...原创 2019-09-26 14:56:39 · 755 阅读 · 0 评论 -
多线程爬虫实战--彼岸图网壁纸爬取
多线程爬虫实战–彼岸图网壁纸爬取普通方法爬取import requestsfrom lxml import etreeimport osfrom urllib import requestheaders={ 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36' }def pa原创 2021-06-16 15:26:11 · 477 阅读 · 0 评论 -
多线程爬虫实战--爬取表情包
多线程爬虫实战–爬取表情包1.0版本–下载表情包之同步爬虫完成方法一import requestsfrom lxml import etreeimport osdef parse_page(url): headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'原创 2021-06-16 15:23:44 · 180 阅读 · 0 评论 -
top250豆瓣电影爬取
import requestsfrom lxml import etreeheaders = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0'}base_url = 'https://movie.douban.com/top250?start='for page in range(150,226,25): url = base_url + str(p原创 2020-07-09 16:15:48 · 147 阅读 · 0 评论 -
用正则爬取豆瓣电影排行榜
import requestsimport reimport chardeturl = "https://movie.douban.com/chart"headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 ...原创 2019-09-30 15:36:36 · 1018 阅读 · 0 评论 -
用xpath爬取豆瓣
**(继上次正则的爬取豆瓣的另一种方法)**import requestsfrom lxml import htmlurl = "https://movie.douban.com/chart"headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko...原创 2019-10-08 20:56:30 · 581 阅读 · 0 评论 -
用SCRAPY爬取豆瓣
用SCRAPY爬取豆瓣Items.pyimport scrapyclass Douban1Item(scrapy.Item): # define the fields for your item here like: # name = scrapy.Field() names = scrapy.Field() actors = scrapy.Field()...原创 2019-10-11 18:36:24 · 124 阅读 · 0 评论 -
scrapy爬取天气预报
scrapy爬取天气预报item.pyimport scrapyclass WeatherItem(scrapy.Item): # define the fields for your item here like: # name = scrapy.Field() city = scrapy.Field() date = scrapy.Field() ...原创 2019-10-13 18:23:03 · 590 阅读 · 0 评论