爬虫
python
解忧杂货铺Q
但行好事,莫问前程
展开
-
记录一次python多版本和软连接指向不对导致的尴尬报错:import pymysql ModuleNotFoundError: No module named ‘pymysql‘
前提条件之前为了测试一键安装脚本,安装了python3.7,却忘记了服务器上已经有了pythton3.6,然后正常添加了软连接到/usr/bin目录,其实测试并没有替换成功,因为我没有先删除软连接。然后后面执行python send.py文件时,一直报错,百思不得其解:Traceback (most recent call last): File "src/send.py", line 6, in <module> from src.fetchData import get.原创 2021-02-09 08:59:06 · 1083 阅读 · 0 评论 -
Python使用pymysql返回字典类型的数据
import pymysqlimport time# 数据库db = ""cur = ""# 现在年月日today = time.strftime("%Y-%m-%d", time.localtime())try: # 数据库配置 config = { "host": "124.71.18.23", "port": 3306, "user": "root", "password": "Master@test", "db": 'scrap原创 2021-02-06 23:36:52 · 9408 阅读 · 2 评论 -
crontab 定时任务执行不生效
日志显示执行[root@AlexWong weather]# tail -f /var/log/cronFeb 5 20:04:01 AlexWong CROND[151652]: (root) CMD (sh /project/python/scrapy/weather/run.sh >/dev/null 2>&1)Feb 5 20:06:01 AlexWong CROND[151692]: (root) CMD (sh /project/python/scrapy/we.原创 2021-02-05 20:41:06 · 1070 阅读 · 0 评论 -
crontab -e无法保存:/var/spool/cron/#tmp.AlexWong.XXXXcY46QG: Operation not permitted
问题[root@AlexWong /]# sudo crontab -ecrontab: installing new crontab/var/spool/cron/#tmp.AlexWong.XXXXcY46QG: Operation not permittedcrontab: edits left in /tmp/crontab.Mr5kao[root@AlexWong /]# lsattr /var/spool/cron/----ia--------e----- /var/spool/.原创 2021-02-05 20:06:43 · 838 阅读 · 0 评论 -
Python3使用Scrapy2.4框架爬取数据,多个spider同时执行
新建目录commands和文件crawlall.pyfrom scrapy.commands import ScrapyCommandfrom scrapy.utils.project import get_project_settingsclass Command(ScrapyCommand): requires_project = True def syntax(self): return '[options]' def short_desc(self): ..原创 2021-02-04 19:53:20 · 309 阅读 · 0 评论 -
Python3使用Scrapy2.4框架爬取数据,多spider指定pipelines配置
目前环境python3scrapy2.41、方案1(scrapy版本必须是1.1以上)settings.py# 这个数值的范围是0-1000, 这个数值确定了他们的运行顺序(即优先级), 数字越小越优先执行ITEM_PIPELINES = { 'weather.pipelines.WeatherPipeline': 300, 'weather.pipelines.WeatherHourPipeline': 302,}items.py# Define here th.原创 2021-02-04 19:27:27 · 463 阅读 · 0 评论 -
Python scrapy框架时调用上级items出现attempted relative import with no known parent package
items.pyimport scrapy# 用于保存所抓取的数据的容器# 定义字段内容# 每日天气class WeatherItem(scrapy.Item): # define the fields for your item here like: # 今日 name = scrapy.Field() # 天气状态 status = scrapy.Field() # 日期 date = scrapy.Field() # 最高气温 max = scr..原创 2021-02-04 13:48:47 · 815 阅读 · 0 评论 -
python3日期计算方法
# 根据日期差值获取日期def get_date_by_diff(day): diff = day - datetime.datetime.now().day # 先获得时间数组格式的日期 threeDayAgo = (datetime.datetime.now() + datetime.timedelta(days=diff)) # 转换为时间戳 # timeStamp = int(time.mktime(threeDayAgo.timetuple())) # 转换为其他字符串格原创 2021-02-04 09:47:50 · 486 阅读 · 1 评论 -
python scrapy中的xpath和css选择器
import scrapyclass ShSpider(scrapy.Spider): name = 'sh' start_urls = [ 'https://weather.com/zh-CN/weather/today/l/7f14186934f484d567841e8646abc61b81cce4d88470d519beeb5e115c9b425a'] def parse(self, response): # 每日预报 for li in response.c原创 2021-02-03 15:12:54 · 231 阅读 · 0 评论 -
python3使用scrapy爬虫数据录入mysql
WeatherPipeline.py# Define your item pipelines hereimport pymysqlclass WeatherPipeline: # 数据库 def __init__(self): # 数据库配置 config = { "host": "124.76.81.53", "port": 3306, "user": "root", "password": "Mastertest.c.原创 2021-02-03 17:35:07 · 351 阅读 · 3 评论