scrapy存到mysql测试用例

最新推荐文章于 2024-07-26 21:19:18 发布

方兔叽

最新推荐文章于 2024-07-26 21:19:18 发布

阅读量188

点赞数

分类专栏：学习记录_Python 文章标签： python mysql

本文链接：https://blog.csdn.net/weixin_46373697/article/details/104684278

版权

学习记录_Python 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

刚开始学习爬虫，以下是测试用例**环境：(1. pycharm 2019.3(2. python 3.8 （遇到pip升级到3.9的问题——版本升级）[#打开命令提示符cmd（Windows键+R / 直接搜索cmd ）我的对应解决命令是：python -m pip install --upgrade pip -ihttp://pypi.douban.com/simple --tr...

摘要由CSDN通过智能技术生成

刚开始学习爬虫，以下是测试用例

**环境：

(1. pycharm 2019.3
(2. python 3.8 （遇到pip升级到3.9的问题——版本升级）
[
#打开命令提示符cmd（Windows键+R / 直接搜索cmd ）
我的对应解决命令是：
python -m pip install --upgrade pip -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
安装某些模块装不上的时候可以用以下命令试试：
pip install 模块名 -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
]
(3. 数据库为mysql8.0（workbench8.0）
[
#进入cmd：
遇到数据库连接时间差问题——
mysql -u 用户名 -p密码
show databases;
set global time_zone=’+8:00’;
]
(4. 爬虫URL: http://www.ceic.ac.cn/speedsearch
[
中国地震台网采用ajax加载页面，因为是测试用例，就用默认的页面数据，具体情况具体代码需进一步查看
]
（数据库连接参考博客代码：https://blog.csdn.net/just_so_so_fnc/article/details/72995731）

**scrapy项目展示:(名字不严谨)+数据创建
scrapy项目文件：
数据库创建：
在这里插入图片描述
命令：

create database earthdb;
create table 'earthdata'(
 `id` int(10) NOT NULL AUTO_INCREMENT,
 `level` varchar(100) DEFAULT NULL,
 `time` varchar(100) DEFAULT NULL,
 `latitude` varchar(100) DEFAULT NULL,
 `longitude` varchar(100) DEFAULT NULL,
 `depth` varchar(100) DEFAULT NULL,
 `address` varchar(100) DEFAULT NULL,
 PRIMARY KEY (`id`)
)ENGINE=InnoDB AUTO_INCREMENT=1181 DEFAULT CHARSET=utf8;
select count(name) from earthdata;

**相关代码：
spider爬虫文件：

import scrapy
from ScrapyPro2.items import Scrapypro2Item

class EarthdataSpider(scrapy.Spider):
    name = 'earthData'
    allowed_domains = ['ceic.ac.cn/speedsearch']
    start_urls = ['http://www.ceic.ac.cn/speedsearch']

    def parse(self, response):
        dataList=response.xpath("//table[@class='speed-table1']/tr")[1:]
        item = Scrapypro2Item()
        for data in dataList:
            item["level"]=data.xpath("./td[1]/text()").extract()
            item["time"] = data.xpath("./td[2]/text()").extract()
            item["latitude"] = data.xpath("./td[3]/text()").extract()
            item["longitude"] = data.xpath("./td[4]/text()").extract()
            item["depth"] = data.xpath("./td[5]/text()").extract()
            item["address"] = data.xpath("./td[6]/a/text()").extract()
            yield item

items文件：

import scrapy


class Scrapypro2Item(scrapy.Item):
 #定义需要爬取的数据
 #定义数据模型
    # define the fields for your item here like:
    level = scrapy.Field()
    time = scrapy.Field()
    latitude = scrapy.Field()
    longitude = scrapy.Field()
    depth = scrapy.Field()
    address = scrapy.Field()

pipelines文件：

import pymysql

class Scrapypro2Pipeline(object):
    def process_item(self, item, spider):
        conn = pymysql.Connect(
            host='localhost',
            port=3306,
            user='root',
            passwd='root',
            db='earthdb',
            charset='utf8'
        )
        cursor = conn.cursor()
        """
        #执行结束后可注释掉插入部分的代码，把这段查询的代码的注释删掉执行查看结果
        sql = "SELECT * FROM earthdata"
        cursor.execute(sql)
        print(cursor.rowcount)
        rr = cursor.fetchall()
        for row in rr:
            print("Id：=%s, level：=%s, time：=%s, latitude：=%s, longitude：=%s, depth：=%s, address：=%s" % row)

        """
        sql_insert = "INSERT INTO earthdata(level,time,latitude,longitude,depth,address) VALUES(%s,%s,%s,%s,%s,%s)"
        # 执行语句
        cursor.execute(sql_insert, (item['level'], item['time'], item['latitude'], item['longitude'],
                                    item['depth'],item['address']))
        # 事务提交，否则数据库得不到更新
        conn.commit()
        print(cursor.rowcount)
        conn.close()
        cursor.close()
        return item

**结果：
（代码运行结果之前关掉了，就不截图了，这是执行完后对应的数据库信息）在这里插入图片描述

小声明：
学习中的小白，参考了很多博主的代码ORZ请多指教

方兔叽

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
scrapy存到mysql测试用例

刚开始学习爬虫，以下是测试用例**环境：(1. pycharm 2019.3(2. python 3.8 （遇到pip升级到3.9的问题——版本升级）[#打开命令提示符cmd（Windows键+R / 直接搜索cmd ）我的对应解决命令是：python -m pip install --upgrade pip -ihttp://pypi.douban.com/simple --tr...
复制链接

扫一扫