利用python爬取数据并保存到MySQL中

本文在Ubuntu环境下使用Docker搭建了MySQL和Python环境,并从高德地图交通态势图接口获取交通数据保存到MYSQL数据库中。本文罗列了配置过程中的常见问题

步骤

1 进入服务器

登入阿里云

ssh  root@地址 -P

2 搭建MYSQL环境

docker安装MySQL地址:菜鸟教程-docker安装MySQL
注: 如果使用以下mysql -h localhost -u root -p命令无法连接数据库,可以使用命令mysql -h 127.0.0.1 -u root -p

3 创建数据库和数据表

  1. 创建数据库 TRAFFIC_DATA:CREATE DATABASE 数据库名;
  2. 进入数据库use TRAFFIC_DATA;
  3. 创建数据库表:
CREATE TABLE TRAFFIC_DATA (
        traffic_data_id INT UNSIGNED AUTO_INCREMENT,
        time CHAR(25),
        status CHAR(1) NOT NULL,
        info CHAR(255),
        infocode CHAR(20),
        trafficinfo_description TEXT,
        trafficinfo_evaluation_expedite CHAR(10),
        trafficinfo_evaluation_congested CHAR(10),
        trafficinfo_evaluation_blocked CHAR(10),
        trafficinfo_evaluation_unknown CHAR(10),
        trafficinfo_evaluation_status CHAR(1),
        trafficinfo_evaluation_description CHAR(255),
        PRIMARY KEY (traffic_data_id));

CREATE TABLE ROADS(
        road_id INT UNSIGNED AUTO_INCREMENT,
        time CHAR(25),
        name CHAR(255),
        road_status CHAR(1),
        direction CHAR(100),
        angle CHAR(10),
        speed CHAR(10),
        lcodes CHAR(255),
        polyline TEXT,
        PRIMARY KEY (road_id));

注: 若表中含有text类型的属性,需要执行ALTER TABLE TRAFFIC_DATA CONVERT TO CHARACTER SET utf8mb4;命令,改变编码类型,相应的在python代码中,连接数据库时也要设置。

db = pymysql.connect(host="localhost", user="root", password="123456", database="TRAFFIC_DATA", charset="utf8mb4")

问题参考地址

4 搭建python环境

Docker安装Python:Docker安装Python

5 编写爬虫

进入用户自己的文件夹创建traffic_data.py

cd /home
cd zl(你自己创建的文件夹)
# 可以用以下命令创建文件夹
mkdir 文件夹名
# 创建文件
vim traffic_data.py

traffic_data.py

import pymysql
import requests
import json
import time

def getData(url):
    req = requests.get(url)
    data = json.loads(req.text)
    return data

# Press the green button in the gutter to run the script.
if __name__ == '__main__':
    url = "https://restapi.amap.com/v3/traffic/status/rectangle?key=166e4acf928205052a530019a0d98a55&rectangle=121.370178,31.194181;121.434894,31.239471&extensions=all"

    db = pymysql.connect(host="localhost", user="root", password="123456", database="TRAFFIC_DATA", charset="utf8mb4")
    cursor = db.cursor()

    while True:
        time.sleep(1)
        data = getData(url)
        ticks = time.time()
        trafficSql= """INSERT INTO TRAFFIC_DATA(time, status , info, infocode, trafficinfo_description, trafficinfo_evaluation_expedite, trafficinfo_evaluation_congested, trafficinfo_evaluation_blocked, trafficinfo_evaluation_unknown, trafficinfo_evaluation_status, trafficinfo_evaluation_description)
            VALUES ('%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s')""" % (ticks, data['status'], data['info'], data['infocode'], data['trafficinfo']['description'], data['trafficinfo']['evaluation']['expedite'], data['trafficinfo']['evaluation']['congested'], data['trafficinfo']['evaluation']['blocked'], data['trafficinfo']['evaluation']['unknown'], data['trafficinfo']['evaluation']['status'], data['trafficinfo']['evaluation']['description'])
        # try:
        cursor.execute(trafficSql)
        db.commit()
        # except:
        # print("INSERT TRAFFIC_DATA ERROR")

        count = 0
        for x in data['trafficinfo']['roads']:
            try:
                speed = x['speed']
            except:
                speed = -1
            InsertRoadSql = """INSERT INTO ROADS(time, name, road_status, direction, angle, speed, lcodes, polyline)
                           VALUES ('%s','%s','%s','%s','%s','%s','%s','%s')""" % (
            ticks, x['name'], x['status'], x['direction'], x['angle'], speed, x['lcodes'], x['polyline'])
            # try:
            cursor.execute(InsertRoadSql)
            db.commit()
            # except:
            # print("INSERT ROADS ERROR")
    db.close()

6 在Ubuntu下后台持续运行Python程序

nohup python3 traffic_data.py &

参考博客地址

7 杀掉进程

[root@izuf631j25k1f0qea8yp97z zl]ps -ef  | grep python
root       824     1  0  2020 ?        00:41:33 /usr/bin/python2 -Es /usr/sbin/tuned -l -P
root     17627     1  4 21:54 ?        00:02:32 python3 traffic_data.py
root     17714 17674  0 22:57 pts/0    00:00:00 grep --color=auto python
[root@izuf631j25k1f0qea8yp97z zl]#kill -9 17627
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值