python脚本向Mysql插入一亿条测试数据

当我想要测试mysql的诸多索引时,需要用到大量的数据,重复的。不重复的,int类型的,varchar类型的,时间类型的等等,咋办呢?当你看到这篇博文的时候,恭喜你来对地方啦。

咱们废话少说,直接上代码

建表

首先在mysql的example库上创建表user

use example;
create TABLE `user` (
  `id` int(11) NOT NULL,
  `last_name` varchar(45) DEFAULT NULL,
  `first_name` varchar(45) DEFAULT NULL,
  `sex` set('M','F')  DEFAULT NULL,
  `age` tinyint(1) DEFAULT NULL,
  `phone` varchar(11) DEFAULT NULL,
  `address` varchar(45) DEFAULT NULL,
  `password` varchar(45) DEFAULT NULL,
  `create_time` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_last_first_name_age` (`last_name`,`first_name`,`age`) USING BTREE,
  KEY `idx_phone` (`phone`) USING BTREE,
  KEY `idx_create_time` (`create_time`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
插数据

loop_count 为批量插的次数

batch_size为每次批量查的数据量

两者通过自定义,实现想插多少插多少。

import MySQLdb
import MySQLdb.cursors
import random
import string
import time

# 批量插的次数
loop_count = 100000
# 每次批量查的数据量
batch_size = 100
success_count = 0
fails_count = 0
# 数据库的连接
# 使用 SSCursor (流式游标),避免客户端占用大量内存。(这个 cursor 实际上没有缓存下来任何数据,它不会读取所有所有到内存中,它的做法是从储存块中读取记录,并且一条一条返回给你。)
conn = MySQLdb.connect(host="47.97.172.176",
                       user="root",
                       passwd="123456",
                       db="test",
                       port=3306,
                       cursorclass=MySQLdb.cursors.SSCursor)
chars = 'AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz'
digits = '0123456789'


def random_generate_string(length):
    return ''.join(random.sample(chars, length))


def random_generate_number(length):
    if length > len(digits):
        digit_list = random.sample(digits, len(digits))
        digit_list.append(random.choice(digits))
        return ''.join(digit_list)
    return ''.join(random.sample(digits, length))


def random_generate_data(num):
    c = [num]
    phone_num_seed = 13100000000

    def _random_generate_data():
        c[0] += 1
        return (c[0], "last_name_" + str(random.randrange(100000)),
                "first_name_" + str(random.randrange(100000)),
                random.choice('MF'), random.randint(1, 120),
                phone_num_seed + c[0], random_generate_string(20),
                random_generate_string(10), time.strftime("%Y-%m-%d %H:%M:%S"))

    return _random_generate_data


def execute_many(insert_sql, batch_data):
    global success_count, fails_count
    cursor = conn.cursor()
    try:
        cursor.executemany(insert_sql, batch_data)
    except Exception as e:
        conn.rollback()
        fails_count = fails_count + len(batch_data)
        print(e)
        raise
    else:
        conn.commit()
        success_count = success_count + len(batch_data)
        print(str(success_count) + " commit")
    finally:
        cursor.close()


try:
    # user表列的数量
    column_count = 9

    # 插入的SQL
    insert_sql = "replace into user(id, last_name, first_name, sex, age, phone, address, password, create_time) values (" + ",".join(
        ["%s" for x in range(column_count)]) + ")"
    batch_count = 0
    begin_time = time.time()
    for x in range(loop_count):
        batch_count = x * batch_size
        gen_fun = random_generate_data(batch_count)
        batch_data = [gen_fun() for x in range(batch_size)]
        execute_many(insert_sql, batch_data)
    end_time = time.time()
    total_sec = end_time - begin_time
    qps = success_count / total_sec
    print("总共生成数据: " + str(success_count))
    print("总共耗时(s): " + str(total_sec))
    print("QPS: " + str(qps))
except Exception as e:
    print(e)
    raise
else:
    pass
finally:
    pass

效果图
在这里插入图片描述

妈妈再也不用担心我测试数据不足啦!

原文链接: https://blog.csdn.net/csujiangyu/article/details/51096978

原文由python2 书写,此文已经改良,支持python3

  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值