你应该了解NewSQL:CockroachDB验证文档

方案八:CockroachDB

cockroachdb是分布式、支持事务、支持SQL操作、K-V存储模式的数据库。CockroachDB的三位创始人全部来自Google,其架构受到Google的 Spanner和F1的启发,cockroach开源地址输入图片说明 具有:

  • 标准SQL接口, 使用PostgreSQL协议,支持标准SQL接口,兼容关系型数据库SQL生态;
  • 扩展能力强、高并发,支持类MPP并行查询框架;
  • 弹性扩容,持按需扩容, 自动负载均衡;
  • 多副本强一致,使用raft算法保证数据一致性;
  • 服务高可用,上去中心化,无SPOF;
  • 分布式事务,基于MVCC实现事务控制,支持SI和SSI两种隔离级别;

调研

建表
DROP TABLE IF EXISTS "tracks";  
CREATE TABLE IF NOT EXISTS "tracks" (  
"id"  SERIAL PRIMARY KEY ,  
"third_tracks_id" varchar(32)  NOT NULL DEFAULT '' ,  
"tracks_title" varchar(255) NOT NULL DEFAULT '' ,  
"tracks_title_other" varchar(255)  NOT NULL DEFAULT '',  
"tracks_title_py" varchar(64) NOT NULL DEFAULT '' ,  
"data_source" bigint DEFAULT 1 NOT NULL,  
"tags" varchar(255)  NOT NULL DEFAULT '',  
"duration" bigint DEFAULT 0 NOT NULL,  
"status" int DEFAULT 0 NOT NULL,  
"pa" int DEFAULT 0 NOT NULL,  
"announcer_name" varchar(255)  NOT NULL DEFAULT '',  
"anchor_name" varchar(255)  NOT NULL DEFAULT '',
"play_count" bigint DEFAULT 0 NOT NULL,
"own_count" bigint DEFAULT 0 NOT NULL,
"paid" int DEFAULT 0 NOT NULL,
"info" text NOT NULL,
"created_at" timestamp NOT NULL,
"updated_at" timestamp NOT NULL,
"data_updated" bigint NOT NULL,
"created" timestamp NOT NULL,
"updated" timestamp NOT NULL,
"announcer_id" varchar(256) NOT NULL DEFAULT '',
"anchor_id" varchar(256) NOT NULL DEFAULT '',
UNIQUE INDEX "idx_thirdTrackId_dataSource" (third_tracks_id ASC, data_source ASC),
INDEX "idx_announcerid_status_paid_playcount" (announcer_id ASC, status ASC, paid ASC, play_count DESC)
);
注意点

1.暂不支持数据表字段加注释; 2.不支持大量数据删除:

> DELETE FROM tracks where id <100000;
pq: kv/txn_coord_sender.go:428: transaction is too large to commit: 189948 intents

因为需要强一致性,如果删除大量数据会导致集群延时变大,如果需要删除大量数据可以采用:分段删除

alter table tracks rename to tracks_0907;
for (i=1;i<count(*);i+=2000){
	DELETE FROM tracks_0907 where id <= i;
}
性能圧测

圧测工具 go语言实现圧测脚本

圧测方法 三台机器执行圧测脚本,每一次圧测时长5-10min,每行记录约为1.6kb。因为测试集群为3台性能如下:

M02-XI3
整机SN216486580
CPU【INTEL Xeon E5-2650 V4 12C 2.2GHZ】*2
内存【LANGCHAO PC4-19200 16G】*8
硬盘【LANGCHAO SATA 3T 7.2K】*4
FLASH【LANGCHAO NVMe SSD 800G】*1
网卡【LANGCHAO INTEL 82599】*1
加速卡
RAID无硬件RAID卡

尽可能模拟将线上数据从MySQL迁移到NewSQL的场景。很多人看到mysql或者其他的数据库性能测试报告时候,看到qps都是几万左右,但是大家注意看就会发现测试的单行记录才50byte,与线上的数据是不相符的。

3台比较好的连接数为1000左右,所以测试会以1000并发为限。

圧测表格
序号SQL数据量并发qps99分位延时90分位延时SQL Byte Traffic备注
1insert1500w100666152ms117ms832kb18个newsql索引
2insert1500w300687352ms187ms872kb18个newsql索引
3insert1500w900778700ms1500ms1.2MB18个newsql索引
4insert1500w10008071500s1200ms1.3MB18个newsql索引
5insert1500w100105142ms12ms1.2MB1个newsql索引
6insert1500w300225492ms32ms3.2MB1个newsql索引
7insert1500w6004021130ms56ms6.1MB1个newsql索引
8insert1500w9005938250ms148ms8.4MB1个newsql索引
9insert1500w10006125270ms171ms8.7MB1个newsql索引
10select * from tracks WHERE id = 随机id AND status = 01500w300562530ms10ms7.3MB主键索引
11select * from tracks WHERE id = 随机id AND status = 01500w600871345ms8ms11.7MB主键索引
12select * from tracks WHERE id = 随机id AND status = 01500w100012320160ms130ms16.2MB主键索引
13select * from tracks WHERE id (随机20ids) AND status = 01500w3002134200ms140ms29.7MB主键索引
14select * from tracks WHERE id (随机20ids) AND status = 01500w6002526420ms350ms34.1MB主键索引
15select * from tracks WHERE id (随机20ids) AND status = 01500w10002650771ms640ms36.1MB主键索引
16select * from tracks WHERE id (随机50ids) AND status = 01500w300714670ms540ms23.2MB主键索引
17select * from tracks WHERE id (随机50ids) AND status = 01500w6006721700ms1300ms21.4MB主键索引
18select * from tracks WHERE id (随机50ids) AND status = 01500w6007573000ms2490ms24.7MB主键索引
19SELECT *FROM "tracks" WHERE "third_tracks_id" IN (随机1个id) AND "data_source" = 随机公司 AND "status" = 01500w300555340ms5ms4.1MB(third_tracks_id , data_source )索引
20SELECT *FROM "tracks" WHERE "third_tracks_id" IN (随机1个id) AND "data_source" = 随机公司 AND "status" = 01500w6008624120ms18ms6.0MB(third_tracks_id , data_source )索引
21SELECT *FROM "tracks" WHERE "third_tracks_id" IN (随机1个id) AND "data_source" = 随机公司 AND "status" = 01500w10001145310ms90ms8.7MB(third_tracks_id , data_source )索引
22SELECT *FROM "tracks" WHERE "announcer_id" IN (随机1个id) AND "paid" = 0 AND "status" = 0 order by play_count DESC1500w3005493160ms3ms5.1MB(announcer_id ASC, status ASC, paid ASC, play_count DESC)索引
23SELECT *FROM "tracks" WHERE "announcer_id" IN (随机1个id) AND "paid" = 0 AND "status" = 0 order by play_count DESC1500w6007825283ms23ms7.5MB(announcer_id ASC, status ASC, paid ASC, play_count DESC)索引
24SELECT *FROM "tracks" WHERE "announcer_id" IN (随机1个id) AND "paid" = 0 AND "status" = 0 order by play_count DESC1500w100011171310ms68ms10.5MB(announcer_id ASC, status ASC, paid ASC, play_count DESC )索引
25select * from tracks WHERE id = 随机id AND status = 03000w3005614123ms3ms8.3MB主键索引
26select * from tracks WHERE id = 随机id AND status = 03000w6008723130ms8ms12.4MB主键索引
27select * from tracks WHERE id = 随机id AND status = 03000w100010764320ms30ms16.2MB主键索引
28select * from tracks WHERE id = 随机id AND status = 05000w3005136159ms8ms7.8MB主键索引
29select * from tracks WHERE id = 随机id AND status = 05000w6008463180ms13ms11.8MB主键索引
30select * from tracks WHERE id = 随机id AND status = 05000w100010848220ms26ms16.3MB主键索引
数据分析
  1. 与MySQL等数据库一样,存在索引时候insert的数据会慢上许多,但是在cockroachDB中会更加明显一些。
    输入图片说明 图中可以得出:
  • 大量索引对insert的影响是巨大,会导致写库操作qps大降,在提高并发数后qps亦没有显著提升;
  • 在仅有主键索引时候,insert的qps随着并发数的提升得到相应的提升,到了6000qps后再提升并发数效果就不再明显了。
  1. 通过对主键id的获取数据量不同的压测,可以得到如下图:
    输入图片说明
    图中可以得出:
  • id数越多需要查询的range就越多qps就越低;
  • id数越多得到的数据量也就越多,网络IO变大,性能也会有所下降;
  1. 对比主键索引和复合索引的查询效率,可以得到:
    输入图片说明
    图中可以得出:
  • 有索引的查询效率会有大幅提升,并且主键索引、唯一索引与复合索引的查询效率基本相同;
  • 主键查询效率最优;
  1. 对比在不同数据记录数中查询效率:
    输入图片说明
    图中可以得出:
  • 表中数据量的大小对qps影响不是很大,起码在千万级别是可以接受的;
  • 表中数据量越少速度就越快;

转载于:https://my.oschina.net/qiangmzsx/blog/1545283

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值