管理索引的方法
TiDB中的Online DDL
mysql> admin show ddl jobs;
+--------+---------+----------------------------------+--------------+--------------+-----------+----------+-----------+---------------------+---------------------+---------------------+--------+
| JOB_ID | DB_NAME | TABLE_NAME | JOB_TYPE | SCHEMA_STATE | SCHEMA_ID | TABLE_ID | ROW_COUNT | CREATE_TIME | START_TIME | END_TIME | STATE |
+--------+---------+----------------------------------+--------------+--------------+-----------+----------+-----------+---------------------+---------------------+---------------------+--------+
| 68 | mysql | advisory_locks | create table | public | 3 | 67 | 0 | 2023-06-07 02:48:20 | 2023-06-07 02:48:20 | 2023-06-07 02:48:20 | synced |
| 66 | mysql | analyze_jobs | create table | public | 3 | 65 | 0 | 2023-06-07 02:48:20 | 2023-06-07 02:48:20 | 2023-06-07 02:48:20 | synced |
增加索引的原理
创建索引,会扫描全表
创建索原理简单而言:1、先创建元数据,2、再回填表(region)数据。
动态调整创建索引的速度
参数 | 默认值 | 说明 |
---|---|---|
tidb_ddl_reorg_worker_nct | 4 | 控制DDL操作re-organize阶段的并发度 |
tidb_ddl_reorg_batch_size | 256 | 控制每个worker一起回填数据单位,以batch为单位 |
tidb_ddl_reorg_priority | priority_low | 调整创建索引优先级,参数有 priority_low priority_normal priority_high |
tidb_ddl_error_count_limit | 512 | 失败重试次数,如果超过该次数创建索引会失败 |
在创建索引时,需要时刻关注:
- 创建索引对系统的压力,可以通过Grafana的dashboard来查看系统的压力
- 创建索引的速度,当系统压力不大的时候,可以适当调整创建索引的速度
注意: row_count 这个列来评估创建的速度
增加索引对于线上业务的影响
- 目标列被频繁更新(DML)时,默认配置会造成较为频繁的写冲突,使得在线负载较大
- 当add index的目标列仅涉及查询负载,或者与线上负载不直接相关时,可以直接使用默认配置
- add index 也可能由于不断地重试,需要很长时间才能完成
目标列上存在频繁读写的场景
- 无add index操作时:
- 有add index操作,且tidb_ddl_reorg_batch_size = 32 时:
随着两个参数(tidb_ddl_reorg_worker_int 和tidb_ddl_reorg_batch_size)的逐渐增大,影响主要来源于add index与column update并发进行冲突的写冲突,系统的表现反应在: - tikv_prewrite_latch_wait_duration 有明显的升高,造成写入变慢。
- admin show ddl 命令可以看到ddl job的多次重试,此时add index会持续非常久。
目标列上存在只读场景
- 无add index操作时:
- 有add index操作,且tidb_ddl_reorg_batch_size = 32 时:
目标列不涉及读写场景
- 无add index操作时:
- 有add index操作,且tidb_ddl_reorg_batch_size = 32 时:
索引扫描方式
PointGet / Batch PointGet
要使得优化器能够选择PointGet算子,需要满足以下几个条件:
- 返回值至多只能有一个,或者没有返回结果
- 一定要有唯一键,可以是主键或者唯一索引
Index Full Scan
index range scan
对于非唯一索引,无论等值还是范围 都是Index Range scan
索引选择规则
索引选择的维度
- 索引列涵盖了多少访问条件
- 索引是否需要回表
- 不需要回表: indexRender
- 需要回表: indexLookupRender
基于代价的选择
有以下因素决定:
- 索引的每行数据在存储层的平均长度
- 索引生成的查询范围的行数
- 索引的回表代价
- 索引查询时的范围数量
实验
观察索引的创建速度,并进行调整;
1、造数
mysql> create table t1(a int,b int);
Query OK, 0 rows affected (0.17 sec)
[root@tidb2 ~]# for i in `seq 10000`; do mysql -uroot -P4000 -h192.168.16.13 -pAa123ab! -e "insert into test.t1 values ($i,floor(rand()*10000000))"; done;
[root@tidb2 ~]# for i in `seq 33`; do mysql -uroot -P4000 -h192.168.16.13 -pAa123ab! -e "insert into test.t1 select * from test.t1 limit 100000"; done;
mysql> select count(*) from t1;
+----------+
| count(*) |
+----------+
| 3060000 |
+----------+
1 row in set (7.89 sec)
2、调慢创建索引的速度
mysql> set global tidb_ddl_reorg_worker_cnt=1;
Query OK, 0 rows affected (0.05 sec)
mysql> set global tidb_ddl_reorg_batch_size=32;
Query OK, 0 rows affected (0.10 sec)
3、创建索引,并观察创建索引的速度
mysql> create index idx_t1_a on t1(a);
Query OK, 0 rows affected (34 min 16.72 sec)
非常慢
mysql> admin show ddl jobs where table_name='t1';
+--------+---------+------------+----------------+----------------------+-----------+----------+-----------+---------------------+---------------------+---------------------+---------+
| JOB_ID | DB_NAME | TABLE_NAME | JOB_TYPE | SCHEMA_STATE | SCHEMA_ID | TABLE_ID | ROW_COUNT | CREATE_TIME | START_TIME | END_TIME | STATE |
+--------+---------+------------+----------------+----------------------+-----------+----------+-----------+---------------------+---------------------+---------------------+---------+
| 101 | test | t1 | add index | write reorganization | 1 | 99 | 2065666 | 2023-07-08 20:12:13 | 2023-07-08 20:12:13 | NULL | running |
| 100 | test | t1 | create table | public | 1 | 99 | 0 | 2023-07-08 19:56:50 | 2023-07-08 19:56:50 | 2023-07-08 19:56:51 | synced |
+--------+---------+------------+----------------+----------------------+-----------+----------+-----------+---------------------+---------------------+---------------------+---------+
8 rows in set (0.02 sec)
4、调整参数,加快创建索引的速度
mysql> set global tidb_ddl_reorg_worker_cnt=32;
Query OK, 0 rows affected (0.05 sec)
mysql> set global tidb_ddl_reorg_batch_size=512;
Query OK, 0 rows affected (0.10 sec)
5、创建索引,并观察速度
mysql> create index idx_t1_a on t1(a);
Query OK, 0 rows affected (14 min 16.72 sec)
实验:点查
1、造数
mysql> create table t1( a int,b int);
Query OK, 0 rows affected (7 min 48.55 sec)
[root@tidb2 ~]# for i in `seq 10000`; do mysql -uroot -P4000 -h192.168.16.13 -pAa123ab! -e "insert into test.t1 values ($i,floor(rand()*10000000))"; done;
2、没有索引情况下,是否可以触发点查
mysql> explain select * from t1 where a = 88;
+-------------------------+----------+-----------+---------------+-------------------+
| id | estRows | task | access object | operator info |
+-------------------------+----------+-----------+---------------+-------------------+
| TableReader_7 | 1.00 | root | | data:Selection_6 |
| └─Selection_6 | 1.00 | cop[tikv] | | eq(test.t1.a, 88) |
| └─TableFullScan_5 | 10000.00 | cop[tikv] | table:t1 | keep order:false |
+-------------------------+----------+-----------+---------------+-------------------+
3 rows in set (0.00 sec)
4、创建非唯一索引后,是否会触发point get
mysql> create index idx on t1(a);
Query OK, 0 rows affected (4.06 sec)
mysql> explain select * from t1 where a = 88;
+-------------------------------+---------+-----------+------------------------+---------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+------------------------+---------------------------------+
| IndexLookUp_10 | 1.00 | root | | |
| ├─IndexRangeScan_8(Build) | 1.00 | cop[tikv] | table:t1, index:idx(a) | range:[88,88], keep order:false |
| └─TableRowIDScan_9(Probe) | 1.00 | cop[tikv] | table:t1 | keep order:false |
+-------------------------------+---------+-----------+------------------------+---------------------------------+
3 rows in set (0.00 sec)
5、创建唯一索引,是否会触发点查
mysql> alter table t1 drop index idx;
Query OK, 0 rows affected (0.35 sec)
mysql> create unique index idx_unique on t1(a);
Query OK, 0 rows affected (4.31 sec)
mysql> explain select * from t1 where a = 88;
+-------------+---------+------+-------------------------------+---------------+
| id | estRows | task | access object | operator info |
+-------------+---------+------+-------------------------------+---------------+
| Point_Get_1 | 1.00 | root | table:t1, index:idx_unique(a) | |
+-------------+---------+------+-------------------------------+---------------+
1 row in set (0.00 sec)
6、唯一索引下的范围查询,是否会触发点查
mysql> explain select * from t1 where a < 10;
+-------------------------------+---------+-----------+-------------------------------+-----------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+-------------------------------+-----------------------------------+
| IndexLookUp_10 | 9.00 | root | | |
| ├─IndexRangeScan_8(Build) | 9.00 | cop[tikv] | table:t1, index:idx_unique(a) | range:[-inf,10), keep order:false |
| └─TableRowIDScan_9(Probe) | 9.00 | cop[tikv] | table:t1 | keep order:false |
+-------------------------------+---------+-----------+-------------------------------+-----------------------------------+
3 rows in set (0.00 sec)
7、有唯一索引,查询结果为空是否会触发point get
mysql> explain select * from t1 where a = -1;
+-------------+---------+------+-------------------------------+---------------+
| id | estRows | task | access object | operator info |
+-------------+---------+------+-------------------------------+---------------+
| Point_Get_5 | 1.00 | root | table:t1, index:idx_unique(a) | |
+-------------+---------+------+-------------------------------+---------------+
1 row in set (0.00 sec)
8、查询null 是否会触发点查
mysql> insert into t1 values (null,10000000);
Query OK, 1 row affected (0.01 sec)
m
ysql> explain select * from t1 where a = 88;
+-------------+---------+------+-------------------------------+---------------+
| id | estRows | task | access object | operator info |
+-------------+---------+------+-------------------------------+---------------+
| Point_Get_1 | 1.00 | root | table:t1, index:idx_unique(a) | |
+-------------+---------+------+-------------------------------+---------------+
1 row in set (0.00 sec)
mysql> explain select * from t1 where a is null;
+-------------------------------+---------+-----------+-------------------------------+-------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+-------------------------------+-------------------------------------+
| IndexLookUp_10 | 1.00 | root | | |
| ├─IndexRangeScan_8(Build) | 1.00 | cop[tikv] | table:t1, index:idx_unique(a) | range:[NULL,NULL], keep order:false |
| └─TableRowIDScan_9(Probe) | 1.00 | cop[tikv] | table:t1 | keep order:false |
+-------------------------------+---------+-----------+-------------------------------+-------------------------------------+
3 rows in set (0.00 sec)