mysql8 sql优化相关（持续更新...）

置顶 xuejianxinokok

已于 2023-10-20 09:31:05 修改

阅读量947

点赞数

分类专栏：数据库文章标签： mysql

于 2021-06-30 10:53:55 首次发布

本文链接：https://blog.csdn.net/xuejianxinokok/article/details/118354964

版权

数据库专栏收录该内容

57 篇文章 3 订阅

订阅专栏

能明确知道哪里会慢，为什么会慢

数据库全局优化 mysql8 系统优化知识点总结_xuejianxinokok的专栏-CSDN博客

优化的本质是减少IO,减少随机IO，减少比较和排序(费cpu)

为什么要减少随机IO?

主流的机械硬盘基本上都是 7200 转的 SATA 硬盘，在全速运转并且是顺序读写的情况下，性能也就是 150MB～160MB/s 左右；如果涉及到数据库读写等随机性较强的 IO 操作，这个性能还要再下降。
传统的机械硬盘在读写数据的时候，有三个步骤：
寻道：磁头定位到数据所在的磁道上，这是一个非常耗时的操作。目前 SATA 硬盘的平均寻道时间一般为 8~10 ms。
旋转延迟：到了特定的磁道之后，硬盘盘片需要旋转，以便要读写的扇区能够和磁头吻合。7200 转的硬盘，旋转一周的时间为 1000/120=8.33 毫秒，平均旋转延时则为 4.17 毫秒左右。
数据传输：磁头定位到需要读取的扇区之后，开始传输数据。假设硬盘顺序读取的平均速度为 150MB/s，则读取 4KB 的数据需要 4KB*1000/(150*1024KB)=0.026 毫秒，此值几乎可以忽略不计。读取 1MB 的数据需要 1MB*1000/(150MB)=6.67 毫秒，可见读取大块数据时，其真正用来读取数据的时间一般也小于寻道时间和旋转延时的和。
上面三个步骤所耗费的时间，也就是数据读取的时间。

1. 关于主键

mysql8 innodb 表必定有主键，第一种方法是自己显示定义

如果没有显示定义，mysql 自动在表中添加

DB_ROW_ID

（全局共享递增）

可以用以下sql 查看隐藏列

show extended columns from `表名`

第一种观点不能用UUID 做主键

UUIDs are Popular, but Bad for Performance — Let’s Discuss - Percona Database Performance Blog UUIDs are Popular, but Bad for Performance — Let’s Discuss技术译文 | UUID 很火但性能不佳？今天我们细聊一聊这篇文章是上一篇文章的翻译

原因:

1. innodb 聚簇索引需要排序导致页频繁拆分，这个是主要原因

2. UUID 太长，对于二级索引来说占用空间太大

3. UUID 是字符串，查询时用字符串比较效率低于整型

第二种观点可用做主键

1. 替代方案 UUID_SHORT()

2. 参考文章

The UUID in MySQL8 - 简书 The UUID in MySQL8

uuid bin mysql_The UUID in MySQL8_weixin_39531229的博客-CSDN博客 uuid 能当主键吗？

打算使用UUID，您应该阅读MySQL8.0中UUID的支持，推荐用binary(16) 存储UUID

CREATE TABLE t (id binary(16) PRIMARY KEY);
INSERT INTO t VALUES(UUID_TO_BIN(UUID()));
使用uuid_to_bin() 可能会改变MySQL的UUID实现的顺序行为

UUID是由32位16进制字符串组成(不算分隔符'-')如：

62ab1547-710f-11e8-9a58-5254007205d6

如果直接保存，则需要32个字符，utf8编码下占用96个字节，对于主键来说还是太长。
幸运的是UUID中的每个字符都是16进制字符，两个16进制字符占用一个字节，
这样可以轻松将UUID转换为binary(16),占用16个字节，所需空间大大减少，而且二进制字符串检索对比效率很高。

An option in MySQL 8 is to use the UUID_TO_BIN()
function with a second argument set to 1 which will make MySQL swap the
first and third groups of hexadecimal digits.

一个至关重要的问题是UUID的组成中将timestamp 部分的低位时间段(如毫秒)放在了前面，高位时间段(如年月日)放在了后面，
这会导致前面的字符变化很快，后面的变化很慢，从而使产生的UUID不能顺序自增。这会导致索引插入效率大大降低。

为解决这一问题，mysql8提供了两个函数：UID_TO_BIN(arg1) / BIN_TO_UUID(arg1,arg2)

UID_TO_BIN(arg1) 将UUID转化为16位二进制字符串，如果参数arg1为true则将UUID中的timestamp部分中的time-low(第一段字符)和time-high(第三段)调换，这样产生的UUID是顺序递增。

BIN_TO_UUID(arg1,arg2)将16位进制字符串转化为可读的UUID，arg1为16位二进制字符串，如果arg2省略或为false,即将二进制字符串原位转换；如果arg2为true，则将原来调换的time-low和time-high再调换回去，返回原本的uuid.

测试

创建一张表如下：

mysql8[test]>create table t (id varbinary(16) primary key,create_time timestamp default current_timestamp());

Query OK, 0 rows affected (0.34 sec)

插入几条数据，注意使用了函数uuid_to_bin：

mysql8[test]>insert into t (id)values(uuid_to_bin(uuid(),true));

Query OK, 1 row affected (0.08 sec)

mysql8[test]>insert into t (id)values(uuid_to_bin(uuid(),true));

Query OK, 1 row affected (0.08 sec)

mysql8[test]>insert into t (id)values(uuid_to_bin(uuid(),true));

Query OK, 1 row affected (0.07 sec)

查看结果：

mysql8[test]>select bin_to_uuid(id) id1,bin_to_uuid(id,true) id2, create_time from t;

+--------------------------------------+--------------------------------------+---------------------+

| id1 | id2 | create_time |

+--------------------------------------+--------------------------------------+---------------------+

| 11e87113-f079-024e-8405-5254004332fa | f079024e-7113-11e8-8405-5254004332fa | 2018-06-16 11:18:28 |

| 11e87113-f826-4134-8405-5254004332fa | f8264134-7113-11e8-8405-5254004332fa | 2018-06-16 11:18:41 |

| 11e87113-f88c-c8a6-8405-5254004332fa | f88cc8a6-7113-11e8-8405-5254004332fa | 2018-06-16 11:18:42 |

+--------------------------------------+--------------------------------------+---------------------+

3 rows in set (0.00 sec)

注意字段id1使用了函数bin_to_uuid(id)，而id2使用了bin_to_uuid(id,true)，注意他们在结果集中的区别。

如果需要按主键查询，还是需要使用对应的uuid函数：

mysql8[test]>select * from t where id=uuid_to_bin('f079024e-7113-11e8-8405-5254004332fa',true);

+------------------+---------------------+

| id | create_time |

+------------------+---------------------+

| 篓󷁎RT C2򞞠 | 2018-06-16 11:18:28 |

+------------------+---------------------+

1 row in set (0.00 sec)

总结

MySQL8通过提供两个内置函数解决了传统UUID作为主键的缺陷，也使UUID成为了设计主键的首选，特别是在数据分片的架构中，其优势十分突出。

实验

select
'5858e872-d4c0-11eb-a639-c81f66e5e38f' uuid
, uuid_to_bin('5858e872-d4c0-11eb-a639-c81f66e5e38f',true) bin
, bin_to_uuid(uuid_to_bin('5858e872-d4c0-11eb-a639-c81f66e5e38f',true),true) uuid1

bigint也要占用8个字节！考虑到其在分表领域的巨大优势，16个字节完全可接受

2. 关于索引

查字典有两种方法 1. 直接从头到尾找 (全表扫描) 2. 利用拼音索引或部首索引

注意这2点在查看explain format=tree 格式的时候能让你快速识别sql 执行是从哪里开始

聚簇索引

非聚簇索引

B+TREE 索引的三大性质：有序性，非空性，连续性

有序性：在order by,group by 的时候有序性可以减少排序，也就是说索引不仅仅在表连接和where 条件时使用，在order by 和group by 中也可以用到。

连续性：在索引覆盖且索引全扫描的时候连续性保证索引读取比较快

非空性：由于B+TREE索引条目不保存null 条目,如果在建立函数索引的时候利用这个性质可以大大提高查询效率，一般对于表示状态字段查询有奇效，因为它在索引中已经排除了大部分数据

从图上可以看出二级索引范围扫描会导致随机IO 增加，MRR 优化技术可以降低二级索引范围扫描的随机IO，原理是扫描二级索引后按ID 排序后，按排序后的ID 访问表

图片来源

Innodb 中的 Btree 实现 (一) · 引言 & insert 篇

#函数索引主要要用2层括号否则会报语法错误
ALTER TABLE demo ADD INDEX i_demo_name ( ( case name when 1 then 1 else null end ) );
#函数索引使用隐藏列来实现,所以在创建函数索引的时候会重构表，大表需谨慎操作
show extended columns from demo;

#查询的时候查询条件和建立索引条件要一样，可以从下图看到执行计划 key 列中使用了 i_demo_name 索引

SELECT * FROM demo where( case name when 1 then 1 else null end ) =1;

函数索引一般不建议加，因为一般的带函数查询都能够改写为不带函数的查询，也就是一般的函数都有反函数

但是以下json 函数可能就无法改写了

ADD INDEX idx_str2 ( ( CAST( str2 ->> '$.name' AS CHAR ( 9 ) ) ) )

参考文章

新特性解读 | MySQL 8.0 索引特性1-函数索引 MySQL 8.0 索引特性1-函数索引

新特性解读 | MySQL 8.0 索引特性2-索引跳跃扫描 MySQL 8.0 索引特性2-索引跳跃扫描

3. sql执行计划

表的连接方式 1.嵌套循环 2. HASH 连接 3. 排序合并

----------------------------嵌套循环------------------------------------

普通嵌套循环

SELECT CountryCode, country.Name AS Country,
city.Name AS City, city.District
FROM world.country
INNER JOIN world.city
ON city.CountryCode = country.Code
WHERE Continent = 'Asia';

嵌套循环伪代码

result = [] #保存结果集
for country_row in country: #3 table scan on country
if country_row.Continent == 'Asia': #2 filter 用 Continent 过滤数据
for city_row in city.CountryCode['country_row.Code']:#4 index lookup city 用CountryCode索引
result.append(join_rows(country_row, city_row))

普通嵌套循环结论和使用场景:

连接思想是每次获取驱动表一条数据，然后取被连接表上查询符合条件的数，

适用于驱动表有较小的数据量，否则会导致被连接表上的大量索引查询造成大量随机读取

被连接表的连接列上必须有索引，否则会引起多次全表扫描，引起大量IO

Block Nested Loop

由于普通嵌套循环连接在被连接表上没有索引存在时会引起多次全表扫描，必须降低被连接表的全表扫描次数。 BNL 使用 join buffer 来解决这个问题，所以合理设置 join buffer 很重要

每次join_buffer 满的时候，被连接表就会来一次全表扫描，为了减少全表扫描次数，适当调大join buffer 是必要的

Prior to MySQL 8.0.18, this algorithm was applied for equi-joins when no indexes could be used; in MySQL 8.0.18 and later, the hash join optimization is employed in such cases. Starting with MySQL 8.0.20, the block nested loop is no longer used by MySQL, and a hash join is employed for in all cases where the block nested loop was used previously.

参考文档 MySQL :: MySQL 8.0 Reference Manual :: 8.2.1.7 Nested-Loop Join Algorithms

这个算法从8.0.20 已经废掉了用hash join 代替

----------------------------HASH 连接------------------------------------

MySQL :: MySQL 8.0 Reference Manual :: 8.3.9 Comparison of B-Tree and Hash Indexes btree和hash索引适用场景

SELECT
CountryCode, country.Name AS Country,
city.Name AS City, city.District
FROM world.country IGNORE INDEX (Primary)
INNER JOIN world.city IGNORE INDEX (CountryCode)
ON city.CountryCode = country.Code
WHERE Continent = 'Asia';

hash join 官方文档 MySQL :: MySQL 8.0 Reference Manual :: 8.2.1.4 Hash Join Optimization

hash join原理: MySQL8.0 新特性 Hash Join - 天士梦 - 博客园

Beginning with MySQL 8.0.18, MySQL employs a hash join for any query for which each join has an equi-join condition, and in which there are no indexes that can be applied to any join conditions

需要满足3个条件

1. 版本>=8.0.18

2. 等值连接

3. 无索引可用

一般用于大表连接，这时候数据比较多，嵌套循环连接如果使用索引的话会产生很多随机读，不如直接 hash join 来的痛快。

join_buffer_size 会影响hash join 的效率，太小的话数据会溢出到磁盘

join_buffer_size The default is 256KB（262144 字节）. The maximum permissible setting for join_buffer_size is 4GB−1. Larger values are permitted for 64-bit platforms (except 64-bit Windows, for which large values are truncated to 4GB−1 with a warning).

可以在hint 中设置，比如设置为1M=1024*1024=1048576 字节

SELECT /*+ SET_VAR(join_buffer_size = 1048576) */ COUNT(0) FROM T

注意 HASH_JOIN or NO_HASH_JOIN 这连个hint 只在8.0.18 版本中起作用

>=8.0.19 使用 BNL and NO_BNL 干预执行计划是否使用hash join

----------------------------排序合并------------------------------------

在连接列和被连接列都有索引的情况下，可以使用排序合并连接，oracle 数据库有排序合并连接

mysql 目前不太清除是否支持

官方文档 MySQL :: MySQL 8.0 Reference Manual :: 8 Optimization

执行计划的入门文章
https://mysqlserverteam.com/mysql-explain-analyze/ MySQL EXPLAIN ANALYZE 英文原文
MySQL EXPLAIN ANALYZE_老叶茶馆-CSDN博客 MySQL EXPLAIN ANALYZE 翻译

MySQL :: MySQL 8.0 Reference Manual :: 13.8.2 EXPLAIN Statement EXPLAIN ANALYZE mysql 官方文档

MySQL 8.0.18 introduces EXPLAIN ANALYZE, which runs a statement and produces EXPLAIN output along with timing and additional, iterator-based, information about how the optimizer's expectations matched the actual execution. For each iterator, the following information is provided:

Estimated execution cost

(Some iterators are not accounted for by the cost model, and so are not included in the estimate.)
Estimated number of returned rows
Time to return first row
Time to return all rows (actual cost), in milliseconds

(When there are multiple loops, this figure shows the average time per loop.)
Number of rows returned by the iterator
Number of loops

The query execution information is displayed using the TREE output format, in which nodes represent iterators. EXPLAIN ANALYZE always uses the TREE output format. In MySQL 8.0.21 and later, this can optionally be specified explicitly using FORMAT=TREE; formats other than TREE remain unsupported.

EXPLAIN ANALYZE can be used with SELECT statements, as well as with multi-table UPDATE and DELETE statements. Beginning with MySQL 8.0.19, it can also be used with TABLE statements.

执行计划hints

MySQL :: MySQL 8.0 Reference Manual :: 8.9.3 Optimizer Hints

一定要理解hint 的意义，否则就是盲目添加

Optimizer hints apply at different scope levels: 不同级别的hint

Global: The hint affects the entire statement 全局hint 影响整个语句
Query block: The hint affects a particular query block within a statement query block hint 影响一个指定的查询块
Table-level: The hint affects a particular table within a query block
Index-level: The hint affects a particular index within a table

Hint Name	Description	Applicable Scopes
BKA, NO_BKA	Affects Batched Key Access join processing 参考 MySQL :: MySQL 8.0 Reference Manual :: 8.2.1.12 Block Nested-Loop and Batched Key Access Joins 用于优化嵌套循环表连接 MySQL implements a method of joining tables called the Batched Key Access (BKA) join algorithm. BKA can be applied when there is an index access to the table produced by the second join operand. Like the BNL join algorithm, the BKA join algorithm employs a join buffer to accumulate the interesting columns of the rows produced by the first operand of the join operation. Then the BKA algorithm builds keys to access the table to be joined for all rows in the buffer and submits these keys in a batch to the database engine for index lookups. The keys are submitted to the engine through the Multi-Range Read (MRR) interface (see Section 8.2.1.11, “Multi-Range Read Optimization”). After submission of the keys, the MRR engine functions perform lookups in the index in an optimal way, fetching the rows of the joined table found by these keys, and starts feeding the BKA join algorithm with matching rows. Each matching row is coupled with a reference to a row in the join buffer.	Query block, table
BNL, NO_BNL	Prior to MySQL 8.0.20: affects Block Nested-Loop join processing; MySQL 8.0.18 and later: also affects hash join optimization; MySQL 8.0.20 and later: affects hash join optimization only 如果强制查询走hash join 需要满足两个条件 1. 忽略相关索引 IGNORE INDEX (索引) *2.添加 /+** `NO_BNL()` */	Query block, table
DERIVED_CONDITION_PUSHDOWN, NO_DERIVED_CONDITION_PUSHDOWN	Use or ignore the derived condition pushdown optimization for materialized derived tables (Added in MySQL 8.0.22)	Query block, table
GROUP_INDEX, NO_GROUP_INDEX	Use or ignore the specified index or indexes for index scans in `GROUP BY` operations (Added in MySQL 8.0.20)	Index
HASH_JOIN, NO_HASH_JOIN	Affects Hash Join optimization (MySQL 8.0.18 only ，已经废弃	Query block, table
INDEX, NO_INDEX	Acts as the combination of JOIN_INDEX, GROUP_INDEX, and ORDER_INDEX, or as the combination of NO_JOIN_INDEX, NO_GROUP_INDEX, and NO_ORDER_INDEX (Added in MySQL 8.0.20)	Index
INDEX_MERGE, NO_INDEX_MERGE	Affects Index Merge optimization 同时使用多个二级索引	Table, index
JOIN_FIXED_ORDER	Use table order specified in `FROM` clause for join order 参考聊聊 Mysql 8.0 hint 之上篇 - 墨天轮	Query block
JOIN_INDEX, NO_JOIN_INDEX	Use or ignore the specified index or indexes for any access method (Added in MySQL 8.0.20)	Index
JOIN_ORDER	Use table order specified in hint for join order 表连接顺序设置	Query block
JOIN_PREFIX	Use table order specified in hint for first tables of join order	Query block
JOIN_SUFFIX	Use table order specified in hint for last tables of join order	Query block
MAX_EXECUTION_TIME	Limits statement execution time	Global
MERGE, NO_MERGE	Affects derived table/view merging into outer query block 把衍生表合并到外层查询 The MERGE and NO_MERGE optimizer hints can be applied to CTEs. Each CTE reference in the top-level statement can have its own hint, permitting CTE references to be selectively merged or materialized. The following statement uses hints to indicate that `cte1` should be merged and `cte2` should be materialized: 这里非常重要原理可以参考 Derived Table Merge Optimization - MariaDB Knowledge Base	Table
MRR, NO_MRR	Affects Multi-Range Read optimization 参考 MySQL :: MySQL 8.0 Reference Manual :: 8.2.1.11 Multi-Range Read Optimization 大概原理: 在二级索引范围扫描时可以按主键排序后回表查询降低二级索引的随机访问，增加缓存的命中率	Table, index
NO_ICP	无论是 MySQL 还是 Oracle，或者是 SQL Server，其实都有这么两层存在，一个是 Server 层(实例层包括内存结构和后台进程)，另一个是存储引擎层。其实也很好理解，可以类比一下我们 Windows 操作系统，比如常说的把 D 盘格式化成 NTFS 或者 FAT32，这个 NTFS 或者 FAT32 就可以理解成数据库中的存储引擎。 Affects Index Condition Pushdown optimization 参考 MySQL :: MySQL 8.0 Reference Manual :: 8.2.1.6 Index Condition Pushdown Optimization 针对扫描二级索引的一项优化改进。总的来说是通过把索引过滤条件下推到存储引擎，来减少 MySQL 存储引擎访问基表的次数以及 MySQL 服务层访问存储引擎的次数。任何需要下推到底层存储层的操作一般都有诸多限制，MySQL ICP 也不例外，ICP 限制如下： 1. ICP 尤其是对联合索引的部分列模糊查找非常有效。 Suppose that a table contains information about people and their addresses and that the table has an index defined as `INDEX (zipcode, lastname, firstname)`. If we know a person's `zipcode` value but are not sure about the last name, we can search like this: 2. ICP 的目标是减少全行记录读取(不用读取全行记录进行过滤直接在索引中过滤)，从而减少 I/O 操作，仅用于二级索引。主键索引本身即是表数据，不存在下推操作。举个例子来说 ICP 就是 Service 调用Dao 层获取数据，数据在Dao层过滤 NO_CIP 就是 Dao层返回所有数据，数据在 Service层用if 过滤 3. EXPLAIN output shows `Using index condition` in the `Extra` column when Index Condition Pushdown is used. It does not show `Using index` because that does not apply when full table rows must be read. 使用二级索引查找数据时，where 子句中属于索引的一部分但又无法使用索引的条件，MySQL会把这部分条件下推到存储引擎层，筛选之后再进行回表，这样回表的次数就减少了。 SQL优化:ICP的缺陷 - 简书什么是ICP（Index Condition Pushdown） ICP全称 Index Condition Pushdown，也就是常说的索引条件下推，在之前的一篇文章中介...https://www.jianshu.com/p/a6aaf604d0a5	Table, index
NO_RANGE_OPTIMIZATION	Affects range optimization	Table, index
ORDER_INDEX, NO_ORDER_INDEX	Use or ignore the specified index or indexes for sorting rows (Added in MySQL 8.0.20)	Index
QB_NAME	Assigns name to query block	Query block
RESOURCE_GROUP	Set resource group during statement execution	Global
SEMIJOIN, NO_SEMIJOIN	Affects semijoin strategies; beginning with MySQL 8.0.17, this also applies to antijoins --利用去重策略 SEMIJOIN(DUPSWEEDOUT), NO_SEMIJOIN(DUPSWEEDOUT)	Query block
SKIP_SCAN, NO_SKIP_SCAN	Affects Skip Scan optimization	Table, index
SET_VAR	Set variable during statement execution	Global
SUBQUERY	Affects materialization, `IN`-to-`EXISTS` subquery stratgies	Query block

使用hint 注意事项：

1. hint 格式 /*+ 空格 hint内容空格 */

Optimizer hints must be specified within /*+ ... */ comments

/*+ BKA(t1) */

/*+ BNL(t1, t2) */

/*+ NO_RANGE_OPTIMIZATION(t4 PRIMARY) */

/*+ QB_NAME(qb2) */

2. hint 的位置必须紧紧跟随在 SELECT, UPDATE, INSERT, REPLACE, and DELETE 关键字后边，

不能写在 from 前边或者字段的中间

The parser recognizes optimizer hint comments after the initial keyword of SELECT, UPDATE, INSERT, REPLACE, and DELETE statements. 例如:

3. 多个hint 怎么写

4. Naming Query Blocks 用于在全局hints 引用内部 query block

比如以下查询

用命名 query block 表示为

query block 相互引用 @查询块名

或者以如下格式表名@queryblockName

5. 一些例子

3. 子查询优化

先参考 Semi-join Subquery Optimizations - MariaDB Knowledge Base 这篇文章了解一下什么是半连接子查询

通俗的理解 semi-join（半连接）和join (连接) ：

join (连接) 是把所有匹配行添加到结果集，会有重复，而 semi-join（半连接）只会把任意且仅仅一个匹配行添加到结果集，相对于 join (连接) 来说产生的连接结果是不全的，此之所以称之为半连接

与子查询相关的hint 有 MERGE, NO_MERGE 和 SEMIJOIN, NO_SEMIJOIN

在这篇文章 Subquery Optimizations Map - MariaDB Knowledge Base 有子查询优化策略的详细解释

以下是 子查询优化策略共5种，其中第一种是最常用的

If a subquery meets the preceding criteria, MySQL converts it to a semijoin (or, in MySQL 8.0.17 or later, an antijoin if applicable) and makes a cost-based choice from these strategies:

1. Convert the subquery to a join, or use table pullout and run the query as an inner join between subquery tables and outer tables. Table pullout pulls a table out from the subquery to the outer query. 转化一个subquery 为一个join ,使用 table pullout（把子查询中的表拉出到外部）然后以 inner join 连接子查询表和外部表，参考文档 Table Pullout Optimization - MariaDB Knowledge Base

参考下边的例子

explain
# format =tree
select *
from city
where city.CountryCode in (select country.Code
from country
where country.Population < 100*1000);

用 show warnings 查看改写后的sql

from `world`.`country` join `world`.`city` where (

(`world`.`city`.`CountryCode` = `world`.`country`.`Code`)

and

(`world`.`country`.`Population` < <cache>((100 * 1000)))

)

2. Duplicate Weedout: Run the semijoin as if it was a join and remove duplicate records using a temporary table. 重复剔除策略:1. 先按普通 join 方式执行，2.把连接结果保存到零时表中然后 Weedout（剔除） Duplicate （重复）。判断是否用了duplicate weedout 看
```
在explain 中 Extra: 是否含有 Using  temporary    
```
参考 DuplicateWeedout Strategy - MariaDB Knowledge Base
SEMIJOIN(DUPSWEEDOUT), NO_SEMIJOIN(DUPSWEEDOUT)
3. FirstMatch: When scanning the inner tables for row combinations and there are multiple instances of a given value group, choose one rather than returning them all. This "shortcuts" scanning and eliminates production of unnecessary rows. 参考 FirstMatch Strategy - MariaDB Knowledge Base
SEMIJOIN(FIRSTMATCH), NO_SEMIJOIN(FIRSTMATCH)
4. LooseScan: Scan a subquery table using an index that enables a single value to be chosen from each subquery's value group. 主要思想是扫描(scan) 子查询连接列上的索引,由于不是全扫描该索引，只是获取相同索引值的其中一个所以叫 loose scan 参考LooseScan Strategy - MariaDB Knowledge Base
5. Materialize the subquery into an indexed temporary table that is used to perform a join, where the index is used to remove duplicates. The index might also be used later for lookups when joining the temporary table with the outer tables; if not, the table is scanned. For more information about materialization, see Section 8.2.2.2, “Optimizing Subqueries with Materialization”. 参考 Semi-join Materialization Strategy - MariaDB Knowledge Base

反连接

反连接基本和半连接是一样的，只是半连接在子查询和外表连接条件匹配时放到结果集中，连接条件不匹配时丢弃。

反连接是在子查询和外表连接条件匹配时把记录丢弃，不匹配时添加到结果集中。

注意匹配的情况比较好判断不存在null=null 这种情况，而反连接不匹配需要处理 null=null 这种情况，所以在not in 和not EXISTS 要考虑字段是否为null 的特殊情况。

先看一个例子

explain
# format =tree
select *
from city
where city.CountryCode not in (select country.Code
from country
where country.Population < 100*1000);

show warnings 输出

/* select#1 */ select `world`.`city`.`ID` AS `ID`,`world`.`city`.`Name` AS `Name`
,`world`.`city`.`CountryCode` AS `CountryCode`,`world`.`city`.`District` AS `District`
,`world`.`city`.`Population` AS `Population`
from `world`.`city` anti join (`world`.`country`)
on(
(
(`<subquery2>`.`Code` = `world`.`city`.`CountryCode`)
and (`world`.`country`.`Population` < <cache>((100 * 1000)))
)
) where true

Beginning with MySQL 8.0.17, the following subqueries are transformed into antijoins:

NOT IN (SELECT ... FROM ...)
NOT EXISTS (SELECT ... FROM ...).
IN (SELECT ... FROM ...) IS NOT TRUE
EXISTS (SELECT ... FROM ...) IS NOT TRUE.
IN (SELECT ... FROM ...) IS FALSE
EXISTS (SELECT ... FROM ...) IS FALSE.

In short, any negation of a subquery of the form IN (SELECT ... FROM ...) or EXISTS (SELECT ... FROM ...) is transformed into an antijoin.

参考文档

MySQL :: MySQL 8.0 Reference Manual :: 8.2.2.1 Optimizing IN and EXISTS Subquery Predicates with Semijoin Transformations 官方文档

技术分享 | MySQL 子查询优化 MySQL 子查询优化

技术分享 | Semi-join Materialization 子查询优化策略 Semi-join Materialization 子查询优化策略

4. 如何优化一个sql

1. 工具：结合 explain , EXPLAIN ANALYZE, show warnings 定位sql 为什么慢，主要关心全表扫描，表连接顺序

特别关注 EXPLAIN ANALYZE 中实际执行时间较长的步骤，或者嵌套循环中的loops 比较多的子步骤

2. 方法：改写sql或用hints 更改执行计划, hash_join是解决大表连接性能的利器，要合理利用

5. insert update delete等非查询

1. 减少大批量插入的redo

#全局关闭redo

ALTER INSTANCE DISABLE INNODB REDO_LOG;

#全局开启redo

ALTER INSTANCE ENABLE INNODB REDO_LOG;

技术分享 | MySQL 8.0.21 Disable Redo Log 性能测试 MySQL 8.0.21 Disable Redo Log 性能测试

2. 优化插入

To optimize insert speed, combine many small operations into a single large operation. Ideally, you make a single connection, send the data for many new rows at once, and delay all index updates and consistency checking until the very end.

为什么要合并很多小操作为一个大操作？

The time required for inserting a row is determined by the following factors, where the numbers indicate approximate proportions:

Connecting: (3)
Sending query to server: (2)
Parsing query: (2)
Inserting row: (1 × size of row)
Inserting indexes: (1 × number of indexes)
Closing: (1)

Use the multiple-row INSERT syntax to reduce communication overhead between the client and the server if you need to insert many rows:

INSERT INTO yourtable VALUES (1,2), (5,5), ...;

所以使用 insert 多values 语句可以合并多个插入，这样可以减少以上步骤执行次数而加快数据

新特性解读 | MySQL 8.0 在线调整 REDO作者：杨涛涛资深数据库专家，专研 MySQL 十余年。擅长 MySQL、PostgreSQL、MongoDB 等开源数据库相关的备份恢复、SQL 调优、监控运维、高可用架构设计等。目前任职于爱可生，为各大运营商及银行金融企业提供 MySQL 相关技术支持、MySQL 相关课程培训等工作。本文来源：原创投稿 *爱可生开源社区出品，原创...https://opensource.actionsky.com/20220802-mysql8-0/

参考文档

MySQL :: MySQL 8.0 Reference Manual :: 8.2.5.1 Optimizing INSERT Statements 官方insert 优化

3.load file

When loading a table from a text file, use LOAD DATA. This is usually 20 times faster than using INSERT statements

从一个文本文件装载一个表时，使用LOAD DATA INFILE，通常比INSERT语句快20倍

When performing bulk inserts, it is faster to insert rows in PRIMARY KEY order. InnoDB tables use a clustered index, which makes it relatively fast to use data in the order of the PRIMARY KEY. Performing bulk inserts in PRIMARY KEY order is particularly important for tables that do not fit entirely within the buffer pool.

导入的数据按照主键的顺序保存：这是因为InnoDB引擎表示按照主键顺序保存的，如果能将插入的数据提前按照排序好自然能省去很多时间。

最好新建一个空表，表上只有主键，没有其他唯一索引或验证，这样能加快速度

4. 利用跨表更新使用主键连接来执行更新语句

UPDATE items,month SET items.price=month.price WHERE items.id=month.id;

使用id 作为连接条件来更新相关表

5.分区裁剪

利用时间分区来trucate或drop 一个分区来删除大量历史数据，这是删除数据最快的方式

参考 mysql8 分区裁剪和锁_xuejianxinokok的专栏-CSDN博客

6.表空间导出和导入

这是除了冷备整个数据库以外迁移大量数据最快的方式

6. 事物和锁

MySQL :: MySQL 8.0 Reference Manual :: 15.7.1 InnoDB Lockinghttps://dev.mysql.com/doc/refman/8.0/en/innodb-locking.html

MySQL :: MySQL 8.0 Reference Manual :: 15.15.2.1 Using InnoDB Transaction and Locking Informationhttps://dev.mysql.com/doc/refman/8.0/en/innodb-information-schema-examples.html

第38问：分区表到底要上多少锁https://opensource.actionsky.com/20210606-mysql/

The intention locking protocol is as follows:

Before a transaction can acquire a shared lock on a row in a table, it must first acquire an IS lock or stronger on the table.
Before a transaction can acquire an exclusive lock on a row in a table, it must first acquire an IX lock on the table

Record locks always lock index records, even if a table is defined with no indexes. For such cases, InnoDB creates a hidden clustered index and uses this index for record locking.

Gap locking can be disabled explicitly. This occurs if you change the transaction isolation level to READ COMMITTED. In this case, gap locking is disabled for searches and index scans and is used only for foreign-key constraint checking and duplicate-key checking.

There are also other effects of using the READ COMMITTED isolation level. Record locks for nonmatching rows are released after MySQL has evaluated the WHERE condition. For UPDATE statements, InnoDB does a “semi-consistent” read, such that it returns the latest committed version to MySQL so that MySQL can determine whether the row matches the WHERE condition of the UPDATE.

参考 mysql8 表空间导出和导入_xuejianxinokok的专栏-CSDN博客_mysql 导出表空间

索引下推。准确来说，应该叫做索引条件下推（Index Condition Pushdown，ICP），就是过滤的动作由下层的存储引擎层通过使用索引来完成，而不需要上推到Server层进行处理。

xuejianxinokok

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
mysql8 sql优化相关（持续更新...）

能明确知道哪里会慢，为什么会慢优化的本质是减少IO,减少随机IO，减少比较和排序(费cpu)1. 关于主键mysql8 innodb 表必定有主键，第一种方法是自己显示定义如果没有显示定义，mysql 自动在表中添加DB_ROW_ID （全局共享递增）可以用以下sql 查看隐藏列show extended columns from `表名`第一种观点不能用UUID 做主键 https://www.percona.com/blog/2...
复制链接

扫一扫

专栏目录