MySQL-5.7-8.2.1.14 ORDER BY Optimization

This section describes when MySQL can use an index to satisfy an ORDER BY clause, the filesort operation used when an index cannot be used, and execution plan information available from the optimizer about ORDER BY.

本节描述MySQL在什么情况下可以使用索引来满足ORDER BY子句,不能使用索引时使用的filesort 操作,以及ORDER BY优化器提供的执行计划信息。

An ORDER BY with and without LIMIT may return rows in different orders, as discussed in Section 8.2.1.17, “LIMIT Query Optimization”.

带或不带LIMIT的ORDER BY可能返回不同顺序的行。

Use of Indexes to Satisfy ORDER BY

使用索引来满足ORDER BY

In some cases, MySQL may use an index to satisfy an ORDER BY clause and avoid the extra sorting involved in performing a filesort operation.

在某些情况下,MySQL可能会使用索引来满足ORDER BY子句,从而避免执行filesort 操作时涉及的额外排序。

The index may also be used even if the ORDER BY does not match the index exactly, as long as all unused portions of the index and all extra ORDER BY columns are constants in the WHERE clause.

即使ORDER BY与索引不完全匹配,也可以使用索引,只要索引的所有未使用部分和所有额外的ORDER BY列都是WHERE子句中的常量。

If the index does not contain all columns accessed by the query, the index is used only if index access is cheaper than other access methods.

如果索引不包含查询访问的所有列,则只有在索引访问比其他访问方法成本更低的情况下才使用索引。

Assuming that there is an index on (key_part1key_part2), the following queries may use the index to resolve the ORDER BY part. Whether the optimizer actually does so depends on whether reading the index is more efficient than a table scan if columns not in the index must also be read.

假设在(key_part1, key_part2)上有一个索引,下面的查询可以使用这个索引来解析ORDER BY部分。优化器是否真的这样做取决于如果也必须读取不在索引中的列,那么读取索引是否比表扫描更有效。

In this query, the index on (key_part1key_part2) enables the optimizer to avoid sorting:

这个查询中,索引(key_part1, key_part2)允许优化器避免排序:

However, the query uses SELECT *, which may select more columns than key_part1 and key_part2. In that case, scanning an entire index and looking up table rows to find columns not in the index may be more expensive than scanning the table and sorting the results.

但是,查询使用SELECT *,它可能选择比key_part1和key_part2更多的列。在这种情况下,扫描整个索引并通过查找表行来查找不在索引中的列可能比扫描表并对结果排序更昂贵。

If so, the optimizer is not likely to use the index. If SELECT * selects only the index columns, the index is used and sorting avoided.

如果是这样,优化器不太可能使用索引。如果SELECT *只选择索引列,则使用索引,避免排序。

If t1 is an InnoDB table, the table primary key is implicitly part of the index, and the index can be used to resolve the ORDER BY for this query:

如果t1是一个InnoDB表,表的主键是索引的隐式部分,索引可以用来解析这个查询的ORDER BY:

In this query, key_part1 is constant, so all rows accessed through the index are in key_part2 order, and an index on (key_part1key_part2) avoids sorting if the WHERE clause is selective enough to make an index range scan cheaper than a table scan: 

在这个查询中,key_part1是常量,因此通过索引访问的所有行都是key_part2的顺序,并且如果WHERE子句具有足够的选择性,使得索引范围扫描比表扫描更便宜,则在(key_part1, key_part2)上的索引可以避免排序:

In the next two queries, whether the index is used is similar to the same queries without DESC shown previously:

在接下来的两个查询中,是否使用索引类似于没有使用DESC的相同查询:

In the next two queries, key_part1 is compared to a constant. The index is used if the WHERE clause is selective enough to make an index range scan cheaper than a table scan:

 在接下来的两个查询中,将key_part1与一个常量进行比较。如果WHERE子句具有足够的选择性,使得索引范围扫描比表扫描更便宜,则使用索引:

In the next query, the ORDER BY does not name key_part1, but all rows selected have a constant key_part1 value, so the index can still be used:

在下一个查询中,ORDER BY没有命名key_part1,但是所有选中的行都有一个固定的key_part1值,所以索引仍然可以使用:

In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it may still use indexes to find the rows that match the WHERE clause. Examples:

在某些情况下,MySQL不能使用索引来解析ORDER BY,尽管它仍然可以使用索引来查找匹配WHERE子句的行。例子:

The query uses ORDER BY on different indexes:

该查询在不同的索引上使用ORDER BY:

 

The query uses ORDER BY on nonconsecutive parts of an index:

该查询对索引的非连续部分使用ORDER BY:

The query mixes ASC and DESC:

查询混合了ASC和DESC:

The index used to fetch the rows differs from the one used in the ORDER BY:

用于获取行的索引不同于ORDER BY中使用的索引:

The query uses ORDER BY with an expression that includes terms other than the index column name:

该查询使用ORDER BY和一个表达式,其中包含索引列名以外的术语:

The query joins many tables, and the columns in the ORDER BY are not all from the first nonconstant table that is used to retrieve rows. (This is the first table in the EXPLAIN output that does not have a const join type.)

该查询连接许多表,ORDER BY中的列并不都来自用于检索行的第一个非常量表。(这是EXPLAIN输出中第一个没有const连接类型的表。)

The query has different ORDER BY and GROUP BY expressions.

 该查询具有不同的ORDER BY和GROUP BY表达式。

There is an index on only a prefix of a column named in the ORDER BY clause. In this case, the index cannot be used to fully resolve the sort order. For example, if only the first 10 bytes of a CHAR(20) column are indexed, the index cannot distinguish values past the 10th byte and a filesort is needed.

在ORDER BY子句中命名的列的前缀上只有一个索引。在这种情况下,索引不能用于完全解析排序顺序。例如,如果只对CHAR(20)列的前10个字节建立索引,则索引无法区分超过第10个字节的值,因此需要一个文件排序。

The index does not store rows in order. For example, this is true for a HASH index in a MEMORY table.

 索引不按顺序存储行。例如,对于MEMORY表中的HASH索引,这是正确的。

Availability of an index for sorting may be affected by the use of column aliases. Suppose that the column t1.a is indexed. In this statement, the name of the column in the select list is a. It refers to t1.a, as does the reference to a in the ORDER BY, so the index on t1.a can be used:

用于排序的索引的可用性可能会受到列别名的使用的影响。假设列向量是t1。一个索引。在这个语句中,选择列表中的列的名称是a,它指的是t1。a,就像ORDER BY中对a的引用一样,所以是t1上的索引。A可以使用:

In this statement, the name of the column in the select list is also a, but it is the alias name. It refers to ABS(a), as does the reference to a in the ORDER BY, so the index on t1.a cannot be used:

 在此语句中,选择列表中的列的名称也是a,但它是别名。它指的是ABS(a),就像ORDER BY中对a的引用一样,所以是t1上的索引。不能使用A:

In the following statement, the ORDER BY refers to a name that is not the name of a column in the select list. But there is a column in t1 named a, so the ORDER BY refers to t1.a and the index on t1.a can be used. (The resulting sort order may be completely different from the order for ABS(a), of course.)

在下面的语句中,ORDER BY引用的名称不是选择列表中的列的名称。但是在t1中有一列叫做a,所以ORDER BY指的是t1。A和t1的下标。可以用A。(当然,结果排序顺序可能与ABS(a)的顺序完全不同。)

By default, MySQL sorts GROUP BY col1col2, ... queries as if you also included ORDER BY col1col2, ... in the query. If you include an explicit ORDER BY clause that contains the same column list, MySQL optimizes it away without any speed penalty, although the sorting still occurs.

 默认情况下,MySQL排序组col1, col2,…如果您还包括ORDER BY col1, col2,…在查询。如果您包含一个明确的ORDER BY子句,其中包含相同的列列表,MySQL会对其进行优化,而不会造成任何速度上的损失,尽管排序仍然会发生。

If a query includes GROUP BY but you want to avoid the overhead of sorting the result, you can suppress sorting by specifying ORDER BY NULL. For example:

如果查询包含GROUP BY,但您希望避免对结果进行排序的开销,可以通过指定ORDER BY NULL来禁止排序。例如:

The optimizer may still choose to use sorting to implement grouping operations. ORDER BY NULL suppresses sorting of the result, not prior sorting done by grouping operations to determine the result.

优化器仍然可以选择使用排序来实现分组操作。ORDER BY NULL禁止对结果进行排序,而不是通过分组操作来确定结果的预先排序。

Note

GROUP BY implicitly sorts by default (that is, in the absence of ASC or DESC designators for GROUP BY columns). However, relying on implicit GROUP BY sorting (that is, sorting in the absence of ASC or DESC designators) or explicit sorting for GROUP BY (that is, by using explicit ASC or DESC designators for GROUP BY columns) is deprecated. To produce a given sort order, provide an ORDER BY clause.

默认情况下GROUP BY隐式排序(也就是说,在GROUP BY列没有ASC或DESC指示符的情况下)。但是,不推荐使用隐式GROUP BY排序(即在没有ASC或DESC指示符的情况下进行排序)或对GROUP BY进行显式排序(即对GROUP BY列使用显式ASC或DESC指示符)。要产生给定的排序顺序,提供order BY子句。

Use of filesort to Satisfy ORDER BY

使用文件排序来满足ORDER BY

If an index cannot be used to satisfy an ORDER BY clause, MySQL performs a filesort operation that reads table rows and sorts them. A filesort constitutes an extra sorting phase in query execution.

如果索引不能用于满足ORDER BY子句,MySQL将执行一个filesort 操作,读取表的行并对它们进行排序。filesort 在查询执行中构成了额外的排序阶段。

To obtain memory for filesort operations, the optimizer allocates a fixed amount of sort_buffer_size bytes up front. Individual sessions can change the session value of this variable as desired to avoid excessive memory use, or to allocate more memory as necessary.

为了获得用于filesort 操作的内存,优化器预先分配固定数量的sort_buffer_size字节。单个会话可以根据需要更改该变量的会话值,以避免过度使用内存,或者根据需要分配更多内存。

filesort operation uses temporary disk files as necessary if the result set is too large to fit in memory. Some types of queries are particularly suited to completely in-memory filesort operations. For example, the optimizer can use filesort to efficiently handle in memory, without temporary files, the ORDER BY operation for queries (and subqueries) of the following form:

如果结果集太大而无法放入内存,则filesort 操作根据需要使用临时磁盘文件。有些类型的查询特别适合完全在内存中的filesort 操作。例如,优化器可以使用filesort在内存中有效地处理以下形式的查询(和子查询)的ORDER BY操作,而不需要临时文件:

Such queries are common in web applications that display only a few rows from a larger result set. Examples:

这样的查询在web应用程序中很常见,这些应用程序只显示较大结果集中的几行。例子:

Influencing ORDER BY Optimization

影响 ORDER BY 优化

For slow ORDER BY queries for which filesort is not used, try lowering the max_length_for_sort_data system variable to a value that is appropriate to trigger a filesort. (A symptom of setting the value of this variable too high is a combination of high disk activity and low CPU activity.)

对于未使用filesort 的慢速ORDER BY查询,请尝试将max_length_for_sort_data系统变量降低到适合触发filesort的值。(将该变量的值设置过高的一个症状是磁盘活动高和CPU活动低的组合。)

To increase ORDER BY speed, check whether you can get MySQL to use indexes rather than an extra sorting phase. If this is not possible, try the following strategies:

要提高ORDER BY的速度,请检查是否可以让MySQL使用索引而不是额外的排序阶段。如果这是不可能的,尝试以下策略:

Increase the sort_buffer_size variable value. Ideally, the value should be large enough for the entire result set to fit in the sort buffer (to avoid writes to disk and merge passes), but at minimum the value must be large enough to accommodate 15 tuples. (Up to 15 temporary disk files are merged and there must be room in memory for at least one tuple per file.)

增加sort_buffer_size变量值。理想情况下,该值应该大到足以容纳整个结果集,以适应排序缓冲区(以避免写入磁盘和合并传递),但至少该值必须大到足以容纳15个元组。(最多合并15个临时磁盘文件,内存中必须有空间容纳每个文件至少一个元组。)

Take into account that the size of column values stored in the sort buffer is affected by the max_sort_length system variable value.

要考虑存储在排序缓冲区中的列值的大小会受到max_sort_length系统变量值的影响。

For example, if tuples store values of long string columns and you increase the value of max_sort_length, the size of sort buffer tuples increases as well and may require you to increase sort_buffer_size.

例如,如果元组存储长字符串列的值,而您增加了max_sort_length的值,排序缓冲区元组的大小也会增加,可能需要您增加sort_buffer_size。

For column values calculated as a result of string expressions (such as those that invoke a string-valued function), the filesort algorithm cannot tell the maximum length of expression values, so it must allocate max_sort_length bytes for each tuple.

 对于作为字符串表达式的结果计算的列值(例如调用字符串值函数的列值),filesort算法不能告诉表达式值的最大长度,因此它必须为每个元组分配max_sort_length字节。

To monitor the number of merge passes (to merge temporary files), check the Sort_merge_passes status variable.

要监视合并通道的数量(合并临时文件),请检查Sort_merge_passes状态变量。

Increase the read_rnd_buffer_size variable value so that more rows are read at a time.

增加read_rnd_buffer_size变量值,以便一次读取更多的行。

Change the tmpdir system variable to point to a dedicated file system with large amounts of free space.

将tmpdir系统变量更改为指向一个具有大量空闲空间的专用文件系统。

The variable value can list several paths that are used in round-robin fashion; you can use this feature to spread the load across several directories.

变量值可以列出以轮询方式使用的几个路径;您可以使用此特性将负载分散到多个目录。

Separate the paths by colon characters (:) on Unix and semicolon characters (;) on Windows. The paths should name directories in file systems located on different physical disks, not different partitions on the same disk.

Unix上用冒号分隔路径,Windows上用分号分隔路径。路径应该命名位于不同物理磁盘上的文件系统中的目录,而不是同一磁盘上的不同分区。

ORDER BY Execution Plan Information Available

按执行计划订购信息可用

With EXPLAIN (see Section 8.8.1, “Optimizing Queries with EXPLAIN”), you can check whether MySQL can use indexes to resolve an ORDER BY clause:

使用EXPLAIN(参见8.8.1节,“用EXPLAIN优化查询”),你可以检查MySQL是否可以使用索引来解析ORDER BY子句:

If the Extra column of EXPLAIN output does not contain Using filesort, the index is used and a filesort is not performed.

如果EXPLAIN输出的“Extra”列不包含“Using filesort”,则使用索引,不执行filesort。

If the Extra column of EXPLAIN output contains Using filesort, the index is not used and a filesort is performed.

如果EXPLAIN输出的“Extra”列中包含“Using filesort”,则不使用索引,执行filesort。

In addition, if a filesort is performed, optimizer trace output includes a filesort_summary block. For example:

此外,如果执行filesort,优化器跟踪输出包括一个filesort_summary块。例如

The sort_mode value provides information about the contents of tuples in the sort buffer:

sort_mode值提供了关于排序缓冲区中元组内容的信息:

  • <sort_key, rowid>: This indicates that sort buffer tuples are pairs that contain the sort key value and row ID of the original table row. Tuples are sorted by sort key value and the row ID is used to read the row from the table.

  • <sort_key, rowid>:这表明排序缓冲区元组是包含排序键值和原始表行的行ID的对。元组按排序键值排序,行ID用于从表中读取行。

  • <sort_key, additional_fields>: This indicates that sort buffer tuples contain the sort key value and columns referenced by the query. Tuples are sorted by sort key value and column values are read directly from the tuple.

  • <sort_key, additional_fields>:这表明排序缓冲区元组包含查询引用的排序键值和列。元组按排序键值排序,列值直接从元组中读取。

  • <sort_key, packed_additional_fields>: Like the previous variant, but the additional columns are packed tightly together instead of using a fixed-length encoding.

  • <sort_key, packed_additional_fields>:与前面的变体类似,但是附加的列紧密地打包在一起,而不是使用固定长度的编码。

EXPLAIN does not distinguish whether the optimizer does or does not perform a filesort in memory. Use of an in-memory filesort can be seen in optimizer trace output. Look for filesort_priority_queue_optimization.

EXPLAIN不区分优化器是否在内存中执行文件存储。在优化器跟踪输出中可以看到内存中文件存储的使用。寻找filesort_priority_queue_optimization。

For information about the optimizer trace, see MySQL Internals: Tracing the Optimizer.

 



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值