这里使用的代码是mysql8.0.20分支。其他分支的代码可能略有不同。
filesort在核心代码前面还有很多辅助代码,辅助部分就不说了,重点放在核心代码部分。
下面是一个调用栈:
SELECT_LEX_UNIT::ExecuteIteratorQuery(SELECT_LEX_UNIT * const this, THD * thd) (/root/mysql-8.0.20/sql/sql_union.cc:1165)
SELECT_LEX_UNIT::execute(SELECT_LEX_UNIT * const this, THD * thd) (/root/mysql-8.0.20/sql/sql_union.cc:1235)
Sql_cmd_dml::execute_inner(Sql_cmd_dml * const this, THD * thd) (/root/mysql-8.0.20/sql/sql_select.cc:945)
Sql_cmd_dml::execute(Sql_cmd_dml * const this, THD * thd) (/root/mysql-8.0.20/sql/sql_select.cc:725)
mysql_execute_command(THD * thd, bool first_level) (/root/mysql-8.0.20/sql/sql_parse.cc:4489)
mysql_parse(THD * thd, Parser_state * parser_state) (/root/mysql-8.0.20/sql/sql_parse.cc:5306)
dispatch_command(THD * thd, const COM_DATA * com_data, enum_server_command command) (/root/mysql-8.0.20/sql/sql_parse.cc:1776)
do_command(THD * thd) (/root/mysql-8.0.20/sql/sql_parse.cc:1274)
handle_connection(void * arg) (/root/mysql-8.0.20/sql/conn_handler/connection_handler_per_thread.cc:302)
pfs_spawn_thread(void * arg) (/root/mysql-8.0.20/storage/perfschema/pfs.cc:2854)
libpthread.so.0!start_thread (Unknown Source:0)
libc.so.6!clone (Unknown Source:0)
这里就从ExecuteIteratorQuery方法的第1165行开始,这部分是真正的排序和获取结果,以及发送结果到客户端的代码。
bool SELECT_LEX_UNIT::ExecuteIteratorQuery(THD *thd) {
。。。。。。
//下面是第1165行
if (m_root_iterator->Init()) {
return true;
}
{
PFSBatchMode pfs_batch_mode(m_root_iterator.get());
auto join_cleanup = create_scope_guard([this, thd] {
for (SELECT_LEX *sl = first_select(); sl; sl = sl->next_select()) {
JOIN *join = sl->join;
join->join_free();
thd->inc_examined_row_count(join->examined_rows);
}
if (fake_select_lex != nullptr) {
thd->inc_examined_row_count(fake_select_lex->join->examined_rows);
}
});
for (;;) {
int error = m_root_iterator->Read();
DBUG_EXECUTE_IF("bug13822652_1", thd->killed = THD::KILL_QUERY;);
if (error > 0 || thd->is_error()) // Fatal error
return true;
else if (error < 0)
break;
else if (thd->killed) // Aborted by user
{
thd->send_kill_message();
return true;
}
++*send_records_ptr;
if (query_result->send_data(thd, *fields)) {
return true;
}
thd->get_stmt_da()->inc_current_row_for_condition();
}
// NOTE: join_cleanup must be done before we send EOF, so that we get the
// row counts right.
}
先看一下ExecuteIteratorQuery方法其中的调用脉络:
可以看到,在ExecuteIteratorQuery方法中调用了SortingIterator::Init()方法,在SortingIterator::Init()中调用了Dosort()方法,整个核心的排序过程都在其中。接下来,SortingIterator::Init()中根据不同的排序路径选择TableRowIterator的不同子类来实际进行排序结果的获取动作。m_root_iterator->Read()代理了TableRowIterator的不同子类的读取。最后通过query_result->send_data将数据发送到客户端。
这里跟随代码看一下具体的实现。关键的部分都带上了注释。
先看看SortingIterator::Init():
bool SortingIterator::Init() {
ReleaseBuffers();
// Both empty result and error count as errors. (TODO: Why? This is a legacy
// choice that doesn't always seem right to me, although it should nearly
// never happen in practice.)
//排序真正发生的地方
if (DoSort() != 0) return true;
// Prepare the result iterator for actually reading the data. Read()
// will proxy to it.
TABLE *table = m_filesort->table;
/*
* 如果m_sort_result.io_cache没有初始化,或者分配了m_sort_result.io_cache.buffer缓冲区,这只会在merge_pass下发生
*/
if (m_sort_result.io_cache && my_b_inited(m_sort_result.io_cache)) {
// Test if ref-records was used
if (m_fs_info.using_addon_fields()) {
DBUG_PRINT("info", ("using SortFileIterator"));
if (m_fs_info.addon_fields->using_packed_addons())
//m_result_iterator是std::unique_ptr类型包裹了RowIterator的对象,
//m_result_iterator_holder.xxx是占位符
m_result_iterator.reset(
new (&m_result_iterator_holder.sort_file_packed_addons)
SortFileIterator<true>(thd(), table, m_sort_result.io_cache,
&m_fs_info, m_examined_rows));
else
m_result_iterator.reset(
new (&m_result_iterator_holder.sort_file)
SortFileIterator<false>(thd(), table, m_sort_result.io_cache,
&m_fs_info, m_examined_rows));
} else {
/*
m_fs_info->addon_field is checked because if we use addon fields,
it doesn't make sense to use cache - we don't read from the table
and m_fs_info->io_cache is read sequentially
*/
bool request_cache = !m_fs_info.using_addon_fields();
m_result_iterator.reset(
new (&m_result_iterator_holder.sort_file_indirect)
SortFileIndirectIterator(
thd(), table, m_sort_result.io_cache, request_cache,
/*ignore_not_found_rows=*/false, m_examined_rows));
}
m_sort_result.io_cache =
nullptr; // The result iterator has taken ownership.
} else {
/*m_sort_result.has_result_in_memory()为真的情况只发生在save_index函数调用中,
* 只有完全处于内存中排序的情况才会发生。
*/
DBUG_ASSERT(m_sort_result.has_result_in_memory());
if (m_fs_info.using_addon_fields()) {
DBUG_PRINT("info", ("using SortBufferIterator"));
DBUG_ASSERT(m_sort_result.sorted_result_in_fsbuf);
if (m_fs_info.addon_fields->using_packed_addons())
m_result_iterator.reset(
new (&m_result_iterator_holder.sort_buffer_packed_addons)
SortBufferIterator<true>(thd(), table, &m_fs_info,
&m_sort_result, m_examined_rows));
else
m_result_iterator.reset(
new (&m_result_iterator_holder.sort_buffer)
SortBufferIterator<false>(thd(), table, &m_fs_info,
&m_sort_result, m_examined_rows));
} else {
//使用filesort排序发生sort_merge_pass的情况下,会走到这里。
DBUG_PRINT("info", ("using SortBufferIndirectIterator (sort)"));
m_result_iterator.reset(
new (&m_result_iterator_holder.sort_buffer_indirect)
SortBufferIndirectIterator(thd(), table, &m_sort_result,
/*ignore_not_found_rows=*/false,
m_examined_rows));
}
}
return m_result_iterator->Init();
}
按照调用顺序,接下来是Dosort():
Dosort()其中方法调用的脉络,有个总体清晰的印象:
,看看DoSort()方法的实现,同样的,关键部分有注释
/*
Do the actual sort, by calling filesort. The result will be left in one of
several places depending on what sort strategy we chose; it is up to Init() to
figure out what happened and create the appropriate iterator to read from it.
RETURN VALUES
0 ok
-1 Some fatal error
1 No records
*/
int SortingIterator::DoSort() {
DBUG_ASSERT(m_sort_result.io_cache == nullptr);
//初始化排序结果存储io_cache
m_sort_result.io_cache =
(IO_CACHE *)my_malloc(key_memory_TABLE_sort_io_cache, sizeof(IO_CACHE),
MYF(MY_WME | MY_ZEROFILL));
if (m_qep_tab != null