mysql8.x版本_select语句源码跟踪

飞奔的屎壳郎

于 2023-12-24 21:12:22 发布

阅读量1k

点赞数 17

分类专栏： mysql源码学习文章标签： mysql

本文链接：https://blog.csdn.net/qq_35349982/article/details/135187029

版权

mysql源码学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

总结

源码基于8.0.34版本分析，函数执行流程含义大致如下：

do_command 方法从连接中读取命令并执行，调用 dispatch_command 对命令进行分发。
dispatch_command 调用 mysql_parse 对命令进行解析，如果遇到一条语句用 ; 分隔多条命令，则会循环调用 mysql_parse，直到出现，解析错误、线程被kill，则不再继续循环。解析时，如遇到语法错误直接返回 error，否则解析结束后，通过判断当前用户是否有该表权限来决定执行或返回错误。
mysql_parse 解析无语法错误，且权限无问题，会调用 mysql_execute_command 执行。
execute 方法执行准备工作、锁表、优化器优化、执行、清理，在内部会调用 Sql_cmd_dml::execute_inner 方法。
execute_inner 方法真正执行优化器优化、记录开销、执行语句。
ExecuteIteratorQuery 方法最终会迭代处理语句结果，读取并通过网络缓冲区逐条发送给客户端。

调用堆栈图

THD::send_result_set_row(THD * const this, const mem_root_deque<Item*> & row_items) (\root\code\mysql-8.0.34\sql\sql_class.cc:2863)
Query_result_send::send_data(Query_result_send * const this, THD * thd, const mem_root_deque<Item*> & items) (\root\code\mysql-8.0.34\sql\query_result.cc:100)
Query_expression::ExecuteIteratorQuery(Query_expression * const this, THD * thd) (\root\code\mysql-8.0.34\sql\sql_union.cc:1785)
Query_expression::execute(Query_expression * const this, THD * thd) (\root\code\mysql-8.0.34\sql\sql_union.cc:1823)
Sql_cmd_dml::execute_inner(Sql_cmd_dml * const this, THD * thd) (\root\code\mysql-8.0.34\sql\sql_select.cc:1022)
Sql_cmd_dml::execute(Sql_cmd_dml * const this, THD * thd) (\root\code\mysql-8.0.34\sql\sql_select.cc:793)
mysql_execute_command(THD * thd, bool first_level) (\root\code\mysql-8.0.34\sql\sql_parse.cc:4719)
dispatch_sql_command(THD * thd, Parser_state * parser_state) (\root\code\mysql-8.0.34\sql\sql_parse.cc:5368)
dispatch_command(THD * thd, const COM_DATA * com_data, enum_server_command command) (\root\code\mysql-8.0.34\sql\sql_parse.cc:2054)
do_command(THD * thd) (\root\code\mysql-8.0.34\sql\sql_parse.cc:1439)
handle_connection(void * arg) (\root\code\mysql-8.0.34\sql\conn_handler\connection_handler_per_thread.cc:302)
pfs_spawn_thread(void * arg) (\root\code\mysql-8.0.34\storage\perfschema\pfs.cc:3042)

源码下载

https://dev.mysql.com/get/Downloads/MySQL-8.0/mysql-boost-8.0.34.tar.gz

新建测试表

CREATE DATABASE test_db;

use test_db;

CREATE TABLE `t1` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `str1` varchar(255) DEFAULT '',
  `i1` int DEFAULT '0',
  `i2` int DEFAULT '0',
  PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;

INSERT INTO t1(str1, i1, i2) VALUES
('s1', NULL, NULL),
('s2', 20, NULL),
('s3', 30, 31),
('s4', 40, 41),
('s5', 50, 51),
('s6', 60, 61),
('s7', 70, 71),
('s8', 80, 81);

select * from t1 where i2 > 20 and (i1 = 50 or i1 = 80);

入口函数

handle_connection

connection_handler_per_thread.cc:245 -> 最后走的是do_command()函数

extern "C" {
static void *handle_connection(void *arg) {
    my_thread_init();
    for (;;) {
        THD *thd = init_new_thd(channel_info);

        while (thd_connection_alive(thd)) {
            //执行具体逻辑
            if (do_command(thd)) break;
        }
        end_connection(thd);

        delete thd;
    }
    my_thread_end();
    my_thread_exit(nullptr);
    return nullptr;
}
}

do_command

sql_parse.cc:1439 执行命令的
return_value = dispatch_command(thd, &com_data, command);

bool do_command(THD *thd) {
    bool return_value;

    /*
    将执行阻塞读取操作来从客户端接收数据。当线程收到来自客户端的下一个命令、连接被关闭或经过了"net_wait_timeout"(网络等待超时)指定的秒数时，读取操作将被中断。
    */
    net = thd->get_protocol_classic()->get_net();
    my_net_set_read_timeout(net, thd->variables.net_wait_timeout);
    net_new_transaction(net);

    /* 恢复线程超时限制值*/
    my_net_set_read_timeout(net, thd->variables.net_read_timeout);

    //执行代码逻辑
    return_value = dispatch_command(thd, &com_data, command);
    
    //thd->get_protocol_classic():这部分代码获取了当前线程(thd)所使用的协议类，即MySQL的经典协议。
    //get_output_packet():这部分代码从协议类中获取输出数据包。
    //shrink(thd->variables.net_buffer_length):这部分代码是用来缩小输出数据包的大小。
    //shrink()方法则是将输出数据包的大小调整为这个值。这样做的目的是为了减少内存占用，提高程序的性能。
    thd->get_protocol_classic()->get_output_packet()->shrink(
            thd->variables.net_buffer_length);

    return return_value;
}

dispatch_command

sql_parse.cc:5368

bool dispatch_command(THD *thd, const COM_DATA *com_data,enum enum_server_command command) {
    thd->set_command(command);
    thd->set_query_id(next_query_id());
switch (command) {
    case COM_QUERY: {
        //从数据包中读取查询语句并将其存储在thd->query中
        alloc_query(thd, com_data->com_query.query,com_data->com_query.length);
        //放置参数
        copy_bind_parameter_values(thd, com_data->com_query.parameters,com_data->com_query.parameter_count);
        //执行线程
        dispatch_sql_command(thd, &parser_state);
        //重置连接
        thd->bind_parameter_values = nullptr;
        thd->bind_parameter_values_count = 0;
        break;
    }
}

//以下是重置线程的命令操作
done:
    thd->update_slow_query_status();
    if (thd->killed) thd->send_kill_message();
    thd->send_statement_status();
    thd->reset_query();
    thd->set_command(COM_SLEEP);
    thd->lex->sql_command = SQLCOM_END;
    thd->mem_root->Clear();
    return error;
}

dispatch_sql_command

void dispatch_sql_command(THD *thd, Parser_state *parser_state) {
    DBUG_PRINT("dispatch_sql_command", ("query: '%s'", thd->query().str));
    // thd->query().str 就是传入的字符串
    //初始化了用于此查询的成本模型，并确保当前查询块为空
    lex_start(thd);

    LEX *lex = thd->lex;
    const char *found_semicolon = nullptr;

    // 检查当前执行的语句是否有错误。如果没有错误，它会调用parse_sql函数解析SQL语句，并在解析成功后调用invoke_post_parse_rewrite_plugins函数。
    // qlen用于解析SQL语句并设置查询长度
    size_t qlen = 0;
    // da 的全称是 Statement Descriptor Accessor  get_stmt_da() 函数返回一个指向语句描述符对象的指针，该对象包含了关于当前执行的语句的信息。
    bool err = thd->get_stmt_da()->is_error();
    if (!err) {
        err = parse_sql(thd, parser_state, nullptr);
        // found_semicolon 就是char*类型的 分号
        found_semicolon = parser_state->m_lip.found_semicolon;
        qlen = found_semicolon ? (found_semicolon - thd->query().str): thd->query().length;

        if (!thd->is_error() && found_semicolon && (ulong)(qlen)) {
            thd->set_query(thd->query().str, qlen - 1);
        }
    }
    //根据当前的SQL命令和触发器类型，为每个表设置正确的触发器事件类型，以便在执行语句时正确处理触发器。
    lex->set_trg_event_type_for_tables();
    //执行Sql语句
    mysql_execute_command(thd, true);
    //清理资源
    thd->lex->destroy();
    thd->end_statement();
    thd->cleanup_after_query();
}

mysql_execute_command

int mysql_execute_command(THD *thd, bool first_level) {
    int res = false;
    LEX *const lex = thd->lex;
    //first_lists_tables_same()的函数，它的作用是将第一个最外层查询的第一个本地表移动到全局表列表的第一个位置。
    // 主要用于处理包含子查询的查询，因为在这种情况下，子查询的表会先进入全局表列表。
    lex->first_lists_tables_same();

    /* Update system variables specified in SET_VAR hints.  更新hint */
    if (lex->opt_hints_global && lex->opt_hints_global->sys_var_hint)
    lex->opt_hints_global->sys_var_hint->update_vars(thd);

    //SQLCOM_SELECT  此处选择特别多，指显示select查询主线
    switch (lex->sql_command) {
        case SQLCOM_DROP_SRS: {
            res = lex->m_sql_cmd->execute(thd);
            break;
        }
    }
    return res || thd->is_error();
}

execute

bool Sql_cmd_dml::execute(THD *thd) {
    lex = thd->lex;
    Query_expression *const unit = lex->unit;
    //判断语句是否prepare
    if (!is_prepared()) {
        if (prepare(thd)) goto err;
    } else {
        /*这段代码是关于准备一个预处理语句(Prepared statement),打开在语句中引用的表，并检查执行该语句所需的权限。*/
        cleanup(thd);
        if (open_tables_for_query(thd, lex->query_tables, 0)) goto err;
        // Bind table and field information   绑定表和字段信息
        if (restore_cmd_properties(thd)) return true;
        if (check_privileges(thd)) goto err;
        if (m_lazy_result) {
            Prepared_stmt_arena_holder ps_arena_holder(thd);
            if (result->prepare(thd, *unit->get_unit_column_types(), unit)) goto err;
            m_lazy_result = false;
        }
    }
    //事务开始
    lex->set_exec_started();
    // 将成本计算置零
    thd->clear_current_query_costs();
    if (lock_tables(thd, lex->query_tables, lex->table_count, 0)) goto err;  //锁表
    // 执行语句
    if (execute_inner(thd)) goto err;

    // 释放资源
    lex->cleanup(false);
    // 保存当前语句开销
    thd->save_current_query_costs();
    // 记录当前查询到的行数
    thd->update_previous_found_rows();
    return false;
}

prepare(THD *thd)

sql\sql_select.cc:549

bool Sql_cmd_dml::prepare(THD *thd) {
    //用于对未准备好的SELECT语句进行授权预检查。这个函数会检查我们是否具有查询涉及的所有表(以及可能涉及的其他实体)的权限。
    if (precheck(thd)) goto err;
    /* 在MySQL数据库中打开表并展开视图的。在执行查询(不作为执行的一部分)时，它会获取S元数据锁而不是SW锁，以与同时进行的LOCK TABLES WRITE和全局读锁保持兼容。 */
    if (open_tables_for_query(
            thd, lex->query_tables,
            needs_explicit_preparation() ? MYSQL_OPEN_FORCE_SHARED_MDL : 0)) {
        if (thd->is_error())  // @todo - dictionary code should be fixed
            goto err;
        if (error_handler_active) thd->pop_internal_handler();
        lex->cleanup(false);
        return true;
    }
    return false;
}

lock_tables

bool lock_tables(THD *thd, Table_ref *tables, uint count, uint flags) {
    if (!(thd->lock = mysql_lock_tables(thd, start, (uint)(ptr - start), flags)))
        return true;
    thd->lex->lock_tables_state = Query_tables_list::LTS_LOCKED;
    int ret = thd->decide_logging_format(tables);
    return ret;
}

//保留堆栈
//get_lock_data(THD * thd, TABLE ** table_ptr, size_t count, uint flags) (\root\code\mysql-8.0.34\sql\lock.cc:686)
//mysql_lock_tables(THD * thd, TABLE ** tables, size_t count, uint flags) (\root\code\mysql-8.0.34\sql\lock.cc:327)
//lock_tables(THD * thd, Table_ref * tables, uint count, uint flags) (\root\code\mysql-8.0.34\sql\sql_base.cc:6899)

execute_inner

bool Sql_cmd_dml::execute_inner(THD *thd) {
  Query_expression *unit = lex->unit;
  //优化器对sql进行优化并查询，成功直接返回
  if (unit->optimize(thd, /*materialize_destination=*/nullptr,
                     /*create_iterators=*/true, /*finalize_access_paths=*/true))
    return true;

  // Calculate the current statement cost.  // 计算当前查询开销
  accumulate_statement_cost(lex);

  // 如果是explain语句，则不真正执行，否则执行
  lex->set_exec_completed();
  if (lex->is_explain()) {
    if (explain_query(thd, thd, unit)) return true; /* purecov: inspected */
  } else {
    if (unit->execute(thd)) return true;
  }

  return false;
}

accumulate_statement_cost

sql_select.cc:881

void accumulate_statement_cost(const LEX *lex) {
    Opt_trace_context *trace = &lex->thd->opt_trace;
    Opt_trace_disable_I_S disable_trace(trace, true);

    double total_cost = 0.0;
    for (const Query_block *query_block = lex->all_query_blocks_list;
         query_block != nullptr;
         query_block = query_block->next_select_in_list()) {
        if (query_block->join == nullptr) continue;

        // Get the cost of this query block.
        double query_block_cost = query_block->join->best_read;

        // 在处理子查询时。如果子查询是非缓存的(non-cacheable),那么需要估计它执行的次数，并相应地调整成本。
        const Item_subselect *item = query_block->master_query_expression()->item;
        if (item != nullptr && !query_block->is_cacheable())
            query_block_cost *= calculate_subquery_executions(item, trace);

        total_cost += query_block_cost;
    }
    lex->thd->m_current_query_cost = total_cost;
}

execute

bool Query_expression::execute(THD *thd) {
    //查询数据
    return ExecuteIteratorQuery(thd);
}

ExecuteIteratorQuery

bool Query_expression::ExecuteIteratorQuery(THD *thd) {
    Opt_trace_context *const trace = &thd->opt_trace;
    Opt_trace_object trace_wrapper(trace);
    Opt_trace_object trace_exec(trace, "join_execution");
    if (is_simple()) {
        trace_exec.add_select_number(first_query_block()->select_number);
    }
    Opt_trace_array trace_steps(trace, "steps");
    // 保存结果字段，提前声明
    mem_root_deque<Item *> *fields = get_field_list();
    Query_result *query_result = this->query_result();
    //标记开始
    set_executed();
    ha_rows *send_records_ptr;
    if (is_simple()) {
        send_records_ptr = &first_query_block()->join->send_records;
    } else if (set_operation()->m_is_materialized) {
        send_records_ptr = &query_term()->query_block()->join->send_records;
    } else {
        send_records_ptr = &send_records;
    }
    *send_records_ptr = 0;
    thd->get_stmt_da()->reset_current_row_for_condition();
    {
        //循环读取数据
        for (;;) {    // 使用 m_root_iterator 迭代器依次读取查询到的结果行 
            int error = m_root_iterator->Read();
            if (error > 0 || thd->is_error())  // Fatal error
                return true;
            else if (error < 0)
                break;
            else if (thd->killed)  // Aborted by user
            {
                thd->send_kill_message();
                return true;
            }
            ++*send_records_ptr;
            if (query_result->send_data(thd, *fields)) {
                return true;
            }
            thd->get_stmt_da()->inc_current_row_for_condition();
        }
    }
    thd->current_found_rows = *send_records_ptr;
    return query_result->send_eof(thd);
}

Read

int FilterIterator::Read() {
    for (;;) {
        int err = m_source->Read();//读数据
        return 0;
    }
}

Read

TableScanIterator::Read(TableScanIterator * const this) (\root\code\mysql-8.0.34\sql\iterators\basic_row_iterators.cc:219)
FilterIterator::Read(FilterIterator * const this) (\root\code\mysql-8.0.34\sql\iterators\composite_iterators.cc:76)
Query_expression::ExecuteIteratorQuery(Query_expression * const this, THD * thd) (\root\code\mysql-8.0.34\sql\sql_union.cc:1770)
Query_expression::execute(Query_expression * const this, THD * thd) (\root\code\mysql-8.0.34\sql\sql_union.cc:1823)
Sql_cmd_dml::execute_inner(Sql_cmd_dml * const this, THD * thd) (\root\code\mysql-8.0.34\sql\sql_select.cc:1022)
Sql_cmd_dml::execute(Sql_cmd_dml * const this, THD * thd) (\root\code\mysql-8.0.34\sql\sql_select.cc:793)
mysql_execute_command(THD * thd, bool first_level) (\root\code\mysql-8.0.34\sql\sql_parse.cc:4719)
dispatch_sql_command(THD * thd, Parser_state * parser_state) (\root\code\mysql-8.0.34\sql\sql_parse.cc:5368)
dispatch_command(THD * thd, const COM_DATA * com_data, enum_server_command command) (\root\code\mysql-8.0.34\sql\sql_parse.cc:2054)
do_command(THD * thd) (\root\code\mysql-8.0.34\sql\sql_parse.cc:1439)
handle_connection(void * arg) (\root\code\mysql-8.0.34\sql\conn_handler\connection_handler_per_thread.cc:302)
pfs_spawn_thread(void * arg) (\root\code\mysql-8.0.34\storage\perfschema\pfs.cc:3042)

int TableScanIterator::Read() {
    int tmp;
    if (table()->is_union_or_table()) {
        while ((tmp = table()->file->ha_rnd_next(m_record))) {
            /*
             ha_rnd_next can return RECORD_DELETED for MyISAM when one thread is
             reading and another deleting without locks.
             */
            if (tmp == HA_ERR_RECORD_DELETED && !thd()->killed) continue;
            return HandleError(tmp);
        }
        if (m_examined_rows != nullptr) {
            ++*m_examined_rows;
        }
    } else {
        while (true) {
            if (m_remaining_dups == 0) {  // always initially
                while ((tmp = table()->file->ha_rnd_next(m_record))) {
                    if (tmp == HA_ERR_RECORD_DELETED && !thd()->killed) continue;
                    return HandleError(tmp);
                }
                if (m_examined_rows != nullptr) {
                    ++*m_examined_rows;
                }

                // Filter out rows not qualifying for INTERSECT, EXCEPT by reading
                // the counter.
                const ulonglong cnt =
                        static_cast<ulonglong>(table()->set_counter()->val_int());
                if (table()->is_except()) {
                    if (table()->is_distinct()) {
                        // EXCEPT DISTINCT: any counter value larger than one yields
                        // exactly one row
                        if (cnt >= 1) break;
                    } else {
                        // EXCEPT ALL: we use m_remaining_dups to yield as many rows
                        // as found in the counter.
                        m_remaining_dups = cnt;
                    }
                } else {
                    // INTERSECT
                    if (table()->is_distinct()) {
                        if (cnt == 0) break;
                    } else {
                        HalfCounter c(cnt);
                        // Use min(left side counter, right side counter)
                        m_remaining_dups = std::min(c[0], c[1]);
                    }
                }
            } else {
                --m_remaining_dups;  // return the same row once more.
                break;
            }
            // Skipping this row
        }
        if (++m_stored_rows > m_limit_rows) {
            return HandleError(HA_ERR_END_OF_FILE);
        }
    }
    return 0;
}

ha_rnd_next

int handler::ha_rnd_next(uchar *buf) {
    int result;

    // Set status for the need to update generated fields
    m_update_generated_read_fields = table->has_gcol();

    MYSQL_TABLE_IO_WAIT(PSI_TABLE_FETCH_ROW, MAX_KEY, result,
                        { result = rnd_next(buf); })
    if (!result && m_update_generated_read_fields) {
        result = update_generated_read_fields(buf, table);
        m_update_generated_read_fields = false;
    }
    table->set_row_status_from_handler(result);
    return result;
}

send_data

bool Query_result_send::send_data(THD *thd,
                                  const mem_root_deque<Item *> &items) {
    Protocol *protocol = thd->get_protocol();
    protocol->start_row();
    if (thd->send_result_set_row(items)) {
        protocol->abort_row();
        return true;
    }
    
    thd->inc_sent_row_count(1);
    return protocol->end_row();
}

bool THD::send_result_set_row(const mem_root_deque<Item *> &row_items) {
    char buffer[MAX_FIELD_WIDTH];
    String str_buffer(buffer, sizeof(buffer), &my_charset_bin);
    for (Item *item : VisibleFields(row_items)) {
        if (item->send(m_protocol, &str_buffer) || is_error()) return true;
        str_buffer.set(buffer, sizeof(buffer), &my_charset_bin);
    }
    return false;
}

其他

HAVE_PSI_THREAD_INTERFACE的作用

HAVE_PSI_THREAD_INTERFACE 是一个编译器宏，用于表示是否支持 PSI(Process Status Interface)线程接口。PSI 是 Linux 内核中用于获取进程状态信息的一种机制。在 C++ 编程语言中，这个宏通常用于条件编译，以便根据编译器和系统的支持情况来选择性地包含或排除与 PSI 相关的代码。

例如，如果你想在支持 PSI 的系统上使用 psi_thread_info 结构体，你可以这样使用：

#include <linux/psi.h>
int main() {
#ifdef HAVE_PSI_THREAD_INTERFACE
    psi_thread_info info;
    // 使用 info 进行相关操作
#else
    // 不支持 PSI 的处理逻辑
#endif
    return 0;
}

TGID

GTID是全局事务标识符(Global Transaction Identifier)的缩写，它是一个唯一标识一个分布式事务的数字。在MySQL中，GTID被用于确保在主从复制环境中数据的一致性。
gtid_consistency_violation_state变量表示当前线程是否存在GTID一致性违规状态。如果存在GTID一致性违规，那么可以通过回滚操作来修复数据不一致的问题。

int mysql_execute_command(THD *thd, bool first_level) {
  int res = false;
  LEX *const lex = thd->lex;
  /* first Query_block (have special meaning for many of non-SELECTcommands) Query_block是一个关键字，它表示一个查询块的开始。对于许多非SELECT命令，这个查询块具有特殊意义。 */
  Query_block *const query_block = lex->query_block;
  /* first table of first Query_block */
  Table_ref *const first_table = query_block->get_table_list();
  /* list of all tables in query */
  Table_ref *all_tables;
  // keep GTID violation state in order to roll it back on statement failure  gtid_consistency_violation_state变量表示当前线程是否存在GTID一致性违规状态。如果存在GTID一致性违规，那么可以通过回滚操作来修复数据不一致的问题。
  bool gtid_consistency_violation_state = thd->has_gtid_consistency_violation;

飞奔的屎壳郎

关注

17
点赞
踩
17

收藏

觉得还不错? 一键收藏
0
评论
mysql8.x版本_select语句源码跟踪

HAVE_PSI_THREAD_INTERFACE 是一个编译器宏，用于表示是否支持 PSI(Process Status Interface)线程接口。在 C++ 编程语言中，这个宏通常用于条件编译，以便根据编译器和系统的支持情况来选择性地包含或排除与 PSI 相关的代码。gtid_consistency_violation_state变量表示当前线程是否存在GTID一致性违规状态。connection_handler_per_thread.cc:245 -> 最后走的是do_command()函数。
复制链接

扫一扫

专栏目录