MySQL---从代码角度理解SQL语句结构

最新推荐文章于 2022-01-13 09:15:29 发布

那海蓝蓝

最新推荐文章于 2022-01-13 09:15:29 发布

阅读量530

点赞数

分类专栏：数据库

本文链接：https://blog.csdn.net/fly2nn/article/details/61924689

版权

数据库专栏收录该内容

289 篇文章 49 订阅

订阅专栏

MySQL的SQL语句,遵循SQL标准, 可以从语法定义的角度,了解SQL语句的构成并推测其执行方式(需要知道SQL原理).

本文则是从内核的角度, 通过源码层来剖析SQL语句的构成(只以SELECT语句为例), 本文的意义,在于帮助内核实现者或者对内核对SQL实现感兴趣者掌握如何理解SQL的剖析(抛砖引玉, ^_^).

如下是MySQL源码sql_lex.h的注释,非常重要, 本文以黑色字体给出MySQL原文, 蓝色字体标识出深入解析的内容, 红色标示特别强调的内容.

/* //如下是基本概念,需要理解SELECT命令中的基本概念(查询表达式/查询块, 形象地举个例子,子查询就是一个"查询表达式", 派生表就是一个查询块)对应的MySQL表达方式(SELECT_LEX_UNIT/SELECT_LEX,需要自己阅读这2个类的定义,要不可能白看了本文...哈):
  Class SELECT_LEX_UNIT represents a query expression.
  Class SELECT_LEX represents a query block.
  A query expression contains one or more query blocks (more than one means
  that we have a UNION query).  //这是在说明二者(查询表达式/查询块)的关系. UNION前后都可是一个查询块,没有UNION的时候,就那么一个简单SELECT语句,首先其是一个"query expression",然后进一步被当作"query block"
  These classes are connected as follows:
   Both classes have a master, a slave, a next and a prev field. //这2个类中,每个类都有一个master,一个slave, 一个next和一个prev,但注意他们是不同的--名字相同但类型不同,前往不可"见名生义"认为他们是一样的.  下面则是解释它们的不同之处:
   For class SELECT_LEX, master and slave connect to objects of type
   SELECT_LEX_UNIT, whereas for class SELECT_LEX_UNIT, they connect
   to SELECT_LEX.

   //下面这2句是在说"父和子"的关系,其意义在于,从某个"子"出发可以找到其"父",从"父"出发可以找到其"子", 父子关系就是这么建立起来的. 解析SQL需要知道父子关系/邻里关系/引用关系(谁参照了谁). 也就是说,一个SQL是一个整体,解析就需要化整为零; 怎么化整为零,靠的就是用不同的对象标示出关键的"子"体,分解过程把子体标示出来,并标示出各种"联系"(如此处的父子间的关联关系)
   master is pointer to outer node.
   slave is pointer to the first inner node

   neighbors are two SELECT_LEX or SELECT_LEX_UNIT objects on
   the same level. //与一个"对象"相邻的,就是"neighbor/邻居"了,邻居一定和自己是同类项.所以此处说邻居是SELECT_LEX 或 SELECT_LEX_UNIT.

   //如下继续深入解释邻居指向的关系/指向"父"的关系/指向"子"的关系

   The structures are linked with the following pointers:
   - list of neighbors (next/prev) (prev of first element point to slave
     pointer of outer structure)
     - For SELECT_LEX, this is a list of query blocks.
     - For SELECT_LEX_UNIT, this is a list of subqueries. //哈,可以了解子查询是怎么标示的

   - pointer to outer node (master), which is
     If this is SELECT_LEX_UNIT
       - pointer to outer select_lex.
     If this is SELECT_LEX
       - pointer to outer SELECT_LEX_UNIT.

   - pointer to inner objects (slave), which is either:
     If this is an SELECT_LEX_UNIT:
       - first query block that belong to this query expression. //注意这里表示的"第一个子",如果有多个"子",怎么找呢?------呵呵呵呵,别忘了有"邻居"
     If this is an SELECT_LEX
       - first query expression that belong to this query block (subqueries).

   - list of all SELECT_LEX objects (link_next/link_prev) //注意和"next/prev"的区别.首先,这里的数据类型只是SELECT_LEX,即只能是"查询块", 可以表示"多个派生表".
     This is to be used for things like derived tables creation, where we
     go through this list and create the derived tables.

   //其他一些变量的作用(应对的是其他一些情况),如fake_select_lex,不多解释了,可自己深入探索

   If query expression contain several query blocks (UNION now,
   INTERSECT etc later) then it has a special select_lex called
   fake_select_lex. It used for storing global parameters (like ORDER BY,
   LIMIT) and executing union.

   Subqueries used in global ORDER BY clause will be attached to this
   fake_select_lex, which will allow them to correctly resolve fields of
   the containing UNION and outer selects.

   For example for following query: //用例子说明,前前后后得多对照阅读,才可切实理解其关系

   select *
     from table1
     where table1.field IN (select * from table1_1_1 union
                            select * from table1_1_2)
     union
   select *
     from table2
     where table2.field=(select (select f1 from table2_1_1_1_1
                                   where table2_1_1_1_1.f2=table2_1_1.f3)
                           from table2_1_1
                           where table2_1_1.f1=table2.f2)
     union
   select * from table3;

   we will have following structure:

   select1: (select * from table1 ...)
   select2: (select * from table2 ...)
   select3: (select * from table3)
   select1.1.1: (select * from table1_1_1)
   ...

     main unit
     fake0  //第1层
     select1 select2 select3 //使用UNION联接起来的3个查询语句,是例子的主体结构. select3结构简单,没有子部分
     |^^     |^
    s|||     ||master 
    l|||     |+---------------------------------+
    a|||     +---------------------------------+|
    v|||master                         slave   ||  //master左边的2个竖线标示select1是2个unit的master
    e||+-------------------------+             ||
     V|            neighbor      |             V|  //本行的V表示的是向下的箭头
     unit1.1<+==================>unit1.2       unit2.1
     fake1.1    //第2层，第1层的子层
     select1.1.1 select 1.1.2    select1.2.1   select2.1.1  //select1.2.1不存在，因该是MySQL原版文字错误。 如果是要对应unit1.2，则应该是“select 1.1.2”，中间不应该有空格

                                               |^
                                               ||
                                               V|
                                               unit2.1.1.1
                                               select2.1.1.1.1


   relation in main unit will be following:
   (bigger picture for:
      main unit
      fake0
      select1 select2 select3
   in the above picture)

         main unit
         |^^^^|fake_select_lex
         |||||+--------------------------------------------+
         ||||+--------------------------------------------+|
         |||+------------------------------+              ||
         ||+--------------+                |              ||
    slave||master         |                |              ||
         V|      neighbor |       neighbor |        master|V  //邻居关系
         select1<========>select2<========>select3        fake0

//如上，main unit是一个主体（表达式，即SELECT_LEX_UNIT结构对象），其slave是select1，select1如果邻居以双向链表的形式指向其兄弟select2/select3。而select1的master就是main unit。

//如下说明了词法解析的过程。

//1 主体是main unit，SELECT_LEX_UNIT结构对象，一切的master。其自己的master为NULL.  从代码里看,形式上,存在一个"根"对象,用LEX表示.可以通过sql_lex.h中的"struct LEX: public Query_tables_list"了解. 代码实现的时候,先是生成一个查询块,然后生成此查询块对应的查询表达式,让查询表达式作为查询块的master,让查询块作为查询表达式的slave.

//2 过程为，先找到slecet1查询块（SELECT_LEX），然后继续找union标示的下一个查询块select2，同理找select3

//3 当selelct3之后没有其他并列的查询块，开始递归向下，找子查询块。selelct3么有子，开始返回到上一个查询块select2，然后找select2的子。这样找到了select2.1.1和select2.1.2。并为select2.1.2找到了子select2.1.1.1.1，没有其它子后，然后继续回退找上一个union的子select1，为select1找到子select1.1.1和selelct。

//4 注意找每个查询块的时候，其本身先是一个查询表达式，然后才是一个查询块。切记。

    list of all select_lex will be following (as it will be constructed by
    parser):

    select1->select2->select3->select2.1.1->select 2.1.2->select2.1.1.1.1-+
                                                                          |
    +---------------------------------------------------------------------+
    |
    +->select1.1.1->select1.1.2