mysql优化(SQL语句的优化以及索引的应用范围)

最新推荐文章于 2024-11-09 16:20:10 发布

binger819623

最新推荐文章于 2024-11-09 16:20:10 发布

阅读量1.3k

点赞数 1

分类专栏： MYSQL 文章标签：优化 sql mysql table oracle 数据库

MYSQL 专栏收录该内容

22 篇文章 0 订阅

订阅专栏

2008-09-09 14:37

SQL 优化的原则是：将一次操作需要读取的BLOCK 数减到最低。

调整不良SQL 通常可以从以下几点切入：

检查不良的 SQL ，考虑其写法是否还有可优化内容；
检查子查询考虑SQL 子查询是否可以用简单连接的方式进行重新书写；
检查优化索引的使用；
考虑数据库的优化器;

查询的一般规则

Ø 明确指出检索的字段，尽量减少对多余的列与多余的行读取。

禁止使用 select * from table ……的方式访问表。

Ø 在一个SQL语句中，如果一个where条件过滤的数据库记录越多，定位越准确，则该where条件越应该前移。

Ø 查询时尽可能使用索引覆盖。即对SELECT的字段建立复合索引，这样查询时只进行索引扫描，不读取数据块。

Ø 在判断有无符合条件的记录时不要用SELECT COUNT （*）语句。 如：

BF: select count(*) from table where condition

在 MSSQL和Sybase中，使用 if exists (select 1 from table_name where condition)性能较好，在Oracle中，使用select 1 from table_name where condition较好。

Ø 内层限定原则

在拼写SQL语句时，将查询条件分解、分类，并尽量在SQL语句的最里层进行限定，以减少数据的处理量。

应绝对避免在order by子句中使用表达式。

正确使用INDEX的SQL

如果和查询条件相关的column上有建index,以下几点能帮助SQL正确的使用index 。

Ø 避免显式或隐含的类型转换。
where 子句中存在数据类型隐形转换的 , 如用 Numeric 型和 Int 型的列的比较时，不能使用 index 。

Ø WHERE 子句中任何对列的操作都将无法使用 index ，它包括数据库函数、计算表达式等等，所以要尽量减少在＝左边的列的运算。如：

BF ：select staff_no, staff_name from staff_member where salary*2 <= 10000;

RP ：select staff_no, staff_name from staff_member where salary <= 5000;

Ø WHERE 子句中使用不等于 (<>) 运算的，将无法使用 index 。可以用 union all 改写。如：
BF ：select staff_no, staff_name from staff_member where dept_no<>2001;

RP ：select staff_no, staff_name from staff_member where dept_no < 2001

Union all

select staff_no, staff_name from staff_member where dept_no > 2001;

Oralce中可以考虑使用函数索引。

Ø WHERE 子句中使用 substr 字符串函数的，将无法使用 index ，可以用 like 改写。如：

BF ：select staff_no, staff_name from staff_member where substr(last_name,1,4)=’FRED’;

RP ：select staff_no, staff_name from staff_member where last_name like ’FRED ％’;

Ø WHERE 子句中 ‘%’ 通配符在第一个字符的，将无法使用 index 。如 :
select staff_no, staff_name from staff_member where first_name like ‘%DON’;

这种情况的优化方式比较复杂，在后面有关index 优化的内容中我们介绍一种在oracle 中使用反向索引的优化方式。

Ø LIKE语句后面不能跟变量，否则也不会使用索引。

where Prod_name like :v_name || '%' -- 不会使用索引

如果一定要使用变量，可以使用如下技巧：
where Prod_name between :v_name and :v_name || chr(255) -- 会使用索引

Ø WHERE 子句中使用IS NULL和IS NOT NULL不会使用索引。好的设计习惯是表中尽量不使用允许为空的字段，可以根据业务逻辑，将字段设为NOT NULL的同时，提供一个DEFAULT值。另外，当表中建有索引的字段包含NULL时，索引的效率会降低。

Ø WHERE 子句中使用字符串连接 (||) 的，将无法使用 index 。我们应该改写这个查询条件。如：

BF ：select staff_no, staff_name from staff_member

where first_name||' '||last_name ='Beill Cliton';

RP ：select staff_no, staff_name from staff_member

where first_name ＝ ‘Beill’

and last_name ='Cliton';

Ø WHERE 条件中使用 ’in’ 子句的情况, 如：

BF ：select count(*) from staff_member

Where id_no in (‘0’,’1’);

WHERE 条件中的'in'在逻辑上相当于'or'，所以语法分析器会将in ('0','1')转化为id_no ='0' or id_no='1'来执行。我们期望它会根据每个or子句分别查找，再将结果相加，这样可以利用id_no上的索引；但实际上（根据showplan）,它却采用了"OR策略"，即先取出满足每个or子句的行，存入临时数据库的工作表中，再建立唯一索引以去掉重复行，最后从这个临时表中计算结果。我们可以将 or 子句分开：

RP ： DECLARE

a integer;
        b integer;
BEGIN
   select count(*) into a from stuff where id_no='0';
   select count(*) into b from stuff where id_no='1';

a := a + b;
END;

Ø 如果在 table 上创建了一个顺序为 col1 ， col2 ， col3 的复合 index 时，在查询中只有以下三种 where 条件子句能有效的使用 index ：

…where col1= @col1 and col2= @col2 and col3= @col3;

…where col1= @col1 and col2= @col2;

…where col1= @col1

子查询的调整

a. 调整具有IN和EXISTS子句的子查询

具有IN的子查询：

BF ：

select emp_id from EMP

where dep_id IN ( select dep_id from DEP

where dep_no = ‘001’) ；

具有EXISTS的子查询：

BF ：

select emp_id from EMP e

where exists ( select dep_id from DEP d

where e.dep_id = d.dep_id

and d.dep_no = ‘001’) ；

用对等连接调整具有IN和EXISTS的子查询：

上面的例子中的子查询有两种情况，dep_id unique和nounique。

当dep_id是unique,

RP ：

select e.emp_id from EMP e, DEP d

where e..dep_id = d.dep_id and d.dep_no = ‘001’ ；

当dep_id是nounique,

RP ：

select e.emp_id from EMP e,

(select distinct dep_id from DEP where dep_no = ‘001’ ) d

where e..dep_id = d.dep_id ；

需要注意的是，具有IN子句的非关联子查询和EXISTS子句的关联子查询，Oracle的优化器虽然能将其转换为标准的连接操作，但Oracle转换用的是NESTED LOOPS连接操作，而且有很多其他因素支配着SQL优化器是否将一个子查询自动转换为一个连接操作。首先，连接操作的两个数据表列通常都应该有唯一的数据索引。所以，我们应该自己重写这些子查询。

b. 调整具有NOT IN和NOT EXISTS子句的子查询

具有NOT IN的子查询：

BF ：

select emp_id from EMP

where dep_id NOT IN ( select dep_id from DEP

where dep_no = ‘001’) ；

具有NOT EXISTS的子查询：

BF ：

select emp_id from EMP e

where NOT EXISTS ( select dep_id from DEP d

where e dep_id = d. dep_id and .d.dep_no = ‘001’) ；

用外联接调整具有NOT IN和NOT EXISTS的子查询

RP ：

select e.emp_id from EMP e,DEP d

where e.dep_id = d.dep_id(+)

and d.dep_id is null

and d.dep_no (+)= ‘001’ ；

c．调整具有自连接的子查询

所谓具有自连接的子查询，实际上是在一种特殊需求下使用的具有IN子句的关联子查询。我们可以用连接的方式重写该子查询。例如以下的需求：

查询每个部门中工资高于该部门平均工资的员工ID,名称，工资，部门ID：

BF ：

select a.emp_id ,a.emp_name,a.emp_salary,a.dep_id from salary a

where a.emp_salary >

(select avg(b.emp_salary) from salary b where b.dep_id = a.dep_id ) ；

RP ：

select a.emp_id ,a.emp_name,a.emp_salary,a.dep_id

from salary a,

(select dep_id,avg(emp_salary) avg_salary from salary group by dep_id) b

where a.dep_id = b.dep_id

and a.emp_salary > b.avg_salary ；

使用绑定变量优化SQL

使用绑定变量可以提高Library Cache的Hit Ratio，减少SQL语句的重编译，从而达到提高查询效率的目的。

BF ：

SQL> alter system flush shared_pool;

系统已更改。

SQL> declare

2 type rc is ref cursor;

3 l_rc rc;

4 l_dummy all_objects.object_name%type;

5 l_start number default dbms_utility.get_time;

6 begin

7 for i in 1..1000

8 loop

9 open l_rc for

10 'select object_name

11 from all_objects

12 where object_id ='|| i;

13 fetch l_rc into l_dummy;

14 close l_rc;

15 end loop;

16 dbms_output.put_line

17 (round((dbms_utility.get_time - l_start)/100,2)||' seconds...');

18 end;

19 /

18.36 seconds...

PL/SQL 过程已成功完成。

这是从all_objects进行的单条查询，在循环1000次的情况下，Oracle对其进行的1000次编译，需要18.36秒才能完成。

RP ：

SQL> alter system flush shared_pool;

系统已更改。

SQL> declare

2 type rc is ref cursor;

3 l_rc rc;

4 l_dummy all_objects.object_name%type;

5 l_start number default dbms_utility.get_time;

6 begin

7 for i in 1..1000

8 loop

9 open l_rc for

10 'select object_name

11 from all_objects

12 where object_id =:x'

13 using i;

14 fetch l_rc into l_dummy;

15 close l_rc;

16 end loop;

17 dbms_output.put_line

18 (round((dbms_utility.get_time - l_start)/100,2)||' seconds...');

19 end;

20 /

.56 seconds...

PL/SQL 过程已成功完成。

这是改用绑定变量之后的结果。这时Oracle对该PL/SQL 进行1次编译。执行时间明显减少。我在分别提交这两个PL/SQL 之前都执行了alter system flush shared_pool;命令，以保证对比结果的真实有效。

减少查询的次数

在一次执行多条SQL时，考虑正确的SQL顺序，减少查询的次数。如：

BF:

IF NOT EXISTS(SELECT count(*) FROM Item WHERE fchrItemID=@chrItemID and fchrA=@chrA)

INSERT INTO Item (fchrItemID ,fchrItem ,fchrItemName , fchrA , flotQ , flotMQ )

VALUES (@chrItemID , @chrItem , @chrItemName , @chrA , -1*@lotQ , -1*@lotMQ )

else
UPDATE Item SET flotQ=flotQ-@lotQ,flotMQ=flotMQ-@lotMQ
WHERE fchrItemID=@chrItemID and fchrA=@chrA

对于这个 SQl来说，select和update对Item做了两次查询操作。实际上我们只需要一次查询就可以实现功能。

RP ：

UPDATE Item SET flotQ=flotQ-@lotQ,flotMQ=flotMQ-@lotMQ
WHERE fchrItemID=@chrItemID and fchrA=@chrA
IF @@rowcount = 0
INSERT INTO @List (fchrItemID ,fchrItem ,fchrItemName , fchrA , flotQ , flotMQ )
VALUES (@chrItemID , @chrItem , @chrItemName , @chrA , -1*@lotQ , -1*@lotMQ )

用union all 代替 union

数据库执行 union操作，首先先分别执行union两端的查询，将其放在临时表中，然后在对其进行排序，过滤重复的记录。

BF: select a1,b1,c1 from table1 ----query A

union

select a2,b2,c2 from table2 ----query B

当已知的业务逻辑决定query A和query B中不会有重复记录时，应该用union all代替union，以提高查询效率。

RP: select a1,b1,c1 from table1 ----query A

union all

select a2,b2,c2 from table2 ----query B