上一篇文章主要是对于join查询的简介,这次则是以join查询优化为主
双表下的join查询:
建表sql:
book表和class表中用card字段进行关联
CREATE TABLE IF NOT EXISTS `class` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`card` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE IF NOT EXISTS `book` (
`bookid` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`card` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`bookid`)
);
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
首先当表中字段没有索引时:
我们执行如下sql:
EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;
结果:
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| 1 | SIMPLE | class | ALL | NULL | NULL | NULL | NULL | 20 | |
| 1 | SIMPLE | book | ALL | NULL | NULL | NULL | NULL | 20 | |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
2 rows in set (0.00 sec)
可以看出由于没有索引,type为all,表示在该sql会执行全表扫描,所需要读取的行数是20行,此种情况在数据量小时不会对生产造成太大影响,但一旦数据量变大,则会造成难以估量的性能耗费,为必须避免的情况。
加索引:
在book表card字段加上索引
ALTER TABLE `book` ADD INDEX Y ( `card`);
</pre><pre name="code" class="html" style="font-size:18px;">索引建好后我们再进行执行该sql:
<pre name="code" class="sql" style="font-size:18px;">EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;
+----+-------------+-------+------+---------------+------+---------+--------------------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
+----+-------------+-------+------+---------------+------+---------+--------------------+------+-------------+
| 1 | SIMPLE | class | ALL | NULL | NULL | NULL | NULL | 20 |
| 1 | SIMPLE | book | ref | Y | Y | 4 | dbtest2.class.card | 1 | Using index |
+----+-------------+-------+------+---------------+------+---------+--------------------+------+-------------+
可以看到,当我们在book表card字段加上字段时,type类型变为ref,为索引扫描(在我们开发过程中,一般需要达到的级别为range,如果能达到ref级别则就更好了),key为Y,表示使用了该索引,而且在rows列,book表的扫描行变为从20变为1行,效率大大提高。
同样,我们在class表上建立索引
ALTER TABLE `class` ADD INDEX idx_class_card ( `card`);
执行:
EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;
结果如下:
+----+-------------+-------+-------+---------------+----------------+---------+--------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------------+---------+--------------------+------+-------------+
| 1 | SIMPLE | class | index | NULL | idx_class_card | 4 | NULL | 20 | Using index |
| 1 | SIMPLE | book | ref | Y | Y | 4 | dbtest2.class.card | 1 | Using index |
+----+-------------+-------+-------+---------------+----------------+---------+--------------------+------+-------------+
可以看出,虽然在class表查询时也使用了索引,但是rows=20,并未起作用,这是由于本身就需要扫描左表全部,进行和右表的比对,可以理解为左表是个全表性质的范围查询,有没索引都无关,而右表则是根据左表的一个值查询,所以会用到索引。
当为inner join时:
执行:
EXPLAIN SELECT * FROM class INNER JOIN book ON class.card = book.card;
结果如下:
+----+-------------+-------+-------+---------------+----------------+---------+--------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------------+---------+--------------------+------+-------------+
| 1 | SIMPLE | class | index | NULL | idx_class_card | 4 | NULL | 20 | Using index |
| 1 | SIMPLE | book | ref | Y | Y | 4 | dbtest2.class.card | 1 | Using index |
+----+-------------+-------+-------+---------------+----------------+---------+--------------------+------+-------------+
当为right join时:
执行:
EXPLAIN SELECT * FROM class right JOIN book ON class.card = book.card;
结果:
+----+-------------+-------+-------+----------------+----------------+---------+-------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+----------------+----------------+---------+-------------------+------+-------------+
| 1 | SIMPLE | book | index | NULL | Y | 4 | NULL | 20 | Using index |
| 1 | SIMPLE | class | ref | idx_class_card | idx_class_card | 4 | dbtest2.book.card | 1 | Using index |
+----+-------------+-------+-------+----------------+----------------+---------+-------------------+------+-------------+
可以看出,表的执行顺序发生了变化,book表先执行,但需要查询20列,而class表则只需要查询1列。
总结:在进行join查询时,若是left join 则右表的连接字段必须建立索引
若是right join,则左表的连接字段必须建立索引
若是inner join ,都建索引
实际上在开发中,对于A,B表的连接字段,我们都需要建立索引。
查询机制: 以以下sql为例
SELECT * FROM class left<span style="font-family: Arial, Helvetica, sans-serif;"> JOIN book ON class.card = book.card;</span>
相当于执行了两次循环:
for:select * from classs ;
for:select * from book where class.card=book.card;
所以在查询时,需要尽量让数据量小的作为外层循环。