高性能MySql学习笔记(一)

最新推荐文章于 2022-07-01 21:11:32 发布

kikikind

最新推荐文章于 2022-07-01 21:11:32 发布

阅读量3.3k

点赞数

分类专栏： Mysql 文章标签： mysql url date actor insert null

本文链接：https://blog.csdn.net/kikikind/article/details/5580631

版权

Mysql 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

高性能MySql学习笔记

1.针对应用建立自己的索引

URL查找例子

select * from tUrl where url='http://www.163.com';

以url（字符串）作行为索引会使得作为索引结构的B-Tree变大，可以移除url列上的索引，并添加一个url_crc索引列，

先建立表：

create tables tUrl(

id int unsigned NOT NULL auto_increment,

url varchar(255) NOT NULL,

url_crc int unsigned NOT NULL DEFAULT 0,

primary key(id)
);

通过以下方式查询：

mysql> select * from queryHash where url_crc=CRC32( 'http://www.163.com' ) and url='http://www.163.com' ;

1.url_crc列的索引性能较高

2.在有冲突的时候，再通过精确比较得到所需要列

3.如果冲突过多可以改由64位的hash, MD5()/SHA1(), 截取前16示例如下：

mysql> select conv( right ( md5('http://www.csdn.net'), 16), 16, 10) as hash64;

select conv( right ( md5('http://www.csdn.net'), 16), 16, 10) as hash64;
+---------------------+
| hash64 |
+---------------------+
| 9270541833456602998 |
+---------------------+
注：Maatkit( http://www.maatkit.org/)包含了一个用户自定义函数(UDF)，实现了Fowler/Noll/Vo64位哈希函数，速度非常快。

上述方法缺点是要手工维护hash值，可以通过使用触发器维护,分别为insert/update添加触发器：

delimiter |

create trigger url_crc_ins before insert on tUrl for each row begin

set NEW.url_crc=crc32(NEW.url);

end;

create trigger url_crc_upd before update on tUrl for each row begin

set NEW.url_crc=crc32(NEW.url);

end;

delimiter ;

验证一下：

mysql> insert into tUrl(url) values( 'http://www.csdn.net' );

mysql> select * from tUrl;
+----+---------------------+------------+
| id | url                 | url_crc    |
+----+---------------------+------------+
| 2 | http://www.csdn.net | 3032599713 |
+----+---------------------+------------+
mysql> update tUrl set url='http://blog.csdn.net' where id=2;
mysql> select * from tUrl;
+----+----------------------+------------+
| id | url                  | url_crc    |
+----+----------------------+------------+
| 2 | http://blog.csdn.net | 1255720221 |
+----+----------------------+------------+

由于我是在虚拟机的测试环境上，测试结果可能不太准确：随机生成了单个400多万的url记录，结果是没有带crc索引的直接比较url列，需要2.xs的时间，而还索引的话下降大概1s。随着记录量的增大，到2kw，查询就变得异常慢了，而且CPU占用率非常高，无论还不还crc优化都需要30s多,对于这么大的数据量，就需要通过其它手段优化查询了，比如水平切割，减少表的大小。

mysql> select count(*) from queryHash;
+----------+
| count(*) |
+----------+
| 27733889 |
+----------+
1 row in set (0.01 sec)

mysql> select * from queryHash where url='http://www.ui.com';
+--------+-------------------+------------+
| id | url | url_crc |
+--------+-------------------+------------+
| 121292 | http://www.ui.com | 3732015786 |
| 132385 | http://www.ui.com | 3732015786 |
| 135030 | http://www.ui.com | 3732015786 |
| 136391 | http://www.ui.com | 3732015786 |
| 138530 | http://www.ui.com | 3732015786 |
| 138555 | http://www.ui.com | 3732015786 |
| 145059 | http://www.ui.com | 3732015786 |
| 147331 | http://www.ui.com | 3732015786 |
| 150356 | http://www.ui.com | 3732015786 |
+--------+-------------------+------------+
9 rows in set (36.58 sec)

mysql> select * from queryHash where url_crc=crc32('http://www.ui.com') and url='http://www.ui.com';
+--------+-------------------+------------+
| id | url | url_crc |
+--------+-------------------+------------+
| 121292 | http://www.ui.com | 3732015786 |
| 132385 | http://www.ui.com | 3732015786 |
| 135030 | http://www.ui.com | 3732015786 |
| 136391 | http://www.ui.com | 3732015786 |
| 138530 | http://www.ui.com | 3732015786 |
| 138555 | http://www.ui.com | 3732015786 |
| 145059 | http://www.ui.com | 3732015786 |
| 147331 | http://www.ui.com | 3732015786 |
| 150356 | http://www.ui.com | 3732015786 |
+--------+-------------------+------------+
9 rows in set (32.43 sec)