下面记录一些数据条数达千万条的数据库的操作。
创建表
mysql> create table tb_test(id int auto_increment not null primary key,firstNamevarchar(14),lastName varchar(14),birthday date,sex int(1));
Query OK, 0 rows affected (0.06 sec)
从文本文件里加载数据,数据默认用"\t"分隔,我用的数据文件是用空格分隔的。
AMD Athlon II X2 220 2.81GHz,1.93GB内存,一千万条数据,每条约40字节,总共370MB。
加载用了2分半钟。
mysql> load data local infile 'G:\\db_test_t.txt' into table tb_test FIELDS TERMINATED BY ' ';
Query OK, 10000000 rows affected (2 min 32.24 sec)
Records: 10000000 Deleted: 0 Skipped: 0 Warnings: 0
=======================================================================================================
查找非索引数据,费时9秒左右。
mysql> select * from tb_test where firstName='fjwvhhtgae' AND lastName='h';
+----+------------+----------+------------+------+
| id | firstName | lastName | birthday | sex |
+----+------------+----------+------------+------+
| 8 | fjwvhhtgae | h | 1953-12-17 | 1 |
+----+------------+----------+------------+------+
1 row in set (8.84 sec)
用id查找是很快的。
mysql> select * from tb_test where id=8;+----+------------+----------+------------+------+
| id | firstName | lastName | birthday | sex |
+----+------------+----------+------------+------+
| 8 | fjwvhhtgae | h | 1953-12-17 | 1 |
+----+------------+----------+------------+------+
1 row in set (0.01 sec)
mysql> select * from tb_test where id=9;
+----+-----------+--------------+------------+------+
| id | firstName | lastName | birthday | sex |
+----+-----------+--------------+------------+------+
| 9 | lyhrwn | jevosfhgphwr | 1992-03-07 | 0 |
+----+-----------+--------------+------------+------+
1 row in set (0.02 sec)
mysql> select * from tb_test where firstName='fjwvhhtgae' AND lastName='h';
+----+------------+----------+------------+------+
| id | firstName | lastName | birthday | sex |
+----+------------+----------+------------+------+
| 8 | fjwvhhtgae | h | 1953-12-17 | 1 |
+----+------------+----------+------------+------+
1 row in set (8.89 sec)
mysql> select * from tb_test where firstName='ml' AND lastName='w';
+---------+-----------+----------+------------+------+
| id | firstName | lastName | birthday | sex |
+---------+-----------+----------+------------+------+
| 3934851 | ml | w | 1935-01-09 | 1 |
| 5857977 | ml | w | 1915-02-09 | 1 |
| 9999999 | ml | w | 1935-08-14 | 1 |
+---------+-----------+----------+------------+------+
3 rows in set (8.89 sec)
=======================================================================================================
创建索引,一千万数据,两列复合索引,用时1.5小时左右。
mysql> create index first_last on tb_test (firstName,lastName);
Query OK, 10000000 rows affected (1 hour 32 min 45.44 sec)
Records: 10000000 Duplicates: 0 Warnings: 0
mysql> alter table tb_test add index first_last (firstName,lastName);
ERROR 1061 (42000): Duplicate key name 'first_last'
查找索引过的数据,0.0x秒,提高几百倍,非常显著。
mysql> select * from tb_test where firstName='' AND lastName='';Empty set (0.01 sec)
mysql> select * from tb_test where firstName='ml' AND lastName='w';
+---------+-----------+----------+------------+------+
| id | firstName | lastName | birthday | sex |
+---------+-----------+----------+------------+------+
| 3934851 | ml | w | 1935-01-09 | 1 |
| 5857977 | ml | w | 1915-02-09 | 1 |
| 9999999 | ml | w | 1935-08-14 | 1 |
+---------+-----------+----------+------------+------+
3 rows in set (0.05 sec)
=======================================================================================================
列出表索引
mysql> show index from tb_test;
+---------+------------+------------+--------------+-------------+-----------+--
-----------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | C
ardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+------------+--------------+-------------+-----------+--
-----------+----------+--------+------+------------+---------+
| tb_test | 0 | PRIMARY | 1 | id | A |
9970750 | NULL | NULL | | BTREE | |
| tb_test | 1 | first_last | 1 | firstName | A |
9970750 | NULL | NULL | YES | BTREE | |
| tb_test | 1 | first_last | 2 | lastName | A |
9970750 | NULL | NULL | YES | BTREE | |
+---------+------------+------------+--------------+-------------+-----------+--
-----------+----------+--------+------+------------+---------+
3 rows in set (0.19 sec)
mysql> show keys from tb_test;
+---------+------------+------------+--------------+-------------+-----------+--
-----------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | C
ardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+------------+--------------+-------------+-----------+--
-----------+----------+--------+------+------------+---------+
| tb_test | 0 | PRIMARY | 1 | id | A |
9939506 | NULL | NULL | | BTREE | |
| tb_test | 1 | first_last | 1 | firstName | A |
9939506 | NULL | NULL | YES | BTREE | |
| tb_test | 1 | first_last | 2 | lastName | A |
9939506 | NULL | NULL | YES | BTREE | |
+---------+------------+------------+--------------+-------------+-----------+--
-----------+----------+--------+------+------------+---------+
3 rows in set (0.16 sec)
=======================================================================================================
birthday没有索引
mysql> select * from tb_test where birthday='2000-02-29';
+---------+----------------+----------------+------------+------+
| id | firstName | lastName | birthday | sex |
+---------+----------------+----------------+------------+------+
| 48970 | ueayceajk | enjtxiexnkp | 2000-02-29 | 1 |
| 101024 | yjm | nxyrph | 2000-02-29 | 0 |
| 9954050 | woqjcedclerhqr | hcaj | 2000-02-29 | 0 |
+---------+----------------+----------------+------------+------+
250 rows in set (12.52 sec)
一千万数据,索引birthday,重复比较多,建索引时间3.5个小时多。
mysql> create index birthdayIndex on tb_test (birthday);
Query OK, 10000000 rows affected (3 hours 35 min 17.99 sec)
Records: 10000000 Duplicates: 0 Warnings: 0
建完索引就已经给表加上索引了。
mysql> show keys from tb_test;
+---------+------------+---------------+--------------+-------------+-----------
+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation
| Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+---------------+--------------+-------------+-----------
+-------------+----------+--------+------+------------+---------+
| tb_test | 0 | PRIMARY | 1 | id | A
| 10013711 | NULL | NULL | | BTREE | |
| tb_test | 1 | first_last | 1 | firstName | A
| 10013711 | NULL | NULL | YES | BTREE | |
| tb_test | 1 | first_last | 2 | lastName | A
| 10013711 | NULL | NULL | YES | BTREE | |
| tb_test | 1 | birthdayIndex | 1 | birthday | A
| 85587 | NULL | NULL | YES | BTREE | |
+---------+------------+---------------+--------------+-------------+-----------
+-------------+----------+--------+------+------------+---------+
4 rows in set (1.09 sec)
birthday有索引。 速度提高一倍。对于重复较多的项,索引后查找性能提升没有上面那么明显,但也很可观。
mysql> select * from tb_test where birthday='1949-10-01';
+---------+----------------+----------------+------------+------+
| id | firstName | lastName | birthday | sex |
+---------+----------------+----------------+------------+------+
| 21550 | ueb | yonv | 1949-10-01 | 0 |
| 22013 | btsxumre | ysctdupp | 1949-10-01 | 0 |
| 9801393 | oponcdsyey | fctk | 1949-10-01 | 0 |
| 9902452 | pbasqplgrhccg | wmagb | 1949-10-01 | 0 |
+---------+----------------+----------------+------------+------+
217 rows in set (6.95 sec)
数据库断开连接,重新打开,速度提高三倍,不知道什么原因。
217 rows in set (1.28 sec)
=======================================================================================================