什么是全文索引?
全文索引是将存储在数据库中的文章或者句子等任意内容信息查找出来的索引,单位是词。全文索引也是目前搜索引擎使用的一种关键技术。指定为 fulltex
创建练习表的sql:
create table command (
id int(5) unsigned primary key auto_increment,
name varchar(10),
instruction varchar(60)
)engine=MyISAM;
插入数据sql:
insert into command values('1','ls','list directory contents');
insert into command values('2','wc','print newline, word, and byte counts for each file');
insert into command values('3','cut','remove sections from each line of files');
insert into command values('4','sort','sort lines of text files');
insert into command values('5','find','search for files in a directory hierarchy');
insert into command values('6','cp','复制文件或者文件夹');
insert into command values('7','top','display Linux processes');
insert into command values('8','mv','修改文件名,移动');
insert into command values('9','停止词','is,not,me,yes,no ...');
添加全文索引:
创建表的时候创建全文索引
mysql> create table command (
id int(5) unsigned primary key auto_increment,
name varchar(10),
instruction varchar(60),
fulltext(instruction)
)engine=MyISAM;
Query OK, 0 rows affected (0.01 sec)
mysql> desc command;
+-------------+-----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------+------+-----+---------+----------------+
| id | int(5) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(10) | YES | | NULL | |
| instruction | varchar(60) | YES | MUL | NULL | |
+-------------+-----------------+------+-----+---------+----------------+
3 rows in set (0.03 sec)
mysql> show index from command;
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| command | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | |
| command | 1 | instruction | 1 | instruction | NULL | NULL | NULL | NULL | YES | FULLTEXT | | |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.04 sec)
通过alter添加
alter table command add fulltext(instruction);
mysql> show create table command;
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| command | CREATE TABLE `command` (
`id` int(5) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT NULL,
`instruction` varchar(60) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=gbk |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.04 sec)
mysql> alter table command add fulltext(instruction);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show create table command;
+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| command | CREATE TABLE `command` (
`id` int(5) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT NULL,
`instruction` varchar(60) DEFAULT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `instruction` (`instruction`)
) ENGINE=MyISAM DEFAULT CHARSET=gbk |
+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.04 sec)
mysql> show index from command;
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| command | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | |
| command | 1 | instruction | 1 | instruction | NULL | NULL | NULL | NULL | YES | FULLTEXT | | |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.04 sec)
使用全文索引:
select * from 表名 where match (字段名) against (‘检索内容’);
mysql> select * from command;
+----+--------+----------------------------------------------------+
| id | name | instruction |
+----+--------+----------------------------------------------------+
| 1 | ls | list directory contents |
| 2 | wc | print newline, word, and byte counts for each file |
| 3 | cut | remove sections from each line of files |
| 4 | sort | sort lines of text files |
| 5 | find | search for files in a directory hierarchy |
| 6 | cp | 复制文件或者文件夹 |
| 7 | top | display Linux processes |
| 8 | mv | 修改文件名,移动 |
| 9 | 停止词 | is,not,me,yes,no ... |
+----+--------+----------------------------------------------------+
9 rows in set (0.04 sec)
mysql> select * from command where match(instruction) against('directory');
+----+------+-------------------------------------------+
| id | name | instruction |
+----+------+-------------------------------------------+
| 1 | ls | list directory contents |
| 5 | find | search for files in a directory hierarchy |
+----+------+-------------------------------------------+
2 rows in set (0.04 sec)
查看匹配度:
mysql> select id,match(instruction) against('directory') from command;
+----+-----------------------------------------+
| id | match(instruction) against('directory') |
+----+-----------------------------------------+
| 1 | 1.2109839916229248 |
| 2 | 0 |
| 3 | 0 |
| 4 | 0 |
| 5 | 1.1976701021194458 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 0 |
+----+-----------------------------------------+
9 rows in set (0.04 sec)
停止词:
is,not,me,yes,no …
mysql> select * from command;
+----+--------+----------------------------------------------------+
| id | name | instruction |
+----+--------+----------------------------------------------------+
| 1 | ls | list directory contents |
| 2 | wc | print newline, word, and byte counts for each file |
| 3 | cut | remove sections from each line of files |
| 4 | sort | sort lines of text files |
| 5 | find | search for files in a directory hierarchy |
| 6 | cp | 复制文件或者文件夹 |
| 7 | top | display Linux processes |
| 8 | mv | 修改文件名,移动 |
| 9 | 停止词 | is,not,me,yes,no ... |
+----+--------+----------------------------------------------------+
9 rows in set (0.03 sec)
mysql> select * from command where match(instruction) against('is');
Empty set
mysql> select * from command where match(instruction) against('me');
Empty set
mysql> select * from command where match(instruction) against('no');
Empty set
mysql> select * from command where match(instruction) against('not');
Empty set
mysql> select * from command where match(instruction) against('of');
Empty set
mysql> select * from command where match(instruction) against('and');
Empty set
mysql> select * from command where match(instruction) against('each');
Empty set
in boolean mode 模式:
in boolean mode:意思是指定全文检索模式为布尔全文检索(简单可以理解为是检索方式)
select * from 表名 where match (字段名) against (‘检索内容’ in boolean mode);
‘+’一定包含
‘-’一定不包含
‘ ’或
‘’通配符 :若单词以 * 前面的字串开始,则匹配
使用通配符时,只能放在词的后边,不能放前边。
用法可以参考这篇博文:in boolean mode 符号含义
mysql> select * from command;
+----+--------+----------------------------------------------------+
| id | name | instruction |
+----+--------+----------------------------------------------------+
| 1 | ls | list directory contents |
| 2 | wc | print newline, word, and byte counts for each file |
| 3 | cut | remove sections from each line of files |
| 4 | sort | sort lines of text files |
| 5 | find | search for files in a directory hierarchy |
| 6 | cp | 复制文件或者文件夹 |
| 7 | top | display Linux processes |
| 8 | mv | 修改文件名,移动 |
| 9 | 停止词 | is,not,me,yes,no ... |
+----+--------+----------------------------------------------------+
9 rows in set (0.04 sec)
mysql> select * from command where match(instruction) against('dir*' in boolean mode);
+----+------+-------------------------------------------+
| id | name | instruction |
+----+------+-------------------------------------------+
| 1 | ls | list directory contents |
| 5 | find | search for files in a directory hierarchy |
+----+------+-------------------------------------------+
2 rows in set (0.05 sec)
mysql> select * from command where match(instruction) against('dir');
Empty set
mysql> select * from command where match(instruction) against('dir* -hie*' in boolean mode);
+----+------+-------------------------+
| id | name | instruction |
+----+------+-------------------------+
| 1 | ls | list directory contents |
+----+------+-------------------------+
1 row in set (0.03 sec)
mysql> select * from command where match(instruction) against('dir* linux' in boolean mode);
+----+------+-------------------------------------------+
| id | name | instruction |
+----+------+-------------------------------------------+
| 1 | ls | list directory contents |
| 5 | find | search for files in a directory hierarchy |
| 7 | top | display Linux processes |
+----+------+-------------------------------------------+
3 rows in set (0.04 sec)
删除全文索引:
alter table 表名 drop index 字段名;
mysql> alter table command drop index instruction;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> desc command;
+-------------+-----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------+------+-----+---------+----------------+
| id | int(5) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(10) | YES | | NULL | |
| instruction | varchar(60) | YES | | NULL | |
+-------------+-----------------+------+-----+---------+----------------+
3 rows in set (0.03 sec)
mysql> show index from command;
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| command | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.04 sec)
注意点总结:
1、一般情况下创建全文索引的字段数据类型为 char、varchar、text 。其它字段类型不可以
mysql> alter table command add num int comment '号码';
Query OK, 9 rows affected (0.01 sec)
Records: 9 Duplicates: 0 Warnings: 0
mysql> desc command;
+-------------+-----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------+------+-----+---------+----------------+
| id | int(5) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(10) | YES | | NULL | |
| instruction | varchar(60) | YES | MUL | NULL | |
| num | int(11) | YES | | NULL | |
+-------------+-----------------+------+-----+---------+----------------+
4 rows in set (0.04 sec)
mysql> alter table command add fulltext(num);
1283 - Column 'num' cannot be part of FULLTEXT index
2、全文索引不针对非常频繁的词做索引。比如is,no,not,you,me,yes这些,我们称之为停止词
3、对英文检索时忽略大小写