当前测试的 mysql 版本: 5.0.95
MYSQL MYISAM引擎支持全文索引;MYSQL 5.6 后,InnoDB 也完全支持全文索引。
全文索引支持的字段类型包括 char ,varchar , text 。
创建测试表:(36行数据)
CREATE TABLE test.ft_tab (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
charname CHAR(10),
collatename VARCHAR(200),
description TEXT(500),
maxlen int
) ENGINE=InnoDB;
insert into test.ft_tab (charname,collatename,description,maxlen)
select CHARACTER_SET_NAME,DEFAULT_COLLATE_NAME,DESCRIPTION,MAXLEN
from information_schema.character_sets ;
5.0 中 InnoDB 引擎创建失败,改为 MYISAM 再创建:
mysql> CREATE FULLTEXT INDEX idx_ft_tab ON test.ft_tab(description);
ERROR 1214 (HY000): The used table type doesn't support FULLTEXT indexes
mysql>
mysql> ALTER TABLE test.ft_tab ENGINE = MYISAM;
Query OK, 36 rows affected (0.02 sec)
Records: 36 Duplicates: 0 Warnings: 0
mysql> CREATE FULLTEXT INDEX idx_ft_tab ON test.ft_tab(description);
Query OK, 36 rows affected (0.01 sec)
Records: 36 Duplicates: 0 Warnings: 0
使用全文索引搜索数据:
mysql> explain select * from test.ft_tab where description like '%Windows%';
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | ft_tab | ALL | NULL | NULL | NULL | NULL | 36 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
mysql> explain select * from test.ft_tab where match(description) against('Windows');
+----+-------------+--------+----------+---------------+------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+----------+---------------+------------+---------+------+------+-------------+
| 1 | SIMPLE | ft_tab | fulltext | idx_ft_tab | idx_ft_tab | 0 | | 1 | Using where |
+----+-------------+--------+----------+---------------+------------+---------+------+------+-------------+
1 row in set (0.00 sec)
使用 IN BOOLEAN MODE 设置特定搜索:
#至少包含一个单词(或)
select * from test.ft_tab where match(description) against('Windows Central' IN BOOLEAN MODE);
#括号,至少包含一个单词(或)
select * from test.ft_tab where match(description) against('(Windows Central)' IN BOOLEAN MODE);
select * from test.ft_tab where match(description) against('+(Windows Central)' IN BOOLEAN MODE);
#括号/大于小于好,至少包含一个单词(或),大于号的排序最高,小于号的最低,同时存在的处中间
select * from test.ft_tab where match(description) against('+(>Windows <Central)' IN BOOLEAN MODE);
select * ,(match(description) against('+(>Windows <Central)' IN BOOLEAN MODE)) AS REF
from test.ft_tab where match(description) against('+(>Windows <Central)' IN BOOLEAN MODE);
#两个(多个)单词同时包含(与)
select * from test.ft_tab where match(description) against('+Windows +Central' IN BOOLEAN MODE);
#包含一个(多个)单词,且不包含另一个(多个)单词
select * from test.ft_tab where match(description) against('+Windows -Central' IN BOOLEAN MODE);
#包含一个(多个)单词,如果包含一个单词(如 Central),排序更高
select * from test.ft_tab where match(description) against('+Windows Central' IN BOOLEAN MODE);
select * ,(match(description) against('+Windows Central' IN BOOLEAN MODE)) AS REF
from test.ft_tab where match(description) against('+Windows Central' IN BOOLEAN MODE);
#双引号,内部作为一个词组查询
select * from test.ft_tab where match(description) against('"Windows Central"' IN BOOLEAN MODE);
#星号,模糊匹配,只在单词尾部
select * from test.ft_tab where match(description) against('Windows*' IN BOOLEAN MODE);
select * from test.ft_tab where match(description) against('Cen*' IN BOOLEAN MODE);
#包含一个(多个)单词,如果包含一个单词(如 ~Central),排序更低
select * from test.ft_tab where match(description) against('+Windows ~Central' IN BOOLEAN MODE);
select * ,(match(description) against('+Windows ~Central' IN BOOLEAN MODE)) AS REF
from test.ft_tab where match(description) against('+Windows ~Central' IN BOOLEAN MODE);
上面的一个查询为例,匹配度越高,匹配的结果值(如下字段 ref )越大。
mysql> select * ,(match(description) against('+(>Windows <Central)' IN BOOLEAN MODE)) AS REF
-> from test.ft_tab where match(description) against('+(>Windows <Central)' IN BOOLEAN MODE);
+----+----------+---------------------+-----------------------------+--------+-------------------+
| id | charname | collatename | description | maxlen | REF |
+----+----------+---------------------+-----------------------------+--------+-------------------+
| 7 | latin2 | latin2_general_ci | ISO 8859-2 Central European | 1 | 0.666666686534882 |
| 18 | cp1250 | cp1250_general_ci | Windows Central European | 1 | 1.16666674613953 |
| 26 | macce | macce_general_ci | Mac Central European | 1 | 0.666666686534882 |
| 28 | cp852 | cp852_general_ci | DOS Central European | 1 | 0.666666686534882 |
| 30 | cp1251 | cp1251_general_ci | Windows Cyrillic | 1 | 1.5 |
| 31 | cp1256 | cp1256_general_ci | Windows Arabic | 1 | 1.5 |
| 32 | cp1257 | cp1257_general_ci | Windows Baltic | 1 | 1.5 |
| 35 | cp932 | cp932_japanese_ci | SJIS for Windows Japanese | 2 | 1.5 |
| 36 | eucjpms | eucjpms_japanese_ci | UJIS for Windows Japanese | 3 | 1.5 |
+----+----------+---------------------+-----------------------------+--------+-------------------+
现在查询一个单词,却找不出结果。 词语 'DEC' 不属于过滤的单词,为什么查不出?
mysql> select * from test.ft_tab where match(description) against('DEC');
Empty set (0.00 sec)
查看全文索引变量,就知道了。最小的长度为 4 字符,4 个字符以下的都忽略不当作单词了!
mysql> show variables like '%ft%';
+--------------------------+----------------+
| Variable_name | Value |
+--------------------------+----------------+
| ft_boolean_syntax | + -><()~*:""&| |
| ft_max_word_len | 84 |
| ft_min_word_len | 4 |
| ft_query_expansion_limit | 20 |
| ft_stopword_file | (built-in) |
+--------------------------+----------------+
#变量说明
http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html
ft_boolean_syntax :IN BOOLEAN MODE 匹配的选项
ft_max_word_len :最长匹配的字符数
ft_min_word_len :最短匹配的字符数
ft_query_expansion_limit :使用WITH QUERY EXPANSION 搜索出的前N(20)个结果再次搜索
ft_stopword_file :停词表文件,用于设置不需要匹配的词表
#默认停止词表
http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html
设置变量,允许查询3个长度的单词:
[root@localhost]# vi /etc/my.cnf
[mysqld]
ft_min_word_len=3
#重启服务,修复表
[root@localhost]# service mysqld restart
mysql> repair table test.ft_tab;
再查询,结果出来了!
mysql> select * from test.ft_tab where match(description) against('DEC');
+----+----------+-----------------+-------------------+--------+
| id | charname | collatename | description | maxlen |
+----+----------+-----------------+-------------------+--------+
| 2 | dec8 | dec8_swedish_ci | DEC West European | 1 |
+----+----------+-----------------+-------------------+--------+
而词语“for” 却查询不出!被忽略掉了!
mysql> select * from test.ft_tab where match(description) against('for');
Empty set (0.00 sec)
变量 ft_stopword_file 没有指定文件(built-in),使用系统默认设置(参考: Full-Text Stopwords)
现在手动设置:
[root@localhost]# vi /etc/my.cnf
[mysqld]
ft_stopword_file=/usr/share/mysql/ft_stopword.txt
[root@localhost]# service mysqld restart
mysql> repair table test.ft_tab;
mysql> show variables like '%ft%';
mysql> show variables like '%ft%';
+--------------------------+----------------------------------+
| Variable_name | Value |
+--------------------------+----------------------------------+
| ft_boolean_syntax | + -><()~*:""&| |
| ft_max_word_len | 84 |
| ft_min_word_len | 3 |
| ft_query_expansion_limit | 20 |
| ft_stopword_file | /usr/share/mysql/ft_stopword.txt |
+--------------------------+----------------------------------+
再查询,结果出来了!
mysql> select * from test.ft_tab where match(description) against('for');
+----+----------+---------------------+---------------------------+--------+
| id | charname | collatename | description | maxlen |
+----+----------+---------------------+---------------------------+--------+
| 35 | cp932 | cp932_japanese_ci | SJIS for Windows Japanese | 2 |
| 36 | eucjpms | eucjpms_japanese_ci | UJIS for Windows Japanese | 3 |
+----+----------+---------------------+---------------------------+--------+
现在需要把 “for” 忽略掉,怎么设置呢?如果添加过滤词,编辑文件添加该字符:
[root@localhost]# vi /usr/share/mysql/ft_stopword.txt
'for','in','at'
[root@localhost]# service mysqld restart
再查询,找不到了!已经被忽略!~
mysql> select * from test.ft_tab where match(description) against('for');
Empty set (0.00 sec)
参考: Full-Text Search Functions