《MySQL必知必会》学习之 第18章 全文搜索

第18章 全文搜索

搜索引擎:MyISAM支持全文本搜索(InnoDB不支持)

  • 使用LIKE关键字或者正则表达式要求MySQL匹配表中所有行,随着行数增加,会很耗时,且不容易明确匹配什么不匹配什么,例如指定一个词必须匹配,一个词必须不匹配且结果不够智能化,如不会区分单个匹配的行和多个匹配的行(一行中匹配同一个行多次)

这些问题可以用全文本搜索解决,此时MySQL不需要分别查看每个行,也不需要分别分析和处理每个词

  • 全文本搜索的前提条件,必须索引被搜索的列,在CREATE TABLE时,MySQL根据子句FULLTEXT(note_text)的指示对该列进行索引,如果需要也可以指定多个列,可以在创建表时指定FULLTEXT,或者稍后,
  • 不要在导入数据时使用FULLTEXT

match()指定被搜索的列
against()指定要使用的搜索表达式

1.简单的全文本搜索 match against

#全文本搜索
mysql> select note_text from productnotes where match(note_text) against('rabbit');
+---------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                           |
+---------------------------------------------------------------------------------------------------------------------+
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now.                        |
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait. |
+---------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.02 sec)


#使用通配符进行查询

mysql> select note_text from productnotes where note_text like '%rabbit%';
+---------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                           |
+---------------------------------------------------------------------------------------------------------------------+
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait. |
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now.                        |
+---------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.02 sec)

2.查询扩展 with query expansion

mysql> select note_text from productnotes where match(note_text) against('anvils' with query expansion);
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                                                                |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Multiple customer returns, anvils failing to drop fast enough or falling backwards on purchaser. Recommend that customer considers using heavier anvils. |
| Customer complaint:
Sticks not individually wrapped, too easy to mistakenly detonate all at once.
Recommend individual wrapping.                           |
| Customer complaint:
Not heavy enough to generate flying stars around head of victim. If being purchased for dropping, recommend ANV02 or ANV03 instead.   |
| Please note that no returns will be accepted if safe opened using explosives.                                                                            |
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now.                                                             |
| Customer complaint:
Circular hole in safe floor can apparently be easily cut with handsaw.                                                                |
| Matches not included, recommend purchase of matches or detonator (item DTNTR).                                                                           |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.03 sec)
# 查询扩展会搜索所有记录(相关性查询)
#查询database,同时也会把含有mysql但不含database的记录也找到。
#此功能会带来诸多非相关性的查询(谨慎使用)

3.布尔文本搜索 in boolean mode

布尔操作符说明
+包含,词必须存在
-排除,词必须不出现
>包含,而且增加等级值
<包含,且减少等级值
()把词组成子表达式(允许这些表达式作为一个组被包含、排除、排列等)
~取消一个词的排序值
*词尾的通配符
“ ”定义一个短语(与单个词的列表不一样,它匹配整个短语一边包含或排除这个短语)

–关键字 in boolean mode

#全文本搜索包含此heavy的所有行

mysql> select note_text from productnotes where match(note_text) against('heavy' in boolean mode);

#全文本搜索heavy ,排除rope(以rope开头的词)的行
mysql> select note_text from productnotes where match(note_text) against('heavy -rope*' in boolean mode);

#匹配包含词rabbit和bait的行
mysql> select note_text from productnotes where match(note_text) against('+rabbit +bait' in boolean mode);

#搜索匹配短语rabbit bait而不是匹配两个词rabbit和bait
mysql> select note_text from productnotes where match(note_text) against('"rabbit bait"' in boolean mode);

#匹配rabbit和carrot,增加前者的等级,降低后者的等级
mysql> select note_text from productnotes where match(note_text) against('>rabbit <carrot' in boolean mode);

#必须匹配词safe和combination,降低后者的等级
mysql> select note_text from productnotes where match(note_text) against('>rabbit +(<combination)' in boolean mode);

#排列而不排序:在布尔方式中,不按等级值降序排序返回的行

演示排序

rank在mysql 8.0版本中数据保留字,使用时必须用’'括起来或者更改名字;

#普通全文本搜索
mysql> select note_text from productnotes where match(note_text) against('rabbit');
+---------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                           |
+---------------------------------------------------------------------------------------------------------------------+
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now.                        |
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait. |
+---------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.02 sec)

#演示排序工作
mysql> select note_text,match(note_text) against('rabbit') as 'rank' from productnotes;
+----------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+
| note_text                                                                                                                                                | rank               |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+
| Customer complaint:
Sticks not individually wrapped, too easy to mistakenly detonate all at once.
Recommend individual wrapping.                           |                  0 |
| Can shipped full, refills not available.
Need to order new can if refill needed.                                                                          |                  0 |
| Safe is combination locked, combination not provided with safe.
This is rarely a problem as safes are typically blown up or dropped by customers.         |                  0 |
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait.                                      | 1.5905543565750122 |
| Included fuses are short and have been known to detonate too quickly for some customers.
Longer fuses are available (item FU1) and should be recommended. |                  0 |
| Matches not included, recommend purchase of matches or detonator (item DTNTR).                                                                           |                  0 |
| Please note that no returns will be accepted if safe opened using explosives.                                                                            |                  0 |
| Multiple customer returns, anvils failing to drop fast enough or falling backwards on purchaser. Recommend that customer considers using heavier anvils. |                  0 |
| Item is extremely heavy. Designed for dropping, not recommended for use with slings, ropes, pulleys, or tightropes.                                      |                  0 |
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now.                                                             | 1.6408053636550903 |
| Shipped unassembled, requires common tools (including oversized hammer).                                                                                 |                  0 |
| Customer complaint:
Circular hole in safe floor can apparently be easily cut with handsaw.                                                                |                  0 |
| Customer complaint:
Not heavy enough to generate flying stars around head of victim. If being purchased for dropping, recommend ANV02 or ANV03 instead.   |                  0 |
| Call from individual trapped in safe plummeting to the ground, suggests an escape hatch be added.
Comment forwarded to vendor.                            |                  0 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+
14 rows in set (0.03 sec)
#rank中大于1的就是全文搜索匹配的行

注意事项

  • 在索引全文本数据时,短词被忽略且从索引中排除,短词定义为3个即以下字符的词

  • MySQL有一个内建的非用词列表,这些词在索引全文本数据时总是被忽略,有需要可以覆盖

  • 如果一个词出现在50%以上的行中,则将它作为一个非用词忽略,这个规则不适用于IN BOOLEAN MODE

  • 表中行数<=2行,全文本搜索不返回结果,因为50%的规则

  • 忽略单引号 如don’t索引为dont

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值