倒排索引是什么?有什么用?

我们知道索引通常是用来查询某个key是否存在,或者避免一些外部排序操作的。

但倒排索引是个啥呢?

mysql的全文索引的实现方式其实就是倒排索引,倒排索引是一种索引方法,被用来存储在全文搜索下某个单词在一个文档或者一组文档中的存储位置的映射,常被应用于搜索引擎和关键字查询的问题中。

假设你有若干文档,内容如下:

// 文档1
Rudeus Greyrat is the main protagonist of Mushoku Tensei. He was an overweight 34-year-old Japanese NEET who is reincarnated into another world after being hit by a truck whilst saving some kids. Realizing he wasted away his previous life, he resolves to live this new one to its fullest and the series revolves around how he impacts his new world all whilst slowly growing out of his reclusive ways.
// 文档2
Sylphiette Greyrat is Rudeus' childhood friend who is part human, elf, and beast race. Following the Teleport Incident,[1][2] during which her hair turned white from mana exhaustion, she became Princess Ariel's personal bodyguard under the alias Fitts.
// 文档3
Eris Greyrat, born Eris Boreas Greyrat, is a noble girl and second cousin of Rudeus. She is a Character with a short temper but has potential in the Sword God Style.[4][5] During her journey to return home following the Teleport Incident, she grows to love Rudeus.
// 文档4
Roxy M. Greyrat, born Roxy Migurdia, is a talented Migurd mage, and a former magic tutor. Because she can't use telepathy, she leaves her village due to feeling isolated from her peers. Unable to make a stable living as an adventurer, she becomes a travelling tutor and eventually becomes Rudeus' teacher. After the Teleport Incident, Roxy helps Paul to search the world for survivors.

那么所谓的倒排索引就是会记录每个单词出现在哪些文档里,具体的感觉就像下面这样:
这种倒排索引对于检索系统的关键词检索或者大数据处理的word count功能等,都是非常关键的功能

Rudeus: [1, 2, 3, 4]
Greyrat: [1, 2, 3, 4]
is: [1, 2, 3, 4]
the: [1, 2, 3, 4]
main: [1]
protagonist: [1]
.
.
.
Following: [2, 3]
.
.
.
survivors: [4]
  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值