match es java,【ES】term和match的区别,

【ES】term和match的区别,

term用法

先看看term的定义,term是代表完全匹配,也就是精确查询,搜索前不会再对搜索词进行分词拆解。

这里通过例子来说明,先存放一些数据:

{"title": "love China","content": "people very love China","tags": ["China", "love"]

}

{"title": "love HuBei","content": "people very love HuBei","tags": ["HuBei", "love"]

}

来使用term 查询下:

{"query": {"term": {"title": "love"}

}

}

结果是,上面的两条数据都能查询到:

{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 2,"max_score": 0.6931472,"hits": [

{"_index": "test","_type": "doc","_id": "8","_score": 0.6931472,"_source": {"title": "love HuBei","content": "people very love HuBei","tags": ["HuBei","love"]

}

},

{"_index": "test","_type": "doc","_id": "7","_score": 0.6931472,"_source": {"title": "love China","content": "people very love China","tags": ["China","love"]

}

}

]

}

}

发现,title里有关love的关键字都查出来了,但是我只想精确匹配 love China这个,按照下面的写法看看能不能查出来:

{"query": {"term": {"title": "love China"}

}

}

执行发现无数据,从概念上看,term属于精确匹配,只能查单个词。我想用term匹配多个词怎么做?可以使用terms来:

{"query": {"terms": {"title": ["love", "China"]

}

}

}

查询结果为:

{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 2,"max_score": 0.6931472,"hits": [

{"_index": "test","_type": "doc","_id": "8","_score": 0.6931472,"_source": {"title": "love HuBei","content": "people very love HuBei","tags": ["HuBei","love"]

}

},

{"_index": "test","_type": "doc","_id": "7","_score": 0.6931472,"_source": {"title": "love China","content": "people very love China","tags": ["China","love"]

}

}

]

}

}

发现全部查询出来,为什么?因为terms里的[ ] 多个是或者的关系,只要满足其中一个词就可以。想要通知满足两个词的话,就得使用bool的must来做,如下:

{"query": {"bool": {"must": [

{"term": {"title": "love"}

},

{"term": {"title": "china"}

}

]

}

}

}

可以看到,我们上面使用china是小写的。当使用的是大写的China 我们进行搜索的时候,发现搜不到任何信息。这是为什么了?title这个词在进行存储的时候,进行了分词处理。我们这里使用的是默认的分词处理器进行了分词处理。我们可以看看如何进行分词处理的?

分词处理器

GET test/_analyze

{"text" : "love China"}

结果为:

{"tokens": [

{"token": "love","start_offset": 0,"end_offset": 4,"type": "","position": 0},

{"token": "china","start_offset": 5,"end_offset": 10,"type": "","position": 1}

]

}

分析出来的为love和china的两个词。而term只能完完整整的匹配上面的词,不做任何改变的匹配。所以,我们使用China这样的方式进行的查询的时候,就会失败。稍后会有一节专门讲解分词器。

match 用法

先用 love China来匹配。

GET test/doc/_search

{"query": {"match": {"title": "love China"}

}

}

结果是:

{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 2,"max_score": 1.3862944,"hits": [

{"_index": "test","_type": "doc","_id": "7","_score": 1.3862944,"_source": {"title": "love China","content": "people very love China","tags": ["China","love"]

}

},

{"_index": "test","_type": "doc","_id": "8","_score": 0.6931472,"_source": {"title": "love HuBei","content": "people very love HuBei","tags": ["HuBei","love"]

}

}

]

}

}

发现两个都查出来了,为什么?因为match进行搜索的时候,会先进行分词拆分,拆完后,再来匹配,上面两个内容,他们title的词条为: love china hubei ,我们搜索的为love China 我们进行分词处理得到为love china ,并且属于或的关系,只要任何一个词条在里面就能匹配到。如果想 love 和 China 同时匹配到的话,怎么做?使用 match_phrase

match_phrase 用法

match_phrase 称为短语搜索,要求所有的分词必须同时出现在文档中,同时位置必须紧邻一致。

GET test/doc/_search

{"query": {"match_phrase": {"title": "love china"}

}

}

结果为:

{"took": 5,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 1,"max_score": 1.3862944,"hits": [

{"_index": "test","_type": "doc","_id": "7","_score": 1.3862944,"_source": {"title": "love China","content": "people very love China","tags": ["China","love"]

}

}

]

}

}

相关文章暂无相关文章

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值