1 制造数据
新增索引
POST/forum/article/_bulk
{"index":{"_id":1}}
{"articleID":"XHDK-A-1293-#fJ3","userID":1,"hidden":false,"postDate":"2017-01-01"}
{"index":{"_id":2}}
{"articleID":"KDKE-B-9947-#kL5","userID":1,"hidden":false,"postDate":"2017-01-02"}
{"index":{"_id":3}}
{"articleID":"JODL-X-1937-#pV7","userID":2,"hidden":false,"postDate":"2017-01-01"}
{"index":{"_id":4}}
{"articleID":"QQPX-R-3956-#aD8","userID":2,"hidden":true,"postDate":"2017-01-02"}
追加新增字段
POST/forum/article/_bulk
{"update":{"_id":"1"}}
{"doc":{"title":"this is java and elasticsearch blog"}}
{"update":{"_id":"2"}}
{"doc":{"title":"this is java blog"}}
{"update":{"_id":"3"}}
{"doc":{"title":"this is elasticsearch blog"}}
{"update":{"_id":"4"}}
{"doc":{"title":"this is java, elasticsearch, hadoop blog"}}
{"update":{"_id":"5"}}
{"doc":{"title":"this is spark blog"}}
2 Text分詞匹配
2.1 搜索标题中包含java或elasticsearch的条目
match query,是负责进行全文检索的,分詞匹配的。
GET/forum/article/_search
{
"query":{
"match":{
"title":"java elasticsearch"
}
}
}
2.2 搜索标题中包含java和elasticsearch的条目
灵活使用and关键字,如果你是希望所有的搜索关键字都要匹配的,那么就用and,可以实现单纯match query无法实现的效果
GET/forum/article/_search
{
"query":{
"match":{
"title":{
"query":"java elasticsearch",
"operator":"and"
}
}
}
}
2.3 搜索包含java,elasticsearch,spark,hadoop,4个关键字中,至少3个的条目
指定一些关键字中,必须至少匹配其中的多少个关键字,才能作为结果返回
GET/forum/article/_search
{
"query":{
"match":{
"title":{
"query":"java elasticsearch spark hadoop",
"minimum_should_match":"75%"
}
}
}
}
2.4 用bool组合多个搜索条件,来搜索title
GET/forum/article/_search
{
"query":{
"bool":{
"must":{"match":{"title":"java"}},
"must_not":{"match":{"title":"spark"}},
"should":[
{"match":{"title":"hadoop"}},
{"match":{"title":"elasticsearch"}}
]
}
}
}
must是确保说,谁必须有这个关键字,同时会根据这个must的条件去计算出document对这个搜索条件的relevance score 在满足must的基础之上,should中的条件,不匹配也可以,但是如果匹配的更多,那么document的relevance score就会更高。
should是可以影响相关度分数的
排名第一:java,同时包含should中所有的关键字,hadoop,elasticsearch("_score":1.3375794)
排名第二:java,同时包含should中的elasticsearch("_score":0.53484553)
排名第三:java,不包含should中的任何关键字("_score":0.19856805)
{
"took":6,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":3,
"max_score":1.3375794,
"hits":[
{
"_index":"forum",
"_type":"article",
"_id":"4",
"_score":1.3375794,
"_source":{
"articleID":"QQPX-R-3956-#aD8",
"userID":2,
"hidden":true,
"postDate":"2017-01-02",
"tag":[
"java",
"elasticsearch"
],
"tag_cnt":2,
"view_cnt":80,
"title":"this is java, elasticsearch, hadoop blog"
}
},
{
"_index":"forum",
"_type":"article",
"_id":"1",
"_score":0.53484553,
"_source":{
"articleID":"XHDK-A-1293-#fJ3",
"userID":1,
"hidden":false,
"postDate":"2017-01-01",
"tag":[
"java",
"hadoop"
],
"tag_cnt":2,
"view_cnt":30,
"title":"this is java and elasticsearch blog"
}
},
{
"_index":"forum",
"_type":"article",
"_id":"2",
"_score":0.19856805,
"_source":{
"articleID":"KDKE-B-9947-#kL5",
"userID":1,
"hidden":false,
"postDate":"2017-01-02",
"tag":[
"java"
],
"tag_cnt":1,
"view_cnt":50,
"title":"this is java blog"
}
}
]
}
}
2.5 搜索java,hadoop,spark,elasticsearch,至少包含其中3个关键字
默认情况下,should是可以不匹配任何一个的,比如上面的搜索中,this is java blog,就不匹配任何一个should条件 但是有个例外的情况,如果没有must的话,那么should中必须至少匹配一个才可以 比如下面的搜索,should中有4个条件,默认情况下,只要满足其中一个条件,就可以匹配作为结果返回
可以精准控制,should的4个条件中,至少匹配几个才能作为结果返回
GET/forum/article/_search
{
"query":{
"bool":{
"should":[
{"match":{"title":"java"}},
{"match":{"title":"elasticsearch"}},
{"match":{"title":"hadoop"}},
{"match":{"title":"spark"}}
],
"minimum_should_match":3
}
}
}
3 总结