1、搜索title或content中包含java beginner的帖子
GET /forum/article/_search { "query": { "dis_max": { "queries": [ { "match": { "title": "java beginner" }}, { "match": { "body": "java beginner" }} ] } } } 结果: { "took": 18, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 3, "max_score": 0.26742277, "hits": [ { "_index": "forum", "_type": "article", "_id": "1", "_score": 0.26742277, "_source": { "articleID": "XHDK-A-1293-#fJ3", "userID": 1, "hidden": false, "postDate": "2017-01-01", "tag": [ "java", "hadoop" ], "tag_cnt": 2, "view_cnt": 30, "title": "this is java and elasticsearch blog", "content": "i like to write best elasticsearch article" } }, { "_index": "forum", "_type": "article", "_id": "2", "_score": 0.19856805, "_source": { "articleID": "KDKE-B-9947-#kL5", "userID": 1, "hidden": false, "postDate": "2017-01-02", "tag": [ "java" ], "tag_cnt": 1, "view_cnt": 50, "title": "this is java blog", "content": "i think java is the best programming language" } }, { "_index": "forum", "_type": "article", "_id": "4", "_score": 0.155468, "_source": { "articleID": "QQPX-R-3956-#aD8", "userID": 2, "hidden": true, "postDate": "2017-01-02", "tag": [ "java", "elasticsearch" ], "tag_cnt": 2, "view_cnt": 80, "title": "this is java, elasticsearch, hadoop blog", "content": "elasticsearch and hadoop are all very good solution, i am a beginner" } } ] } } |
有些场景不是太好复现的,因为是这样,你需要尝试去构造不同的文本,然后去构造一些搜索出来,去达到你要的一个效果
可能在实际场景中出现的一个情况是这样的:
(1)某个帖子,doc1,title中包含java,content不包含java beginner任何一个关键词
(2)某个帖子,doc2,content中包含beginner,title中不包含任何一个关键词
(3)某个帖子,doc3,title中包含java,content中包含beginner
(4)最终搜索,可能出来的结果是,doc1和doc2排在doc3的前面,而不是我们期望的doc3排在最前面
dis_max,只是取分数最高的那个query的分数而已。如果这样说的话,(1)(2)(3)的最大的分数是一样的
2、dis_max只取某一个query最大的分数,完全不考虑其他query的分数
3、使用tie_breaker将其他query的分数也考虑进去
tie_breaker参数的意义,在于说,将其他query的分数,乘以tie_breaker,然后综合与最高分数的那个query的分数,综合在一起进行计算
除了取最高分以外,还会考虑其他的query的分数
tie_breaker的值,在0~1之间,是个小数,就ok
GET /forum/article/_search { "query": { "dis_max": { "queries": [ { "match": { "title": "java beginner" }}, { "match": { "body": "java beginner" }} ], "tie_breaker": 0.3 } } }
结果: { "took": 10, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 3, "max_score": 0.26742277, "hits": [ { "_index": "forum", "_type": "article", "_id": "1", "_score": 0.26742277, "_source": { "articleID": "XHDK-A-1293-#fJ3", "userID": 1, "hidden": false, "postDate": "2017-01-01", "tag": [ "java", "hadoop" ], "tag_cnt": 2, "view_cnt": 30, "title": "this is java and elasticsearch blog", "content": "i like to write best elasticsearch article" } }, { "_index": "forum", "_type": "article", "_id": "2", "_score": 0.19856805, "_source": { "articleID": "KDKE-B-9947-#kL5", "userID": 1, "hidden": false, "postDate": "2017-01-02", "tag": [ "java" ], "tag_cnt": 1, "view_cnt": 50, "title": "this is java blog", "content": "i think java is the best programming language" } }, { "_index": "forum", "_type": "article", "_id": "4", "_score": 0.155468, "_source": { "articleID": "QQPX-R-3956-#aD8", "userID": 2, "hidden": true, "postDate": "2017-01-02", "tag": [ "java", "elasticsearch" ], "tag_cnt": 2, "view_cnt": 80, "title": "this is java, elasticsearch, hadoop blog", "content": "elasticsearch and hadoop are all very good solution, i am a beginner" } } ] } } |