算分与排序
● Elasticsearch 默认会以⽂档的相关度算分进⾏排序
● 可以通过指定⼀个或者多个字段进⾏排序
● 使⽤相关度算分(score)排序,不能满⾜某些特定条件
● ⽆法针对相关度,对排序实现更多的控制
Function Score Query
● Function Score Query
● 可以在查询结束后,对每⼀个匹配的⽂档进⾏⼀系列的重新算分,根据新⽣成的分数进⾏排序。
● 提供了⼏种默认的计算分值的函数
● Weight :为每⼀个⽂档设置⼀个简单⽽不被规范化的权重
● Field Value Factor:使⽤该数值来修改 _score,例如将 “热度”和“点赞数”作为算分的参考因素
● Random Score:为每⼀个⽤户使⽤⼀个不同的,随机算分结果
● 衰减函数: 以某个字段的值为标准,距离某个值越近,得分越⾼
● Script Score:⾃定义脚本完全控制所需逻辑
POST /blogs/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "popularity",
"fields": [ "title", "content" ]
}
}
}
}
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.13353139,
"hits" : [
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.13353139,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 0
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.13353139,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 100
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.13353139,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 1000000
}
}
]
}
}
希望能够将点赞多的 blog,放在搜索列表相
对靠前的位置。同时搜索的评分,还是要作为
排序的主要依据
● 新的算分 = ⽼的算分 * 投票数
● 投票数 为 0
● 投票数很⼤时
POST /blogs/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "popularity",
"fields": [ "title", "content" ]
}
},
"field_value_factor": {
"field": "votes"
}
}
}
}
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 133531.39,
"hits" : [
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "3",
"_score" : 133531.39,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 1000000
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "2",
"_score" : 13.353139,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 100
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.0,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 0
}
}
]
}
}
_score分值太高,我们使⽤ Modifier 平滑曲线
POST /blogs/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "popularity",
"fields": [ "title", "content" ]
}
},
"field_value_factor": {
"field": "votes",
"modifier": "log1p"
}
}
}
}
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.8011884,
"hits" : [
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.8011884,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 1000000
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.26763982,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 100
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.0,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 0
}
}
]
}
}
引⼊ Factor 新的算分 = ⽼的算分 * log( 1 + factor *投票数 )
POST /blogs/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "popularity",
"fields": [ "title", "content" ]
}
},
"field_value_factor": {
"field": "votes",
"modifier": "log1p" ,
"factor": 0.1
}
}
}
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.66765755,
"hits" : [
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.66765755,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 1000000
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.13905862,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 100
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.0,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 0
}
}
]
}
}
Boost Mode 和 Max Boost
● Boost Mode
● Multiply:算分与函数值的乘积
● Sum:算分与函数的和
● Min / Max:算分与函数取 最⼩/ 最⼤值
● Replace:使⽤函数值取代算分
● Max Boost 可以将算分控制在⼀个最⼤值
POST /blogs/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "popularity",
"fields": [ "title", "content" ]
}
},
"field_value_factor": {
"field": "votes",
"modifier": "log1p" ,
"factor": 0.1
},
"boost_mode": "sum",
"max_boost": 3
}
}
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 3.1335313,
"hits" : [
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "3",
"_score" : 3.1335313,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 1000000
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.1749241,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 100
}
},
{
"_index" : "blogs",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.13353139,
"_source" : {
"title" : "About popularity",
"content" : "In this post we will talk about...",
"votes" : 0
}
}
]
}
}
⼀致性随机函数
● 使⽤场景:⽹站的⼴告需要提⾼展现率
● 具体需求:让每个⽤户能看到不同的随机排名,但是
也希望同⼀个⽤户访问时,结果的相对顺序,保持⼀
致 (Consistently Random)
POST /blogs/_search
{
"query": {
"function_score": {
"random_score": {
"seed": 911119
}
}
}
}
seed值不变每次搜索的结果都不变
POST blogs/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "talk",
"fields": ["tltle","content"]
}},
"functions": [
{
"field_value_factor": {
"field": "votes",
"modifier": "log2p"
}},
{
"random_score": {
"seed": 911119
}
}
]
}
}
}