ES 向量搜索 function score 报错

reason "function score query returned an invalid score: NaN for doc: 17085

原因是向量搜索定义评分的计算方法consineSimilarity的计算过程中需要对两个向量求模

故,如果全文索引中存在全零向量数据时,可以将consineSimilarity计算换成其它向量相似度计算方法,例如 dotProduct

GET reference/_search
{ "explain": true, 
  "query": {
    "function_score": {
      "query": {
         "multi_match": {
            "query": "清华大学艺术系的小明同学",
            "fields": ["name", "school", "department", "home_address"]
  
          }
      },
      "functions": [
        {
          "script_score": {
            "script": {
              "source": "(cosineSimilarity(params.query_vector, 'text_vector') + 1.0)*0.6", 
              "params": {
                "query_vector": [0.17213014641107321, 0.03529149917147957, 0.021164631815954453, 0.04276578593911642, -0.011539837034197614, -0.040505111467529525, -0.013804620485218865, 0.07954305265937577, -0.03259408482301212, 0.042030898483276985, -0.08187475423584016, 0.02331000497613773, -0.012518793076655677, 0.028751927085989483, -0.0450599553695574, 0.02574408603742391, -0.012067035100387393, 0.08755896933278114, 0.0043671871492196096]
              }
            }
          }
        },
        {
          "field_value_factor": {
            "field": "page_r", 
            "modifier": "log1p", 
            "factor": 100, 
            "missing": 1
          }
        }
      ],
      "score_mode": "max", 
      "boost_mode": "multiply"
    }
  },
  "_source": ["student_id", "school", "name"], 
  "size": 20
}

consineSimilarity替换为

"source": "(dopProduct(params.query_vector, 'text_vector') + 1.0)*0.6", 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值