elasticsearch function_score的使用,参考ES自定义评分机制:function_score查询详解_斗者_2013的博客-CSDN博客_es 自定义评分:
{
"query": {
"function_score": {
"functions": [
{
"gauss": {
"ipublish": {
"origin": 1661498609,
"offset":0,
"scale": 864000
}
}
}
],
"query": {
"dis_max": {
"queries": [
{
"constant_score": {
"boost": 10000,
"filter": {
"bool":{
"should":[
{ "match_phrase": {
"title": "万东医疗"
}}
]
}
}
}
},
{
"constant_score": {
"boost": 1,
"filter": {
"bool":{
"should":[
{ "match": {
"title": "万东医疗"
}}
]
}
}
}
}
]
}
},
"boost_mode": "sum"
}
}
}
忽略tf-idf得分:constant_score:
{
"query": {
"bool": {
"should": [
{
"constant_score": {
"boost": 10000,
"filter": {
"bool":{
"should":[
{ "match_phrase": {
"title": "万东医疗"
}}
]
}
}
}
},
{
"constant_score": {
"boost": 1,
"filter": {
"bool":{
"should":[
{ "match": {
"title": "万东医疗"
}}
]
}
}
}
}
]
}
}
}
索引副本拷贝:reindex
post http://IP:9200/_reindex
{
"source": {
"index": "original_index"
},
"dest": {
"index": "new_index"
}
}
索引别名切换:
POST http://IP:9200/_aliases
{
"actions":[
{
"add":{
"index":"new_index",
"alias":"t_simple_dict"
}
},
{
"remove":{
"index":"old_index",
"alias":"t_simple_dict"
}
}
]
}
索引更新时间区间设置:
PUT http://IP:9200/index_name/settings
{
"refresh_interval": "15s"
}
分桶上限设置:
put http://IP:9200/_cluster/settings
{
"persistent": {
"search.max_buckets": 20000
}
}
分桶统计查询
post http://IP:9200/index_data/_search
{ "size":0,
"query": {
"match_all": {}
},
"aggs": {
"group_name1": {
"terms": {
"size": 20000,
"field": "field_name.keyword",
"min_doc_count": 15
}
}
}
}
数字型聚合操作:
{ "size":0,
"aggs":{
"grade_ranges":{
"range":{
"field":"event_type",
"ranges":[
{"from":0,"to":5},
{"from":5,"to":12}]
}
}
}
}
对过滤的数据进行聚合:
{
"size": 0,
"query": {
"bool": {
"should": [
{
"term": {
"event_type": 1
}
}
]
}
},
"aggs": {
"grade_ranges": {
"range": {
"field": "age",
"ranges": [
{
"from": 1,
"to": 2
}
]
}
}
}
}
均值聚合:
{
"size": 0,
"query": {
"constant_score": {
"filter": {
"range": {
"price": {
"gte": 10000
}
}
}
}
},
"aggs": {
"single_avg_price": {
"avg": {
"field": "price"
}
}
}
}
多个聚合,每个聚合的过滤条件不同:
{
"query": {
"match_all": {}
},
"size": 0,
"aggs": {
"rec_result": {
"filter": {
"bool": {
"must": rec_filter_list
}
},
"aggs": {
"rec_agg": {
"cardinality": {
"field": "event_id"
}
}
}
},
"click_result": {
"filter": {
"bool": {
"must": click_filter_list
}
},
"aggs": {
"click_agg": {
"cardinality": {
"field": click_static_field
}
}
}
}
}
}
多层嵌套,其中cardinality是在过滤的条件下的数据中对指定的属性字段去重计数。
另外聚合中的filter不能进行多条件过滤,如果需要多条件过滤,可以内嵌套一个bool条件
ignore_malformed是定义Mapping时的一个参数配置,默认为false,即如果将错误的数据类型映射到字段中则会报错,如果设置为true,则可以忽略数据类型的异常。
有时,当你对数据类型不太确定时,可以尝试配置这个属性为true。
{
"mappings":{
"properties": {
"create_date": {
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_second",
"type": "date",
"ignore_malformed": true
}
}
}
}
查询student不为空的数据(不存在或则数组无元素):
{
"query": {
"bool": {
"must": {
"exists": {
"field": "student"
}
}
}
}
}
elasticsearch 对查询结果进行二次过滤post_filter,对应的介绍见Docs:
GET /shirts/_search { "query": { "bool": { "filter": { "term": { "brand": "gucci" } } } }, "aggs": { "colors": { "terms": { "field": "color" } }, "color_red": { "filter": { "term": { "color": "red" } }, "aggs": { "models": { "terms": { "field": "model" } } } } }, "post_filter": { "term": { "color": "red" } } }
基于elasticsearch进行文章词频统计
POST /test_info/_doc/_termvectors
{
"doc":{
"content":"世界是美丽的,也是多样的。唯有保持一颗包容的心,允许差异化的存在,我们才能更好的生活在这个美丽的星球。"
}
}