Elasticsearch入门----terms聚合实现搜索热词统计

最新推荐文章于 2024-05-10 06:58:21 发布

qq_28757391

最新推荐文章于 2024-05-10 06:58:21 发布

阅读量3.7k

点赞数 4

分类专栏： elasticsearch

本文链接：https://blog.csdn.net/qq_28757391/article/details/120836023

版权

Elasticsearch 搜索热词聚合查询用户统计关键词频率

关键词由CSDN通过智能技术生成

elasticsearch 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

最近项目中遇到一个需求。统计用户的搜索热词Top5。于是就想到了用elasticsearch来记录用户检索时的关键词及用户信息，然后通过聚合操作实现统计用户搜索热词，返回搜索次数最多的前10个。

Elasticsearch版本：7.0.0

首先创建存储关键词及用户信息的索引：

POST  http://localhost:9200/hotwords_test/_mapping


{
  "properties": {
    "search_txt": {
      "type": "keyword"
    },
    "user_name":{
		"type": "text",
		"analyzer": "keyword"
	},
	"happend_time":{
		"type": "date",
        "format": "yyy-MM-dd HH:mm:ss"
	}
  }
}

通过RestHighLevelClient 客户端，将测试数据插入索引，首先引入maven依赖：

<dependencies>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>7.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.0.0</version>
        </dependency>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.48</version>
        </dependency>
    </dependencies>

测试数据索引入库代码：

import com.alibaba.fastjson.JSONObject;
import org.apache.http.HttpHost;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.Aggregation;
import org.elasticsearch.search.aggregations.AggregationBuilder;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import java.io.IOException;

public class ElasticsearchTesl {
    public static final String host = "localhost";
    public static final Integer port = 9200;
    public static final String index = "hotwords_test";

    public static void main(String[] args) throws IOException{
        RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(
                new HttpHost(host, port, "http")));

        JSONObject data = new JSONObject();
        data.put("search_txt", "大枣");
        data.put("user_name", "test");
        data.put("happend_time", "2021-10-17 15:11:30");
        String docId = indexDoc(client, index, data);
        System.out.println(docId);
        client.close();

    }

    public static String indexDoc(RestHighLevelClient client, String index, JSONObject data){
        IndexRequest request = new IndexRequest(index);
        request.source(data);
        try {
            IndexResponse response = client.index(request, RequestOptions.DEFAULT);
            return response.getId();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }
}

执行多次，索引中已存在数据如下：

下面是聚合查询操作，查询出同一个用户，搜索各类水果的次数，并输出搜索次数最多的前5个。


AggregationBuilder aggregationBuilder = AggregationBuilders
                .terms("value_count").field("search_txt").size(5);
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
        sourceBuilder.aggregation(aggregationBuilder);
        sourceBuilder.query(QueryBuilders.termQuery("user_name", "test"));
        SearchRequest searchRequest = new SearchRequest(index);
        searchRequest.source(sourceBuilder);
        SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
        Aggregations aggregations = searchResponse.getAggregations();
        for(Aggregation a:aggregations){
            Terms terms = (Terms) a;
            for(Terms.Bucket bucket:terms.getBuckets()){
                System.out.println(bucket.getKeyAsString() +":" + bucket.getDocCount());
            }
        }

控制台输出如下：

甘蔗:4
芒果:4
榴莲:3
大枣:2
桃子:2

qq_28757391

关注

4
点赞
踩
23

收藏

觉得还不错? 一键收藏
0
评论
Elasticsearch入门----terms聚合实现搜索热词统计

最近项目中遇到一个需求。统计用户的搜索热词Top5。于是就想到了用elasticsearch来记录用户检索时的关键词及用户信息，然后通过聚合操作实现统计用户搜索热词，返回搜索次数最多的前10个。Elasticsearch版本：7.0.0首先创建存储关键词及用户信息的索引：POST http://localhost:9200/hotwords_test/_mapping{ "properties": { "search_txt": { "type": "keywo
复制链接

扫一扫

专栏目录