php elasticsearch ik,elasticsearch中将content字段设置为ik分词器后再使用 terms 聚合生成类似热门词汇的功能...

最新推荐文章于 2023-05-04 22:00:59 发布

weixin_39590566

最新推荐文章于 2023-05-04 22:00:59 发布

阅读量105

点赞数

Elasticsearch IK分词实时聚合热词统计微博热搜

关键词由CSDN通过智能技术生成

索引的 msg2017-04 中 201038447 的mapping

{

"msg2017-04": {

"mappings": {

"201038447": {

"properties": {

"@timestamp": {

"type": "date"

"content": {

"type": "text",

"boost": 8,

"analyzer": "ik_smart",

"include_in_all": true

"createTime": {

"type": "date"

}

索引的 settings

{

"msg2017-04": {

"settings": {

"index": {

"creation_date": "1492398234434",

"number_of_shards": "5",

"number_of_replicas": "1",

"uuid": "yiGoDhL1T3WLexG79e5uQg",

"version": {

"created": "5020299"

"provided_name": "msg2017-04"

}

环境：

linux

elasticsearch 5.2.2

已安装ik分词

分词结果

// request

GET /msg2017-04/_search?pretty

{

"size": 1,

"aggs": {

"fenci" : {

"terms" : {

"field" : "content.ik_smart"

}

// response

{

"took": 2,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

"hits": {

"total": 105,

"max_score": 1,

"hits": [

{

"_index": "msg2017-04",

"_type": "7510570179@chatroom",

"_id": "5067959408840553063",

"_score": 1,

"_source": {

"wxid": "wxid_1idf7gf5jgh822",

"msgId": "69",

"msgSvrId": "5067959408840553063",

"type": 0,

"isSend": "1",

"status": "2",

"speakerId": "",

"content": "rhh",

"imei": "867464024215618",

"room": "7510570179@chatroom",

"roomName": "和湖光山色hzhzh",

"roomOwner": "mikezhangsky",

"roomMembers": "mikezhangsky;wxid_1idf7gf5jgh822;wxid_j56srpxywn5n22;wxid_90uy0wlz229e22;sun461629376",

"roomSize": "5",

"createTime": "2017-04-07T03:08:37",

"@timestamp": "2017-04-17T03:14:15"

}

]

"aggregations": {

"fenci": {

"doc_count_error_upper_bound": 0,

"sum_other_doc_count": 0,

"buckets": []

}

我想通过中文分词后再聚合，这样就可以实时的统计出一段时间内的热词，类似微博的热搜。

weixin_39590566

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
php elasticsearch ik,elasticsearch中将content字段设置为ik分词器后再使用 terms 聚合生成类似热门词汇的功能...

索引的 msg2017-04 中 201038447 的mapping{"msg2017-04": {"mappings": {"201038447": {"properties": {"@timestamp": {"type": "date"},"content": {"type": "text","boost": 8,"analyzer": "ik_smart","include_in_all...
复制链接

扫一扫