elasticsearch索引（多字段类型字段—字段可检索可聚合）

最新推荐文章于 2024-08-16 11:00:41 发布

一直走别回头_

最新推荐文章于 2024-08-16 11:00:41 发布

阅读量8.4k

点赞数 1

本文链接：https://blog.csdn.net/u012603073/article/details/48719005

版权

事情是这样的。

之前做了mongoDB 的检索，做好交给师兄看了之后，师兄说，恩，可以。接下来把他换成用elasticsearch检索。

当时我的内心是崩溃的。敢情MongoDB就是用来练手的啊！ES又是什么鬼！

在学习的过程中，遇到问题请教师兄的时候，师兄总说，这个我也没学过啊，你要自己去学啊。于是我只能自己看官网文档，花了我不少时间。

下面就把我学到的总结一下。

一、ES索引和映射

之前的安装我没有做，具体的部署可以看一下官网（https://www.elastic.co/guide/en/elasticsearch/reference/current/_installation.html）。

如果要php查询，用的是github上一个PHP-ES（https://github.com/nervetattoo/elasticsearch）

安装命令如下

cd /var/www/html
curl -s http://getcomposer.org/installer | php
echo "{"  > composer.json
echo '        "require" : {'                           >> composer.json
echo '        "nervetattoo/elasticsearch" : ">=2.0"'   >> composer.json
echo '    }' >> composer.json
echo '}' >> composer.json


./composer.phar install

首先类比关系型数据库：

Relational DB -> Databases -> Tables -> Rows -> Columns
Elasticsearch -> Indices   -> Types  -> Documents -> Fields

Elasticsearch集群可以包含多个索引(indices)（数据库），每一个索引可以包含多个类型(types)（表），每一个类型包含多个文档(documents)（行），然后每个文档包含多个字段(Fields)（列）。

创建索引：

<span style="font-size:10px;">curl -XPUT 'http://localhost:9200/ht_email_index/' -d '{
    "settings" : {
        "index" : {
            "number_of_shards" : 3,
            "number_of_replicas" : 1
        }
    }
}'</span>

在创建索引时，指定分片和副本数量。，参数采用json格式。

映射：

curl - XPUT 'http://localhost:9200/ht_email_index/_mapping/Email' - d '{
"Email":{
	"properties":{
		"attachments":{"type":"string"},
		"cc_user":{"type":"string"},
		"date":{
			"type":"string",
			"index":"not_analyzed"
		},
		"dir_path":{
			"type":"string"
		},
		"from_user":{
			"type":"string",
			"fields":{
				"raw":{
					"type":"string",
					"index":"not_analyzed"
				}
			}
		},
		"message":{
			"type":"string"
		},
		"subject":{
			"type":"string"
		},
		"to_user":{
			"type":"string",
			"fields":{
				"raw":{
					"type":"string",
					"index":"not_analyzed"
				}
			}
		}
	}
}
}'

映射相当于给字段指定数据类型。从命令中可以看到，有些字段设置为string等等。

这里特别说明一下：以“from_user”字段为例，这里设置了两个类型，这种字段被称为“多字段类型字段”。

作用就是，有些字段有时候需要被检索，有时候需要被聚合。检索时，需要字段是analyzed，也就是mapping时默认的形式，这时候字段如果是多个单词，就可以分别拿单词来检索；这时候检索条条件里写原来的字段名，比如说"field" => "from_user"。而在聚合时，字段需要是not_analyzed，这时，字段才会被当做一个整体来看待，而统计往往需要的是针对这个字段整体；这时检索（聚合）条件里要写拥有not_analyzed属性的字段名，比如"field" => "from_user.raw"。

这一点非常重要！

索引数据：

curl -XPUT 'http://localhost:9200/_river/ht_email_index_ik/_meta' -d '{ 
"type": "mongodb", 
"mongodb": { 
  "db": "HT_Email", 
  "collection": "Email"
}, 
"index": {
  "name": "ht_email_index", 
  "type": "Email",
  "indexAnalyzer": "ik",  
  "searchAnalyzer": "ik"
}
}'

这就是索引mongoDB的数据。

删除索引：

curl -XDELETE 'http://localhost:9200/ht_email_index'
curl -XDELETE 'http://localhost:9200/_river/ht_email_index_ik/'

如果你要重建索引的话，一定要先删除原来的索引。

【工具】

这里介绍一个工具，web页面查看索引状态——elasticsear-head

http://blog.csdn.net/laigood/article/details/8193758

这篇博客里有详细的介绍，有需要的可以看一下。

做好以上的准备工作就可以愉快的检索了。

（创建索引之前的部分我都没有亲自做过，所以不保证方法/教程的可操作性）

一直走别回头_

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫