Elasticsearch
Elasticsearch是一个采用Restful API标准的高扩展性、高实用性及高实时性的实时数据分析的全文搜索工具,现在ES官方有非常健全的ES教材文档,从搭建到集成Kibana、logstash等组件,再到Restful API的使用,包括倒排索引等概念都有非常齐全的说明,笔者就不再赘述。
1. 官方教程地址:https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html
2. 笔者更推荐:https://es.xiaoleilu.com/
以下是笔者在翻阅过以上API后提取并简化后的Handler语法
核心语法
-
必须要全部小写,也不能以下划线开头,不能包含逗号。
-
数据类型
-
核心数据类型(Core datatypes)
- 字符型(String datatype):string
- 数字型(Numeric datatypes):long, integer, short, byte, double, float
- 日期型(Date datatype):date
- 布尔型(Boolean datatype):boolean
- 二进制型(Binary datatype):binary
- 复杂数据类型(Complex datatypes)
-
数组类型(Array datatype):数组类型不需要专门指定数组元素的type,例如:
- 字符型数组: [ “one”, “two” ]
- 整型数组:[ 1, 2 ]
- 数组型数组:[ 1, [ 2, 3 ]] 等价于[ 1, 2, 3 ]
- 对象数组:[ { “name”: “Mary”, “age”: 12 }, { “name”: “John”, “age”: 10 }]
-
对象类型(Object datatype): object 用于单个JSON对象
-
嵌套类型(Nested datatype): nested 用于JSON数组
-
地理位置类型(Geo datatypes)
-
地理坐标类型(Geo-point datatype): geo_point 用于经纬度坐标
-
地理形状类型(Geo-Shape datatype): geo_shape 用于类似于多边形的复杂形状
-
IPv4 类型(IPv4 datatype): ip 用于IPv4 地址
-
还有一些数据类型很少领域会用到,已屏蔽
-
-
集群健康
curl -XGET http://ip:port/_cluster/health
或
curl -XGET http://ip:port/_cluster/health?level=shards&wait_for_status=green
索引的操作
- 初始化index_setting
curl -XPUT "http://ip:port/megacorp/"
{"settings":
{"index":
{
"number_of_shards":5,
考虑机器扩容
"number_of_replicas":1
考虑机器效率,不大于机器总量
"translog.durability":"async",
"translog.sync_interval":"5s"
"blocks.onetype":true
对onetype禁止操作
"blocks.read":true
对本index禁止读
"blocks.write":true
对本index禁止写
"blocks.read_only":1\true
对本index禁止读写
}
}
}
- 查询index_settings
curl -XGET "http://ip:port/megacorp,megacorp2/_settings?pretty"
curl -XGET "http://ip:port/_all/_settings?pretty"
- 初始化index_mapping
curl -XPUT "http://ip:port/megacorp/"
{"settings":
{"index":
{
"number_of_shards":4,
"number_of_replicas":3
}
},
"mappings":{
"worker":
{
"dynamic":"true",
"properties":{
"name":{"type":"string","index":"analyzed","analyzer":"standard"},
"age":{"type":"integer","index":"not_analyzed"},
"sex":{"type":"integer","index":"not_analyzed"},
"addr":{"type":"string","index":"not_analyzed"},
"date":{"type":"long","index":"not_analyzed"},
"num":{"type":"object","dynamic":"true",
"properties":{
"id":{"type":"string"},
"gender":{"type":"string"},
"age":{"type":"long"},
"name":{
"type":"object",
"properties":{
"full":{"type":"string"},
"first":{"type":"string"},
"last":{"type":"string"}
}
}
}
},
"location": {
"type": "geo_point",
"geohash_prefix": true,
"geohash_precision": "1km"
}
}
}
}
}
- 使用分词器
curl -XPUT 'http://ip:port/iktest?pretty' -d '{
"settings":{
"index":
{
"number_of_shards":1,
"number_of_replicas":1
},
"analysis" : {
"analyzer" : {
"ik" : {
"tokenizer" : "ik_max_word"
}
}
}
},
"mappings":{
"worker":
{
"dynamic":"strict",
"properties":{
"name":{"type":"string","index":"analyzed","analyzer":"standard"},
"age":{"type":"integer","index":"not_analyzed"},
"sex":{"type":"integer","index":"not_analyzed"},
"addr":{"type":"string","index":"not_analyzed"},
"date":{"type":"long","index":"not_analyzed"},
"num":{"type":"object","dynamic":"true"},
"location": {"type":"geo_point"}
}
}
}
}'
- 查询index_mapping
curl -XGET http://ip:port/_all/_mapping?pretty
curl -XGET http://ip:port/partment/_mapping?pretty
- 删除mapping
curl -XDELETE http://ip:port/partment/_mapping
索引数据CRUD操作
- 添加数据
curl -XPOST 'ip:port/index/type/_bulk?pretty' -H 'Content-Type: application/json' -d'
{ "index": { "_id": 1 ,"_version":3 }}
{"name":"The quick brown fox","age":18,"sex":2,"addr":"北京","date":1388534400000}
{ "index": { "_id": 2 ,"_version":3 }}
{"name":"The quick brown fox jumps over the lazy dog","age":19,"sex":1,"addr":"上海","date":1391212800000}
{ "index": { "_id": 3 ,"_version":3 }}
{"name":"The quick brown fox jumps over the quick dog","age":22,"sex":1,"addr":"北京","date":1398902400000}
{ "index": { "_id": 4 ,"_version":3 }}
{"name":"Brown fox brown dog","age":11,"sex":1,"addr":"北京","date":1404172800000}
'
- 取回多个文档
curl -XGET 'ip:port/_mget?pretty' -H 'Content-Type: application/json' -d'
{
"docs" : [
{
"_index" : "website",
"_type" : "blog",
"_id" : 2
},
{
"_index" : "website",
"_type" : "pageviews",
"_id" : 1,
"_source": "views"
}
]
}
'
- 更新文档
curl -XPOST 'ip:port/index/type/_bulk?pretty' -H 'Content-Type: application/json' -d'
{"update":{"_id":1}}
'
- 删除文档
curl -XPOST 'ip:port/index/type/_bulk?pretty' -H 'Content-Type: application/json' -d'
{"delete":{"_id":1}}
'
查询条件
- 分词器的分隔粒度控制
analyzer=ik_max_word 最大粒度
analyzer=ik_smart 适中粒度
curl -XGET 'ip:port/index/_analyze?analyzer=ik_max_word&pretty' -H 'Content-Type: application/json' -d'
{
"text":"好记性不如烂笔头"
}
'
- 搜索条件过于繁多,笔者将其整理为一个文档,免费提供给大家。当然,只适用ES-5.x和ES-1.x,其它版本的语法会有所差别,以下是地址:
https://github.com/yuzhou152/elasticsearch-grammar.git