ES 是不删除也不修改任何数据
DELETE /movie_index |
{
“acknowledged”: true
}
新增文档
格式 :PUT /index/type/id
PUT /movie_index/movie/1 { "id":1, "name":"operation red sea", "doubanScore":8.5, "actorList":[ {"id":1,"name":"zhang yi"}, {"id":2,"name":"hai qing"}, {"id":3,"name":"zhang han yu"} ] } PUT /movie_index/movie/2 { "id":2, "name":"operation meigong river", "doubanScore":8.0, "actorList":[ {"id":3,"name":"zhang han yu"} ] }
PUT /movie_index/movie/3 { "id":3, "name":"incident red sea", "doubanScore":5.0, "actorList":[ {"id":4,"name":"zhang chen"} ] } |
如果之前没建过index或者type,es 会自动创建。
直接用id查找
GET movie_index/movie/1 |
修改**—**整体替换
和新增没有区别
PUT /movie_index/movie/3 { "id":"3", "name":"incident red sea", "doubanScore":"5.0", "actorList":[ {"id":"1","name":"zhang chen"} ] } |
修改**—某个字段******
POST movie_index/movie/3/_update { "doc": { "doubanScore":"7.0" } } |
修改**—某个字段和**** 修改****—**整体替换二者选一,否则:
删除一个document
DELETE movie_index/movie/3 |
{
“found”: true,
“_index”: “movie_index”,
“_type”: “movie”,
“_id”: “3”,
“_version”: 18,
“result”: “deleted”,
“_shards”: {
“total”: 2,
“successful”: 1,
“failed”: 0
}
}
搜索type全部数据
GET movie_index/movie/_search |
结果
{ "took": 2, //耗费时间 毫秒 "timed_out": false, //是否超时 "_shards": { "total": 5, //发送给全部5个分片 "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 3, //命中3条数据 "max_score": 1, //最大评分 "hits": [ // 结果 { "_index": "movie_index", "_type": "movie", "_id": 2, "_score": 1, "_source": { "id": "2", "name": "operation meigong river", "doubanScore": 8.0, "actorList": [ { "id": "1", "name": "zhang han yu" } ] } 。。。。。。。。 。。。。。。。。 } |
按条件查询(全部)
GET movie_index/movie/_search { "query":{ "match_all": {} } } |
按分词查询
GET movie_index/movie/_search { "query":{ "match": {"name":"red"} } } |
注意结果的评分
按分词子属性查询
GET movie_index/movie/_search { "query":{ "match": {"actorList.name":"zhang"} } } |
结果:
{
“took”: 2,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 2,
“max_score”: 1,
“hits”: [
{
“_index”: “movie_index”,
“_type”: “movie”,
“_id”: “2”,
“_score”: 1,
“_source”: {
“id”: 2,
“name”: “operation meigong river”,
“doubanScore”: 8,
“actorList”: [
{
“id”: 3,
“name”: “zhang han yu”
}
]
}
},
{
“_index”: “movie_index”,
“_type”: “movie”,
“_id”: “1”,
“_score”: 1,
“_source”: {
“id”: 1,
“name”: “operation red sea”,
“doubanScore”: 8.5,
“actorList”: [
{
“id”: 1,
“name”: “zhang yi”
},
{
“id”: 2,
“name”: “hai qing”
},
{
“id”: 3,
“name”: “zhang han yu”
}
]
}
}
]
}
}
match phrase
GET movie_index/movie/_search { "query":{ "match_phrase": {"name":"operation red"} } } |
按短语查询,不再利用分词技术,直接用短语在原始数据中匹配
我就不发结果了,太长的博客也不好看。
fuzzy查询
GET movie_index/movie/_search { "query":{ "fuzzy": {"name":"rad"} } } |
校正匹配分词,当一个单词都无法准确匹配,es通过一种算法对非常接近的单词也给与一定的评分,能够查询出来,但是消耗更多的性能。
过滤–查询后过滤
GET movie_index/movie/_search { "query":{ "match": {"name":"red"} }, "post_filter":{ "term": { "actorList.id": 3 } } } |
先查询后过滤效率慢,好比,我先从全国所有人中先过滤其他省份的留下广东的,再查询比先查询全国所有人再过滤广东的
过滤–查询前过滤(推荐)
GET movie_index/movie/_search { "query":{ "bool":{ "filter":[ {"term": { "actorList.id": "1" }}, {"term": { "actorList.id": "3" }} ], "must":{"match":{"name":"red"}} } } } |
过滤–按范围过滤
GET movie_index/movie/_search { "query": { "bool": { "filter": { "range": { "doubanScore": {"gte": 8} } } } } } |
关于范围操作符:
gt | 大于 |
lt | 小于 |
gte | 大于等于 |
lte | 小于等于 |
排序
GET movie_index/movie/_search { "query":{ "match": {"name":"red sea"} } , "sort": [ { "doubanScore": { "order": "desc" } } ] } |
这个先按名称后按red sea排序
分页查询
GET movie_index/movie/_search { "query": { "match_all": {} }, "from": 1, "size": 1 } |
指定查询的字段
GET movie_index/movie/_search { "query": { "match_all": {} }, "_source": ["name", "doubanScore"] } |
太多了,但我坚持,希望尽最大努力把我知道的全部写出来。
高亮
GET movie_index/movie/_search { "query":{ "match": {"name":"red sea"} }, "highlight": { "fields": {"name":{} } } } |
聚合
取出每个演员共参演了多少部电影
GET movie_index/movie/_search { "aggs": { "groupby_actor": { "terms": { "field": "actorList.name.keyword" } } } } |
每个演员参演电影的平均分是多少,并按评分排序
GET movie_index/movie/_search { "aggs": { "groupby_actor_id": { "terms": { "field": "actorList.name.keyword" , "order": { "avg_score": "desc" } }, "aggs": { "avg_score":{ "avg": { "field": "doubanScore" } } } } } } |
关于mapping的类型
之前说type可以理解为table,那每个字段的数据类型是如何定义的呢
查看看mapping
GET movie_index/_mapping/movie |
{
“movie_index”: {
“mappings”: {
“movie”: {
“properties”: {
“actorList”: {
“properties”: {
“id”: {
“type”: “long”
},
“name”: {
“type”: “text”,
“fields”: {
“keyword”: {
“type”: “keyword”,
“ignore_above”: 256
}
}
}
}
},
“doubanScore”: {
“type”: “float”
},
“id”: {
“type”: “long”
},
“name”: {
“type”: “text”,
“fields”: {
“keyword”: {
“type”: “keyword”,
“ignore_above”: 256
}
}
}
}
}
}
}
}
实际上每个type中的字段是什么数据类型,由mapping定义。
但是如果没有设定mapping系统会自动,根据一条数据的格式来推断出应该的数据格式。
-
true/false → boolean
-
1020 → long
-
20.1 → double
-
“2017-02-01” → date
-
“hello world” → text +keyword
默认只有text会进行分词,keyword是不会分词的字符串。
mapping除了自动定义,还可以手动定义,但是只能对新加的、没有数据的字段进行定义。一旦有了数据就无法再做修改了。
注意:虽然每个Field的数据放在不同的type下,但是同一个名字的Field在一个index下只能有一种mapping定义。
中文分词
elasticsearch本身自带的中文分词,就是单纯把中文一个字一个字的分开,根本没有词汇的概念。但是实际应用中,用户都是以词汇为条件,进行查询匹配的,如果能够把文章以词汇为单位切分开,那么与用户的查询条件能够更贴切的匹配上,查询速度也更加快速。
分词器下载网址:https://github.com/medcl/elasticsearch-analysis-ik
安装
下载好的zip包,请解压后放到 /usr/share/elasticsearch/plugins/
cp /opt/elasticsearch-analysis-ik-5.6.4.zip elastic
unzip elastic
rm elastic
然后重启es
service elasticsearch stop
service elasticsearch start
测试使用
使用默认
GET movie_index/_analyze { "text": "我是中国人" } |
请观察结果
使用分词器
GET movie_index/_analyze { "analyzer": "ik_smart", "text": "我是中国人" } |
请观察结果
另外一个分词器
ik_max_word
GET movie_index/_analyze { "analyzer": "ik_max_word", "text": "我是中国人" } |
请观察结果
能够看出不同的分词器,分词有明显的区别,所以以后定义一个type不能再使用默认的mapping了,要手工建立mapping, 因为要选择分词器。
基于中文分词搭建索引
1、建立mapping
PUT movie_chn { "mappings": { "movie":{ "properties": { "id":{ "type": "long" }, "name":{ "type": "text" , "analyzer": "ik_smart" }, "doubanScore":{ "type": "double" }, "actorList":{ "properties": { "id":{ "type":"long" }, "name":{ "type":"keyword" } } } } } } } |
插入数据
PUT /movie_chn/movie/1 { "id":1, "name":"红海行动", "doubanScore":8.5, "actorList":[ {"id":1,"name":"张译"}, {"id":2,"name":"海清"}, {"id":3,"name":"张涵予"} ] } PUT /movie_chn/movie/2 { "id":2, "name":"湄公河行动", "doubanScore":8.0, "actorList":[ {"id":3,"name":"张涵予"} ] }
PUT /movie_chn/movie/3 { "id":3, "name":"红海事件", "doubanScore":5.0, "actorList":[ {"id":4,"name":"张晨"} ] } |
查询测试
GET /movie_chn/movie/_search { "query": { "match": { "name": "红海战役" } } }
GET /movie_chn/movie/_search { "query": { "term": { "actorList.name": "张译" } } } |
自定义词库
修改/usr/share/elasticsearch/plugins/ik/config/中的IKAnalyzer.cfg.xml
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <comment>IK Analyzer 扩展配置</comment> <!--用户可以在这里配置自己的扩展字典 --> <entry key="ext_dict"></entry> <!--用户可以在这里配置自己的扩展停止词字典--> <entry key="ext_stopwords"></entry> <!--用户可以在这里配置远程扩展字典 --> <entry key="remote_ext_dict">http://192.168.67.163/fenci/myword.txt</entry> <!--用户可以在这里配置远程扩展停止词字典--> <!-- <entry key="remote_ext_stopwords">words_location</entry> --> </properties> |
按照标红的路径利用nginx发布静态资源