es全文检索论文_ElasticSearch 全文检索— ElasticSearch 基本操作

ElasticSearch-CURL命令-创建索引注意事项

1)索引库名称必须要全部小写,不能以下划线开头,也不能包含逗号

2)如果没有明确指定索引数据的ID,那么es会自动生成一个随机的ID,需要使用POST参数

curl -XPOST http://master:9200/zimo/user/ -d '{"name" : "john"}'

创建全新内容的两种方式:

1)使用自增ID(post)

2)在url后面添加参数(get)

[hadoop@masternode elasticsearch-2.4.0]$ curl -XPUT http://masternode:9200/zimo/user/2?op_type=create -d '{"name" : "john", "age" : 28}'

{"_index":"zimo","_type":"user","_id":"2","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}

[hadoop@masternode elasticsearch-2.4.0]$ curl -XGET http://masternode:9200/zimo/user/1

或者

{"_index":"zimo","_type":"user","_id":"1","_version":4,"found":true,"_source":{"name" : "john", "age" : 28}}

ElasticSearch-CURL命令-查询索引GET

1.根据员工id查询

[hadoop@masternode elasticsearch-2.4.0]$ curl -XGET http://masternode:9200/zimo/user/1

{"_index":"zimo","_type":"user","_id":"1","_version":4,"found":true,"_source":{"name" : "john", "age" : 28}}

在任意的查询字符串中添加pretty参数,es可以得到易于识别的json结果。

[hadoop@masternode elasticsearch-2.4.0]$ curl -XGET http://masternode:9200/zimo/user/1?pretty

{"_index" : "zimo","_type" : "user","_id" : "1","_version" : 4,"found" : true,"_source": {"name" : "john","age" : 28}

}

2.检索文档中的一部分,如果只需要显示指定字段

[hadoop@masternode elasticsearch-2.4.0]$ curl -XGET 'http://masternode:9200/zimo/user/1?_source=name&pretty'{"_index" : "zimo","_type" : "user","_id" : "1","_version" : 4,"found" : true,"_source": {"name" : "john"}

}

3.查询指定索引库指定类型所有数据

[hadoop@masternode elasticsearch-2.4.0]$ curl -XGET http://masternode:9200/zimo/user/_search

{"took":17,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"zimo","_type":"user","_id":"AWVlstKrgUMcKLYQxfRR","_score":1.0,"_source":{"name" : "john"}},{"_index":"zimo","_type":"user","_id":"2","_score":1.0,"_source":{"name" : "john", "age" : 28}},{"_index":"zimo","_type":"user","_id":"1","_score":1.0,"_source":{"name" : "john", "age" : 28}}]}}

4.根据条件进行查询

[hadoop@masternode elasticsearch-2.4.0]$ curl -XGET http://masternode:9200/zimo/user/_search?q=age:28

{"took":36,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":0.30685282,"hits":[{"_index":"zimo","_type":"user","_id":"2","_score":0.30685282,"_source":{"name" : "john", "age" : 28}},{"_index":"zimo","_type":"user","_id":"1","_score":0.30685282,"_source":{"name" : "john", "age" : 28}}]}}

ElasticSearch-CURL命令-DSL查询

Domain Specific Language领域特定语言

新添加一个文档

[hadoop@masternode elasticsearch-2.4.0]$ curl -XPUT http://masternode:9200/zimo/user/3/_create -d '{"name" : "lily", "age" : 18}'

{"_index":"zimo","_type":"user","_id":"3","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}

[hadoop@masternode elasticsearch-2.4.0]$ curl -XGET http://masternode:9200/zimo/user/_search -d '{"query":{"match":{"name":"lily"}}}' (查询)

{"took":31,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.30685282,"hits":[{"_index":"zimo","_type":"user","_id":"3","_score":0.30685282,"_source":{"name" : "lily", "age" : 18}}]}}

ElasticSearch-CURL命令-MGET查询

1.使用mget API获取多个文档

先新建一个库

[hadoop@masternode elasticsearch-2.4.0]$ curl -XPUT 'http://masternode:9200/zimo2'{"acknowledged":true}

[hadoop@masternode elasticsearch-2.4.0]$ curl -XPOST http://masternode:9200/zimo2/user/1 -d '{"name" : "lucy", "age" : 20}'

{"_index":"zimo2","_type":"user","_id":"1","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}

[hadoop@masternode elasticsearch-2.4.0]$ curl -XGET http://masternode:9200/_mget?pretty -d '{"docs":[{"_index":"zimo","_type":"user","_id":2,"_source":"name"},{"_index":"zimo2","_type":"user","_id":1}]}'

{"docs": [ {"_index" : "zimo","_type" : "user","_id" : "2","_version" : 1,"found" : true,"_source": {"name" : "john"}

}, {"_index" : "zimo2","_type" : "user","_id" : "1","_version" : 1,"found" : true,"_source": {"name" : "lucy","age" : 20}

2.如果需要的文档在同一个_index或者同一个_type中,你就可以在URL中指定一个默认的/_index或者/_index/_type

[hadoop@masternode elasticsearch-2.4.0]$ curl -XGET http://masternode:9200/zimo/user/_mget?pretty -d '{"docs":[{"_id":1},{"_id":2}]}'

{"docs": [ {"_index" : "zimo","_type" : "user","_id" : "1","_version" : 4,"found" : true,"_source": {"name" : "john","age" : 28}

}, {"_index" : "zimo","_type" : "user","_id" : "2","_version" : 1,"found" : true,"_source": {"name" : "john","age" : 28}

} ]

}

3.如果所有的文档拥有相同的_index 以及 _type,直接在请求中添加ids的数组即可。

curl -XGET http://masternode:9200/zimo/user/_mget?pretty -d '{"ids":["1","2"]}'

结果同上。

ElasticSearch-CURL命令-HEAD使用

如果只想检查一下文档是否存在,你可以使用HEAD来代替GET方法,这样只会返回HTTP头部文件。

[hadoop@masternode elasticsearch-2.4.0]$ curl -i -XHEAD http://masternode:9200/zimo/user/1

HTTP/1.1 200OK

Content-Type: text/plain; charset=UTF-8Content-Length: 0

ElasticSearch-CURL命令-更新

ES可以使用PUT或者POST对文档进行更新(全部更新),如果指定ID的文档已经存在,则执行更新操作。执行更新操作的时候要注意以下细节:

1)ES首先将就得文档标记为删除状态

2)然后添加新的文档

3)就得文档不会立即消失,但是你也无法访问

4)ES会在你继续添加更多数据的时候再后台清理已经标记问删除状态的文档、

局部更新:可以添加新字段或者更新已有字段(必须使用POST)

[hadoop@masternode elasticsearch-2.4.0]$ curl -XPOST http://masternode:9200/zimo/user/1/_update -d '{"doc":{"name":"john","age":30}}'

{"_index":"zimo","_type":"user","_id":"1","_version":5,"_shards":{"total":2,"successful":2,"failed":0}}

ElasticSearch-CURL命令-删除

[hadoop@masternode elasticsearch-2.4.0]$ curl -XDELETE http://masternode:9200/zimo/user/1

{"found":true,"_index":"zimo","_type":"user","_id":"1","_version":6,"_shards":{"total":2,"successful":2,"failed":0}}

如果文档存在,found属性值为true,_version属性值+1;

如果文档不存在,found属性值为false,但是_version属性值依然+1,这个就是内部管理的一部分,它保证了我们在多个节点间的不同操作的顺序都被 正确标记了;

注意:删除一个文档也不会立即生效,它只是被标记我已删除,ElasticSearch将会在你之后添加更多所以的时候才会在后头进行删除内容的清理。

ElasticSearch-CURL命令-bulk批量操作

bulk API可以帮助我们同时执行多个请求。

格式:

action: index/create/update/delete

metadata:_index/_type/_id

request body:_source(删除操作不需要)

(action:{metadata})

{request body }

(action:{metadata})

{request body }

create和ndex的区别:如果数据存在,使用create操作失败会提示文档已经存在,使用index则可以成功执行。

//新建一个requests文件

[hadoop@masternode elasticsearch-2.4.0]$ vi requests

{"index":{"_index":"zimo","_type":"user","_id":"6"}}

{"name":"mayun","age":51}

{"update":{"_index":"zimo","_type":"user","_id":"6"}}

{"doc":{"age":52}}

[hadoop@masternode elasticsearch-2.4.0]$ ls

bin config lib LICENSE.txt modules NOTICE.txt plugins README.textile requests

//执行批量操作

[hadoop@masternode elasticsearch-2.4.0]$ curl -XPOST http://masternode:9200/_bulk --data-binary @requests;

{"took":296,"errors":false,"items":[{"index":{"_index":"zimo","_type":"user","_id":"6","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"status":201}},{"update":{"_index":"zimo","_type":"user","_id":"6","_version":2,"_shards":{"total":2,"successful":2,"failed":0},"status":200}}]}

bulk请求可以在URL中声明/_indexhuozhe /_-index/_type.

bulk一次最大可以处理多少数据量:

1)bulk会把将要处理的数据载入内存中,所以数据量是有限的;

2)最佳的数据量不是一个确定的数值,它取决于你的硬件,你的文档大小以及复杂性,你的索引以及搜索的负载;

3)一般建议是1000-5000个文档,如果你的文档很大,可以适当减少队列,大小建议是5-15M,默认不能超过100M,可以在ES的配置文件中修改这个值http.max_content_length:100mb;

4)https://www.elastic.co/guide/en/elasticsearch/reference/2.4/moudles-http.html。

ElasticSearch-CURL命令-版本控制

普通关系型数据库使用的是PCC(悲观并发控制):当我们在修改一个数据前先锁定这一行,然后确保只有读取到的这个线程可以修改这一行数据。

ES使用的是OCC(乐观并发控制):ES不会阻止某一数据的访问。然而,如果基础数据在我们读取和写入的间隔中发生了变化,更新就会失败,这时候就由程序来决定如何处理这个冲突。它可以重新读取数据来进行更新,又或者将这一情况最直接反馈给用户。

ES如何实行版本控制?(使用ES内部版本号)

1)首先需要修改的文档,获取版本号(_version)

[hadoop@masternode elasticsearch-2.4.0]$ curl –XGET http://masternode:9200/zimo/user/2

{"_index":"zimo","_type":"user","_id":"2","_version":1,"found":true,"_source":{"name" : "john", "age" : 28}}

2)在执行更新操作的时候把版本号传过去

[hadoop@masternode elasticsearch-2.4.0]$ curl http://masternode:9200/zimo/user/2?version=1 -d '{"name":"john1","age":29}'

{"_index":"zimo","_type":"user","_id":"2","_version":2,"_shards":{"total":2,"successful":2,"failed":0},"created":false}

该操作可以重复执行,每执行一次版本号都会+1。

3)如果传递的版本号和待更新的版本号不一致,更新操作将会失败。

以上就是博主为大家介绍的这一板块的主要内容,这都是博主自己的学习过程,希望能给大家带来一定的指导作用,有用的还望大家点个支持,如果对你没用也望包涵,有错误烦请指出。如有期待可关注博主以第一时间获取更新哦,谢谢!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值