一、Client
使用Java API进行操作必须要有一个Client。
Client client =
null
;
Settings settings = ImmutableSettings.settingsBuilder()
.put(
"client.transport.sniff"
,
true
)
.put(
"cluster.name"
,
"Minutch"
)
.build();
TransportClient transportClient =
new
TransportClient(settings);
transportClient.addTransportAddress(
new
InetSocketTransportAddress(
"127.0.0.1"
,
9300
));
client = transportClient;
|
二、Index
搜索引擎一个核心就是建索引,索引我们看下使用Java API如何建索引。
1.使用Elasticsearch提供的JSON生成工具
//jsonBuilder
XContentBuilder builder = jsonBuilder()
.startObject()
.field(
"user"
,
"kimchy"
)
.field(
"postDate"
,
new
Date())
.field(
"message"
,
"trying out Elasticsearch"
)
.endObject()
//如果primaryKeyValue这个参数不要,Elasticsearch会自动生成一个id
IndexRequestBuilder indexRequestBuilder = client.prepareIndex(indexName,typeName,primaryKeyValue).setSource(builder);
//创建索引
IndexResponse indexResponse = indexRequestBuilder.execute().actionGet();
/******indexResponse对象会提供很多信息**************/
// Index name
String _index = indexResponse.getIndex();
// Type name
String _type = indexResponse.getType();
// Document ID (generated or not)
String _id = indexResponse.getId();
// Version (if it's the first time you index this document, you will get: 1)
long
_version = indexResponse.getVersion();
// isCreated() is true if the document is a new one, false if it has been updated
boolean
created = indexResponse.isCreated();
|
2.使用JSON字符串
String json =
"{"
+
"\"user\":\"kimchy\","
+
"\"postDate\":\"2013-01-30\","
+
"\"message\":\"trying out Elasticsearch\""
+
"}"
;
IndexResponse response = client.prepareIndex(
"twitter"
,
"tweet"
)
.setSource(json)
.execute()
.actionGet();
|
3.使用Map
Map<String,Object> map =
new
HashMap<String,Object>();
IndexResponse response = client.prepareIndex(
"twitter"
,
"tweet"
)
.setSource(map)
.execute()
.actionGet();
|
三、Get
创建完索引,我们可以通过ID把那条纪录搜索出来,prepareGet方式的查询是基于ID的。
GetResponse response = client.prepareGet(
"twitter"
,
"tweet"
,
"1"
)
.execute()
.actionGet();
|
setOperationThreaded:默认为true,表示该查询在多个线程上执行。
GetResponse response = client.prepareGet(
"twitter"
,
"tweet"
,
"1"
)
.setOperationThreaded(
false
)
.execute()
.actionGet();
|
四、DELETE
删除索引,prepareDelete的删除方式是基于ID的删除
DeleteResponse response = client.prepareDelete(
"twitter"
,
"tweet"
,
"1"
)
.execute()
.actionGet();
|
setOperationThreaded:默认为true,表示该查询在多个线程上执行。
DeleteResponse response = client.prepareDelete(
"twitter"
,
"tweet"
,
"1"
)
.setOperationThreaded(
false
)
.execute()
.actionGet();
|
五 、Update
更新索引,实际上他还是先删掉了原来的索引,然后重新建了所有,但是版本号会增加。
UpdateRequest updateRequest =
new
UpdateRequest();
updateRequest.index(
"index"
);
updateRequest.type(
"type"
);
updateRequest.id(
"1"
);
updateRequest.doc(jsonBuilder()
.startObject()
.field(
"gender"
,
"male"
)
.endObject());
client.update(updateRequest).get();
|
client.prepareUpdate(
"ttl"
,
"doc"
,
"1"
)
.setDoc(jsonBuilder()
.startObject()
.field(
"gender"
,
"male"
)
.endObject())
.get();
|
3.使用script,我们可以写脚本文件,在使用时指定ScriptService.ScriptType.FILE
//直接使用脚本
client.prepareUpdate(
"ttl"
,
"doc"
,
"1"
)
.setScript(
"ctx._source.gender = \"male\""
, ScriptService.ScriptType.INLINE)
.get();
|
UpdateRequest updateRequest =
new
UpdateRequest(
"ttl"
,
"doc"
,
"1"
)
.script(
"ctx._source.gender = \"male\""
);
client.update(updateRequest).get();
|
4.如果我们要更新的数据不存在时,我们新增一条指定数据。
IndexRequest indexRequest =
new
IndexRequest(
"index"
,
"type"
,
"1"
)
.source(jsonBuilder()
.startObject()
.field(
"name"
,
"Joe Smith"
)
.field(
"gender"
,
"male"
)
.endObject());
UpdateRequest updateRequest =
new
UpdateRequest(
"index"
,
"type"
,
"1"
)
.doc(jsonBuilder()
.startObject()
.field(
"gender"
,
"male"
)
.endObject())
.upsert(indexRequest);
client.update(updateRequest).get();
|
六、Bulk
批量操作,一次请求处理多个文档的操作.
BulkRequestBuilder bulkRequest = client.prepareBulk();
// either use client#prepare, or use Requests# to directly build index/delete requests
bulkRequest.add(client.prepareIndex(
"twitter"
,
"tweet"
,
"1"
)
.setSource(jsonBuilder()
.startObject()
.field(
"user"
,
"kimchy"
)
.field(
"postDate"
,
new
Date())
.field(
"message"
,
"trying out Elasticsearch"
)
.endObject()
)
);
bulkRequest.add(client.prepareIndex(
"twitter"
,
"tweet"
,
"2"
)
.setSource(jsonBuilder()
.startObject()
.field(
"user"
,
"kimchy"
)
.field(
"postDate"
,
new
Date())
.field(
"message"
,
"another post"
)
.endObject()
)
);
BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if
(bulkResponse.hasFailures()) {
// process failures by iterating through each bulk response item
}
|
bulkProcessor:当满足一定条件后,自动发送批量的请求,比如:已经有100个request了,或者每分钟发送一次,或者大小超过5mb
BulkProcessor bulkProcessor = BulkProcessor.builder(
client,
new
BulkProcessor.Listener() {
//1.bulk执行前被触发,比如你可以看看当前bulk有多少个action:request.numberOfActions()
@Override
public
void
beforeBulk(
long
executionId,
BulkRequest request) { ... }
//2.bulk执行后被触发,你可以看看是否有Actions失败:esponse.hasFailures()
@Override
public
void
afterBulk(
long
executionId,
BulkRequest request,
BulkResponse response) { ... }
//3.当bulk抛出异常时被触发
@Override
public
void
afterBulk(
long
executionId,
BulkRequest request,
Throwable failure) { ... }
})
//4.每1000个actions就执行bulk
.setBulkActions(
10000
)
//5.满1GB就执行bulk
.setBulkSize(
new
ByteSizeValue(
1
, ByteSizeUnit.GB))
//6.每5秒发送一次bulk
.setFlushInterval(TimeValue.timeValueSeconds(
5
))
//7.设置的并发请求数。值为0意味着只有一个请求可以被执行。值为1表示1个并发请求可以同时积累新的批量要求执行。
.setConcurrentRequests(
1
)
.build();
//你可以向下面一样往bulkProcessor里添加action
bulkProcessor.add(
new
IndexRequest(
"twitter"
,
"tweet"
,
"1"
).source(
/* your doc here */
));
bulkProcessor.add(
new
DeleteRequest(
"twitter"
,
"tweet"
,
"2"
));
|
默认的BulkProcessor配置如下:
sets bulkActions to
1000
sets bulkSize to 5mb
does not set flushInterval
sets concurrentRequests to
1
|
关闭bulkProcessor的两种方式
bulkProcessor.awaitClose(
10
, TimeUnit.MINUTES);
bulkProcessor.close();
|
两个方法关闭后会将剩余的请求发送出去,并且不在接收新增的action.
如果并发请求可用,并且使用的是awaitClose,在时间内完成了所有的请求,则返回true。