典型场景说明
通过典型场景,我们可以快速学习和掌握Elasticsearch的开发过程,并且对关键的接口函数有所了解。
场景说明
假定用户开发一个应用程序,用于搜索所有图书信息,并要求提供关键字搜索关键字相关的图书,并打分按分排序,其中搜索的功能就可以用Elasticsearch来实现,搜索流程如下:
- 客户端连接集群
- 查询集群健康状态
- 检查指定索引是否存在
- 创建指定分片数目的索引
- 写入索引数据
- 批量写入数据
- 查询索引信息
- 删除索引
- 删除索引中的文档
- 刷新索引
- 多线程样例
样例代码
TransportClient样例代码
客户端连接集群
功能简介
获取客户端,通过设置集群名称、IP和端口连接到特定的Elasticsearch集群。是在使用Elasticsearch提供的API之前的必要工作。
- 在进行完Elasticsearch操作后,需要调用“client.close()”关闭所申请的资源。
- 在使用transport client发送各种请求之前,需要调用prepare()方法,其作用是做认证等安全相关的操作。
客户端连接集群分为以下两步:
- 初始化配置,如下代码片段所示:
ClientFactory.initConfiguration(LoadProperties.loadProperties());
- 获取客户端,如下代码片段所示:
client = ClientFactory.getClient();
集群健康检查
功能简介
查询Elasticsearch集群当前的健康状态,以下方法中的client为客户端连接集群小节中获取的client。
public static void clusterHealth(PreBuiltHWTransportClient client) {
ClusterHealthResponse healths;
try {
healths = client.prepare().admin().cluster().prepareHealth().get();
} catch (ElasticsearchSecurityException e) {
CommonUtil.handleException(e);
return;
}
LOG.info(healths.toString());
}
创建索引
创建索引时指定Settings
创建索引时指定Settings
public static void createIndexWithSettings(PreBuiltHWTransportClient client, String indexName) {
GetIndexResponse response;
try {
client.prepare().admin().indices().prepareCreate(indexName)
.setSettings(Settings.builder().put("index.number_of_shards", 3).put("index.number_of_replicas", 1)).get();
response = client.prepare().admin().indices().prepareGetIndex().get();
} catch (ElasticsearchSecurityException e) {
CommonUtil.handleException(e);
return;
}
LOG.info(response.settings());
}
创建索引时指定Mapping
创建索引时指定Mapping
public static void createIndexWithMapping(PreBuiltHWTransportClient client, String indexName) {
GetIndexResponse response;
try {
client.prepare().admin().indices().prepareCreate(indexName).addMapping("tweet", "message", "type=text").get();
response = client.prepare().admin().indices().prepareGetIndex().get();
} catch (ElasticsearchSecurityException e) {
CommonUtil.handleException(e);
return;
}
LOG.info(response.mappings().get(indexName).get("tweet").source());
}
插入文档
以Map形式插入文档
将待插入的文档保存于Map对象中,在创建索引的同时,插入文档。
public static void createMapDocument(PreBuiltHWTransportClient client) {
LOG.info("createMapDocument:");
Map<String, Object> json = new HashMap<>();
json.put("name", "Elasticsearch Reference");
json.put("author", "Alex Yang");
json.put("pubinfo", "Beijing,China.");
json.put("pubtime", "2016-07-16");
json.put("desc", "Elasticsearch is a highly scalable open-source full-text search and analytics engine.");
IndexResponse response;
try {
response = client.prepare().prepareIndex("book", "book").setSource(json).execute().actionGet();
} catch (ElasticsearchSecurityException e) {
CommonUtil.handleException(e);
return;
}
CommonUtil.printIndexInfo(response);
}
以String方式插入文档
将待写入的文档以json String的形式保存,在新建索引的同时插入文档。
public static void createJsonStringDocument(PreBuiltHWTransportClient client) {
String json = "{" + "\"name\":\"Elasticsearch Reference\"," + "\"author\":\"Alex Yang \"," + "\"pubinfo\":\"Beijing,China. \","
+ "\"pubtime\":\"2016-07-16\","
+ "\"desc\":\"Elasticsearch is a highly scalable open-source full-text search and analytics engine.\"" + "}";
createDocument(client, "book", "book", json);
}
private static void createDocument(PreBuiltHWTransportClient client, String index, String type, String sourcecontent) {
LOG.info("createDocument:");
IndexResponse response;
try {
response = client.prepare().prepareIndex(index, type).setSource(sourcecontent, XContentType.JSON).get();
} catch (ElasticsearchSecurityException e) {
CommonUtil.handleException(e);
return;
}
CommonUtil.printIndexInfo(response);
LOG.info(index);
}
以JavaBean对象的方式插入文档
将待插入的索引保存在JavaBean对象中,在新建索引的同时插入文档。
假设要写入的文档为Article对象,如下代码片段所示:
import java.util.Date;
public class Article {
private int id;
private String title;
private String content;
private String url;
private Date pubdate;
private String source;
private String author;
public Article() { }
public Article(int id, String title, String content, String url, Date pubdate, String source, String author) {
super();
this.id = id;
this.title = title;
this.content = content;
this.url = url;
this.pubdate = pubdate;
this.source = source;
this.author = author;
}
}
“createIndex”中的变量index,type,sourceContent,即需要插入数据进行更新的索引名称、类型、要插入doc,即Article的对象转成的json串。
public static void createBeanDocument(PreBuiltHWTransportClient client) throws JsonProcessingException {
LOG.info("createBeanDocument:");
ObjectMapper mapper = new ObjectMapper();
AtomicInteger ids = new AtomicInteger(0);
Article article = new Article(ids.getAndIncrement(), "Elasticsearch Reference",
"Elasticsearch is a highly scalable open-source full-text search and analytics engine.",
"https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html", Calendar.getInstance().getTime(),
"https://www.gitbook.com/@imalexyang/dashboard", "Alex Yang");
String json = mapper.writeValueAsString(article);
LOG.info(json);
createDocument(client, "article", "article", json);
}
private static void createDocument(PreBuiltHWTransportClient client, String index, String type, String sourcecontent) {
LOG.info("createDocument:");
IndexResponse response;
try {
response = client.prepare().prepareIndex(index, type).setSource(sourcecontent, XContentType.JSON).get();
} catch (ElasticsearchSecurityException e) {
CommonUtil.handleException(e);
return;
}
CommonUtil.printIndexInfo(response);
LOG.info(index);
}
批量写入文档
功能简介
Elasticsearch提供了批量操作处理的功能,即bulk命令,该命令能够以非常高效的机制完成多个操作,减少网络往返。
public static void bulkIndex() throws Exception {
LOG.info("bulkIndex:");
try {
BulkRequestBuilder bulkRequest = client.prepare().prepareBulk();
bulkRequest.add(client.prepare().prepareIndex("book", "book", "3").setSource(
jsonBuilder().startObject().field("name", "Elasticsearch Reference").field("author", "Alex Yang")
.field("pubinfo", "Beijing,China.").field("pubtime", "2016-07-16")
.field("desc", "Elasticsearch is a highly scalable open-source full-text search and analytics engine.").endObject()));
bulkRequest.add(client.prepare().prepareIndex("book", "book", "4").setSource(
jsonBuilder().startObject().field("name", "Lucene in Action").field("author", "Erik Hatcher")
.field("pubinfo", "ISBN 9781933988177 532 pages printed in black & white").field("pubtime", "2004-01-01").field("desc",
"Adding search to your application can be easy. "
+ "With many reusable examples and good advice on best practices, Lucene in Action shows you how.").endObject()));
BulkResponse bulkResponse = bulkRequest.get();
if (bulkResponse.hasFailures()) {
LOG.error("Batch indexing fail!");
} else {
LOG.info("Batch indexing success!");
}
} catch (ElasticsearchSecurityException e) {
handleException(e);
CommonUtil.handleException(e);
}
}
查询文档
功能简介
通过索引名、tpye、文档id来查看相应的文档。
public static void getDocument(PreBuiltHWTransportClient client, String index, String type, String id)
LOG.info("getDocument:");
GetResponse response;
try {
response = client.prepare().prepareGet(index, type, id).execute().actionGet();
} catch (ElasticsearchSecurityException e) {
CommonUtil.handleException(e);
return;
}
boolean exists = response.isExists();
LOG.info("Index found(true of false):" + exists);
LOG.info("response:" + response.getSource());
String _index = response.getIndex();
String _type = response.getType();
String _id = response.getId();
long _version = response.getVersion();
LOG.info(_index + "," + _type + "," + _id + "," + _version);
}
搜索样例
Term Query
字符串字段可以是文本类型,例如电子邮件。也可以是关键字类型,例如邮政编码。数字、日期等具有确定值的类型在添加到倒排索引的字段中时会制定确切的值,以便搜索。文本类型会通过分词器生成一个term表,然后添加到倒排索引中。默认的分词器会丢弃标点符号,将文本分解为单词,并且将其转换为小写。term查找适用于有确定值的字段。
public static void termsQuery(PreBuiltHWTransportClient client) {
TermsQueryBuilder termsQueryBuilder = QueryBuilders.termsQuery("content", "elasticsearch", "alex");
SearchResponse searchResponse;
try {
searchResponse = client.prepare().prepareSearch("article").setQuery(termsQueryBuilder).
setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setFrom(0).setSize(60).setExplain(true).execute().actionGet();
} catch (ElasticsearchSecurityException e) {
CommonUtil.handleException(e);
return;
}
SearchHits searchHits = searchResponse.getHits();
LOG.info("termsQuery:");
LOG.info("Total match found:" + searchHits.getTotalHits());
SearchHit[] hits = searchHits.getHits();
for (SearchHit searchHit : hits) {
LOG.info(searchHit.getSourceAsString());
//Get the highlighting field
Map<String, HighlightField> highlightFields = searchHit.getHighlightFields();
if (0 == highlightFields.size()) {
return;
}
HighlightField highlightField = highlightFields.get("content");
LOG.info("Highlighting field:" + highlightField.getName() + "\nHighlighting field content:" + highlightField.getFragments()[0]
.string());
Map<String, Object> sourceAsMap = searchHit.getSourceAsMap();
Set<String> keySet = sourceAsMap.keySet();
for (String string : keySet) {
LOG.info(string + ":" + sourceAsMap.get(string));
}
}
}
滚动搜索
采用滚动搜索的方式查出值为value的文档,并且采用bluk批量删除。
public static void scrollSearchDelete(PreBuiltHWTransportClient client, String index, String name, String value) {
LOG.info("scrollSearchDelete:");
try {
QueryBuilder qb = termQuery(name, value);
SearchResponse scrollResp = client.prepare().prepareSearch(index).addSort(FieldSortBuilder.DOC_FIELD_NAME, SortOrder.ASC)
.setScroll(new TimeValue(60000)).setQuery(qb).setSize(100).execute().actionGet();
//100 hits per shard will be returned for each scroll
BulkRequestBuilder bulkRequest = client.prepare().prepareBulk();
while (true) {
for (SearchHit hit : scrollResp.getHits().getHits()) {
LOG.info(hit.getIndex() + hit.getType());
LOG.info(hit.getSourceAsString());
bulkRequest.add(client.prepare().prepareDelete(hit.getIndex(), hit.getType(), hit.getId()));
}
scrollResp =
client.prepare().prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(600000)).execute().actionGet();
if (scrollResp.getHits().getHits().length == 0) {
break;
}
}
if (bulkRequest.numberOfActions() == 0) {
return;
}
BulkResponse bulkResponse = bulkRequest.get();
BulkItemResponse[] bulkItemResponses = bulkResponse.getItems();
for (BulkItemResponse bulkItemResponse : bulkItemResponses) {
LOG.info("index:" + bulkItemResponse.getIndex());
LOG.info("type:" + bulkItemResponse.getType());
LOG.info("Optype:" + bulkItemResponse.getOpType());
LOG.info("isFailed:" + bulkItemResponse.isFailed());
}
} catch (ElasticsearchSecurityException e) {
CommonUtil.handleException(e);
}
}
删除索引
功能简介
删除指定索引。
public static void deleteIndices(PreBuiltHWTransportClient client, String indices) {
DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest(indices);
AcknowledgedResponse response;
try {
response = client.prepare().admin().indices().delete(deleteIndexRequest).get();
} catch (ElasticsearchSecurityException | ExecutionException | InterruptedException e)
{
CommonUtil.handleException(e);
return;
}
if (response.isAcknowledged()) {
LOG.info("Delete success!");
}
}
批量操作
功能简介
位于“elasticsearch-transport-client-example/src/main/java/com/huawei/fusioninsight/elasticsearch/example/bulk”目录下的Bulk.java,作用是执行批量操作,例如批量建立索引,批量更新索引或者批量删除索引。
private static void dataInput(long recordNum, String index, String type) {
long circleCommit = recordNum / bulkNum;
Map<String, Object> esJson = new HashMap<String, Object>();
for (int j = 0; j < circleCommit; j++) {
long starttime = System.currentTimeMillis();
BulkRequestBuilder bulkRequest = client.prepare().prepareBulk();
for (int i = 0; i < bulkNum; i++) {
esJson.clear();
esJson.put("id", "1");
esJson.put("name", "Linda");
esJson.put("sex", "man");
esJson.put("age", 78);
esJson.put("height", 210);
esJson.put("weight", 180);
bulkRequest.add(client.prepare().prepareIndex(index, type).setSource(esJson));
}
BulkResponse bulkResponse = bulkRequest.get();
if (bulkResponse.hasFailures()) {
LOG.info("Batch indexing fail!");
} else {
LOG.info("Batch indexing success and put data time is " + (System.currentTimeMillis() - starttime));
}
}
}
BulkProcessor批量入库样例
功能简介
位于“elasticsearch-transport-client-example/src/main/java/com/huawei/fusioninsight/elasticsearch/example/bulk”目录下的BulkProcessorSample.java,其作用是指导用户使用BulkProcessor来完成批量入库。
BulkProcessor初始化:
private static BulkProcessor getBulkProcessor(PreBuiltHWTransportClient transportClient) {
BulkProcessor.Listener listener = new BulkProcessor.Listener() {
@Override
public void beforeBulk(long executionId, BulkRequest bulkRequest) {
int numberOfActions = bulkRequest.numberOfActions();
LOG.info("Executing bulk {} with {} requests.", executionId, numberOfActions);
}
@Override
public void afterBulk(long executionId, BulkRequest bulkRequest, BulkResponse bulkResponse) {
if (bulkResponse.hasFailures()) {
LOG.warn("Bulk {} executed with failures.", executionId);
} else {
LOG.info("Bulk {} completed in {} milliseconds.", executionId, bulkResponse.getTook().getMillis());
}
}
@Override
public void afterBulk(long executionId, BulkRequest bulkRequest, Throwable throwable) {
LOG.error("Failed to execute bulk.", throwable);
}
};
BulkProcessor bulkProcessor = BulkProcessor.builder(transportClient, listener)
.setBulkActions(onceBulkMaxNum)
.setBulkSize(new ByteSizeValue(onecBulkMaxSize, ByteSizeUnit.MB))
.setConcurrentRequests(concurrentRequestsNum)
.setFlushInterval(TimeValue.timeValueSeconds(flushTime))
.setBackoffPolicy(BackoffPolicy.constantBackoff(TimeValue.timeValueSeconds(1L), maxRetry))
.build();
LOG.info("Init bulkProcess successfully.");
return bulkProcessor;
}
BulkProcessor入库样例:
private void singleThreadBulk() {
//单线程
int bulkTime = 0;
while (bulkTime++ < totalNumberForThread) {
Map<String, Object> dataMap = Maps.newHashMap();
dataMap.put("date", "2019/12/9");
dataMap.put("text", "the test text");
dataMap.put("title", "the title");
bulkProcessor.add(transportClient.prepare().prepareIndex(indexName, indexType).setSource(dataMap).request());
//安全模式下,transport客户端在多线程中直接new IndexRequest,会出现认证错误
//不能直接new IndexRequest: bulkProcessor.add(new IndexRequest(indexName, indexType).source(dataMap));
}
LOG.info("This thead bulks successfully, the thread name is {}.", Thread.currentThread().getName());
}