前言
本篇是承接上一篇的内容,因为头条发文有限制,不得不作为第二篇内容给大家介绍,希望大家能够理解和支持!!!
![1e96f3f5cf99e2695f0edbc99d7eea31.png](https://i-blog.csdnimg.cn/blog_migrate/cc52666e2e6568f988c30bdb945c4237.jpeg)
Java客户端
如果熟悉Java语言,而不想使用脚本等其他方式操作ElasticSearch搜索集群,则可以使用ElasticSearch提供的Java客户端API来编码实现,能够更加灵活地控制。ElasticSearch提供的Java客户端支持全部常用操作,如更新索引、索引文档、搜索文档、删除索引等等操作,而且还支持其他一些功能,如同步异步模式、explain查询等,下面我们通过代码来了解一下。如果使用Maven管理Java代码,可以在pom.xml文件中加入如下依赖:
org.elasticsearch elasticsearch 2.0.0
创建一个ElasticSearch客户端,代码如下所示:
// create & configure clientSettings settings = Settings.settingsBuilder() .put("cluster.name", "dw_search_engine") .put("client.transport.sniff", true) .build();final Client client = TransportClient.builder().settings(settings).build() .addTransportAddress(newAddress("es-01", 9300)) .addTransportAddress(newAddress("es-02", 9300));
可以将你的ElasticSearch集群的节点通过上面的addTransportAddress方法,都与Client对象关联起来,这样在操作ElasticSearch集群中的索引/更新/删除/搜索文档的时候,就能够自动感知。上面newAddress方法如下:
private static InetSocketTransportAddress newAddress(String host, int port) throws UnknownHostException { return new InetSocketTransportAddress(InetAddress.getByName(host), port);}
另外,也可以通过在配置文件elasticsearch.yml中指定相关配置,例如:
cluster.name: dw_search_engineclient.transport.sniff: trueclient.transport.ping_timeout: 10sclient.transport.nodes_sampler_interval: 10s
那么,创建客户端需要从配置文件中读取配置内容,具体可以查看官方文档。
- 准备工作
索引的时候,我们是从一个本地文件中读取数据,并构建索引文档需要的格式,然后请求ElasticSearch集群执行索引操作,下面代码是一些基本准备工作:
final String index = "basis_device_info";final String type = "user"; // index documentsString f = "C:甥敳獲yanjunDesktopbasis_device_info.txt";File in = new File(f);
从文件中,每次读取一行记录,然后构建一个JSON格式字符串,通过XContentBuilder来表示,代码如下所示:
protected static XContentBuilder createSource(String[] a) throws IOException { return jsonBuilder() .startObject() .field("installid", a[0]) .field("appid", a[1]) .field("udid", a[2]) .field("channel", a[3]) .field("version", a[4]) .field("osversion", a[5]) .field("device_name", a[6]) .field("producer", a[7]) .field("device_type", a[8]) .field("resolution", a[9]) .field("screen_size", a[10]) .field("mac", a[11]) .field("idfa", a[12]) .field("idfv", a[13]) .field("imei", a[14]) .field("create_time", a[15]) .endObject();}
下面我们从API的功能入手,分别详细说明,并附加代码展示用法。
- 创建索引
可以直接通过Java客户端库来创建索引,代码如下所示:
protected static void createIndex(final Client client, String index) { Map indexSettings = Maps.newHashMap(); indexSettings.put("number_of_shards", "4"); indexSettings.put("number_of_replicas", "1"); CreateIndexRequest createIndexRequest = new CreateIndexRequest( index, Settings.settingsBuilder().put(indexSettings).build()); CreateIndexResponse createIndexResponse = client.admin().indices().create(createIndexRequest).actionGet(); System.out.println(createIndexResponse);}
- 创建Mappings
通过Java客户端创建Mappings,相对比较复杂一点,需要拼接对应的JSON字符串,实现代码如下所示:
protected static void createMappings(final Client client, String index) throws IOException, InterruptedException, ExecutionException { XContentBuilder basisInfoMapping = jsonBuilder() .startObject() .startObject("_all") .field("enabled", "false") .endObject() .startObject("properties") .startObject("id") .field("type", "string") .endObject() .startObject("name") .field("type", "string") .field("index", "analyzed") .endObject() .startObject("age") .field("type", "int") .endObject() .startObject("birthday") .field("type", "date") .field("format", "yyyy-MM-dd HH:mm:ss") .field("index", "not_analyzed") .endObject() .endObject() .endObject(); XContentBuilder deviceInfoMapping = jsonBuilder() .startObject() .startObject("_all") .field("enabled", "false") .endObject() .startObject("properties") .startObject("udid") .field("type", "string") .endObject() .startObject("device_name") .field("type", "string") .field("index", "analyzed") .endObject() .startObject("privoder") .field("type", "string") .field("index", "analyzed") .endObject() .startObject("os_version") .field("type", "string") .endObject() .endObject() .endObject(); PutMappingRequest putMappingRequest = Requests.putMappingRequest(index) .type("basic_info") .source(basisInfoMapping) .type("device_info") .source(deviceInfoMapping); System.out.println(putMappingRequest.indicesOptions()); PutMappingResponse putMappingResponse = client.admin().indices().putMapping(putMappingRequest).get(); System.out.println(putMappingResponse);}
上面代码创建了一个名称为app_user_info的索引,该索引具有basic_info和device_info这2个type,可以通过elasticsearch_head插件,在Web管理页面上查看对应的索引信息。
- 索引单个文档
从文件中读取数据,一条记录构造一个文档,然后执行索引,代码如下所示:
protected static void indexDocs(final Client client, final String index, final String type, File in) { BufferedReader reader = null; try { reader = new BufferedReader(new FileReader(in.getAbsoluteFile())); String line = null; while((line = reader.readLine()) != null) { String[] a = line.split("", -1); if(a.length == 16) { String udid = a[2]; IndexResponse response = client .prepareIndex(index, type, udid) .setSource(createSource(a)) .get(); System.out.println(response.toString()); } } } catch (Exception e) { throw Throwables.propagate(e); } finally { closeQuietly(reader); }}
- 批量索引
批量索引有多种方式,首先,通过Bulk API进行索引,我们自己控制每一个batch的大小,代码如下所示:
protected static void indexBulk(final Client client, final String index, final String type, File in) { BulkRequestBuilder bulkRequest = client.prepareBulk(); final int batchSize = 100; int counter = 0; BufferedReader reader = null; try { reader = new BufferedReader(new FileReader(in.getAbsoluteFile())); String line = null; while((line = reader.readLine()) != null) { String[] a = line.split("", -1); if(a.length == 16) { String udid = a[2]; IndexRequestBuilder indexRequestBuilder = client .prepareIndex(index, type, udid) .setSource(createSource(a)); bulkRequest.add(indexRequestBuilder); if(++counter >= batchSize) { System.out.println(!bulkRequest.get().hasFailures()); counter = 0; bulkRequest = client.prepareBulk(); } } } } catch (Exception e) { throw Throwables.propagate(e); } finally { System.out.println(!bulkRequest.get().hasFailures()); closeQuietly(reader); }}
![612c39856928285d21b6a811ca9cc494.png](https://i-blog.csdnimg.cn/blog_migrate/b72145772f9646d02cdef956871ddb02.jpeg)
另一种方式,是根据ElasticSearch提供的Bulk Processor来实现,只需要设置相关参数,就可以实现批量索引,这种方式更加灵活,示例如下所示:
protected static void indexUsingBulkProcessor(final Client client, final String index, final String type, File in) throws InterruptedException { String name = "device_info_processor"; int bulkActions = 1000; ByteSizeValue bulkSize = new ByteSizeValue(100, ByteSizeUnit.MB); TimeValue flushInterval = TimeValue.timeValueSeconds(60); int concurrentRequests = 12; // create bulk processor final BulkProcessor bulkProcessor = BulkProcessor.builder(client, new BulkProcessor.Listener() { public void afterBulk(long id, BulkRequest req, BulkResponse resp) { System.out.println("id=" + id + ", resp=" + resp); } public void afterBulk(long id, BulkRequest req, Throwable cause) { System.out.println("id=" + id + ", req=" + req + ", cause=" + cause); } public void beforeBulk(long id, BulkRequest req) { System.out.println("id=" + id + ", req=" + req); } }) .setName(name) .setBulkActions(bulkActions) .setBulkSize(bulkSize) .setFlushInterval(flushInterval) .setConcurrentRequests(concurrentRequests) .build(); // index documents BufferedReader reader = null; try { reader = new BufferedReader(new FileReader(in.getAbsoluteFile())); String line = null; while((line = reader.readLine()) != null) { String[] a = line.split("", -1); if(a.length == 16) { String udid = a[2]; bulkProcessor.add(new IndexRequest(index, type, udid).source(createSource(a))); } } } catch (Exception e) { throw Throwables.propagate(e); } finally { closeQuietly(reader); // close bulk processor bulkProcessor.awaitClose(60, TimeUnit.SECONDS); }}
可以通过实现自定义的BulkProcessor.Listener,它提供了Hook的功能,比如,索引某个文档失败的话,可以在Hook方法中增加处理,实现重试的功能;再比如,如果索引成功,给其他系统服务一个回调,等等。
- 更新文档
更新文档中的某些字段,需要指定id的值,以及需要更新的字段的值,代码如下所示:
protected static void updateDoc(final Client client, final String index, final String type) throws IOException, InterruptedException, ExecutionException { String id = "60e90ddcb1a61622028b8d92112a646c"; UpdateRequest updateRequest = new UpdateRequest(index, type, id); updateRequest.doc(jsonBuilder() .startObject() .field("channel", "h-google") .field("appid", "1") .endObject()); UpdateResponse response = client.update(updateRequest).get(); System.out.println(response);}
如果更新文档的时候,文档不存在,则需要先执行索引操作,再进行更新操作,将这两个操作合并到一起,使用upsert操作,代码如下所示:
protected static void upsertDoc(final Client client, final String index, final String type) throws IOException, InterruptedException, ExecutionException { String id = "fdd5ff7f56b613f0acb2c20a1ebc35e4"; IndexRequest indexRequest = new IndexRequest(index, type, id).source(jsonBuilder() .startObject() .field("installid", "00000BSe") .field("appid", "0") .field("udid", "fdd5ff7f56b613f0acb2c20a1ebc35e4") .field("channel", "A-wandoujia") .field("version", "3.1.1") .field("resolution", "960*540") .field("mac", "00:08:22:be:1b:b7") .field("device_type", "0") .field("device_name", "HTC") .field("producer", "alps") .field("create_time", "2015-01-17 17:15:36") .endObject()); UpdateRequest updateRequest = new UpdateRequest(index, type, id).doc(jsonBuilder() .startObject() .field("resolution", "540*960") .field("channel", "h-baidu") .field("version", "3.1.1") .field("imei", "861622010000056") .endObject()) .upsert(indexRequest); UpdateResponse response = client.update(updateRequest).get(); System.out.println(response);}
- 删除文档
删除文档,需要指定文档的id的值,代码如下所示:
protected static void deleteDoc(final Client client, final String index, final String type) { String id = "60e90ddcb1a61622028b8d92112a646c"; DeleteResponse response = client.prepareDelete(index, type, id).get(); System.out.println(response);}
- 搜索文档
搜索文档,可以根据需要构造指定的查询(Query),可以设置过滤器等等,然后提交搜索,示例代码如下所示:
protected static void searchDocs(final Client client, final String index, final String type) { SearchResponse response = client .prepareSearch(index) .setTypes(type) .setQuery(QueryBuilders.termQuery("device_name", "xiaomi")) .setPostFilter(QueryBuilders.rangeQuery("create_time").from("2015-01-16 00:00:00").to("2015-01-16 23:59:59")) .setFrom(30).setSize(10).setExplain(true) .execute() .actionGet(); System.out.println(response);}
查询(Query)的构造有很多的方式,比如构造布尔查询,指定与、或、非关系,然后提交搜索。执行搜索,可以设置搜索文档的起始偏移位置以及每次取多少个结果文档,这便能实现分页功能。
(原创:时延军(包含链接:http://shiyanjun.cn)
![63be4d26871cbcd62df0118adab7c697.png](https://i-blog.csdnimg.cn/blog_migrate/30028bc17a6074b92c6f344d2c639e01.jpeg)
可以转发关注小编,每天更新技术好文~~~感谢大家支持!!!!