前言
ES作为当前最流行的开源分布式搜索引擎,用Java语言开发的。ES提供丰富的访问方式。用户可以基于Rest API直接封装JSON API使用ES。但是JSON的封装对部分用户来说易用性不够。ES也提供丰富的SDK支持用户方便的使用ES服务。如何连接ES取决于使用什么语言开发。官方提供不同的客户端SDK,方便开发者灵活使用。SDK支持不同的语言。常用的JAVA, Python, JS,PHP等。
SDK参考地址:Elasticsearch Clients
由于在项目中对java接触比较多,这里重点写下如何使用java连接使用ES。
ES提供transport client方式访问,默认端口是9300,由于这种访问方式和es节点间的的metadata元数据信息交互使用相同的端口,当业务访问量大的时候会导致es集群的不稳定,在6.x以后的版本中使用transport client的方式官方已经不推荐使用了,并且在7.x 版本中已经废弃掉了这种访问方式。官方推荐是high level客户端的方式使用ES,默认的端口是9200。
以下介绍java方式访问ES的几种方式。
1、使用Transport方式访问ES
Transport方式在ES 7.x版本中已经Deprecated,最新的项目都推荐High Level API方式访问ES,但是在6.x 5.x版本中还有在使用,这些项目建议可以重构,后续ES升级维护会很方便。以下给出简单的连接示例。这里的端口是9300。
public class TransportClientFactory {
private TransportClientFactory() {
}
private static class Inner {
private static final TransportClientFactory instance = new TransportClientFactory();
}
public static TransportClientFactory getInstance() {
return Inner.instance;
}
public TransportClient create(String host) throws Exception {
Settings settings = Settings.builder()
.put("cluster.name", "my-elasticsearch") // 默认的集群名称是elasticsearch,如果不是要指定
.put("client.transport.sniff", false)
.build();
return new PreBuiltTransportClient(settings)
.addTransportAddress(new TransportAddress(InetAddress.getByName(host), 9300));
}
}
2、使用High Level API访问ES
ES提供两种Client,一种是High Level Client API方式,一种是Low Level Client,HighLevel是在LowLevel Client的基础上封装而成的。
以下给出High Level Client 使用代码。
创建RestHighLevelClient
private String host;
private String username;
private String password;
private int port;
private int connectTimeout;
private int connectionRequestTimeout;
private int socketTimeout;
private boolean certification;
private int retryCnt;
private RestHighLevelClient client;
public ESHighLevelClient() {
this.port = 9200;
this.connectTimeout = 30000;
this.connectionRequestTimeout = 30000;
this.socketTimeout = 30000;
this.certification = false;
this.retryCnt = 3;
}
public static ESHighLevelClient create() {
return new ESHighLevelClient();
}
public ESHighLevelClient host(String host) {
this.host = host;
return this;
}
public ESHighLevelClient port(int port) {
this.port = port;
return this;
}
public ESHighLevelClient username(String username) {
this.username = username;
this.certification = true;
return this;
}
public ESHighLevelClient password(String password) {
this.password = password;
this.certification = true;
return this;
}
public ESHighLevelClient connectTimeout(int connectTimeout) {
this.connectTimeout = connectTimeout;
return this;
}
public ESHighLevelClient connectionRequestTimeout(int connectionRequestTimeout) {
this.connectionRequestTimeout = connectionRequestTimeout;
return this;
}
public ESHighLevelClient socketTimeout(int socketTimeout) {
this.socketTimeout = socketTimeout;
return this;
}
//create high client
public ESHighLevelClient build() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost(host, port, "http"))
.setRequestConfigCallback(
config -> config.setConnectTimeout(connectTimeout)
.setConnectionRequestTimeout(connectionRequestTimeout)
.setSocketTimeout(socketTimeout));
if (this.certification) {
builder.setHttpClientConfigCallback(
httpClientBuilder -> {
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY,
new UsernamePasswordCredentials(username, password));
return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
}
);
}
this.client = new RestHighLevelClient(builder);
logger.info("es rest client build success {} ", client);
ClusterHealthRequest request = new ClusterHealthRequest();
ClusterHealthResponse response = this.client.cluster().health(request, RequestOptions.DEFAULT);
logger.info("es rest client health response {} ", response);
return this;
}
创建索引
public void createIndex(String index, Map<String, String> config) throws Exception {
try {
CreateIndexRequest request = new CreateIndexRequest(index);
request.settings(Settings.builder()
.put("index.number_of_shards", 1)
.put("index.number_of_replicas", 1));
Map<String, Object> properties = new HashMap<>();
for (Map.Entry<String, String> entry : config.entrySet()) {
String propertyName = entry.getKey();
String type = entry.getValue();
Map<String, Object> property = new HashMap<>();
property.put("type", type);
properties.put(propertyName, property);
}
Map<String, Object> mapping = new HashMap<>();
mapping.put("properties", properties);
//request.mapping("_doc", mapping);
logger.info("create index success,mapping: " + mapping);
// 同步方式发送请求
CreateIndexResponse createIndexResponse = this.client.indices().create(request, RequestOptions.DEFAULT);
boolean acknowledged = createIndexResponse.isAcknowledged();
boolean shardsAcknowledged = createIndexResponse.isShardsAcknowledged();
logger.info("create index success,acknowledged = " + acknowledged);
logger.info("create index success,shardsAcknowledged = " + shardsAcknowledged);
logger.info("create index success, index name {}.", index);
} catch (IOException e) {
String message = String.format("es high level client (%s) create index failed, message %s.", index,
e.getMessage());
System.out.println(message);
throw new Exception("create index failed, message" + message);
}
}
增加文档
public void putDocument(String indexName, Map<String, Object> source) throws Exception {
IndexRequest request = new IndexRequest(indexName);
request.source(source);
int count = 0;
while (true) {
count++;
try {
logger.info("put index count {} document {} ", count, source);
IndexResponse response = this.client.index(request, RequestOptions.DEFAULT);
if (response.getResult() != DocWriteResponse.Result.CREATED) {
if (count < retryCnt) {
continue;
}
throw new Exception("put index document failed, response " + response);
}
logger.info("ES putDocument, resp={}", response);
break;
} catch (IOException e) {
if (count < retryCnt) {
continue;
}
logger.error("write index message: {} ", e.getMessage());
throw new Exception("write index failed, message" + e.getMessage());
}
}
}
批量增加文档
private BulkResponse putBulkDocuments(String indexName, List<DocWriteRequest<?>> requests) {
try {
BulkRequest bulkRequest = new BulkRequest(indexName);
bulkRequest.add(requests);
return this.client.bulk(bulkRequest, RequestOptions.DEFAULT);
} catch (Exception e) {
logger.error("bulk documents failed, message {}", e.getMessage());
return null;
}
}
public void putDocuments(String indexName, List<Map<String, Object>> sources) throws Exception {
List<DocWriteRequest<?>> requests = new ArrayList<>();
for (Map<String, Object> source : sources) {
IndexRequest indexRequest = new IndexRequest(indexName);
indexRequest.source(source);
requests.add(indexRequest);
}
List<DocWriteRequest<?>> actualRequests = new ArrayList<>(requests);
int count = 0;
while (true) {
count++;
// 同步方式发送请求
BulkResponse bulkResponse = putBulkDocuments(indexName, actualRequests);
if (bulkResponse == null) {
continue;
}
actualRequests.clear();
BulkItemResponse[] responses = bulkResponse.getItems();
for (int point = 0; point < responses.length; point++) {
BulkItemResponse response = responses[point];
if (response.status() != RestStatus.CREATED) {
logger.error("put document failed, message {}", requests.get(point));
actualRequests.add(requests.get(point));
}
}
if (actualRequests.isEmpty()) {
logger.info("put document count {} success.", sources.size());
return;
}
requests.clear();
requests.addAll(actualRequests);
if (count < retryCnt) {
continue;
}
throw new Exception("put index document failed, response " + actualRequests.toString());
}
}
查询,Client 封装了ES所有的查询语句(Term 查询, match查询, filter查询, bool查询),使用上可以灵活选用。
public SearchResponse queryCondition(String indexName, Map<String, String> queryTerms,
String property, long startTime, long stopTime, int from, int size) throws Exception {
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
.must(QueryBuilders.rangeQuery(property).gte(startTime).lte(stopTime));
if (queryTerms != null) {
for (Map.Entry<String, String> entry : queryTerms.entrySet()) {
String name = entry.getKey();
String value = entry.getValue();
boolQueryBuilder.must(QueryBuilders.termQuery(name, value));
}
}
searchSourceBuilder.query(boolQueryBuilder)
.from(from)
.size(size)
.timeout(new TimeValue(60, TimeUnit.SECONDS))
.sort(new ScoreSortBuilder().order(SortOrder.DESC));
SearchRequest searchRequest = new SearchRequest(indexName);
searchRequest.source(searchSourceBuilder);
try {
SearchResponse searchResponse = this.client.search(searchRequest, RequestOptions.DEFAULT);
log.info("response query condition {} ", searchRequest);
return searchResponse;
} catch (Exception e) {
log.error("search doc failed", e);
throw new Exception("search document failed, message" + e.getMessage());
}
}
聚合搜索
public SearchResponse queryAgg(String indexName, Map<String, String> queryTerms, String property,
long startTime, long stopTime, String aggField, int size) throws Exception {
SearchRequest searchRequest = new SearchRequest(indexName);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
.filter(QueryBuilders.rangeQuery(property).gte(startTime).lte(stopTime));
for (Map.Entry<String, String> entry : queryTerms.entrySet()) {
String name = entry.getKey();
String value = entry.getValue();
boolQueryBuilder.filter(QueryBuilders.termQuery(name, value));
}
searchSourceBuilder.query(boolQueryBuilder).size(0)
.timeout(new TimeValue(60, TimeUnit.SECONDS))
.sort(new ScoreSortBuilder().order(SortOrder.DESC));
searchSourceBuilder.aggregation(AggregationBuilders.terms("aggregation")
.field(aggField)).size(size);
searchRequest.source(searchSourceBuilder);
try {
SearchResponse searchResponse = this.client.search(searchRequest, RequestOptions.DEFAULT);
log.info("response query agg {} ", searchRequest);
return searchResponse;
} catch (Exception e) {
log.error("search doc failed", e);
throw new Exception("search document failed, message" + e.getMessage());
}
}
API使用测试
@Test
public void TestCreateEngine1() throws Exception {
ESHighLevelClient client = ESHighLevelClient.create().host(host).build();
String indexName = "test_index_" + GenerateUUID.getUUID32();
Map<String, String> config = new HashMap<>();
config.put("timestamp", "date");
config.put("name", "keyword");
config.put("title", "text");
config.put("size", "long");
client.createIndex(indexName, config);
List<Map<String, Object>> sources = new ArrayList<>();
for (int i = 0; i < 5; i++) {
Map<String, Object> source = new HashMap<>();
source.put("timestamp", Calendar.getInstance().getTimeInMillis());
source.put("name", "elasticsearch");
source.put("title", "this is a article");
source.put("size", 1000);
sources.add(source);
}
Map<String, Object> source = new HashMap<>();
source.put("timestamp", "xxx");
source.put("name", "elasticsearch");
source.put("title", "this is a article");
source.put("size", 1000);
sources.add(source);
client.putDocuments(indexName, sources);
client.deleteIndex(indexName);
client.close();
}
High Level Client的功能很强大,如果需要更深入的了解,大家可以参考官网继续探索。
3、jest jar方式访问
官网已经不更新了,最新的版本在18年11月份已经停止更新,官网只支持6.以前的版本。不过在高版本也可以使用。
官网地址:searchbox-io/Jest
用法上和high level client类似,但是jest的好处是可以直接执行ES的DSL语句。这样我们可以预置DSL模板,使用是替换对应的变量就可以实现搜索,代码量会大量减少。
String search = "{" +
" "query": {" +
" "bool": {" +
" "must": [" +
" { "match": { "name": "Michael Pratt" }}" +
" ]" +
" }" +
" }" +
"}";
jestClient.execute(new Search.Builder(search).build());
4、springboot集成elasticsearch
ES官方提供的SDK的方式访问ES在使用方式上还是不太方便,封装JSON,处理返回值,构造查询条件,这些都需要大量的代码处理,在spring项目中有没有类似JPA封装JDBC的功能来处理ES的数据?
Spring Boot 提供了两种方式操作elasticsearch,Jest 和 Spring Data Elasticsearch 。
Jest 提供Elasticsearch Java Rest Client. ,版本支持到ES 6.x 版本已经不再更新,在最新的springboot版本中已经把Jest使用方式deprecated, 官网参考 https://github.com/searchbox-io/Jest。
Spring Data Elasticsearch 是SpringData的子项目。
Spring Data的使命是为数据访问提供熟悉且一致的基于Spring的编程模型,同时仍保留底层数据存储的特殊特性。 SpringData有一系列子项目。
官网:https://spring.io/projects/spring-data
- Spring Data Commons
- Spring Data JPA
- Spring Data KeyValue
- Spring Data LDAP
- Spring Data MongoDB
- Spring Data Redis
- Spring Data REST
- Spring Data for Apache Cassandra
- Spring Data for Apache Geode
- Spring Data for Apache Solr
- Spring Data for Pivotal GemFire
- Spring Data Couchbase (community module)
- Spring Data Elasticsearch (community module)
- Spring Data Neo4j (community module)
spring-data-elasticsearch
Spring Boot 通过整合Spring Data ElasticSearch为我们提供了非常便捷的检索功能支持Elasticsearch,在新的项目中可以优先采用此方式。
主要特性
- 支持Spring的基于
@Configuration
的java配置方式,或者XML配置方式,实现配置自动注入 - 提供了用于操作ES的便捷工具类
ElasticsearchTemplate
。包括实现文档到POJO之间的自动智能映射。 - 利用Spring的数据转换服务实现的功能丰富的对象映射
- 基于注解的元数据映射方式,而且可扩展以支持更多不同的数据格式
- 根据持久层接口自动生成对应实现方法,无需人工编写基本操作代码(类似mybatis,根据接口自动得到实现)。当然,也支持人工定制查询
spring-data-elasticsearch 使用ES high level client连接es,在早期的版本使用transport client,在最新的版本已经废弃。
maven引入Java包
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
<version>2.3.7.RELEASE</version>
</dependency>
其中Spring Data Elasticsearch和ES版本一一对应,版本选择上选择匹配版本使用。
注意springboot版本和elasticsearch版本之间的差异。
配置文件
在resources目录下的application.properties文件中增加如下配置,Springboot启动的时候会自动读取配置连接ES。
spring.elasticsearch.rest.uris=127.0.0.1:9200
#spring.elasticsearch.rest.username=username
#spring.elasticsearch.rest.password=password
spring.elasticsearch.rest.connection-timeout=3000
spring.elasticsearch.rest.read-timeout=3000
在早期的版本中使用的spring.data.elasticsearch.cluster-nodes,spring.data.elasticsearch.cluster-name等配置在新的springboot版本中已经deprecated。
定义model
@Document(indexName = "book_info", shards = 3, replicas = 1)
@Data
public class Book {
@Id
private String id;
@Field(type = FieldType.Text, analyzer = "standard")
private String name;
@Field(type = FieldType.Text, analyzer = "standard")
private String author;
@Field(type = FieldType.Date, format = DateFormat.basic_date)
private Date createTime;
@Field(type = FieldType.Date, format = DateFormat.basic_date)
private Date updateTime;
@Field(type = FieldType.Double)
private Double price;
}
id必须定义,其对应底层es中_id 字段,写入数据的时候如果id不赋值,es会自动生成 _id.
定义Repository
使用上和jpa类似,可以使用Repository提供的方法,也可以自定义方法。
public interface BookRepository extends ElasticsearchRepository<Book, String> {
Book findByName(String name);
List<Book> findByAuthor(String author);
Book findBookById(String id);
}
定义controller
@RestController
@RequestMapping("/book")
public class BookController {
@Autowired
BookRepository bookRepository;
@PostMapping(value = "/add")
public ResponseEntity<String> indexDoc(@RequestBody Book book) {
book.setCreateTime(new Date());
book.setUpdateTime(new Date());
System.out.println("book===" + book);
bookRepository.save(book);
return new ResponseEntity<>("save executed!", HttpStatus.OK);
}
@GetMapping()
public ResponseEntity<Iterable<Book>> getAll() {
Iterable<Book> all = bookRepository.findAll();
return new ResponseEntity<>(all, HttpStatus.OK);
}
@GetMapping(value = "/{name}")
public ResponseEntity<Book> getByName(@PathVariable("name") String name) {
Book book = bookRepository.findByName(name);
return new ResponseEntity<>(book, HttpStatus.OK);
}
@PutMapping(value = "/{id}")
public ResponseEntity<Book> updateBook(@PathVariable("id") String id,
@RequestBody Book updateBook) {
Book book = bookRepository.findBookById(id);
book.setId(updateBook.getId());
book.setName(updateBook.getName());
book.setAuthor(updateBook.getAuthor());
book.setPrice(updateBook.getPrice());
book.setUpdateTime(new Date());
bookRepository.save(book);
return new ResponseEntity<>(book, HttpStatus.OK);
}
@DeleteMapping(value = "/{id}")
public ResponseEntity<String> deleteBook(@PathVariable("id") String id) {
bookRepository.deleteById(id);
return new ResponseEntity<>("delete execute!", HttpStatus.OK);
}
}
查询测试
{
"content": [
{
"id": "0Sqc0HYBNIyq1EKBqlpn",
"name": "name",
"author": "11",
"createTime": "2021-01-05T00:00:00.000+00:00",
"updateTime": "2021-01-05T00:00:00.000+00:00",
"price": 111.0
},
{
"id": "0Cqc0HYBNIyq1EKBdVp5",
"name": "name",
"author": "11",
"createTime": "2021-01-05T00:00:00.000+00:00",
"updateTime": "2021-01-05T00:00:00.000+00:00",
"price": 111.0
}
],
"pageable": {
"sort": {
"sorted": false,
"unsorted": true,
"empty": true
},
"offset": 0,
"pageNumber": 0,
"pageSize": 2,
"paged": true,
"unpaged": false
},
"aggregations": null,
"scrollId": null,
"maxScore": 1.0,
"totalPages": 1,
"totalElements": 2,
"number": 0,
"size": 2,
"sort": {
"sorted": false,
"unsorted": true,
"empty": true
},
"first": true,
"last": true,
"numberOfElements": 2,
"empty": false
}
结论
以上把在工作中使用的几种连接方式做了简单介绍,给ES使用者做一个简单的参考。
参考
1、https://docs.spring.io/spring-data/elasticsearch/docs/current/reference/html/#elasticsearch.clients
2、https://www.jianshu.com/p/56e755415e63
3、Jest - Elasticsearch Java Client | Baeldung
4、Java High Level REST Client
5、Spring Boot + Spring Data + Elasticsearch example - Mkyong.com