ElasticSearch
1. 什么是ElasticSearch
Elaticsearch,简称为es, es是一个开源的高扩展的分布式全文检索引擎,它可以近乎实时的存储、检索数据;本身扩展性很好,可以扩展到上百台服务器,处理PB级别的数据。
es也使用Java开发并使用Lucene作为其核心来实现所有索引和搜索的功能,但是它的目的是通过简单的RESTful API来隐藏Lucene的复杂性,从而让全文搜索变得
简单。
2. ElasticSearch对比Solr
Solr 利用 Zookeeper 进行分布式管理,而 Elasticsearch 自身带有分布式协调管理功能;
Solr 支持更多格式的数据,而 Elasticsearch 仅支持json文件格式;
Solr 官方提供的功能更多,而 Elasticsearch 本身更注重于核心功能,高级功能多有第三方插件提供;
Solr 在传统的搜索应用中表现好于 Elasticsearch,但在处理实时搜索应用时效率明显低于 Elasticsearch
3. ElasticSearch相关概念
Elasticsearch是面向文档(document oriented)的,这意味着它可以存储整个对象或文档(document)。然而它不仅仅是存储,还会索引(index)每个文档的内容使之可以被搜索。在Elasticsearch中,你可以对文档(而非成行成列的数据)进行索引、搜索、排序、过滤。Elasticsearch比传统关系型数据库如下:
Relational DB ‐> Databases ‐> Tables ‐> Rows ‐> Columns
Elasticsearch ‐> Indices ‐> Types ‐> Documents ‐> Fields
4. 使用java操作ElasticSearch
(1) maven坐标
<dependencies>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>5.6.8</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>5.6.8</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j‐to‐slf4j</artifactId>
<version>2.9.1</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j‐api</artifactId>
<version>1.7.24</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j‐simple</artifactId>
<version>1.7.21</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.12</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
</dependency>
</dependencies>
(2) 创建索引
// 创建Client连接对象,put参数为集群名称
Settings settings = Settings.builder().put("cluster.name", "my‐elasticsearch").build();
//根据Settings 创建Client对象
TransportClient client = new PreBuiltTransportClient(settings)
//通过那个端口来接入head,参数为ip和端口号,可以写多个,防止一个毁掉
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"),9300));
//创建索引
client.admin().indices()
//索引库名称
.prepareCreate("blog2")
//执行操作
.get();
//释放资源
client.close();
(3) 创建mapping映射
// 创建Client连接对象
Settings settings = Settings.builder().put("cluster.name", "my‐elasticsearch").build();
TransportClient client = new PreBuiltTransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"),9300));
//写json数据
XContentBuilder builder = XContentFactory.jsonBuilder()
//相当于{
.startObject()
//"article": {
.startObject("article")
//"properties": {
.startObject("properties")
//"id": {
.startObject("id")
//"type":"long",
.field("type", "long")
//"store":true
.field("store", "yes")
//}
.endObject()
.startObject("title")
.field("type", "string")
.field("store", "yes")
.field("analyzer", "ik_smart")
.endObject()
.startObject("content")
.field("type", "string")
.field("store", "yes")
.field("analyzer", "ik_smart")
.endObject()
.endObject()
.endObject()
.endObject();
//使用客户端讲mapping设置到索引库
client.admin().indices()
//设置索引库名称
.preparePutMapping("index3")
//Type
.setType("article")
//json
.setSource(xContentBuilder)
.get();
(4) 建立文档
// 创建Client连接对象
Settings settings = Settings.builder().put("cluster.name", "my‐elasticsearch").build();
TransportClient client = new PreBuiltTransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"),
9300));
//创建文档信息
/*
*1.4.2 建立文档(使用Jackson转换实体)
*1)创建Article实体
*2)添加jackson坐标
*/
XContentBuilder builder = XContentFactory.jsonBuilder()
.startObject()
.field("id", 1)
.field("title", "ElasticSearch是一个基于Lucene的搜索服务器")
.field("content","它提供了一个分布式多用户能力")
.endObject();
// 建立文档对象
/**
* 参数一blog1:表示索引对象
* 参数二article:类型
* 参数三1:建立id
*/
client.prepareIndex("blog2", "article", "1").setSource(builder).get();
//释放资源
client.close();
(4) 使用jackson
1)创建实体类
2)添加maven坐标
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson‐core</artifactId>
<version>2.8.1</version>
</dependency>
<dependency>
3)代码实现
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson‐databind</artifactId>
<version>2.8.1</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson‐annotations</artifactId>
<version>2.8.1</version>
</dependency>
3)代码实现
// 创建Client连接对象
Settings settings = Settings.builder().put("cluster.name", "my‐elasticsearch").build();
TransportClient client = new PreBuiltTransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"),
9300));
// 描述json 数据
//{id:xxx, title:xxx, content:xxx}
Article article = new Article();
article.setId(2);
article.setTitle("搜索工作其实很快乐");
article.setContent("我们希望我们的搜索解决方案要快。");
//把Article实体类转成json
ObjectMapper objectMapper = new ObjectMapper();
String json = objectMapper.writeValueAsString(article);
// 建立文档
client.prepareIndex("blog2", "article", "3")
//第一个参数,json,第二个声明为json字符串
.setSource(json,XContentType.JSON)
.get();
//释放资源
client.close();
(5) 根据id查询
//1.设置查询,id可以设置多个
QueryBuilder queryBuilder= QueryBuilders.idsQuery().addIds("1","2");
//2.执行查询
SearchResponse searchResponse = client.prepareSearch("index3")
.setTypes("article")
.setQuery(queryBuilder)
//设置分页
.setFrom(0)
//设置分页显示多少
.setSize(5)
.get();
//获取查询结果
SearchHits searchHits= searchResponse.getHits();
//取查询结果集
System.out.println(searchHits.getTotalHits());
//查询结果列表
Iterator<SearchHit> iterator = searchHits.iterator();
while(iterator.hasNext()){
SearchHit searchHit=iterator.next();
//打印文档对象
System.out.println(searchHit.getSourceAsString());
Map<String,Object> document=searchHit.getSource();
}
//关闭
client.close();
(6) 关键词查询
//1、创建es客户端连接对象
Settings settings = Settings.builder().put("cluster.name", "my‐elasticsearch").build();
TransportClient client = new PreBuiltTransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"),
9300));
//2、设置搜索条件
SearchResponse searchResponse = client.prepareSearch("blog2")
.setTypes("article")
.setQuery(QueryBuilders.termQuery("content", "搜索"))
.get();
//3、遍历搜索结果数据
SearchHits hits = searchResponse.getHits(); // 获取命中次数,查询结果有多少对象
System.out.println("查询结果有:" + hits.getTotalHits() + "条");
Iterator<SearchHit> iterator = hits.iterator();
while (iterator.hasNext()) {
SearchHit searchHit = iterator.next(); // 每个查询对象
System.out.println(searchHit.getSourceAsString()); // 获取字符串格式打印
System.out.println("title:" + searchHit.getSource().get("title"));
}
//4、释放资源
client.close();
(7) 高亮显示代码实现
// 创建Client连接对象
Settings settings = Settings.builder().put("cluster.name", "my‐elasticsearch").build();
TransportClient client = new PreBuiltTransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"),
9300));
// 搜索数据
SearchRequestBuilder searchRequestBuilder = client
.prepareSearch("blog2").setTypes("article")
.setQuery(QueryBuilders.termQuery("title", "搜索"));
//设置高亮数据
HighlightBuilder hiBuilder=new HighlightBuilder();
hiBuilder.preTags("<font style='color:red'>");
hiBuilder.postTags("</font>");
hiBuilder.field("title");
searchRequestBuilder.highlighter(hiBuilder);
//获得查询结果数据
SearchResponse searchResponse = searchRequestBuilder.get();
//获取查询结果集
SearchHits searchHits = searchResponse.getHits();
System.out.println("共搜到:"+searchHits.getTotalHits()+"条结果!");
//遍历结果
for(SearchHit hit:searchHits){
System.out.println("String方式打印文档搜索内容:");
System.out.println(hit.getSourceAsString());
System.out.println("Map方式打印高亮内容");
System.out.println(hit.getHighlightFields());
System.out.println("遍历高亮集合,打印高亮片段:");
Text[] text = hit.getHighlightFields().get("title").getFragments();
for (Text str : text) {
System.out.println(str);
}
}
//释放资源
client.close();