一、介绍
1.1引言
1.1:海量数据
在海量数据中执行搜索功能时,如果使用MySQL,效率太低。
1.2:全文检索
在海量数据中执行搜索功能时,如果使用MySQL,效率太低。
1.3:高亮显示
将搜索关键字,以红色的字体展示。
1.2ES介绍
ES是一个使用Java语言,并且基于Lucene编写的搜索引擎框架,它提供了分布式的全文搜索功能,提供了一个统一的基于Restful风格的web接口,官方客户端也对多种语言提供了响应的API
Lucene∶Lucene本身就是一个搜索引擎的底层。
分布式∶ES主要是为了突出他的横向扩展能力。
全文检索∶将一段词语进行分词,并且将分出的单个词语统一的放到一个分词库中,在搜索时,根据关键字去分词库中检索,找到匹配的内容。
(倒排索引)
RESTfuI风格的WEB接口∶操作ES很简单,只需要发送一个HTTP请求,并且根据请求方式的不同,携带参数的不同,执行相应的功能。
应用广泛∶Github.com,WIK,Gold Man用ES每天维护将近10TB的数据。
1.3倒排索引
安装
2.1 安装ES&Kibana
version: '3.1'
services:
elasticsearch:
image: daocloud.io/library/elasticsearch:6.5.4
restart: always
container_name: elasticsearch
ports:
- 9200:9200
kibana:
image: daocloud.io/library/kibana:6.5.4
restart: always
container_name: kibana
ports:
- 5601:5601
environment:
- elasticsearch_url=http://192.168.0.106:9200
depends_on:
- elasticsearch
2.2安装IK分词器
https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.4/elasticsearch-analysis-ik-6.5.4.zip
上面的慢用下面的
elasticsearch-plugin install http://tomcat01.qfjava.cn:81/elasticsearch-analysis-
ik/releases/download/v6.5.4/elasticsearch-analysis-ik-6.5.4.zip
如果自己下载到了window下可以通过如下命令复制到dicker容器下的elasticsearch中
docker cp bin/ik elasticsearch:/usr/share/elasticsearch/plugins/
然后你重启容器就行了
基础操作
3.1ES的结构
3.1.1 索引index
ES的服务中可以创建多个索引
而每一个索引默认被分成5片存储
每一个分片都会存在一个备份分片
备份分片默认不会帮助检索数据,当ES检索压力特别大时,备份分片才会帮助检索数据
3.1.2索引的类型
一个索引下可以创建多个类型,相当于mysql的表
3.1.3
document相当于mysql的一行数据
一个类型下,可以有多个文档
3.1.4属性
一个文档中可以包含多个属性,类似于mysql的列
3.2ES的RESTful语法
GET请求:
http://ip:port/index 查询索引信息
http://ip:port/index/type/doc_id 查询文档信息
POST请求:
heep://ip:port/index/type/_search:查询 在请求体中添加字符串代表查询
heep://ip:port/index/type/doc_id/_update:修改文档
put请求:
http://ip:port/index 创建一个索引,在请求体重指定索引的信息
http://ip:port/index/type_mappings 创建索引时指定索引文档存储的属性信息
delete属性
http://ip:port/index 删除跑路
http://ip:port/index/type/doc_id
3.3索引的从奥做
3.3.1创建一个索引
#创建一个索引
PUT /person
{
"settings": {
"number_of_shards": 5
, "number_of_replicas": 1
}
}
3.3.2查看索引信息
#查看索引信息
GET /person
3.3.3删除索引
#删除
DELETE /person
3.4指定结构化(类型)
字符串类型:
text:一般用于全文检索,将当前field进行分词
keyword:当前field不会被分词
数值类型:
long:
integer:
short:
byte
double
float
half_float
scaled_fload:根据一个long和scaled来表达一个浮点型,long-345,scaled-100=3.45
时间类型:
date:可以针对时间类型指定具体的格式
布尔类型:
boolean类型,表达true和false
二进制类型
binary:展示支持64为
范围类型:
long_range:赋值时,无序指定具体的内容,只需要存储一个范围,指定gt,lt,gte,lte
float_range:同上
double_range:同上
date_range:同上
ip_range:同上
经纬度类型:
geo_point:用于存储经纬度的
ip类型:
ip:可以存储ipv6和ipv4
其他的数据类型参考官网
3.5创建索引
#创建索引
PUT /book
{
"settings": {
#备份数
"number_of_replicas": 1
#分片数
, "number_of_shards": 5
},
"mappings": {
类型
"novel":{
# field有哪些
"properties":{
#field属性名
"name":{
"type": "text",
#指定分词器
"analyzer":"ik_max_word",
#可作为查询的条件
"index":true,
#是否需要额外存储
"store":false
},
"author":{
"type":"keyword"
},
"count":{
"type":"long"
},
"onSale":{
"type":"date",
#格式化方式
"format":"yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
},
"descr":{
"type":"text"
}
}
}
}
}
3.6文档的操作
文档在ES服务中的唯一标识,_index,_type,_id三个内容为组合锁定一个文档。
3.6.1新建文档
自动生成id
上面的是产生随机的id,但不方便我们之后记忆所以要手动,要用put标签
PUT /book/novel/1
{
"name":"红楼梦",
"author":"曹雪芹",
"count":10000,
"on-sale":"1985-01-01",
"descr":"asdsadasd"
}
3.6.2修改文档
覆盖式修改:就是上面写啥就写啥
doc修改
POST /book/novel/1/_update
{
"doc":{
"count":33333
}
}
3…6.3删除文档
发送delete请求就可以了
DELETE /book/novel/1
四、JAVA连接ES
4.1导入依赖
<dependencies>
<!--elasticsearch-->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>6.5.4</version>
</dependency>
<!--高级API-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>6.5.4</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.12</version>
</dependency>
</dependencies>
4.2编写ESClient工具类
public class ESClient {
public static RestHighLevelClient getClient(){
HttpHost httpHost=new HttpHost("192.168.0.109",9200);
RestClientBuilder clientBuilder= RestClient.builder(httpHost);
RestHighLevelClient client=new RestHighLevelClient(clientBuilder);
return client;
}
}
4.3创建索引
@Test
public void CreateIndex() throws IOException {
/*准备关于索引的settings*/
Settings.Builder settings=Settings.builder()
.put("number_of_shards",3)
.put("number_of_replicas",1);
/*准备关于索引的结构mapping*/
XContentBuilder mappings=JsonXContent.contentBuilder()
.startObject()
.startObject("properties")
.startObject("name")
.field("type","text")
.endObject()
.startObject("age")
.field("type","integer")
.endObject()
.startObject("birthday")
.field("type","date")
.field("format","yyyy-MM-dd")
.endObject()
.endObject()
.endObject()
;
/*将setting和mappings进行封装到request的对象中,*/
CreateIndexRequest request=new CreateIndexRequest(index).settings(settings)
.mapping(type,mappings);
/*通过client对象去链接ES并执行创建索引*/
CreateIndexResponse response=client.indices().create(request, RequestOptions.DEFAULT);
System.out.println(response);
}
4.3.1查询是否存在索引
@Test
public void exists() throws IOException {
/*准备request对象*/
GetIndexRequest request=new GetIndexRequest();
request.indices(index);
/*通过cliend去操作*/
boolean b =client.indices().exists(request,RequestOptions.DEFAULT);
/*输出*/
System.out.println(b);
}
4.3.2删除索引
@Test
public void deleteIndex() throws IOException {
DeleteIndexRequest request=new DeleteIndexRequest();
request.indices(index);
client.indices().delete(request,RequestOptions.DEFAULT);
}
4.4文档操作
4.4.1 添加文档
@Test
public void createDocument() throws IOException {
/*准备json数据*/
Person person=new Person(1,"张三",23,new Date());
String json=mapper.writeValueAsString(person);
/*准备request数据*/
IndexRequest request=new IndexRequest(index,type,person.getId().toString());
request.source(json, XContentType.JSON);
/*通过client执行添加即可*/
IndexResponse response=client.index(request,RequestOptions.DEFAULT);
System.out.println(response.getResult().toString());
}
4.4.2修改文档
@Test
public void updateDoc() throws IOException {
/*创建一个map指定修改的内容*/
Map<String,Object> doc=new HashMap<>();
doc.put("name","郎博年");
/*创建request对象,封装数据*/
UpdateRequest request=new UpdateRequest(index,type,"1");
request.doc(doc);
/*通过client执行*/
UpdateResponse updateResponse=client.update(request,RequestOptions.DEFAULT);
System.out.println(updateResponse.getResult().toString());
}
4.4.3删除文档
@Test
public void deleteDoc() throws IOException {
/*封装request对象*/
DeleteRequest deleteRequest=new DeleteRequest(index,type,"1");
DeleteResponse deleteResponse=client.delete(deleteRequest,RequestOptions.DEFAULT);
System.out.println(deleteResponse.getResult().toString());
}
4.4.4批量增加
@Test
public void blukDoc() throws IOException {
/*准备多个json数据*/
Person person=new Person(1,"张三",23,new Date());
Person person1=new Person(2,"张2",24,new Date());
Person person2=new Person(3,"张3",25,new Date());
String json1=mapper.writeValueAsString(person);
String json2=mapper.writeValueAsString(person1);
String json3=mapper.writeValueAsString(person2);
/*创建request*/
BulkRequest request=new BulkRequest();
request.add(new IndexRequest(index,type, person1.getId().toString()).source(json2,XContentType.JSON));
request.add(new IndexRequest(index,type, person2.getId().toString()).source(json3,XContentType.JSON));
request.add(new IndexRequest(index,type, person.getId().toString()).source(json1,XContentType.JSON));
BulkResponse bulk =client.bulk(request,RequestOptions.DEFAULT);
System.out.println(bulk);
}
4.4.5批量删除
@Test
public void blukDelete() throws IOException {
BulkRequest bulkRequest=new BulkRequest();
bulkRequest.add(new DeleteRequest(index,type,"1"));
bulkRequest.add(new DeleteRequest(index,type,"2"));
bulkRequest.add(new DeleteRequest(index,type,"3"));
BulkResponse responses=client.bulk(bulkRequest,RequestOptions.DEFAULT);
System.out.println(responses);
}
五、练习
5.1查询
5.1.1term查询
term查询代表完全匹配,搜索之前不会对你搜索的关键字进行分词,对你的关键字去文档分词库中去匹配搜索
#term查询
POST /sms-logs-index/sms-logs-type/_search
{
"from":0,
"size":5,
"query": {
"term": {
"province": {
"value": "北京"
}
}
}
}
Java实现
@Test
public void search() throws IOException {
SearchRequest request=new SearchRequest(index);
request.types(type);
SearchSourceBuilder builder=new SearchSourceBuilder();
builder.from(0);
builder.size(5);
builder.query(QueryBuilders.termQuery("province","北京"));
request.source(builder);
SearchResponse response= client.search(request,RequestOptions.DEFAULT);
for (SearchHit hit : response.getHits().getHits()) {
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
System.out.println(sourceAsMap);
}
}
5.1.2terms查询
和term的查询机制一样,都不会将指定的关键字进行分词,直接去分词库中匹配,找到响应文档内容
terms式在针对一个字段包含多个值的时候使用
@Test
public void termsQuery() throws IOException {
/*创建request*/
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
/*封装查询条件*/
SearchSourceBuilder builder=new SearchSourceBuilder();
builder.query(QueryBuilders.termsQuery("province","北京","山西"));
searchRequest.source(builder);
/*执行查询*/
SearchResponse response=client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
5.2match查询
属于高层查询,他会根据你查询的字段类型不一样,采用不同的查询方式
- 查询的式日期或者式数值的化,他会将你基于的字符串查询转换为日期或者数值对待
- 如果是一个不能分词的内容,match查询不会对你指定的查询关键字进行分词
- 如果是可以被分词的内容,match会将你指定的查询内容根据一定的方式去分词,去分词库中匹配指定的内容。
- match查询,实际就是多个term查询,将多个term查询的结果封装到了一起
5.2.1match_all
查询全部内容,不指定条件,但是es查询时默认只显示10条
builder.size(20);提升条数
@Test
public void matchallsearch() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
SearchResponse searchResponse= client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
5.2.2 match查询
指定一个field作为筛选的条件
@Test
public void matchQuery() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("smsContent","收获安装"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse= client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
5.2.3match
可以基于一个field匹配的内容,采用and或or的方式连接
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"match": {
"smsContent":{
"query":"中国 健康",
"operator":"and"
}
}
}
}
@Test
public void matchMatchQuery() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("smsContent","中国 健康").operator(Operator.OR));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse= client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
5.3其他查询
5.3.1id查询
GET /sms-logs-index/sms-logs-type/1
@Test
public void findById() throws IOException {
GetRequest getRequest=new GetRequest(index,type,"1");
GetResponse getResponse=client.get(getRequest,RequestOptions.DEFAULT);
System.out.println(getResponse.getSourceAsMap());
}
5.3.2ids查询
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"ids": {
"values": ["1","2"]
}
}
}
public void findAllbyIds() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.query( QueryBuilders.idsQuery().addIds("1","3"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
5.3.3prefix查询
前缀查询,可以通过一个关键字去指定一个field的前缀,从而查询到指定的文档。
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"prefix": {
"corpName": {
"value": "途虎"
}
}
}
}
@Test
public void findByPrefix() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.prefixQuery("corpName","盒马"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
5.3.4fuzzy查询(模糊查询)
模糊查询,输入字符的大概,ES根据输入的内容去匹配一下结果
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"fuzzy": {
"corpName": {
"value": "盒马先生"
, "prefix_length": 2
}
}
}
}
@Test
public void findByFuzzy() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.fuzzyQuery("corpName","盒马先生").prefixLength(2));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
5.3.5 wildcard
和mysqllike一个套路,在字符串指定通配符*和占位符?
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"wildcard": {
"corpName": {
"value": "中国??"
}
}
}
}
@Test
public void findByWillCard() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.wildcardQuery("corpName","中国*"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
5.3.6range查询
范围查询,只针对数值查询,对某一个field进行大于或小于的范围指定
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"range": {
"fee": {
"gte": 5,#gt> e代表等于
"lte": 10#lt<
}
}
}
}
5.3.7 regexp查询
正则查询,通过你编写的正则表达式去匹配内容
PS:prefix,wildcard和regexp查询效率低,要求效率比较高时,避免去使用
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"regexp": {
"mobile": "180[0-9]{8}"
}
}
}
@Test
public void findByRegexp() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.regexpQuery("mobile","139[0-9]{8}"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
5.3.8scroll
ES对from+size是由限制的,from和size二者之和不能超过1W
原理:
ES查询数据的方式,
- 第一步先将用户指定的关键字进行分词,
- 第二步将词汇去分词库中惊醒检索,得到多个文档的id,
- 第三步去各个分片中拉取指定的数据,
- 第四步,将数据根据score进行排序,
- 第五步,将查询到的数据根据from的值,舍弃一部分,
- 第六步返回结果
Scroll在ES查询数据的方式 - 第一步先将用户指定的关键字进行分词,
- 第二步将词汇去分词库中惊醒检索,得到多个文档的id,
- 第三步将文档的id存放在ES的上下文中
- 第四步根据你指定的size去ES中检索指定的数据,拿完数据的文档id,会从上下文移除
- 如果需要下一页数据,直接 去ES的上下文中,找后续内容
Scroll查询方式,不适合做实时的查询
#1m是留存1分钟的意思
POST /sms-logs-index/sms-logs-type/_search?scroll=1m
{
"query": {
"match_all": {}
},
"size": 2,
"sort": [
{
"fee": {
"order": "desc"
}
}
]
}
POST /_search/scroll
{
"scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAADgFFmNVZ3JUR2F1VDIyYlFrTzZXenBZUmcAAAAAAAA4BxZjVWdyVEdhdVQyMmJRa082V3pwWVJnAAAAAAAAOAYWY1VnclRHYXVUMjJiUWtPNld6cFlSZw==",
"scroll":"1m"
}
删除
```java
DELETE /_search/scroll/id
```java
SearchRequest request = new SearchRequest(index);
request.types(type);
//2. 指定scroll信息
request.scroll(TimeValue.timeValueMinutes(1L));
//3. 指定查询条件
SearchSourceBuilder builder = new SearchSourceBuilder();
builder.size(4);
builder.sort("fee", SortOrder.DESC);
builder.query(QueryBuilders.matchAllQuery());
request.source(builder);
//4. 获取返回结果scrollId,source
SearchResponse resp = client.search(request, RequestOptions.DEFAULT);
String scrollId = resp.getScrollId();
System.out.println("----------首页---------");
for (SearchHit hit : resp.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
while(true) {
//5. 循环 - 创建SearchScrollRequest
SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
//6. 指定scrollId的生存时间
scrollRequest.scroll(TimeValue.timeValueMinutes(1L));
//7. 执行查询获取返回结果
SearchResponse scrollResp = client.scroll(scrollRequest, RequestOptions.DEFAULT);
//8. 判断是否查询到了数据,输出
SearchHit[] hits = scrollResp.getHits().getHits();
if(hits != null && hits.length > 0) {
System.out.println("----------下一页---------");
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsMap());
}
}else{
//9. 判断没有查询到数据-退出循环
System.out.println("----------结束---------");
break;
}
}
//10. 创建CLearScrollRequest
ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
//11. 指定ScrollId
clearScrollRequest.addScrollId(scrollId);
//12. 删除ScrollId
ClearScrollResponse clearScrollResponse = client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
//13. 输出结果
System.out.println("删除scroll:" + clearScrollResponse.isSucceeded());
}
5.3.7delete-by-query
根据term,match等插叙你方式去删除大量的文档。
如果是index下的大部分数据,推荐创建一个全新的索引,将保留的放在新的索引
POST /sms-logs-index/sms-logs-type/_delete_by_query
{
"query":{
"range":{
"fee":{
"lt":4
}
}
}
}
@Test
public void deleteByQuery() throws IOException {
DeleteByQueryRequest deleteByQueryRequest=new DeleteByQueryRequest(index);
deleteByQueryRequest.types(type);
deleteByQueryRequest.setQuery(QueryBuilders.rangeQuery("fee").lt(4));
BulkByScrollResponse scrollResponse=client.deleteByQuery(deleteByQueryRequest,RequestOptions.DEFAULT);
System.out.println(scrollResponse.toString());
}
6.6复合查询
复合过滤器,将你的多个查询条件,以一定的逻辑组合在一起
- must:所有的条件,用must组合在一起,表示and的意思
- must_not:相当于not
- should:表示or的意思
POST /sms-logs-index/sms-logs-type/_search
{
"query":{
"bool": {
"should": [
{
"term": {
"province": {
"value": "北京"
}
}
},
{
"term": {
"province": {
"value": "武汉"
}
}
}
],
"must_not": [
{
"term": {
"operatorId": {
"value": "2"
}
}
}
],
"must": [
{
"match": {
"smsContent": "中国"
}
},
{
"match": {
"smsContent": "平安"
}
}
]
}
}
}
@Test
public void boolSearch() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
BoolQueryBuilder queryBuilder=QueryBuilders.boolQuery();
queryBuilder.should(QueryBuilders.termQuery("province","武汉"));
queryBuilder.should(QueryBuilders.termQuery("province","北京"));
queryBuilder.mustNot(QueryBuilders.termQuery("operatorId",2));
queryBuilder.must(QueryBuilders.matchQuery("smsContent","中国"));
queryBuilder.must(QueryBuilders.matchQuery("smsContent","平安"));
searchSourceBuilder.query(queryBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
7.7boosting查询
可以帮助我们影响查询后的score
- positve:只有匹配上positive的查询内容,才会被放到返回集上
- negative:如果匹配上了两者,就降低这个文档的score
- negative_boost:指定系数,将匹配以上两个的剩余系数,小于1.0
关于查询时,分数所影响的地方: - 搜索的关键字,在文档中出现的频次越高,分数就越高
- 文档内容越短,分数就越高
- 在搜索时,之后顶的关键字也会被分词,这个被分词的内容,被分词库匹配的个数越多,分数越高
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"boosting": {
"positive": {
"match": {
"smsContent": "收货安装"
}
},
"negative": {
"match": {
"smsContent": "王五"
}
},
"negative_boost": 0.5
}
}
}
@Test
public void BosttingQ() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
BoostingQueryBuilder bosttingQueryBuilder=QueryBuilders.boostingQuery(QueryBuilders.matchQuery("smsContent","收货安装"),
QueryBuilders.matchQuery("smsContent","王五")).negativeBoost(0.5f);
searchSourceBuilder.query(bosttingQueryBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
8.8filter查询
query通过你的查询条件,去计算文档的匹配度得到一个分数,并且根据分数进行排序,不会做缓存的
filter,根据你的查询条件去查询文档,不去计算分数,而且filter会对经常被过滤的数据进行缓存
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"bool": {
"filter":[
{
"term":{
"corpName":"盒马鲜生"
}
},
{
"range":{
"fee":{
"lte":5
}
}
}
]
}
}
}
9高亮查询
就是将用户输入的关键字以特殊样式展示,让用户知道为什么这个结果被检索出来,高亮展示的数据本身就是文档中的一个field,单独将当前field以highlight的形式返回给你。
ES提供了highlight的属性,和query同级别
- fragment_size:显示多少字的,默认100个
- fields:指定哪个field指定高亮显示
- pre_tags:前缀标签
- post_tags:后缀标签
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"match": {
"smsContent": "盒马"
}
},
"highlight": {
"fields": {
"smsContent": {}
},
"pre_tags": "<font color='red'>",
"post_tags": "</font>",
"fragment_size": 10
}
}
@Test
public void HighSearch() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("smsContent","盒马"));
HighlightBuilder highlightBuilder=new HighlightBuilder();
highlightBuilder.field("smsContent",10).preTags("<font color='red'").postTags("</font>");
searchSourceBuilder.highlighter(highlightBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits().getHits()) {
System.out.println(hit.getHighlightFields().get("smsContent"));
}
}
聚合查询
ES的聚合查询和Mysql的聚合查询类似,ES的相对比要mysql要强大的多,统计方式多种多样。
去重计数查询
Cardinality,第一步先将返回的文档中的一个指定的field进行去重,统计一共有多少条。
POST /sms-logs-index/sms-logs-type/_search
{
"aggs": {
"agg": {
"cardinality": {
"field": "province"
}
}
}
}
@Test
public void cardinality() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
;
searchSourceBuilder.aggregation(AggregationBuilders.cardinality("agg").field("province"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
Cardinality cardinality=searchResponse.getAggregations().get("agg");
System.out.println(cardinality.getValue());
}
范围统计
POST /sms-logs-index/sms-logs-type/_search
{
"aggs": {
"agg": {
"range": {
"field": "fee",
"ranges": [
{
"to":5
},
{
"from": 50,# from包含当前值
"to": 100#没有包含当前值的意思
},
{
"from": 10
}
]
}
}
}
}
POST /sms-logs-index/sms-logs-type/_search
{
"aggs": {
"agg": {
"date_range": {
"field": "createDate",
"format": "yyyy",
"ranges": [
{
"from": "2000"
}
]
}
}
}
}
POST /sms-logs-index/sms-logs-type/_search
{
"aggs": {
"agg": {
"ip_range": {
"field": "ipAddr",
"ranges": [
{
"to": "10.126.2.8"
}
]
}
}
}
}
public void rangeq() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.aggregation(AggregationBuilders.range("agg").field("fee")
.addUnboundedTo(5).addRange(5,10).addUnboundedFrom(10));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
Range agg=searchResponse.getAggregations().get("agg");
for (Range.Bucket bucket : agg.getBuckets()) {
System.out.println(bucket.getKeyAsString()); ;
Object from=bucket.getFrom();;
bucket.getTo();
bucket.getDocCount();
}
}
统计聚合
查询指定field的最大值最小值,平均值,平方和
POST /sms-logs-index/sms-logs-type/_search
{
"aggs": {
"agg": {
"extended_stats": {
"field": "fee"
}
}
}
}
@Test
public void extendedStats() throws IOException {
SearchRequest searchRequest=new SearchRequest(index);
searchRequest.types(type);
SearchSourceBuilder searchSourceBuilder=new SearchSourceBuilder();
searchSourceBuilder.aggregation(AggregationBuilders.extendedStats("agg").field("fee"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse=client.search(searchRequest,RequestOptions.DEFAULT);
ExtendedStats extendedStats=searchResponse.getAggregations().get("agg");
System.out.println(extendedStats.getMax());
System.out.println(extendedStats.getMax());
}