bulk
bulk 脚本
Bulk 批量操作是将文档的增删改查一些列操作,通过一次请求全都做完。减少网络传输次数。
语法:
示例:
POST _bulk
{"delete":{"_index":"person","_id":4}}
{"create":{"_index":"person","_id":8}}
{"name" : "李四8","age" : 44,"address" : "北京海淀"}
{"update":{"_index":"person","_id":2}}
{"doc":{"name" : "李四儿"}}
bulk javaApi
测试代码添加
/**
* 批量操作 bulk
* @throws IOException
*/
@Test
public void testBulk() throws IOException {
// POST _bulk
// {"delete":{"_index":"person","_id":4}}
// {"create":{"_index":"person","_id":8}}
// {"name" : "李四8","age" : 44,"address" : "北京海淀"}
// {"update":{"_index":"person","_id":2}}
// {"doc":{"name" : "李四儿"}}
//1创建请求
BulkRequest bulkRequest=new BulkRequest();
//删除
DeleteRequest deleteRequest=new DeleteRequest("person","4");
bulkRequest.add(deleteRequest);
// 添加
IndexRequest indexRequest = new IndexRequest("person").type("_doc").id("8");
Map<String, Object> sourceMap = new HashMap<>();
sourceMap.put("name", "李四4");
sourceMap.put("age", 44);
sourceMap.put("address", "北京海淀");
indexRequest.source(sourceMap);
bulkRequest.add(indexRequest);
// 修改
Map<String, Object> sourceMap2 = new HashMap<>();
sourceMap2.put("name", "李四4");
UpdateRequest updateRequest=new UpdateRequest("person","2");
updateRequest.doc(sourceMap2);
bulkRequest.add(updateRequest);
//2执行操作
BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);
//3获取结果
RestStatus status = bulkResponse.status();
System.out.println(status);
}
导入数据
导入数据-数据准备
4.1需求
将数据库中Goods表的数据导入到ElasticSearch中
4.2实现步骤
创建goods索引
查询Goods表数据
批量添加到ElasticSearch中
4.3准备工作
1mysql导入数据,模拟业务逻辑。
2es创建索引
PUT goods
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
},
"price": {
"type": "double"
},
"createTime": {
"type": "date"
},
"categoryName": {
"type": "keyword"
},
"brandName": {
"type": "keyword"
},
"spec": {
"type": "object"
},
"saleNum": {
"type": "integer"
},
"stock": {
"type": "integer"
}
}
}
}
- title:商品标题
- price:商品价格
- createTime:创建时间
- categoryName:分类名称。如:家电,手机
- brandName:品牌名称。如:华为,小米
- spec: 商品规格。如: spec:{“屏幕尺寸”,“5寸”,“内存大小”,“128G”}
- saleNum:销量
- stock:库存量
3添加一条数据测试
POST goods/_doc/1
{
"title":"小米手机",
"price":1000,
"createTime":"2019-12-01",
"categoryName":"手机",
"brandName":"小米",
"saleNum":3000,
"stock":10000,
"spec":{
"网络制式":"移动4G",
"屏幕尺寸":"4.5"
}
}
导入数据-代码实现
代码
1导入依赖
<!--mybatis-->
<dependency>
<groupId>org.mybatis.spring.boot</groupId>
<artifactId>mybatis-spring-boot-starter</artifactId>
<version>2.1.0</version>
</dependency>
<!--mysql驱动-->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
</dependency>
2配置文件增加配置
# datasource
spring:
datasource:
url: jdbc:mysql:///es11?serverTimezone=UTC
username: root
password: root
driver-class-name: com.mysql.cj.jdbc.Driver
# mybatis
mybatis:
mapper-locations: classpath:mapper/*Mapper.xml # mapper映射文件路径
type-aliases-package: com.itheima.es11.domain
3创建实体类
com.itheima.es11.domain
public class Goods {
private int id;
private String title;
private double price;
private int stock;
private int saleNum;
private Date createTime;
private String categoryName;
private String brandName;
private Map spec;
@JSONField(serialize = false)//在转换JSON时,忽略该字段
private String specStr;//接收数据库的信息 "{}"
getter and setter....
}
4创建mapper
com.itheima.es11.mapper
@Repository
@Mapper
public interface GoodsMapper {
/**
* 查询所有商品
*/
public List<Goods> findAll();
}
5xml文件
resources下创建mapper/GoodsMapper.xml
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE mapper PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN" "http://mybatis.org/dtd/mybatis-3-mapper.dtd">
<mapper namespace="com.itheima.es11.mapper.GoodsMapper">
<select id="findAll" resultType="goods">
select
`id` ,
`title` ,
`price` ,
`stock` ,
`saleNum` ,
`createTime` ,
`categoryName`,
`brandName` ,
`spec` as specStr
from goods
</select>
</mapper>
6测试类 测试方法
@Test
public void importData() throws IOException {
//1查询所有商品
List<Goods> all = goodsMapper.findAll();
// System.out.println(all.size());
//2bulk导入es
//2.1创建请求
BulkRequest bulkRequest=new BulkRequest();
for (Goods goods : all) {
//将spec转换
goods.setSpec(JSON.parseObject(goods.getSpecStr(),Map.class));
// 添加
IndexRequest indexRequest = new IndexRequest("goods").type("_doc").id(goods.getId()+"");
indexRequest.source(JSON.toJSONString(goods),XContentType.JSON);
bulkRequest.add(indexRequest);
}
//2.2执行操作
BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);
//2.3获取结果
RestStatus status = bulkResponse.status();
System.out.println(status);
}
如遇到如下错:
解决方案:
PUT /goods/_settings
{
"settings": {
"index.mapping.total_fields.limit": 2000
}
}
导入数据 详解-了解
有能力同学看一眼即可。
关注点:es的mapping设置为object,数据给为“{}”,es解析不了。
各种搜索-重点
matchAll 脚本
语法:
示例:
GET goods/_search
{
"query": {
"match_all": {}
}
}
注意:
1 get带请求体。
2默认返回10条。
3缓存策略。
4返回字段解析。
- took:本次操作花费的时间,单位为毫秒。
- timed_out:请求是否超时
- _shards:说明本次操作共搜索了哪些分片
- hits:搜索命中的记录
- hits.total : 符合条件的文档总数 hits.hits :匹配度较高的前N个文档
- hits.max_score:文档匹配得分,这里为最高分
- _score:每个文档都有一个匹配度得分,按照降序排列。
- _source:显示了文档的原始内容。
分页查询
GET goods/_search
{
"query": {
"match_all": {}
},
"from": 0,
"size": 20
}
matchAll javaApi
/**
* MathchAll
*/
@Test
public void testMathchAll() throws IOException {
// GET goods/_search
// {
// "query": {
// "match_all": {}
// },
// "from": 0,
// "size": 20
// }
//1创建请求
SearchRequest searchRequest = new SearchRequest("goods");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//查询条件
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
//分页条件
searchSourceBuilder.from(0);
searchSourceBuilder.size(20);
searchRequest.source(searchSourceBuilder);
//2执行操作
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
//3获取结果
TimeValue took = searchResponse.getTook();
SearchHits hits = searchResponse.getHits();
List<Goods> list = new ArrayList<>();
for (SearchHit hit : hits) {
//获取json字符串格式的数据
String sourceAsString = hit.getSourceAsString();
//转为java对象
Goods goods = JSON.parseObject(sourceAsString, Goods.class);
list.add(goods);
}
//遍历list展现数据
for (Goods goods : list) {
System.out.println(goods);
}
}
term 词条查询
不会对查询条件进行分词。
脚本语法:
GET goods/_search
{
"query": {
"term": {
"brandName": {
"value": "小米"
}
}
}
}
java代码:
/**
* termQuery
*/
@Test
public void testTermQuery() throws IOException {
// GET goods/_search
// {
// "query": {
// "term": {
// "brandName": {
// "value": "小米"
// }
// }
// }
// }
//1创建请求
SearchRequest searchRequest = new SearchRequest("goods");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//查询条件
searchSourceBuilder.query(QueryBuilders.termQuery("brandName","小米"));
searchRequest.source(searchSourceBuilder);
//2执行操作
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
//3获取结果
TimeValue took = searchResponse.getTook();
SearchHits hits = searchResponse.getHits();
List<Goods> list = new ArrayList<>();
for (SearchHit hit : hits) {
//获取json字符串格式的数据
String sourceAsString = hit.getSourceAsString();
//转为java对象
Goods goods = JSON.parseObject(sourceAsString, Goods.class);
list.add(goods);
}
//遍历list展现数据
for (Goods goods : list) {
System.out.println(goods);
}
}
matchQuery
流程:
- 会对查询条件进行分词。
- 然后将分词后的查询条件和词条进行等值匹配
- 默认取并集(OR)
脚本:
GET goods/_search
{
"query": {
"match": {
"title": "华为手机"
}
}
}
GET goods/_search
{
"query": {
"match": {
"title": {
"query": "小米手机",
"operator": "or"
}
}
}
}
体会 or 和and的不同
javaApi:
其他不变,只看变动
searchSourceBuilder.query(QueryBuilders.matchQuery("title","华为手机"));
searchSourceBuilder.query(QueryBuilders.matchQuery("title","华为手机").operator(Operator.OR));
模糊查询 脚本
现象:
GET goods/_search
{
"query": {
"match": {
"title": "象"
}
}
}
查不到
继续查询:
GET goods/_search
{
"query": {
"match": {
"title": "象牙白"
}
}
}
可以查到,想想为什么?倒排索引表中对应“白”没有关联文档。
引出模糊查询:
-
wildcard查询:会对查询条件进行分词。还可以使用通配符?(任意单个字符)和* (0个或多个字符)
GET goods/_search { "query": { "wildcard": { "title": { "value": "象??" } } } }
-
regexp查询:正则查询
GET goods/_search
{
"query": {
"regexp": {
"title": "\\w+(.)*"
}
}
}
字母开头数据都查出来了
- prefix查询:前缀查询
GET goods/_search
{
"query": {
"prefix": {
"brandName": {
"value": "三"
}
}
}
}
模糊查询 javaApi
测试类
- wildcard查询
searchSourceBuilder.query(QueryBuilders.wildcardQuery("title", "象**"));
- regexp查询
searchSourceBuilder.query(QueryBuilders.regexpQuery("title","\\w+(.)*"));
- prefix查询
searchSourceBuilder.query(QueryBuilders.prefixQuery("brandName","三"));
范围查询
range 范围查询:查找指定字段在指定范围内包含值。
脚本:
GET goods/_search
{
"query": {
"range": {
"price": {
"gte": 2000,
"lte": 3000
}
}
}
}
代码:
searchSourceBuilder.query(QueryBuilders.rangeQuery("price").gte(2000).lte(3000));
排序:
脚本:
GET goods/_search
{
"query": {
"range": {
"price": {
"gte": 2000,
"lte": 3000
}
}
},
"sort": [
{
"price": {
"order": "desc"
}
}
]
}
代码:
//排序
searchSourceBuilder.sort("price", SortOrder.DESC);
queryString
流程:
- 会对查询条件进行分词。
- 然后将分词后的查询条件和词条进行等值匹配
- 默认取并集(OR)
- 可以指定多个查询字段
脚本:
GET goods/_search
{
"query": {
"query_string": {
"fields": ["title","categoryName","brandName"],
"query": "华为手机"
}
}
}
GET goods/_search
{
"query": {
"simple_query_string": {
"fields": ["title","categoryName","brandName"],
"query": "华为 AND 手机"
}
}
}
不支持连接符,按照“华为”“AND”“手机” 三个词查询。
代码:
searchSourceBuilder.query(QueryBuilders.queryStringQuery("华为手机").field("title").field("categoryName" ).field("brandName").defaultOperator(Operator.OR));
searchSourceBuilder.query(QueryBuilders.simpleQueryStringQuery("华为手机").field("title").field("categoryName" ).field("brandName"));
布尔查询 脚本
1概念:
对多个查询条件连接
2链接方式:
- must(and):条件必须成立
- must_not(not):条件必须不成立
- should(or):条件可以成立
- filter:条件必须成立,性能比must高。不会计算得分。 range 1000-2000,term 华为
3脚本:
GET goods/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"brandName": {
"value": "华为"
}
}
},
{
"match": {
"title": "电信"
}
}
],
"must_not": [
{
"term": {
"brandName": {
"value": "小米"
}
}
}
],
"should": [
{
"term": {
"brandName": {
"value": "小米"
}
}
}
],
"filter": {
"match": {
"title": "白"
}
}
}
}
}
布尔查询 javaApi
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("brandName", "华为");
MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("title", "电信");
boolQueryBuilder.must(termQueryBuilder);
boolQueryBuilder.must(matchQueryBuilder);
TermQueryBuilder termQueryBuilder1 = QueryBuilders.termQuery("brandName", "小米");
boolQueryBuilder.mustNot(termQueryBuilder1);
boolQueryBuilder.should(termQueryBuilder1);
MatchQueryBuilder matchQueryBuilder1 = QueryBuilders.matchQuery("title", "白");
boolQueryBuilder.filter(matchQueryBuilder1);
searchSourceBuilder.query(boolQueryBuilder);
聚合查询 脚本
指标聚合:相当于MySQL的聚合函数。max、min、avg、sum等
脚本:
GET goods/_search
{
"query": {
"match": {
"title": "华为"
}
},
"aggs": {
"max_price": {
"max": {
"field": "price"
}
}
}
}
桶聚合:相当于MySQL的group by 操作。不要对text类型的数据进行分组,会失败。
脚本:
GET goods/_search
{
"query": {
"match": {
"title": "手机"
}
},
"aggs": {
"group_by_brands": {
"terms": {
"field": "brandName",
"size": 10
}
}
}
}
聚合查询 javaApi
/**
* aggsQuery
*/
@Test
public void testAggsQuery() throws IOException {
// GET goods/_search
// {
// "query": {
// "match": {
// "title": "手机"
// }
// },
// "aggs": {
// "group_by_brands": {
// "terms": {
// "field": "brandName",
// "size": 10
// }
// }
// }
// }
//1创建请求
SearchRequest searchRequest = new SearchRequest("goods");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//查询条件
searchSourceBuilder.query(QueryBuilders.matchQuery("title","手机"));
// 聚合
TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders.terms("group_by_brands").field("brandName").size(100);
searchSourceBuilder.aggregation(termsAggregationBuilder);
//排序
searchSourceBuilder.sort("price", SortOrder.DESC);
searchRequest.source(searchSourceBuilder);
//2执行操作
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
//3获取结果
SearchHits hits = searchResponse.getHits();
List<Goods> list = new ArrayList<>();
for (SearchHit hit : hits) {
//获取json字符串格式的数据
String sourceAsString = hit.getSourceAsString();
//转为java对象
Goods goods = JSON.parseObject(sourceAsString, Goods.class);
list.add(goods);
}
//遍历list展现数据
// for (Goods goods : list) {
// System.out.println(goods);
// }
//获取聚合结果
Aggregations aggregations = searchResponse.getAggregations();
Map<String, Aggregation> asMap = aggregations.getAsMap();
Terms group_by_brands = (Terms) asMap.get("group_by_brands");
List<? extends Terms.Bucket> buckets = group_by_brands.getBuckets();
for (Terms.Bucket bucket : buckets) {
String keyAsString = bucket.getKeyAsString();
System.out.println("keyAsString:"+keyAsString);
long docCount = bucket.getDocCount();
System.out.println("docCount:"+docCount);
System.out.println("===============================");
}
}
高亮 脚本
高亮三要素:
- 高亮字段
- 前缀
- 后缀
脚本:
GET goods/_search
{
"query": {
"match": {
"title": "手机"
}
},
"highlight": {
"fields": {
"title": {
"pre_tags":"<font color='red'>",
"post_tags": "</font>"
}
}
}
}
注意观察结果。
高亮 javaApi
/**
* highlightQuery
*/
@Test
public void testHighlightQuery() throws IOException {
// GET goods/_search
// {
// "query": {
// "match": {
// "title": "手机"
// }
// },
// "highlight": {
// "fields": {
// "title": {
// "pre_tags":"<font color='red'>",
// "post_tags": "</font>"
// }
// }
// }
// }
//1创建请求
SearchRequest searchRequest = new SearchRequest("goods");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//查询条件
searchSourceBuilder.query(QueryBuilders.matchQuery("title","手机"));
//高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("title");
highlightBuilder.preTags("<font color='red'>");
highlightBuilder.postTags("</font>");
searchSourceBuilder.highlighter(highlightBuilder);
searchRequest.source(searchSourceBuilder);
//2执行操作
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
//3获取结果
SearchHits hits = searchResponse.getHits();
List<Goods> list = new ArrayList<>();
for (SearchHit hit : hits) {
//获取json字符串格式的数据
String sourceAsString = hit.getSourceAsString();
//转为java对象
Goods goods = JSON.parseObject(sourceAsString, Goods.class);
//获取高亮
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
HighlightField highlightField = highlightFields.get("title");
Text[] fragments = highlightField.getFragments();
//替换title
goods.setTitle(fragments[0].toString());
list.add(goods);
}
//遍历list展现数据
for (Goods goods : list) {
System.out.println(goods);
}
}
重建索引
需求:
随着业务需求的变更,索引的结构可能发生改变。
ElasticSearch的索引一旦创建,只允许添加字段,不允许改变字段。因为改变字段,需要重建倒排索引,影响内部缓存结构,性能太低。
那么此时,就需要重建一个新的索引,并将原有索引的数据导入到新索引中。
!!!索引名必须小写
步骤:
1建立索引
PUT student_v1
{
"mappings": {
"properties": {
"birthday":{
"type": "date"
}
}
}
}
2插入数据
PUT student_v1/_doc/1
{
"birthday":"1992-01-07"
}
3修改映射?失败
PUT student_v1
{
"mappings": {
"properties": {
"birthday":{
"type": "text"
}
}
}
}
4建立另外索引
PUT student_v2
{
"mappings": {
"properties": {
"birthday":{
"type": "text"
}
}
}
}
5转移数据
POST _reindex
{
"source": {
"index": "student_v1"
},
"dest": {
"index": "student_v2"
}
}
6v2插入新数据
PUT student_v2/_doc/2
{
"birthday":"1992年01月07日"
}
7索引别名
7.1删除v1
DELETE student_v1
7.2v2起别名为v1
POST student_v2/_alias/student_v1
7.3查询数据
GET student_v1/_doc/1
7.4查看索引信息
GET student_v1
#1 运维 创建 user_v1
PUT user_v1
{
"mappings": {
"properties": {
"name":{
"type": "text"
}
}
}
}
#2 起别名
POST user_v1/_alias/user
#3 开发 user 增删改查
PUT /user/_doc/1
{
"name":"张三"
}
#需求 改字段类型
#4 创建user_v2
PUT user_v2
{
"mappings": {
"properties": {
"name":{
"type": "keyword"
}
}
}
}
#5 数据转移:旧索引名 新索引名 bulkAPI。
POST _reindex
{
"source": {
"index": "user_v1"
},
"dest": {
"index": "user_v2"
}
}
#6 user_v2别名设置成user
POST user_v2/_alias/user
#7 删除user_v1
DELETE user_v1