使用ES一段时间了。突然有一天线上索引创建报错。到9月份了。按照月份生成日期时报错如下:
ElasticsearchStatusException[Elasticsearch exception [type=illegal_argument_exception, reason=mapper [xxx] cannot be changed from type [integer] to [long]]]
查看了git历史日志,并没有发现修改字段类型的痕迹。
于是我本地运行代码,多线程触发,模拟同时创建索引。果然真的就报出来上面的错误。
1、问题复现
经过反复多次模拟,报错类型总共出现如下几种:
- 1、索引类字段类型转换出错
Exception in thread "dta-async-thread13" ElasticsearchStatusException[Elasticsearch exception [type=illegal_argument_exception, reason=mapper [myIndexFeild] cannot be changed from type [integer] to [long]]]
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:2053)
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:2030)
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1777)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1734)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1696)
at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:928)
at org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate.index(ElasticsearchRestTemplate.java:694)
- 2、还是索引类字段转换出错,不过类型顺序是反着的
Exception in thread "dta-async-thread2" ElasticsearchStatusException[Elasticsearch exception [type=illegal_argument_exception, reason=mapper [myIndexField] cannot be changed from type [long] to [integer]]]
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:2053)
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:2030)
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1777)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1734)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1696)
at org.elasticsearch.client.IndicesClient.putMapping(IndicesClient.java:295)
at org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate.putMapping(ElasticsearchRestTemplate.java:292)
at org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate.putMapping(ElasticsearchRestTemplate.java:270)
at org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate.putMapping(ElasticsearchRestTemplate.java:265)
at tech.dta.data.utils.IndexManage.createIndexForEsEntity(IndexManage.java:45)
- 3、找不到索引名。
因为我的索引名是按时间滚动。但是找不到的根源不是再创建索引时出错,而是在插入数据时出错。
2、源代码
我的代码其实是有问题的,一般情况下没问题,多线程并发的情况下,就出现问题了。
原代码如下:
实体类:
@Data
@Document(indexName = "my_index_result_" +
"#{ T(com.entity.esdo.MyIndexResult ).dateStr() }", createIndex = false)
@NoArgsConstructor
@AllArgsConstructor
@Alias(name = IndexNameConstant.MY_RESULT_INDEX)
public class MyIndexResult implements Serializable {
static final long serialVersionUID = 2255410541905094030L;
@Id
@Field(type = FieldType.Long)
@JsonSerialize(using = ToStringSerializer.class)
long id;
@Field(type = FieldType.Keyword)
String sn;
@Field(type = FieldType.Integer)
int myIndexFeild;
// 测试时时间截取到秒
public static String dateStr() {
return ServiceUtils.getNowFullTime().substring(0,14);
}
}
创建索引的方法:
private final ElasticsearchRestTemplate esRestTemplate;
// ...
public String createIndexForEsEntity(Class clazz, String aliasName) {
ElasticsearchPersistentEntity persistentEntity = esRestTemplate.getPersistentEntityFor(clazz);
AliasQuery aliasQuery = new AliasQuery();
if (!esRestTemplate.indexExists(clazz)) {
synchronized (this) {
// 再判断一次索引是否已创建,防止重复创建
String indexName = persistentEntity.getIndexName();
boolean b = esRestTemplate.indexExists(indexName);
if (!b) {
log.info("to do create a index :{}", indexName);
// 索引创建时设置索引的生命周期
String setting = String.format(settingStr, DELELTE_AFTER_4_MONTH);
boolean flag1 = esRestTemplate.createIndex(clazz, setting);
log.info("index create:{}", flag1);
boolean flag2 = esRestTemplate.putMapping(clazz);
log.info("mapping create:{}", flag2);
// 添加别名
aliasQuery.setAliasName(aliasName);
aliasQuery.setIndexName(indexName);
boolean flag3 = esRestTemplate.addAlias(aliasQuery);
log.info("alias add:{}", flag3);
// 再次判断是否创建成功
int i = 0;
while (i++ < 5) {
boolean flag = esRestTemplate.indexExists(indexName);
if (!flag) {
log.error("索引未创建成功!休眠后再判断一次。");
try {
Thread.sleep(500);
} catch (Exception e) {
log.error("休眠异常:", e);
}
} else {
log.info("索引已存在,结束循环。");
break;
}
}
} else {
log.info("{} 索引已创建!", indexName);
}
}
}
return persistentEntity.getIndexName();
}
我是在插入数据时,判断是否要创建滚动索引
// 【问题就出在这里了】这是SpringData提供的操作索引的接口
private final MyIndexResultDao resultDao;
@Override
public Response addMyIndexResult(MyIndexResult result) {
// 判断是否需要新增索引
indexManage.createIndexForEsEntity(MyIndexResult.class, IndexNameConstant.MY_RESULT_INDEX);
Long generateId = IDGenerator.SNOW_FLAKE.generate();
result.setId(generateId);
// ----问题就出在这里了
resultDao.save(result);
return Response.buildSuccess();
}
为MyIndexResult索引提供的继承了SpringData提供的CURD接口
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
public interface MyIndexResultDao extends ElasticsearchRepository<MyIndexResult, Long> {
}
3、问题分析
问题3的分析:
起初,我以为是新建索引的方法中的synchronized未生效,导致,一个索引未创建好,另一个索引就在被使用了。
于是我把synchronized挪到方法定义上,结果还是报错。
通过断点跟踪,发现 org.springframework.data.elasticsearch.core.mapping.SimpleElasticsearchPersistentEntity#getIndexName
不仅仅在 String indexName = persistentEntity.getIndexName();
时调用。明明索引创建已经完成了。还被调用。
于是我改成单线程调用,跟踪代码,发现了惊喜,果然,出现了不止一次的调用getIndexName()
。
调用的地方,居然是插入数据的代码:resultDao.save(result);
我创建索引时,使用的都是ElasticsearchRestTemplate
对象,而在插入数据时,使用CURD接口。
结果插入数据时,还会检查索引名字,由于我的索引时间精确到了秒,于是插入数据的索引,与我新建的索引不同了。出现了问题3。
修改方案
把插入数据也改成ElasticsearchRestTemplate
对象操作:
IndexQueryindexQuery = new IndexQuery();
indexQuery.setIndexName(indexName);
indexQuery.setObject(result);
esRestTemplate.index(indexQuery);
问题3未复现。成功解决!
问题1,2的分析。
index create:已经走过。
出错信息是:at org.elasticsearch.client.IndicesClient.putMapping(IndicesClient.java:295)
和at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1777)
说明出错代码一定是:boolean flag2 = esRestTemplate.putMapping(clazz);
这句。
通过追踪代码,我们找到了报错原码:
// ES 7.17的版本
private <Req, Resp> Resp internalPerformRequest(Req request,
CheckedFunction<Req, Request, IOException> requestConverter,
RequestOptions options,
CheckedFunction<Response, Resp, IOException> responseConverter,
Set<Integer> ignores) throws IOException {
Request req = requestConverter.apply(request);
req.setOptions(options);
Response response;
try {
response = client.performRequest(req);
} catch (ResponseException e) {
if (ignores.contains(e.getResponse().getStatusLine().getStatusCode())) {
try {
return responseConverter.apply(e.getResponse());
} catch (Exception innerException) {
// the exception is ignored as we now try to parse the response as an error.
// this covers cases like get where 404 can either be a valid document not found response,
// or an error for which parsing is completely different. We try to consider the 404 response as a valid one
// first. If parsing of the response breaks, we fall back to parsing it as an error.
throw parseResponseException(e);
}
}
throw parseResponseException(e);
}
try {
return responseConverter.apply(response);
} catch(Exception e) {
throw new IOException("Unable to parse response body for " + response, e);
}
}
其实,client.performRequest(req)
返回的是IOException。是在上层封装成了parseResponseException。
/**
* Sends a request to the Elasticsearch cluster that the client points to.
* Blocks until the request is completed and returns its response or fails
* by throwing an exception. Selects a host out of the provided ones in a
* round-robin fashion. Failing hosts are marked dead and retried after a
* certain amount of time (minimum 1 minute, maximum 30 minutes), depending
* on how many times they previously failed (the more failures, the later
* they will be retried). In case of failures all of the alive nodes (or
* dead nodes that deserve a retry) are retried until one responds or none
* of them does, in which case an {@link IOException} will be thrown.
*
* This method works by performing an asynchronous call and waiting
* for the result. If the asynchronous call throws an exception we wrap
* it and rethrow it so that the stack trace attached to the exception
* contains the call site. While we attempt to preserve the original
* exception this isn't always possible and likely haven't covered all of
* the cases. You can get the original exception from
* {@link Exception#getCause()}.
*
* @param request the request to perform
* @return the response returned by Elasticsearch
* @throws IOException in case of a problem or the connection was aborted
* @throws ClientProtocolException in case of an http protocol error
* @throws ResponseException in case Elasticsearch responded with a status code that indicated an error
*/
public Response performRequest(Request request) throws IOException {
SyncResponseListener listener = new SyncResponseListener(maxRetryTimeoutMillis);
performRequestAsyncNoCatch(request, listener);
return listener.get();
}
这个方法是异步执行,内部报错以ResponseException报错抛出,上层返回的是ElasticsearchStatusException [Elasticsearch exception [type=illegal_argument_exception, reason=mapper [myIndexFeild] cannot be changed from type [long] to [integer]]]
而这个时候我们去看生成的索引发现,生成的索引的mapping中myIndexFeild,的确是long类型,而不是我们注解中配置的类型integer;
这里的根因是一个大大的问号?求大神指点。
我怀疑是我putMapping时没有给定索引名称,只传递了一个Class,导致的。
下面说说我的解决办法:
索引创建代码改进,putMapin时给定索引名称:
private final ElasticsearchRestTemplate esRestTemplate;
public synchronized String createIndexForEsEntity(Class clazz, String aliasName) {
ElasticsearchPersistentEntity persistentEntity = esRestTemplate.getPersistentEntityFor(clazz);
// 1、先确定indexName
String indexName = persistentEntity.getIndexName();
log.info("-------------创建索引------------:{}", indexName);
// 2、判断索引是否存在,存在则不再创建
if (!esRestTemplate.indexExists(indexName)) {
log.info("to do create a index :{}", indexName);
// 1)索引创建时设置索引的生命周期
String setting = String.format(settingStr, DELELTE_AFTER_4_MONTH);
boolean flag1 = esRestTemplate.createIndex(indexName, setting);
log.info("{} index create:{}", indexName, flag1);
// 2)添加映射
String mapType;
if (clazz.getName().contains("WorkOrderEntity")) {
mapType = "robot_workorder_mapping";
} else if (clazz.getName().contains("DetectionItemResultEntity")) {
mapType = "ai_detect_result_mapping";
} else {
String[] split = clazz.getName().split("\\.");
mapType = split[split.length - 1].toLowerCase();
}
boolean flag2 = esRestTemplate.putMapping(indexName, mapType, clazz);
log.info("{} mapping create:{}", indexName, flag2);
// 3)添加别名
AliasQuery aliasQuery = new AliasQuery();
aliasQuery.setIndexName(indexName);
aliasQuery.setAliasName(aliasName);
boolean flag3 = esRestTemplate.addAlias(aliasQuery);
log.info("{} alias add:{}", indexName, flag3);
} else {
log.info("{} 索引已存在!", indexName);
}
return indexName;
}
改进后的代码,经过多线程测试,没有发现异常。
50个现场,随机间隔0~12秒,最后的确是保存了50个文档。只创建了7个索引,没有报错。没有中断主线程。
问题1,2 看起来是解决了。