elasticsearch 分页的时候有
1.深度分页(from-size) 分页的偏移值越大,执行分页查询时间就会越长
2.快照分页(scroll) 该查询会自动返回一个scrollId
由于数据量比较大. 所以采用了快照分页(scroll) 在开发环境没有任何问题. 但是在测试环境,发现scrollId的长度已经超出了get请求url的最大长度. 导致scrollId传不到后台.
于是追究其原因
1.准备环境:windows /elasticsearch-5.6.9/ 修改jvm.options [-Dfile.encoding=UTF-8] 为GBK
注意: 5.X版本和2.X版本的配置项有比较大的差异, 请自行百度
2.java代码
public class ElasticsearchTest {
private final static String HOST = "127.0.0.1";
private final static int PORT = 9300;
private TransportClient client = null;
/**
* 获取客户端连接信息
* 默认配置
*/
@Before
public void getConnect() throws UnknownHostException {
client = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddresses(
new InetSocketTransportAddress(InetAddress.getByName(HOST), PORT));
Logger.info("连接信息:" + client.toString());
}
// @Before
public void before() throws UnknownHostException {
Map<String, String> map = new HashMap<>();
map.put("cluster.name", "elasticsearch");
Settings settings = Settings.builder().put(map).build();
client = new PreBuiltTransportClient(settings).addTransportAddress(
new InetSocketTransportAddress(InetAddress.getByName(HOST), PORT));
Logger.info("连接信息:" + client.toString());
}
/**
* 关闭连接
*/
@After
public void closeConnect() {
if (null != client) {
Logger.info("执行关闭连接操作...");
client.close();
}
}
/**
* 创建索引库
* 需求:创建一个索引库为:msg消息队列,类型为:tweet,id为1
* 索引库的名称必须为小写
*/
@Test
public void addIndex1() throws IOException {
addOne("1");
}
/**
* 循环添加数据
*/
@Test
public void addIndex2() throws IOException {
for (int i = 100; i < 600; i++) {
addOne(i + "");
}
}
private void addOne(String i) throws IOException {
SimpleDateFormat s = new SimpleDateFormat("yyyy-MM-dd HH:mm:sss");
client.prepareIndex("msg", "tweet")
// client.prepareIndex("msg", "tweet1")
// client.prepareIndex("msg1", "tweet")
.setSource(XContentFactory.jsonBuilder()
.startObject().field("name", "张三")
.field("date", new Date())
.field("fmtTime", s.format(new Date()))
.field("msg", "中文_" + i)
.endObject()).get();
}
/**
* 从索引库获取数据
*/
@Test
public void getData1() {
String[] indexArray = new String[0];
SearchResponse response = client.prepareSearch(indexArray)
.setFrom(0)
.setSize(100)
.addSort("date", SortOrder.ASC)
.setScroll(new TimeValue(1000 * 60 * 60))
.get();
for (SearchHit searchHit : response.getHits().getHits()) {
Logger.info(searchHit.getId() + "" + searchHit.getSourceAsString());
}
Logger.info("scrollId: " + response.getScrollId());
Logger.info("scrollId.length:" + response.getScrollId().length());
}
/**
* 从索引库获取数据
*/
@Test
public void getData2() {
String scrollId = "1";
SearchResponse response = client.prepareSearchScroll(scrollId)
.setScroll(new TimeValue(1000 * 60 * 60)).get();
for (SearchHit searchHit : response.getHits().getHits()) {
Logger.info(searchHit.getId() + "" + searchHit.getSourceAsString());
}
Logger.info("scrollId: " + response.getScrollId());
Logger.info("scrollId.length:" + response.getScrollId().length());
}
/**
* 根据索引名称,类别,文档ID 删除索引库的数据
*/
@Test
public void deleteData() {
DeleteResponse deleteResponse = client.prepareDelete("msg", "tweet", "1").get();
Logger.info("deleteResponse索引名称:" + deleteResponse.getIndex()
+ "\t deleteResponse类型:" + deleteResponse.getType()
+ "\t deleteResponse文档ID:" + deleteResponse.getId()
+ "\t当前实例deleteResponse状态:" + deleteResponse.status());
}
@Test
public void deleteIndex() {
client.admin().indices().prepareDelete("msg").execute().actionGet();
Logger.info("删除索引成功");
}
}
分别批量增加 document , type , index 测试发现得. 当增加index的时候scrollId会增加.
解决方案(尝试):
1.改get请求为post请求.但是scrollId的长度达到3w+字符.每次需要夹带几KB的数据在请求中,而且scrollId可能将来会不断变大
2.将scrollId缓存,设置过期时间.每次请求下一次查询后清除该scrollId的缓存.
博友还有什么新的解决方案可以在我的博客下留言.