好久没有写东西了。正好这两天刚忙完,就抽个时间把之前弄的东西总结下。供自己温故。当然也如果能对朋友们有所帮助。那是再好不过!
先说说弄了这么久es的一些小感受吧。
1.随着es使用的深入。发现我在做一个项目的时候最初希望将这个项目涉及到的type全部放到一个index下。但是当项目的深入后发现同样的一个字段可能在不同的type下,它的类型可能是string,也有可能是int(数字)型 。比如我常用的status这个字段。(所有的表都是基于原来的mysql数据库同步过去) 在30多个表中,其中2到3个表它的类型是string其他的表都是int。这样就导致需要将这个status指定一个别名。在所有用到这几个表的数据的时候都需要将别名重新改成status。这样无形中则增加了很多麻烦。
2.由于将所有type都放到一个index下。发现我们在项目中会遇到意想不到一些问题。比如分词方式,我遇到的问题是按照名称去做聚合。可是前期并不知情,就将name这个字段用了ik分词。大家都知道用了分词的字段聚合的结果就不是你想要的了。es字段属性设置好之后又没有办法去修改。就只有删除库重新建。
3.就是删除index的时候。数据都在一个index下你本来只有一个字段定义错了。但是没办法修改就只有备份然后删除index 重新创建。这样代价太大。
接下来就就是正题了。
查询stang_cbid的文档,title中包含“测试”,同时发布日期大于2017-05-01,按照pubtime 汇总并排序。
{
"query": {
"bool": {
"filter": [
{
"match_phrase": {
"title": "测试" }
},
{
"range":{
"pubtime":{ "gte":"2017-05-01" } }
}
]
}
},"sort": [
{
"pubtime": {
"order": "desc"
}
}
]
,
"aggs":{
"group":{
"terms":{
"field":"pubtime"
}
}
}
}
局部更新数据:
post /index/type/_id/_update
{
"doc":{
"filed1":"value",
"filed2":"value"
}
}
bool条件过滤并以坐标点距离排序
{
"query" : {
"bool": {
"must": [
{
"match_phrase": {
"projectname": "隧道" }
}
,
{
"match_phrase": {
"area_id": "26" }
}
]
}
}, "sort" : [
{
"_geo_distance" : {
"location" :"31.88,106.25",
"order" : "asc",
"unit" : "km"
}
}
]
}
分组查询:(注意用于分组的filed(字段) index属性为“not_analyzed”不被分析)
{
"query": {
"bool": {
"must": [
{
"term": {
"type": { "value": 2 } }
},
{
"term": {
"area_id": { "value": 26 } }
}
]
}
},
"aggs": {
"top_tags": {
"terms": {
"field": "standards",
"size": 20
},
"aggs": {
"too": {
"top_hits": { "_source": { }, "size" : 1 } }
}
}
}
}
我们要计算地理位置。首先我们需要设置字段的类型为geo_point.
如下:
“properties”: {
“location”:{
“type”: “geo_point”
}
}
我es的版本是2.4.1。type里面找不到这个类型。但是设置上了还是正确的类型。
存放的形式有3总
json体:”location”:{“lat”:30.23422,”lon”:107.23151}
数组:[30.123,104.21543]
String:{“30.34”,”121.343654”}
所有类型都是latitude在前,longitude在后。
下面是java 的代码:
1.给定一个坐标范围,查询落在这个范围内的所有文档,并指定返回的参数名称。
public Map map_tunnel(Map map) {
ElasticsearchUtil eu = new ElasticsearchUtil("123");
String indexname = "test";
int from = 0;
List list = new ArrayList<>();
String name = "";
try{
name = map.get("name").toString();
}catch(Exception e){
name = null;
}
String[] fileds = new String[]{};
SearchResponse searchResponse = null;
QueryBuilder qb = QueryBuilders.geoBoundingBoxQuery("location").topLeft(Double.parseDouble(map.get("lat1").toString()), Double.parseDouble(map.get("lon1").toString())).bottomRight(Double.parseDouble(map.get("lat2").toString()), Double.parseDouble(map.get("lon2").toString())).ignoreMalformed(true);
//ignoreMalformed忽略畸形数据
if (null != name) {
switch (name) {
case "plan":
name = "stang_plan_project";
fileds = new String[]{"id", "latitude", "longitude", "projectaddress", "projectname"};
break;
case "work":
name = "stang_work_project";
fileds = new String[]{"id", "latitude", "longitude", "projectaddress", "projectname"};
break;
case "tunnel":
name = "stang_tunnel";
fileds = new String[]{"id", "latitude", "longitude", "address", "name", "section", "status", "type"};
break;
default:
break;
}
if (name.contains("tunnel")) {
BoolQueryBuilder filterqb = QueryBuilders.boolQuery();
filterqb.mustNot(QueryBuilders.matchQuery("type", 2));
filterqb.filter(QueryBuilders.matchQuery("status", 3));
QueryBuilder qbss = QueryBuilders.queryFilter(filterqb);
searchResponse = eu.searchCompanySetSourceFiled(indexname, name, qb, qbss, fileds, from, maxSize);
} else {
searchResponse = eu.searchCompanySetSourceFiled(indexname, name, qb, fileds, from, maxSize);
}
} else {
name = "stang_tunnel";
fileds = new String[]{"id", "latitude", "longitude", "address", "name", "section", "status", "type","forid"};
QueryBuilder qbss = QueryBuilders.boolQuery().mustNot(QueryBuilders.matchQuery("type", 2));
searchResponse = eu.searchCompanySetSourceFiled(indexname, name, qb, qbss, fileds, from, maxSize);
}
try {
SearchHits hits = searchResponse.getHits();
for (SearchHit hit : hits) {
map = new HashMap<>();
map = hit.getSource();
list.add(map);
}
Map tmap = new HashMap<>();
tmap.put("ext", list);
tmap.put("state", true);
tmap.put("message", "操作成功");
return tmap;
} catch (Exception e) {
return OutData.softwareFormart();
} finally {
eu.close();
}
}
这个是searchCompanySetSourceFiled方法。(这个方法其实没什么好多说的。)
public SearchResponse searchCompanySetSourceFiled(String indexname, String type, QueryBuilder queryBuilder, String[] fileds, int from, int pageSize) {
SearchResponse searchResponse = null;
try {
searchResponse = client.prepareSearch(indexname).setTypes(new String[]{type}).setQuery(queryBuilder).setFetchSource(fileds, null).setFrom(from).setSize(pageSize).execute().actionGet();
return searchResponse;
} catch (Exception e) {
return searchResponse;
}
}
得到的效果就是这样的:
简单解释下参数:lat1,lon1为给定的左上角的坐标,lat2,lon2为右下角的坐标。
geoBoundingBoxQuery:这个方法大致意思就是根据你传入的两个坐标点构建一盒子。只要是落在这个盒子或者矩形中的坐标点都会被查询出来。
2.计算两个坐标点之间的距离。
public Map datalist(Map<String, Object> map) {
String lat = map.get("lat").toString();
String lon = map.get("lon").toString();
map.remove("lat");
map.remove("lon");
int from = 0;
try{
from = Integer.parseInt(map.get("from").toString());
}catch(Exception e){
from = 0;
}
ElasticsearchUtil eu = new ElasticsearchUtil("123");
BoolQueryBuilder bqb = QueryBuilders.boolQuery();
try {
if (!map.isEmpty()) {
for (Entry<String, Object> vo : map.entrySet()) {
switch (vo.getKey()) {
case "type":
QueryBuilder term = QueryBuilders.matchQuery("type", vo.getValue());
bqb.must(term);
break;
case "status":
QueryBuilder term1 = QueryBuilders.matchQuery("status", vo.getValue());
bqb.must(term1);
break;
case "roadnetwork":
QueryBuilder term2 = QueryBuilders.matchQuery("roadnetwork", vo.getValue());
bqb.must(term2);
break;
case "name":
QueryBuilder term3 = QueryBuilders.matchPhraseQuery("name", vo.getValue());
bqb.must(term3);
break;
case "area_id":
QueryBuilder term4 = QueryBuilders.matchQuery("area_id", vo.getValue());
bqb.must(term4);
break;
default:
break;
}
}
} else {
QueryBuilder term = QueryBuilders.matchAllQuery();
bqb.should(term);
}
SortBuilder sb = SortBuilders.geoDistanceSort("location").point(Double.parseDouble(lat), Double.parseDouble(lon)).ignoreMalformed(true).unit(DistanceUnit.KILOMETERS).order(SortOrder.ASC);
String indexname = "test";
String[] fileds = {"address", "area_id", "city_id", "id", "latitude", "longitude", "name", "pic_url", "roadnetwork", "section", "status", "type", "length"};
SearchResponse searchResponse = eu.geoDistanceSortSearchAndSetSourceFileds(indexname, "stang_tunnel", bqb, fileds, sb, from, pageSize);
SearchHits hits = searchResponse.getHits();
List lists = new ArrayList<>();
for (SearchHit hit : hits) {
map = new HashMap<>();
map = hit.getSource();
map.put("distance", OutData.formartDouble((double) hit.getSortValues()[0]));
lists.add(map);
}
map = new HashMap<>();
map.put("count", hits.getTotalHits());
lists.add(map);
return OutData.software_Formart(lists,pageSize);
} catch (Exception e) {
return OutData.softwareFormart();
} finally {
eu.close();
}
}
location:type中创建的类型为坐标点的字段的名称。
SortBuilders.geoDistanceSort():计算两个点之间的距离的函数
unit(DistanceUnit.KILOMETERS):设置现在的距离单位。
.order(SortOrder.ASC);排序方式(按照距离升序排列。即距离近的文档排在最前面.)
3.根据条件查询数据并计算坐标距离再聚合统计,效果和sql 的group by相似。
public Map datagroup(Map<String, Object> map) {
ElasticsearchUtil eu = new ElasticsearchUtil("123");
int from = 0;
String type = "stang_tunnel";
try {
from = (Integer.parseInt(map.get("page").toString()) - 1) * pageSize;
} catch (Exception e) {
}
map.remove("page");
String lat = map.get("lat").toString();
String lon = map.get("lon").toString();
map.remove("lat");
map.remove("lon");
String indexname = "test";
BoolQueryBuilder bqb = QueryBuilders.boolQuery();
try {
if (!map.isEmpty()) {
for (Entry<String, Object> vo : map.entrySet()) {
switch (vo.getKey()) {
case "type":
QueryBuilder term1 = QueryBuilders.matchPhraseQuery("type", vo.getValue());
bqb.must(term1);
break;
case "area_id":
QueryBuilder term2 = QueryBuilders.matchPhraseQuery(vo.getKey(), vo.getValue());
bqb.must(term2);
break;
default:
break;
}
}
} else {
QueryBuilder term = QueryBuilders.matchAllQuery();
bqb.must(term);
}
GeoPoint gp = GeoPoint.parseFromLatLon(lat + "," + lon);
SortBuilder sbd = SortBuilders.geoDistanceSort("location").points(gp).unit(DistanceUnit.KILOMETERS).order(SortOrder.ASC);
String[] fields = {"id", "area_id", "city_id", "name", "roadnetwork", "status"};
//核心部分
AbstractAggregationBuilder aab = AggregationBuilders.terms("group").field("name").size(100).subAggregation(AggregationBuilders.topHits("too").setFetchSource(true).setFetchSource(fields, null).setSize(1));
SearchResponse searchResponse = eu.geoDistanceSortSearchAndSetSourceFileds(indexname, type, bqb, fields, sbd, aab, from, pageSize);
SearchHits hits = searchResponse.getHits();
List lists = new ArrayList<>();
//核心部分
Terms terms = searchResponse.getAggregations().get("group");
List<Terms.Bucket> buckets = terms.getBuckets();
for (Terms.Bucket bucket : buckets) {
TopHits topHits = bucket.getAggregations().get("too");//获取子聚合中的参数。
for (SearchHit hit : topHits.getHits()) {
map = new HashMap<>();
map = hit.getSource();
map.put("area", OutData.transArea(map.get("area_id").toString()));
map.put("city", eu.outCity("jjt", "stang_area", Integer.parseInt(map.get("city_id").toString())));
}
lists.add(map);
}
map = new HashMap<>();
map.put("ext", lists);
map.put("state", true);
map.put("message", "操作成功");
} catch (Exception e) {
System.out.println(e.getMessage());
map = OutData.softwareFormart();
} finally {
eu.close();
}
return map;
}
AggregationBuilders.terms(“group”).field(“name”).size(100):外层聚合。
.subAggregation(AggregationBuilders.topHits(“too”).setFetchSource(true).setFetchSource(fields, null).setSize(1)):这个内层聚合里面的参数才是我们真正要取得的参数。size为1的目的是每个类别只需要返回条数据。因为是统计类别嘛。
setFetchSource():设置内部聚合返回的字段。
好了大体上也就这么多了。欢迎各位朋友赐教。