ElasticSearch 地理位置聚合

好久没有写东西了。正好这两天刚忙完,就抽个时间把之前弄的东西总结下。供自己温故。当然也如果能对朋友们有所帮助。那是再好不过!
先说说弄了这么久es的一些小感受吧。
1.随着es使用的深入。发现我在做一个项目的时候最初希望将这个项目涉及到的type全部放到一个index下。但是当项目的深入后发现同样的一个字段可能在不同的type下,它的类型可能是string,也有可能是int(数字)型 。比如我常用的status这个字段。(所有的表都是基于原来的mysql数据库同步过去) 在30多个表中,其中2到3个表它的类型是string其他的表都是int。这样就导致需要将这个status指定一个别名。在所有用到这几个表的数据的时候都需要将别名重新改成status。这样无形中则增加了很多麻烦。
2.由于将所有type都放到一个index下。发现我们在项目中会遇到意想不到一些问题。比如分词方式,我遇到的问题是按照名称去做聚合。可是前期并不知情,就将name这个字段用了ik分词。大家都知道用了分词的字段聚合的结果就不是你想要的了。es字段属性设置好之后又没有办法去修改。就只有删除库重新建。
3.就是删除index的时候。数据都在一个index下你本来只有一个字段定义错了。但是没办法修改就只有备份然后删除index 重新创建。这样代价太大。
接下来就就是正题了。
查询stang_cbid的文档,title中包含“测试”,同时发布日期大于2017-05-01,按照pubtime 汇总并排序。
{
    "query": {
        "bool": {
            "filter": [
                {
                    "match_phrase": {
                        "title": "测试" }
                    },
                    {
                        "range":{
                            "pubtime":{ "gte":"2017-05-01" } }
                    }
            ]
        }
    },"sort": [
       {
          "pubtime": {
             "order": "desc"
          }
       }
    ]
    ,
    "aggs":{
        "group":{
                "terms":{
                "field":"pubtime"   
                }
        }
    }
}

局部更新数据:

post /index/type/_id/_update
{
    "doc":{
    "filed1":"value",
    "filed2":"value"
    }
}

bool条件过滤并以坐标点距离排序

{
    "query" : {
        "bool": {
            "must": [
                {
                    "match_phrase": {
                      "projectname": "隧道" }
               }
                ,
               {
                   "match_phrase": {
                      "area_id": "26" }

               }
            ]
        }
    }, "sort" : [   
        {
            "_geo_distance" : {
                "location" :"31.88,106.25",
                "order" : "asc",
                "unit" : "km"
            }
        }
    ]
}

分组查询:(注意用于分组的filed(字段) index属性为“not_analyzed”不被分析)

{
    "query": {
        "bool": {
            "must": [
               {
                   "term": {
                  "type": { "value": 2 } }

               },
                              {
                   "term": {
                  "area_id": { "value": 26 } }

               }
            ]
        }
    }, 
    "aggs": {
        "top_tags": {
            "terms": {
                "field": "standards",
                "size": 20
            },
            "aggs": {
                "too": {
                    "top_hits": { "_source": { }, "size" : 1 } }
            }
        }
    }
}

我们要计算地理位置。首先我们需要设置字段的类型为geo_point.
如下:
“properties”: {
“location”:{
“type”: “geo_point”
}
}
我es的版本是2.4.1。type里面找不到这个类型。但是设置上了还是正确的类型。
存放的形式有3总
json体:”location”:{“lat”:30.23422,”lon”:107.23151}
数组:[30.123,104.21543]
String:{“30.34”,”121.343654”}
所有类型都是latitude在前,longitude在后。
下面是java 的代码:
1.给定一个坐标范围,查询落在这个范围内的所有文档,并指定返回的参数名称。

  public Map map_tunnel(Map map) {
        ElasticsearchUtil eu = new ElasticsearchUtil("123");
        String indexname = "test";
        int from = 0;
        List list = new ArrayList<>();
        String name = "";
        try{
            name = map.get("name").toString();
        }catch(Exception e){
            name = null;
        }
        String[] fileds = new String[]{};
        SearchResponse searchResponse = null;
        QueryBuilder qb = QueryBuilders.geoBoundingBoxQuery("location").topLeft(Double.parseDouble(map.get("lat1").toString()), Double.parseDouble(map.get("lon1").toString())).bottomRight(Double.parseDouble(map.get("lat2").toString()), Double.parseDouble(map.get("lon2").toString())).ignoreMalformed(true);
        //ignoreMalformed忽略畸形数据
        if (null != name) {
            switch (name) {
                case "plan":
                    name = "stang_plan_project";
                    fileds = new String[]{"id", "latitude", "longitude", "projectaddress", "projectname"};
                    break;
                case "work":
                    name = "stang_work_project";
                    fileds = new String[]{"id", "latitude", "longitude", "projectaddress", "projectname"};
                    break;
                case "tunnel":
                    name = "stang_tunnel";
                    fileds = new String[]{"id", "latitude", "longitude", "address", "name", "section", "status", "type"};
                    break;
                default:
                    break;
            }
            if (name.contains("tunnel")) {
                BoolQueryBuilder filterqb = QueryBuilders.boolQuery();
                filterqb.mustNot(QueryBuilders.matchQuery("type", 2));
                filterqb.filter(QueryBuilders.matchQuery("status", 3));
                QueryBuilder qbss = QueryBuilders.queryFilter(filterqb);
                searchResponse = eu.searchCompanySetSourceFiled(indexname, name, qb, qbss, fileds, from, maxSize);
            } else {
                searchResponse = eu.searchCompanySetSourceFiled(indexname, name, qb, fileds, from, maxSize);
            }
        } else {
            name = "stang_tunnel";
            fileds = new String[]{"id", "latitude", "longitude", "address", "name", "section", "status", "type","forid"};
            QueryBuilder qbss = QueryBuilders.boolQuery().mustNot(QueryBuilders.matchQuery("type", 2));
            searchResponse = eu.searchCompanySetSourceFiled(indexname, name, qb, qbss, fileds, from, maxSize);
        }
        try {
            SearchHits hits = searchResponse.getHits();
            for (SearchHit hit : hits) {
                map = new HashMap<>();
                map = hit.getSource();
                list.add(map);
            }
            Map tmap = new HashMap<>();
            tmap.put("ext", list);
            tmap.put("state", true);
            tmap.put("message", "操作成功");
            return tmap;
        } catch (Exception e) {
            return OutData.softwareFormart();

        } finally {
            eu.close();
        }
    }

这个是searchCompanySetSourceFiled方法。(这个方法其实没什么好多说的。)

    public SearchResponse searchCompanySetSourceFiled(String indexname, String type, QueryBuilder queryBuilder, String[] fileds, int from, int pageSize) {
        SearchResponse searchResponse = null;
        try {
            searchResponse = client.prepareSearch(indexname).setTypes(new String[]{type}).setQuery(queryBuilder).setFetchSource(fileds, null).setFrom(from).setSize(pageSize).execute().actionGet();
            return searchResponse;
        } catch (Exception e) {
            return searchResponse;
        }
    }

得到的效果就是这样的:
这里写图片描述
简单解释下参数:lat1,lon1为给定的左上角的坐标,lat2,lon2为右下角的坐标。
geoBoundingBoxQuery:这个方法大致意思就是根据你传入的两个坐标点构建一盒子。只要是落在这个盒子或者矩形中的坐标点都会被查询出来。

2.计算两个坐标点之间的距离。

    public Map datalist(Map<String, Object> map) {
        String lat = map.get("lat").toString();
        String lon = map.get("lon").toString();
        map.remove("lat");
        map.remove("lon");
        int from = 0;
        try{
            from = Integer.parseInt(map.get("from").toString());
        }catch(Exception e){
            from = 0;
        }
        ElasticsearchUtil eu = new ElasticsearchUtil("123");
        BoolQueryBuilder bqb = QueryBuilders.boolQuery();
        try {
            if (!map.isEmpty()) {
                for (Entry<String, Object> vo : map.entrySet()) {
                    switch (vo.getKey()) {
                        case "type":
                            QueryBuilder term = QueryBuilders.matchQuery("type", vo.getValue());
                            bqb.must(term);
                            break;
                        case "status":
                            QueryBuilder term1 = QueryBuilders.matchQuery("status", vo.getValue());
                            bqb.must(term1);
                            break;
                        case "roadnetwork":
                            QueryBuilder term2 = QueryBuilders.matchQuery("roadnetwork", vo.getValue());
                            bqb.must(term2);
                            break;
                        case "name":
                            QueryBuilder term3 = QueryBuilders.matchPhraseQuery("name", vo.getValue());
                            bqb.must(term3);
                            break;
                        case "area_id":
                            QueryBuilder term4 = QueryBuilders.matchQuery("area_id", vo.getValue());
                            bqb.must(term4);
                            break;
                        default:
                            break;
                    }
                }
            } else {
                QueryBuilder term = QueryBuilders.matchAllQuery();
                bqb.should(term);
            }
            SortBuilder sb = SortBuilders.geoDistanceSort("location").point(Double.parseDouble(lat), Double.parseDouble(lon)).ignoreMalformed(true).unit(DistanceUnit.KILOMETERS).order(SortOrder.ASC);
            String indexname = "test";
            String[] fileds = {"address", "area_id", "city_id", "id", "latitude", "longitude", "name", "pic_url", "roadnetwork", "section", "status", "type", "length"};
            SearchResponse searchResponse = eu.geoDistanceSortSearchAndSetSourceFileds(indexname, "stang_tunnel", bqb, fileds, sb, from, pageSize);
            SearchHits hits = searchResponse.getHits();

            List lists = new ArrayList<>();
            for (SearchHit hit : hits) {
                map = new HashMap<>();
                map = hit.getSource();
                map.put("distance", OutData.formartDouble((double) hit.getSortValues()[0]));
                lists.add(map);
            }
            map = new HashMap<>();
            map.put("count", hits.getTotalHits());
            lists.add(map);
            return OutData.software_Formart(lists,pageSize);
        } catch (Exception e) {
            return OutData.softwareFormart();
        } finally {
            eu.close();
        }
    }

location:type中创建的类型为坐标点的字段的名称。
SortBuilders.geoDistanceSort():计算两个点之间的距离的函数
unit(DistanceUnit.KILOMETERS):设置现在的距离单位。
.order(SortOrder.ASC);排序方式(按照距离升序排列。即距离近的文档排在最前面.)
3.根据条件查询数据并计算坐标距离再聚合统计,效果和sql 的group by相似。

    public Map datagroup(Map<String, Object> map) {
        ElasticsearchUtil eu = new ElasticsearchUtil("123");
        int from = 0;
        String type = "stang_tunnel";
        try {
            from = (Integer.parseInt(map.get("page").toString()) - 1) * pageSize;
        } catch (Exception e) {
        }
        map.remove("page");
        String lat = map.get("lat").toString();
        String lon = map.get("lon").toString();
        map.remove("lat");
        map.remove("lon");
        String indexname = "test";
        BoolQueryBuilder bqb = QueryBuilders.boolQuery();
        try {
            if (!map.isEmpty()) {
                for (Entry<String, Object> vo : map.entrySet()) {
                    switch (vo.getKey()) {
                        case "type":
                            QueryBuilder term1 = QueryBuilders.matchPhraseQuery("type", vo.getValue());
                            bqb.must(term1);
                            break;
                        case "area_id":
                            QueryBuilder term2 = QueryBuilders.matchPhraseQuery(vo.getKey(), vo.getValue());
                            bqb.must(term2);
                            break;
                        default:
                            break;
                    }
                }
            } else {
                QueryBuilder term = QueryBuilders.matchAllQuery();
                bqb.must(term);
            }
            GeoPoint gp = GeoPoint.parseFromLatLon(lat + "," + lon);
            SortBuilder sbd = SortBuilders.geoDistanceSort("location").points(gp).unit(DistanceUnit.KILOMETERS).order(SortOrder.ASC);
            String[] fields = {"id", "area_id", "city_id", "name", "roadnetwork", "status"};
            //核心部分
            AbstractAggregationBuilder aab = AggregationBuilders.terms("group").field("name").size(100).subAggregation(AggregationBuilders.topHits("too").setFetchSource(true).setFetchSource(fields, null).setSize(1));
            SearchResponse searchResponse = eu.geoDistanceSortSearchAndSetSourceFileds(indexname, type, bqb, fields, sbd, aab, from, pageSize);
            SearchHits hits = searchResponse.getHits();
            List lists = new ArrayList<>();
            //核心部分
            Terms terms = searchResponse.getAggregations().get("group");
            List<Terms.Bucket> buckets = terms.getBuckets();
            for (Terms.Bucket bucket : buckets) {
                TopHits topHits = bucket.getAggregations().get("too");//获取子聚合中的参数。
                for (SearchHit hit : topHits.getHits()) {
                    map = new HashMap<>();
                    map = hit.getSource();
                    map.put("area", OutData.transArea(map.get("area_id").toString()));
                    map.put("city", eu.outCity("jjt", "stang_area", Integer.parseInt(map.get("city_id").toString())));
                }
                lists.add(map);
            }
            map = new HashMap<>();
            map.put("ext", lists);
            map.put("state", true);
            map.put("message", "操作成功");
        } catch (Exception e) {
            System.out.println(e.getMessage());
            map = OutData.softwareFormart();
        } finally {
            eu.close();
        }
        return map;
    }

AggregationBuilders.terms(“group”).field(“name”).size(100):外层聚合。
.subAggregation(AggregationBuilders.topHits(“too”).setFetchSource(true).setFetchSource(fields, null).setSize(1)):这个内层聚合里面的参数才是我们真正要取得的参数。size为1的目的是每个类别只需要返回条数据。因为是统计类别嘛。
setFetchSource():设置内部聚合返回的字段。
好了大体上也就这么多了。欢迎各位朋友赐教。

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值