Java实现分组聚合(摘录)

近期开发,遇到了,需要各种统计出租率,通过mysql分组聚合来计算效率不理想!经过一次全部查询拿到所有数据通过java实现分组聚合
介绍
在Java 8 的Lambda(stream)之前,要在Java代码中实现类似SQL中的group by分组聚合功能,还是比较困难的。这之前Java对函数式编程支持不是很好,Scala则把函数式编程发挥到了机制,实现一个group by聚合对Scala来说就是几行代码的事情:

val birds = List("Golden Eagle","Gyrfalcon", "American Robin",  "Mountain BlueBird", "Mountain-Hawk Eagle")
val groupByFirstLetter = birds.groupby(_.charAt(0))

输出:

Map(M -> List(Mountain BlueBird, Mountain-Hawk Eagle), G -> List(Golden Eagle, Gyrfalcon), 
       A -> List(American Robin))

Java也有一些第三方的函数库来支持,例如Guava的Function,以及functional java这样的库。 但总的来说,内存对Java集合进行GroupBy ,OrderBy, Limit等TopN操作还是比较繁琐。本文实现一个简单的group功能,支持自定义key以及聚合函数,通过简单的几个类,可以实现SQL都比较难实现的先分组,然后组内排序,最后取组内TopN。

源码可以在这里下载;

实现
假设我们有这样一个Person类:

package me.lin;

class Person {

    private String name;

    private int age;


    private double salary;


    public Person(String name, int age, double salary) {
        super();
        this.name = name;
        this.age = age;
        this.salary = salary;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }

    public double getSalary() {
        return salary;
    }

    public void setSalary(double salary) {
        this.salary = salary;
    }

    public String getNameAndAge() {
        return this.getName() + "-" + this.getAge();
    }

    @Override
    public String toString() {
        return "Person [name=" + name + ", age=" + age + ", salary=" + salary
                + "]";
    }
}

对于一个Person的List,想要根据年龄进行统计,取第一个值,取salary最高值等。实现如下:

聚合操作
定义一个聚合接口,用于对分组后的元素进行聚合操作,类比到MySQL中的count(*) 、sum():

package me.lin;

import java.util.List;

/**
 *
 * 聚合操作
 *
 * Created by Brandon on 2016/7/21.
 */
public interface Aggregator<T> {

    /**
     * 每一组的聚合操作
     *
     * @param key 组别标识key
     * @param values 属于该组的元素集合
     * @return
     */
    Object aggregate(Object key , List<T> values);
}

我们实现几个聚合操作,更复杂的操作支持完全可以自己定义。

分组实现
接下来是分组实现,简单起见,采用工具类实现:

package me.lin;

import java.lang.reflect.Field;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

/**
 * Collection分组工具类
 */
public class GroupUtils {


    /**
     * 分组聚合
     *
     * @param listToDeal    待分组的数据,相当于SQL中的原始表
     * @param clazz         带分组数据元素类型
     * @param groupBy       分组的属性名称
     * @param aggregatorMap 聚合器,key为聚合器名称,作为返回结果中聚合值map中的key
     * @param <T>           元素类型Class
     * @return
     * @throws NoSuchFieldException
     * @throws SecurityException
     * @throws IllegalArgumentException
     * @throws IllegalAccessException
     */
    public static <T> Map<Object, Map<String, Object>> groupByProperty(
            Collection<T> listToDeal, Class<T> clazz, String groupBy,
            Map<String, Aggregator<T>> aggregatorMap) throws NoSuchFieldException,
            SecurityException, IllegalArgumentException, IllegalAccessException {

        Map<Object, Collection<T>> groupResult = new HashMap<Object, Collection<T>>();

        for (T ele : listToDeal) {
            Field field = clazz.getDeclaredField(groupBy);
            field.setAccessible(true);
            Object key = field.get(ele);

            if (!groupResult.containsKey(key)) {
                groupResult.put(key, new ArrayList<T>());
            }
            groupResult.get(key).add(ele);
        }


        return invokeAggregators(groupResult, aggregatorMap);
    }


    public static <T> Map<Object, Map<String, Object>> groupByMethod(
            Collection<T> listToDeal, Class<T> clazz, String groupByMethodName,
            Map<String, Aggregator<T>> aggregatorMap) throws NoSuchMethodException, SecurityException, IllegalAccessException, IllegalArgumentException, InvocationTargetException {

        Map<Object, Collection<T>> groupResult = new HashMap<Object, Collection<T>>();

        for (T ele : listToDeal) {
            Method groupByMenthod = clazz.getDeclaredMethod(groupByMethodName);
            groupByMenthod.setAccessible(true);
            Object key = groupByMenthod.invoke(ele);

            if (!groupResult.containsKey(key)) {
                groupResult.put(key, new ArrayList<T>());
            }
            groupResult.get(key).add(ele);
        }


        return invokeAggregators(groupResult, aggregatorMap);
    }

    private static <T> Map<Object, Map<String, Object>> invokeAggregators(Map<Object, Collection<T>> groupResult, Map<String, Aggregator<T>> aggregatorMap) {

        Map<Object, Map<String, Object>> aggResults = new HashMap<>();
        for (Object key : groupResult.keySet()) {
            Collection<T> group = groupResult.get(key);
            Map<String, Object> aggValues = doInvokeAggregators(key, group, aggregatorMap);
            if (aggValues != null && aggValues.size() > 0) {
                aggResults.put(key, aggValues);
            }

        }
        return aggResults;

    }


    private static <T> Map<String, Object> doInvokeAggregators(Object key, Collection<T> group, Map<String, Aggregator<T>> aggregatorMap) {
        Map<String, Object> aggResults = new HashMap<String, Object>();

        if (group != null && group.size() > 0) {

            // 调用当前key的每一个聚合函数
            for (String aggKey : aggregatorMap.keySet()) {
                Aggregator<T> aggregator = aggregatorMap.get(aggKey);
                Object aggResult = aggregator.aggregate(key, Collections.unmodifiableList(new ArrayList<T>(group)));
                aggResults.put(aggKey, aggResult);
            }

        }

        return aggResults;

    }

}

上述代码中,分组的key可以指定元素的属性,也可以指定元素的方法,通过自己实现复杂方法和聚合函数,可以实现很强大的分组功能。

测试
根据属性分组
下面测试一下根据属性分组:

package me.lin;

import java.util.ArrayList;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class GroupByPropertyTest {

    public static void main(String[] args) throws NoSuchFieldException,
            SecurityException, IllegalArgumentException, IllegalAccessException {



        List<Person> persons = new ArrayList<>();

        persons.add(new Person("Brandon", 15, 5000));
        persons.add(new Person("Braney", 15, 15000));
        persons.add(new Person("Jack", 10, 5000));
        persons.add(new Person("Robin", 10, 500000));
        persons.add(new Person("Tony", 10, 1400000));

        Map<String, Aggregator<Person>> aggregatorMap = new HashMap<>();
        aggregatorMap.put("count", new CountAggregator<Person>());
        aggregatorMap.put("first", new FirstAggregator<Person>());

        Comparator<Person> comparator = new Comparator<Person>() {
            public int compare(final Person o1, final Person o2) {
                double diff = o1.getSalary() - o2.getSalary();

                if (diff == 0) {
                    return 0;
                }
                return diff > 0 ? -1 : 1;
            }
        };
        aggregatorMap.put("top2", new TopNAggregator<Person>( comparator , 2 ));
        Map<Object, Map<String, Object>> aggResults = GroupUtils.groupByProperty(persons, Person.class, "age", aggregatorMap);


        for (Object key : aggResults.keySet()) {
            System.out.println("Key:" + key);

            Map<String, Object> results = aggResults.get(key);
            for (String aggKey : results.keySet()) {
                System.out.println("     aggkey->" + results.get(aggKey));
            }
        }

    }

}

输出结果:

Key:10
     aggkey->3
     aggkey->Person [name=Jack, age=10, salary=5000.0]
     aggkey->[Person [name=Tony, age=10, salary=1400000.0], Person [name=Robin, age=10, salary=500000.0]]
Key:15
     aggkey->2
     aggkey->Person [name=Brandon, age=15, salary=5000.0]
     aggkey->[Person [name=Braney, age=15, salary=15000.0], Person [name=Brandon, age=15, salary=5000.0]]

根据方法返回值分组
测试根据方法返回值分组:

package me.lin;

import java.util.ArrayList;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class GroupByMethodTest {

    public static void main(String[] args) throws Exception {


        List<Person> persons = new ArrayList<>();

        persons.add(new Person("Brandon", 15, 5000));
        persons.add(new Person("Brandon", 15, 15000));
        persons.add(new Person("Jack", 10, 5000));
        persons.add(new Person("Robin", 10, 500000));
        persons.add(new Person("Tony", 10, 1400000));

        Map<String, Aggregator<Person>> aggregatorMap = new HashMap<>();
        aggregatorMap.put("count", new CountAggregator<Person>());
        aggregatorMap.put("first", new FirstAggregator<Person>());

        Comparator<Person> comparator = new Comparator<Person>() {
            public int compare(final Person o1, final Person o2) {
                double diff = o1.getSalary() - o2.getSalary();

                if (diff == 0) {
                    return 0;
                }
                return diff > 0 ? -1 : 1;
            }
        };
        aggregatorMap.put("top2", new TopNAggregator<Person>(comparator, 2));
        Map<Object, Map<String, Object>> aggResults = GroupUtils.groupByMethod(persons, Person.class, "getNameAndAge", aggregatorMap);


        for (Object key : aggResults.keySet()) {
            System.out.println("Key:" + key);

            Map<String, Object> results = aggResults.get(key);
            for (String aggKey : results.keySet()) {
                System.out.println("     " + aggKey + "->" + results.get(aggKey));
            }
        }

    }

}

测试结果:

Key:Robin-10
     count->1
     first->Person [name=Robin, age=10, salary=500000.0]
     top2->[Person [name=Robin, age=10, salary=500000.0]]
Key:Jack-10
     count->1
     first->Person [name=Jack, age=10, salary=5000.0]
     top2->[Person [name=Jack, age=10, salary=5000.0]]
Key:Tony-10
     count->1
     first->Person [name=Tony, age=10, salary=1400000.0]
     top2->[Person [name=Tony, age=10, salary=1400000.0]]
Key:Brandon-15
     count->2
     first->Person [name=Brandon, age=15, salary=5000.0]
     top2->[Person [name=Brandon, age=15, salary=15000.0], Person [name=Brandon, age=15, salary=5000.0]]

以上就是GroupBy的简单实现,如果问题,欢迎指出。欢迎交流。

原文:https://blog.csdn.net/bingduanlbd/article/details/51987117

Java代码中使用Elasticsearch(ES)进行分组聚合,通常是通过Elasticsearch的Java客户端库来实现的。这通常涉及以下步骤: 1. **添加依赖**:首先,需要在项目中添加Elasticsearch的客户端库依赖,比如使用Spring Data Elasticsearch或直接使用Elasticsearch官方的RestHighLevelClient。 2. **创建客户端连接**:使用`RestHighLevelClient`类创建与ES集群的连接。 3. **构建查询**:构建一个聚合查询,例如使用`TermsAggregationBuilder`来实现分组(group by)。 4. **执行查询**:将构建好的查询通过客户端发送给Elasticsearch集群执行。 5. **处理结果**:从返回的结果中提取并处理所需的聚合数据。 下面是一个使用`RestHighLevelClient`的简单示例代码,展示如何在Java中执行一个按照某个字段分组聚合查询: ```java import org.elasticsearch.action.search.SearchRequest; import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.RequestOptions; import org.elasticsearch.client.RestHighLevelClient; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.search.aggregations.AggregationBuilders; import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder; import org.elasticsearch.search.builder.SearchSourceBuilder; // 创建RestHighLevelClient实例 RestHighLevelClient client = new RestHighLevelClient( RestClient.builder(new HttpHost("localhost", 9200, "http"))); try { // 创建一个分组聚合查询 TermsAggregationBuilder aggregation = AggregationBuilders.terms("group_by_field").field("field_name"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.aggregation(aggregation); searchSourceBuilder.query(QueryBuilders.matchAllQuery()); // 创建搜索请求 SearchRequest searchRequest = new SearchRequest("index_name"); searchRequest.source(searchSourceBuilder); // 执行搜索请求 SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT); // 处理响应 Aggregations aggregations = searchResponse.getAggregations(); Terms groupByField = aggregations.get("group_by_field"); for (Terms.Bucket bucket : groupByField.getBuckets()) { String key = bucket.getKeyAsString(); // 处理每个分组的数据 } } catch (IOException e) { e.printStackTrace(); } finally { // 关闭客户端连接 try { client.close(); } catch (IOException e) { e.printStackTrace(); } } ``` 在上面的代码中,我们创建了一个`TermsAggregationBuilder`实例来对名为`field_name`的字段进行分组聚合,并通过`searchRequest`发送查询请求到Elasticsearch的`index_name`索引。然后处理返回的结果,提取每个分组的键(key)。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值