Simply put
DoubleSummaryStatistics
is a class used for collecting statistical information about a set of double
values. It provides a convenient way to calculate the minimum, maximum, sum, average, and count of a series of double
values.
Internally, DoubleSummaryStatistics
uses several properties to track the collected statistical information:
count
: the number of valuessum
: the sum of all valuesmin
: the minimum valuemax
: the maximum value
When we use DoubleSummaryStatistics
to collect statistics on a series of double
values, it iterates over each value, updating these properties by comparing the values.
In Java, we often use the Collectors.summarizingDouble
collector to convert a stream of double
values into a DoubleSummaryStatistics
object. During this process, the collector traverses each value in the stream, updating the properties of the DoubleSummaryStatistics
object.
When dealing with a large number of double
values, DoubleSummaryStatistics
may internally use optimized algorithms to efficiently calculate the statistical information, minimizing the number of traversal and value comparison operations.
Benchmark
import java.util.ArrayList;
import java.util.DoubleSummaryStatistics;
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;
public class DoubleSummaryStatisticsBenchmark {
public static void main(String[] args) {
int size = 100000000; // adjust size for your needs
// Generate random double list
List<Double> list = new ArrayList<>(size);
Random random = new Random();
for (int i = 0; i < size; i++) {
list.add(random.nextDouble());
}
// Benchmark using traditional loop
long startTimeLoop = System.nanoTime();
double minLoop = Double.MAX_VALUE;
double maxLoop = Double.MIN_VALUE;
double sumLoop = 0;
long countLoop = 0;
for (double value : list) {
minLoop = Math.min(minLoop, value);
maxLoop = Math.max(maxLoop, value);
sumLoop += value;
countLoop++;
}
long endTimeLoop = System.nanoTime();
double timeLoop = (endTimeLoop - startTimeLoop) / 1e9;
// Benchmark using DoubleSummaryStatistics
long startTimeStats = System.nanoTime();
DoubleSummaryStatistics stats = list.stream().collect(Collectors.summarizingDouble(Double::doubleValue));
long endTimeStats = System.nanoTime();
double timeStats = (endTimeStats - startTimeStats) / 1e9;
// Print results
System.out.println("Size: " + size);
System.out.println("Loop Time (s): " + timeLoop);
System.out.println("Stream Time (s): " + timeStats);
System.out.println("Min (Loop): " + minLoop);
System.out.println("Max (Loop): " + maxLoop);
System.out.println("Sum (Loop): " + sumLoop);
System.out.println("Count (Loop): " + countLoop);
System.out.println("Min (Stream): " + stats.getMin());
System.out.println("Max (Stream): " + stats.getMax());
System.out.println("Sum (Stream): " + stats.getSum());
System.out.println("Count (Stream): " + stats.getCount());
}
}
See
https://github.com/mtopolnik/billion-row-challenge/blob/main/src/Blog1.java
DoubleSummaryStatistics getMax() method in Java with Examples - GeeksforGeeks