Summary statistics made easy

Java 8 has simplified finding summary statistics such as average, maximum, minimum, sum and count on collections and arrays. Using the Stream API the reduction operation is performed internally providing a clean and eloquent way to its predecessor for loop.

Detailed Video Notes

Hey everyone, this is Justin at level up lunch. In today's episode we are going to look at how java 8 has simplified summary statistics such as average, max, min and sum. To get us set up, we created a list of numbers and populated it with ten double objects with the value of 10. If you haven't installed java 8, you will need to do so to run through this tutorial.

[0:26]

Prior to java 8 you might have looped through or iterated over a list adding each element to a variable to capture the sum. Then to find the average you would divide by the number of elements in the list. Or you might have created an internal utility to your company or used a public library like Guava. Guava has DoubleMath.mean that will find the average of all the numbers in the list. Other public libraries exists such as apache commons which contains functionally similar operations.

@Test
public void average_with_java() {

    Double sum = 0d;
    for (Double vals : numbers) {
        sum += vals;
    }

    sum = sum / numbers.size();

    assertEquals(new Double(10), sum);
}

@Test
public void average_with_guava() {

    double average = DoubleMath.mean(numbers);
    assertEquals(10, average, 0);
}

At first glance

[0:57]

Lets take a look at how we can perform these operations in java 8. We have a list of numbers and will call the stream() then mapToDouble() which accepts a function. A function in java 8 is a method that accepts an argument and produces a result. We will create a function with a lambda expression of Double::doubleValue. Calling the mapToDouble will return a DoubleStream which is a specialized stream to work with primitive values of type double. Finally, Stream.average() will return the average in the OptionalDouble container.

Replacing Stream.average() with Stream.min() will return the minimum value in the list.

@Test
public void stats_with_java8() {

    OptionalDouble average = numbers.stream()
            .mapToDouble(Double::doubleValue).average();

    System.out.println(average.getAsDouble());

    OptionalDouble min = numbers.stream().mapToDouble(Double::doubleValue)
            .min();

    System.out.println(min);
}

Looking at the output you will notice that both the average and the min equal 10. This is because the min value of a list of all values of 10 is 10 and the average of 10 * 10 or 100 = 10.

Output

10.0
OptionalDouble[10.0]

General purpose reduction operation

[2:57]

There is a few different ways to calculate statistics. Above we used connivence methods while in this snippet we will show how to use the general purpose reduction operation by calling Stream.reduce. Regardless of which approach you take these reduction methods will condense the stream into one a value.

We again will call the numbers.stream but instead of mapToDouble we will call the Stream.reduce. The reduce method accepts an accumulator function which maintains a running total while calling a supplied function. To sum elements, we will specify the lambda expression (a, b) -> a + b that will add two Double values and returns a Double value which will equate to the sum of all elements in the list.

@Test
public void stats_with_java8_reduce() {

    Optional<Double> sum = numbers.stream().reduce((a, b) -> a + b);

    System.out.println(sum);
}

Output

Optional[100.0]

Using SummaryStatistics state object

[4:19]

Depending on what you are dealing with, another possible way to find the sum of an array is to use the Stream.collect() method passing the collector Collectors.summarizingDouble which returns a summary statistics object DoubleSummaryStatistics for the resulting values. Like IntSummaryStatistics and LongSummaryStatistics, the DoubleSummaryStatistics is a state object that contains all the statistics average, sum, max, min and count. All of these pieces can be obtained in one swoop so if you need them it is easy to gain access to them.

@Test
public void statis_with_java8_reduction_target() {

    DoubleSummaryStatistics stats = numbers.stream().collect(
            Collectors.summarizingDouble(Double::doubleValue));

    System.out.println(stats);
}

Output

DoubleSummaryStatistics{count=10, sum=100.000000, min=10.000000, average=10.000000, max=10.000000}

Reduce on a list of objects

[6:24]

The next question you might be asking is we don't always deal with a list of numbers, doubles, ints or longs. We might have a list of objects where we want to obtain or call a summary statistic operations. Taking a look, we will jump over to our class SummaryStatistics2 where in this example we create a class named Company with a attribute named revenue. In our set up method, we created a list of companies with a specified revenue. Lets take a look at how to perform these same operations on a list of objects.

We will call the stream method on the list of companies and like before we will call the Stream.collect passing in Collectors.summarizingDouble. Instead of passing in the function produce by the lambda expression Double::doubleValue as in the prior example, we will pass in Company::getRevenue. It will tell the collector for every value in the stream we want to call the Company.getRevenue method. The result of this is the DoubleSummaryStatistics object.

If we wanted to just call a connivence method we can call the Stream.mapToDouble passing in the lambda expression Company::getRevenue then calling the .average() terminal method.

class Company {

    double revenue;

    public Company(double revenue) {
        super();
        this.revenue = revenue;
    }

    public double getRevenue() {
        return revenue;
    }
}

List<Company> companies;

@Test
public void stats_with_java8() {

    DoubleSummaryStatistics stats = companies.stream().collect(
            Collectors.summarizingDouble(Company::getRevenue));

    System.out.println(stats);

    OptionalDouble average = companies.stream()
            .mapToDouble(Company::getRevenue).average();

    System.out.println(average);
}

Output

DoubleSummaryStatistics{count=4, sum=439.770000, min=12.100000, average=109.942500, max=184.900000}
OptionalDouble[109.9425]

Thanks for joining us at level up lunch, I hope you enjoyed this episode. Have a great day!