Stream groupingBy

A common operation that you have become familiar with in SQL is the GROUP BY statement which is used in conjunction with the aggregate functions such as count. It might look something like this:

SELECT column_name, count(column_name)
FROM table
GROUP BY column_name;

In java 8 the idea of grouping objects in a collection based on the values of one or more of their properties is simplified by using a Collector. A collector is another new interface introduced in Java 8 for defining how to perform a reduction operation on a stream and with the functional nature allows you to achieve a grouping with a single statment. It might even spark debates with your DBA for control or their lack of understanding. Should it happen in code or in the database? In the example below, we will peform a number of group by statements to show the simplicity and flexibility of the interface.

Setup

class StudentClass {

    private String teacher;
    private double level;
    private String className;

    public StudentClass(String teacher, double level, String className) {
        super();
        this.teacher = teacher;
        this.level = level;
        this.className = className;
    }

    //...
}

private List<StudentClass> studentClasses;

@Before
public void setUp() {

    studentClasses = new ArrayList<>();

    studentClasses.add(new StudentClass("Kumar", 101, "Intro to Web"));
    studentClasses.add(new StudentClass("White", 102, "Advanced Java"));
    studentClasses.add(new StudentClass("Kumar", 101, "Intro to Cobol"));
    studentClasses.add(new StudentClass("Sargent", 101, "Intro to Web"));
    studentClasses.add(new StudentClass("Sargent", 102, "Advanced Web"));

}

Group by teacher name

This example will show how to "group classes by a teacher's name". Here you will pass the groupingBy method a function in the form of a method reference extracting each teacher name to the corresponding StudentClass which will return 1 key to many elements. This is similar to guava's Multimap collection which allows for easy mapping of a single key to multiple values.

@Test
public void group_by_teacher_name() {

    Map<String, List<StudentClass>> groupByTeachers = studentClasses
            .stream().collect(
                    Collectors.groupingBy(StudentClass::getTeacher));

    logger.info(groupByTeachers);

    assertEquals(1, groupByTeachers.get("White").size());
}

Output

{
White=[StudentClass {teacher=White, level=102.0, className=Advanced Java}],
Kumar=[StudentClass{teacher=Kumar, level=101.0, className=Intro to Web}, StudentClass{teacher=Kumar, level=101.0, className=Intro to Cobol}],
Sargent=[StudentClass{teacher=Sargent, level=101.0, className=Intro to Web}, StudentClass{teacher=Sargent, level=102.0, className=Advanced Web}]
}

Group by class level

Similar to above, this example will show "show all classes by level".

@Test
public void group_by_level() {

    Map<Double, List<StudentClass>> groupByLevel = studentClasses.stream()
            .collect(Collectors.groupingBy(StudentClass::getLevel));

    logger.info(groupByLevel);

    assertEquals(3, groupByLevel.get(101.0).size());
}

Output

102.0=[StudentClass{teacher=White, level=102.0, className=Advanced Java}, StudentClass{teacher=Sargent, level=102.0, className=AdvancedWeb}],
101.0=[StudentClass{teacher=Kumar, level=101.0,className=Intro to Web}, StudentClass{teacher=Kumar, level=101.0,className=Intro to Cobol}, StudentClass{teacher=Sargent, level=101.0, className=Intro to Web}]

groupBy aggregate

This example will "count the number of classes per level". In sql, it might look like this:

SELECT CLASS_LEVEL, count(CLASS_LEVEL)
FROM STUDENT_CLASS
GROUP BY CLASS_LEVEL;

In an overloaded groupingBy method, you can pass a second collector. Collectors have various reduce operations which can be passed, in this case Collectors.counting which will count the number of classes in each level.

@Test
public void group_by_count() {

    Map<Double, Long> groupByLevel = studentClasses.stream().collect(
            Collectors.groupingBy(StudentClass::getLevel,
                    Collectors.counting()));

    logger.info(groupByLevel);

    assertEquals(2.0, groupByLevel.get(102.0), 0);
}

Output

<{102.0=2, 101.0=3}>