Filter collection with guava

Filtering objects from a collection in java can be difficult to test and hard to read. Let's examine a cleaner approach to those multi-line if statements with Collections2.filter, a Google guava utility class.

Detailed Video Notes

Hey everyone this is Justin Musgrove from level up lunch today we are going talk about filtering a collection. So often you get data from data service or data base and you get all these java objects that you want to filter them down like you would in SQL. We are going to look at it from traditional perspective and also look at how we can really make it work well with Guava, which is a very interesting how they have a powerful utility classes to help us filter down collections.

For our example we are going to look at states. We have 50 states in United States and we may want to just get midwest states so we will create a state object and a state test. In our test we will walk through how we will trim it down just to get the Midwest region.

Let's go ahead and get started. I already created a project and will get this posted out on github, filtering java collections with guava. The one thing we did in our pom.xml file is we imported the guava dependency 15.0. So we have the dependency of guava and junit which we will use to run some unit tests. Instead of you watching me create all projects stuff and import dependencies we got that set up.

<dependency>
  <groupId>junit</groupId>
  <artifactId>junit</artifactId>
  <version>4.11</version>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>com.google.guava</groupId>
  <artifactId>guava</artifactId>
  <version>15.0</version>
</dependency>

Set up pojo

[1:38]

We will create our first class, State which is a simple POJO. A few variables stateCode, name, regionCode and population then generate getters and setters. Couple other things we want to do is create a constructor from the class properties. We can do that through eclipse shortcut -> generate constructor from fields. Then for demonstration purposes we want to override toString so when we send it to the console we will see the object values. We will piece the toString return using another Guava class, Objects.toStringHelper.

public class State {

    private String stateCode;
    private String name;
    private String regionCode;
    private int population;

    public State(String stateCode, String name, String regionCode,
            int population) {
    ...
    }

    ...
    @Override
    public String toString() {
        return Objects.toStringHelper(this)
                .add("stateCode", stateCode)
                .add("name", name)
                .toString();

    }


}

Set up test class

[3:54]

Next, lets create a StateTest class and a variable named states. We will have a list of states of type state objects. I created seed data so you won't have to watch me type so lets copy it into our state test.

public class StateTest {

    List<State> states;

    @Before
    public void setUp () {

        states = Lists.newArrayList();

        states.add(new State("WI", "Wisconsin", "MDW", 5726398));
        states.add(new State("FL", "Florida", "SE", 19317568));
        states.add(new State("IA", "Iowa", "MDW", 3078186));
        states.add(new State("CA", "California", "W", 38041430));
        states.add(new State("NY", "New York", "NE", 19570261));
        states.add(new State("CO", "Colorado", "W", 5187582));
        states.add(new State("OH", "Ohio", "MDW", 11544225));
        states.add(new State("ME", "Maine", "NE", 1329192));
        states.add(new State("SD", "South Dakota", "MDW", 833354));
        states.add(new State("TN", "Tennessee", "SE", 6456243));
        states.add(new State("OR", "Oregon", "W", 3899353));
    }

    ...
}

Using if conditional

[5:20]

Lets look at this from a traditional perspective and create a method getmidwesttraditional which will use a if statement. Most everyone has done this before where you create a for loop, loop over the states and filter based off the if statement. This does what we needed to do, it works well for small conditionals. If we wanted to add a more filters, where population is is greater than 12312 OR where name starts with "C". As you can see every time you add a conditional check, your if statements gets bigger and harder to read.

What happens next is another developer comes along while looking at the code says "Hmm, I need to filter states". They will copy and paste code, just like all lazy developers do, changing one variable. This leads to maintainability issues, duplicated code and all those good things. Lets take a look further this with guava.

@Test
public void get_mdw_traditional () {

    List<State> mdwStates = Lists.newArrayList();
    for (State state : states) {

        if (state.getRegionCode().equals("MDW")) {
            mdwStates.add(state);
        }
    }

    System.out.println(mdwStates);
}

Using Collections2.filter

[8:35]

Lets create a method getmidweststateswithguava. We are just going to touch on the Collections2 filter method but there other Guava utilities class that we will highlight that are very powerful. It just depends on what collection type you are dealing with. The parameters on Collections2.filter is the unfiltered collection and a predicate. A predicate can be thought of as an if statement or a conditional that you want to apply to each element within your collection.

@Test
public void get_mdw_states_with_guava () {

    Collection<State> mdwStates = Collections2.filter(states, State.byMdwRegion);

    System.out.println(mdwStates);
}

Why is using the filter with a predicate so different than just a standard if statement? One thing we can do is extract the predicate outside the test method and most likely we will want to put it inside the state class. This method is in one spot so if someone else want to apply that same logic it would be just a matter of using a collection utility and applying the predicate through a filter.

A couple other utilities Guava has, Iterables filter, FluentIterable which is really kind of neat as it allows you to chain multiple predicates together. So in our instance of States, if we wanted to get midwest region, population and any states that started with "C" it would be that easy.

Iterables mdwStates = Iterables.filter(states, State.byMdwRegion);

List<State> mdwStates = FluentIterable
                            .from(states)
                            .filter(State.byMdwRegion)
                            .filter(byPopulationGT?)
                            .filter(whereStateStartsWithC)
                            .toList();

To recap, Guava contains very powerful collections utilties that gives the flexibility to filter collections. Hopefully this gives you another perspective the next time you need to apply this pattern.