Group and count chars in a string

While count occurrences of a char finds how many times a single char is present, this example will group all characters of a string finding how many times a char is repeated. Each snippet calls toLowerCase and will include the whitespace in the numbers.

Straight up Java

A pre java 8 solution is to use a HashMap using the char as a key and value is the number of times the char is present. If there is the char contained in the map the key or the count will be incremented else initialized.

@Test
public void most_frequent_char_java() {
    
    Map<String, Long> charCount = new HashMap<>();
    for (char charachter : sentence.toLowerCase().toCharArray()) {
        
        String charAsString = Character.toString(charachter);
        if (charCount.containsKey(charAsString)) {
            long val = charCount.get(charAsString) + 1;
            charCount.put(charAsString, val);
        } else {
            charCount.put(charAsString, 1l);
        }

    }
    
    System.out.println(charCount);
    
    assertEquals(7, charCount.get("e"), 0);
}

Output

{ =8, a=2, c=2, e=7, f=1, h=5, i=4, m=1, n=4, o=1, q=1, r=2, s=5, t=7, u=1, w=2}

Java 8

Using a common regular expression to split on each char and converting a string to a stream we collect using a reduction operation to group chars. The Collectors.counting() is a collector that accepts a char and counts the number of times it occurs. In the instance no elements are present, the result is 0. Finally outputting the char and the number of times it was present using a java 8 foreach.

@Test
public void most_frequent_char_java8() throws IOException {

    Map<String, Long> frequentChars = Arrays.stream(
            sentence.toLowerCase().split("")).collect(
            Collectors.groupingBy(c -> c, Collectors.counting()));

    frequentChars.forEach((k, v) -> System.out.println(k + ":" + v));

    assertEquals(7, frequentChars.get("e"), 0);
}

Output

 :8
a:2
c:2
e:7
f:1
h:5
i:4
m:1
n:4
o:1
q:1
r:2
s:5
t:7
u:1
w:2

Google Guava

Before showing the code snippet, it is important to understand what a multiset is. Wikipedia defines a multiset, in mathematics, as "a generalization of the notion of set in which members are allowed to appear more than once... In multisets, as in sets and in contrast to tuples, the order of elements is irrelevant: The multisets {a, a, b} and {a, b, a} are equal." A guava multiset is like a set but can contain multiple items and has methods has useful summary methods like count.

The guava splitter class has the ability to split a string by a fixed length and in this case split it by 1 returning an Iterable. Passing the Iterable into HashMultiset constructor which in concept will create a key of the char and a count as the value. Finally outputing the contents of the key and count.

@Test
public void most_frequent_char_guava() throws IOException {

    Multiset<String> frequentCharacters = HashMultiset.create(Splitter
            .fixedLength(1).split(sentence.toLowerCase()));

    for (Entry<String> item : frequentCharacters.entrySet()) {
        System.out.println(item.getElement() + ":" + item.getCount());
    }
    
    assertEquals(7, frequentCharacters.count("e"), 0);
}

Output

 :8
a:2
c:2
e:7
f:1
h:5
i:4
m:1
n:4
o:1
q:1
r:2
s:5
t:7
u:1
w:2