Chapter 4. Lambda Operations on Streams

4.1.  Extract stream data using map, peek, and flatMap methods

[Note]

Stream is a sequence of elements from a source supporting sequential and parallel aggregate operations. The source here refers to a Collection (e.g. java.util.List, java.util.Set) or an array which provides data to a Stream. The Stream keeps the order of the data as it is in the source.


List<String> list = Arrays.asList("Java", "is", "not", "great");
list.stream()
    .filter(s -> !s.startsWith("n"))
    .map(String::toUpperCase)
    .forEach(s -> System.out.print(s + " "));
}

					

output:

JAVA IS GREAT
					

A Collection (source) is an in-memory data structure to hold values and before we start using Collection, all the values should have been populated. Whereas a Stream is a data structure that is computed on-demand.

Stream does not store data, it operates on the source data structure (Collection or array) and produce pipelined data that we can use and perform specific operations. For example, we can create a Stream from a java.util.List and filter it based on a condition as shown above.

Intermediate Operations

Stream API operations that returns a new java.util.stream.Stream are called intermediate operations. Most of the times, these operations are lazy in nature, computation on the source data is only performed when the terminal operation is initiated, and source elements are consumed only as needed. Intermediate operations are never the final result producing operations. Commonly used intermediate operations are filter(...) and map(...).

  • Stream.filter(...)

    You can filter a stream using the Stream.filter(...) method:

    
    Stream<T> filter(Predicate<? super T> predicate);
    
    								

    Here is a stream filtering example:

    stream.filter(s -> !s.startsWith("n"));
    								

    The filter(...) method takes a Predicate functional interface as parameter. The Predicate interface takes a single parameter and returns a boolean primitive. If you look at the lambda expression above, you can see that it takes a single parameter s and returns a boolean -- the result of the !s.startsWith("n") method call.

    When you call the filter(...) method on a Stream, the filter passed as parameter to the filter(...) method is stored internally. No filtering takes place yet (lazy processing).

    The parameter passed to the filter(...) function determines what items in the stream should be processed, and which that should be excluded from the processing. If the Predicate.test(T t) method of the parameter passed to filter(...) returns true for an item, that means it should be processed. If false is returned, the item is not processed.

  • Stream.map(...)

    It is possible to map the items in a collection to other objects using Stream.map(...) method:

    
    <R> Stream<R> map(Function<? super T, ? extends R> mapper);
    
    								 

    In other words, for each item in the collection you create a new object based on that item. Here is a simple Java stream mapping example:

    stream.map(s -> s.toUpperCase());
    								 

    This example maps all strings in the items collection to their uppercase equivalents.

    NOTE: this example does not actually perform the mapping (all intermediate operations are lazy). It only configures the stream for mapping. Once one of the stream processing methods are invoked, the mapping will be performed.

  • Stream.distinct()

    The Stream.distinct() method in stream API returns stream with distinct elements. Whether or not element is distinct is decided by equals() method of Object class.

    
    Stream<T> distinct();
    
    								

    Here is a simple example:

    
    List<String> list = Arrays.asList("aA","AA","Aa", "Aa", "AA");
    long l = list.stream().distinct().count();
    System.out.println("Number of distinct elements : " + l);
    
    								

    Output:

    Number of distinct elements : 3
    								

  • Stream.peek(...)

    The Stream.peek(...) returns Stream itself after applying the action passed as Consumer object.

    
    Stream<T> peek(Consumer<? super T> action)
    
    								

    The Stream.peek(...) is extremely useful during debugging. It allows you to peek into the stream before an action is encountered. Here is a simple example:

    
    Stream<String> words = Stream.of("lower", "case", "text");
    List<String> list = words
        .peek(s -> System.out.println(s))
        .map(s -> s.toUpperCase())
        .collect(Collectors.toList());
    System.out.println(list);
    
    								

    Output:

    lower
    case
    text
    [LOWER, CASE, TEXT]
    								

Terminal Operations

Stream API operations that returns a result or produce a side effect. Once the terminal method is called on a stream, it consumes the stream and after that we can not use stream. Terminal operations are eager in nature i.e. they process all the elements (unless it's a short-circuiting terminal operation) in the stream before returning the result. Commonly used terminal methods are forEach, toArray, min, max, findFirst, anyMatch, allMatch, etc. You can identify terminal methods from the return type, they will never return a Stream.

  • Stream.collect(...)

    Stream.collect(...) is a terminal operation to transform the elements of the stream into a different kind of result, e.g. a java.util.List, java.util.Set or java.util.Map:

    
    <R, A> R collect(Collector<? super T, A, R> collector);
    
    								

    Stream.collect(...) accepts a Collector interface which consists of several operations including: supplier(), accumulator(), a combiner(), finisher():

    
    public interface Collector<T, A, R> {
        Supplier<A> supplier();
    
        BiConsumer<A, T> accumulator();
    
        BinaryOperator<A> combiner();
    
        Function<A, R> finisher();
    
        ...
    
    }
    
    								

    Java 11 supports various built-in collectors via the java.util.stream.Collectors final class. So for the most common operations you do not have to implement a Collector yourself:

    
    List<String> list = Arrays.asList("Java", "is", "not", "great");
    List<String> filtered = list.stream()
        .filter(item -> item.startsWith("J"))
        .collect(Collectors.toList());
    System.out.print(filtered.get(0));
    
    								

    This example creates a stream, adds a filter, and collects all object accepted by the filter in a java.util.List. The filter only accepts items (strings) which start with the character J. The resulting java.util.List thus contains all strings from the original list which starts with the character J. Output is:

    Java
    								

  • Stream.min(...) / Stream.max(...)

    The Stream.min(...) and Stream.max(...) methods are stream processing terminal methods. Once these are called, the stream will be iterated, filtering and mapping applied, and the minimum or maximum value in the stream will be returned:

    
    Optional<T> min(Comparator<? super T> comparator);
    
    Optional<T> max(Comparator<? super T> comparator);
    
    								

    The Stream.min(...) and Stream.max(...) methods return an Optional instance which has a get() method, which you use to obtain the value. In case the Optional has no value the get() method will throw NoSuchElementException.

    The Stream.min(...) and Stream.max(...) methods take a java.util.Comparator as parameter. The Comparator.comparing(...) method creates a Comparator based on the lambda expression passed to it. In fact, the comparing(...) method takes a Function which is a functional interface suited for lambda expressions. It takes one parameter and returns a value:

    
    Comparator<Person> byLastName = Comparator.comparing(Person::getLastName);
    
    								

  • Stream.findAny()

    The Stream.findAny() finds any element in the stream, which may be cheaper than findFirst() for some streams. This is a short circuit terminal operation. A short circuit terminal operation potentially allows processing of a stream to stop early without examining all the elements.

    
    Optional<T> findAny();
    
    								

    Here is an example:

    
    List<String> list = Arrays.asList("Java", "is", "not", "great");
    Optional<String> result = list.stream()
        .filter(item -> item.contains("t"))
        .findAny();
    System.out.print(result.get());
    
    								

    This example processes elements from collection one by one and gets the first element that contains character t, and then passes to findAny method. The findAny method immediately stops pipeline execution, so no further elements will be processed. The possible output is:

    not
    								

    [Important]

    It will most likely return 'not' but there is no guarantee for this. Printing 'great' also a possible option.

  • Stream.findFirst()

    The method findFirst() provides the first element from the stream. The return value is an Optional, in case of an empty stream an empty optional.

    
    Optional<T> findFirst();
    
    								

    It will return first element from stream and then will not process any further elements, as it is a short circuit terminal operation.

  • Stream.count()

    The Stream.count() method simply returns the number of elements in the stream after filtering has been applied.

    long count();
    								

    Here is an example:

    
    List<String> list = Arrays.asList("Java", "is", "not", "great");
    long l = list.stream()
        .filter(item -> item.startsWith("J"))
        .count();
    System.out.print(l);
    
    								

    This example iterates the stream and keeps all elements that start with the character J, and then counts these elements. The count() method returns a long which is the count of elements in the stream after filetering it:

    1
    								

Stream Pipelines

To perform a computation, stream operations are composed into a stream pipeline. A stream pipeline consists of:

  • a source (which might be an array, a Collection, a generator function, an I/O channel, etc.)

  • zero or more intermediate operations (which transform a Stream into another Stream, such as filter(Predicate p))

  • a terminal operation (which produces a result or side-effect, such as count() or forEach(Consumer c))

Streams are lazy; computation on the source data is only performed when the terminal operation is initiated, and source elements are consumed only as needed.

Applying a function to each element of a stream

Streams support the method map(), which takes a java.util.function.Function as an argument. The function is applied to each element, mapping it into a new element (the word mapping is used because it has a meaning similar to transforming but with the nuance of "creating a new version of" rather than "modifying"). For example, in the following code you pass a method reference Employee::getName to the map(...) method to extract the names of the employees in the stream:

public class Employee {
    private String name;
    public Employee(String n) {
        name = n;
    }
    public String getName() {
        return name;
    }
}
					


Stream<Employee> emps = Stream.of(new Employee("Mikalai"), new Employee("Volha"));
Stream<String> names = emps.map(Employee::getName);
List<String> staff = names.collect(Collectors.toList());
System.out.print(staff);

					

Output:

[Mikalai, Volha]
					

Because the method Employee.getName() returns a String, the stream outputted by the map() method is of type Stream<String>.

For example, if you wanted to find out the length of the name of each employee, you could do this by chaining another map(...) as follows:


Stream<Employee> emps = Stream.of(new Employee("Mikalai"), new Employee("Volha"));
Stream<String> names = emps.map(Employee::getName);
Stream<Integer> lengths = names.map(n -> n.length());
List<Integer> list = lengths.collect(Collectors.toList());
System.out.println(list);

					

Output:

[7, 5]
					

Primitive stream specializations

Java 8.0 introduced three primitive specialized stream interfaces that support specialized methods (like max(), sum(), average()) to work with streams of numbers: IntStream, DoubleStream, and LongStream, that respectively specialize the elements of a stream to be int primitives, double primitives, and long primitives -- and thereby avoid hidden boxing costs.

[Important]

The java.util.stream.Stream interface has max(...) and min(...) methods, but they are not with empty parameter list as in IntStream, DoubleStream, LongStream and require a Comparator interface passed in as a parameter:

The java.util.stream.Stream interface does not have average() and sum() methods.

Each of these three interfaces brings new methods to perform common numeric reductions such as sum() to calculate the sum of a numeric stream and max() to find the maximum element. In addition, they have methods to convert back to a stream of Objects when necessary.

Mapping to a numeric stream

The most common methods you will use to convert a stream to a primitive specialized version are Stream.mapToInt(), Stream.mapToDouble(), and Stream.mapToLong(). These methods work exactly like the method Stream.map() that you saw earlier but return a specialized stream instead of a Stream<T>. For example, you can use mapToInt() as follows to calculate the longest name of the employees:


Stream<Employee> emps = Stream.of(new Employee("Mikalai"), new Employee("Volha"), new Employee("Ivan"));
Stream<String> names = emps.map(e -> e.getName());
IntStream lengths = names.mapToInt(n -> n.length());
int i = lengths.max().getAsInt();
System.out.print(i);

					

Here, the method mapToInt() extracts all the lenghts from each name (represented as an int) and returns an IntStream as the result (rather than a Stream<Integer>). You can then call the max() method defined on the IntStream interface to calculate the longest name. IntStream also supports other convenience methods such as sum(), min(), and average().

Stream.flatMap(...)


<R> Stream<R> flatMap(Function<? super T,? extends Stream<? extends R>> mapper)

					

The Stream.flatMap(...) method returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element. The function produces a stream for each input element and the output streams are flattened. Performs one-to-many mapping.

The Stream.flatMap(...) operation works as follows:

  • It takes an input stream and produces an output stream using a mapping function.

  • The mapping function takes an element from the input stream and maps the element to a stream. The type of input element and the elements in the mapped stream may be different. This step produces a stream of streams. Suppose the input stream is a Stream<T> and the mapped stream is Stream<Stream<R>> where T and R may be the same or different.

  • Finally, it flattens the output stream (that is, a stream of streams) to produce a stream. That is, the Stream<Stream<R>> is flattened to Stream<R>.

Let's look at a simple example. We have got a Stream of lists of names, and we want all the names from these in sequences. We can solve this problem using an approach like the one in example below:


List<String> names1 = Arrays.asList("Dzmitry", "John");
List<String> names2 = Arrays.asList("David", "Laura");
Stream<List<String>> s = Stream.of(names1, names2);
s.flatMap(names -> names.stream()).forEach(System.out::println);

					

output:

Dzmitry
John
David
Laura
					

We replace the List<String> with a Stream<String> using the stream() method, and flatMap does the rest. The flatMap's associated functional interface is the same as map's — the Function — but its return type is restricted to streams and not any value.

The flatMap(...) transforms each element of a stream into another form (just like map(...)), and generates sub-streams of the newly formed elements. Finally, it flattens all of the sub-streams into a single stream of elements. As the flatMap(...) is a map type of function, it also takes a function and applies (maps) that function to each of the element in the stream.

The difference between map(...) and flatMap(...) is:

  • The map(...) accepts a function that returns a mapped element and then the map(...) function returns a stream of such elements (1 to 1).

  • The flatMap accepts a function that returns streams of the mapped elements and then the flatMap finally returns a collective stream of all of the sub-streams that are created by the each execution of the passed function (1 to 0...n).

Professional hosting         Exam 1Z0-817: Upgrade OCP Java 6, 7 & 8 to Java SE 11 Developer Quiz     Exam 1Z0-810: Upgrade to Java SE 8 Programmer Quiz