11. Streams
Programming Project 2021/22

11.2. Pipelines

Stream operations and pipelines

Stream operations are combined to form stream pipelines.

A stream pipeline consists of the following.

  • One stream source, such as a Collection, a generator function, or an I/O channel;
  • zero or more intermediate operations, such as Stream.filter or Stream.map;
  • one terminal operation, such as Stream.forEach or Stream.reduce.

Intermediate operations

  • return a new stream,
  • can be "lazy",
  • e.g. filter(), map(), sorted().

Terminal operations

  • traverse the stream to produce a result or a side-effect, and
  • are mostly "eager".
  • After a terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used,
  • e.g. forEach(), sum(), max().

Here is our previous pipeline.

public class StreamSolution2 {
  public static void main(String[] args) {
    Dog d1 = new Dog("Max", 50, 8);
    Dog d2 = new Dog("Marley", 60, 10);
    Dog d3 = new Dog("Rocky", 30, 5);
    Dog d4 = new Dog("Bear", 70, 12);
    Dog d5 = new Dog("Luna", 30, 13);
    Dog d6 = new Dog("Luna", 25, 10);

    List<Dog> dogs = List.of(d1, d2, d3, d4, d5, d6);
    System.out.println("Original array: " + dogs);

    List<String> nameList =
        dogs.stream()
            .filter(dog -> dog.height < 60 && dog.weight > 5)
            .map(dog -> dog.name)
            .distinct()
            .sorted()
            .collect(Collectors.toList());

    System.out.println("Dog names: " + nameList);
  }
}

Stream pipelines

streams explained

java streams pipeline

Stream operations

operation taxonomy

Intermediate operations

Processing intermediate operations lazily is efficient. This way, we can

  • fuse multiple steps into a single pass on the data, e.g., in a filter-map-sum pipeline, filtering, mapping, and summing are done together;
  • avoid examining all the data when it is not necessary, e.g., to find the first string longer than 1000 characters, it is only necessary to examine strings until you find the desired one.

This behavior becomes even more important when the input stream is not finite.

Intermediate operations are further divided into stateless and stateful operations.

Stateless operations

  • retain no state from any previously processed element when processing a new element -- each element can be processed independently of operations on other elements,
  • e.g., filter(), map().

Stateful operations

  • incorporate state from previously processed elements when processing new elements,
  • e.g., distinct(), sorted(),
  • and may need to process the entire input before producing a result.

Short-circuiting operations

Short-circuiting operations are just like boolean short-circuit evaluations.

An intermediate short-circuiting operation

  • may produce a reduced stream as a result,
  • e.g., limit().

A terminal short-circuiting operation

  • may finish before traversing all elements in the stream,
  • e.g., findFirst(), findAny().

short circuiting operations

Source

Reduction operations

A reduction operation takes a sequence of input elements and combines them into a single summary result, for example,

finding the sum of a set of numbers,

[1, 2, 3, 4, 5] = 1 + 2 + 3 + 4 + 5 = 15

finding the average of a set of numbers,

[1, 2, 3, 4, 5] = (1 + 2 + 3 + 4 + 5)/5 = 3

finding the highest on a set of numbers,

[1, 2, 3, 4, 5] = 5

and accumulating elements into a list.


Here is an example of a reduction that sums elements in a list.

int sum = 0;
for (int x : numbers) {
    sum += x;
}

And here is an equivalent code using streams:

int sum = numbers.stream().reduce(0, (x,y) -> x+y);

or:

int sum = numbers.stream().reduce(0, Integer::sum);

An advantage of using streams is that the solution is inherently parallelizable.

int sum = numbers.parallelStream().reduce(0, Integer::sum);

Mutable reduction: accumulates input elements into a mutable result container, such as:

  • a Collection
  • a StringBuilder

Immutable reduction accumulates input elements into an immutable variable, such as:

  • an int
  • a boolean

The streams classes have general reduction operations, which repeatedly apply a combining operation:

  • reduce(): to perform an immutable reduction
  • collect(): to perform a mutable reduction

Common reduction operations are:

  • sum()
  • max()
  • count()