Collections are fundamental to many programming tasks as they let you group and process data. Java applications often process collections.
For example:
Despite their importance, processing collections was far from trivial before Java 8.
Typical processing patterns on collections are similar to SQL-like operations.
Most databases let you specify such operations declaratively.
SELECT id, MAX(weight) from dogs
We don’t need to implement how to calculate the maximum weight, we only express that we want it. You worry less about how to explicitly implement such queries. Why can’t we do something similar with collections?
How many times do you find yourself re-implementing these operations using loops over and over again?
How can we process really large collections efficiently? Ideally, to speed up the processing, you want to leverage multicore architectures. As we will learn, writing parallel code is hard and error-prone!
The new abstraction called Stream lets you process data in a declarative way.
Streams can leverage multi-core architectures without you having to write a single line of multithread code.
Let us write a couple of operations on collections of the Dog
class below.
public class Dog {
public String name;
public int height;
public int weight;
public Dog(String name, int height, int weight) {
this.name = name;
this.height = height;
this.weight = weight;
}
public String toString() {
return String.format("(%s: %dkg, %dcm)", name, weight, height);
}
}
Let us print, from a list of dogs, the tallest dog and the average weight of the dogs.
public class Main {
public static void main(String[] args) {
Dog d1 = new Dog("Max", 50, 8);
Dog d2 = new Dog("Marley", 60, 10);
Dog d3 = new Dog("Rocky", 30, 5);
Dog d4 = new Dog("Bear", 70, 12);
Dog d5 = new Dog("Luna", 30, 13);
Dog d6 = new Dog("Luna", 25, 10);
List<Dog> dogs = List.of(d1, d2, d3, d4, d5, d6);
System.out.println("Original array: " + dogs);
Dog tallest = dogs.get(0);
for (Dog dog : dogs) {
if (dog.height > tallest.height) {
tallest = dog;
}
}
System.out.println("The tallest dog is: " + tallest);
int sum = 0;
for (Dog dog : dogs) {
sum += dog.weight;
}
double averageWeight = (double) sum / dogs.size();
System.out.println("The average weight is: " + averageWeight);
}
}
This is how you would solve the same problem with streams.
public class StreamSolution1 {
public static void main(String[] args) {
Dog d1 = new Dog("Max", 50, 8);
Dog d2 = new Dog("Marley", 60, 10);
Dog d3 = new Dog("Rocky", 30, 5);
Dog d4 = new Dog("Bear", 70, 12);
Dog d5 = new Dog("Luna", 30, 13);
Dog d6 = new Dog("Luna", 25, 10);
List<Dog> dogs = List.of(d1, d2, d3, d4, d5, d6);
System.out.println("Original array: " + dogs);
Dog tallest = dogs.stream()
.max(Comparator.comparing(dog -> dog.height))
.get();
System.out.println("The tallest dog is: " + tallest);
double averageWeight = dogs.stream()
.mapToInt(dog -> dog.weight)
.average()
.orElse(0);
System.out.println("The average weight is: " + averageWeight);
}
}
Now, let us produce a list, in alphabetical order, of the name of all dogs with are less than 60 cm tall and weigh more than 5 kg.
If we don't use streams, we could do:
public class Main2 {
public static void main(String[] args) {
Dog d1 = new Dog("Max", 50, 8);
Dog d2 = new Dog("Marley", 60, 10);
Dog d3 = new Dog("Rocky", 30, 5);
Dog d4 = new Dog("Bear", 70, 12);
Dog d5 = new Dog("Luna", 30, 13);
Dog d6 = new Dog("Luna", 25, 10);
List<Dog> dogs = List.of(d1, d2, d3, d4, d5, d6);
System.out.println("Original array: " + dogs);
Set<String> nameSet = new HashSet<>();
for (Dog dog : dogs) {
if (dog.weight > 5 && dog.height < 60) {
nameSet.add(dog.name);
}
}
List<String> nameList = new ArrayList<>(nameSet);
nameList.sort(Comparator.naturalOrder());
System.out.println("Dog names: " + nameList);
}
}
Using streams, we could do:
public class StreamSolution2 {
public static void main(String[] args) {
Dog d1 = new Dog("Max", 50, 8);
Dog d2 = new Dog("Marley", 60, 10);
Dog d3 = new Dog("Rocky", 30, 5);
Dog d4 = new Dog("Bear", 70, 12);
Dog d5 = new Dog("Luna", 30, 13);
Dog d6 = new Dog("Luna", 25, 10);
List<Dog> dogs = List.of(d1, d2, d3, d4, d5, d6);
System.out.println("Original array: " + dogs);
List<String> nameList =
dogs.stream()
.filter(dog -> dog.height < 60 && dog.weight > 5)
.map(dog -> dog.name)
.distinct()
.sorted()
.collect(Collectors.toList());
System.out.println("Dog names: " + nameList);
}
}
Using streams, we can easily filter elements.
Dog[] smallDogs = dogs.stream()
.filter(dog -> dog.height <= 50)
.toArray(Dog[]::new);
Sort the elements according to a given criteria.
String[] sortedDogs = dogs.stream()
.sorted((o1, o2) -> o1.name.compareToIgnoreCase(o2.name))
.toArray(String[]::new);
Map its elements by applying a function.
String[] dogNames = dogs.stream()
.map(dog -> dog.name)
.toArray(String[]::new);
Find an element that fits some criteria.
Dog biggestDog = dogs.stream()
.max((o1, o2) -> o1.height - o2.height)
.get();
Streams differ from collections in several ways.
No storage:
Functional in nature:
Laziness-seeking:
Possibly unbounded:
limit(n)
or findFirst()
can allow computations on infinite streams to complete in finite time.Consumable:
Iterator
, a new stream must be generated to revisit the same elements of the source.Read more here.