Java constructs for real-world applications, Part 2

Before you begin

This two-part tutorial is part of the Introduction to Java™ programming series.

Level Topic Type
101 Set up your Java development environment and learn basic object-oriented programming principles Tutorial
102 Java language basics Tutorial
103 Writing good Java code Tutorial
201 Java constructs for real-world applications, Part 1 Tutorial
202 Java constructs for real-world applications, Part 2 Tutorial

Although the concepts discussed in the individual tutorials are standalone in nature, the hands-on component builds as you progress through the series. I recommend that you review the prerequisites, setup, and series details before proceeding.

You can find the code examples in this tutorial in my GitHub repository. The code in the src tree is where you’ll find the examples from the tutorial (broken out by sections from the tutorial, like Regular Expressions, Generics, and so on). The code in the test tree is where you’ll find the JUnit tests that allow you to run the examples.

Objectives

The Java language is mature and sophisticated enough to help you accomplish nearly any programming task. This tutorial introduces you to additional features of the Java language that you need to handle complex programming scenarios, including:

  • Regular expressions
  • Generics
  • enum types
  • I/O
  • Serialization

Prerequisites

The content of this tutorial is geared toward programmers new to the Java language who are unfamiliar with its more-sophisticated features. The tutorial assumes that you have:

Regular expressions

A regular expression is essentially a pattern to describe a set of strings that share that pattern. If you’re a Perl programmer, you should feel right at home with the regular expression (regex) pattern syntax in the Java language. If you’re not used to regular expressions syntax, however, it can look weird. This section gets you started with using regular expressions in your Java programs.

The Regular Expressions API

Here’s a set of strings that have a few things in common:

  • A string
  • A longer string
  • A much longer string

Note that each of these strings begins with A and ends with string. The Java Regular Expressions API helps you pull out these elements, see the pattern among them, and do interesting things with the information you’ve gleaned.

The Regular Expressions API has three core classes that you use almost all the time:

  • Pattern describes a string pattern.
  • Matcher tests a string to see if it matches the pattern.
  • PatternSyntaxException tells you that something wasn’t acceptable about the pattern that you tried to define.

You’ll begin working on a simple regular-expressions pattern that uses these classes shortly. But first, take a look at the regex pattern syntax.

Note: To run the examples in this section, open test/com/jstevenperry/intro/regex/RegularExpressionTest, right-click on the test method that matches the example you want to run, and select Run As > JUnit Test.

Regex pattern syntax

A regex pattern describes the structure of the string that the expression tries to find in an input string. The pattern syntax can look strange to the uninitiated, but once you understand it, you’ll find it easier to decipher. Table 1 lists some of the most common regex constructs that you use in pattern strings.

Table 1. Common regex constructs
Regex construct What qualifies as a match
. Any character
? Zero (0) or one (1) of what came before
* Zero (0) or more of what came before
One (1) or more of what came before
[] A range of characters or digits
^ Negation of whatever follows (that is, “not whatever“)
\d Any digit (alternatively, [0-9])
\D Any nondigit (alternatively, [^0-9])
\s Any whitespace character (alternatively, [\n\t\f\r])
\S Any nonwhitespace character (alternatively, [^\n\t\f\r])
\w Any word character (alternatively, [a-zA-Z_0-9])
\W Any nonword character (alternatively, [^\w])

The first few constructs are called quantifiers, because they quantify what comes before them. Constructs like \d are predefined character classes. Any character that doesn’t have special meaning in a pattern is a literal and matches itself.

Pattern matching

Armed with the pattern syntax in Table 1, you can work through the simple example in Listing 1, using the classes in the Java Regular Expressions API.

Listing 1. Pattern matching with regex

    public void matches() {
        Pattern pattern = Pattern.compile("[Aa].*string");
        Matcher matcher = pattern.matcher("A string");
        boolean didMatch = matcher.matches();
        logger.info("Did match using matches() ==> " + didMatch);
        int patternStartIndex = matcher.start();
        logger.info("Pattern Start Index ==> " + patternStartIndex);
        int patternEndIndex = matcher.end();
        logger.info("Pattern End Index ==> " + patternEndIndex);
    }

Note: To run the example above, open the RegularExpressionTest JUnit test class in Eclipse, and run the testMatches() method. You can run any of the examples in this unit using this pattern.

First, Listing 1 creates a Pattern class by calling compile()— a static method on Pattern— with a string literal representing the pattern you want to match. That literal uses the regex pattern syntax. In this example, the English translation of the pattern is:

Find a string of the form A or a followed by zero or more characters, followed by string.

Methods for matching

Next, Listing 1 calls matcher() on Pattern. That call creates a Matcher instance. The Matcher then searches the string you passed in for matches against the pattern string you used when you created the Pattern.

Every Java language string is an indexed collection of characters, starting with 0 and ending with the string length minus one. The Matcher parses the string, starting at 0, and looks for matches against it. After that process is complete, the Matcher contains information about matches found (or not found) in the input string. You can access that information by calling various methods on Matcher:

  • matches() tells you if the entire input sequence was an exact match for the pattern.
  • start() tells you the index value in the string where the matched string starts.
  • end() tells you the index value in the string where the matched string ends, plus one.

Listing 1 finds a single match starting at 0 and ending at 7. Thus, the call to matches() returns true, the call to start() returns 0, and the call to end() returns 8.

lookingAt() versus matches()

If your string had more elements than the number of characters in the pattern you searched for, you could use lookingAt() instead of matches(). The lookingAt() method searches for substring matches for a specified pattern. For example, consider the following example:

a string with more than just the pattern.

If you search this string for a.*string, you get a match if you use lookingAt(). But if you use matches(), it returns false, because there’s more to the string than what’s in the pattern.

To see this for yourself, open the code for this tutorial from my Github repo in Eclipse, and then run the RegularExpressionTest JUnit test.

Complex patterns in regex

Simple searches are easy with the regex classes, but you can also do highly sophisticated things with the Regular Expressions API.

Wikis are based almost entirely on regular expressions. Wiki content is based on string input from users, which is parsed and formatted using regular expressions. Any user can create a link to another topic in a wiki by entering a wiki word, which is typically a series of concatenated words, each of which begins with an uppercase letter, like this:

MyWikiWord

Suppose a user inputs the following string:

Here is a WikiWord followed by AnotherWikiWord, then YetAnotherWikiWord.

You could search for wiki words in this string with a regex pattern like this:

[A‑Z][a‑z]∗([A‑Z][a‑z]∗)+

And here’s code to search for wiki words:

    public void wikiWord() {
        String input = "Here is a WikiWord followed by AnotherWikiWord, then SomeWikiWord.";
        Pattern pattern = Pattern.compile("[A-Z][a-z]*([A-Z][a-z]*)+");
        Matcher matcher = pattern.matcher(input);
        while (matcher.find()) {
            logger.info("Found this wiki word: " + matcher.group());
        }
    }

Run this code (using the testWikiWord() method in RegularExpressionTest), and you can see the three wiki words in your console.

Replacing strings

Searching for matches is useful, but you also can manipulate strings after you find a match for them. You can do that by replacing matched strings with something else, just as you might search for text in a word-processing program and replace it with other text. Matcher has a couple of methods for replacing string elements:

  • replaceAll() replaces all matches with a specified string.
  • replaceFirst() replaces only the first match with a specified string.

Using Matcher‘s replace methods is straightforward:

    public void replace() {
        String input = "Here is a WikiWord followed by AnotherWikiWord, then SomeWikiWord.";
        Pattern pattern = Pattern.compile("[A-Z][a-z]*([A-Z][a-z]*)+");
        Matcher matcher = pattern.matcher(input);
        logger.info("Before: " + input);
        String result = matcher.replaceAll("replacement");
        logger.info("After: " + result);
    }

    public void replaceFirst() {
        String input = "Here is a WikiWord followed by AnotherWikiWord, then SomeWikiWord.";
        Pattern pattern = Pattern.compile("[A-Z][a-z]*([A-Z][a-z]*)+");
        Matcher matcher = pattern.matcher(input);
        logger.info("Before: " + input);
        String result = matcher.replaceFirst("replacement");
        logger.info("After: " + result);
    }

This code finds wiki words, as before. When the Matcher finds a match, it replaces the wiki word text with its replacement. When you run the testReplace() method, you can see the following on your console:

Jul 18, 2020 11:43:14 AM com.jstevenperry.intro.regex.RegularExpression replace
INFO: Before: Here is a WikiWord followed by AnotherWikiWord, then SomeWikiWord.
Jul 18, 2020 11:43:14 AM com.jstevenperry.intro.regex.RegularExpression replace
INFO: After: Here is a replacement followed by replacement, then replacement.

If you had used replaceFirst(), you would have seen this (run testReplaceFirst() to see it):

Jul 18, 2020 11:44:46 AM com.jstevenperry.intro.regex.RegularExpression replaceFirst
INFO: Before: Here is a WikiWord followed by AnotherWikiWord, then SomeWikiWord.
Jul 18, 2020 11:44:46 AM com.jstevenperry.intro.regex.RegularExpression replaceFirst
INFO: After: Here is a replacement followed by AnotherWikiWord, then SomeWikiWord.

Matching and manipulating groups

When you search for matches against a regex pattern, you can get information about what you found. You’ve seen some of that capability with the start() and end() methods on Matcher. But it’s also possible to reference matches by capturing groups.

In each pattern, you typically create groups by enclosing parts of the pattern in parentheses. Groups are numbered from left to right, starting with 1 (group 0 represents the entire match). The code in Listing 2 replaces each wiki word with a string that “wraps” the word.

Listing 2. Matching groups

    public void matchingGroups() {
        String input = "Here is a WikiWord followed by AnotherWikiWord, then SomeWikiWord.";
        Pattern pattern = Pattern.compile("[A-Z][a-z]*([A-Z][a-z]*)+");
        Matcher matcher = pattern.matcher(input);
        Logger.getAnonymousLogger().info("Before: " + input);
        String result = matcher.replaceAll("blah$0blah");
        Logger.getAnonymousLogger().info("After: " + result);
    }

Run the Listing 2 code, and you get the following console output:

Jul 18, 2020 11:29:17 AM com.jstevenperry.intro.regex.RegularExpression matchingGroups
INFO: Before: Here is a WikiWord followed by AnotherWikiWord, then SomeWikiWord.
Jul 18, 2020 11:29:17 AM com.jstevenperry.intro.regex.RegularExpression matchingGroups
INFO: After: Here is a blahWikiWordblah followed by blahAnotherWikiWordblah, then blahSomeWikiWordblah.

Listing 2 references the entire match by including $0 in the replacement string. Any portion of a replacement string of the form $int$int refers to the group identified by the integer (so $1 refers to group 1, and so on). In other words, $0 is equivalent to matcher.group(0);.

You could accomplish the same replacement goal by using other methods. Rather than calling replaceAll(), you could do this:

    public void matchingGroupsWithStringBuffer() {
        String input = "Here is a WikiWord followed by AnotherWikiWord, then SomeWikiWord.";
        Pattern pattern = Pattern.compile("[A-Z][a-z]*([A-Z][a-z]*)+");
        Matcher matcher = pattern.matcher(input);
        StringBuffer buffer = new StringBuffer();
        while (matcher.find()) {
            matcher.appendReplacement(buffer, "blah$0blah");
        }
        matcher.appendTail(buffer);
        logger.info("After: " + buffer.toString());
    }

And you’d get the same result:

Jul 18, 2020 11:31:59 AM com.jstevenperry.intro.regex.RegularExpression matchingGroupsWithStringBuffer
INFO: After: Here is a blahWikiWordblah followed by blahAnotherWikiWordblah, then blahSomeWikiWordblah.

Generics

The introduction of generics in JDK 5.0 (released in 2004) marked a huge leap forward for the Java language. If you’ve used C++ templates, you’ll find that generics in the Java language are similar but not exactly the same. If you haven’t used C++ templates, don’t worry: This section offers a high-level introduction to generics in the Java language.

Note: To run the examples in this section, open test/com/jstevenperry/intro/generics/GenericsTest, right-click on the test method that matches the example you want to run, and choose Run As > JUnit Test.

What are generics?

When JDK 5.0 introduced generic types (generics) and the associated syntax into the Java language, some then-familiar JDK classes were replaced with their generic equivalents. Generics is a compiler mechanism whereby you can create (and use) types of things (such as classes or interfaces, and methods) in a generic fashion by harvesting the common code and parameterizing (or templatizing) the rest. This approach to programming is called generic programming.

Generics in action

To see what a difference generics makes, consider the example of a class that has been in the JDK for a long time: java.util.ArrayList, which is a List of Objects that’s backed by an array.

Listing 3 shows how java.util.ArrayList is instantiated.

Listing 3. Instantiating ArrayList

    public void listing3() {
        ArrayList arrayList = new ArrayList();
        arrayList.add("A String");
        arrayList.add(Integer.valueOf(10));
        arrayList.add("Another String");
        // So far, so good.
        log.info("Added " + arrayList.size() + " objects to arrayList");
    }

As you can see, the ArrayList is heterogeneous: It contains two String types and one Integer type. Before JDK 5.0, the Java language had nothing to constrain this behavior, which caused many coding mistakes. In Listing 3, for example, everything is looking good so far, other than the warnings about ArrayList being a raw type (more on that later). But what about accessing the elements of the ArrayList, which Listing 4 tries to do?

Listing 4. Attempt to access elements in ArrayList

    public void listing4() {
        ArrayList arrayList = new ArrayList();
        arrayList.add("A String");
        arrayList.add(Integer.valueOf(10)); // This feels wrong somehow...
        arrayList.add("Another String");
        log.info("Added " + arrayList.size() + " objects to arrayList");
        // So far, so good.
        for (int aa = 0; aa < arrayList.size(); aa++) {
            String s = (String) arrayList.get(aa);
            log.info("String from ArrayList" + s);
        }
    }

Without prior knowledge of what’s in the ArrayList, you must either check the element that you want to access to see if you can handle its type, or face a possible ClassCastException.

With generics, you can specify the type of item that went in the ArrayList. Listing 5 shows how, and what happens if you try and add an object of the wrong type. (Spoiler alert: The code will not even compile.)

Listing 5. A second attempt, using generics

    public void listing5() {
        ArrayList<String> arrayList = new ArrayList<>();
        arrayList.add("A String");
        // TODO: Uncomment next line to see compile error
        // arrayList.add(Integer.valueOf(10));// compiler error!
        arrayList.add("Another String");
        log.info("Added " + arrayList.size() + " objects to arrayList");
        // So far, so good.
        // Process the ArrayList
        for (int aa = 0; aa < arrayList.size(); aa++) {
            String s = arrayList.get(aa); // No cast necessary
            log.info("String from ArrayList" + s);
        }
    }

Iterating with generics

Generics enhance the Java language with special syntax for dealing with entities, such as Lists, that you commonly want to step through element by element. If you want to iterate through ArrayList, for instance, you could rewrite the code from Listing 5 like so:

    public void iteratingWithGenerics() {
        ArrayList<String> arrayList = new ArrayList<>();
        arrayList.add("A String");
        // TODO: Uncomment next line to see compile error
        // arrayList.add(Integer.valueOf(10));// compiler error!
        arrayList.add("Another String");
        log.info("Added " + arrayList.size() + " objects to arrayList");
        // So far, so good.
        // Process the ArrayList
        processArrayList(arrayList);
    }

This syntax works for any type of object that is Iterable (that is, implements the Iterable interface).

Parameterized classes

Parameterized classes shine when it comes to collections, so that’s the context for the following examples. Consider the List interface, which represents an ordered collection of objects. In the most common use case, you add items to the List and then access those items either by index or by iterating over the List.

If you’re thinking about parameterizing a class, consider if the following criteria apply:

  • A core class is at the center of some kind of wrapper: The “thing” at the center of the class might apply widely, and the features (attributes, for example) surrounding it are identical.
  • The behavior is common: You do pretty much the same operations regardless of the “thing” at the center of the class.

Applying these two criteria, you can see that a collection fits the bill:

  • The “thing” is the class of which the collection is composed.
  • The operations (such as add, remove, size, and clear) are pretty much the same regardless of the object of which the collection is composed.

A parameterized List

In generics syntax, the code to create a List looks like this:

List<E> listReference = new concreteListClass<E>();

The E, which stands for Element, is the “thing” I mentioned earlier. The concreteListClass is the class from the JDK that you’re instantiating. The JDK includes several List<E> implementations, but you use ArrayList<E>. Another way you might see a generic class discussed is Class<T>, where T stands for Type. When you see E in Java code, it’s usually referring to a collection of some kind. And when you see T, it’s denoting a parameterized class.

So, to create an ArrayList of, say, java.lang.Integer, you do this:

List<Integer> listOfIntegers = new ArrayList<Integer>();

SimpleList: A parameterized class

Now suppose you want to create your own parameterized class called SimpleList, with three methods:

  • add() adds an element to the end of the SimpleList.
  • size() returns the current number of elements in the SimpleList.
  • clear() completely clears the contents of the SimpleList.

Listing 6 shows the syntax to parameterize SimpleList.

Listing 6. Parameterizing SimpleList

package com.jstevenperry.intro.generics;

import java.util.ArrayList;
import java.util.List;

public class SimpleList<E> {
    private List<E> backingStore;

    public SimpleList() {
        backingStore = new ArrayList<E>();
    }

    public E add(E e) {
        if (backingStore.add(e))
            return e;
        else
            return null;
    }

    public int size() {
        return backingStore.size();
    }

    public void clear() {
        backingStore.clear();
    }
}

SimpleList can be parameterized with any Object subclass. To create and use a SimpleList of, say, java.math.BigDecimal objects, you might do this (the code below is from SimpleListTest):

    void testAdd_BigDecimal() {
        SimpleList<BigDecimal> simpleList = new SimpleList<>();
        simpleList.add(BigDecimal.ONE);
        assertEquals(1, simpleList.size());
        log.info("SimpleList size is " + simpleList.size());

        simpleList.add(BigDecimal.ZERO);
        assertEquals(2, simpleList.size());
        log.info("SimpleList size is " + simpleList.size());

        simpleList.clear();
        assertEquals(0, simpleList.size());
        log.info("SimpleList size is " + simpleList.size());
    }

And you would get this output:

Jul 18, 2020 11:57:02 AM com.jstevenperry.intro.generics.SimpleListTest testAdd_BigDecimal
INFO: SimpleList size is 1
Jul 18, 2020 11:57:02 AM com.jstevenperry.intro.generics.SimpleListTest testAdd_BigDecimal
INFO: SimpleList size is 2
Jul 18, 2020 11:57:02 AM com.jstevenperry.intro.generics.SimpleListTest testAdd_BigDecimal
INFO: SimpleList size is 0

Parameterized methods

At times, you might not want to parameterize your entire class, but only one or two methods. In this case, you create a generic method. Consider the example in Listing 7, where the method formatArray is used to create a string representation of the contents of an array.

Listing 7. A generic method

    public <E> String formatArray(E[] arrayToFormat) {
        StringBuilder sb = new StringBuilder();

        int index = 0;
        for (E element : arrayToFormat) {
            if (index > 0)
                sb.append(", ");
            sb.append("Element ");
            sb.append(index++);
            sb.append(" => ");
            sb.append(element);
        }

        return sb.toString();
    }

Rather than parameterize MyClass, you make generic just the one method you want to use create a consistent string representation that works for any element type.

In practice, you’ll find yourself using parameterized classes and interfaces far more often then methods, but now you know that the capability is available if you need it.

enum types

In JDK 5.0, a new data type was added to the Java language, called enum (not to be confused with java.util.Enumeration). The enum type represents a set of constant objects that are all related to a particular concept, each of which represents a different constant value in that set.

Note: To run the examples in this section, open test/com/jstevenperry/intro/enumtypes/GenderTest, right-click on the test method that matches the example you want to run, and choose Run As > JUnit Test.

Before enum was introduced into the language, you would have defined a set of constant values for a concept (say, gender) like so:

public class Person {

    public static final String MALE = "male";
    public static final String FEMALE = "female";
    public static final String PREFER_NOT_TO_SAY = "preferNotToSay";

}

Any code that needed to reference that constant value would have been written something like this:

public void myMethod() {
  //. . .
  String genderMale = Person.MALE;
  //. . .
}

Defining constants with enum

Using the enum type makes defining constants much more formal — and more powerful. Here’s the enum definition for Gender:

public enum Gender {
  MALE,
  FEMALE,
  PREFER_NOT_TO_SAY
}

This example only scratches the surface of what you can do with enums. In fact, enums are much like classes, so they can have constructors, attributes, and methods:

package com.jstevenperry.intro.enumtypes;

public enum Gender {

    MALE("Male"), FEMALE("Female"), PREFER_NOT_TO_SAY("PreferNotToSay");

    private String displayName;

    private Gender(String displayName) {
        this.displayName = displayName;
    }

    public String getDisplayName() {
        return this.displayName;
    }
}

One difference between a class and an enum is that an enum‘s constructor must be declared private, and it cannot extend (or inherit from) other enums. However, an enumcan implement an interface.

An enum implementing an interface

Suppose you define an interface, Displayable:

package com.jstevenperry.intro.enumtypes;

public interface Displayable {
    String getDisplayName();
}

Your Gender enum could implement this interface (and any other enum that needed to produce a friendly display name), like so:

package com.jstevenperry.intro.enumtypes;

public enum Gender implements Displayable {

    MALE("Male"), FEMALE("Female"), PREFER_NOT_TO_SAY("PreferNotToSay");

    private String displayName;

    private Gender(String displayName) {
        this.displayName = displayName;
    }

    public String getDisplayName() {
        return this.displayName;
    }
}

I/O

This section is an overview of the java.io package. You learn to use some of its tools to collect and manipulate data from various sources.

Note: To run the examples in this section, open test/com/jstevenperry/intro/io/InputOutputTest, right-click on the test method that matches the example you want to run, and choose Run As > JUnit Test.

Working with external data

More often than not, the data you use in your Java programs comes from an external data source, such as a database, direct byte transfer over a socket, or file storage. Most of the Java tools for collecting and manipulating external data are in the java.io package.

Files

Of all the data sources available to your Java applications, files are the most common and often the most convenient. If you want to read a file in your application, you must use streams that parse its incoming bytes into Java language types.

java.io.File is a class that defines a resource on your file system and represents that resource in an abstract way. Creating a File object is easy:

        File file = new File("/some/directory/tree/temp.txt");

The File constructor takes the name of the file it creates. The first call creates a file called temp.txt in the specified directory. You can pass any String to the constructor of File, provided that it’s a valid file name for your operating system, whether or not the file that it references even exists.

This code asks the newly created File object if the file exists:

   public File createFileUnlessItExists(String filename) throws IOException {

        File file = new File(filename);
        if (file.exists()) {
            // File exists. Process it...
            log.info("File '" + file.getName() + "' exists. Cannot create it.");
        } else {
            // File doesn't exist. Create it...
            log.info("Creating file '" + file.getPath() + "'.");
            file.createNewFile();
        }

        return file;

    }

java.io.File has some other handy methods that you can use to:

  • Delete files
  • Create directories (by passing a directory name as the argument to File‘s constructor)
  • Determine if a resource is a file, directory, or symbolic link
  • More

The main action of Java I/O is in writing to and reading from data sources, which is where streams come in.

Using Java I/O streams

Note: Java I/O streams have been around for a long time. Don’t confuse them with the Java Streams API, which I’ll cover later in this unit. The two are very different. I’ll refer to the former as “Java I/O Streams” or “I/O Streams” and the latter (even though a relative newcomer) as “streams” because that’s how you’re most likely to see it.

You can access files on the file system by using Java I/O streams. At the lowest level, I/O streams enable a program to receive bytes from a source or to send output to a destination. Some I/O streams handle all kinds of 16-bit characters (Reader and Writer types). Others handle only 8-bit bytes (InputStream and OutputStream types). Within these hierarchies are several flavors of I/O streams, all found in the java.io package.

Byte streams read (InputStream and subclasses) and write (OutputStream and subclasses) 8-bit bytes. In other words, a byte stream can be considered a more raw type of I/O stream. Here’s a summary of two common byte streams and their usage:

  • FileInputStream / FileOutputStream: Reads bytes from a file, writes bytes to a file.
  • ByteArrayInputStream / ByteArrayOutputStream: Reads bytes from an in-memory array, writes bytes to an in-memory array.

Character I/O streams

Character streams read (Reader and its subclasses) and write (Writer and its subclasses) 16-bit characters. Here’s a selected listing of character streams and their usage:

  • StringReader / StringWriter: Read and write characters to and from Strings in memory.
  • InputStreamReader / InputStreamWriter (and subclasses FileReader / FileWriter): Act as a bridge between byte streams and character streams. The Reader flavors read bytes from a byte stream and convert them to characters. The Writer flavors convert characters to bytes to put them on byte streams.
  • BufferedReader / BufferedWriter: Buffer data while reading or writing another stream, making read and write operations more efficient.

Rather than try to cover I/O streams in their entirety, I’ll focus here on the recommended I/O streams for reading and writing files. In most cases, these are character streams.

Reading from a File

You can read from a File in several ways. Arguably the simplest approach is to:

  1. Create an InputStreamReader on the File you want to read from.
  2. Call read() to read one character at a time until you reach the end of the file.

Listing 8 is an example in reading from a File:

Listing 8. Reading from a File

    public List<String> readWordsUnbufferedStream(String wordsFilename) {
        long startTime = System.currentTimeMillis();

        // Return value: list of strings
        List<String> ret = new ArrayList<>();

        File wordsFile = new File(wordsFilename);

        int numberOfWords = 0;
        try (InputStreamReader reader = new InputStreamReader(new FileInputStream(wordsFile))) {
            boolean done = false;
            // While there is more in the file to read
            while (!done) {
                int charRead = reader.read();
                if (charRead == -1) {
                    done = true;
                } else {
                    StringBuilder word = new StringBuilder();
                    // Read the current word (file has one word per line)
                    while (charRead != -1 && charRead != '\n' && charRead != '\r') {
                        word.append(charRead);
                        charRead = reader.read();
                    }
                    ret.add(word.toString());
                    numberOfWords++;
                }
            }
        } catch (IOException e) {
            log.log(Level.SEVERE, "IOException occurred, message = " + e.getLocalizedMessage(), e);
        }

        log.info("Read " + numberOfWords + " words in " + Long.toString(System.currentTimeMillis() - startTime) + "ms");

        return ret;

    }

Note: to see the code above in action, open InputOutpuTest and run the testReadWordsUnbufferedStream method in Eclipse.

Writing to a File

As with reading from a File, you have several ways to write to a File. Once again, I go with the simplest approach:

  1. Create a FileOutputStream on the File you want to write to.
  2. Call write() to write the character sequence.

Listing 9 is an example of writing to a File:

Listing 9. Writing to a File

    public int saveWordsUnbufferedStream(String filename, List<String> words) {
        long startTime = System.currentTimeMillis();

        // Return value: number of words written
        int wordCount = 0;

        File file = new File(filename);
        try (OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream(file))) {
            for (String word : words) {
                if (wordCount > 0) {
                    writer.append(' ');
                }
                writer.write(word);
                wordCount++;
            }
        } catch (IOException e) {
            log.log(Level.SEVERE, "IOException occurred, message = " + e.getLocalizedMessage(), e);
        }

        log.info("Wrote " + wordCount + " words in " + Long.toString(System.currentTimeMillis() - startTime) + "ms");

        return wordCount;
    }

Buffering I/O streams

Reading and writing character streams one character at a time isn’t efficient, so in most cases you probably want to use buffered I/O instead. To read from a file using buffered I/O, the code looks just like Listing 8, except that you wrap the InputStreamReader in a BufferedReader, as shown in Listing 10.

Listing 10. Reading from a File with buffered I/O

    public List<String> readWordsBufferedStream(String wordsFilename) {
        long startTime = System.currentTimeMillis();

        // Return value: list of strings
        List<String> ret = new ArrayList<>();

        File wordsFile = new File(wordsFilename);

        int numberOfWords = 0;
        try (BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(wordsFile)))) {
            String line = reader.readLine();
            // While there is more in the file to read
            while (line != null) {
                ret.add(line);
                numberOfWords++;
                line = reader.readLine();
            }
        } catch (IOException e) {
            log.log(Level.SEVERE, "IOException occurred, message = " + e.getLocalizedMessage(), e);
        }

        log.info("Read " + numberOfWords + " words in " + Long.toString(System.currentTimeMillis() - startTime) + "ms");

        return ret;

    }

Writing to a file using buffered I/O is the same: You wrap the OutputStreamWriter in a BufferedWriter, as shown in Listing 11.

Listing 11. Writing to a File with buffered I/O

    public int saveWordsBufferedStream(String filename, List<String> words) {
        long startTime = System.currentTimeMillis();

        // Return value: number of words written
        int wordCount = 0;

        File file = new File(filename);
        try (BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file)))) {
            for (String word : words) {
                if (wordCount > 0) {
                    writer.append(' ');
                }
                writer.write(word);
                wordCount++;
            }
        } catch (IOException e) {
            log.log(Level.SEVERE, "IOException occurred, message = " + e.getLocalizedMessage(), e);
        }

        log.info("Wrote " + wordCount + " words in " + Long.toString(System.currentTimeMillis() - startTime) + "ms");

        return wordCount;
    }

Run the testReadWordsUnbufferedStream and you’ll see output like this (notice the elapsed time for reading each file):

Jul 18, 2020 12:25:53 PM com.jstevenperry.intro.io.InputOutput readWordsUnbufferedStream
INFO: Read 10 words in 0ms
Jul 18, 2020 12:25:53 PM com.jstevenperry.intro.io.InputOutput readWordsUnbufferedStream
INFO: Read 10000 words in 28ms
Jul 18, 2020 12:25:53 PM com.jstevenperry.intro.io.InputOutput readWordsUnbufferedStream
INFO: Read 100000 words in 64ms
Jul 18, 2020 12:25:54 PM com.jstevenperry.intro.io.InputOutput readWordsUnbufferedStream
INFO: Read 1000000 words in 453ms

Compare that against the output from the testReadWordsBufferedStream method:

Jul 18, 2020 12:27:25 PM com.jstevenperry.intro.io.InputOutput readWordsBufferedStream
INFO: Read 10 words in 1ms
Jul 18, 2020 12:27:25 PM com.jstevenperry.intro.io.InputOutput readWordsBufferedStream
INFO: Read 10000 words in 5ms
Jul 18, 2020 12:27:25 PM com.jstevenperry.intro.io.InputOutput readWordsBufferedStream
INFO: Read 100000 words in 22ms
Jul 18, 2020 12:27:26 PM com.jstevenperry.intro.io.InputOutput readWordsBufferedStream
INFO: Read 1000000 words in 96ms

You can see that as the size of the file grows, the performance advantage of Buffered I/O Streams is obvious.

Java serialization

Java serialization is another one the Java platform’s essential libraries. Serialization is primarily used for object persistence and object remoting, two use cases where you need to be able to take a snapshot of the state of an object and then reconstitute later. This section gives you a taste of the Java Serialization API and shows how to use it in your programs.

Note: To run the examples in this section, open test/com/jstevenperry/intro/serialization/HumanResourcesApplicationTest, right-click on the test method that matches the example you want to run, and choose Run As > JUnit Test.

What is object serialization?

Serialization is a process whereby the state of an object and its metadata (such as the object’s class name and the names of its attributes) are stored in a special binary format. Putting the object into this format —serializing it — preserves all the information necessary to reconstitute (or deserialize) the object whenever you need to do so.

The two primary use cases for object serialization are:

  • Object persistence— storing the object’s state in a permanent persistence mechanism such as a database
  • Object remoting— sending the object to another computer or system

java.io.Serializable

The first step in making serialization work is to enable your objects to use the mechanism. Every object you want to be serializable must implement an interface called java.io.Serializable:

package com.jstevenperry.intro.serialization;

import java.io.Serializable;
import java.util.logging.Logger;

public class Person implements Serializable {

    /**
     *
     */
    private static final long serialVersionUID = -265820726104107219L;
.
.
}

In this example, the Serializable interface marks the objects of the Person class — and every subclass of Person— to the runtime as serializable.

Any attributes of an object that are not serializable cause the Java runtime to throw a NotSerializableException if it tries to serialize your object. You can manage this behavior by using the transient keyword to tell the runtime not to try to serialize certain attributes. In that case, you are responsible for making sure that the attributes are restored (if necessary) so that your object functions correctly.

Serializing an object

Now, try an example that combines what you learned about Java I/O with what you’re learning now about serialization.

Suppose you create and populate a List of Employee objects and then want to serialize that List to an OutputStream, in this case to a file. That process is shown in Listing 12 (the listing is quite long, so I’ve omitted comments for brevity).

Listing 12. Serializing an object

    public boolean serializeToDisk(String filename, List<Employee> employees) {
        boolean ret = false;// default: failed
        File file = new File(filename);
        try (ObjectOutputStream outputStream = new ObjectOutputStream(new FileOutputStream(file))) {
            log.info("Writing " + employees.size() + " Employee objects to disk (using Java serialization)...");
            outputStream.writeObject(employees);
            ret = true; // Looks good
            log.info("Done.");
        } catch (IOException e) {
            log.log(Level.SEVERE, "Cannot find file " + file.getName() + ", message = " + e.getLocalizedMessage(), e);
        }
        return ret;
    }

The first step is to create the objects, which is done in createEmployees() using the specialized constructor of Employee to set some attribute values. Next, you create an OutputStream— in this case, a FileOutputStream— and then call writeObject() on that stream. writeObject() is a method that uses Java serialization to serialize an object to the stream.

In this example, you are storing the List object (and its contained Employee objects) in a file, but this same technique is used for any type of serialization.

To drive the code in Listing 12, you could use a JUnit test, as shown here:

    @Test
    void testSerializeToDisk() {
        List<Employee> employees = HumanResourcesApplication.createEmployees();
        String filename = ROOT_DIRECTORY + "Employees-JUnit.ser";
        boolean serializationSucceeded = humanResourcesApplication.serializeToDisk(filename, employees);
        assertTrue(serializationSucceeded);
    }

Deserializing an object

The whole point of serializing an object is to be able to reconstitute, or deserialize, it. Listing 13 reads the file you’ve just serialized and deserializes its contents, thereby restoring the state of the List of Employee objects.

Listing 13. Deserializing objects

    public List<Employee> deserializeFromDisk(String filename) {
        List<Employee> ret = new ArrayList<>();
        File file = new File(filename);
        int numberOfEmployees = 0;
        try (ObjectInputStream inputStream = new ObjectInputStream(new FileInputStream(file))) {
            @SuppressWarnings("unchecked")
            List<Employee> employees = (List<Employee>) inputStream.readObject();
            log.info("Deserialized List says it contains " + employees.size() + " objects...");
            for (Employee employee : employees) {
                log.info("Read Employee: " + employee.toString());
                numberOfEmployees++;
            }
            ret = employees;
            log.info("Read " + numberOfEmployees + " Employee objects from disk (using Java serialization).");
        } catch (FileNotFoundException e) {
            log.log(Level.SEVERE, "Cannot find file " + file.getName() + ", message = " + e.getLocalizedMessage(), e);
        } catch (IOException e) {
            log.log(Level.SEVERE, "IOException occurred: message = " + e.getLocalizedMessage(), e);
        } catch (ClassNotFoundException e) {
            log.log(Level.SEVERE, "ClassNotFoundException: message = " + e.getLocalizedMessage(), e);
        }
        return ret;
    }

Again, to drive the code in Listing 13, you could use a JUnit test like this one:

    @Test
    void testDeserializeFromDisk() {
        List<Employee> employees = HumanResourcesApplication.createEmployees();
        String filename = ROOT_DIRECTORY + "Employees-JUnit.ser";
        boolean serializationSucceeded = humanResourcesApplication.serializeToDisk(filename, employees);
        assertTrue(serializationSucceeded);
        List<Employee> deserializedEmployees = (List<Employee>) humanResourcesApplication.deserializeFromDisk(filename);
        assertEquals(employees.size(), deserializedEmployees.size());
        assertEquals(employees, deserializedEmployees);
    }

For most application purposes, marking your objects as serializable is all you ever need to worry about when it comes to serialization. When you do need to serialize and deserialize your objects explicitly, you can use the technique shown in Listing 12 and Listing 13. But as your application objects evolve, and you add and remove attributes to and from them, serialization takes on a new layer of complexity.

serialVersionUID

In the early days of middleware and remote object communication, developers were largely responsible for controlling the “wire format” of their objects, which caused no end of headaches as technology began to evolve.

Suppose you added an attribute to an object, recompiled it, and redistributed the code to every computer in an application cluster. The object would be stored on a computer with one version of the serialization code but accessed by other computers that might have a different version of the code. When those computers tried to deserialize the object, bad things often happened.

Java serialization metadata — the information included in the binary serialization format — is sophisticated and solves many of the problems that plagued early middleware developers. But it can’t solve every problem.

Java serialization uses a property called serialVersionUID to help you deal with different versions of objects in a serialization scenario. You don’t need to declare this property on your objects; by default, the Java platform uses an algorithm that computes a value for it based on your class’s attributes, its class name, and position in the local galactic cluster. Most of the time, that algorithm works fine. But if you add or remove an attribute, that dynamically generated value changes, and the Java runtime throws an InvalidClassException.

To avoid this outcome, get in the habit of explicitly declaring a serialVersionUID:

import java.io.Serializable;

public class Person implements Serializable {
    /**
     *
     */
    private static final long serialVersionUID = -265820726104107219L;
.
.
}

I recommend using a scheme of some kind for your serialVersionUID version number (I’ve used the current date in the preceding example). And you should declare serialVersionUID as private static final and of type long.

You might be wondering when to change this property. The short answer is that you should change it whenever you make an incompatible change to the class, which usually means you’ve added or removed an attribute. If you have one version of the object on one computer that has the attribute added or removed, and the object gets remoted to a computer with a version of the object where the attribute is either missing or expected, things can get weird. This is where the Java platform’s built-in serialVersionUID check comes in handy.

As a rule of thumb, any time you add or remove features (meaning attributes or any other instance-level state variables) of a class, change its serialVersionUID. Better to get a java.io.InvalidClassException on the other end of the wire than an application bug that’s caused by an incompatible class change.

Lambda expressions

Lambda expressions were introduced to the Java language as part of Java 8. The term lambda expression comes from lambda calculus and refers to an anonymous method you create and pass as a parameter to another method (a popular technique used in functional programming). Once you understand the syntax, you’ll see how Java lambda expressions allow you to write more compact Java code without sacrificing readability (the simplicity of lambda expressions, some argue, makes the code more readable). In this section, I’ll give you an overview of Java lambda expressions with examples.

Note: To run the examples in this section, open test/com/jstevenperry/intro/lambdas/HumanResourcesApplicationTest, right-click on the test method that matches the example you want to run, and choose Run As > JUnit Test.

Goodbye, anonymous inner classes

Recall the StockOptionEligible interface from Part 1:

package com.jstevenperry.intro.lambdas;

public interface StockOptionProcessingCallback {

    public void process(StockOptionEligible employee);
}

You can implement this interface as an anonymous inner class like this (from HumanResourcesApplicationTest):

Listing 14. Anonymous inner class

        @Test
        @DisplayName("When using the long form")
        public void testLongForm() {
            List<Person> people = HumanResourcesApplication.createPeople();
            for (Person person : people) {
                boolean optionsAwarded = humanResourcesApplication.handleStockOptions(person, new StockOptionProcessingCallback() {
                    @Override
                    public void process(StockOptionEligible employee) {
                        employee.processStockOptions(1000, BigDecimal.valueOf(1.0));
                    }
                });
                if (person instanceof StockOptionEligible) {
                    assertTrue(optionsAwarded);
                } else {
                    assertFalse(optionsAwarded);
                }
            }
        }

Most of the code that creates the StockOptionProcessingCallback is boilerplate and necessary only to make the compiler happy. The real “business end” of this code is just this one method, which is called for every Person object in the list above:

                public void process(StockOptionEligible employee) {
                    employee.processStockOptions(1000, BigDecimal.valueOf(1.0));
                }

You can shorten the syntax by using a lambda expression:

Listing 15. Using a lambda expression

@Test
@DisplayName("When using a lambda expression")
public void testUsingLambdaExpression() {
    List<Person> people = HumanResourcesApplication.createPeople();
    for (Person person : people) {
        boolean optionsAwarded = humanResourcesApplication.handleStockOptions(person, (stockOptionEligible -> stockOptionEligible.processStockOptions(1000, BigDecimal.valueOf(1.0))));
        if (person instanceof StockOptionEligible) {
            assertTrue(optionsAwarded);
        } else {
            assertFalse(optionsAwarded);
        }
    }
}

I’ll break it down for you and compare Listings 14 and 15. The first difference is the syntax. The following code (from Listing 14):

    boolean optionsAwarded = humanResourcesApplication.handleStockOptions(person, new StockOptionProcessingCallback() {
        @Override
        public void process(StockOptionEligible employee) {
            employee.processStockOptions(1000, BigDecimal.valueOf(1.0));
        }
    });

is replaced with this code (from Listing 15):

    boolean optionsAwarded = humanResourcesApplication.handleStockOptions(person, (stockOptionEligible -> stockOptionEligible.processStockOptions(1000, BigDecimal.valueOf(1.0))));

The examples above do the same thing, but I think you’ll agree the second (two lines of code) is more compact than the first.

In the long-form example (Listing 14), you provide all of the code to implement the StockOptionProcessingCallback interface.

In the lambda expression example (Listing 15), however, the compiler is able to recognize two things:

  • The type of the second argument to handleStockOptions() is StockOptionProcessingCallback
  • StockOptionProcessingCallback requires a single argument of type StockOptionEligible

In general, lambda expressions look like this:

    (param1, param2) -> {
        // Implementation goes here
        System.out.println("Hi, I'm a lambda expression body. Parameter 1:" +
            param1 + " parameter 2: " + param2);
    }

Arguments to the lambda expression (which represent arguments to the method that will be called) are listed in parentheses just like they would be for any method, followed by -> and a method body, surrounded by curly braces if the body is more than one line of code (you can omit the curly brackets, if the lambda expression body is just one line of code, as it is in Listing 15). That’s it!

Let’s look at another example. The JUnit tests from the listings above both call the handleStockOptions() method in HumanResourcesApplication:

    public boolean handleStockOptions(final Person person, final StockOptionProcessingCallback callback) {
        boolean retVal = false;
        if (person instanceof StockOptionEligible) {
            callback.process((StockOptionEligible) person);
            retVal = true; // options awarded
        } else if (person instanceof Employee) {
            callback.process(new StockOptionEligible() {
                @Override
                public void processStockOptions(int number, BigDecimal price) {
                    log.warning("Unfortunately, Employee " + person.getName() + " is not eligible for Stock Options!");
                }
            });
        } else {
            callback.process((number, price) -> log.severe("Cannot consider awarding " + number + " options because " + person.getName() + " does not even work here!"));
        }
        return retVal;
    }

This code should look familiar from Part 1, so I won’t explain it again. There is one difference, however, let’s look at it.

Notice the else if block creates an anonymous class to implement the StockOptionEligible interface. No surprise there. But check out the else block. The argument to the process() method of the StockOptionProcessingCallback is the implementation of the StockOptionEligible interface as a lambda expression. The long form implementation of the StockOptionEligible interface in the else if block requires 4 – 6 lines of code, whereas the lambda expression in the else block consists of a single line of code.

You can use a lambda expression as the implementation of any interface that consists of a single method to make your code more concise and readable.

I encourage you to run the JUnit test in the lambdas package in my GitHub repo to see this in action.

Method references

As you write more Java code you often need just a single method call on an object as the body of a lambda expression.

Consider this interface:

package com.jstevenperry.intro.lambdas;

public interface Displayable {
    String getDisplayName();
}

You can call this method (which you’ll find in HumanResourcesApplication) that takes an object that implements the Displayable interface:

    public void handleDisplayName(final Displayable displayable) {
        log.info("Display name: " + displayable.getDisplayName());
    }

Suppose you want to pass a Person object to the method, but Person doesn’t implement the Displayable interface. That’s not a problem, and a big reason anonymous inner classes are so handy to use.

        @Test
        @DisplayName("When using the long form")
        public void testLongForm() {
            // People
            List<Person> people = HumanResourcesApplication.createPeople();
            for (Person person : people) {
                humanResourcesApplication.handleDisplayName(new Displayable() {
                    @Override
                    public String getDisplayName() {
                        return person.getName();
                    }
                });
            }
        }

This code example uses the Name attribute of Person as the implementation of Displayable using an anonymous inner class. Instead, use a lambda expression:

        @Test
        @DisplayName("When using a lambda expression")
        public void testUsingambdaExpression() {
            List<Person> people = HumanResourcesApplication.createPeople();
            people.forEach(person -> humanResourcesApplication.handleDisplayName(() -> person.getName()));

        }

Again, the lambda expression syntax shortens the example (since getDisplayName() takes no arguments, () is used as the parameter list to the lambda expression).

However, by using a method reference you can shorten the expression further:

        @Test
        @DisplayName("When using a lambda expression")
        public void testUsingambdaExpressionWithMethodReference() {
            List<Person> people = HumanResourcesApplication.createPeople();
            people.forEach(person -> humanResourcesApplication.handleDisplayName(person::getName));
        }

The syntax for a method expression looks like this: object::methodName. In this case, the method reference tells the compiler: “Where you need a method body as the implementation of getDisplayName() for Displayable, use the getName() method of the Person object that is in scope.”

You can also use method references to make static method calls, and they look like ClassName::methodName.

Lambda expressions are a very handy tool to have in your Java toolbox, and can make your code much more concise and readable.

Java Streams API

The Java Streams API was introduced in Java 8 and provides a way to process a “stream” of objects by applying one or more operations of the following types as objects in the stream “flow” by.

As previously mentioned, Java I/O streams have been around for a long time. Don’t confuse them with the Java Streams API. The two are very different. I’ll refer to the Java Streams API (even though a relative newcomer) as “streams” because that’s how you’re most likely to see it.

Note: To run the examples in this section, open test/com/jstevenperry/intro/streamsapi/HumanResourcesApplicationTest, right-click on the test method that matches the example you want to run, and choose Run As > JUnit Test.

Intermediate operations operate on objects in the stream and return a Stream object. Examples include filter(), map(), and flatMap(), which you’ll see examples of later in this section.

Terminal operations are used to perform an operation on objects in the stream, close the stream, and return an object (other than a Stream, like a List or Optional) that usually contains objects from the stream after being processed. I’ll show you how to use a Collector later in this section to collect the objects in the stream that were processed and return them in a List and a Set.

Think of a stream as an assembly line and operations as workers on the assembly line. Each worker may do something to the objects in the stream as they pass by. An intermediate operation usually does something to an object, then puts it back on the stream, where it can be processed by the next worker on the assembly line. A terminal operation is like the last worker on the assembly line, who takes the contents of the stream and puts them into a “box,” after which the stream is closed. (The analogy isn’t perfect, but hopefully you get the idea.)

There are a couple of ways your program will typically start a stream “flowing”:

  • You can create a stream from a collection of objects
  • You can create a stream from objects that are not already part of a stream

Let’s look at some examples, starting with intermediate operations.

Filter

You use the filter() method, an intermediate operation, to select objects from the incoming stream based on a boolean value you compute, called a predicate. The predicate could be based on one (or more) of the object’s attribute values, or some other piece of information related to the object.

If the predicate returns true, the object is kept. That is, the object will be included in the Stream that is returned from the filter() method. If the predicate returns false, the object is dropped from the Stream.

In the following example, the List of Person objects is filtered to include only those Person objects that implement the BonusEligible interface:

    public List<Person> filterBonusEligible(final List<Person> people) {
        return people
                .stream()
                .filter(person -> person instanceof BonusEligible)
                .collect(Collectors.toList());
    }

The stream() method is called on the List, which starts the stream. The filter() method is called next and is passed a lambda expression that serves as the predicate, which returns true, if person is an instance of BonusEligible. Any Person object that does not implement the BonusEligible interface is dropped from the Stream.

After filtering is applied (remember, filtering is an intermediate operation), the Stream is passed to the collect() method, where the contents of the Stream are “collected” as a List<Person> containing only those objects that implement the BonusEligible interface.

You can apply multiple filters to the Stream by chaining them together:

    public List<Person> filterSalaryGreaterThan(final Integer threshold, final List<Person> people) {
        return people
                .stream()
                .filter(person -> person instanceof Employee)
                .filter(person -> ((Employee)person).getSalary().intValue() > threshold)
                .collect(Collectors.toList());
    }

In this example, a List of Person objects is filtered to include only those that are Employees, then filtered again to include those employees whose salary is greater than the threshold value supplied as an argument to the method.

Map

You use the map() method, an intermediate operation, to transform the objects in the stream from one type to another, placing the transformed object into the Stream that is returned:

    public List<Manager> mapManager(final List<Person> people) {
        return people
                .stream()
                .filter(person -> person instanceof Manager)
                .map(person -> (Manager)person)
                .collect(Collectors.toList());
    }

In this example, a List of Person objects is first filtered to keep only those objects that are Managers, and then those objects are cast to Manager and placed onto the stream.

The List that is collected contains only Manager objects.

While Manager “is-a” Employee from an object-oriented standpoint, the results of map() do not have to resemble the incoming object at all. Consider this example:

    public List<String> mapNames(final List<Person> people) {
        return people
                .stream()
                .map(person -> person.getName())
                .collect(Collectors.toList());
    }

The Stream is processed to include only the name of each Person object in the Stream. Person objects go in, and Strings come out the other side. I think you’ll agree that String and Person are very different indeed!

Mapping from one object to another can be as complicated as your application needs it to be. Consider this example:

    public List<Manager> promoteBlueEyedEmployeesToManager(final List<Employee> employees) {
        return employees
                .stream()
                .filter(employee -> employee.getEyeColor().equalsIgnoreCase("BLUE"))
                .map(Manager::promote)
                .collect(Collectors.toList());
    }

In this example, the Stream of Employee objects is filtered to include only blue-eyed employees (not the strangest business rule I’ve ever seen, btw), who are then promoted to Managers via the promote() method on the Manager class (passed as a method reference, which you saw earlier in the tutorial).

I’m not sure I would want to work at such a capricious organization that would operate this way (I myself having brown eyes), but hopefully you get the idea: mapping is whatever your application needs it to be to transform one object into another (you can also chain map() calls just like you can filter() method calls, in case you were wondering).

The following example transforms the incoming List of Person objects into a List of just the EyeColor attributes of each:

    public List<String> mapEyeColors(final List<Person> people) {
        return people
                .stream()
                .map(Person::getEyeColor)
                .collect(Collectors.toList());
    }

This example does the same thing as the one above, but returns a Set instead of a List, so there is only one of each EyeColor:

    public Set<String> mapUniqueEyeColors(final List<Person> people) {
        return people
                .stream()
                .map(Person::getEyeColor)
                .collect(Collectors.toSet());
    }

Mapping is another powerful tool in your Java Streams API tool belt.

All of the examples so far use the collect() terminal operation to return a collection of some kind, because it’s not easy to look at a Stream directly and you will almost never want to use a Stream without a terminal operation. There are several terminal operations, and I’ll show you two of them in the next two sections.

Collect

One of the most powerful features of the Java Development Kit (JDK) is its support for working with collections of objects. It makes sense, then, that the Java Streams API would mirror this.

You’ve seen the collect() method already, but I didn’t really explain it. Let me do that now.

You use collect(), a terminal operation, to gather (that is, accumulate) the objects from a Stream into some type of collection like a List, by calling toList(), or a Set, by calling toSet().

Depending on the type of “accumulation” you wish to do, Collectors probably has a method to do it, and I would like to introduce you to one you haven’t seen, that is particularly handy: Collectors.groupingBy(). Consider the following example:

    public Map<String, List<Person>> collectByEyeColor(final List<Person> people) {
        return people
                .stream()
                .collect(Collectors.groupingBy(Person::getEyeColor));
    }

In this example, the Stream of Person objects is grouped into separate Lists by eye color. Collector‘s groupingBy() method is used to do this, and it needs to know how to do the grouping; that is, the value that serves as the key. The getEyeColor() method is used for this purpose and tells the collector: “Create a map whose key is eye color and whose value is a list of Person objects that have that eye color”.

Collecting is not limited to creating collections, and you can use it to perform arithmetic operations like counting and computing statistics like min, max, and average, and lots more. Learn more from the Javadoc.

Reduce

You use reduce(), a terminal operation, to “fold” a Stream of objects into a subset of values of the objects in the Stream (often a single value).

Consider the following example:

    public int computeSum(final Integer... integers) {
        return integers.reduce(0, (accumulated, current) -> accumulated + current);
    }

The computeSum() method takes a variable number of Integer objects, and uses Stream‘s reduce() method to add up their values.

The first argument to reduce is the initial value (also called the identity or default value if the Stream is empty).

The next argument is a BinaryOperator function that takes two arguments: (1) the accumulated value thus far, and (2) the current object from the Stream. The lambda expression adds them together and returns the sum of the two.

The reduction is applied to every object in the stream, after which the reduction terminates and a single value is returned, in this case the sum of all Integer values that are passed to computeSum().

As you might have guessed, you can perform other intermediate operations before calling reduce(), such as filter() and map(). Suppose you wanted to compute the total annual payroll based on the salaries of all employees:

    public BigDecimal computeTotalPayroll(final List<Employee> employees) {
        return employees
                .stream()
                .map(Employee::getSalary)
                .reduce(BigDecimal.ZERO, (current, value) -> current.add(value));
    }

In this example, a Stream is created from a List<Employee>, where the Salary attribute is mapped on to the Stream, and a reduction is applied to calculate the total.

The Java Streams API is full of all kinds of great utilities to let you work with Streams of objects, and I’ve only been able to scratch the surface. I encourage you to modify the code from my GitHub repo using the Javadoc as a reference and explore some of the other great features of the Java Streams API.

Type inference

Note: To run the examples in this section, open test/com/jstevenperry/intro/localtype/HumanResourcesApplicationTest, right-click on the test method that matches the example you want to run, and choose Run As > JUnit Test.

When you hear the term type inference with respect to the Java language, this could mean one of two things:

  • Type inference for generics
  • Local variable type inference

Type inference for generics

Type inference for generics was introduced in Java 7 and is often referred to as the “diamond operator.” The diamond operator is used where the compiler can infer the type of an assignment. For example, prior to Java 7, you were required to declare an assignment to a List<String> like this:

List<String> listOfString = new ArrayList<String>();

Using the diamond operator, the Java 7 compiler could infer the type:

List<String> listOfString = new ArrayList<>();

Because the Java 7 (and later) compiler can understand the type that the ArrayList contains from the assignment’s destination (that is, List<String>), there is no need to specify the type.

By the way, the diamond operator gets its name from the empty angle brackets, which (sort of) resemble a “diamond”: <>.

Local variable type inference

Local variable type inference, or local type inference, was introduced in Java 10 and uses the var reserved type name to declare a variable when the compiler can infer the type. Consider this example from the section on I/O:

    public File createFileUnlessItExists(String filename) throws IOException {
        File file = new File(filename);
        .
        .
        return file;
    }

It is obvious that the type of the local variable file is java.io.File, so why is it strictly necessary to declare it? This is the thinking behind local type inference, which makes it unnecessary:

    public File createFileUnlessItExists(String filename) throws IOException {
        var file = new File(filename);
        .
        .
        return file;
    }

Using the var reserved type name, you can declare the file variable without having to explicitly specify its type.

For simple types like File it may be difficult to see the benefit, but consider a more complicated example.

    public Map<Character, Map<String, Long>> computeAlphabeticalWordMap(final List<String> words) {
        return words.stream().collect(Collectors.groupingBy(word -> Character.valueOf(Character.toUpperCase(word.charAt(0))),
                Collectors.groupingBy(Function.identity(), Collectors.counting())));
    }

This method takes a list of words, and returns a Map whose key is the letter of the alphabet that each word from the input List starts with, and whose value is another map whose key is the word, and whose value is the number of times that word occurs.

The type that is returned from this method is Map<Character, Map<String, Long>>. Without local variable type inference, you would have to specify the full type of the return value to make the compiler happy, like this:

List<String> listOfWords = /* obtain somehow */
Map<Character, Map<String, Long>> wordMapByStartingLetter = inputOutput.computeAlphabeticalWordMap(listOfWords);

With local type inference, however, the same example looks like this:

var listOfWords = /* obtain somehow */
var wordMapByStartingLetter = inputOutput.computeAlphabeticalWordMap(listOfWords);

This example is definitely more compact, and arguably more readable.

Local type inference also works for try-with-resources:

    public List<String> readWordsBufferedStream(String wordsFilename) {
        .
        .
        try (var reader = new BufferedReader(new InputStreamReader(new FileInputStream(wordsFile)))) {
            .
            .
        } catch (IOException e) {
            log.log(Level.SEVERE, "IOException occurred, message = " + e.getLocalizedMessage(), e);
        }
        .
        .
    }

You can use local type inference anywhere you would normally use a local variable where the type to assign the var is obvious to the compiler.

As cool as local variable type inference is, however, you can’t use it everywhere. Local type inference only works for local variables (in case that wasn’t clear from the name). You can’t use it for class member variables, static variables, or for method parameter variables.

Conclusion

Thus far, the Introduction to Java programming series has covered a significant portion of the Java language, but the language is huge. This series doesn’t encompass it all.

As you continue learning about the Java language and platform, you probably will want to do further study into topics such as regular expressions, generics, and Java serialization. Eventually, you might also want to explore topics not covered in this introductory series, such as concurrency and persistence.