Thursday 13 August 2015

Java SE 8 Optional, a pragmatic approach

The Optional classs in Java 8 is a useful tool to help developers manage data. But advice on how to use it varies. This is my take on one good approach to using Optional in Java 8.

Note that this article assumes you know what Optional is and how it works. See my previous article and other tutorials for more info. Also, be aware that Optional is a heavily argued topic, with some commentators liable to get rather too excited about its importance.

A pragmatic approach to Optional in Java 8

What follows is a specific approach to using Optional in Java 8 that I have found very useful. It should be considered that the approach has been developed in terms of writing a new application, rather than maintaining an existing one. There are five basic points:

  1. Do not declare any instance variable of type Optional.
  2. Use null to indicate optional data within the private scope of a class.
  3. Use Optional for getters that access the optional field.
  4. Do not use Optional in setters or constructors.
  5. Use Optional as a return type for any other business logic methods that have an optional result.

For example:

  public class Address {
    private final String addressLine;  // never null
    private final String city;         // never null
    private final String postcode;     // optional, thus may be null

    // constructor ensures non-null fields really are non-null
    // optional field can just be stored directly, as null means optional
    public Address(String addressLine, String city, String postcode) {
      this.addressLine = Preconditions.chckNotNull(addressLine);
      this.city = Preconditions.chckNotNull(city);
      this.postcode = postcode;
    }

    // normal getters
    public String getAddressLine() { return addressLine; }
    public String getCity() { return city; }

    // special getter for optional field
    public Optional<String> getPostcode() {
      return Optional.ofNullable(postcode);
    }

    // return optional instead of null for business logic methods that may not find a result
    public static Optional<Address> findAddress(String userInput) {
      return ... // find the address, returning Optional.empty() if not found
    }
  }

The first thing to notice about this users of our address API are protected from receiving null. Calling getAddressLine() or getCity() will always return a non-null value, as the address object cannot hold null in those fields. Calling getPostcode() will return an Optional<String> instance that forces callers to at least think about the potential for missing data. Finally, findPostcode() also returns an Optional. None of these methods can return null.

Within the object, the developer is still forced to think about null and manage it using != null checks. This is reasonable, as the problem of null is constrained. The code will all be written and tested as a unit (you do write tests don't you?), so nulls will not cause many issues.

In essence, what this approach does is to focus on using Optional in return types at API boundaries, rather than within a class or on input. Compared to using it as a field, optional is now created on-the-fly. The key difference here is the lifetime of the Optional instance.

It is often the case that domain objects hang about in memory for a fair while, as processing in the application occurs, making each optional instance rather long-lived (tied to the lifetime of the domain object). By contrast, the Optional instance returned from the getter is likely to be very short-lived. The caller will call the getter, interpret the result, and then move on. If you know anything about garbage collection you'll know that the JVM handles these short-lived objects well. In addition, there is more potential for hotspot to remove the costs of the Optional instance when it is short lived. While it is easy to claim this is "premature optimization", as engineers it is our responsibility to know the limits and capabilities of the system we work with and to choose carefully the point where it should be stressed.

While it is a minor point, it should be noted that the class could be Serializable, something that is not possible if any field is Optional (as Optional does not implement Serializable).

The approach above does not use Optional for inputs, such as setters or constructors. While accepting Optional would work, it is my experience that having Optional on a setter or constructor is annoying for the caller, as they typically have the actual object. Forcing the caller to wrap the parameter in Optional is an annoyance I'd prefer not to inflict on users. (ie. convenience trumps strictness on input)

On the downside, this approach results in objects that are not beans. The return type of the getter does not match the type of the field, which can cause issues for some tools. Before adopting this approach, check that any tool you use can handle it, such as by directly accessing the field.

If adopted widely in an application, the problem of null tends to disappear without a big fight. Since each domain object refuses to return null, the application tends to never have null passed about. In my experience, adopting this approach tends to result in code where null is never used outside the private scope of a class. And importantly, this happens naturally, without it being a painful transition. Over time, you start to write less defensive code, because you are more confident that no variable will actually contain null.

The key to making this approach work beyond the basics is to learn the various methods on Optional. If you simply call Optional.get() you've missed the whole point of the class.

For example, here is some code that handles an XML parse where either "effectiveDate" or "relativeEffectiveDate" is present:

 AdjustableDate startDate = tradeEl.getChildOptional("effectiveDate")
   .map(el -> parseKnownDate(el))
   .orElseGet(() -> parseRelativeDate(tradeEl.getChildSingle("relativeEffectiveDate")));

Breaking this down, tradeEl.getChildOptional("effectiveDate") returns an Optional<XmlElement>. If the element was found, the map() function is invoked to parse the date. If the element was not found, the orElseGet() function is invoked to parse the relative date.

For a large enterprise-style codebase that uses this approach to Optional, see OpenGamma Strata, a modern open source toolkit for the finance industry.

See also Joda-Beans code generation, which can generate this pattern (and much more).

Finally, it should be noted that some future Java version, beyond Java 9, will probably support value types. In this future world, the costs associated with Optional will disappear, and using it far more widely will make sense. I simply argue that now is not the time for an "optional everywhere" approach.

Summary

This article outlines a pragmatic approach to using Optional in Java 8. If followed consistently on a new application, the problem of null tends to just fade away.

Any comments?