Thursday, 13 August 2015

Java SE 8 Optional, a pragmatic approach

The Optional classs in Java 8 is a useful tool to help developers manage data. But advice on how to use it varies. This is my take on one good approach to using Optional in Java 8.

Note that this article assumes you know what Optional is and how it works. See my previous article and other tutorials for more info. Also, be aware that Optional is a heavily argued topic, with some commentators liable to get rather too excited about its importance.

A pragmatic approach to Optional in Java 8

What follows is a specific approach to using Optional in Java 8 that I have found very useful. It should be considered that the approach has been developed in terms of writing a new application, rather than maintaining an existing one. There are five basic points:

  1. Do not declare any instance variable of type Optional.
  2. Use null to indicate optional data within the private scope of a class.
  3. Use Optional for getters that access the optional field.
  4. Do not use Optional in setters or constructors.
  5. Use Optional as a return type for any other business logic methods that have an optional result.

For example:

  public class Address {
    private final String addressLine;  // never null
    private final String city;         // never null
    private final String postcode;     // optional, thus may be null

    // constructor ensures non-null fields really are non-null
    // optional field can just be stored directly, as null means optional
    public Address(String addressLine, String city, String postcode) {
      this.addressLine = Preconditions.chckNotNull(addressLine);
      this.city = Preconditions.chckNotNull(city);
      this.postcode = postcode;
    }

    // normal getters
    public String getAddressLine() { return addressLine; }
    public String getCity() { return city; }

    // special getter for optional field
    public Optional<String> getPostcode() {
      return Optional.ofNullable(postcode);
    }

    // return optional instead of null for business logic methods that may not find a result
    public static Optional<Address> findAddress(String userInput) {
      return ... // find the address, returning Optional.empty() if not found
    }
  }

The first thing to notice about this users of our address API are protected from receiving null. Calling getAddressLine() or getCity() will always return a non-null value, as the address object cannot hold null in those fields. Calling getPostcode() will return an Optional<String> instance that forces callers to at least think about the potential for missing data. Finally, findPostcode() also returns an Optional. None of these methods can return null.

Within the object, the developer is still forced to think about null and manage it using != null checks. This is reasonable, as the problem of null is constrained. The code will all be written and tested as a unit (you do write tests don't you?), so nulls will not cause many issues.

In essence, what this approach does is to focus on using Optional in return types at API boundaries, rather than within a class or on input. Compared to using it as a field, optional is now created on-the-fly. The key difference here is the lifetime of the Optional instance.

It is often the case that domain objects hang about in memory for a fair while, as processing in the application occurs, making each optional instance rather long-lived (tied to the lifetime of the domain object). By contrast, the Optional instance returned from the getter is likely to be very short-lived. The caller will call the getter, interpret the result, and then move on. If you know anything about garbage collection you'll know that the JVM handles these short-lived objects well. In addition, there is more potential for hotspot to remove the costs of the Optional instance when it is short lived. While it is easy to claim this is "premature optimization", as engineers it is our responsibility to know the limits and capabilities of the system we work with and to choose carefully the point where it should be stressed.

While it is a minor point, it should be noted that the class could be Serializable, something that is not possible if any field is Optional (as Optional does not implement Serializable).

The approach above does not use Optional for inputs, such as setters or constructors. While accepting Optional would work, it is my experience that having Optional on a setter or constructor is annoying for the caller, as they typically have the actual object. Forcing the caller to wrap the parameter in Optional is an annoyance I'd prefer not to inflict on users. (ie. convenience trumps strictness on input)

On the downside, this approach results in objects that are not beans. The return type of the getter does not match the type of the field, which can cause issues for some tools. Before adopting this approach, check that any tool you use can handle it, such as by directly accessing the field.

If adopted widely in an application, the problem of null tends to disappear without a big fight. Since each domain object refuses to return null, the application tends to never have null passed about. In my experience, adopting this approach tends to result in code where null is never used outside the private scope of a class. And importantly, this happens naturally, without it being a painful transition. Over time, you start to write less defensive code, because you are more confident that no variable will actually contain null.

The key to making this approach work beyond the basics is to learn the various methods on Optional. If you simply call Optional.get() you've missed the whole point of the class.

For example, here is some code that handles an XML parse where either "effectiveDate" or "relativeEffectiveDate" is present:

 AdjustableDate startDate = tradeEl.getChildOptional("effectiveDate")
   .map(el -> parseKnownDate(el))
   .orElseGet(() -> parseRelativeDate(tradeEl.getChildSingle("relativeEffectiveDate")));

Breaking this down, tradeEl.getChildOptional("effectiveDate") returns an Optional<XmlElement>. If the element was found, the map() function is invoked to parse the date. If the element was not found, the orElseGet() function is invoked to parse the relative date.

For a large enterprise-style codebase that uses this approach to Optional, see OpenGamma Strata, a modern open source toolkit for the finance industry.

See also Joda-Beans code generation, which can generate this pattern (and much more).

Finally, it should be noted that some future Java version, beyond Java 9, will probably support value types. In this future world, the costs associated with Optional will disappear, and using it far more widely will make sense. I simply argue that now is not the time for an "optional everywhere" approach.

Summary

This article outlines a pragmatic approach to using Optional in Java 8. If followed consistently on a new application, the problem of null tends to just fade away.

Any comments?

36 comments:

  1. We're using Guava's Optional on one project and we're following a similar approach. One difference is that we tend to constrain even more and use Optional for fields as well; inside methods, nulls are allowed.

    ReplyDelete
  2. Another option, particularly for legacy code, is not to add Optional directly to your method signatures, but write a small set of generic liftOptional methods - the Supplier example is trivial (it could easily be replaced in calling code with a direct call to Optional.ofNullable, but illustrates the principal).

    public Supplier> liftOptional(Supplier s){
    return ()->Optional.ofNullable(s.get());
    }

    Which can be use as follows

    Optional postCode = liftOptional(adress::getPostCode).get();

    For methods that accept input parameters, calling code can enforce null safety

    public Function,Optional> liftOptional(Function fn){
    return optionalT-> optionalT.map(fn);
    }

    Then we can unobtrusively make the findAddress code null safe externally.

    Optional address = liftOptional(address::findAddress).apply(Optional.ofNullable(userInput));

    ReplyDelete
    Replies
    1. Try again - html didn't render well..

      public Optional<Supplier<T>> liftOptional(Supplier<T> s){
      return ()->Optional.ofNullable(s.get());
      }

      Optional<String> postCode = liftOptional(adress::getPostCode).get();

      (and so on)

      Delete
    2. I've used a similar method in the past, altho I'd tend to pull the final .get() call directly into liftOptional and evaluate the supplier immediately - and maybe wrap the call in a try/catch to handle any potential NPEs from the supplier.

      At this point I'd probably look at http://tech.unruly.co/validation/ or something similar tho.

      Delete
  3. What will happen if you try to serialize this into json with gson? How will be the optional fields be serialized?

    ReplyDelete
  4. Has anyone done any actual benchmarks to show the benefit of this approach over having Optional fields? I ask, because if you are not using Joda-Beans, this approach is a whole lot more work to get right, and leaves you with non-bean properties (as you noted.)

    I'd like to have some actual data instead of just "it seems like this should be better." My team uses Guava Optional extensively as a field and parameter, including inside of classes deserialized by Jackson (so we create a good number of them) and we have not had problems.

    I'm also mystified that you don't have Optional parameters to setters or constructors. If you are truly using optional everywhere, then you should already have an Optional for every potentially null value in your code. Why would you unwrap that to null just to pass it to another class that knows that it is optional? It seems like you would only do that if you are following this advice (currently unsupported by any evidence) to avoid Optional fields.

    I know you are a smart guy Steven, but I'm not going to take advice that requires a lot more work for me and my team and is potentially error prone without some pretty clear evidence that we cannot afford it.

    ReplyDelete
    Replies
    1. Just to note, that in my experience, when passing data into a method, it is never in an Optional box already. Forcing callers to wrap the argument just to call my method is deeply unfriendly. If I do have a situation where I can be sure that the caller already has an optional box (such as in a private or package scoped method) of course I'd use Optional as a parameter. Since less than 1% of method parameters will accept null in my current codebase, its not a big deal anyway.

      Delete
    2. A lot of time, it's just possible to wrap the method call in an ifPresent() and just call a variant that hasn't this parameter. Of course, it starts to be difficult when multiple parameters can all be optional.

      I must say that, over time, the few methods that I had that had an Optional parameter just were replaced by two methods, with or without this parameter.

      Delete
    3. I've found that method overloading handles most cases for optional parameters, otherwise, if it gets too complicated to do that way, wrapping the input arguments in a new object/builder. I agree that having Optionals in method arguments is a bit hostile.

      However for fields, I feel like Optionals are good because it means that the design intent for that field is communicated in the code/type itself, rather than just a comment saying it's optional... It forces all code to treat it as such, and IDE will give warnings when it is not handles correctly as being optional.

      Delete
  5. Interesting. Just for the record, you can use standard java.util.Objects.requireNonNull(T) instead of Preconditions.chckNotNull(T).

    ReplyDelete
  6. Optional does not serialize its wrapped object, it serializes itself. I ran into this issue with Jackson and had to define some special getters that were used to override the Optional-wrapped getters. I would not want to have to do that over all of the beans in an entire project.

    ReplyDelete
    Replies
    1. Unless I misunderstand, that is not the behavior when using /jackson-datatype-jdk8 ( https://github.com/FasterXML/jackson-datatype-jdk8 ). This library represents an empty Optional as null and a present optional as the underlying value.

      Delete
  7. To me it seems quite inconsistent to use Optional on output, but not on input. I think you should use Optional on input (constructors, setters) as well. Your motivation for not using Optional on input is quite weak.

    ReplyDelete
    Replies
    1. For most usage I'd prefer concrete constructors/factory methods to construct an object with missing/default values, but sometimes depending on the usage that's not always clean/easy - not without some form of pattern matching on the calling side - I often find it easier to accept an Optional and lift the source value in.

      I do however also firmly believe that accepting optional values is indicative of some other problem in your API that should be rethought - or at least thought about.

      Delete
  8. There are Jackson modules for both Guava and JDK8 that can serialise / deserialise both Optional impls to JSON and back.

    ReplyDelete
  9. Can anyone tell me what's wrong with using Optional as a type on instance variables (point 1 in the post)?

    The only case I've heard is basically "it wasn't designed for that", which is a bit disappointing to the pragmatist because it's also really useful for that. Maybe you can get rid of point 1 and call it "an even more pragmatic approach to Optional in Java 8"?

    ReplyDelete
    Replies
    1. It wasn't designed for that YET. This is advice for Java 8. When Java has value types (in 10 or 11), the cost of the Optional box will often fall to zero. If it also becomes serializable, then it would be applicable to be used more widely.

      Delete
  10. I share the feeling that the asymmetry between getters and setters (not-a-bean) is undesirable. However, I draw the opposite conclusion: Never expose an Optional from a getter (and of course never use it as an instance field). I would only consider returning Optional from business logic methods, and otherwise leave the use of Optional to the caller.

    My reasons:

    Using Optional type fields makes your domain objects less re-usable and less flexible. What is optional may well depend on context, localization, use case etc. In my experience pretty much anything in a domain object can turn out to be optional in some circumstances, except its primary key.

    Hard-coding optionality in the type system makes using composition harder, because you must always consider whether an Optional is returned. That takes away some of the simplicity gained when avoiding null checks. Let's use the well-worn car insurnace example (a person owns a car that has an insurance, given the person you want to get the name of the insurance company, anything may be null). With ordinary beans this runs like this:

    String getCarInsuranceNameFluent(Person thePerson) {
    Optional<String> name = Optional.ofNullable(thePerson)
    .map(Person::getCar)
    .map(Car::getInsurance)
    .map(Insurance::getName);
    return name.orElse("Unknown");
    }

    Now suppose you decide that really not everyone owns a car and not every car is insured and make this explicit using classes Person2 and Car2, returning Optional from the respective getters. The code now becomes:

    String getCarInsuranceNameFluentExplicit2(Person2 thePerson) {
    Optional<String> name = Optional.ofNullable(thePerson)
    // must use flatMap for monadic composition
    .flatMap((Person2 p) -> p.getCar())
    .flatMap((Car2 c) -> c.getInsurance())
    .map((Insurance ins) -> ins.getName());
    return name.orElse("Unknown");
    }

    In my view undesirable. I believe in most cases Optional is not a replacement for null values, but a way to make it easier to deal with them in chained computations. They don't belong in the type system.

    ReplyDelete
    Replies
    1. "Hard-coding optionality in the type system makes using composition harder, because you must always consider whether an Optional is returned."

      This is exactly why we use Optionals, to force the code to deal with the fact that the value may not be present. If you receive a value that may be null and pass it around, you have no hints that it may be null outside of the small chance that someone noted that in the Javadoc. If I get an Optional, I absolutely must figure out what to do with it if it is null, or I must assert that it is not null. Goodbye null pointer exceptions.

      Using Optionals for all optional parameters has made our code a lot more robust, and we know (thanks to the type system and some preconditions) that we will not get NPEs down the stack.

      The alternative you are proposing is to not deal with null values. Unless you are just passing a value through to other code that wont care if it is null or not, you need to deal with null eventually. Optional helps makes sure you do that.

      Delete
    2. No, the second example would become

      String getCarInsuranceNameFluentExplicit2(Person2 thePerson) {
      Optional name = Optional.ofNullable(thePerson)
      .flatMap(Person2::getCar)
      .flatMap(Car2::getInsurance)
      .map(Insurance::getName);
      return name.orElse("Unknown");
      }

      Delete
    3. It is not the case that method references are necessarily better than lambdas. On many occasions I've found method references to be less readable and harder to refactor.

      In particular, that IntelliJ chooses to provide a warning when you could convert a lambda to method reference is deeply unhelpful.

      Delete
    4. Ah but method references are just function addresses which is a genuine functional approach. You are passing the address of a function whatever you are doing (and I won't even start about closures) and when you discover that is what us really happening then using them is more readable

      Delete
  11. Not sure if I agree 100% on the advice of not including Optional in the parameters, I wouldn't go out of my way to support them but if you know that they're likely to be retrieving from an Optional, why not allow taking the optional. I think it might also depend on what has to happen in the constructor or setter, are you simply setting a property? or is there logic around it? if someone passes a null to your method is an NPE likely to occur?

    it's also worth noting that a null optional is a static, so no real cost if it's null.

    ReplyDelete
  12. Reading this post reminds me of the incredible dangers of the billion-dollar mistake of Java's type system supporting null as a primitive. How much have NullPointerExceptions really cost the economy? Newer languages like Ceylon, Kotlin and Swift have mostly omitted null as a primitive and have instead segregated it into its own ghetto in the type system. It's regrettable that null as a primitive in this case still needs to be used as a performance optimization, as a properly designed type system would make this necessary. Still, did anybody really know any better 20 years ago?

    ReplyDelete
  13. Replying to some points:
    1) I can sort of see where liftOptional() is going, although just using Optional.ofNullable() would go a long way.

    2) "Optional everywhere" is a feasible design choice, but not one I will recommend until Optional is a value type (sometime after Java 9). We know that using Optional as a field will add another box and another dereference to code. Since memory access is a key performance issue at the hardware level, we can reason that there is the potential for trouble. Similar reasoning applies to GC. And it is also clear from reading the mailing list and the lack of Serializable that it is not currently intended for Optional to be used as a field. Can "optional everywhere" work? Sure, so long as eyes are open.

    3) "Use for input parameters" is a feasible technical choice, however my experience suggests that I very rarely have an object already in an Optional box waiting to be passed in. The safety benefits accrue mainly on querying, not inputting as well. So, while I understand the consistency and documentation points, it seems to me that users would suffer if we had optional input parameters. Looking at a Java with value types, this will continue to be a problem unless there is a mechanism to auto-box into an Optional.

    4) "Not using on getters" seems to lose most of the benefits. Specifically, there is no way to tell if a getter can, or cannot, return null. In a new codebase, it is much better to take control of null and remove it as far as practical.

    ReplyDelete
    Replies
    1. I believe the idea of having total control over a codebase and being able to use Optional throughout, is unrealistic. Obviously so when dealing with pre-existing code that continues to evolve. But even when starting a new project, what about libraries you have to include, pre-Java 8 collections etc.? It's not practical. Null will always be there.

      Delete
    2. "I believe the idea of having total control over a codebase and being able to use Optional throughout, is unrealistic."

      Unless, you know, you have total control over the codebase, as many of us do. Even if you are only working on a subset, you can ofNullable your inputs and orElse(null) your outputs. NPE worries go away.

      The thing that we are missing here is some context. Clearly Steven is thinking of projects with a lot of existing code not using optionals, where code performance is a major concern. I'm thinking of data driven web micro services that are pulling (potentially ugly) data off the wire, doing a little transformation, and writing to external systems. In my case, the overhead from optionals doesn't really matter because the bottleneck in our system is a DB or something external, not GC.

      Delete
  14. Using static analysis tool solves this flawlessly. The checker framework has NullnessChecker. Using the @Nullable and the @NonNull annotation makes it clear when there is a type mismatch.

    http://types.cs.washington.edu/checker-framework/current/checker-framework-manual.html

    "If you know anything about garbage collection you'll know that the JVM handles these short-lived objects well."

    In regard to costs the type annotations means no extra work when passing the return value, while Optional means some extra work. No work is less than some work!

    ReplyDelete
    Replies
    1. The usefulness of the checker framework is often overestimated. It does not stop nulls at all. It provides a mechanism for an additional tool to be run that might stop nulls. It is also underspecified, has no common standard, works differently in different IDEs, makes no difference to reflective access, is very verbose and frequently annoying.

      The worst part is if you write a public API and declare a variable as "non null". This does nothing. Nothing at all. Only if a caller also uses the checker framework does it make the slightest bit of difference. As such, a public API written with the checker framework is often less safe to use as a caller (because the authors don't include the real null-safety checks that they should).

      Delete
  15. Your recommendations sound reasonable, but I'd add:
    Don't use Optional in getters, if the getter returns an array or collection. Return an empty array or collection in this case.

    ReplyDelete
  16. something that has been chewing at me is the usefulness when you can have a method like

    public Optional test(){
    return null;
    }

    it's sinister and not realistic, but shows that the caller has to still check for a null Optional instance before calling isPresent() or anything else on the otional instance

    ReplyDelete
    Replies
    1. The goal of Optional is not to get rid of NPE in case an implementation is buggy. It states there might be an "empty" value (and provides some elegant methods to deal with it). If the caller doesn't deal with the empty value, it is his mistake. If the method returns null, it is simply a bug in the method and NPE (on isPresent() call for example) is completely OK, you don't know how to handle null return from a method that says "i never return null".

      Delete
  17. What if we have an "already existing" situation like this:
    1) Instance of X is created or not;
    2) if it is created it's passed to the class A with the method 'Optional getX()';
    3) class B gets reference to X from A if it exists, and keeps it in a field, it also has 'Optional getX()' method;
    3) class C gets reference to X .... and so on

    So when X "travels" between those classes it seem to be an extra work to unwrap it every time it enters a new class, and then rewrap in the getter methods.

    ReplyDelete
    Replies
    1. I think the rules above still apply. If necessary, hotspot will inline the methods and remove the extra wrapping.

      Delete
  18. Just to say that I much appreciate this article, and even the discussion. It seems a lot of people don't get Optional that well, but I found the overall quality of the discussion is quite high here.

    ReplyDelete
  19. 6. Do not create an Optional to replace an if-else statement.

    I have seen this so many times now...

    ReplyDelete

Please be aware that by commenting you provide consent to associate your selected profile with your comment. Long comments or those with excessive links may be deleted by Blogger (not me!). All spam will be deleted.