Stephen Colebourne's blog: 2016

Monday, 26 September 2016

Code generating beans - mutable and immutable

Java has long suffered from the pain of beans. To declare a simple data class takes far too much code. At JavaOne 2016, I talked about code generation options - see the slides.

Code generation of mutable and immutable beans

The Java ecosystem is massive. So many libraries releasd as open source and beyond, which naturally leads to the question as to how those libraries communicate. And it is the basic concept of beans that is the essential glue, despite the ancient specification. How do ORMs (Hibernate etc.), Serialization (Jackson etc.) and Bean Mappers (Dozer etc.) communicate? Via getters and setters.

The essential features of beans have moved beyond the JavaBeans spec, and are sometimes referred to as POJOs. The features are:

Mutable
No-args constructor
Getters and Setters
equals() / hashCode() / toString()

But writing these manually is slow, tedious and error-prone. Code generation should be able to help us here.

But should we be using mutable beans in 2016? No, no, no!

It is time to be writing immutable data structure (immutable beans). But the only practical way to do so is code generation, especially if you want to have builders.

In my talk at JavaOne 2016, I considered various code generation approaches:

IDE code generation

This is fine as far as it goes, but while the code is likely to be correct immediately after generation, there is still no guarantee that the generated code will stay correct as the class is maintained over time.

AutoValue, Immutables and VALJOGen

These three projects - AutoValue, Immutables, VALJOGen - use annotation processors to generate code during compilation. The idea is simple - the developer writes an abstract class or interface, and the tool code generates the implementation at compile time. However, these tool all focus on immutable beans, not mutable (Immutables can generate a modifiable bean, but it doesn't match the JavaBeans spec, so many tools will reject it).

On the up side, there is no chance to mess up the equals / hashCode / toString. While the tools all allow the methods to be manually written if necessary, most of the time, the default is what you want. It is also great not to have to implement the immutable builder class manually.

On the down side, you as the developer have to write abstract methods, not fields. A method is a few more keystrokes than a field, and the Javadoc requires an @return line too. With AutoValue, this is particularly painful, as you have to write the outline of the builder class. With Immutables, there is no need for this.

Of the three projects, AutoValue provides a straightforward simple tool that works, and where the implementation class is hidden (package scoped). Immutables provides a full-featured tool, with many options and ways to generate. By default, the implementation class is publicly visible and used by callers, but there are ways to make it package-scoped (with more code written by you). VALJOGen allows full customisation of the generation template. There is no doubt that Immutables is the most comprehensive of the three projects.

All three annotation processing tools must be setup before use. In general, adding the tool to Maven will do most of the work (Maven support is good). For Eclipse and IntelliJ, these instructions are comprehensive.

Lombok

The Lombok project also uses annotations to control code generation. However, instead of acting as an annotation processor, it hacks into the internal APIs of Eclipse and the Java compiler.

This approach allows code to be generated within the same class, avoiding the need for developers to work with an abstract class or interface. This means that instead of writing abstract methods, the developer writes fields, which is a more natural thing to do.

The key question with Lombok is not what it generates, but the way it generates it. If you are willing to accept the hacky approach, IDE limitations, and the inability to debug into the generated code, then it is a neat enough solution.

For Eclipse, Lombok must be installed, which is fairly painless as there is a GUI. Other tools require other installation approaches, see this page.

Joda-Beans

The Joda-Beans project takes a third approach to code generation. It is a source code regenerator, creating code within the same source file, identified by "autogenerated start/end" comments.

Developers write fields, not abstract methods which is simpler and less code. They also generate code into the same class, which can be final if desired.

One key benefit of generating all the code to the same class is that the code is entirely valid when checked out. There is no need to install a plugin or configure your IDE in any way.

Unlike the other choices, Joda-Beans also provides functionality at runtime. It aims to add C# style properties to Java. What this means in practice is that you can easily treat a bean as a set of properties, loop over the properties and create instances using a standardized builder. These features are the ideal building block for serialization frameworks, and Joda-Beans provides XML, JSON and binary serialization that operates using the properties, generally without reflection. The trade off here is that Joda-Beans is a runtime dependency, so it is only the best option if you use the additional properties feature.

The Joda-Beans regenerator can be run from Maven using a plugin. If you use the standard Eclipse Maven support, then simply saving the file in Eclipse will regenerate it. There is also a Gradle plugin.

Comparing

Most of the projects have some appeal.

AutoValue is a simple annotation processor that hides the implementation class, but requires more code to trigger it.
Immutables is a flexible and comprehensive annotation processor that can be used in many ways.
Lombok requires the least coding by the developer, but the trade-off is the implementation via internal APIs.
Joda-Beans is different in that it has a runtime dependency adding C# style properties to Java, allowing code to reliably loop over beans.

My preference is Joda-Beans (which I wrote), because I like the fact that the generated code is in the same source file, so callers see a normal final class, not an interface or abstract class. It also means that it compiles immediately in an IDE that has not been configured when checked out from source control. But Joda-Beans should really only be used if you understand the value of the properties support it provides.

If I was to pick another tool, I would use Immutables. It is comprehensive, and providing you invest the time to choose the best way to generate beans for your needs, it should have everything you need.

Finally, it is important that readers have a chance to look at the code written and the code generated. To do this, I have created the compare-beangen GitHub project. This project contains source code for all the tools above and more, demonstrating what you have to write.

To make best use of the project, check it out and import it into your IDE. That way, you will experience what code generation means, and how practical it is to work with it. (For example, see what happens when you rename a field/method/class. Does the code generator cope?)

Summary

It is time to start writing and using immutable beans instead of mutable ones. The Strata open source market risk project (my day job) has no mutable beans, so it is perfectly possible to do. But to use immutable beans, you are going to need a code generator, as they are too painful to use otherwise.

This blog has summarised five code generators, and provided a nice GitHub project for you to do you own comparisons.

Tuesday, 20 September 2016

Private methods in interfaces in Java 9

Java SE 9 is slowly moving towards the finishing line. One new feature is private methods on interfaces.

Private methods on interfaces in Java 9

In Java 7 and all earlier versions, interfaces were simple. They could only contain public abstract methods.

Java 8 changed this. From Java 8, you can have public static methods and public default methods.

public interface HolidayCalendar {
  // static method, to get the calendar by identifier
  public static HolidayCalendar of(String id) {
    return Util.holidayCalendar(id);
  }
  
  // abstract method, to find if the date is a holiday
  public abstract boolean isHoliday(LocalDate date);
  
  // default method, using isHoliday()
  public default boolean isBusinessDay(LocalDate date) {
    return !isHoliday(date);
  }
}

Note that I have chosen to use the full declaration, with "public" on all three methods even though it is not required. I have argued that this is best practice for Java SE 8, because it makes the code clearer (now there are three types of method) and prepares for a time when there will be non-public methods.

And that time is very soon, as Java 9 is adding private methods on interfaces.

public interface HolidayCalendar {
  // example of a private interface method
  private void validateDate(LocalDate date) {
    if (date.isBefore(LocalDate.of(1970, 1, 1))) {
   throw new IllegalArgumentException();
 }
  }
}

Thus, methods can be public or private (with the default being public if not specified). Private methods can be static or instance. In both cases, the private method is not inherited by sub-interfaces or implementations. The valid combinations of modifiers in Java 9 will be as follow:

public static - supported
public abstract - supported
public default - supported
private static - supported
private abstract - compile error
private default - compile error
private - supported

Private methods on interfaces will be very useful in rounding out the functionality added in Java 8.

Saturday, 26 March 2016

Var and val in Java?

Should local variable type inference be added to Java? This is the question being pondered right now by the Java language team.

Local Variable Type Inference

JEP-286 proposes to add inference to local variables using a new psuedo-keyword (treated as a "reserved type name").

We seek to improve the developer experience by reducing the ceremony associated with writing Java code, while maintaining Java's commitment to static type safety, by allowing developers to elide the often-unnecessary manifest declaration of local variable types.

A number of possible keywords have been suggested:

var - for mutable local variables
val - for final (immutable) local variables
let - for final (immutable) local variables
auto - well lets ignore that one shall we...

Given the implementation strategy, it appears that the current final keyword will still be accepted in front of all of the options, and thus all of these would be final (immutable) variables:

final var - changes the mutable local variable to be final
final val - redundant additional modifier
final let - redundant additional modifier

Thus, the choice appears to be to add one of these combinations to Java:

var and final var
var and val - but final var and final val also valid
var and let - but final var and final let also valid

In broad terms, I am unexcited by this feature and unconvinced it actually makes Java better. While IDEs can mitigate the loss of type information when coding, I expect some code reviews to be significantly harder (as they are done outside IDEs). It should also be noted that the C# coding standards warn against excessive use of this feature:

Do not use var when the type is not apparent from the right side of the assignment.
Do not rely on the variable name to specify the type of the variable. It might not be correct.

Having said the above, I suspect there is very little chance of stopping this feature. The rest of this blog post focuses on choosing the right option for Java

Best option for Java

When this feature was announced, aficionados of Scala and Kotlin naturally started arguing for var and val. However, while precedence in other languages is good to examine, it does not necessarily apply that it is the best option for Java.

The primary reason why the best option for Java might be different is history. Scala and Kotlin had this feature from the start, Java has not. I'd like to show why I think val or let is wrong for Java, because of that history.

Consider the following code:

 public double parSpread(SwapLeg leg) {
   Currency ccyLeg = leg.getCurrency();
   Money convertedPv = presentValue(swap, ccyLeg);
   double pvbp = legPricer.pvbp(leg, provider);
   return -convertedPv.getAmount() / pvbp;
 }

Local variable type inference would apply fine to it. But lets say that the type on one line was unclear, so we choose to keep it to add clarity (as per the C# guidelines):

 public double parSpread(SwapLeg leg) {
   val ccyLeg = leg.getCurrency();
   Money convertedPv = presentValue(swap, ccyLeg);
   val pvbp = legPricer.pvbp(leg, provider);
   return -convertedPv.getAmount() / pvbp;
 }

Fine, you might say. But what if the code is written in a team that insists on marking every local variable as final.

 public double parSpread(final SwapLeg leg) {
   val ccyLeg = leg.getCurrency();
   final Money convertedPv = presentValue(swap, ccyLeg);
   val pvbp = legPricer.pvbp(leg, provider);
   return -convertedPv.getAmount() / pvbp;
 }

Suddenly, we have a mess. Some parts of the code use final to indicate a "final" (immutable) local variable. Whereas other parts of the code use val. It is the mixture that is all wrong, and it is that mixture that you do not get in Scala or Kotlin.

(Perhaps you don't code using final on every local variable? I know I don't. But I do know that it is a reasonably common coding standard, designed to add safety to the code and reduce bugs.)

Contrast the above to the alternative:

 public double parSpread(final SwapLeg leg) {
   final var ccyLeg = leg.getCurrency();
   final Money convertedPv = presentValue(swap, ccyLeg);
   final var pvbp = legPricer.pvbp(leg, provider);
   return -convertedPv.getAmount() / pvbp;
 }

This is a lot more consistent within Java. final continues to be the mechanism used everywhere to get a final (immutable) variable. And if you, like me, don't worry about the final keyword, it reduces to this:

 public double parSpread(final SwapLeg leg) {
   var ccyLeg = leg.getCurrency();
   Money convertedPv = presentValue(swap, ccyLeg);
   var pvbp = legPricer.pvbp(leg, provider);
   return -convertedPv.getAmount() / pvbp;
 }

I understand the objections many readers will be having right now - that there should be two new "keywords", one for mutable and one for immutable local variables and that both should be of the same length/weight (or that the mutable one should be longer) to push people to use the immutable form more widely.

But in Java it really isn't that simple. We've had the final keyword for many years. Ignoring it results in an unpleasant and inconsistent mess.

Summary

I don't personally like local variable type inference at all. But if we are to have it, we have to make it fit well within the existing language.

I argue that val or let simply does not fit Java, because final already exists and has a clear meaning in that space. While not ideal, I must argue for var and final var, as the only combination on offer that meets the key criteria of fitting the existing language.