Monday, 8 December 2014

What might a Beans v2.0 spec contain?

What features should be in scope and out of scope for a Beans specification v2.0?

Note: I am an independent blogger, not an Oracle employee, and as such this blog is my opinion. Oracle owns the JavaBeans trademark and controls what actually happens to the specification.

Update 2015-01-03: New home page for the Beans v2.0 effort now available!

JavaBeans v1.0

My last blog examined the contents of the current JavaBeans (TM) specification. The key feature elements can be summarized as follows:

  • Properties - accessed using getter and setter methods
  • Events - change events for properties
  • Methods - all public methods
  • BeanInfo - permits customization of properties, events and methods
  • Beans - a utility class to access the features
  • Property editor - a mechanism to edit properties, including conversion to and from a string format
  • GUI support - allows a bean to be manipulated visually

These features were a good fit for 1997, given that the original goal was to be a component system, interoperating with other similar systems like COM/DCOM. These features are no longer what we need, so any Beans v2.0 is naturally going to look a lot different.

Beans v2.0 - Data abstraction

The key goal of a Beans v2.0 spec is to define an abstraction over the data in an object.

When such an abstraction exists, frameworks and libraries become far simpler to write and interact with. For example, a serialization framework at its most basic level simply needs to iterate over all the beans and properties in an object graph in order to perform its task.

In addition to frameworks, many ad-hoc coding tasks become simpler because it is much easier to walk over the object graph of beans and properties. For example, to find all instances of a particular type within an object graph starting from an arbitrary bean.

The data abstraction needs to be fully round-trip. The spec must allow for data to be read from a bean into another format. But it must also provide the tools to allow the original bean to be recreated from the data previously read.

Beans v2.0 - Mutable vs Immutable

The JavaBean v1.0 specification implicitly defines mutable objects. It requires all beans to have a no-args constructor, thus the only way for the properties to get populated is through the use of setters. This mutability is one of the key complaints that people have about JavaBeans, and often Java itself by association.

Any Beans v2.0 specification must support immutable objects.

Immutable classes are now a common design approach to systems and are increasingly popular. Supporting them is thus vital to encompass the range of data objects in use today.

Any Beans v2.0 specification must also still support mutable objects. While there are those that turn their nose up at any mutability, it is still a perfectly valid choice for an application. The decision between mutable and immutable data objects is a trade-off. The Beans v2.0 spec needs to support both options.

The implications of supporting immutable is that an alternative to a no-args constructor and public setters will be needed to support construction. This is likely to be a "bean builder" of some kind.

Beans v2.0 - Beans and Properties

The term "bean" is often associated with a specific Java class, such as Person. The term "property" is often associated with a single field on the class, such as surname or forename.

Any Beans v2.0 spec must not have a reflection-based view of beans and properties.

Firstly, a property is not the same as a field (instance variable). Fields hold the state of an object, but not necessarily in the form that the object wishes to expose (it might be stored in an optimised format for example). As such, it must be possible for a bean to control the set of properties that it publishes.

Secondly, it can be desirable to a bean with a dynamic set of properties. This might consist of a bean with a fixed set of base properties and an extensible map of additional ones, or a bean which is entirely dynamic, much closer to the data model of many dynamic languages.

The implications of this is that there is not a one-to-one mapping between a bean and a Class, nor is there a one-to-one mapping between a property and a java.lang.reflect.Field.

The JavaBeans v1.0 spec treats indexed properties in a special way. I can see no reason to do that in Beans v2.0, particularly given the range of collection types now in use.

Beans v2.0 - Getters and Setters

The current JavaBeans v1.0 spec requires there to be a physical java.lang.reflect.Method instance for both the getter and setter. What Beans v2.0 needs to provide is an abstraction away from this, as it provides many more options.

Any Beans v2.0 spec must not require data access via a java.lang.reflect.Method.

Moving away from direct exposure of reflection raises the level of abstraction. It permits alternative object designs to be exposed as beans.

For example, consider a bean that has no public getters and setters. With the Beans v2.0 abstraction, this could still expose its data as properties, providing interoperation with frameworks, but encapsulating the data for most programmatic uses. This tackles another big criticism of JavaBeans, which is that they cause state to be exposed rather than hidden.

Another example is the HashMap based bean. In this case a Method does not exist for each property, just a single get by name. Being able to expose this as a bean provides the desired dynamic behaviour.

The implications of this are simply that the Beans v2.0 spec abstraction will simply be long the lines of the Map interface get and put.

Beans v2.0 - Annotations

The current JavaBeans v1.0 spec predates annotations in Java. But annotations are now key to many programming practices in Java.

Any Beans v2.0 spec must provide access to property and bean annotations.

Annotations are represented by implementations of the Annotation interface. It so happens that the Annotation interface is not part of reflection. It is possible to implement an annotation yourself (apart from a compiler warning).

What is needed is for the Beans v2.0 spec abstraction to provide access to the set of annotations on the bean, and the set of annotations on each property without going via reflection. With this step, and the ones above, many direct uses of reflection will completely disappear, with the Beans v2.0 abstraction being good enough.

Beans v2.0 - Events

The current JavaBeans v1.0 spec considers events to be first class parts of the abstraction.

There is no good reason to include events in the Beans v2.0 spec.

The current JavaBeans v1.0 spec is GUI focussed, and as such events were an essential element. But the Beans v2.0 spec outlined here focuses on a different goal, abstracting over the data in the object. Round-tripping data from a bean to another format and back to a bean does not require events or a GUI.

There will be those reaching for the comments box right now, demanding that events must be included. But think carefully and you should realise that you don't need to.

None of what is proposed here for Beans v2.0 takes away what works today with JavaBeans v1.0. GUI developers can still carry on using the same getters, setters and events that they do today, including bound and constrained properties. The java.beans package will continue to exist and continue to be available. In addition, JavaFX is the way forward for Java GUIs, and it has its own approach to properties.

Beans v2.0 - Conversion to/from String

The current JavaBeans v1.0 spec includes PropertyEditor. One of the use cases of the class, notably in Spring framework, is to convert a simple object (typically a VALJO) to and from a String. For example, this includes Integer, Enum, Class and Currency.

Any Beans v2.0 spec must tackle conversion to/from a String for simple types.

The most common use case for the Beans v2.0 spec will be to iterate over the object graph and process it in some way, such as to write out a JSON or XML message. In order for this to be practical, all objects in the object graph need to be either a bean, or a simple type convertible to a string. As such, simple type string conversion is very much needed to round out the spec.

The Joda-Convert project contains a simple set of interfaces and classes to tackle this problem, and I imagine any Beans v2.0 spec solution would be similar.

Beans v2.0 - But what does a bean actually look like?

Part of the goal of Beans v2.0 is to allow for different kinds of bean. Immutable as well as mutable. Private getters and setters as well as public. Dynamic map-based structures as well as field-based ones.

Any Beans v2.0 spec should not overly restrict how beans are implemented.

We know that Oracle has no current plans for properties at the language level. As such, it also makes little sense for Beans v2.0 spec to specify a single approach for how classes should be designed. Instead, the spec aims to allow classes that are designed in different ways to all agree on a common way to interact with frameworks and libraries that does not require getters, setters and a no-args constructor.

The Joda-Beans, Lombok, Immutables and Auto projects (and many others) all provide different ways to create beans without writing manual code. All could be adapted to work with the proposed Beans v2.0 spec, as could frameworks like Jackson, Hibernate, Dozer and Groovy (to name but a few).

The simple answer is that the bean could look just like it does today with getters and setters. The real answer is that the bean may look any way it likes so long as it can expose its data via the Beans v2.0 API.

Summary

The current JavaBeans v1.0 spec is very old and not that useful for the kinds of applications we build today. This blog has outlined my opinion on what direction a Beans v2.0 spec should take:

  • Focussed on data abstraction
  • Support for Mutable and Immutable beans
  • Fully abstracted from reflection
  • No requirement for getters and setters
  • No requirement for a no-args constructor
  • Support for dynamically extensible properties
  • Access to bean and property annotations
  • Conversion of simple types to and from a String
  • No support for events

Feel free to comment on the high-level scope outlined above. In particular, if you develop a project that uses the JavaBeans spec (directly or indirectly), please comment to say whether the proposed spec is likely to be sufficient to cover your use cases.

I've also started work on a prototype of the code necessary to support the spec. So also feel free to raise pull requests or join the mailing list / group.

Friday, 28 November 2014

The JavaBeans specification

Have you ever read the JavaBean (TM) specification? It really is from another era!

The JavaBean specification

The JavaBean specification, version 1.01, was the last specification that defined the approach to the common bean pattern used in many Java applications. Most developers have never read it, but if you want a history lesson, its a good read.

The spec dates from 1997, JDK 1.1 and Sun Microsystems, a scary 17 years ago. Serialization and Reflection are new technologies, and CORBA is a big deal.

The introduction starts with:

The goal of the JavaBeans APIs is to define a software component model for Java, so that third party ISVs can create and ship Java components that can be composed together into applications by end users.

It goes on to describe the key goal of being able to interact with the OpenDoc, COM/DCOM/ActiveX and LiveConnect object models, so JavaBeans can be embedded inside Word documents and Excel spreadsheets (via a "transparant bridge"). An early example is the desire to support "menubar merging", combining the menu from a JavaBean with that of the host application.

So, what is a JavaBean?:

A Java Bean is a reusable software component that can be manipulated visually in a builder tool.

The explanation goes on to consider builder tools as web page builders, visual application builders, GUI layout builders, or document editors (like a Word document). Beans may have a GUI, or be invisible, however GUI is clearly the main focus. Note that GUI beans are required to subclass java.awt.Component.

The unifying features of a bean are:

  • Introspection - so tools can analyse a bean
  • Customization - so tools can alter a bean
  • Events - AWT style event handing
  • Methods - the set of methods the bean declares as being callable
  • Properties - the set of get/set properties
  • Persistence - long term storage, primarily by serialization

Beans are expected to run in a container, and that container is expected to provide certain behaviour around persistence and gluing beans together. Notably, a connection from one bean to another can either be composition (where the bean is expected to serialize the composed bean) or a "pointer to an external bean", where it is expected that the container will recreate it.

The multi-threading section is very simplistic:

It is the responsibility of each java bean developer to make sure that their bean behaves properly under multi-threaded access. For simple beans this can generally be handled by simply making all the methods “synchronized”.

Another interesting section indicates a future plan to add a "view" layer, where multiple beans can be composed together into a view. To enable this, two methods Beans.instantiate() and Beans.getInstanceOf() are supposed to be used. Casts and instanceof checks are forbidden.

The most classic part of the spec is the scenarios section:

Buying the tools.
1.a) The user buys and installs the WombatBuilder Java application builder.
1.b) The user buys some JavaBeans components from Al’s Discount Component Store. They get a win32 floppy containing the beans in a JAR file. This includes a “Button” component and a “DatabaseViewer” component.
1.c) The user inserts the floppy into the machine and uses the WombatBuilder...

Floppy disks! Well this is 1997...

The section does highlight that bean customization is expected to make available a mechanism of serializing beans as source code, via the PropertyEditor.getJavaInitializationString() method. Standard Java object serialization is a mandatory technique, while the source code route is optional.

Event handling is familiar enough, using java.util.EventListener and java.util.EventObject (think mouse listeners for example). Beans can define their own event hierarchies and publish the events that the bean defines. The listeners are based on a naming convention:

The standard design pattern for EventListener registration is:
public void add<ListenerType>(<ListenerType> listener);
public void remove<ListenerType>(<ListenerType> listener);

And of course these methods are expected to access a Vector to store the listeners. A special case is that of limiting the number of listeners using java.util.TooManyListenersException, not something I've ever seen. Oh, and you are supposed to declare and throw checked exceptions in listeners, not unchecked.

Section 7 covers properties, described as "discrete, named attributes of a JavaBean". They are defined by a naming convention we all know well:

void setFoo(PropertyType value); // simple setter
PropertyType getFoo(); // simple getter

There is also direct support for indexed properties, but only on arrays:

void setter(int index, PropertyType value); // indexed setter
PropertyType getter(int index); // indexed getter
void setter(PropertyType values[]); // array setter
PropertyType[] getter(); // array getter

Note that the spec is slightly unclear here, using "setter" and "getter" as a placeholder for a "setFoo" type name. All getters and setters are allowed to throw checked exceptions.

Bound and constrained properties get a mention, along with PropertyChangeListener, PropertyVetoException and VetoableChangeListener. Vetoing gets complicated when there are multiple event listeners and some may have been notified of a success before the veto occurs. The spec has a suprising amount of detail on these two concepts, including a variety of ways of registering listeners.

The introspection section finally talks in detail about using reflection ad design patterns to find getters and setters, plus the ability to use a more specific BeanInfo:

By default we will use a low level reflection mechanism to study the methods supported by a target bean and then apply simple design patterns to deduce from those methods what properties, events, and public methods are supported. However, if a bean implementor chooses to provide a BeanInfo class describing their bean then this BeanInfo class will be used to programmatically discover the beans behaviour.

...within Java Beans the use of method and type names that match design patterns is entirely optional. If a programmer is prepared to explicitly specify their properties, methods, and events using the BeanInfo interface then they can call their methods and types whatever they like. ... Although use of the standard naming patterns is optional, we strongly recommend their use as standard naming conventions are an extremely valuable documentation technique

Bean properties, getters and setters, are matched based on the "get" and "set" prefix, and they may be defined in a superclass. Read-only and write-only properties are permitted. Boolean properties may use the "is" prefix, which may be used instead of, or in addition to, the normal "get" prefixed method.

Bean events are similarly matched by "addFooListener" and "removeFooListener" pattern matching. Both must be present for an event to be recognised.

Bean methods are simply defined as all public methods of the class. The list of methods includes getters, setters and event methods.

The BeanInfo class is used to selectively replace parts of the reflected information. Thus, the properties could be defined, but the events obtained by reflection. The java.beans.Introspector class is used to obtain the combined view, including the BeanInfo and reflected data.

It turns out that there are specific rules around capitalization in property names:

Thus when we extract a property or event name from the middle of an existing Java name, we normally convert the first character to lower case. However to support the occasional use of all upper-case names, we check if the first two characters of the name are both upper case and if so leave it alone. So for example,
“FooBah” becomes “fooBah”
“Z” becomes “z”
“URL” becomes “URL”

When customizing a bean, the default approach is to take the set of properties and expose an editor. This would operate as a key-value table, but where each property can nominate a java.beans.PropertyEditor to provide a GUI for defining that property. The classic example is a colour picker for a property of type Color. Again, any GUI is an AWT Component. And you can register an editor simply by naming it after the property being configured - Foo class is edited by FooEditor in the same package or a set of search packages.

The official tutorial is also instructive.

The simple view is that a bean just needs to be serializable, have a no-args constructor and have getters/setters matching a naming convention. While this is true, the actual spec has a lot more in it, including many rules that most developers do not follow, such as the use of Beans.instantiate() and Beans.getInstanceOf().

Summary

This has been a whistlestop tour of the JavaBeans spec. What hits you between the eyes when you read it is a sense of how old it is. The world has very much moved on.

More significantly however, is that the "beans" that are so common in Java code, consisting of mutable classes with getters and setters are not what the authors of the spec had in mind. As such, we probably shouldn't really call them beans.

But perhaps it is time to write up a JavaBeans v2 spec to rethink the problem being solved in the absence of a properties language feature.

Saturday, 15 November 2014

Devoxx 2014 - Whiteboard votes

As is traditional, the attendees of Devoxx vote on certain topics on whiteboards. Each result must be treated with care - there is no validation process, and only a fraction of the 3500 attendees actually bother to vote. Nevertheless, the results can be interesting.

What version of Java do you use at work right now?

A high level question focussed on Java at the day job. Java 8 hasn't surpassed Java 6 yet, but it is doing pretty well. Remember that Java 7 is End-Of-Life in April 2015!

Java 91
Java 863
Java 7160
Java 681
Java 511
Java 43
Java 31
Dalvik6
Other4


What IDE do you use?

A sense of how the IDE market is looking. The interesting data point is how Eclipse is falling - compare with 2008, where Eclipse led IntelliJ by almost 2 to 1, and NetBeans by almost 4 to 1. Sadly, the absolute number of votes is a lot lower than 6 years ago, so be careful with comparisons.

Eclipse36
NetBeans26
IntelliJ75
Vi/vim8
Other1

Beyond this, I heard various people comment on IBM's apparent withdrawal of funding for Eclipse IDE and what that might mean looking forward.

What JVM language are you using?

An attempt to see what languages people are using. Whenever this is asked, it is always an opportunity for advocates of a language to push hard, so again take the results with a good pinch of salt:

Java214
Scala64
Groovy62
Clojure15
Rhino/JS6
Ceylon3
Kotlin1
Cobol1


Is merging Scala into Java a good idea?

An odd question with a predictable answer - WTF!

Yes3
No26
WTF?! seriously?52
Depends2


What JavaScript framework/library do you use?

An open question that showed a clear dominance for Angular.

Angular51
Ember4
Reactive2
Backbone6
Knockout7
Angular-Dart3
Dojo5
JQuery9
Meteor2


Is this the end for Angular?

Angular 2.0 - "no migration path, not backwards compatible". Whether this is factually correct or not, this was the question asked.

Yes4
No47
Maybe12
Hate JavaScript24
Follow GWT1


What desktop UI technology have you used in the last 3 years?

A sense of where the desktop UI is today. Given that JavaFX has only been truly useful in Java SE 8, its not a bad showing for FX.

None39
Swing41
JavaFX15
MS .NET10
Chrome apps2
Eclipse RCP14
NetBeans platform2
QT5
VCL1
CLI4
OGLPHI1
Applet2
Broswer/HTML7


How do you create WebApps?

An open question on web technologies. Bear in mind that some options were added after others, notably JAX-RS+JavaScript, which was added long after REST+JavaScript. I think its fair to say that there are an odd mix of answers, but a clear leader in REST + JavaScript.

JSF+JavaEE44
REST+JavaScript101
JAX-RS+JavaScript9
Play23
Swing MVC76
Wicket26
JavaScript library50
Grails8
GWT30
Vaadin20
Apache Sling6
WebSockets+JavaScript6
Spray+AkkaHTTP6
Ruby on Rails2
ZK framework4
OmniFaces4
PrimeFaces10
REST+Dart2
Fluent-HTTP1
AngularJS23
Polymer3
By hand, with love2
Web motion3
Struts v1 or v211
Beehive2
JRuby3
Bootstrap2


What one thing would you add to Java (your top priority)?

A traditional Devoxx question, with a long shopping list of answers, as always. Note however that Brian Goetz indicated strongly that properties as a language feature is not happening.

Properties54
Better handling of null33
Multi-line strings33
Safe null dereference23
Interpolation of strings22
Immutability21
Value types16
Unsigned int11
Too bloated!8
Better logging7
Pattern matching8
Sub-package scope4
Tail recursion3
Solve expression problem2
Algorithmic data types2
Class versioning2
Cluster aware JVM1
Performance1
Free$ embedded support1
Inner functions1
Generators and co-routines1
Operator overloading1 (plus 2 minus 1 !)
Enterprise modularity1


Summary

Lots of interesting, and no doubt slightly dubious, statistics to examine. Use with care, and congratulations to the Devoxx team for another great conference!

Thursday, 13 November 2014

No properties in the Java language

Here at Devoxx 2014, Brian Goetz went a little further than he has before about whether properties will ever be added to Java as a language feature.

No properties for you

Properties as a language feature in Java has been talked about for many years. But ultimately, it is Oracle, the Java stewards, who get to decide whether it should be in or out.

Today, at Devoxx, Brian Goetz, Java Language Architect, went a little further than he has before to indicate that he sees little prospect of properties being added to the language at this point.

Oracle have chosen to prioritize lambdas, modularization and value types as key language features over the JDK 8 to 10 timescale. As such, there was no realistic prospect of properties before JDK 11 anyway. However, the indications today were reasonably clear - that it isn't on the radar at all.

When quizzed, Brian indicated two main reasons. First, that it is perhaps too late for properties to make a difference as a language feature, with compatibility with older getter/setter beans being a concern. Second, that there are many different, competing visions of what properties means, and there is an unwillingness to pick one.

Competing visions include:

  • Just generate getters, setters, equals, hashCode and toString
  • Generate observable properties for GUIs
  • Raising the level of abstraction, treating a property as a first class citizen
  • Immutability support, with builders for beans
  • New shorter syntax for accessing properties

While it is easy to argue that these are competing visions, it is also possible to argue that these are merely different requirements, and that all could be tackled by a comprehensive language enhancement.

(Just to note that "beans and properties" does not imply mutable objects. It is perfectly possible to have immutable beans, with or without a builder, but immutable beans are harder for frameworks to deal with. Value types are intended for small values, not immutable beans.)

As is well known, I have argued for properties as a language feature for many years. As such, I find the hardening of the line against properties to be intensely disappointing.

Every day, millions of developers interact with beans, getters, setters and the frameworks that rely on them. Imagine if someone today proposed a naming convention mechanism (starts with "get") for properties? It would be laughable in any language, never mind the world's most widely used enterprise development language.

Instead, developers use IDEs to generate getters, setters, equals and hashCode, adding no real value to their code, just boilerplate of the kind Java is notorious for.

Consider that with 9 million Java developers, there could easily be 50,000 new beans created every day, each with perhaps 100 lines of additional boilerplate getters/setters/equals/hashCode/toString - easily 5 million lines of pointless code created, per day.

(And plenty more code in serialization and mapping code that either tries to deduce what the properties are, or results in lots of additional manual mapping code)

Realistically however, we have to accept that in all likelihood Java will never have properties as a language feature.

Summary

The Java language is not getting properties.

Comments welcome, but please no "Foo language has properties" comments!

Wednesday, 12 November 2014

Converting from Joda-Time to java.time

What steps are needed to upgrade from Joda-Time to Java SE 8 date and time (JSR-310)?

From Joda-Time to java.time

Joda-Time has been a very successful date and time library, widely used and making a real difference to many applications over the past 12 years or so. But if you are moving your application to Java SE 8, its time to consider moving on to java.time, formerly known as JSR-310.

The java.time library contains many of the lessons learned from Joda-Time, including stricter null handling and a better approach to multiple calendar systems. I use the phraseology that java.time is "inspired by Joda-Time", rather than an exact derivation, however many concepts will be familiar.

Converting types

Classes not including a time-zone:

Joda-Time java.time Notes
LocalDate LocalDate Same concept - date without time or time-zone
YearMonthDay
LocalTime LocalTime Same concept - time without date or time-zone
TimeOfDay
LocalDateTime LocalDateTime Same concept - date and time without time-zone
MonthDay MonthDay Same concept - month and day without time-zone
YearMonth YearMonth Same concept - year and month without time-zone
- Year New concept - a value type for the year
- Month New concept - an enum for the month-of-year
- DayOfWeek New concept - an enum for the day-of-week
Partial - Not included in java.time

Classes including a time-zone or representing an instant:

Joda-Time java.time Notes
DateTime ZonedDateTime Same concept - Date and time with time-zone
OffsetDateTime New concept - Date and time with offset from Greenwich/UTC
MutableDateTime - Not included in java.time, use immutable ZonedDateTime instead
DateMidnight - Deprecated as a bad idea in Joda-Time, Use LocalDate or ZonedDateTime
- OffsetTime New concept - Time with offset from Greenwich/UTC
Instant Instant Same concept - Instantaneous point on timeline
DateTimeZone ZoneId Same concept - Identifier for a time-zone, such as Europe/Paris
ZoneOffset New concept - An offset from Greenwich/UTC, such as +02:00

Amounts of time:

Joda-Time java.time Notes
Duration Duration Same concept - Amount of time, based on fractional seconds
Period Period and/or Duration Similar concept - Amount of time
Joda-Time Period includes years to milliseconds, java.time only year/month/day (see also (ThreeTen-Extra PeriodDuration)
MutablePeriod - Not included in java.time, use immutable Period or Duration instead
Years - Not included in java.time, use Period instead (or ThreeTen-Extra)
Months
Weeks
Days
Hours - Not included in java.time, use Duration instead (or ThreeTen-Extra)
Minutes
Seconds
Interval - Not included in java.time (see ThreeTen-Extra)
MutableInterval - Not included in java.time

Formatting:

Joda-Time java.time Notes
DateTimeFormatter DateTimeFormatter Same concept - an immutable formatter
DateTimeFormatterBuilder DateTimeFormatterBuilder Same concept - a builder of the formatter
DateTimeFormat DateTimeFormatter Concepts exposed as static methods on DateTimeFormatter
ISODateTimeFormat
PeriodFormatter - Not included in java.time (see ThreeTen-Extra for limited support)
PeriodFormatterBuilder
PeriodFormat
ISOPeriodFormat

Other classes and interfaces:

Joda-Time java.time Notes
- Clock New concept - Standard way to pass around the current time
Chronology Chronology Similar concept - Very different API and implementation
ISOChronology IsoChronology Similar concept - Very different API and implementation
DateTimeFieldType ChronoField Similar concept - Very different API and implementation
DateTimeField
DurationFieldType ChronoUnit Similar concept - Very different API and implementation
DurationField
PeriodType - Not included in java.time
Readable* Temporal* The Readable* interfaces are most closely replaced by the Temporal* interfaces
It is strongly recommended to use the value types, not the temporal interfaces

Odds and Ends

In most cases, the table above will be sufficient to identify the class to convert to. After that, in most cases the method needed will be obvious. Some special cases are noted below.

Rounding. Joda-Time has considerable support for rounding, java.time has less, based only on truncation. See the truncatedTo() methods.

End of month. Joda-Time has the withMaximumValue() method on the property object to change a field to the last day of the month. java.time has a more general mechanism, see TemporalAdjusters.lastDayOfMonth().

Nulls. Joda-Time accepts null as an input and handles it with a sensible default value. java.time rejects null on input.

Construction. Joda-Time has a constructor that accepts an Object and performs type conversion. java.time only has factory methods, so conversion is a user problem, although a parse() method is provided for strings.

Time between. Joda-Time provides methods on amount classes such as Days.between(a, b). java.time provides a similar method on the ChronoUnit classes - ChronoUnuit.DAYS.between(a, b).

Additional fields. Joda-Time provides a fixed set of fields in DateTimeFieldType. java.time allows the set of fields to be extended by implementing TemporalUnit, see IsoFields, JulianFields and WeekFields.

Summary

This blog post has outlined the key mappings between Joda-Time and java.time in Java SE 8. If you are writing code in Java SE 8, its time to migrate to java.time, perhaps using ThreeTen-Extra to fill any gaps.

Monday, 10 November 2014

One more library for JDK 9?

On Tuesday night in Belgium at Devoxx I'm asking the question "Is there room for one more library in JDK 9"?

With JDK 9 full of modules, and the long wait until JDK 10 for more interesting things like value types, I pose this question to the community to ask what would you add to JDK 9 if you could?

But, I'm asking more than that. Any additional work for JDK 9 would have to be led by someone outside Oracle capable of drawing together a community. And whatever it is it would need to be tightly focussed - the timescale available is limited.

Think of something that could be expressed in 20 classes or less. Involves only code and no library or JVM changes. Something that the JDK doesn't have, but you think it should. Think about whether you'd put in your time to try and make it happen.

But also think about whether your idea should be in the JDK? Is it better as an external library? What makes it necessary to be in the JDK, especially a modularized JDK? Will Oracle's Java stewards support the idea?

As an example, I'll put forward Joda-Convert, a very small and simple library that provides a mechanism to convert and round-trip a simple object (like a VALJO) to a formatted string, ideal for use in serialization, such as JSON or XML (and noting that JSON is coming to JDK 9...). I'm sure that if you're reading this, you've got some other ideas!

If you've got any thoughts, ideas, or want to suggest something, please add a comment. But remember the rules above - there is no free lunch here. Are you willing to take a lead on your idea?

Thursday, 6 November 2014

Better nulls in Java 10?

Rethinking null when Java has value types.

Null in Java

Was null really the billion dollar mistake? Well, thats an interesting question, but what is certain is that null is widely used in Java, and frequently a very useful concept. But, just because it is useful doesn't mean that there isn't something better.

Optional everywhere!

There are those that would point you to adopt the following strategy - avoid using null everywhere, and use Optional if you need to express the possible absence of something. This has a certain attraction, but I firmly believe the Java is not yet ready for this (in Java SE 8) as per my previous post on Optional.

Specifically, all the additional object boxes that Optional creates will have an effect of memory and garbage collection unless you are very lucky. In addition, its relatively verbose in syntax terms.

Nullable annotations

A second approach to handling null is to use the @Nullable and @NonNull annotations. Except that it is only fair to point out that from an engineering perspective, annotations are a mess.

Firstly, there is no formal specification for nullable annotations. As such, there are multiple competing implementations - one from FindBugs, one from JSR-305, one from IntelliJ, one from Eclipse, one from Lombok and no doubt others. And the semantics of these can differ in subtle ways.

Secondly, they don't actually do anything:

  @javax.annotation.ParametersAreNonnullByDefault
  public class Foo {
    public StringBuilder format(String foo) {
      return new StringBuilder().append(foo);
    }
    public StringBuilder formatAllowNull(@Nullable String foo) {
      return new StringBuilder().append(foo);
    }
  }

Here, I've use @javax.annotation.ParametersAreNonnullByDefault to declare that all parameters to methods in the class are non-null unless otherwise annotated, that is they only accept non-null values. Except that there is nothing to enforce this.

If you run an additional static checker, like FindBugs or your IDE, then the annotation can warn you if you call the "format" method with a null. But there is nothing in the language or compiler that really enforces it.

What I want is something in the compiler or JVM that prevents the method from being called with null, or at least throws an exception if it is. And that cannot be switched off or ignored!

The nice part about the annotations is that they can flip the default approach to null in Java, making non-null the default and nullability the unusual special case.

Null with value types in JDK 10

The ongoing work on value types (target JDK 10 !!) may provide a new avenue to explore. Thats because Optional may be changed in JDK 10 to be a value type. What this might mean is that an Optional property of a Java-Bean would no longer be an object in its own right, instead it would be a value, with its data embedded in the parent bean. This would take away the memory/gc reasons preventing use in beans, but there is still the syntactic overhead.

Now for the hand-waving part.

What if we added a new psuedo keyword "nonnull" that could be used as a modifier to a class. It would be similar to @javax.annotation.ParametersAreNonnullByDefault but effectively make all parameters, return types and variables in the class non-null unless modified.

Then, what if we introduce the use of "?" as a type suffix to indicate nullability. This is a common syntax, adopted by Fantom, Ceylon and Kotlin, and very easy to understand. Basically, wherever you see String you know the variable is non-null and wherever you see String? you know that it might be "null".

Then, what if we took advantage of the low overhead value type nature of Optional (or perhaps something similar but different) and used it to represent data of type String?. Given this, a "nonnull" class would never be able to see a null at all. This step is the most controversial, and perhaps not necessary - it may be possible for the compiler and JVM to track String and String? without the Optional box.

But the box might come in useful when dealing with older code not written with the "nonnull" keyword. Specifically, it might be possible to teach the JVM to be able to auto box and un-box the Optional wrapper. It might even be possible to release a standard annotation jar that projects could use on codebases that need to remain compatible with JDK 8 or 9, where the annotation could be interpreted by a JDK 10 JVM to provide similar nullability information.

  public nonnull class Foo {
    public StringBuilder format(String foo) {
      return new StringBuilder().append(foo);
    }
    public StringBuilder formatAllowNull(String? foo) {
      return new StringBuilder().append(foo);
    }
  }

The example above has been rewritten from annotations to the new style. It is clearly a shorter syntax, but it would also benefit from proper integration into the compiler and JVM resulting in it being very difficult to call "format" with a null value. Behind the scenes, the second method would be compiled in bytecode to take a parameter of Optional<String>, not String.

The hard part is the question of what methods can be called on an instance of String?. The simple encoding suggests that you can only call the methods of Optional<String>, but this might be a little surprising given the syntax. The alternative is to retain the use of null, but with an enforced check before use.

The even harder part is what to do with things like Map.get(key) where null currently has two meanings - not found and found null.

This concept also appears to dovetail well into the value type work. This is because value types are likely to behave like primitives, and thus cannot be null (the value type has to be boxed to be able to be null). By providing proper support for null in the language and JVM, this aspect of value types, which might otherwise be a negative, becomes properly handled.

It should also be pointed out that one of the difficulties that current tools have in this space is that the JDK itself is not annotated in any way to indicate whether null is or is not allowed as an input or return type. Using this approach, the JDK would be updated with nullability, solving that big hurdle.

(As a side note, I wrote up some related ideas in 2007 - null-safe types and null-safe invocation.)

Just a reminder. This was a hand-wavy thought experiment, not a detailed proposal. But since I've not seen it before, I thought it was worth writing.

Just be careful not to overuse Optional in the meantime. Its not what it was intended for, despite what the loud (and occasionally zealot-like) fans of using it everywhere would have you believe.

Summary

A quick walk through null in Java, concluding with some hand-waving of a possible approach to swapping the default for types wrt null.

Feel free to comment on the concept or the detail!

Wednesday, 5 November 2014

Optional in Java SE 8

Java SE 8 added a new class to handle nullability - Optional - but when should it be used?

Optional in Java SE 8

Optional is not a new concept, but an old one which often results in polarized opinions. It was added to Java 8 and is used in limited places at the moment.

The class itself is a simple box around an object, which has a special marker for empty instead of using null:

  Optional<String> optional = Optional.of("Hello");
  Optional<String> empty = Optional.empty();

The goal of the class is to reduce the need to use null. Instead, Optional.empty() takes the place of null. One way to think of it is as a collection that holds either zero or one objects.

When you want to get the data out of the optional, there is a nice simple method, get(), but this will throw an exception if the optional is empty. In most cases, you will want to call one of the other methods, for example to return a default value:

  Optional<Person> personResult = findPerson(name);
  Person person = personResult.orElse(Person.GUEST);

One approach to using Optional is to use it everywhere in your entire codebase and eliminate null altogether. I don't subscribe to this approach, and nor does Brian Goetz.

Of course, people will do what they want. But we did have a clear intention when adding this feature, and it was not to be a general purpose Maybe or Some type, as much as many people would have liked us to do so. Our intention was to provide a limited mechanism for library method return types where there needed to be a clear way to represent "no result", and using null for such was overwhelmingly likely to cause errors.

The key here is the focus on use as a return type. The class is definitively not intended for use as a property of a Java Bean. Witness to this is that Optional does not implement Serializable, which is generally necessary for widespread use as a property of an object.

Where it is used as a return type, Optional can work very well. Consider the case where you are finding the first matching element in a stream. In this case, what should the stream API do if the stream is empty? Returning Optional makes it clear that this case should be considered when using the API.

For me, the pleasant surprise about Optional is that it has three friends - OptionalInt, OptionalLong and OptionalDouble that wrap an int, long and double. I've made good use of these as return types where previously I would have used Integer, Long and Double. Not only is the increase in meaning helpful, but the ability to call orElse(0) to default the missing value is a useful feature for API users.

My only fear is that Optional will be overused. Please focus on using it as a return type (from methods that perform some useful piece of functionality) Please don't use it as the field of a Java-Bean.

(As a side note, one reason not to use it in this way is that it creates an extra object for the garbage collector. If used as a method return type, the extra object is typically short-lived, which causes the gc little trouble. But when used in a Java-Bean, the object is likely to be long-lived, with a memory/gc overhead. I was bitten by something very similar to this about 13 years ago, and I can categorically say that for server systems running at scale, you don't want lots of little object boxes within your beans. Interestingly, if Optional becomes a value type in a future Java version, then the memory/gc issue goes away.)

Optional in Java SE 8

Java 8 Optional, and its three primitive friends, is very useful when used as a return type from an operation where previously you might have returned null. As written, the class is is not intended for use as a Java-bean property - please don't use it for that.

Sunday, 7 September 2014

Oracle RDBMS empty string and null

The Oracle database (RDBMS) has a "wonderful quirk" - it treats null and empty string as one and the same.

Oracle database empty string

If you have never used Oracle RDMBS you may not be aware of this particular "feature". Basically, if you insert an empty string into the database it is treated as null. Thus, a column declared as "NOT NULL" will reject the insertion of an empty string. For me, and probably most other developers, this isn't what you expect.

The Stack Overflow explanation is "I believe the answer is that Oracle is very, very old." As well as being amusing, it is as good an explanation as any other.

Last week, I had the dubious pleasure of porting the OpenGamma database infrastructure to work on Oracle. It already worked on various other databases (including HSQLDB, Postgres and SQL Server) but Oracle is rather an outlier in the world of SQL.

Most of the conversion went fine with ElSql able to handle any deviations from standard SQL. But the problem of empty strings and null was a little more vexing.

The first question was whether we could avoid needing to identify the difference between null and an empty string? I believe that if Oracle is your primary database, then you could code your application in that way. However, if you are coding the application to work on multiple different databases then this is simply not an option.

The second question was of course how to handle the problem? My research threw up three realistic possibilities:

  1. Prefix all strings by the same character, such as '@'. The string 'Stephen' would be stored in the database as '@Stephen', while the empty string would be stored as '@'.
  2. Replace the empty string with a magic value, such as '!!! EMPTY !!!'.
  3. Store an additional flag column indicating the difference between null and the empty string.

Option 1 means that every string in the database is polluted, making it hard to use the data directly by SQL (bypassing the application). Option 2 relies on the magic value never occurring in real data, and direct SQL access is again compromised (though less than option 1). Option 3 is the most "pure" solution, but would have been very hard to adopt when writing one piece of code for multiple different databases.

My chosen solution was a variation on option 2 - encoding using "one extra space". I'm documenting it here in case anyone else finds it to be a useful strategy:

  • The empty string is stored in the database as a single space (ASCII 32).
  • Any string that consists only of spaces (ASCII 32) is stored with one additional space added.
  • Any other string is stored without change

The advantage of this encoding is that the vast majority of strings are stored without being changed. This makes access by direct SQL simple and obvious. The only strings to be encoded are the empty string, and the unlikely "all space" strings. The encoding will make little difference to most algorithms or displays even if it is not decoded. The encoding is also fully reversible, provided that column length limits are not hit.

Fortunately, I was able to encode and decode the data in relatively few places. For decoding, a decorated ResultSet worked effectively. For encoding, it was possible to create a subclass of Spring's NamedParameterJdbcTemplate and JdbcTemplate to do the trick. See this commit for details.

As a final note, there are other complications with data storage in Oracle. As I understand it, NaN and null also cannot be separated. This blog only covers problems of strings, which is tricky enough!

Summary

I will continue to believe that an empty string and null are two different concepts. Oracle RDBMS disagrees. If you face the problem of creating a workaround, perhaps the option I used is worth considering.

Wednesday, 6 August 2014

StringJoiner in Java SE 8

Java SE 8 added a new class for joining strings - StringJoiner. But is it any good?

StringJoiner

Here is what the Javadoc of the new class says:

StringJoiner is used to construct a sequence of characters separated by a delimiter and optionally starting with a supplied prefix and ending with a supplied suffix.

Sounds good! Maybe we can finally stop using the Joiner class in Guava.

Unfortunately, the JDK StringJoiner is not a general purpose string joiner in the way that the Guava Joiner is. The first clue is in the API:

// constructors
 StringJoiner(CharSequence delimiter)
 StringJoiner(CharSequence delimiter, CharSequence prefix, CharSequence suffix)
 // methods
 StringJoiner setEmptyValue(CharSequence emptyValue)
 StringJoiner add(CharSequence newElement)
 StringJoiner merge(StringJoiner other)
 int length()
 String toString()

There are two constructors, one taking the separator (delimiter) and one allowing a prefix and suffix. There are then just 5 other methods. By contrast, the Guava version has 15 other methods on top of 2 factory methods.

But the real missing thing? A method to add multiple elements at once to the joiner!

Every time I want to join, I have a list, set or other iterable. With Guava I simply say:

 String joined = Joiner.on(", ").join(list);

StringJoiner has no equivalent method. You have to add the elements one by one using add(CharSequence)!

 StringJoiner joiner = new StringJoiner(", ");
 for (String str : list) {
   joiner.add(str);
 }
 String joined = joiner.toString();

I think we'd all agree that rather defeats the purpose of having a joiner at all!

However, it turns out that it is kind of possible to add multiple with the JDK, but you might not spot it:

 String joined = String.join(", ", list);

So, not too bad then?

Firstly, I don't expect the method to actually perform a useful join to be on String, I expect it to be on StringJoiner. The method on String is not referenced from StringJoiner at all.

Secondly, the method on String is static, whereas the Guava method is an instance method. This means that the Guava method can pickup additional state from the builder phase of the joiner, such as the ability to handle null. The Guava joiner can in fact handle Map joins as well thanks to its clever immutable instance-based design.

Thirdly, StringJoiner only works on CharSequence. By contrast, Guava's Joiner works on Object, which is much more useful in most circumstances.

Rationale

So, why was StringJoiner written this way?

Well, partly, it is just bad API design. But the reason why no-one noticed is because you are not supposed to actually use the class!

The whole StringJoiner API is designed to be a tool used as a Collector, the mutable reduction phase of the new Java SE 8 stream API. In this context StringJoiner itself is not visible:

 String joined = list.stream()
   .map(Object::toString)
   .collect(Collectors.joining(", "));

In the simple case, this is longer than Guava and less discoverable, plus I had to manually map to a string. However, in more advanced stream cases it is a great tool to have.

The other advantage of StringJoiner over Guava Joiner is that it handles prefixes and suffixes. This is actually really useful, the classic example being to output the '[' and ']' at the start and end of a list. Ideally, Guava would add prefix and suffix handling to their Joiner.

The good news is that some of the flaws in StringJoiner can be mitigated in a later JDK version. However, since StringJoiner is fundamentally stateful and mutable it will never be comparable to Guava's Joiner.

Commons-Lang

Amusingly, for many of the day-to-day tasks in string building, the class I developed in Commons-Lang over 12 years ago, StrBuilder is still the best option. It takes the concept of StringBuilder class and adds many additional methods. Relevant to this discussion is:

 return new StrBuilder()
   .append("something")
   .append(somethingElse)
   .appendWithSeparators(list, ", ")
   .toString();

Note how the joining occurs naturally within the middle of a fluent set of method calls. Neither Guava nor JDK joiners can be used in this way.

Summary

The Java SE 8 StringJoiner class is in my opinion nothing more than a behind-the-scenes tool. It should only be used indirectly from String.join() or Collectors.joining(). If you use it directly you are liable to be frustrated.

Personally, I plan to continue using the Guava joiner, unless I am performing a mutable reduction of a stream.

Tuesday, 1 July 2014

ThreeTen-Backport vs Joda-Time

So which project should you choose? ThreeTen-Backport or Joda-Time?

Date and Time libarary pre-Java 8

Over the years, I have produced two libraries for Java date and time.

The first is Joda-Time, a library has become the de facto standard choice for many users.

The second is JSR-310, included as java.time in Java SE 8. Today I released v1.0 of ThreeTen-Backport, which makes the same JSR-310 API available in Java SE 6 and 7, although it is under a different package name.

So which should you choose?

If you are on Java SE 8 then you should use java.time (JSR-310). It tackles many issues with Joda-Time and is better integrated with the rest of the Java core libraries. Where necessary, consider using ThreeTen-Extra for any additional functionality that isn't in the JDK.

If you are on Java SE 6 or 7, then in general you should use Joda-Time. It is the standard option that other teams will be using, and it will be easier to interoperate if you stick to using Joda-Time.

The main use case for ThreeTen-Backport is if your team is using Java SE 6 or 7 but you are planning on moving to Java SE 8 in the near future. In this case, using the backport can smooth your future migration. However it will still require a package rename for your API users.

Another use case for the backport is to assist with backporting other Java SE 8 code to Java SE 7. See the retrolambda project (which could use ThreeTen-Backport but doesn't).

I hope that helps you to decide!

Thursday, 26 June 2014

Iterable in Java SE 8

In the last few weeks I've finally had the chance to use lambdas and streams in Java SE 8 in anger. In doing so, I've found much to like, but some rough edges.

Iterable woes

The Iterable interface was added in Java SE 5 to enable the foreach loop to work. It was defined with a single method that returns an iterator:

 public interface Iterable<T> {
   Iterator<T> iterator();
 }

The simplicity of the interface was based around the primary use case, the foreach loop, which needed nothing more than a source of an iterator.

In Java SE 8, the interface changed, as it added two default methods:

 public interface Iterable<T> {
   Iterator<T> iterator();

   default void forEach(Consumer<? super T> action) {
     Objects.requireNonNull(action);
     for (T t : this) {
       action.accept(t);
     }
   }
   
   default Spliterator<T> spliterator() {
     return Spliterators.spliteratorUnknownSize(iterator(), 0);
   }
 }

Isn't this wonderful? Lots of new functionality for free! Er well, no.

Despite following the lambda-dev mailing lists, the problem I now have with Iterable isn't one I foresaw. Yet it took just a few days of active use of Java SE 8 to realise that Iterable is now a lot less useful than it was.

The problem as I see it, is that Iterable has been used for two different things in my code. The first, is as the parameter of a method, where it acted as a very abstract form of collection. The second is as a "common" interface on a domain object.

Receiving an Iterable parameter

The first case is where a method has been written to receive an Iterable parameter. In my code, I have defined many methods that take an Iterable, often alongside a varargs array:

   void addNames(Name... names);
   void addNames(Iterable<Name> names);

This pairing provides great flexibility to callers. If you have a set of actual name objects, you can use the varargs form. If you have an array of name objects, you can also use the varargs form. And if you have virtually any kind of collection, you can use the iterable form. (That abstract sense of just providing an iteration really broadens the kind of inputs allowed and blurs the line between collections and non-collections).

Code within the addName(Iterable) method will benefit to some degree from the new methods on Iterable. That is because the method receiving the Iterable knows nothing else about the input type other than it is an iterable.

Given this, the addition of the new forEach(Consumer) method is reasonable, as it provides additional looping options. But, in reality, there is little difference between these two:

   void addNames(Iterable<Name> names) {
     for (Name name : names) { ... }
   }
   void addNames(Iterable<Name> names) {
     names.forEach(() -> ...);
   }

Yes its a new method, but the actual new functionality is limited. Nevertheless, from this perspective it is a win.

By contrast, the addition of the new spliterator() method simply leads to frustration, as by itself the method has no value. What would have had value would have been a stream() method but there are good reasons why stream() was not added.

Instead, we have to write some rather ugly code to get the actually useful stream:

   void addNames(Iterable<Name> names) {
     StreamSupport.stream(names.spliterator(), false)
    .doMyWizzyStreamThing();
   }

As an aside, arrays manage this better:

   void addNames(Name[] names) {
     Stream.of(names)
    .doMyWizzyStreamThing();
   }

Now why oh why oh why is there no Stream.of(Iterable) method?

Actually, I know why, but that is a story for another day...

Streamable?

Perhaps you're thinking that the reason why Iterable isn't supported well is that there is a Streamable interface that you are supposed to use for this purpose instead?

Er, no. There was a Streamable interface, but it got removed during development. There is no JDK supplied abstraction for a type that provides streams.

Perhaps this is philosophical - you could just declare and call using a supplier:

   void addNames(Supplier<Stream<T>> streamSupplier) {
     streamSupplier.get()
    .doMyWizzyStreamThing();
   }
   Collection<T> coll = ...
   addNames(coll::stream));
   T[] array = ...
   addNames(Stream.of(array));

Definitely a different kind of use of the type system.

After much thought and playing with the feel of the options available, I decided to keep on declaring methods to take Iterable and varargs as that is friendliest to users of the API. Internally, the mapping has to be done from iterable to stream, which necessitates a convenience method in a helper for my sanity.

Iterable on domain objects

The second major way I use Iterable is where I had more difficulty.

I have found over the years that some domain objects are little more than a type-safe wrapper around a collection. Consider a FooSummary object that contains a list of FooItem. In many cases, I have made FooSummary implement Iterable<FooItem>:

 public class FooSummary implements Iterable<FooItem> {
   List<FooItem> items;
   @Override
   public Iterator<FooItem> iterator() {
     return items.iterator();
   }

Doing this allowed the object to be iterated over directly in a foreach loop.

 FooSummary summary = ...
 for (FooItem item : summary) { ... }

This approach is very useful for those domain objects that are primarily a collection of some other object, yet are distinct enough to have a class of their own.

However, this kind of approach relied on using Iterable as a kind of "common" interface, similar to Comparable or Serializable. These small "common" interfaces work on the basis that they provide a common language across a very broad spectrum of use cases. Comparable does nothing other than define the method needed to participate in comparisons. I used Iterable in exactly that sense - to accept the common language for things that can be iterated over, without any implication that they were actual collections.

In my opinion, usage of Iterable as a "common" interface has been all but destroyed by Java SE 8.

The FooSummary class is essentially a domain object. The methods it provides must make sense as a set in their own right. Adding the iterable() method was perfectly acceptable in exactly the same way as adding comparable() is. It is a well-known method with clearly defined properties and usage. However, in Java SE 8, the domain object has had two additional methods foisted upon it.

The issue is that from the perspective of a user of FooSummary, the two additional methods are net negative, rather than positive. While there is nothing wrong in theory with being able to call forEach(Consumer) or spliterator(), it needs to be a specific and separate decision to add them to the domain object.

In my cases, I definitely do not want those extra methods. As such, I have no choice but to stop treating Iterable as a "common" interface. It can now only be used by classes that really make sense to be treated as collections.

In practical terms, this change simply makes life for callers slightly more verbose:

 FooSummary summary = ...
 for (FooItem item : summary.getItems()) { ... }

Where the line between a collection and a non-collection could previously be blurred through Iterable, that option is not really present any more. As such, I feel an API design tool has been removed from my tool-box.

(Another example I played with was a Tuple interface abstracting over Pair and Triple. Again, while adding an iterator() method to Tuple is desirable, adding forEach(Consumer) and spliterator() was not. Here, none of the available options are very pleasing.)

Summary

The Iterable interface has had two default methods added in Java SE 8. These make reasonable sense when receiving an iterable. Unfortunately, the difficulty of obtaining an actual stream is rather annoying.

The more significant problem for me is that Iterable is no longer suitable for use as a "common" interface. The extra methods, and the threat of more in future releases mean that the use cases for Iterable have been reduced. It should now only be used for classes that really are collections, something that I find to be a significant reduction in usability.

Any thoughts on iterable, "common" interfaces and default methods? Has Iterable been ruined by the addition of default methods?

Friday, 21 March 2014

VALJOs - Value Java Objects

The term "value object" gets confusing in Java circles with various different interpretations. When I think about these I have my own very specific interpretation, which I'm christening VALJOs.

Introduction

The Wikipedia article on value objects gives this definition:

In computer science, a value object is a small object that represents a simple entity whose equality is not based on identity: i.e. two value objects are equal when they have the same value, not necessarily being the same object.

Classic examples of values are numbers, amounts, dates, money and currency. Martin Fowler also gives a good definition, and related wiki.

I agree with the Wikipedia definition, however there is still a lot of detail to be added. What does it take to write a value object in Java - is there a set of rules?

Recently, life has got more complicated with talk of adding value types to a future Java release:

"A future version of Java" will almost certainly have support for "value types", "user-defined primitives", "identity-less aggregates", "structs", or whatever you would like to call them.

Work on value types is focussed on extending the JVM, language and libraries to support values. Key benefits revolve and memory usage and performance, particularly in enabling the JVM to be much more efficient. Since this post isn't about value types, I'll direct curious readers to the JEP and various articles on John Rose's blog.

JVM supported value types may be the future, but how far can be go today?

VALue Java Objects - VALJOs

What I want to achieve is a set of criteria for well-written value objects in Java today. Although written for today, the criteria have one eye on what value types of the future may look like.

As the term "value object" is overloaded, I'm giving these the name VALJO. Obviously this is based on the popular POJO naming.

  • The class must be immutable, which implies the final keyword and thread-safety.
  • The class name must be simple and direct, focussed on the value.
  • Instances must be obtained using static factory methods. All constructors must be private.
  • The state that defines the value must be clearly specified. It will consist of one or more elements.
  • The elements of the state must be other values, which includes primitive types.
  • The equals() and hashCode() methods must check all the elements of the state and nothing else.
  • If the class implements Comparable, then it must check all the elements of the state and nothing else.
  • If the class implements Comparable, the compareTo() method must be consistent with equals.
  • The toString() method must return a formally-defined string fully exposing the state and nothing else.
  • The toString() for two equal values must be the same. For two non-equal values it must be different.
  • There must be a static factory method capable of creating an instance from the formal string representation. It is strongly recommended to use the method name parse(String) or of(String).
  • The clone() method should not be public.
  • Provide methods, typically simple getters, to get the elements of the state.
  • Consider providing with() methods to obtain a copy of the original with different state.
  • Other methods should be pure functions, depending only on the arguments are the state or derived state.
  • Consider providing a link to the JDK 8 value-based classes documentation.

It is important to note that the state of a VALJO is not necessarily identical to the instance variables of the Class. This happens in two different ways:

Firstly, the state could be stored using different instance variables to those in the definition. For example, the state definition and public variables may expose an int but the instance variable might be a byte, because the validated number only needs 8 bit storage, not 32 bit. Similarly, the state definition might expose an enum, but store the instance variable as an int. In this case, the difference is merely an implementation detail - the VALJO rules apply to the logical state.

Secondly, there may be more instance variables than those in the state definition. For example, a Currency is represented in ISO 4217 with two codes, a three letter string and a three digit number. Since it is possible to derive the numeric code from the string code (or vice versa), the state should consist only of the string. However, rather than looking up the number each time it is needed, it is perfectly acceptable to store the numeric code in an instance variable that is not part of the state. What would not be acceptable for Currency is including numeric code in the toString (as the numeric code is not part of the state, only the string code is).

In effect, the previous paragraph simply permits caching of related data within a VALJO so long as it can be derived from the state. Extending this logic further, it is acceptable to cache the toString() or hashCode().

On parsing, I recommend using of(String) if the state of the VALJO consists entirely of a single element which is naturally a string, such as Currency. I recommend parse(String) where the state consists of more than one element, or where the single element is not a string, such as Money.

It is acceptable for the parse method to accept alternate string representations. Thus, a Currency parsing method might accept both the standard three letter code and the string version of the numeric code.

On immutability, it is allowed to have an abstract VALJO class providing the constructor is private scoped and all the implementations are immutable private static final nested classes. The need for this is rare however.

Using VALJOs

It is best practice to use VALJOs in a specific way tailored for potential future conversion to JVM value types:

  • Do not use ==, only compare using equals().
  • Do not rely on System.identityHashCode().
  • Do not synchronize on instances.
  • Do not use the wait() and notify() methods.

The first of these rules is stronger than absolutely necessary given we are talking about normal Java objects in a current JVM. Basically, so long as your VALJO implementation promises to provide a singleton cached instance for each distinct state then using == is acceptable, because == and equals() will always return the same result. The real issue to be stopped here is using == to distinguish between two objects that are equal via equals().

Separately from these restrictions, the intention of VALJOs is that users should refer to the concrete type, not an interface. For example, users refer to the JSR-310 LocalDate, not the Temporal interface that it implements. The reason for this is that VALJOs are, of necessity, simple basic types, and hiding them behind an interface is not helpful.

Tooling

Actually writing VALJO classes is boring and tedious. The rules and implications need to be carefully thought through on occasion. Some projects provide tools which may assist to some degree.

Joda-Convert provides annotations for @ToString and @FromString which can be used to identify the round-trippable string format.

Joda-Beans provides a source code generation system that includes immutable beans. They are not instant VALJOs, but can be customised.

Auto-Value provides a bytecode generation system that converts abstract classes into immutable value objects. They are not instant VALJOs, but can be customised.

Project Lombok provides a customised compiler and IDE plugin which includes value object generation. They are not instant VALJOs, but can be customised.

In general however, if you want to write a really good VALJO, you need to do it by hand. And if that VALJO will be used by many others, such as in JSR-310, it is worth doing correctly.

Summary

The aim of this blog was to define VALJOs, a specific set of rules for value objects in Java. The rules are intended to be on the strict side - all VALJOs are value objects, but not all value objects are VALJOs.

Did I miss anything? Get something wrong? Let me know!

Monday, 10 February 2014

New project: ThreeTen-Extra for JDK 8

JDK 8 includes JSR-310, a new date and time library. But what about functionality that didn't make it into the JDK?

ThreeTen-Extra

The main ThreeTen project is now essentially complete. The project was developed and delivered via JSR-310 into OpenJDK and JDK 8.

However, as part of that process, certain pieces of functionality were rejected and/or excluded from the JDK. This was sometimes due to scope management and sometimes due to whether something was appropriate for the JDK.

The TheeTen-Extra project provides a home for that functionality.

ThreeTen-Extra is a project on GitHub that provides additional functionality. It is delivered via Maven Central and is dependent on JDK 8.

The following functionality is currently provided:

  • DayOfMonth temporal value type
  • DayOfYear temporal value type
  • AmPm temporal enum
  • Quarter temporal enum
  • YearQuarter temporal value type
  • Days, Months and Years amount value types
  • Next/previous day adjusters that skip the weekend (Saturday/Sunday)
  • Coptic calendar system
  • Support for the TAI and UTC time-scales

The project has spare space to add more functionality, so long as it is generally applicable. For example, additional calendar systems would be a good fit. Feel free to raise a pull request with your ideas.

Summary

The ThreeTen-Extra project is now available, providing an additional jar file of date/time code that builds on java.time (JSR-310) in JDK 8.

Comments welcome!

Saturday, 8 February 2014

Turning off doclint in JDK 8 Javadoc

JDK 8 includes many updates, but one is I suspect going to cause quite a few complaints - doclint for Javadoc.

Javadoc doclint

Documentation is not something most developers like writing. With Java, we were fortunate to have the Javadoc toolset built in and easy to access from day one. As such, writing Javadoc is a standard part of most developers life.

The Javadoc comments in source code use a mixture of tags, starting with @, and HTML to allow the developer to express their comment and format it nicely.

Up to JDK 7, the Javadoc tool was pretty lenient. As a developer, you could write anything that vaguely resembled HTML and the tool would rarely complain beyond warnings. Thus you could have @link references that were inaccurate (such as due to refactoring) and the tool would simply provide a warning.

With JDK 8, a new part has been added to Javadoc called doclint and it changes that friendly behaviour. In particular, the tool aim to get conforming W3C HTML 4.01 HTML (despite the fact that humans are very bad at matching conformance wrt HTML).

With JDK 8, you are unable to get Javadoc unless your tool meets the standards of doclint. Some of its rules are:

  • no self-closed HTML tags, such as <br /> or <a id="x" />
  • no unclosed HTML tags, such as <ul> without matching </ul>
  • no invalid HTML end tags, such as </br>
  • no invalid HTML attributes, based on doclint's interpretation of W3C HTML 4.01
  • no duplicate HTML id attribute
  • no empty HTML href attribute
  • no incorrectly nested headers, such as class documentation must have <h3>, not <h4>
  • no invalid HTML tags, such as List<String> (where you forgot to escape using &lt;)
  • no broken @link references
  • no broken @param references, they must match the actual parameter name
  • no broken @throws references, the first word must be a class name

Note that these are errors, not warnings. Break the rules and you get no Javadoc output.

In my opinion, this is way too strict to be the default. I have no problem with such a tool existing in Javadoc, but given the history of Javadoc, errors like this should be opt-in, not opt-out. Its far better to get slightly broken Javadoc than no Javadoc.

I also haven't been able to find a list of the rules, which makes life hard. At least we can see the source code to reverse engineer them.

Turning off doclint

The magic incantation you need is -Xdoclint:none. This goes on the command line invoking Javadoc.

Maven

If you are running from maven with maven-javadoc-plugin v3.0.0 or later, you need to use the doclint setting, as per the manual. Either add it as a global property:

  <properties>
    <doclint>none</doclint>
  </properties>

or add it to the maven-javadoc-plugin:

  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-javadoc-plugin</artifactId>
      <configuration>
        <doclint>none</doclint>
      </configuration>
    </plugin>
  </plugins>

When using earlier versions of maven-javadoc-plugin (before v3.0.0), you need to use a different setting:

  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-javadoc-plugin</artifactId>
      <configuration>
        <additionalparam>-Xdoclint:none</additionalparam>
      </configuration>
    </plugin>
  </plugins>

Ant

Ant also uses additionalparam to pass in -Xdoclint:none, see the manual.

Gradle

Gradle does not expose additionalparam but Tim Yates and Cedric Champeau advise of this solution:

  if (JavaVersion.current().isJava8Compatible()) {
    allprojects {
      tasks.withType(Javadoc) {
        options.addStringOption('Xdoclint:none', '-quiet')
      }
    }
  }

See also the Gradle manual.

Summary

I don't mind doclint existing, but there is no way that it should be turned on to error mode by default. Getting some Javadoc produced without hassle is far more important than pandering to the doclint style checks. In addition, it is very heavy handed with what it defines to be errors, rejecting plenty of HTML that works perfectly fine in a browser.

I've asked the maven team to disable doclint by default, and I'd suggest the same to Ant and Gradle. Unfortunately, the Oracle team seem convinced that they've made the right choice with errors by default and their use of strict HTML.

Comments welcome, but please note that non-specific "it didn't work for me" comments should be at Stack Overflow or Java Ranch, not here!

Wednesday, 5 February 2014

Exiting the JVM

You learn something new about the JDK every day. Apparantly, System.exit(0) does not always stop the JVM!

System.exit()

This is a great Java puzzler from Peter Lawrey:

  public static void main(String... args) {
    Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
      @Override
      public void run() {
        System.out.println("Locking");
        synchronized (lock) {
          System.out.println("Locked");
        }
      }
    }));
    synchronized (lock) {
      System.out.println("Exiting");
      System.exit(0);
    }
  }

What does the code do?

  1. Our code registers the shutdown hook
  2. Our code acquires the lock
  3. Our code prints "Exiting"
  4. Our code calls System.exit(0)
  5. System.exit(0) calls our shutdown hook
  6. Our shutdown hook prints "Locking"
  7. Our shutdown hook tries to acquire the lock
  8. Deadlock - Code never exits

Clearly, calling System.exit(0) and not exiting is a Bad Thing, although hopefully badly written shutdown hooks are rare. And there are also deprecated runFinalizersOnExit, another potential source of problems.

What are the alternatives?

The System.exit(0) call simply calls Runtime.getRuntime().exit(0), so that makes no difference.

The main alternative is Runtime.getRuntime().halt(0), described as "Forcibly terminates the currently running Java virtual machine". This does not call shutdown hooks or exit finalizers, it just exits.

But what if you want to try and exit nicely first, and only halt if that fails?

Well that seems like its a missing JDK method. However, a delay timer can be used to get a reasonable approximation:

  /**
   * Exits the JVM, trying to do it nicely, otherwise doing it nastily.
   * 
   * @param status  the exit status, zero for OK, non-zero for error
   * @param maxDelay  the maximum delay in milliseconds
   */
  public static void exit(final int status, long maxDelayMillis) {
    try {
      // setup a timer, so if nice exit fails, the nasty exit happens
      Timer timer = new Timer();
      timer.schedule(new TimerTask() {
        @Override
        public void run() {
          Runtime.getRuntime().halt(status);
        }
      }, maxDelayMillis);
      // try to exit nicely
      System.exit(status);
      
    } catch (Throwable ex) {
      // exit nastily if we have a problem
      Runtime.getRuntime().halt(status);
    } finally {
      // should never get here
      Runtime.getRuntime().halt(status);
    }
  }

All things being equal, that really should exit the JVM in the best way it can. In the puzzler, if you replace System.exit(0) with a call to this method, the deadlock will be broken and the JVM will exit.

Summary

System.exit(0) does not always stop the JVM. Runtime.getRuntime().halt(0) should always stop the JVM.

Final thought - if you're writing code to exit the JVM, have you considered whether you even need that code? After all, exiting the JVM doesn't play well in embedded or cloud environments...