Tuesday 4 December 2012

Annotating JDK default data

A common issue in Java development is use of the default Locale, default TimeZone and default CharSet.

JDK defaults

The JDK has a number of defaults that apply to a running JVM. The most well known are the default Locale, default TimeZone and default CharSet.

These defaults are very useful for getting systems up and running quickly. When a developer new to Java writes some code, they should get the localized answer they expect. However, in any larger environment, especially code intended to run on a server, the defaults cause a problem.

The classic example of this is the String.toLowerCase() method. Many developers use str1.toLowerCase().equals(str2.toLowerCase()) to check if two strings are equal ignoring case. But this code is not valid is all Locales! It turns out that in Turkey, there are two different upper case versions of the letter I and two different lower case versions. This causes lots of problems.

The problem applies more generally to server side applications. Any server-side code that relies on a JDK method using one of the defaults will alter its behaviour based on where and how the server is setup. This is clearly undesirable.

As is often the case, there are two competing forces - ease of use for newcomers, and bugs in larger applications. There is also the issue of backwards compatibility, meaning that the JDK methods depending on the defaults cannot be removed.

One approach has been taken by the Lucene project, scanning code using ASM. The link/tool also provides a useful list of JDK methods affected by this problem:








Thinking about the two competing forces, it seems to me that the best option would be a new annotation.

Imagine the JDK adds an annotation @DependsOnJdkDefaults. This would be used to annotate all methods in the JDK that directly or indirectly rely on a default, including all of those above. Developers outside the JDK could of course also use the annotation to mark their methods relying on defaults.

  public String toLowerCase() {
    return toLowerCase(Locale.getDefault());

Tooling like Checkstyle, IDEs and perhaps even the JDK compiler could then use the annotation to warn developers that they should not use the method. It would be possible to even envisage an IDE setting to hide the methods from auto-complete.

This would seem to balance the competing forces by providing information that could be widely used and very valuable.


I think marking methods that rely on JDK defult Locale/TimeZone/CharSet with an annotation would be a powerful tool. What do you think?

Monday 26 November 2012

Six months in

I've been busier than usual over the last few months. Thats because I now have a new focus in my life beyond ThreeTen/JSR-310 and OpenGamma.

He is our first son, now a little over six months old, and definitely into his food. A wild ride, but definitely worth it!

Friday 23 November 2012

Javadoc coding standards

Javadoc is a key part of coding in Java, yet there is relatively little discussion of what makes good Javadoc style - a coding standard.

Javadoc coding standard

These are the standards I tend to use when writing Javadoc. Since personal tastes differ, I've tried to explain some of the rationale for some of my choices. Bear in mind that this is more about the formatting of Javadoc, than the content of Javadoc.

There is an Oracle guide which is longer and more detailed than this one. The two agree in most places, however these guidelines are more explicit about HTML tags, two spaces in @param and null-specification, and differ in line lengths and sentence layout.

Each of the guidelines below consists of a short description of the rule and an explanation, which may include an example:

Write Javadoc to be read as source code

When we think of "Javadoc" we often think of the online Javadoc HTML pages. However, this is not the only way that Javadoc is consumed. A key way of absorbing Javadoc is reading source code, either of code you or your team wrote, or third party libraries. Making Javadoc readable as source code is critical, and these standards are guided by this principal.

Public and protected

All public and protected methods should be fully defined with Javadoc. Package and private methods do not have to be, but may benefit from it.

If a method is overridden in a subclass, Javadoc should only be present if it says something distinct to the original definition of the method. The @Override annotation should be used to indicate to source code readers that the Javadoc is inherited in addition to its normal meaning.

Use the standard style for the Javadoc comment

Javadoc only requires a '/**' at the start and a '*/' at the end. In addition to this, use a single star on each additional line:

   * Standard comment.
  public ...

  /** Compressed comment. */
  public ...

Do not use '**/' at the end of the Javadoc.

Use simple HTML tags, not valid XHTML

Javadoc uses HTML tags to identify paragraphs and other elements. Many developers get drawn to the thought that XHTML is necessarily best, ensuring that all tags open and close correctly. This is a mistake. XHTML adds many extra tags that make the Javadoc harder to read as source code. The Javadoc parser will interpret the incomplete HTML tag soup just fine.

Use a single <p> tag between paragraphs

Longer Javadoc always needs multiple paragraphs. This naturally results in a question of how and where to add the paragraph tags. Place a single <p> tag on the blank line between paragraphs:

   * First paragraph.
   * <p>
   * Second paragraph.
   * May be on multiple lines.
   * <p>
   * Third paragraph.
  public ...

Use a single <li> tag for items in a list

Lists are useful in Javadoc when explaining a set of options, choices or issues. These standards place a single <li> tag at the start of the line and no closing tag. In order to get correct paragraph formatting, extra paragraph tags are required:

   * First paragraph.
   * <p><ul>
   * <li>the first item
   * <li>the second item
   * <li>the third item
   * </ul><p>
   * Second paragraph.
  public ...

Define a punchy first sentence

The first sentence, typically ended by a dot, is used in the next-level higher Javadoc. As such, it has the responsibility of summing up the method or class to readers scanning the class or package. To achieve this, the first sentence should be clear and punchy, and generally short.

While not required, it is recommended that the first sentence is a paragraph to itself. This helps retain the punchiness for readers of the source code.

It is recommended to use the third person form at the start. For example, "Gets the foo", "Sets the "bar" or "Consumes the baz". Avoid the second person form, such as "Get the foo".

Use "this" to refer to an instance of the class

When referring to an instance of the class being documented, use "this" to reference it. For example, "Returns a copy of this foo with the bar value updated".

Aim for short single line sentences

Wherever possible, make Javadoc sentences fit on a single line. Allow flexibility in the line length, favouring between 80 and 120 characters to make this work.

In most cases, each new sentence should start on a new line. This aids readability as source code, and simplifies refactoring re-writes of complex Javadoc.

   * This is the first paragraph, on one line.
   * <p>
   * This is the first sentence of the second paragraph, on one line.
   * This is the second sentence of the second paragraph, on one line.
   * This is the third sentence of the second paragraph which is a bit longer so has been
   * split onto a second line, as that makes sense.
   * This is the fourth sentence, which starts a new line, even though there is space above.
  public ...

Use @link and @code wisely

Many Javadoc descriptions reference other methods and classes. This can be achieved most effectively using the @link and @code features.

The @link feature creates a visible hyperlink in generated Javadoc to the target. The @link target is one of the following forms:

   * First paragraph.
   * <p>
   * Link to a class named 'Foo': {@link Foo}.
   * Link to a method 'bar' on a class named 'Foo': {@link Foo#bar}.
   * Link to a method 'baz' on this class: {@link #baz}.
   * Link specifying text of the hyperlink after a space: {@link Foo the Foo class}.
   * Link to a method handling method overload {@link Foo#bar(String,int)}.
  public ...

The @code feature provides a section of fixed-width font, ideal for references to methods and class names. While @link references are checked by the Javadoc compiler, @code references are not.

Only use @link on the first reference to a specific class or method. Use @code for subsequent references. This avoids excessive hyperlinks cluttering up the Javadoc.

Never use @link in the first sentence

The first sentence is used in the higher level Javadoc. Adding a hyperlink in that first sentence makes the higher level documentation more confusing. Always use @code in the first sentence if necessary. @link can be used from the second sentence/paragraph onwards.

Do not use @code for null, true or false

The concepts of null, true and false are very common in Javadoc. Adding @code for every occurrence is a burden to both the reader and writer of the Javadoc and adds no real value.

Use @param, @return and @throws

Almost all methods take in a parameter, return a result or both. The @param and @return features specify those inputs and outputs. The @throws feature specifies the thrown exceptions.

The @param entries should be specified in the same order as the parameters. The @return should be after the @param entries, followed by @throws.

Use @param for generics

If a class or method has generic type parameters, then these should be documented. The correct approach is an @param tag with the parameter name of <T> where T is the type parameter name.

Use one blank line before @param

There should be one blank line between the Javadoc text and the first @param or @return. This aids readability in source code.

Treat @param and @return as a phrase

The @param and @return should be treated as phrases rather than complete sentences. They should start with a lower case letter, typically using the word "the". They should not end with a dot. This aids readability in source code and when generated.

Treat @throws as an if clause

The @throws feature should normally be followed by "if" and the rest of the phrase describing the condition. For example, "@throws IllegalArgumentException if the file could not be found". This aids readability in source code and when generated.

@param should two spaces after the parameter name

When reading the Javadoc as source code, a single space after the parameter name is a lot harder to read than two spaces. Avoid aligning the parameters in a column, as it is prone to difficulty in refactoring where parameter names are changed or added.

   * Javadoc text.
   * @param foo  the foo parameter
   * @param bar  the bar parameter
   * @return the baz content
  public String process(String foo, String bar) {...}

Define null-handling for all parameters and return types

Whether a method accepts null on input, or can return null is critical information for building large systems. All non-primitive methods should define their null-tolerance in the @param or @return. Some standard forms expressing this should be used wherever possible:

  • "not null" means that null is not accepted and passing in null will probably throw an exception , typically NullPointerException
  • "may be null" means that null may be passed in. In general the behaviour of the passed in null should be defined
  • "null treated as xxx" means that a null input is equivalent to the specified value
  • "null returns xxx" means that a null input always returns the specified value

When defined in this way, there should not be an @throws for NullPointerException.

   * Javadoc text.
   * @param foo  the foo parameter, not null
   * @param bar  the bar parameter, null returns null
   * @return the baz content, null if not processed
  public String process(String foo, String bar) {...}

While it may be tempting to define null-handling behaviour in a single central location, such as the class or package Javadoc, this is far less useful for developers. The Javadoc at the method level appears in IDEs during normal coding, whereas class or package level Javadoc requires a separate "search and learn" step.

Other simple constraints may be added as well if applicable, for example "not empty, not null". Primitive values might specify their bounds, for example "from 1 to 5", or "not negative".

Specifications require implementation notes

If you are writing a more formal specification that will be implemented by third parties, consider adding an "implementation notes" section. This is an additional section, typically at the class level, that specifies any behaviours required by implementations that are not otherwise specified, or not of general interest. See this example.

Avoid @author

The @author feature can be used to record the authors of the class. This should be avoided, as it is usually out of date, and it can promote code ownership by an individual. The source control system is in a much better position to record authors.


The ThreeTen repository has some more complete examples


Hopefully these suggestions will help you to write better Javadoc. Feel free to disagree or point to some alternative standards.

Tuesday 6 November 2012

Pitfalls of "consistent with equals"

The equals() method and the compareTo() method on Comparable are two of the most basic in Java. Yet their definitions have an interesting wrinkle around the concept of "consistent with equals".

The equals() method

The equals() method is both well defined and unclear in Java. It is well-defined in that it specifies exactly how the check for equal must work. They must be reflexive, symmetric, transitive, consistent and handle null.

However the equals() is also unclear. The Javadoc says that the method "Indicates whether some other object is "equal to" this one". Note the "equal to" part is in quotes. The key here is that it does not define how the equality should be determined. There are three standard choices:

  • the identity of the object (==), which is the default inherited from Object
  • the entire observable state of the object, such that if two objects are equal then one is substitutable for the other in other parts of the application
  • some subset of the information that makes logical sense to check equals on, such as an ID

The compareTo() method

The Comparable interface defines the concept of comparability. The Javadoc specifies that it "imposes a total ordering on the objects of each class that implements it".

Classes that implement Comparable have a natural ordering and can be sorted easily and used in collections like TreeSet and TreeMap without a separate Comparator.

The interface is well-defined, in that it specifies that the implementation must ensure a form of symmetry and transitive behaviour, just like equals(). It does not specify exactly how the comparison should be made, but defines the concept of consistency with equals.

Consistent/Inconsistent with equals

The Comparable interface says the following:

The natural ordering for a class C is said to be consistent with equals if and only if e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2) for every e1 and e2 of class C.

Basically, this requires that the concept of equals defined by compareTo is the same as that defined by equals() (apart from nulls). This is at first glance a simple requirement but, as I'm going to discuss, it has complications.

This kind of definition is especially useful when considering operator overloading. If we consider a Java-like language where == is not object identity, but comparison using a method, and the greater-than/less-than operators are also available the question is what method to call. Greater-than/less-than would in this Java-like language naturally be based on compareTo() and == would call equals().

  // our new Java-like language
  if (a < b) return "Less";      // translation ignoring nulls: if (a.compareTo(b) < 0)
  if (a > b) return "Greater";   // translation ignoring nulls: if (a.compareTo(b) > 0)
  if (a == b) return "Equal";    // translation ignoring nulls: if (a.equals(b))
  throw new Exception("Impossible assuming no nulls?");

But if compareTo() is "inconsistent with equals" then this code can throw the exception, because a.compareTo(b) can return zero when a.equals(b) is false.

Other problems can occur in collections like TreeMap:

  // Foo class is "inconsistent with equals"
  assert foo1.equals(foo2) == false;
  assert foo1.compareTo(foo2) == 0;
  TreeMap<Foo, String> map = ...
  map.put(foo1, "a");
  map.put(foo2, "b");

Here the two objects are not equal using equals() but are equal using compareTo(). In this case, the map will have size one, not size two!

Because of the problems with being "inconsistent with equals", the Javadoc says It is strongly recommended (though not required) that natural orderings be consistent with equals.

Many JDK classes implementing Comparable do so in a manner "consistent with equals". These include Byte, Short, Integer, Long, Character and String.

Some others are more interesting:

  • BigDecimal - well-known to be "inconsistent with equals", as 4.00 is not equal to 4.0, but do compare the same
  • Double/Float - provides an explicit sort order and equals check for positive and negative zero and NaN to ensure compareTo is "consistent with equals"
  • CharSet - based on the ID/name. equals is case-senstitive, while compareTo is not, although as the names are normalized it tends not to matter - dubiously "consistent"
  • *Buffer (nio) - based on remaining content of the buffer. equals and compareTo appear to be "consistent" under my testing
  • Rdn (ldap) - based on the normalized form of the state, thus is "consistent with equals"
  • ObjectStreamField (serialization) - based on the name, but with primitives sorted first. equals is not overriden, so is "inconsistent with equals"
  • ObjectName (jmx) - equals based on the canonical name, compareTo defined as "not completely specified but is intended to be such that a sorted list of ObjectNames will appear in an order that is convenient for a person to read"! "inconsistent with equals"
  • FileTime (io) - implements equals using compareTo, definitely "consistent with equals"
  • File (io) - OS-specific, implements equals using compareTo, definitely "consistent with equals"
  • URI - source code appears to be "consistent with equals"
  • UUID - source code for equals includes variant that compareTo doesn't, but it appears to be unessesary code with no effect such that it is "consistent with equals"
  • Date (java.util) - source code difficult to follow because equals changes the state of the object (normalizing the deprecated setters), whereas compareTo does not change the state. As both ultimately so the same calculation it is "consistent with equals"
  • Date/Time (sql) - subclass java.util.Date and add nothing new, so are also "consistent with equals"
  • Timestamp (sql) - horrible compareTo and equals implementations that break the contracts, but is "consistent with equals" when comparing two Timestamp objects
  • Calendar/GregorianCalendar - equals compares the state, while compareTo compares along the time-line. Definitely "inconsistent with equals"

Note: In almost all cases, I had to examine the source code or write test code to determine whether a class was or was not "consistent with equals". There is a good Adopt-a-JDK task here to clarify the Javadoc and check UUID equals.


Having seen too many issues around BigDecimal, the plan has been for JSR-310 classes to be "consistent wth equals". However, a recent thread shows how controversial this can get.

Basically, it seems really simple to define equals/compareTo for some classes. LocalDate is just a date in a single calendar system, so has an obvious ordering and equals. LocalTime is just a time, so has an obvious ordering and equals. Instant is just an instantaeous point on the time-line, so has an obvious ordering and equals.

However other cases are not so obvious. Consider OffsetDateTime:

  dt1 = OffsetDateTime.parse("2012-11-05T06:00+01:00");
  dt2 = OffsetDateTime.parse("2012-11-05T07:00+02:00");

These two date-time objects represent the same instantaneous point on the time-line. But they have different local times and different offsets from UTC/Greenwich.

So, one question to readers.... which of these options do you prefer...

  1. dt1 not equal to dt2, compareTo() takes into account the local time and offset, "consistent with equals" (with a separate Comparator to sort by instant)
  2. dt1 not equal to dt2, compareTo() based on the instantaneous point on the time-line, "inconsistent with equals"
  3. dt1 equal to dt2, compareTo() based on the instantaneous point on the time-line, "consistent with equals"
  4. dt1 equal to dt2 and do not implement Comparable
  5. dt1 not equal to dt2 and do not implement Comparable

Personally I struggle with the idea that dt1.equals(dt2) would be true, but I'm open to opinions.

BTW, the question could also be posed with respect to BigDecimal. If you could to make that class "consistent with equals" would you change the equals() method or the compareTo() method?

Monday 15 October 2012

ElSql - Library/DSL to manage SQL

For many years I've thought about writing a simple library for handling SQL and/or relational databases. At OpenGamma I got the chance, and today the ElSql library is launched (pronounced else-q-el).


ElSql is not an ORM or anything similar. Instead, it is a simple tool for managing SQL external to the application. Plus a touch of DSL goodness.

The development model that ElSql is targetted at is Spring-based, using SqlParameterSource and JdbcTemplate. You need to be comfortable writing SQL, as well as writing the code to convert data objects to and from the form needed for the database.

So what does ElSql do?

Primarily, ElSql defines a DSL file format and two public classes - a manager and a configuration class. The aim of the DSL elements in the file format is to provide some basic tags to handle common difficulties encountered when building up SQL dynamically across a variety of databases. This external file-based approach has the major benefit of allowing DBAs to discuss the SQL more easily with the development team.

Here is an example elsql file:

 -- an example comment
     SELECT @INCLUDE(CommonFields)
     FROM blogs
     WHERE id = :id
         date > :date
         active = :active
     ORDER BY title, author
   title, author, content

The aim is that at first glance it should be relatively clear what the file represents and how it works. And yes, that is significant whitespace indentation being used for defining the blocks.

The colon prefixes are variables, as defined by the Spring NamedParameterJdbcOperations.

The tags, starting with @, cover three key areas:

  • SQL LIKE vs = for wildcards
  • dynamic construction of WHERE/AND clauses where only some things are being searched for
  • paging of results

The second and third of these are shown in the example above. The @PAGING will add whatever code is necessary to perform paging on the configured database. The @AND block will only remain if the variable, such as :date exists.

The named block, @NAME, is how the application refers to the SQL:

  String sql = bundle.getSql("SelectBlogs", args);

where args is the SqlParameterSource that has to be created and passed to NamedParameterJdbcOperations anyway. (Passing in the arguments allows tags like @AND to work, by querying to see if the variable is present or not)

These three problems - wildcards, WHERE/AND and paging - have always been the three biggest issues faced when building SQL dynamically, and the tags neatly solve the problems. And of course there is now a great file we can use to talk to DBAs.

More details can be found in the user guide and configuration guide. Or look at a complete, yet simple, example.


ElSql is a library I've wanted to write for a long time and now it finally exists. Its not for everyone - you need to be able to write SQL for starters! The DSL elements are deliberately simple and limited, providing just enough to tackle the common cases hit when dynamically building SQL.

If it fits your needs, take it for a test run. See the project page on GitHub for more details, documentation and to fork it. Or download from maven com.opengamma/elsql. Feedback welcome!

Monday 3 September 2012

Still no Transparancy

Update 2012-09-06: The Expert Group mailing lists for Project Lambda are now open!

Back in October last year, I wrote about the lack of access to the expert group mailing list of Project lambda, the effort adding closures to JDK 8. Has anything changed? Well of course not.

Still no Transparancy

The blog last year was in response to the submission (in November 2010) of the original JSR which made certain promises (as required by the JCP). Chief of these was:

A publicly readable mailing list for expert group communication will be the primary working model.

After years of asking, there still is no publicly readable mailing list for the expert group.

(Everything in the previous blog still stands, so read that for more info.)

We are of course forunate enough to have the more general fully open lambda mailing list at OpenJDK. But it isn't what the JCP requires, Oracle promised or the community deserves.

While I'm sure individuals have tried their best, it seems clear that the system is broken. If a mailing list cannot be opened up in nearly 2 years then frankly what is the point?

And that is why I continue to see the JCP as an irrelevent zombie in its current form. It needs to be split in two such that Oracle's total ownership of Java SE is clear for all to see.

After all, I can live with Oracle's control. Its the broken promises I object to.

Thursday 23 February 2012

Time-zone database - Astrolabe withdraws

Just a quick blog to say that the lawsuit against the time-zone database appears to be at an end.

Pursuant to Federal Rule of Civil Procedure 41(a)(1), please take
notice that Plaintiff ASTROLABE, INC., pursuant to Federal Rule of
Civil Procedure 41(a)(1), hereby voluntarily dismisses all claims in
this action with prejudice, and without costs, as to Defendants ARTHUR

In connection with this NOTICE, the Plaintiff states that the
Defendants have neither answered Plaintiff's Complaint, nor filed a
motion for summary judgment. Accordingly, this matter may be
dismissed with prejudice and without an Order of the Court.

The EFF had this to say:

The Electronic Frontier Foundation (EFF) is pleased to announce that a copyright lawsuit threatening an important database of time zone information has been dismissed. The astrology software company that filed the lawsuit, Astrolabe, has also apologized and agreed to a 'covenant not to sue' going forward, which will help protect the database from future baseless legal actions and disruptions.

In a statement, Astrolabe said, "Astrolabe's lawsuit against Mr. Olson and Mr. Eggert was based on a flawed understanding of the law. We now recognize that historical facts are no one's property and, accordingly, are withdrawing our Complaint. We deeply regret the disruption that our lawsuit caused for the volunteers who maintain the TZ database, and for Internet users."

"We are grateful that EFF and its co-counsel at Fish & Richardson were able to step in and assist us, so that we could help ensure the TZ database would continue to be available," said Eggert and Olson.

So in the end, Astrolabe understood that facts cannot be copyrighted and withdrew the case. I am also glad to see their statement of regret.

Finally, its good to see the EFF's positive involvement in the case.

Tuesday 7 February 2012

JDK helper Math methods

Update 2013-08-16: The three methods discussed below - incrementExact, decrementExact and negateExact - have all made it into JDK 8.

How far should simple helper methods go in the JDK? Specifically for Maths.

JDK helper Math methods

A discussion is occurring on the OpenJDK core-libs list to provide some maths helper methods. But there is a disagreement as to which methods should be provided.

The following methods appear to be going in (to java.math.Math, as static methods):

  • addExact(int, int)
  • subtractExact(int, int)
  • multiplyExact(int, int)
  • toIntExact(long)

All the methods are intended to throw an exception on overflow, so you would use addExact instead of a + b if you care that the result fits into an int.

Furthermore, some of these methods are designed such that the JVM may be able to provide better low-level implementations dependent on the processor. The same methods with long as the parameter type are also included.

The discussion has arisen around some additional methods:

  • negateExact(int)
  • incrementExact(int)
  • decrementExact(int)

These are currently going to be rejected on the grounds that "they don't pull their weight in the API". So, this blog post is to canvas a wider opinion.

The proposed alternatives are either that the caller should just write the code themselves, by testing against Integer.MIN_VALUE, or to indirectly use one of the other methods. (If you're wondering how negating an integer could go wrong, read this.)

So, here are the three discussed options for negation:

  int x = ...
  // option A - dedicated method
  int a = Math.negateExact(x);
  // option B - code at the call-site
  if (x == Integer.MIN_VALUE) {
    throw new ArithmeticException();
  int b = -x;
  // option C - indirectly use another API
  int a = Math.subtractExact(0, x);

And here are the options for increment:

  int x = ...
  // option A - dedicated method
  int a = Math.incrementExact(x);
  // option B - code at the call-site
  if (x == Integer.MAX_VALUE) {
    throw new ArithmeticException();
  int b = x++;
  // option C - indirectly use another API
  int a = Math.addExact(x, 1);

For me, I find option A in both cases to be a lot clearer when reading the code, as it says three things - that I have understood the corner case, done something about it, and documented that thinking via the inserted method. Sure there is more code in the library (the JDK) to enable this, but isn't that really the point of libraries? To provide the common code so we don't have to? The alternatives are far less appealing to me.

The counter argument is that the more methods that the JDK provides, the harder it is for users to find things. And the JDK gets larger (bigger jar file) as a result.

So, dear reader, its over to you :-) How far is too far when adding methods to the JDK? Should negate, increment and decrement be provided in the JDK to help protect us against errors, or can/should we use the alternatives above?