Stephen Colebourne's blog

Friday, 25 June 2010

Problems with Eclipse, Eclipse-CS and JDK 1.5

I've just spent 2 days reinstalling my machine, with the worst part being the following problem. Since I didn't find any good Google hits, I thought I'd recount the problem and fix for others to follow.

My main goal was the upgrade of my OpenGamma work Windows 7 PC from a hard disk to an SSD. As part of this, I was joining a Windows Domain. As such, I ended up reinstalling Windows 7 from scratch and recreating my entire work environment. The PC in question is 64 bit, which was key to the problems.

After the fresh install of Windows 7, I reinstalled JDK 1.5, then JDK 1.6. Only then did I install Eclipse (3.5). When I tried starting Eclipse, it failed immediately, displaying its command line arguments.

To fix this strange error (which I'd not seen before) I changed the eclipse.ini OSGI -Dosgi.requiredJavaVersion=1.5 line to 1.6. This change got Eclipse up and running.

I installed all the Eclipse plugins I needed including Eclipse-CS for Checkstyle.

When I then imported our source code I got an UnsupportedClassVersionError referencing a number of seemingly random files. The problem marker referred to the whole file with a lovely Eclipse red cross. The only reference to checkstyle was the problem type of 'Checkstyle problem'.

By uninstalling all the checkstyle setup, I got everything to compile. So it was definitely something to do with Eclipse-CS or Checkstyle.

Some Googling did reveal that the cause for the random files was that those files contained a Javadoc @throws clause referring to a custom exception class (not a JDK exception class). So, there was something in Eclipse-CS or Checkstyle that couldn't cope with this, and it had something to do with an invalid Java version. (UnsupportedClassVersionError refers to a class that was compiled under one version of the JDK and is then being run under an earlier version)

But, I had no idea what that issue was, or how to solve it. Or why it was different on this new install as opposed to my old install.

This morning I decided to go back and think about the first problem that I'd worked around - the failure of Eclipse to startup normally. I did java -version on each of the four Java installs, JDK 1.5 and 1.6 on each of the old and new disks. On the old hard disk, both were 64 bit JDK installs. On the new SSD, JDK 1.6 was 64 bit, but JDK 1.5 was 32 bit.

So, I uninstalled the two SSD JDKs, and reinstalled them, both as 64 bit. I also needed to install the browser JRE separately as that needed a 32 bit JRE.

Eclipse now started correctly (with either the OSGI 1.5 or 1.6 setting) and the Eclipse-CS/Checkstyle problem went away.

I've still no idea why the difference between a 32 and 64 bit JVM should cause an UnsupportedClassVersionError though. I've always thought that the bytecode format was independent of 32 vs 64 bit, yet the error would appear to indicate otherwise.

At least it is sorted now. The newly installed SSD drive is going to have to be super-quick for the two day loss of productivity though!

Thursday, 17 June 2010

Exception transparency and Lone-Throws

The Project Lambda mailing list has been considering exception transparency recently. My fear with the proposal in this area is that the current proposal goes beyond what Java's complexity budget will allow. So, I proposed an alternative.

Exception transparency

Exception transparency is all about checked exceptions and how to handle them around a closure/lambda.

Firstly, its important to note that closures are a common feature in other programming languages. As such, it would be a standard approach to look elsewhere to see how this is handled. However, checked exceptions are a uniquely Java feature, so this approach doesn't help.

Within Neal Gafter's BGGA and CFJ proposals, and referenced by the original FCM proposal is the concept and solution for exception transparency. First lets look at the problem:

Consider a method that takes a closure and a list, and processes each item in the list using the closure. For out example, we have a conversion library method (often called map) that transforms an input list to an output list:

  // library method
  public static <I, O> List<O> convert(List<I> list, #O(I) block) {
    List<O> out = new ArrayList<O>();
    for (I in : list) {
      O converted = block.(in);
      out.add(converted);
    }
    return out;
  }
  // user code
  List<File> files = ...
  #String(File) block = #(File file) {
    return file.getCanonicalPath();
  };
  List<String> paths = convert(list, block);

However, this code won't work as expected unless we specially handle it in closures. This is because the method getCanonicalPath can throw an IOException.

The problem of exception transparency is how to transparently pass the exception, thrown by the user supplied closure, back to the surrounding user code. In other words, we don't want the library method to absorb the IOException, or wrap it in a RuntimeException.

Project Lambda approach

The approach of Project Lambda is modelled on Neal Gafter's work. This approach adds addition type information to the closure to specify what checked exceptions can be thrown:

  // library method
  public static <I, O, throws E> List<O> convert(List<I> list, #O(I)(throws E) block) throws E {
    List<O> out = new ArrayList<O>();
    for (I in : list) {
      O converted = block.(in);
      out.add(converted);
    }
    return out;
  }
  // user code
  List<File> files = ...
  #String(File)(throws IOException) block = #(File file) {
    return file.getCanonicalPath();
  };
  List<String> paths = convert(list, block);

Notice how more generic type information was added - throws E. In the library method, this is specified at least three time - once in the generic declaration, once in the function type of the block and once on the method itself. In short, throws E says "throws zero-to-many exceptions where checked exceptions must follow standard rules".

However, the user code also changed. We had to add the (throws IOException) clause to the function type. This actually locks in the exception that will be thrown, and allows checked exceptions to continue to work. This creates the mouthful #String(File)(throws IOException).

It has recently been noted that syntax doesn't matter yet in Project Lambda. However, here is a case where there is effectively a minimum syntax pain. No matter how you rearrange the elements, and what symbols you use, the IOException element needs to be present.

On the Project Lambda mailing list I have argued that the syntax pain here is inevitable and unavoidable with this approach to exception transparency. And I've gone further to argue that this syntax goes beyond what Java can handle. (Imagine some of these declarations with more than one block passed to the library method, or with wildcards!!!)

Lone throws approach

As a result of the difficulties above, I have proposed an alternative - lone-throws.

The lone-throws approach has three elements:

Any method may have a throws keyword without specifying the types that are thrown ("lone-throws"). This indicates that any exception, checked or unchecked may be thrown. Once thrown in this manner, any checked exception flows up the stack in an unchecked manner.
Any catch clause may have a throws keyword after the catch. This indicates that any exception may be caught, even if the exception isn't known to be thrown by the try block.
All closures are implicitly declared with lone throws. Thus, all closures can throw checked and unchecked exceptions without declaring the checked ones.

Here is the same example from above:

  // library method
  public static <I, O> List<O> convert(List<I> list, #O(I) block) {
    List<O> out = new ArrayList<O>();
    for (I in : list) {
      O converted = block.(in);
      out.add(converted);
    }
    return out;
  }
  // user code
  List<File> files = ...
  #String(File) block = #(File file) {
    return file.getCanonicalPath();
  };
  List<String> paths = convert(list, block);

If you compare this example to the very first one, it can be seen that it is identical. Personally, I'd describe that as true exception transparency (as opposed to the multiple declarations of generics required in the Project Lambda approach.)

It works, because the closure block automatically declares the lone-throws. This allows all exceptions, checked or unchecked to escape. These flow freely through the library method and back to the user code. (Checked exceptions only exist in the compiler, so this has no impact on the JVM)

The user may choose to catch the IOException, however they won't be forced to. In this sense, the IOException has become equivalent to a runtime exception because it was wrapped in a closure. The code to catch it is as follows:

  try {
    paths = convert(list, block);  // might throw IOException via lone-throws
  } catch throws (IOException ex) {
    // handle as normal - if you throw it, it is checked again
  }

The simplicity of the approach in syntax terms should be clear - it just works. However, the downside is the impact on checked exceptions.

Checked exceptions have both supporters and detractors in the Java community. However, all must accept that given projects like Spring avoiding checked exceptions, their role has been reduced. It is also widely known that other newer programming languages are not adopting the concept of checked exceptions.

In essence, this proposal provides a means for the new reality where checked exceptions are less important to be accepted. Any developer may use the lone-throws concept to convert checked exceptions to unchecked ones. They may also use the catch-throws concept to catch the exceptions that would otherwise be uncatchable.

This may seem radical, however with the growing integration of non-Java JVM languages, the problem of being unable to catch checked exceptions is fast approaching. (Many of those languages throw Java checked exceptions in an unchecked manner.) As such, the catch-throws clause is a useful language change on its own.

Finally, I spent a couple of hours tonight implementing the lone-throws and catch-throws parts. It took less than 2 hours - this is an easy change to specify and implement.

Summary

Overall, this is a tale of two approaches to a common problem - passing checked exceptions transparently from inside to outside a closure. The Project Lambda approach preserves full type-information and safeguards checked exceptions at the cost of horribly verbose and complex syntax. The lone-throws approach side-steps the problem by converting checked exceptions to unchecked, with less type-information as a result, but far simpler syntax. (The mailing list has discussed other possible alternatives, however these two are the best developed options.)

Can Java really stand the excess syntax of the Project Lambda approach?
Or is the lone-throws approach too radical around checked exceptions?
Which is the lesser evil?

Feedback welcome!

Sunday, 14 March 2010

Java language design by use case

In a blog in 2006 Neal Gafter wrote about how language design was fundamentally different to API design and how use cases were a bad approach to language design. This blog questions some of those conclusions in the context of the Java language.

Java language design by use case

Firstly, Neal doesn't say that use cases should be avoided in language design:

In a programming language, on the other hand, the elements that are used to assemble programs are ideally orthogonal and independent. ...
To be sure, use cases also play a very important role in language design, but that role is a completely different one than the kind of role that they play in API design. In API design, satisfying the requirements of the use cases is a sufficient condition for completeness. In language design, it is a necessary condition.

So, Neal's position seems very sound. Language features should be orthogonal, and designed to interact in new ways that the language designer hadn't thought of. This is one element of why language design is a different skill to API design - and why armchair language designers should be careful.

The problem, and the point of this blog, is that it would appear that the development of the Java language has never been overly concerned with following this approach. (I'm not trying to cast aspersions here on those involved - just trying to provide some background on the language).

Consider inner classes - added in v1.1. These target a specific need - the requirements of the swing API. While they have been used for other things (poor mans closures), they weren't overly designed as such.

Consider enums - aded in v1.5. These target a single specific use case, that of a typesafe set of values. They don't extend to cover additional edge cases (shared code in an abstract superclass or extensibility for example) because these weren't part of the key use case. JSR-310 has been significantly compromised by the lack of shared code.

Consider the foreach loop - added in v1.5. This meets a single basic use case - looping over an array or iterable. The use case didn't allow for indexed looping, finding out if its the first or last time around the loop, looping around two lists pairwise, and so on. The feature is driven by a specific use case.

And the var-args added in v1.5? I have a memory that suggests the use case for its addition was to enable String printf.

Finally, by accounts I've heard, even James Gosling tended to add items to the original Java builds on the basis of what he needed at that moment (a specific use case) rather than to a great overarching plan for a great language.

To be fair, some features are definitely more orthogonal and open - annotations for example.

Looking forward, Project Lambda repeats this approach. It has a clear focus on the Fork-Join/ParallelArray use case - other use cases like filtering/sorting/manipulating collections are considered a second class use case (apparently - its a bit hard to pin down the requirements). Thus, once again the Java language will add a use case driven feature rather than a language designers orthogonal feature.

But is that necessarily a Bad Thing?

Well, firstly we have to consider that the java language has 9 million developers and is probably still the worlds most widely used language. So, being use case driven in the past hasn't overly hurt adoption.

Now, most in the community and blogosphere would accept that in many ways Java is actually not a very good programming language. And somewhere deep down, some of that is due to the use case/feature driven approach to change. Yet, down in the trenches most Java developers don't seem especially fussed about the quality of the language. Understanding that should be key for the leaders of the Java community.

I see two ways to view this dichotomy. One is to say that it is simply because people haven't been exposed to better languages with a more thought through and unified language design. In other words - once they do see a "better designed language" they'll laugh at Java. While I think that is true of the blogosphere, I'm rather unconvinced as to how true that is of the mainstream.

The alternative is to say that actually most developers can more easily handle discrete use-case focussed language features better than abstracted, independent, orthogonal features. In other words - "use feature X do achieve goal Y". I have a suspicion that is how many developers actually like to think.

Looked at in this way, the design of the Java language suddenly seems a lot more clever. The use case driven features map more closely onto the discrete mental models of working developers than the abstract super-powerful ones of more advanced languages. Thus this is another key difference that marks out a blue collar language (pdf) (cache) from an academic experiment.

Project Lambda

I'm writing this blog because of Project Lambda, which is adding closures to the Java language. Various options have been suggested to solve the problem of referring to local variables and whether those reference should be safe across multiple threads or not. The trouble is that there are two use cases - immediate invocation, where local variables can be used immediately and safely, and deferred asynchronous invocation where local variables would be published to another thread and be subject to data races.

What this blog suggests is that maybe these two use cases need to be representable as two language features or two clear variations of the same feature (as in C++11).

Summary

Many of the changes to the Java language, and some of the original features, owe as much to a use case driven approach as to an overarching language design with orthogonal features. Yet despite this supposed "flaw" developers still use the Java language in droves.

Maybe its time to question whether use case focus without orthogonality in language features isn't such a Bad Thing after all?

Feedback welcome!

Sunday, 21 February 2010

Serialization - shared delegates

I've been working on Joda-Money as a side project and have been investigating serialization, with a hope of improving JSR-310

Small serialization

Joda-Money has two key classes - BigMoney, capable of storing information to any scale and Money, limited to the correct number of decimal places for the currency.

 public class BigMoney {
   private final CurrencyUnit currency;
   private final BigDecimal amount;
 }
 public class Money {
   private final BigMoney money;
 }

A default application of serialization to these classes will generate 525 bytes for BigMoney and 599 bytes for Money. This is a lot of data to be sending for objects that seem quite simple.

Where does the size go?

Well, each serialized class had to write a header to state what the class is. For something like Money, it has to write a header for itself, BigMoney, CurrencyUnit, BigDecimal and BigInteger. The header also includes the serialization version number and the names of each field.

Of course, serialization is designed to handle complex cases where the versions of the class file differ on two JVMs. Data is populated into the right fields using the field name. But for simple classes like money, the data isn't going to change over time.

One interesting fact is that the class header is only sent once per stream for a class. As a result, for each subsequent after the first the size is reduced. For default serialization of a subsequent BigMoney the size is 59 bytes and for Money it is 65 bytes. Clearly, the header is a major overhead.

Making the data smaller

The key to this is using a serialization delegate class. The delegate is a class that is written into the output stream in place of the original class. This approach is required because the fields are final which prevents a sensible data format from being written/read by the class itself.

 public class Money {
   private final BigMoney money;
   private Object writeReplace() {
     return new Ser( ... );
   }
 }

So, there is a new class Ser which will appear in the stream wherever the Money class would have been. The name Ser is deliberately short, as each letter takes up space in the stream.

The delegate class is usually written as a static inner class:

 public class Money implements Serializable {
   private final BigMoney money;
   private Object writeReplace() {
     return new Ser(this);
   }
   private static class Ser implements Serializable {
     private Money obj;
     private Ser(Money money) {obj = money;}
     private void writeObject(ObjectOutputStream out) throws IOException {
       // write data to stream, avoiding defaultWriteObject()
       // this writes the currency code and amount directly
     }
     private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
       // read data from stream to obj variable, avoiding defaultReadObject()
     }
     private Object readResolve() {
       return obj;
     }
   }
 }

The delegate class uses the low level writeObject and readObject to control the data in the stream. The readResolve method then returns the correct object back for the serialization mechanism to put in the object structure. The class is static to ensure a stable serialized form.

Simply taking control of the stream in this way will greatly reduce the overall size. The biggest gain is in writing out the BigDecimal in an efficient manner.

Even better?

My investigation has shown a technique to make the stream even smaller.

Firstly, rather than using a static inner class, use a top-level package scoped class. This will have a shorter fully qualified class name, thus a shorter header.

Secondly, look at the other classes in the package. If there are more classes that need the same treatment, why not use a single delegate class for all of them?

 public class BigMoney {
   private final CurrencyUnit currency;
   private final BigDecimal amount;
   private Object writeReplace() {
     return new Ser(Ser.BIG_MONEY, this);
   }
 }
 public class Money {
   private final BigMoney money;
   private Object writeReplace() {
     return new Ser(Ser.MONEY, this);
   }
 }
 class Ser implements Externalizable {
   static final byte BIG_MONEY = 0;
   static final byte MONEY = 1;
   private byte type;
   private Object obj;
   private Ser(byte type, Object obj) {this.byte = byte; this.obj = obj;}
   public void writeExternal(ObjectOutput out) throws IOException {
     out.writeByte(type);
     switch (type) {
       // write data to stream based on the type
       // this writes the currency code and amount directly
     }
   }
   public void readExternal(ObjectInput in) throws IOException {
     type = in.readByte();
     switch (type) {
       // read data from stream to obj variable based on the type
     }
   }
   private Object readResolve() {
     return obj;
   }
 }

So, both classes are sharing the same serialization delegate, using a single byte type to distinguish them. Since the header is written once per class per stream, there is now only one header written whether your stream contains BigMoney, Money or both.

I've also switched to using Externalizable rather than Serializable. Despite the public methods, these cannot be called on the general API because this is a package scoped class. This change doesn't affect the stream size, but should perform faster (untested!) as there is less reflection involved.

With these changes, the stream size for sending one BigMoney or Money drops to 58 bytes from 525/299 bytes. Sending a subsequent object of the same type drops to 24 bytes, whereas the default would be 59/65 bytes.

The single shared delegate approach also results in a smaller jar file, as there is a large jar file size overhead for each separate class. (We've replaced two delegates by one, so the jar is smaller).

One downside with this approach is that serialization is no longer encapsulated within the class being serialized. This may result in a constructor becoming package scoped rather than private.

The approach is also only recommended where the class and serialized format is stable, as you are fully responsible for evolution over time of the data format.

A final downside is that the object identity of objects might not be not preserved. For example, if the data of the BigDecimal is written out rather than a reference to the object then a new BigDecimal object will be created for each BigMoney deserialized. The extent to which this is a problem is dependent on the memory structure being serialized.

The same problem applies to multiple Money object backed by the same BigMoney. The default serialized size for the second would be just 10 bytes, whereas the basic shared delegate approach would be 24 bytes.

As a result, I recommend only writing the base class, BigMoney in this case, directly using its contents. Other classes that contain the base class, Money in this case, should write out a reference to the BigMoney from the shared delegate. This approach means that the second Money takes 14 bytes when the BigMoney is shared and 34 bytes when it isn't.

Using this final approach, the figures are as follows

Object	Default serialization		Shared delegate
Object	First sent	Subsequent	First sent	Subsequent
BigMoney	525	59	58	24
Money	599	65	68	34
Money with shared BigMoney	599	10	68	14

Summary

The shared delegate technique offers one route to the smallest stream size for serialization. The data size for the first object was a tenth of the original, and halved for subsequent objects. However, I would recommend this as a specialist technique for low level value objects rather than general beans.

So is this worth applying to JSR-310? Feedback welcome!

Monday, 8 February 2010

New job - impact on JSR-310

This is a quick blog to outline my upcoming job change and how it affects JSR-310.

For many years I've worked for SITA, global leader in air transport communications and IT solutions. But the time has come to move on, so from the 1st of March I'm starting a new job at a London startup, Open Gamma.

So, what can I tell you about Open Gamma? Well not too much just yet as its only just coming out of stealth mode. I can say they're lead by Kirk Wylie, they're building technology for the financial industry, and I'm excited about their big idea! Oh, and they're hiring (London only).

And how does this affect JSR-310?

Well, OpenGamma will be actively supporting my work on JSR-310 in work time! Clearly this will have a big impact on development pace, and we may yet make JDK 7 (but of course thats up to the SunOracle).

In the meantime, watch out for the Early Draft Review of JSR-310 where I'll need maximum feedback!

Monday, 23 November 2009

More detail on Closures in JDK 7

This blog goes into a little more detail about the closures announcement at Devoxx and subsequent information that has become apparent.

Closures in JDK 7

At Devoxx 2009, Mark Reinhold from Sun announced that it was time for closures in Java. This was a big surprise to everyone, and there was a bit of a vacuum as to what was announced. More information is now available.

Firstly, Sun, via Mark, have chosen to accept the basic case for including closures in Java. By doing so, the debate now changes from whether to go ahead, to how to proceed. This is an important step.

Secondly, what did Mark consider to be in and what out? Well, he indicated that non-local returns and the control-invocation statement were out of scope. There was also some indication that access to non-final variables may be out of scope (this is mainly because it raises nasty multi-threading Java Memory Model issues with local variables).

In terms of what was included, Mark indicated that extension methods would be considered. This would be necessary to provide meaningful closure style APIs since the existing Java collection APIs cannot be altered. The result of what was in was being called "simple closures".

Finally, Mark offered up a possible syntax. The syntax he showed was very close to FCM:

  // function expressions
  #(int i, String s) {
    System.println.out(s);
    return i + str.length();
  }

  // function expressions
  #(int i, String s) (i + str.length())
  
  // function types
  #int(int, String)

As such, its easy to say that Sun has "chosen the FCM proposal". However, with all language changes, we have to look at the semantics, not the syntax!

The other session at Devoxx was the late night BOF session. I didn't attend, however according to Reinier Zwitserloot, Mark indicated that Exception transparancy might not be essential to the final proposal.

According to Reinier, Mark also said "It is not an endorsement of FCM. He was very specific about that." I'd like to consider that in a little more detail (below).

The final twist was when a new proposal by Neal Gafter was launched which I'm referring to as the CFJ proposal. After some initial confusion, it became clear that this was written 2 weeks before Devoxx, and that Neal had discussed it primarily with James Gosling on the basis of it being a "compromise proposal". Neal has also stated that he didn't speak to Mark directly before Devoxx (and nor did I).

There are a number of elements that have come together in the various proposals:

	CICE	BGGA 0.5	FCM 0.5	CFJ 0.6a	Mark's announcement (Devoxx summary)
Literals for reflection	-	-	Yes	-	No (No info)
Method references	-	-	Yes	Yes	Worth investigating (No info)
Closures assignable to single method interface	Yes	Yes	Yes	Yes	Yes (No info)
Closures independent of single method interface	-	Yes	Yes	Yes	Yes
Access non-final local variables	Yes	Yes	Yes	Yes	No (Maybe not)
Keyword 'this' binds to enclosing class	-	Yes	Yes	Yes	No info (Probably yes)
Local 'return' (binds to closure)	Yes	-	Yes	Yes	Yes
Non-local 'return' (binds to enclosing method)	-	Yes	-	-	No
Alternative 'return' binding (Last-line-no-semicolon)	-	Yes	-	-	No
Exception transparancy	-	Yes	Yes	Yes	No info (Maybe not)
Function types	-	Yes	Yes	Yes	Yes
Library based Control Structure	-	Yes	-	-	No
Proposed closure syntax	Name(args) {block}	{args => block;expr}	#(args) {block}	#(args) {block} #(args) expr	#(args) {block} #(args) (expr)

A table never captures the full detail of any proposal. However, I hope that I've demonstrated some key points.

Firstly, the table shows how the CFJ is worthy of a new name (from BGGA) as it treats the problem space differently by avoiding non-local returns and control invocation. Neal Gafter is also the only author of the CFJ proposal, unlike BGGA.

Secondly, it should be clear that at the high level, the CFJ and FCM proposals are similar. But in many respects, this is also a result of what would naturally occur when the non-local returns and control invocation is removed from BGGA.

Finally, it should be clear that it is not certain that the CFJ proposal meets the aims that Mark announced at Devoxx ("simple closures"). There simply isn't enough hard information to make that judgement.

One question I've seen on a few tweets and blog posts is whether Mark and Sun picked the BGGA or FCM proposal. Well, given what Mark said in the BOF (via Reinier), the answer is that neither was "picked" and that he'd like to see a new form of "simple closures" created. However, it should also be clear that the choices announced at Devoxx were certainly closer to the FCM proposal than any other proposal widely circulated at the time of the announcement.

At this point, it is critical to note that Neal Gafter adds a huge amount of technical detail to each proposal he is involved with - both BGGA and CFJ. The FCM proposal, while considerably more detailed than CICE, was never at the level necessary for full usage in the JDK. As such, I welcome the CFJ proposal as taking key points from FCM and applying the rigour from BGGA.

As of now, there has been relatively little debate on extension methods.

Summary

The key point now is to focus on how to implement closures within the overall scope laid out. This should allow the discussion to move forward considerably, and hopefully with less acrimony.

Friday, 20 November 2009

Why JSR-310 isn't Joda-Time

One question that has been repeatedly asked is why JSR-310 wasn't simply the same as Joda-Time. I hope to expain some reasons here.

Joda-Time as JSR-310?

At its heart, JSR-310 is an effort to add a quality date and time library to the JDK. So, since most people consider Joda-Time to be a quality library, why not include it directly in the JDK?

Well, there is one key reason - Joda-Time has design flaws.

Now before everyone panics and abuses that line as a tweet, I need to say that Joda-Time is by far the best option curently available, and that most users won't appreciate the design flaws. But, I do want to document them, so the basis for the changes in JSR-310 is clear.

1) Human/Machine timelines

One element of clarity is a better understanding of the distinction between the two principle views of the timeline - Human and Machine.

Machines have one view - a single, ever increasing number. In Java we set zero as 1970-01-01T00:00Z and count in milliseconds from there.

Humans have a totally different view of time. We have multiple calendar systems (one primary, many others), which divide the timeline into years, months, days, hours, minutes and seconds. In addition, humans have time zones which cause the values to vary around the globe, and for there to be gaps and overlaps in the human timeline as DST starts and ends.

Much of the time, a conversion between the two timeline views is possible with a time zone, however in a DST gap or overlap, things are much less clear.

Joda-Time defines two key interfaces - ReadableInstant and ReadablePartial. Both the Instant class (simple instant in time) and the DateTime class (human view of an instant in time) are implementations of ReadableInstant. This is wrong.

DateTime is a human-timeline view of the world, not a machine-timeline view. As such, DateTime is much better thought of, and designed as, a LocalDateTime and a timezone rather than the projection of the machine timeline onto the human timeline. Thus, DateTime should not implement ReadableInstant.

2) Pluggable chronology

What is the range of values returned by this method in Joda-Time?:
int month = dateTime.getMonthOfYear();
The answer is not 1 to 12, but could be 1 to 13! (yet January is still 1 and December is 12)

The answer to this puzzler is down to pluggable chronologies. Each date/time class in Joda-Time has a pluggable chronology. This defines the calendar system that is in use. But most users of the API never check to see if the chronology is the standard ISO/ chronology before calling getMonthOfYear(). Yet, the Coptic chronology has 13 months in a year, and thus can return a range of 1 to 13.

A better solution would be to keep the date/time classes restricted to a single calendar system. That way, the result from each method call is clear, and not dependent on any other state in the class, like the chronology.

3) Nulls

Joda-Time accepts null as a valid value in most of its methods. For date/times it means 1970-01-01T00:00Z. For durations it means zero. For peiods it means zero.

This approach causes random bugs if your code happens to provide a null to Joda-Time that you hadn't originally planned for. Instead of throwing an error, Joda-Time continues, and the resulting date/time is going to be different from what you want.

4) Internal implementation

Certain aspects of the internal implementation are complex, and the result of having pluggable chronologies and a misunderstanding of the machine/human divide in the timeline. Changing this is a big change to the code.

One particular area of trouble is managing DST overlaps. The behaviour in Joda-Time of these isn't that well defined.

Summary

Joda-Time isn't broken!

It does the job it was designed for, and does it much better than the JDK. And it is widely used and without too many major issues. However, after a few years, it is now clear where it could be designed better.

I took the decision that I didn't want to add an API to the JDK that had known design flaws. And the changes required weren't just minor. As a result, JSR-310 started from scratch, but with an API 'inspired by Joda-Time'.

I hope that explains the thought process behind the creation of a new API in JSR-310.