Stephen Colebourne's blog: 2006

Monday, 18 December 2006

Javapolis whiteboards on Java language changes

At Javapolis I 'looked after' a set of whiteboards on which all 2800+ participants were free to write up their ideas for improving their Java lives.

These were photographed at the end, and I've now extracted the information into textual format. So, if you were at Javapolis, or even if you weren't, you can see what was on the collective mind here.

As you look through them, you'll note that there were a few indicative votes taken. It appeared that developers liked the idea of closures and solving some issues with NPEs, but weren't so fussed about improving the foreach loop. There was also a vote about native XML, which showed a majority against the feature at present.

Thursday, 7 December 2006

enum + switch == NPE

Enums hide most of their implementation successfully from the developer., You just use them as constants, assign them to fields, call methods on them, no real issues and no exposed implementation details. However, they do fail on at least one count - a NullPointerException in switch statements.

When enums were introduced, the compiler was extended to support enums in switch statements. Consider an enum of traffic lights (RED, YELLOW, GREEN):

public enum TrafficLight {RED, YELLOW, GREEN}

TrafficLight trafficLights = ...
switch (trafficLights) {
  case RED: {/* do stuff */}
  case YELLOW: {/* do stuff */}
  case GREEN: {/* do stuff */}
}

This code can throw a NullPointerException on the first line. How? Well, take a look at how the compiler sees it:

switch (trafficLights.ordinal()) {
  case 0: {/* do stuff */}
  case 1: {/* do stuff */}
  case 2: {/* do stuff */}
}

Now its clear whats going on. The enum switch statement is just syntax sugar for a regular switch statement. But in order to get the ordinal for the switch, a method is called, and that method call will throw an NPE if trafficLights is null.

Can we solve this issue? Well, yes! Since this is just syntax sugar already, why not extend the sugar to allow us to avoid the NPE? This can be achieved by allowing a "case null" within the sugared enum switch statement:

switch (trafficLights) {
  case RED: {/* do stuff */}
  case YELLOW: {/* do stuff */}
  case GREEN: {/* do stuff */}
  case null: {/* do stuff */}  // the null case
}

which compiles to:

switch (trafficLights != null) ? trafficLights.ordinal() : -1) {
  case 0: {/* do stuff */}
  case 1: {/* do stuff */}
  case 2: {/* do stuff */}
  case -1 {/* do stuff */}  // the null case
}

Its a simple extension, but it closes a gap in the enum syntax sugar. Perhaps one for Java SE 7?

Monday, 13 November 2006

Days.daysBetween(today, christmas);

Using Java, how many days are there between now and Christmas? Its actually quite a tricky calculation to get reliably right (Did you forget to handle daylight savings?).

Now you no longer need to write it yourself if you use version 1.4 of Joda-Time:

LocalDate today = new LocalDate();
LocalDate christmas = new LocalDate(2006, 12, 25);
Days daysToChristmas = Days.daysBetween(today, christmas);

So what's going on here? Well its pretty readable really, but we are (a) creating two date objects (no time, no timezone) and (b) calculating the days between the dates. The calculation effectively includes the start date and excludes the end date. Issues with daylight savings time are avoided by using the no timezone LocalDate objects. But what if you want to calculate the number of days considering time, lets say 08:00 on christmas morning?

DateTime now = new DateTime();
DateTime christmas = new DateTime(2006, 12, 25, 8, 0, 0, 0);
Days daysToChristmas = Days.daysBetween(today, christmas);

Well, that's pretty similar! Internally, the code will count the number of whole days between the datetimes. Or put another way, how many 24 hours units are there between the two datetimes (except that it handles 23/25 hours DST dates too!). So, 08:01 on christmas eve returns zero days, but 07:59 returns one day.

And what's the other noticeable feature about the code above? Well, the result is being returned as an object, Days, not as an int. This allows your application to store a value in days, and actually know it is days and not some random int.

Up until now in Joda-Time, we have used the Period class for this. But that allowed you flexibility in storing a period such as "3 days, 12 hours and 32 minutes" - fine if you want that flexibility, but not if you don't. Days on the other hand can only store a number of days.

Now, you may be asking what is so special about days? Why can't I calculate the difference between months or hours. Well, of course you can! The new classes are Years, Months, Weeks, Days, Hours, Minutes and Seconds. They all operate in a similar manor, and they all implement the relevant interface, ReadablePeriod so they can interoperate.

Oh, and one last point. the factory method have a naming style that allows them to be statically imported on Java SE 5. Still not sure if I really like static imports, but at least the API supports the coding style!

As always, any feedback on Joda-Time is welcomed, here, mailing list or forum. With your help, the library can improve still further!

Monday, 6 November 2006

Generics, Iterables and Arrays - more banging of the head

Yes, I'm back again, with another Java 5 example that caused me to bang my head against the wall (see yesterday).

Lets recap the scenario - a CompositeCollection wraps a list of collections and makes them look like a single collection. Here's the class outline again: (I've removed the simple one and two argument methods from yesterdays example as they're not relevant to today):

public class CompositeCollection<E> implements Collection<E> {

  private Collection<Collection<E>> all;

  public void addComposited(Collection<E>[] arr) {
    all.addAll(Arrays.asList(arr));
  }
}

Well we know varargs doesn't work here from yesterday. What if we try to improve the functionality in a different way. What if we make it so that you can pass in any Iterable, wouldn't that make it simpler? After all, Iterable is the new super-interface to all things in the JDK that can be iterated around:

public class CompositeCollection<E> implements Collection<E> {

  private Collection<Collection<E>> all;

  public void addComposited(Iterable<Collection<E>> it);
    for (Collection<E> c : it) {
      all.add(c);
    }
  }
}

This is all fine right? I can pass in a List, a Collection or an array right? Wrong.

Why? Well arrays don't implement Iterable. Did you know that? Well, if you did then three gold stars to you! But it was against my expectations.

The reason I expected an array to implement Iterable is because the new foreach loop works on Iterable objects. Thus I naturally expected all arrays to implement Iterable. But it isn't so. And that made me bang my head against the wall. Again.

Now one reason for this may be that an Iterator defines the remove method, but that could easily have thrown UnsupportedOperationException as it is defined to do.

The second possible reason is that it clashes with generics. Consider the case where arrays implement Iterable. Of course an Integer[] would have to implement Iterable<Integer>. But that generification causes problems:

Integer[] arrayInt = new Integer[10];
Iterable<Integer> iterInt = arrayInt;  // assume array implements Iterable
Iterable<Number> iterNum1 = iterInt;   // illegal due to generics

Number[] arrayNum = arrayInt;          // always been legal in Java
Iterable<Number> iterNum2 = arrayNum;  // legal or illegal???????????

Thus, an Iterable<Number> can be obtained via one sequence of code (array assign) but not another (generic assign). So, the Java5 expert groups chose to not make an array implement Iterable - it would have clashed with generics.

And yet, the irony of all this is that there is in fact no risk here, and the final line could have been legal. Iterable is a read-only interface (wrt generics), so there wasn't actually anything that could go wrong!!!

To be fair, I'm only speculating on what the thinking was. All we know is that arrays don't implement Iterable, and the foreach loop is coded differently for arrays as opposed to Iterable instances.

And hence API designers will have to choose to make their API only accept Collection, and force users to use Arrays.asList. Or alternatively have two versions of the API method, one for collections and one for arrays.

Sunday, 5 November 2006

Generics == bashing your head against a wall

Well, I've been spending a few hours generifying commons-collections recently. The jobs only just started, and my head hurts from all the times I've banged it against the wall. Is there anybody listening who might actually be able to communicate to the powers that be in Sun that generics are just way messed up?

This time, my problem is in CompositeCollection. It holds a collection of collections, and makes it look like a single collection. Simple enough use case. So here is a simple generification (showing the methods that add a composited collection to the collection of composited collections):

public class CompositeCollection<E> implements Collection<E> {

  private Collection<Collection<E>> all;

  public void addComposited(Collection<E>);
  public void addComposited(Collection<E>, Collection<E>);
  public void addComposited(Collection<E>[]);

}

Thats great! It works and we're all happy, right? Well, not really, because those three addComposited() methods can be simplified using varargs of course:

public class CompositeCollection<E> implements Collection<E> {

  private Collection<Collection<E>> all;

  public void addComposited(Collection<E>...);

}

Now isn't that a much simpler API. We can pass in an array, one, two, or any number of collections we like, right? Wrong.

The problem is that calling this method is totally screwed due to the messed up generics implementation:

public class MyClass {

  private void process() {
    Collection<String> a = new ArrayList<String>()
    Collection<String> b = new ArrayList<String>()
    CompositeCollection<String> c = new CompositeCollection<String>;

    c.addComposited(a, b);    // compiler warning!!!!!!!!
  }

}

Why is there a compiler warning? Because in order to pass a and b to the method the varargs implementation needs to create an array - Collection<String>[]. But the generics implementation can't create a generified array, yet thats what the method is demanding. Arrrgh!!

So, instead of being helpful, Java creates an ungenerified Collection[], and then the generics system warns us that it is an unchecked cast to make it a Collection<String>[]. But this conversion is all internal to the compiler! As far as the developer is concerned, there is nothing even vaguely type unsafe about what s/he is doing.

Now this is just headbangingly stupid. There is absolutely no risk of breaking the generics model here. No risk of a ClassCastException or ArrayStoreException. No, this is really just another case of generics being nearly completely broken with regards to arrays. And the varargs group didn't manage to realise/coordinate and get a decent implementation that works.

The problem for me is serious though. I am writing an API (commons-collections) that will be widely used, and will have to retain backwards binary compatibility. If I go with the varargs option (the logically correct choice), then every client of this method will get a useless java warning - not very friendly. The alternative is to go back to the original design and have three methods, an array, one-arg and two arg. And really thats just crap.

Tuesday, 29 August 2006

Closures - async bad, sync good

Closures. Great or not? Well given its properly up for discussion I thought I'd add my opinions. (I'll assume you've read the various blogs - Neal's most recent two are perhaps the most helpful).

Two separate use cases have been identified for consideration - async and sync. Async is where code is to be executed later, such as timer tasks. Sync is where the code is executed immediately by the current thread. The immediate question is "why is there is a need for one language syntax to meet both requirements?".

Async use case

The async case is already dealt with in Java - with inner classes (typically anonymous). With inner classes, the timer task asks us to pass in 'what we want doing'. So we pass in an object that represents the 'task'. Its logically more than just a block of code.

Now some may complain that its too much typing or too much clutter. But we already have it. The need is met. Now lets look at some issues with the async closures.

a) The syntax of an async closure will be subtly different to that of a sync closure. Once written however, you won't be able to tell which is which, until you try and edit it. This fails an important language syntax readabiity test. In effect, the developer is being lied to, as they are writing a method with limited control flow, but it looks no different to a code block from the sync case.

b) Refactoring an anonymous inner class to a closure converted inner class is not staightforward. (Neal uses the test of how easy it is to refactor code from a method into the closure - this is the opposite test). This point is really about which refactoring should be easy? Moving code to execute in another thread is a serious refactoring, with potentially serious impact. Surely, its not unreasonable for that to require brainpower?

c) An async closure will produce weird error messages - 'nonlocal return/break/continue'. What do these mean, or rather how can they be explained. Lets use this test - write an error message for the compiler in 12 words that a junior developer will understand?

d) May discourage re-use of 'task' instances, which as real objects should often be designed and re-used. Developers will simply be less aware that they are passing an object around (as it doesn't look like an object). Thus there may well be more inlined code.

e) Multi-method adaptor classes like those in Swing don't play well. In fact, you'll have to write an inner class...

Point (a) is really important. Thus, its no surprise that at present, I am very much on the side of inner classes for the async use case and against closures.

Sync use case

The sync use case is entirely different. Here, the whole point of the use case is that the code is executed inline within the method. Thus, it really should be pretty indistinguishable from any other block of code. And it has the potential to add real value to the language.

I like to think of this as being implemented in a very simple way. The compiler could take the first half of the closure implementation and insert it before the closure code block, and the second half after the code block. Very simple, the only downside being bloat of the client code. For example, considering:

public void read() {
  closeAfter() (InputStream in) {
    in = openStream();
    if (in == null) return;
    readStream(in);
  }
}

public static void closeAfter(void(InputStream) closure) {
  InputStream in = null;
  try {
    closure(in);
  } finally {
    in.close();
  }
}

The compiler would produce:

public void read() {
  InputStream in = null;
  try {
    in = openStream();
    if (in == null) return;
    readStream(in);
  } finally {
    in.close();
  }
}

Now, I'm no language implementation expert, but this looks very simple to do, exceptions and control flow just happen naturally. In particular, data assigned to the variable 'in' is available for use at the end of the closure implementation.

Now, I just need to provide a nice syntax for returning a value from the closure. Maybe assigning it to a parameter like we did with 'in'?

So, yes, I probably am in favour of sync closures (combined with extension methods. Just so long as the syntax and implementation is simple and without external artifacts.

Of course it could be I've misunderstood something, but really the two use cases seem poles apart to me. Closures fit with the sync case, but not with the async one (in Java - in other languages it is probably very different). Trying to make one 'syntax' fit both is misleading and dangerous, as the complete syntax must also take into account errors and unusual conditions and be fully consisent across both.

Wednesday, 23 August 2006

Extension methods and closures

Extension methods are a C# feature that just could prove very useful in Java, particularly with the possibility of closures in Java (something which should be the topic of a blog in its own right). So, what is an extension method, and why might it be useful? Consider the following code:

public class Collections {
 pubilc static void sort(List list, Comparator comp) {
   ...
 }
}
public MyClass {
 pubilc void myMethod() {
  List list = ...;
  Comparator comp = ...;
  Collections.sort(list, comp);
 }
}

This is standard JDK code. To sort a list you have to use the Collections static methods. But that means you need to know that they exist there - they won't auto complete like other list methods in an IDE for example. What if we could find a way to alter this slightly.

public class Collections {
 @extensionMethod(Sort)
 pubilc static void sort(List list, Comparator comp) {
   ...
 }
}
public MyClass {
 pubilc void myMethod() {
  List list = ...;
  Comparator comp = ...;
  list.Sort(comp);
 }
}

So, by adding an attribute (or some other syntax mechanism - C# uses 'this' rather strangely) and importing the Collections class, we can then call the sort method in an OO style in the client code myMethod(). I have deliberately setup a coding standard, so that the extension method has a capital first letter. This would allow someone reading the code to spot that it is not a standard method on the class. Behind the scenes, all it does is call the static method in a static way - no magic involved.

This brings benefits in being able to effectively extend code that is locked down in some way - perhaps because its an interface (such as the JDK List interface), or perhaps because its an 3rd party library that you don't want to change, but is really missing a very useful little method.

So, how does this link to closures? Well, many closure advocates want to add methods to Collection, List and Map to allow closures to operate directly on the collection. But this simply isn't going to happen as they are interfaces and backwards compatability must be maintained. However, extension methods offer a solution - a way to apparantly add methods to the collection interfaces. Thus you could code with closures:

public class Collections {
 @extensionMethod(Each)
 pubilc static void each(Collection<T> list, void(T) closure) {
  for (T item : list) {
   closure(item);
  }
 }
}
public MyClass {
 pubilc void myMethod() {
  List<String> list = ...;
  list.Each() (String str) {
   System.out.println(str);
  };
 }
}

So, are there any views about extension methods? Are they really an essential addon to the closure proposal?

Which days of the week are the weekend in Egypt?

Do you know? Or more precisely, does your Java application know?

Well, yesterday I released a 0.1 version of a new project Joda-Time-I18N. This provides four lookups at present:

Time zone by territory
First day of week by territory
Business days by territory
Weekend days by territory

The data comes from the Unicode CLDR project, so should be pretty accurate (its backed by IBM and Sun).

The code is released now to get some feedback, and to allow early users access. The 0.1 version refers to the chance of the API changing (high), not the quality of the data (which is good). Oh, and the weekend in Egypt?

  Territory t = Territory.forID("EG");
  int weekendStartDay = t.getWeekendStart();
  int weekendEndDay = t.getWeekendEnd();

Anyway, I'd love to here any feedback on the API, data or proposed enhancements.

Wednesday, 2 August 2006

Joda-Time 1.3 released

Well, it took a while, but the next release of Joda-Time is finally here. Version 1.3 includes two main enhancements in addition to the bug fixes:

Three new datetime classes have been added - LocalDate, LocalTime and LocalDateTime. These represent a date, time or datetime without a time zone. As such LocalDate is a direct replacement for YearMonthDay, whilst LocalTime replaces TimeOfDay. Neither of the replaced classes has been deprecated yet however, due to their widespread adoption.

The new Local* classes use a more standard and reliable implementation internally, and fix some weird external semantics of the older classes. For example, it was not possible to query the dayOfWeek on a YearMonthDay (because it only holds year, month and day!). LocalDate uses a different implementation, so that restriction no longer applies.

The second major enhancement is the addition of convenience methods to change each field value - immutable equivalent of set methods. Here is how to set the day to the first day of the month, constrasting before and after:

// previously
firstOfMonth = dt.dayOfMonth().setCopy(1);

// version 1.3
firstOfMonth = dt.withDayOfMonth(1);

Clearly, the new code is more readable and understandable. Note that as Joda-Time objects are immutable, the methods return a new instance with that field value changed. This is emphasised by the use of the verb 'with' rather than 'set'

As always, feedback is welcomed - whether bugs or reviews, good or bad! Only with input from the community can the library improve.

Monday, 31 July 2006

Generics - why no <this>?

So, am I just misunderstanding the spec, or is there no way to easily specify that a method should return the type of the object you called the method on?

Consider this example - I want to write a whole heap of methods in an abstract superclass, but I want each method to return the type of the subclass:

public abstract class A {
  <this> withYear(int year) {
    // implementation
  }
}

public class B extends A {
}

B b = new B().withYear(2006);

But, there is no <this> part in generics. The closest I could get was to generify the whole type (A) based on its subclass (B). But thats not really what I want to do at all.

Am I missing something?

Thursday, 6 July 2006

A better datetime library JSR?

Do you hate the JDK Date and Calendar classes? Would you like to see an alternative in the JDK? Or are you happy with the JDK classes? Maybe you are happy with pulling in a jar file?

I ask because I get asked from time to time whether Joda-Time is going to get a JSR. Each time this comes up I point out that its a lot of work to do in my free time, and there are infrequent enough releases as is!

So, this blog is an open question - is it worth it? Should JDK7 contain a better datetime library? Please add a comment with your views, postive and negative so I can get some idea of peoples thoughts!

Friday, 16 June 2006

Generics and commons-collections

A debate has started up over in Jakarta-Commons as to whether a new version of commons-collections should be created that supports JDK1.5 generics.

Unfortunately, I happen to think that generics (as implemented) have serious flaws. There are just too many weird behaviours and odd corner cases. Just look how big the FAQ is - a sure sign of serious problems. And they also look darn ugly too!

A forked version has been created on sourceforge with generics, but that is far from ideal, as Apache can't link to it (official policy) nor can it easily be kept in sync. However, it does show that the task is possible. Could the fork be brought back to Apache? Well possibly, but that gets bogged down in bureaucracy when the forkers don't have Apache commit priviledges. Even more so when the relevant Apache committer (me) isn't particularly motivated by the issue (generics).

Anyway, I'd like to ask anyone reading this a question or two - do you want a generics JDK1.5 version of commons-collections? Would you mind if parts of the API were changed to fix API design flaws (this would be a major release after all...)? Please comment here, or at Commons JIRA. You can also just vote for the change at JIRA.

Tuesday, 23 May 2006

Immutable POJOs - Improving on getters/setters

So we all know the bean/POJO get/set convention right? Its fine for mutable objects, but what about immutables? The convention just doesn't work there.

BigDecimal base = new BigDecimal(100d);
base.setScale(2);  // BUG!!!

Why is the above a bug? Because BigDecimal is immutable, and so the set method doesn't actually change base. Instead, the method returns a new BigDecimal instance which isn't being assigned in the example. Here's the fixed version:

BigDecimal base = new BigDecimal(100d);
base = base.setScale(2);  // FIXED by assigning

In addition, by returning a new instance, the set method actually breaks the JavaBean spec. This is because the JavaBean spec says that set methods must return void.

So what can be done? Well I'd like to argue that the time has come for a new coding convention for immutable beans/POJOs. As a convention (preferably Sun endorsed) it would allow frameworks like JSF or Struts to handle immutable objects properly. So, what do I propose?

Mutable verb	Immutable verb
get	get
set	with
add	plus
subtract	minus

Here's an example of a datetime object using these conventions:

DateTime dt = new DateTime();

// immutable get is the same as a mutable get
int month = dt.getMonthOfYear();

// immutable sets use the 'with' verb
dt = dt.withYear(2006);
dt = dt.withMonthOfYear(5);
dt = dt.withDayOfMonth(23);

// immutable add or subtract using the 'plus'/'minus' verbs
dt = dt.plusYears(1);
dt = dt.minusDays(1);

// combinations, for example, the first day of the next month
dt = dt.withDayOfMonth(1).plusMonths(1);

If adopted as a standard, there is even the potential to write a PMD or FindBugs rule to match on the verbs and ensure that you actually assign the result, thus eliminating a potential bug.

So, do you use immutable objects? Do you have a coding convention like this? Should this become a Sun convention? Opinions welcome!

Tuesday, 10 January 2006

Maven SourceForge Plugin v1.3 - Rapid releases!

Well my latest little project is complete - the Maven SourceForge plug-in. Basically all I've done is reactivated an existing plug-in and made it work with the latest SourceForge website.

So what does it do? Well it allows project administrators and release managers of open source projects hosted at SourceForge to use maven directly to upload their releases. The plug-in

Uploads the files (configurable set)
Logs on to the website
Accesses the File Release System
Uploads release and change notes
Sets the file types
Sends email notification (optional)
Submits a news item (optional)

Typically it manages this in a couple of minutes. Now thats a lot faster and a lot less hassle than doing it manually!!!

I've already released two Joda-Time subprojects as well as the maven-sourceforge-plugin itself so I'm pretty happy that it works. So next time you want to release something to SourceForge just do 'maven sourceforge:deploy'.