Tuesday, 29 August 2006

Closures - async bad, sync good

Closures. Great or not? Well given its properly up for discussion I thought I'd add my opinions. (I'll assume you've read the various blogs - Neal's most recent two are perhaps the most helpful).

Two separate use cases have been identified for consideration - async and sync. Async is where code is to be executed later, such as timer tasks. Sync is where the code is executed immediately by the current thread. The immediate question is "why is there is a need for one language syntax to meet both requirements?".

Async use case

The async case is already dealt with in Java - with inner classes (typically anonymous). With inner classes, the timer task asks us to pass in 'what we want doing'. So we pass in an object that represents the 'task'. Its logically more than just a block of code.

Now some may complain that its too much typing or too much clutter. But we already have it. The need is met. Now lets look at some issues with the async closures.

a) The syntax of an async closure will be subtly different to that of a sync closure. Once written however, you won't be able to tell which is which, until you try and edit it. This fails an important language syntax readabiity test. In effect, the developer is being lied to, as they are writing a method with limited control flow, but it looks no different to a code block from the sync case.

b) Refactoring an anonymous inner class to a closure converted inner class is not staightforward. (Neal uses the test of how easy it is to refactor code from a method into the closure - this is the opposite test). This point is really about which refactoring should be easy? Moving code to execute in another thread is a serious refactoring, with potentially serious impact. Surely, its not unreasonable for that to require brainpower?

c) An async closure will produce weird error messages - 'nonlocal return/break/continue'. What do these mean, or rather how can they be explained. Lets use this test - write an error message for the compiler in 12 words that a junior developer will understand?

d) May discourage re-use of 'task' instances, which as real objects should often be designed and re-used. Developers will simply be less aware that they are passing an object around (as it doesn't look like an object). Thus there may well be more inlined code.

e) Multi-method adaptor classes like those in Swing don't play well. In fact, you'll have to write an inner class...

Point (a) is really important. Thus, its no surprise that at present, I am very much on the side of inner classes for the async use case and against closures.

Sync use case

The sync use case is entirely different. Here, the whole point of the use case is that the code is executed inline within the method. Thus, it really should be pretty indistinguishable from any other block of code. And it has the potential to add real value to the language.

I like to think of this as being implemented in a very simple way. The compiler could take the first half of the closure implementation and insert it before the closure code block, and the second half after the code block. Very simple, the only downside being bloat of the client code. For example, considering:

public void read() {
  closeAfter() (InputStream in) {
    in = openStream();
    if (in == null) return;
    readStream(in);
  }
}

public static void closeAfter(void(InputStream) closure) {
  InputStream in = null;
  try {
    closure(in);
  } finally {
    in.close();
  }
}

The compiler would produce:

public void read() {
  InputStream in = null;
  try {
    in = openStream();
    if (in == null) return;
    readStream(in);
  } finally {
    in.close();
  }
}

Now, I'm no language implementation expert, but this looks very simple to do, exceptions and control flow just happen naturally. In particular, data assigned to the variable 'in' is available for use at the end of the closure implementation.

Now, I just need to provide a nice syntax for returning a value from the closure. Maybe assigning it to a parameter like we did with 'in'?

So, yes, I probably am in favour of sync closures (combined with extension methods. Just so long as the syntax and implementation is simple and without external artifacts.

Of course it could be I've misunderstood something, but really the two use cases seem poles apart to me. Closures fit with the sync case, but not with the async one (in Java - in other languages it is probably very different). Trying to make one 'syntax' fit both is misleading and dangerous, as the complete syntax must also take into account errors and unusual conditions and be fully consisent across both.

Wednesday, 23 August 2006

Extension methods and closures

Extension methods are a C# feature that just could prove very useful in Java, particularly with the possibility of closures in Java (something which should be the topic of a blog in its own right). So, what is an extension method, and why might it be useful? Consider the following code:

public class Collections {
 pubilc static void sort(List list, Comparator comp) {
   ...
 }
}
public MyClass {
 pubilc void myMethod() {
  List list = ...;
  Comparator comp = ...;
  Collections.sort(list, comp);
 }
}

This is standard JDK code. To sort a list you have to use the Collections static methods. But that means you need to know that they exist there - they won't auto complete like other list methods in an IDE for example. What if we could find a way to alter this slightly.

public class Collections {
 @extensionMethod(Sort)
 pubilc static void sort(List list, Comparator comp) {
   ...
 }
}
public MyClass {
 pubilc void myMethod() {
  List list = ...;
  Comparator comp = ...;
  list.Sort(comp);
 }
}

So, by adding an attribute (or some other syntax mechanism - C# uses 'this' rather strangely) and importing the Collections class, we can then call the sort method in an OO style in the client code myMethod(). I have deliberately setup a coding standard, so that the extension method has a capital first letter. This would allow someone reading the code to spot that it is not a standard method on the class. Behind the scenes, all it does is call the static method in a static way - no magic involved.

This brings benefits in being able to effectively extend code that is locked down in some way - perhaps because its an interface (such as the JDK List interface), or perhaps because its an 3rd party library that you don't want to change, but is really missing a very useful little method.

So, how does this link to closures? Well, many closure advocates want to add methods to Collection, List and Map to allow closures to operate directly on the collection. But this simply isn't going to happen as they are interfaces and backwards compatability must be maintained. However, extension methods offer a solution - a way to apparantly add methods to the collection interfaces. Thus you could code with closures:

public class Collections {
 @extensionMethod(Each)
 pubilc static void each(Collection<T> list, void(T) closure) {
  for (T item : list) {
   closure(item);
  }
 }
}
public MyClass {
 pubilc void myMethod() {
  List<String> list = ...;
  list.Each() (String str) {
   System.out.println(str);
  };
 }
}

So, are there any views about extension methods? Are they really an essential addon to the closure proposal?

Which days of the week are the weekend in Egypt?

Do you know? Or more precisely, does your Java application know?

Well, yesterday I released a 0.1 version of a new project Joda-Time-I18N. This provides four lookups at present:

  • Time zone by territory
  • First day of week by territory
  • Business days by territory
  • Weekend days by territory

The data comes from the Unicode CLDR project, so should be pretty accurate (its backed by IBM and Sun).

The code is released now to get some feedback, and to allow early users access. The 0.1 version refers to the chance of the API changing (high), not the quality of the data (which is good). Oh, and the weekend in Egypt?

  Territory t = Territory.forID("EG");
  int weekendStartDay = t.getWeekendStart();
  int weekendEndDay = t.getWeekendEnd();

Anyway, I'd love to here any feedback on the API, data or proposed enhancements.

Wednesday, 2 August 2006

Joda-Time 1.3 released

Well, it took a while, but the next release of Joda-Time is finally here. Version 1.3 includes two main enhancements in addition to the bug fixes:

Three new datetime classes have been added - LocalDate, LocalTime and LocalDateTime. These represent a date, time or datetime without a time zone. As such LocalDate is a direct replacement for YearMonthDay, whilst LocalTime replaces TimeOfDay. Neither of the replaced classes has been deprecated yet however, due to their widespread adoption.

The new Local* classes use a more standard and reliable implementation internally, and fix some weird external semantics of the older classes. For example, it was not possible to query the dayOfWeek on a YearMonthDay (because it only holds year, month and day!). LocalDate uses a different implementation, so that restriction no longer applies.

The second major enhancement is the addition of convenience methods to change each field value - immutable equivalent of set methods. Here is how to set the day to the first day of the month, constrasting before and after:

// previously
firstOfMonth = dt.dayOfMonth().setCopy(1);

// version 1.3
firstOfMonth = dt.withDayOfMonth(1);

Clearly, the new code is more readable and understandable. Note that as Joda-Time objects are immutable, the methods return a new instance with that field value changed. This is emphasised by the use of the verb 'with' rather than 'set'

As always, feedback is welcomed - whether bugs or reviews, good or bad! Only with input from the community can the library improve.