Tuesday 18 December 2007

Closures - Comparing the core of BGGA, CICE and FCM

In this blog I'm going to compare the core of the three principle 'closure' proposals. This is particularly apt following the recent surge in interest after Josh Bloch's Javapolis talk.

Comparing 'closure' proposals

The three principle closure proposals are:

  • BGGA - full closures for Java
  • CICE - simplified inner classes (with the related ARM proposal)
  • FCM - first class methods (with the related JCA proposal)

It is easy to get confused when evaluating these competing proposals. So much new syntax. So many new ideas. For this blog I'll summarise, and then deep dive into one area.

The first key point that I have been making recently is that closures is not 'all or nothing' - there are parts to the proposals that can be implemented separately. This table summarises the 5 basic parts in the proposals (I've split the last one into two rows):

  BGGA CICE+ARM FCM+JCA
Literals for reflection - - FCM member literals
References to methods - - FCM method references
One method callback classes BGGA closures CICE FCM inner methods
Function types BGGA function types - FCM method types
Library based Control Structure BGGA control invocation - JCA control abstraction
Language based Control Structure - ARM -

Of course a table like this doesn't begin to do justice to any of the three proposals. Or to express the possible combinations (for example, BGGA could add support for the first two rows easily). So, instead of talking generally, lets examine a key use case where all 'closure' proposals work.

One method callbacks

This issue in Java refers to the complexity and verbosity of writing an anonymous inner class with one method. These are typically used for callbacks - where the application code passes a piece of logic to a framework for it to be executed later in time. The classic example is the ActionListener:

  public void init() {
    button.addActionListener(new ActionListener() {
      public void actionPerformed(ActionEvent ev) {
        // handle the event
      }
    });
  }

This registers a callback that will be called by the framework whenever the button is pressed. We notice that the code in the listener will typically be called long after the init() method has completed. While this is an example from swing, the same design appears all over Java code.

Some of the issues with this code are immediately obvious. Some are not.

  1. The declaration of the listener is very verbose. It takes a lot of code to define something that is relatively simple, and importantly, that makes it harder to read.
  2. Secondly, information is duplicated that could be inferred by the compiler. Its not very DRY.
  3. The scope of the code inside actionPerformed() is different to that of the init() method. The this keyword has a different meaning.
  4. Any variables and methods from the interface take preference to those available in init(). For example, toString(), with any number of parameters, will always refer to the inner class, not the class that init() is declared in. In other words, the illusion that the code in actionPerformed() has full access to the code visible in init() is just that - just an illusion.

The three closure proposals differ in how they tackle this problem area. BGGA introduces full closures. These eliminate all the problems above, but introduce new issues with the meaning of return. CICE introduces a shorthand way of creating an inner class. This solves the first issue, and some of the second, but does not solve issue 3 or 4. FCM introduces inner methods. These eliminate all the problems above, and add no new surprises.

  // BGGA
  public void init() {
    button.addActionListener({ActionEvent ev =>
      // handle the event
    });
  }
  // CICE
  public void init() {
    button.addActionListener(ActionListener(ActionEvent ev) {
      // handle the event
    });
  }
  // FCM
  public void init() {
    button.addActionListener(#(ActionEvent ev) {
      // handle the event
    });
  }

In this debate, it is easy to get carried away by syntax, and say that one of these might look 'prettier', 'more Java' or 'ugly'. Unfortunately, the syntax isn't that important - its the semantics that matter.

BGGA and FCM have semantics where this within the callback refers to the surrounding class, exactly as per code written directly in the init() method. This is emphasised by removing all signs of the inner class. The scope confusion of inner classes (issues 3 and 4 above) disappears (as this is 'lexically scoped').

CICE has semantics where this within the callback still refers to the inner class. Only you can't even see the inner class now, its really well hidden. This simply perpetuates the scope confusion of inner classes (issues 3 and 4 above) and gains us little other than less typing. In fact, as the inner class is more hidden, this might actually be more confusing than today.

BGGA also has semantics where a return within the callback will return from init(). This generally isn't what you want, as init() will be long gone when the button press occurs. (It should be noted that this feature is a major gain to the overall power of Java, but comes with a high complexity factor)

FCM and CICE have semantics where return within the callback returns from the callback, as in an inner class. This enables easy conversion of code from today's inner classes. (For FCM, the return is considered to be back to the # symbol, a simple rule to learn)

And here is my personal summary of this discussion:

  // BGGA
  public void init() {
    button.addActionListener({ActionEvent ev =>
      // return will return to init() - BAD for many (most) use cases
      // this means the same as within init() - GOOD and simple
    });
  }
  // CICE
  public void init() {
    button.addActionListener(ActionListener(ActionEvent ev) {
      // return will return to caller of the callback - GOOD for most use cases
      // this refers to the inner class of ActionListener - BAD and confusing
    });
  }
  // FCM
  public void init() {
    button.addActionListener(#(ActionEvent ev) {
      // return will return to caller of the callback - GOOD for most use cases
      // this means the same as within init() - GOOD and simple
    });
  }

Summary

This blog has evaluated just one part of the closures proposals. And this part could be implemented without any of the other parts if so desired. (BGGA would probably find it trickiest to break out just the concept discussed, but it could be done)

The point here is that there are real choices to be made here. The simplicity and understandability of Java must be maintained. But that doesn't mean standing still.

Once again, I'd suggest FCM has the right balance here, and that anyone interested should take a look. Perhaps the right new features for Java are method references and inner methods, and no more?

Opinions welcome as always :-)

Saturday 15 December 2007

Voting on Java 7 language changes

This year at Javapolis I again looked after the whiteboards. In this post, I'll discuss the results.

Whiteboard results

The whiteboards play an important part at Javapolis in bringing people together to discuss ideas and future trends.

This year, I chose to focus the whiteboard debate around a number of specific topics. This was achieved by having votes on 10 language changes proposals, plus areas for new ideas for Spring, JSF, JavaEE and JavaSE. I'm going to discuss the language change votes here, as the photos of the other boards aren't available yet.

Language change votes

Before the conference started, there were a number of language change proposals floating about, initiated by Neal and Josh at Google. These were discussed in a late night BOF on Thursday, but this was backed up by the votes on the whiteboards.

The full results are available (with more detailed examples), but here is the summary.

The strongest support was for Improving Generics, String Switch and Multi-catch. Attendees were giving a clear 'yes' to these being in Java 7.

The strongest opposition was to Typedef (type aliasing) and Extension methods and chaining. Attendees were giving a clear 'no' to these being in Java 7.

There was some support for Better null handling (with a new operator) and Properties. The balance was slightly against, but not overwhelming.

There was positive support for Using [] to access lists and maps. The balance was definitely in favour, yet there was a large minority against. This was perhaps the most surprising vote. Unfortunately, it wasn't possible to find out why there were so many 'no' votes. My view, is that it is still worth pursuing this for Java 7.

Summary

Java being open source isn't going to be enough going forwards - it will also be necessary to provide an open community. Votes and direct feedback like this have a really positive effect, allowing Java's growth to be influenced by real developers and their problems.

The Javapolis Experience 2007

So, Javapolis is over for another year! Big thanks and congratulations to the Javapolis team for making it such an excellent conference.

This year I, personally, had more of a focus on the networking aspect of the conference - trying to take time to chat to many of the delegates and speakers. This is often where you make the interesting and surprising connections that will come in useful in the future.

Whiteboards

The whiteboards again played a really interesting part in getting people talking. This year, I left half the whiteboards as freeform, available for any comments. The other half were carefully organised to get a snapshot of opinions on specific topics.

These fell in three areas. Firstly, boards for collecting ideas for changes in Spring, JSF, J2EE and Java 7. Next up, was a board gathering information on how many people are using each application server, and each web framework.

Finally, about 10 language change proposals were described. Each of these was then offered with a vote - "Do you support this change" - Yes/No/Maybe. In the region of 150 votes were received for each of these proposals, so hopefully the results should be pretty representative. I'll discuss the results in my next blog post.

Closures

The other big topic of the conference was again closures. The first talk on the topic was a well-attended BOF from Neal Gafter discussing the current status of BGGA.

The second was a last minute change - something you'd never see at JavaOne! Josh Bloch gave a talk on 'The Closures Controversy', where he showed some examples of complexity added to the language with BGGA closures. He also showed the syntax proposals of CICE and ARM. This talk went down well with the audience, however it should be born in mind that Neal had no opportunity to respond directly to the points made.

Finally, in another last minute change, I gave a 15 minute quick talk on FCM+JCA closures. Despite being at lunchtime on the Friday, there was still a reasonable group in the audience, and the feedback I received personally was good.

There was also a whiteboard discussing closures, with a vote - No closures vs CICE vs BGGA. The vote was about 2 to 1 in favour of BGGA before Josh's talk, while it was about 50/50 afterwards. This suggests to me that opinions on the topic are not fixed and can be swayed by the last good presentation people see.

Looking forwards, the hope is to have all three proposals (CICE, BGGA and FCM) discussed on JavaPosse.

Summary

Javapolis really is a great conference. It is extremely well run, and has a really good feel. Stefan has said that he intends to keep it at Metropolis, Antwerp next year, so since it sold out this year, I suggest you get your tickets early next year!

Monday 10 December 2007

Javapolis 2007

I am just getting ready to go to Javapolis 2007 :-)

This year I am presenting a main conference talk and a BOF:

In addition, there will be whiteboards again this year, and I will be attempting to keep the discussions on them moving forwards.

Hope to see you there!

Saturday 1 December 2007

Is the JCP broken?

Jean-Marie Dautelle has posted a question on Javalobby asking about the future of the JCP.

Java Community Process

This question leads into broader questions of governance in Java. Java is now opens source, yet we still have no real clarification on how that works or affects the JCP.

In particular, who is the 'architect' overseeing the development of Java, and its future direction? Is it a named person in Sun? A hidden magical process? Or hit and hope?

Some examples:

1) Language changes. I, and many others, have suggested numerous language changes for Java 7. But who decides which, if any, go forward? How do people influence the process? How do people register opposition to changes?

At present, the only approach seems to be to work for Sun in a senior position, or work in silicon valley. What I'm looking for is a direction, for example 'Java will add new language features, but only to fill in gaps in the current language', or 'Java will not add any new language features'.

Personally, I've argued strongly for language change. I believe that there are quite a lot of changes, many quite small, that would help 'complete' the language. But what I really need is that architectural statement that says whether there will be any language change. If not, then I can move on, perhaps to a different language.

2) Core library changes. What if I want to add a method to one of the core libraries, such as String, or Integer. How do I lobby for this to occur? Despite requests, it seems that this isn't worthy of a JSR. Yet, Google can propose a list of methods to add, and stand a big chance of succeeding.

Here, I'm arguing for an open JSR which examines all the core library utilities out there - Commons Lang/Collections/IO, Google Collections, Spring, Jodd, etc. and works out the commonality. Plus, anyone could propose their own ideas. But it needs organising, and it needs funding.

3) Modules. The shameless behaviour of Sun wrt OSGi shows the JCP in a very poor light. Anyone who believes that the JCP functions well should be examine this tragedy.

When JSR-277 started, the OSGi team were told that they were not to be invited onto the 14 person 'Expert Group' despite being probably the most experienced people on the planet in managing modules. Apparently, the expert group was full at 14. Today, it has 20 people. Pure politics.

So, instead of adopting a widely proven, top-notch module system suitable for everything from mobile phones to cars to enterprise, Sun has gone 'Not Invented Here'. It has desperately trying to reinvent OSGi and failed. The JCP now has 6 JSRs that could be solved simply by adopting OSGi. This is absurd.

Peter Kriens from OSGi has been repeatedly blogging about this, and how the weak requirements process at the JCP is a big issue.

And modules are the biggest change currently planned for Java 7, yet one that very few people are following or appreciating. Messing this one up will be a real blow to Java. Its like java.util.logging. Only far, far worse.

Issues

The issue is that JCP, and Sun's approach to Java, seems pretty broken at present.

1) Sun ignores the JCP legal agreements it signs (ie. wrt Apache Harmony). This destroys all trust.

2) The JCP still exists to rubber stamp proposals from its big company members. Basically, a company submits code they already have, and the 'Expert Group' really exists to peer review it.

3) Neither the JCP or Sun are giving any sense of direction to Java. There should be a clear sense of where Java is headed and what it is trying to be - a five to ten year vision.

4) The JCP is not eliminating duplicate JSRs (see OSGi). Worse, it is allowing the worse option to succeed.

5) Individuals have too small a say. They tend to provide the innovation and disruption that keeps the big companies honest.

Remedies

1) Sun must regain the trust of the community, and stop fracturing the Java landscape. It must offer Apache Harmony the correct testing kit.

2) I would suggest a slightly larger executive committee. This should guarantee 20%, or maybe 25%, of seats to individuals, yet still provide enough room for companies such as Spring Source to get in (who just missed out at the last ballot.

3) Sun needs to either lose its special status on the JCP, or have it properly acknowledged. The former would mean it having to face a ballot every 3 years. Given what Sun is to Java, this probably isn't realistic. So, I'd argue that the EC should be honest about Sun, and that it isn't counted in the members of the EC (ie. it is the chair). Thus, I'd suggest an EC of 1 chair (Sun), 15 other companies, and 5 individuals.

4) The EC needs to stand up to Sun, and force a few issues - notably licensing and OSGi.

5) The JCP needs to setup an architecture group that takes 3 months to create and publish a five to ten year vision for Java.

A real alternative

The real remedy may however be to kill the JCP. We may not need it anymore. Consider these words:

Modularization creates a market for solutions. It is no longer required to come up with a catch-all API in the JCP, which is too often designed under political pressure. For example, J2EE would never have been a success in a free market. It received its life because it was the “ordained” API for enterprise applications by virtue of its JCP origin. It was not until an external party like Interface21 showed with Spring that Enterprise applications can actually be lean and mean as well.

So, the real solution would be to redefine Java as a core kernel, and an OSGi module system, with a central repository of modules that could be downloaded on demand. Vendors could then group together ad hoc to build new modules for everything from date-time to JSF. The market would then decide the winners. Thus the JCP would simply be the guardian of the language syntax and core libraries, an actually manageable task!

Finally...

Finally, as co-spec lead of JSR-310, I have to say that money is a problem. Despite all of the above, I have given freely of my time to run a JSR (Date and Time) which the vast majority of developers appear to really want, but which Sun had no desire whatsoever to run. To achieve this I have taken a 40% pay cut for 6 months by working 3 days a week instead of 5.

So, my advice to anyone considering running a JSR is don't. If the JCP wants to encourage individuals to continue running JSRs, then perhaps it needs to start paying them to do so.

Tuesday 27 November 2007

Java 7 - Extension methods

A recent document revealed some possible language changes that are being proposed. One of these is is extension methods.

Extension methods

Extension methods allow a user to 'add' a method to an interface or class that you don't control. The original document is linked with BGGA closures. And was followed up by Peter Ahe (broken link).

The classic example of the proposal is as follows (known as use-site extension methods):

// in the application
import static java.util.Collections.sort;
List list = ...
list.sort();

Thus, the sort method appears to be a method on the list, even though we haven't actually changed the List interface. When it is compiled, the extension method is removed, and replaced with the static method call.

Peter Ahe pointed out some flaws with this design. He also proposes declaration-site extension methods:

// in the JDK
public interface List {
  void sort() import static java.util.Collections.sort;
  ...
}

// in the application
list.sort();

In this case, it is easier for the user to find the implementation of the sort method, as it genuinely is a method on the interface, just one implemented elsewhere. Unfortunately, the downside is that now only the JDK authors can add methods to the List interface in this way.

My proposal is that the freedom given to the user by the use-site approach is far more desirable, but the possible side-effects are nasty. My preference would be to add a visible marker to show that this isn't a regular method:

// in the application
import static java.util.Collections.sort;
List list = ...
list.do.sort();

Note the '.do.'. This provides the user with the clear readability to show that something 'out of the ordinary' is going on.

In addition, I would argue that the static method should be marked to indicate that it can be used as an extension method, probably using an annotation:

// in the JDK
public class Collections {
  @ExtensionMethod
  public void sort(List list) { ...}
}

This provides the final piece of the puzzle, preventing innapropriate methods from being used as extension methods. While it does suffer from the same issue of pushing control back to the library author (the JDK), it avoids the issue of which methods get added to the List interface.

The point is that as many methods as appropriate can be tagged with @ExtensionMethod. There is no conceptual overhead in doing so, and it doesn't expand the conceptual weight or complexity of the ListInterface itself.

Summary

I've outlined an alternative proposal for extension methods, that tries to focus on freedom for users, without compromising readability or allowing confusing options.

Opinions welcome, as always :-)

Sunday 11 November 2007

Kijaro - for Java language changes

I'd like to announce the creation of a new project - kijaro. Kijaro is designed as a place where ideas for changes to the Java language can be implemented.

Kijaro

The kijaro project has been setup following various discussions on blogs, mailing lists and email. Its aim is to provide a very open place for those interested in implementing a change to javac to gather and code.

Kijaro is similar in its scope to the Kitchen Sink Language. The KSL project has been open for quite a while, but has yet to see any new features. KSL also aims to have code reviews and experienced compiler writers involved, which can be seen as quite formal.

Kijaro aims to be lightweight in rules:

  • Documentation. Each new language feature must have some form of associated document, even if its just a blog. It doesn't have to be much, but should have an outline of why the feature is needed and the syntax implications.
  • Backwards compatibility. On svn TRUNK all existing Java code must compile.
  • Comments. Each change must have a comment so we can find it later, such as 'FCM-MREF'.

So far, the proponents of three language enhancements have expressed an interest in working at kijaro:

In fact, the FCM and Properties code is already checked in.

So, do you have a favourite language change that you want to see implemented in Java 7? Or, would you like to download and try out one of these changes? Then, please join us at kijaro! The more ideas we get implemented the better!

After all, real working prototypes tend to produce good feedback and really encourage the decision makers that any change is practical.

Tuesday 6 November 2007

Implementing FCM - part 1

Its been a long coding weekend, but after much head scratching, I might finally be getting the hang of changing the java compiler. My goal? to implement First class methods (FCM).

Implementing FCM

Currently, on my home laptop I have an altered version of javac, originally sourced from the KSL project which supports Method literals (section 2.1 of the spec). What this means in practice is that you can write:

  Method m = String#substring(int,int);

This will now compile and return the correct Method object. The left hand side can be a reference to an inner class such as Map.Entry without problem. It will also validate the standard access rules, preventing you from referencing a private/package/protected method when you shouldn't.

In fact, the implementation is currently more generous than this, as the left hand side can also be a variable. This is needed for later phases of FCM, but isn't really appropriate for a method literal.

Making a language change like this is definitely intimidating though, and mostly because the compiler feels like such a big scary beast. The first two parts - Scanner and Parser are actually pretty easy. Once those are done you have an AST (Abstract Syntax Tree) representing your code. In this case that involves a new node, currently called 'MethodReference'.

After that it gets more confusing. The main piece of work should be in Lower, which is where syntax sugar is converted to 'normal' Java. For example the new foreach loop is converted to the old one here. Since method literals are just syntax sugar, Lower is where the change goes.

Unfortunately, to get to Lower, you also have to fight with Attr which is where the types get allocated, checked and validated. That was a real head scratcher, and I'm sure a compiler expert would laugh at my final code. It does work though, and after all thats the main thing.

The next step will be to convert 'MethodReference' to 'MemberReference' and enable Field and Constructor literals. After that, the real FCM work begins!

Publishing

So, by now you probably want to have a download link. But I'm not going to provide one. As the code is GPL, I can't just publish a binary, the source has to come too. And personally, I'd rather actually have it in a SVN repo somewhere before I let it escape at all (because that can probably count as the GPL source publication). So, I'm afraid you'll have to be patient :-)

This probably should go in the KSL, but right now I don't know whether it will. The trouble with the KSL is that it still has a few guards and protections around it. And to date, no-one other than Sun has committed there, even in a branch. I'm feeling more comfortable in creating a separate, 'fewer rules', project at sourceforge. In fact, it would be great if other javac hackers like Remi wanted to join too ;-)

One final thing... if anyone wants to help out with this work I'd much appreciate it - scolebourne|joda|org. Compiler hacking isn't easy, but its a great feeling when you get it working!

Monday 29 October 2007

Joda-Time 1.5 released

Finally released today, after a long wait, is Joda-Time version 1.5. This release fixes a number of bugs, and contains a raft of enhancements, which should make it a desirable upgrade for most users.

One enhancement is to provide a way to handle awkward time zones where midnight disappears. In these zones, the daylight savings time (DST) spring cutover results in there being no midnight. Unfortunately, the LocalDate.toDateTimeAtMidnight() method required midnight to exist and throws an exception if it doesn't. Version 1.5 adds a LocalDate.toDateTimeAtStartOfDay() method which handles this case by returning the first valid time on the date.

The Period class has also received a lot of new methods. You can now easily convert to a Duration and other periods using the standard ISO definitions of seconds/minutes/hours/days. You can also normalize using these definitions, converting "1 year 15 months" to "2 years 3 months".

Finally, it is now easier to tell if a given instant is in summer or winter time. This uses the new DateTimeZone.isStandardOffset() method.

Full details are available in the release notes or just go ahead and download! Maven jar, sources and javadoc are also available.

Friday 5 October 2007

JSR-310 and Java 7 language changes

Part of the difficulty I'm finding with designing JSR-310 (Dates and Times) is that I constantly come across gaps in the language of Java. My concern is that these gaps will shape the API to be less than it should be. Let me give some examples:

BigDecimal

JSR-310 is considering adding classes representing durations. (So is JSR-275, but thats another story.)

The aim of the duration classes is to meet the use case of representing "6 days", or "7 minutes". As a result, the first set of code uses int to represent the amount.

 public class Days {     // code is for blog purposes only!
   private int amount;
   ...
 }

However, what happens when you get down to seconds and milliseconds? Do we have one class that represents seconds, and a separate class that represents milliseconds? That seems rather naff. What would be better is to have a class to represent seconds with decimal places:

 public class Days {
   private double amount;
   ...
 }

But double is a no-no. Like float, it is unreliable - unable to represent some decimal values, and with sometimes unexpected answers from the maths. Of course, the answer is BigDecimal:

 public class Days {
   private BigDecimal amount;
 }

So, why am I, and others, reticent to use this 'correct' solution? Its because we don't have BigDecimal literals and operators. This is a clear case where language-level choices are affecting library-level design for the worse.

Immutables

JSR-310 is based around immutable classes, which are now a well recognised best practice for classes like dates and times. Unfortunately, the Java language is not well setup for immutable classes.

Ideally, there should be language level support for immutable classes. This would involve a single keyword to declare a class as immutable, and could allow certain runtime optimisations (so I'm told...).

 public immutable class DateTime {
   ...
 }

Unfortunately, it is probably too late for Java to do much on this one.

Self types

Another missing Java feature affecting JSR-310 is self-types. These are where a superclass can declare a method that automatically causes all subclasses to return the same type as the subclass:

 public abstract class AbstractDateTime {
   public <this> plusYears(Duration duration) {
     return factory.create(this.amount + duration);   // pseudo-code
   }
 }
 public final class DateTime extends AbstractDateTime {
 }
 // usage
 DateTime result = dateTime.plus(duration);

The thing to note is the 'this' generic style syntax. The effect is that the user of the subclass was able to use the method and get the correct return type. They were returned a DateTime rather than a AbstractDateTime.

This can be achieved today by manually overriding the method in each and every subclass. However that doesn't work if you want to add a new method to the abstract superclass in the future and have it picked up by every subclass (which is what you want in a JSR, as a JSR doesn't have a perfect crystal ball for its first release).

Again, a language-level missing feature severely compromises a library-level JSR.

Operator overloading

Another area where JSR-310 may choose the 'wrong' approach is int-wrapper classes. For JSR-310 this would be a class representing a duration in years.

If I were to give you a problem description that said you need to model 'the number of apples' in a shop', and to be able to add, subtract and multiply that number, you'd have a couple of design options.

The first option is to hold an int and perform regular maths using +, - and *. The downside is that only javadoc tells you that the int is a number of apples, instead of a number of oranges. This is exactly the situation before generics.

The second option is to create an Apples class wrapping the int. Now, you cannot confuse apples with oranges. Unfortunately, you also cannot use +, - and *. This is despite the fact that they are obviously valid in this scenario. Using plus/minus/multipliedBy/dividedBy methods just doesn't have the same clarity.

Given this choice today, most architects/designers seem to be choosing the first option of just using an int. Language designers should really be looking at that decision and shaking their head. The whole point of the generics change was to move away from reliance on javadoc. Yet here, in another design corner, people still prefer a lack of real safety and javadoc to 'doing the right thing'. Why? Primarily because of the lack of operator overloading.

Ironically, I think this is an area that really might change in the future. But if JSR-310 has chosen to use int rather than proper classes by that point, a great opportunity will have passed.

Summary

My assertion that language-level features (or rather missing features) affect library design isn't really news. The interesting thing with JSR-310 is just how constraining the missing features are proving to be.

The trouble is that with no clear idea on where the future of the Java language lies, a JSR like 310 cannot make good choices which will fit well with future language change. The danger is that we simply create another date and time library that doesn't fit well with the Java language of 2010 or later.

Opinions welcome on whether JSR-310 should completely ignore potential language changes, or try to make a best guess for the future.

Monday 1 October 2007

Java 7 - Properties terminology

The debate around properties is hotting up. I'm going to try and contribute by defining a common terminology for everyone to communicate in.

Properties terminology

This is a classification of the three basic property types being debated. Names are given for each type of property. My hope is that the names will become widely used to allow everyone to explain their opinions more rapidly.

Note that although none of the examples show get/set methods, all of the options can support them. It should also be noted that the interface/class definitions are reduced and don't include all the possible methods.

Type 1 - Bean-independent property

(aka per-class, property-proxy, property adaptor)

This type of property consists of a type-safe wrapper for a named property on a specific type of bean. No reference is held to a bean instance, so the property is a lightweight singleton, exactly as per Field or Method. As such, it could be held as a static constant on the bean.

 public interface BeanIndependentProperty<B, V> {
   /**
    * Gets the value of the property from the specified bean.
    */
   V get(B bean);
   /**
    * Sets the value of the property on the specified bean.
    */
   void set(B bean, V newValue);
   /**
    * Gets the name of the property.
    */
   String propertyName();
 }

This type of property is useful for meta-level programming by frameworks. The get and set methods are not intended to be used day in day out by most application developers. Instead, the singleton bean-independent property instance is passed as a parameter to a framework which will store it for later use, typically against a list of actual beans.

A typical use case would be defining the properties on a bean to a comparator, for example comparing surnames, and if they are equal then forenames. Clearly, the comparator needs to be defined completely independently to any actual bean instance.

 Comparator<Person> comp = new MultiPropertyComparator<Person>(
   Person.SURNAME, Person.FORENAME   // bean-independent property defined as static constant
 );

Bean-independent properties can be implemented in two main ways. The first option uses reflection to access the field:

 public class Person {
  public static final BeanIndependentProperty<Person, String> SURNAME =
      ReflectionIndependentProperty.create(Person.class, "surname");
  private String surname;
 }

The second option uses an inner class to access the field:

 public class Person {
  public static final BeanIndependentProperty<Person, String> SURNAME =
      new AbstractIndependentProperty("surname") {
        public String get(Person person) { return person.surname; }
        public void set(Person person, String newValue) { person.surname = newValue; }
      };
  private String surname;
 }

As bean-independent properties are static singletons, they should be stateless and immutable.

Type 2 - Bean-attached property

(aka per-instance)

This type of property consists of a type-safe wrapper for a named property on a specific instance of a bean. As the bean-attached property is connected to a bean, it must be accessed via an instance method.

 public interface BeanAttachedProperty<B, V> {
   /**
    * Gets the value of the property from the stored bean.
    */
   V get();
   /**
    * Sets the value of the property on the stored bean.
    */
   void set(V newValue);
   /**
    * Gets the name of the property.
    */
   String propertyName();
 }

This type of property is useful for frameworks that need to tie a property on a specific bean to a piece of internal logic. The most common example of this is many parts of the Beans Binding JSR, such as textfield bindings.

The get and set methods are not intended to be used day in day out by most application developers. Instead, the bean-attached property instance is passed as a parameter to a framework which will store it for later use.

A typical use case would be defining the binding from a person's surname to a textfield. In this scenario we are binding the surname on a specific bean to the textfield on a specific form.

 bind( myPerson.surnameProperty(), myTextField.textProperty() );

Bean-attached properties can be implemented in two main ways. The first option uses reflection to access the field. This example show the attached property being created new each time (on-demand), however it could also be cached or created when the bean is created:

 public class Person {
  public BeanAttachedProperty<Person, String> surnameProperty() {
    return ReflectionAttachedProperty.create(this, "surname");
  }
  private String surname;
 }

The second option uses an inner class to access the field. Again, this example show the attached property being created new each time (on-demand), however it could also be cached or created when the bean is created:

 public class Person {
  public BeanAttachedProperty<Person, String> surnameProperty() {
    return new AbstractAttachedProperty("surname") {
        public String get() { return surname; }
        public void set(String newValue) { surname = newValue; }
      };
  private String surname;
 }

Bean-attached properties are merely pointers to the data on the bean. As such, they should be stateless and immutable.

Type 3 - Stateful property

(aka bean-properties, property objects, Beans 2.0, Eclipse IObservableValue)

Note that stateful properties have been implemented in the bean-properties project, however that project goes well beyond the description below.

This type of property consists of a stateful property on a specific instance of a bean. The property is a fully fledged object, linked to a specific bean, that holds the entire state of the property, including it's value. This approach is based around an alternative to the standard beans specification, however get/set methods can be added if desired.

 public class StatefulProperty<B, V> {
   private B bean;
   private V value;
   /**
    * Constructor initialising the value.
    */
   StatefulProperty<B, V>(B bean, V initialValue) {
     this.bean = bean;
     value = initialValue;
   }
   /**
    * Gets the value of the property from the stored bean.
    */
   public V get() { return value }
   /**
    * Sets the value of the property on the stored bean.
    */
   public void set(V newValue) { value = newValue; }
   /**
    * Gets the name of the property.
    */
   public String propertyName();
 }

This type of property is intended for application developers to use day in day out. However it is not intended to be used in the same way as normal get/set methods. Instead, developers are expected to code a bean and use it as follows:

 // bean implementation
 public class Person {
  public final StatefulProperty<Person, String> surname =
      new StatefulProperty<Person, String>(this, "surname");
  }
 }
 // usage (instead of get/set)
 String s = myPerson.surname.get();
 myPerson.surname.set("Colebourne");

The important point to note is that the surname field on the bean is final and a reference to the stateful property. The actual state of the property is within the stateful property, not directly within the bean.

A stateful property fulfils the same basic API as a bean-attached property (type 2). As such in can be used in the same use cases. However, because a stateful property is an object in its own right, it can have additional state and metadata added as required.

The key difference between bean-attached (type 2) and stateful (type 3) is the location of the bean's state. This impacts further in that bean-attached properties are intended primarily for interaction with frameworks, whereas stateful properties are intended for everyday use.

Language syntax changes

Both bean-independent (type 1) and bean-attached (type 2) properties benefit from language syntax change. This would be used to access the property (not the value) to pass to the framework in a compile-safe, type-safe, refactorable way:

 // bean-independent - accessed via the classname
 BeanIndependentProperty<Person, String> property = Person#surname;
 
 // bean-attached - accessed via the instance
 BeanAttachedProperty<Person, String> property = myPerson#surname;

Stateful properties (type 3) do not need a language change to access the property as they are designed around it, making it directly available and fully safe.

The second area of language change is definition of properties. This is a complicated area, which I won't cover here, where all three types of property would benefit from language change.

The third area of language change is access to the value of a property. Again, there are many possible syntaxes and implications which I won't cover here.

Combination

The full benefit of bean-independent (type 1) and bean attached (type 2) properties occurs when they are combined. Additional methods can be added to each interface to aid the integration, and any language syntax change becomes much more powerful.

Summary

I hope that these definitions will prove useful, and provide a common terminology for the discussion of properties in Java.

Opinions are welcome, and I will make tweaks if necessary!

Friday 28 September 2007

JSR-275 - Suitable for Java 7?

I've been evaluating JSR-275 - Units and Quantities tonight in relation to JSR-310. Its not been pretty.

JSR-275 - Units and Quantities

JSR-275 is a follow on from JSR-108, and aims to become part of the standard Java library in the JDK. It provides a small API for quantities and units, and a larger implementation of physical units, conversions and formatting. The goal is to avoid using ints for quantities, with their potential for mixing miles and metres.

So where's the problem?

Well, it starts with naming. There is a class called Measure which is the key class. But its what I would call a quantity - a number of units, such as 6 metres or 32 kilograms.

Measure is generified by two parameters. The first is the value type. You'd expect this to be a Number, but weirdly it is any class you like.

You'd expect the second generic parameter to be the unit type, right? Er, no. It's the Quantity. So, not only is the quantity class not called Quantity, there is another class which doesn't have much to do with a quantity that is called Quantity. Confused yet?

The unit class is at least called Unit. Unit also has a generic parameter, and again its Quantity.

So, what is this mysterious Quantity class in JSR-275?

Well, its the concept of length or temperature or mass. Now actually that is a very useful thing for JSR-275 to have in the API. By having it in the API, it allows the generic system to prevent you from mixing a length with a mass. Which must be a Good Thing, right?

 Measure<Integer, Length> distance = Measure.valueOf(6, KILO(METRES));
 int miles = distance.intValue(MILES);

The first thing to notice is that the variable is held using a Length generic parameter. There is no support in JSR-275 to specify the variable to only accept metres or miles. Personally, I think thats a problem.

The second thing to notice is how easy conversion is. It just happened when it was requested. But is that what you necessarily want? We just converted from kilometres to miles without a thought. What about lost accuracy? Are those units really compatible?

Well now consider dates and times:

 Measure<Integer, Duration> duration = Measure.valueOf(3, MONTHS);
 int days = duration.intValue(DAYS);

Er, how exactly did we manage to convert between months and days? What does that really mean? Well, assuming I've understood everything, what it does is use a definition of a year as 365 days, 5 hours, 49 minutes, and 12 seconds, and a month being one twelth of that. It then goes ahead and does the conversion from months to days using that data.

I doubt that the result will be what you want.

In fact, perhaps what has happened here is that we have substituted the unsafe int with a supposedly safe class. With JSR-275 we relax and assume all these exciting conversions will just work and nothing will go wrong. In short, we have a false sense of security. And thats perhaps one of the worst things an API can do.

I then looked at the proposal for integrating JSR-275 and JSR-310. One idea was for the JSR-310 duration field classes (Seconds, Minutes, Days, Months, etc) to subclass Measure. But that isn't viable as Measure defines static methods that are inappropriate for the subclasses. Measure also defines too many regular methods, including some that don't make sense in the context of JSR-310 (or at least the JSR-310 I've been building).

Looking deeper, what I see is that JSR-275 is focussed on conversions between different units. There is quite a complex API for expressing the relationships between different units (multipliers, powers, roots, etc). This then feeds right up to the main public API, where a key 'service' provided by the API is instant conversion.

But while conversion is important, for me it has come to dominate the API at the expense of what I actually wanted and expected. My use case is a simple model to express a quantity - a Number plus a Unit.

Alternative

So, here is my mini alternative to JSR-275:

 public abstract class Quantity<A extends Number, U extends Unit> {
  public abstract A amount();
  public abstract U unit();
 }
 public abstract class Unit {    // grams or celcius
  public abstract String name();
  public abstract Scale scale();
 }
 public abstract class Scale {   // mass or temperature
  public abstract String name();
 }
 
 Quantity<Integer, MilesUnit> distance = ...

So, what have we lost? Well, all the conversion for a start. In fact, all we actually have is a simple class, similar to Number, but for quantities. And the unit is very bare-bones too.

We've also lost some safety. There is now no representation of the distance vs duration vs temperature concept. And that could lead us to mix mass and distance. Except that the API will catch it at runtime, so its not too bad.

And what have we gained? Well its a much smaller and simpler API. But less functional. However, by being simpler, its easier for other teams like JSR-310 to come along and add the extra classes and conversions we want. For example, implementations of Quantity called Days and Months that don't provide easy conversion.

In essence, this is just the quantity equivalent of Number - thus its a class which the JDK lacks.

I'm sure there are many holes in this mini-proposal (I did code it and try to improve on it, but ran into generics hell again). The main thing is to emphasise simplicity, and to separate the whole reams of code to do with conversion and formatting of units. I'm highly unconvinced at the moment that the conversion/formatting code from JSR-275 belongs in the JDK.

Summary

Having spent time looking at JSR-275 I'm just not feeling the love. Its too focussed on conversion for my taste, and prevents the simple subclassing I actually need.

Opinions welcome on JSR-275!

Thursday 27 September 2007

Java 7 - Update on Properties

Whilst I was on holiday there was a bit of discussion about properties in Java.

Shannon Hickey announced Beans Binding version 1.0. This is the JSR that intends to develop an API to connect together swing components in a simple and easy to use manner. The idea is that you call a simple API to setup a binding between a GUI field and your data model.

During development, a lot of discussion on the JSR was about the interaction with the possible language proposal for properties. As a result, the v1.0 API includes a Property abstract class, with a subclasses for beans and the EL language from J2EE.

Per-class or per-instance

One key issue that has to be addressed with a property proposal is whether to go per-class or per-instance. The per-class approach results in usage like Method or Field, where the bean must be passed in to obtain the value:

 Person mother = ...;
 Property<Person, String> surnameProperty = BeanProperty.create("surname");
 String surname = surnameProperty.getValue(mother);

Here, the surname property refers to the surname on all Person beans. The caller must pass in a specific person instance to extract the value of the surname.

On the other hand, the per-instance approach results in usage where the property itself knows what bean it is attached to:

 Person mother = ...;
 Property<Person, String> surnameProperty = BeanProperty.create("surname", mother);
 String surname = surnameProperty.getValue();

Here, the surname property refers to the specific surname property on the mother instance. There is thus no need to pass in any object to extract the value of the surname, as the property knows that it is attached to mother.

Which is better? And which did Beans Binding adopt?

Well, its actually the case that for the swing use case, the per-class approach is really the only option. Consider a table where each row is the representation of a bean in a list. The columns of this table are clearly going to be properties, but which type? Well, when defining the binding, we want to do this without any reference to the beans in the list - after all, the beans may change and we don't want to have to redefine the binding if they do.

So, we naturally need the per-class rather than per-instance properties. And, that is what the Beans Binding project defines.

Is there ever a need for the per-instance approach? In my opinion yes, but that can be added to the per-class based property functionality easily.

Adding language support

Having seen the Beans Binding release, Remi Forax released a version of his property language compiler that integrates:

 Person mother = ...;
 Property<Person, String> surnameProperty = LangProperty.create(Person#surname);
 String surname = surnameProperty.getValue(mother);

As can be seen, the main advantage is that we now have a type-safe, compile-time checked, refactorable property Person#surname. This is a tremendous leap forward for Java, and one that shouldn't be underestimated. And the # syntax can easily be extended to fields and methods (FCM).

But, why do we see this new class LangProperty? Well, Remi's implementation returns an object of type java.lang.Property, not a Beans Binding org.jdesktop.Property. This fustrating difference means that two classes named Property exist, and conversion has to take place.

What this really demonstrates is the need for Sun to declare if Java 7 will or won't contain a language change for properties, and for a real effort to occur to unify the various different JSRs currently running in the beans/properties arena. It would be a real disaster for Java to end up with two similar solutions to the same problem (although I should at least say I'm very happy that the two parts are compatible :-).

Summary

Properties support in Java is progressing through a combination of JSR and individual work. Yet Sun is still silent on Java 7 language changes. Why is that?

Opinions welcomed as always!

Friday 31 August 2007

Java 7 - Self-types

I just screamed at Java again. This time my ire has been raised by the lack of self-types in the language.

Self-types

So what is a self-type. Well, I could point you at some links, but I'll try and have a go at defining it myself...

A self-type is a special type that always refers to the current class. Its used mostly with generics, and if you've never needed it then you won't understand what a pain it is that it isn't in Java.

Here's my problem to help explain the issue. Consider two immutable classes - ZonedDate and LocalDate - both declaring effectively the same method:

 public final class ZonedDate {
   ZonedDate withYear(int year) { ... }
 }
 public final class LocalDate {
   LocalDate withYear(int year) { ... }
 }

The implementations only differ in the return type. This is essential for immutable classes, as you need the returned instance to be the same type as the original in order to call other methods (like withMonthOfYear).

Now consider that both methods actually contain the same code, and I want to abstract that out into a shared abstract superclass. This is where we get stuck:

 public abstract class BaseDate {
   BaseDate withYear(int year) { ... }
 }
 public final class ZonedDate extends BaseDate {
   // withYear removed as now in superclass, or is it?
 }
 public final class LocalDate extends BaseDate {
   // withYear removed as now in superclass, or is it?
 }

The problem is that we've changed the effect of the method as far as ZonedDate and LocalDate are concerned, because they no longer return the correct type, but now return the type of the superclass.

One solution is to override the abstract implementation in the subclass, just to get the correct return type (via covariance):

 public abstract class BaseDate {
   BaseDate withYear(int year) { ... }
 }
 public final class ZonedDate extends BaseDate {
   ZonedDate withYear(int year) {
     return (ZonedDate) super.withYear(year);
   }
 }
 public final class LocalDate extends BaseDate {
   LocalDate withYear(int year) {
     return (LocalDate) super.withYear(year);
   }
 }

What a mess. Imagine doing that for many methods on many subclasses. Its a large amount of pointless boilerplate code for no good reason. What we really want is self-types, with a syntax such as <this>:

 public abstract class BaseDate {
   <this> withYear(int year);
 }
 public final class ZonedDate extends BaseDate {
   // withYear removed as now correctly in superclass
 }
 public final class LocalDate extends BaseDate {
   // withYear removed as now correctly in superclass
 }

The simple device of the self-type means that the withYear method will now appear to have the correct return type in each of the subclasses - LocalDate in LocalDate, ZonedDate in ZonedDate and so on. And without reams of dubious boilerplate.

Summary

Self-types are a necessary companion for abstract classes where the subclass is to be immutable (and there are probably other good uses too...).

Opinions welcome as always!

Wednesday 11 July 2007

Java 7 - What to do with BigDecimal?

What happens in Java when we have to deal with large decimal numbers? Numbers that must be accurate, of unlimited size and precision? We use BigDecimal, of course. Yet, I have a sense that we tend to curse every time we do. Why is that?

BigDecimal

I think that the main reason why we dislike working with BigDecimal is that its so much more clunky than working with a primitive type. There are no literals to construct them easily. Its only in Java 5 that constructors taking an int/long have been added.

And then there are the mathematical operations. Calling methods is just a whole lot more verbose and confusing than using operators.

  // yucky BigDecimal
  BigDecimal dec = new BigDecimal("12.34");
  dec = dec.add(new BigDecimal("34.45")).multiply(new BigDecimal("1.12")).subtract(new BigDecimal("3.21"));

  // nice primitive double
  double d = 12.34d;
  d = (d + 34.45d) * 1.12d - 3.21d;

Finally, there is the performance question. Perhaps this is a Java Urban Myth, but my brain associates BigDecimal with poor performance. (Note to self... really need to benchmark this!).

So, I was wondering if anything can be done about this? One solution would be a new primitive type in Java - decimal - with all the associated new bytecodes. Somehow I doubt this will happen, although it would be nice if it did.

More realistic is operator overloading (lets just consider BigDecimal for now, and not get into the whole operator overloading debate...). Overloading for BigDecimal has definitely been talked about in Sun for Java 7, and it would make sense. However, as can be seen in my example below, there is really a need for BigDecimal literals to be added at the same time:

  // now
  BigDecimal dec = new BigDecimal("12.34");
  dec = dec.add(new BigDecimal("34.45")).multiply(new BigDecimal("1.12")).subtract(new BigDecimal("3.21"));

  // just operator overloading
  BigDecimal dec = new BigDecimal("12.34");
  dec = dec + new BigDecimal("34.45") * new BigDecimal("1.12") - new BigDecimal("3.21");

  // with literals
  BigDecimal dec = 12.34n;
  dec = dec + 34.45n * 1.12n - 3.21n;

As you can see, the literal syntax makes a big difference to readability (I've used 'n' as a suffix, meaning 'number' for now. Of course the main issue with operator overloading is precedence, and that would need quite some work to get it right. I would argue that if literals are added at the same time, then precedence must work exactly as per other primitive numbers in Java.

One possibility to consider is replacing BigDecimal with a new class. I don't know if there are backwards compatability issues holding back the performance or design of BigDecimal, but its something to consider. Obviously, its difficult though, as BigDecimal is, like Date, a very widely used class.

A new idea?

One idea I had tonight was to write a subclass of BigDecimal that uses a long for internal storage instead of a BigInteger. The subclass would override all the methods of BigDecimal, implementing them more quickly because of the single primitive long storage. If the result of any calculation overflowed the capacity of a long, then the method would just return the standard BigDecimal class. I'd love to hear if anyone has had this idea before, or thinks that it would work well.

By the way, I thought of this in relation to JSR-310. It could be a specialised implementation of BigDecimal just for milliseconds, allowing arbitrary precision datetimes if people really need it.

And finally...

And finally, could I just say that I hate the new Java 5 method on BigDecimal called plus(). Believe it or not, the method does absolutely nothing except return 'this'. Who thought that was a good idea???

Summary

Hopefully this random collection of thoughts about BigDecimal can trigger some ideas and responses. There is still time before Java 7 to get a better approach to decimal numbers, and stop the use of inaccurate doubles. Opinions welcome as always.

Monday 2 July 2007

JSR-310 - What running a JSR really means

Choosing to take on the development of a JSR (JSR-310, Date and Time API) wasn't an easy decision. I knew at the start that it would involve a lot of effort.

My original plan was to work on the JSR solely in my free time (evenings and weekends), but after 4 months the reality was that I hadn't got enough done. So thats why, starting today, I'm taking unpaid leave two days a week from my day-job for 3 months to try and get the task done.

So, I guess that this post should serve as a warning to anyone thinking of starting a JSR. Its going to cost you :-)

PS. JSR-310 is run in a fully open manner, so if you've got something to contribute to the debate, please join the mailing list or add to the wiki!

Friday 8 June 2007

Speaking at TSSJS Europe on Closures

I will be speaking on the subject of Java Closures at TSSJS Europe in Barcelona on Wednesday June 27th. I'll be introducing closures, showing what they can do, and comparing the main proposals. It should be a good trip!

Saturday 12 May 2007

Closures - comparing options impact on language features

At JavaOne, closures were still a hot topic of debate. But comparing the different proposals can be tricky, especially as JavaOne only had BGGA based sessions.

Once again, I will be comparing the three key proposals in the 'closure' space - BGGA from Neal Gafter et al, CICE from Josh Bloch et al, and FCM from Stefan and myself, with the optional extension JCA extension. In addition I'm including the BGGA restricted variant, which is intended to safeguard developers when the closure is to be run asynchronously, and the ARM proposal for resource management.

In this comparison, I want to compare the behaviour of key program elements in each of the main proposals with regards the lexical scope. With lexical scoping here I am referring to what features of the surrounding method and class are available to code within the 'closure'.

Neal Gafter analysed the lexical scoped semantic language constructs in Java. The concept is that 'full closures' should not impact on the lexical scope. Thus any block of code can be surrounded by a closure without changing its meaning. The proposals differ on how far to follow this rule, as I hope to show in the table (without showing every last detail):

Is the language construct lexically scoped when surrounded by a 'closure'?
  Inner class CICE ARM FCM JCA BGGA
Restricted
BGGA
variable names Part (1) Part (1) Yes Yes Yes Part (7) Yes
methods Part (2) Part (2) Yes Yes Yes Yes Yes
type (class) names Part (3) Part (3) Yes Yes Yes Yes Yes
checked exceptions - - Yes Yes (4) Yes (4) Yes (4) Yes (4)
this - - Yes Yes Yes Yes Yes
labels - - Yes - (5) Yes - (5) Yes
break - - Yes - (5) Yes - (5) Yes
continue - - Yes - (5) Yes - (5) Yes
return - - Yes - (6) Yes - (8) Yes

(1) Variable names from the inner class hide those from the lexical scope
(2) Method names from the inner class hide those from the lexical scope
(3) Type names from the inner class hide those from the lexical scope
(4) Assumes exception transparency
(5) Labels, break and continue are constrained within the 'closure'
(6) The return keyword returns to the invoker of the inner method
(7) The RestrictedClosure interface forces local variables to be declared final
(8) The RestrictedClosure interface prevents return from compiling within the 'closure'
BTW, if this is inaccurate in any way, I'll happily adjust.

So more 'Yes' markers means the proposal is better right? Well not necessarily. In fact, I really don't want you to get that impression from this blog.

What I'm trying to get across is the kind of problems that are being tackled. Beyond that you need to look at the proposals in more detail with examples in order to understand the impact of the Yes/No and the comments.

Let me know if this was helpful (or innaccurate), or if you'd like any other comparisons.

Wednesday 9 May 2007

JavaOne JSR-310 Date and Time API BOF tonight!

If you're at JavaOne and hate Date and Calendar then come along to our BOF tonight! Its at 21:55 in MC105 - BOF-2794.

We'll be having a short talk about what's wrong with Date and Calendar, what current alternatives are out there and outline the initial thoughts of the JSR group. There will also be a mini-puzzler with mini-chocolate reward!!!

The main aim of the BOF is to have a discussion however - what calendar systems should we support, what kind of API, should we address SQL access? We'd love to hear your input, so come along if you can!

Monday 7 May 2007

First time at JavaOne

So, its my first time at JavaOne. I'm about to head off to the warm up CommunityOne event.

My main task for the week is the BOF for JSR-310. This is on Wednesday evening at 21:55. I'd love to see you there if you're around.

Apart from that, I'll be following all the debates on Java language changes and generally wandering around. If you see me (and recognise me) then say hi.

Thursday 3 May 2007

Closures options - FCM+JCA is my choice

In my last post I drew a diagram showing how I viewed the main proposals in terms of complexity. It has since been pointed out to me that I didn't clearly identify whether the JCA part of FCM was included or not. So I'll redraw the diagram and express my opinon.

The three key proposals in the 'closure' space are - BGGA from Neal Gafter et al, CICE from Josh Bloch et al, and FCM from Stefan and myself. FCM has an optional extension called JCA which adds control abstraction [1].

Control abstraction [1] is that part of the closures language change which allows a user to write an API method which acts, via a static import, as though it is a built in keyword. This is an incredibly powerful facility in a programming language. However, reading many forums, this is a facility that many Java developers also find concerning.

Given this, I'll redraw the diagram from yesterday:

 --------------------------------------------------------------------------
  Simpler                                                     More complex
 --------------------------------------------------------------------------
   No change   Better inner   Method-based   Method-based   Full-featured
                 classes        closures     closures and   closures with
                                             control abst.   control abst.
 --------------------------------------------------------------------------
 |        No control abstraction [1]      |    Control abstraction [1]   |
 --------------------------------------------------------------------------
   No change      CICE*           FCM           FCM+JCA         BGGA
 --------------------------------------------------------------------------

* It should also be noted that Josh Bloch has written the ARM proposal for resource management that tackles a specific control abstraction [1] use case.

So, what is my preference?

Firstly, I believe that Java should be enhanced with FCM-style closures (where the 'return' keyword returns to the invoker of the closure).

Secondly, I believe that control abstraction [1] would be a very powerful and expressive language enhancement. Personally, I would like to see Java include control abstraction, but I am very aware that it is a real concern for others in the community. I am also not yet fully confident of the risks involved by rushing into control abstraction [1].

Thus, my preferred option is FCM+JCA for Java 7. However I would also accept a potentially lower risk option of including just the FCM-style solution in Java 7 and adding JCA-style control abstraction [1] in Java 8.

Feel free to add your preference to the comments.

[1] Update: Neal Gafter has indicated that I am using conflicting terminology wrt 'control abstraction'. Neal uses the term 'control abstraction' to refer to any closure which can abstract over an arbitrary block of code including interacting with break, continue and return. Whereas, in this blog post I have been referring to control abstraction as adding API based keywords (described in paragraph 3), which Neal refers to as the 'control invocation syntax'. Apologies for any confusion

Wednesday 2 May 2007

Cautious welcome on closure JSR - too soon for consensus

Neal Gafter announced a draft closures JSR a few days ago. I'd like to give the JSR a cautious welcome, but also wish to emphasise that the 'consensus' term could be misleading.

At present, there are three key proposals in the 'closure' space - BGGA from Neal Gafter et al, CICE from Josh Bloch et al, and FCM from Stefan and myself. Conceptually, these proposals meet differing visions of how far Java should be changed, and what is an acceptable change.

 ----------------------------------------------------------------
  Simpler                                           More complex
 ----------------------------------------------------------------
   No change    Better inner     Method-based     Full-featured
                  classes          closures         closures
 ----------------------------------------------------------------
   No change       CICE              FCM              BGGA
 ----------------------------------------------------------------

In deciding to support this JSR, a number of factors came into play for me:

  • I want to see closures in Java 7 - unfortunately it may already be too late for that
  • I'm not a language designer, and don't have the knowledge to run a JSR myself on this topic
  • The text of the draft JSR is clearly phrased as a derivation of the BGGA proposal
  • There has been positive feedback on Java forums about each of the three proposals - the Java community does not have a settled opinion yet
  • Many Java developers feel uncomfortable with any change in this area
  • Groovy, a Java derived language, uses an approach similar to FCM
  • This is the only JSR proposal currently available to support
  • Private emails

In the end, taking all of these factors into account, I took the judgement that this JSR was currently the only game in town and it would be better for me to participate rather than stand outside. And that is why I give it a cautious welcome.

Unfortunately, the extensive list of pre-selected Expert Group members (not including FCM) and Expert Group questionnaire came as rather a surprise, especially given the 'consensus' tag.

On the point of consensus, some commentators have assumed that this means a compromise proposal has been created, or agreement reached. This isn't the case.

Looking forward, my preference would be for Sun to finally play their hand on this topic. As a neutral player, they are well placed to encourage all parties to take a step back for a minute and put the focus firmly on what is best for Java going forward. Let the Sun shine :-)

Tuesday 10 April 2007

ASF open letter to Sun

Despite a great deal of behind the scenes effort, the Apache Software Foundation (ASF) has been forced to write an open letter to Sun Microsystems. This letter is regarding the inability of the ASF to obtain a version of the testing compatibility kit for Java SE, as needed by Apache Harmony, under a suitable license.

As the open letter details, Sun is attempting to avoid its contractual obligations from the JSPA, which is the legal agreement of the Java Community Process (JCP). This seriously places at risk the ability of the JCP to act as a truly independent open standards organisation.

Considerable efforts (over 7 months) have gone into trying to avoid reaching this point. It is hoped that now the issue is public, the Java community can bring to bear pressure on Sun to meet its contractual obligations quickly and amicably.

Further questions may be answered in the FAQ.

On this issue, Geir Magnussen Jr acts as the official ASF representative. I am speaking here simply to provide information.

Java Control Abstraction for First-Class Methods (Closures)

Stefan Schulz, Ricky Clarkson and I are pleased to announce the release of Java Control Abstraction (JCA). This is a position paper explaining how we envisage the First-Class Methods (FCM) closures proposal being extended to cover control abstraction.

Java Control Abstraction

So, what is control abstraction? And how does it relate to FCM? Well its all about being able to add methods in an API that can appear as though they are part of the language. The classic example is iteration over a map. Here is the code we write today:

  Map<Long,Person> map = ...;
  for (Map.Entry<Long, Person> entry : map) {
    Long id = entry.getKey();
    Person p = entry.getValue();
    // some operations
  }

and here is what the code looks like with control abstraction:

  Map<Long,Person> map = ...;
  for eachEntry(Long id, Person p : map) {
    // some operations
  }

The identfier eachEntry is a special method implemented elsewhere (and statically imported):

  public static <K, V> void eachEntry(for #(void(K, V)) block : Map<K, V> map) {
    for (Map.Entry<K, V> entry : map.entrySet()) {
      block.invoke(entry.getKey(), entry.getValue());
    }
  }

As can be seen, the API method above has a few unique features. Firstly, it has two halves separated by a colon. The first part consists of a method type which represents the code to be executed (the closure). The second part consists of any other parameters. As shown, the closure block is invoked in the same way as FCM.

The allowed syntax that the developer may enter in the block is not governed by the rules of FCM. Instead, the developer may use return, continue, break and exceptions and they will 'just work'. Thus in JCA, return will return from the enclosing method, not back into the closure. This is the opposite to FCM. This behaviour is required as the JCA block has to act like a built-in keyword.

One downside of the approach is that things can go wrong because the API writer has access to a variable that represents the closure. The API writer could store this in a variable and invoke it at a later time after the enclosing method is complete. However, if this occurred, then any return/continue/break statements would no longer operate correctly as the original enclosing method would no longer be on the call stack and a weird and unexpected exception will be thrown.

The semantics of a pure FCM method invocation are always safe, and there is no way to get one of these unexpected exceptions. But, for JCA control abstraction we could find no viable way to stop the weird exceptions. Instead, we have chosen to specifically separate the syntax of FCM from the syntax of control abstraction in JCA.

Our approach is to accompany the integration of control abstraction into Java by a strong set of messages. Developers will be encouraged to use both FCM callbacks and JCA control abstractions. However, developers would only be encouraged to write FCM style APIs, and not JCA.

Writing the API part of any control abstraction (including JCA) is difficult to get right (or more accurately easy to get wrong). As a result, some coding shops may choose to ban the writing of control abstraction APIs, but by having a separate syntax this will be easy to do for the tools. It is expected, of course, that the majority of the key control abstractions will be provided by the JDK, where experts will ensure that the control abstraction APIs work correctly.

Summary

This document has taken a while to produce, especially by comparison with FCM. In the end this indicated to us that writing a control abstraction is probably going to be a little tricky irrespective of what choices the language designer makes. By separating the syntax and semantics from FCM we have clearly identified the control abstraction issue in isolation, which can only be a good thing.

Feedback always welcome!

Wednesday 4 April 2007

First-Class Methods: Java-style closures - v0.5

Stefan and I are pleased to announce the release of v0.5 of the First-class Methods: Java-style closures proposal.

Changes

Since v0.4, we have focussed on understanding the relationship between the various use cases for closures and FCM. We identified the following three use case groups:

  • Asynchronous - callbacks, like swing event listeners - where the block is invoked by a method other than the higher order method to which it was passed
  • Synchronous - inline callbacks, like list sorting, filtering and searching - where the block is invoked by the higher order method and completes before it returns
  • Control abstraction - keywords, like looping around a map, or resource acquisition

Having identified the three groups, we found that we could extend FCM from asynchronous to synchronous with a couple of simple changes. But we became convinced that the control abstraction use case is completely different. This is because the meaning of return, continue and break needs to be entirely different in the control abstraction use case.

As a result, of our deliberations, these are the main changes from v0.4:

1) Local variables can now be accessed read/write by all FCM inner methods. This enables the synchronous closure use case, and does not greatly compromise the asynchronous use case. We were concerned about the possibility of race conditions, but we identified that this is caused by concurrent access to the local variable, and could be addressed by a warning annotation - @ConcurrentLocalVariableAccess:

  public void addUpdater(@ConcurrentLocalVariableAccess #(void()) updater) {
    // invoke updater on a new thread
    // marked with annotation as local variable could be accessed
    // by new thread and original at the same time
  }

  public void process() {
    int val = 6;
    addUpdater(#{
      int calc = val;    // OK, can read local variables
      val = 7;           // OK, can update local variables
    });
    System.out.println(val);  // race condition - warning from annotation
  }

2) Added brief section on transparency. When considering synchronous closures it is desirable to not have to be concerned with every checked exception. Although we discussed alternatives, we simply reference the BGGA work here.

3) Removed the creation of stub methods for named inner methods. This prevents FCM from creating MouseListener instances, but is safer. As a result, we also removed the method compound concept.

Summary

This version tidies up the princpal loose ends of FCM and provides rationale for our decisions. We think it represents a simple and safe extension to Java which would be of great value to developers. It should also be noted that we are not intending to extend FCM further to cover the control abstraction use case.

If you've any more feedback, please let us know!

Saturday 31 March 2007

Closures - Outside Java

Trying to agree semantics for closures in Java isn't easy. Maybe we should look at what some other Java-like languages have done - Nice, Groovy and Scala.

The first thing to notice is that they all have closures. Java really is the odd one out now. The devil is in the detail though. Specifically, what do these languages do with the interaction between continue/break/return and the closure?

Nice

  // syntax for the higher order function
  String find(String -> boolean predicate) {
    ...
    boolean result = predicate(str);
    ...
  }

  // syntax 1 to call the higher order function
  String result = strings.find(String str => str.length() == 2);

  // syntax 2 to call the higher order function
  String result = strings.find(String str => {
    // other code
    return str.length() == 2;
  });

Nice has two variants of calling the closure. Syntax 1, where there is just a single expression has no braces and no return keyword. Syntax 2, has a block of code in braces, and uses the return keyword to return a value back to the higher order function.

There is a control abstraction syntax variant, where again the return keyword returns back to the higher order function. Similarly, the continue and break keywords may not cross the boundary between the closure and the main application.

Groovy

  // syntax for the higher order function
  def String find(Closure predicate) {
    ...
    boolean result = predicate(str)
    ...
  }

  // syntax 1 to call the higher order function
  String result = strings.find({String str -> str.length() == 2})

  // syntax 2 to call the higher order function
  String result = strings.find({String str -> 
    // other code
    return str.length() == 2;
  })

Groovy has many possible variants of calling the closure (with optional types). Semicolons at the end of lines are always optional, as is specifying the return keyword. So syntax 1 and 2 are naturally the same in Groovy anyway. However, the example does show that if you do use the return keyword, then it returns from the closure to the higher order function.

There is a control abstraction syntax variant, where again the return keyword returns back to the higher order function. Similarly, the continue and break keywords may not cross the boundary between the closure and the main application.

Scala

  // syntax for the higher order function
  def find(predicate : (String) => boolean) : String = {
    ...
    boolean result = predicate(str)
    ...
  }

  // syntax 1 to call the higher order function
  String result = strings.find({String str => str.length() == 2});

  // syntax 2 to call the higher order function
  String result = strings.find({String str =>
    // other code
    str.length() == 2;
  });

In Scala, like Groovy, semicolons at the end of lines are always optional. However, the key difference with Groovy is that the return keyword is always linked to the nearest enclosing method, not the closure. If that method is no longer running when the closure is invoked, then a NonLocalReturnException may be thrown.

There is a control abstraction syntax variant, where again the return keyword will return from the enclosing method. Scala does not support the continue and break keywords.

BGGA and FCM

Hopefully, the above is clear enough to show that Nice and Groovy operate in a similar manner (return always returns to the higher order function), whereas Scala is different (return will always return the enclosing method, but you may get an exception). This is especially noticeable when Nice/Groovy is used in a control abstraction syntax manner, because then the return keyword looks odd as it returns from an 'invisible' closure (the code often just looks like a keyword).

So how do the Java proposals compare?

BGGA follows the Scala model. If you write a return within the closure block it will return from the enclosing method. And you might get a NonLocalReturnException if that enclosing method has completed processing. BGGA also handles continue and break similarly.

FCM follows the Nice/Groovy model. If you write a return within the closure block it will return to the higher order function method, and there are no possible exceptions. There is no way to return from the enclosing method. Similarly, FCM prevents continue and break from escaping the boundary of the closure block.

Which model is right? Well, both have points in their favour and points against.

Scala/BGGA is more powerful as it allows return from the enclosing method, but it comes with the price of a weird exception - NonLocalReturnException. FCM, is thus slightly less powerful, but has no risk of undesirable exceptions.

Personally I can see why Scala's choice makes sense in Scala - because the language generally omits semicolons and return statements. But I'm not expecting us to remove semicolons from Java any time soon, or make return optional on all methods, so Scala's choices within a Java context seem dubious.

Feedback

Feedback welcomed of course, including examples from other languages (C#, Smalltalk, Ruby, ...).

And what about adding a new keyword to FCM to allow returning from the enclosing method? Such as "break return"? It would expose FCM to NonLocalReturnException though...

Sunday 18 March 2007

First-Class Methods: Java-style closures - v0.4

Stefan and I are pleased to announce the release of v0.4 of the First-class Methods: Java-style closures proposal.

Changes

Since v0.3, we have tried to incorporate some of the feedback received on the various forums. The main changes are as follows:

1) Constructor and Field literals. It is now possible to create type-safe, compile-time changed instances of java.lang.reflect.Constructor and Field using FCM syntax:

  // method literal:
  Method m = Integer#valueOf(int);

  // constructor literal:
  Constructor<Integer> c = Integer#(int);

  // field literal:
  Field f = Integer#MAX_VALUE;

2) Extended method references. Invocable method references have been renamed to method references. There are now four types, static, instance, bound and constructor:

  // static method reference:
  #(Integer(int)) ref = Integer#valueOf(int);

  // constructor reference:
  #(Integer(int)) ref = Integer#(int);

  // bound method reference:
  Integer i = ...
  #(int()) ref = i#intValue();

  // instance method reference:
  #(int(Integer)) ref = Integer#intValue();

3) Defined mechanism for accessing local variables. We have chosen a simple mechanism - copy-on-construction. This mechanism copies the values of any local variables accessed by the inner method to the compiler-generated inner class at the point of instantiation (via the constructor).

The effect is that changes made to the local variable are not visible to the inner method. It also means that you do not have write access to the local variables from within the inner method. The benefits are simplicity and thread-safety:

  int val = 6;
  StringBuffer buf = new StringBuffer();
  #(void()) im = #{
    int calc = val;    // OK, val always equals 6
    val = 7;           // compile error!
    buf.append("Hi");  // OK
    buf = null;        // compile error!
  };

4) Stub methods are only generated for methods of return type void. This addresses a concern that it was too easy to create invalid implementations using named inner methods.

5) Added method compounds. This addresses the inability to override more than one method at a time. Unlike inner classes, each inner method in a method compound remains independent of one another, thus in this example, the mouseClicked method cannot call the mouseExited method.

  MouseListener lnr = #[
    #mouseClicked(MouseEvent ev) {
      // handle mouse click
    }
    #mouseExited(MouseEvent ev) {
      // handle mouse exit
    }
  ];

6) Added more detail on implementation throughout the document, with specific examples.

We believe that we have addressed the key concerns of the community with this version of FCM. But if you've any more feedback, please let us know!

Monday 12 March 2007

Configuration in Java - It sure beats XML!

Is the Java community slowly remembering what's good about Java? I'm talking about static typing.

For a long time, Java developers have been forced by standards committees and framework writers to write (or generate) reams of XML. The argument for this was flexibility, simplicity, standards.

The reality is that most Java developers will tell you of XML hell. These are files where refactoring doesn't work. Files where auto-complete doesn't work. Files where errors occur at runtime instead of compile-time. Files that get so unwieldy that we ended up writing tools like XDoclet to generate them. Files whose contents are not type-safe.

In other words, everyone forgot that Java is a statically-typed language, and that is one of its key strengths.

Well, it seems to me that things are slowly changing. The change started with annotations, which are slowly feeding through to new ideas and new frameworks. What prompted me to write this blog was the release of Guice. This is a new dependency injection framework based around annotations and configuration using Java code. For example:

  bind(Service.class)
    .annotatedWith(Blue.class)
    .to(BlueService.class);

I've yet to use it, but I already like it ;-)

So, Guice allows us to replace reams of XML configuration in a tool like Spring with something a lot more compact, that can be refactored, thats type-safe, supports auto-complete and thats compile-time checked.

But lets not stop with module definitions. Many applications have hundreds or thousands of pieces of configuration stored in XML, text or properties files as arbitrary key-value pairs. Each has to be read in, parsed, validated before it can be used. And if you don't validate then an exception will occur.

My question is why don't we define this configuration in Java?

public class Config {
  public static <T> List<T> list(T... items) {
    return Arrays.asList(items);
  }
}
public class LogonConfig extends Config {
  public int userNameMinLength;
  public int userNameMaxLength;
  public int passwordMinLength;
  public int passwordMaxLength;
  public List<String> invalidPasswords;
}
public class LogonConfigInit {
  public void init(LogonConfig config) {
    config.userNameMinLength = 6;
    config.userNameMaxLength = 20;
    config.passwordMinLength = 6;
    config.passwordMaxLength = 20;
    config.invalidPasswords = list("password","mypass");
  }
}

So, we've got a common superclass, Config, with useful helper methods. A POJO, LogonConfig, without getters and setters (why are they needed here?). And the real class, LogonConfigInit, where the work is actually done. (By the way, please don't get hung up on the details of this example. Its late, and I can't think of anything better right now.)

So, is the Java class LogonConfigInit really any more complicated than an XML file? I don't think so, yet it has so many benefits in terms of speed, refactoring, validation, ... Heck we could even step through it with a debugger!

Whats more, its easy to have variations, or overrides. For example, if the XxxInit file was for database config, you could imagine decorating it with setups by machine, one for production, one for QA, etc.

So, what about reloading changes to configuration at runtime? Thats why we use text/xml files isn't it? Well once again, that needn't be true. Another of Java's key strengths is its ability to dynamically load classes, and to control the classloader.

All we need is a module that handles the 'reload configuration' button, invokes javac via the Java 6 API, compiles the new version of LogonConfigInit and runs it to load the new config. Apart from the plumbing, why is this fundamentally any different to reloading a text/xml file?

Well, there's my challenge to the Java community - how about a rock solid configuration framework with full class compilation and reloading and not a drop of XML in sight. Just compile-time checked, statically-typed, 100% pure Java. Who knows, maybe it already exists!

As always, your opinions on the issues raised are most welcome!