Showing posts with label closures. Show all posts
Showing posts with label closures. Show all posts

Monday, 3 September 2012

Still no Transparancy

Update 2012-09-06: The Expert Group mailing lists for Project Lambda are now open!


Back in October last year, I wrote about the lack of access to the expert group mailing list of Project lambda, the effort adding closures to JDK 8. Has anything changed? Well of course not.

Still no Transparancy

The blog last year was in response to the submission (in November 2010) of the original JSR which made certain promises (as required by the JCP). Chief of these was:

A publicly readable mailing list for expert group communication will be the primary working model.

After years of asking, there still is no publicly readable mailing list for the expert group.

(Everything in the previous blog still stands, so read that for more info.)

We are of course forunate enough to have the more general fully open lambda mailing list at OpenJDK. But it isn't what the JCP requires, Oracle promised or the community deserves.

While I'm sure individuals have tried their best, it seems clear that the system is broken. If a mailing list cannot be opened up in nearly 2 years then frankly what is the point?

And that is why I continue to see the JCP as an irrelevent zombie in its current form. It needs to be split in two such that Oracle's total ownership of Java SE is clear for all to see.

After all, I can live with Oracle's control. Its the broken promises I object to.

Wednesday, 26 October 2011

Transparency in action

Some days I think I have infinite patience. Today was a day I was reminded that I don't.

Here is what Oracle said in the JCP submission for Project Lambda (November 2010). Interspersed is what has actually happened (non contentious points snipped):

2.14 Please describe the anticipated working model for the Expert Group working on developing this specification.

A publicly readable mailing list for expert group communication will be the primary working model.

The expert group mailing list is not publicly readable. This has effectively created a 'them and us' environment where no rationale can be seen and no real input can be provided.

2.15 Provide detailed answers to the transparency checklist, making sure to include URLs as appropriate:
...
- The Expert Group business is regularly reported on a publicly readable alias.
We intend this to be the case.

Expert group business has occasionally been reported to the public lambda-dev list. However, feedback could in no way be described as "regular". Today (the straw that finally broke the camels back of my patience) I found that the most likely syntax for method references was being touted in an IBM developer works article without any input from the main public mailing list at all - I'm sorry but you cannot talk about transparency and then ignore the only vaguely transparent element in the system in key decisions.

- The public can read/write to a wiki for my JSR.
Rather than a wiki which the Expert Group must take time out to visit, we intend to implement a pair of mailing lists following the approach of JSR 294 and JSR 330. First, Expert Group business will be carried out on a mailing list dedicated to Expert Group members. Second, an "observer" mailing list which anyone may join will receive the traffic of the Expert Group list. The observer list allows the public to follow Expert Group business, and is writable to allow the public to make comments for other observers to see and make further comment on. Expert Group members are under no obligation to join the observer list and partake in public discussion. In addition to these lists we will maintain a private Expert-Group-only list for administrative and other non-technical traffic.

Nope. No observer list with public archives. And obviously not writeable. All meaningful discussions are in private and hidden.

- I read and respond to posts on the discussion board for my JSR on jcp.org.
In lieu of tracking the jcp.org discussion board, and in light of the considerable public attention that the observer list is likely to receive, the Specification Lead (or his designates) will read the observer list and respond directly as appropriate.

Obviously this fails too.

- There is an issue-tracker for my JSR that the public can read.
We intend this to be the case.

Anyone seen an issue tracker? Not me.

Now, I've been patient on this as I know privately some of the reasons why there is no transparency. But frankly enough is enough. If the management team cared enough about this they would have escalated the priority of this sufficiently by now to have removed the roadblocks.

After all, not even members of the JCP Executive Committee get to see what is going on, as per the comments for Java 7:

...
------------------------------------------------------------------------------
On 2011-07-18 Keil, Werner commented:
... want to emphasize again the concern about intransparent ways this umbrella JSR and some of the underlying EGs have worked.

Hoping greater transparency in the spirit of JCP.next and also more satisfactory licensing terms follow with Java 8 and beyond.
------------------------------------------------------------------------------
On 2011-07-16 London Java Community commented:
...
We note that the archives for some of the Expert Groups that comprise this JSR have not been made public. It is most regrettable that this did not happen prior to this being put to final vote.

We trust that no further platform JSRs will be submitted without full access to EG archives - we would be very unlikely to support any such JSR.

Summary

However you look at it, transparency simply isn't happening with key JCP Java SE projects. (The same issues afflict other Java SE projects)

But don't worry, its all going to get magically better! Oracle recently signed up for more transparency in JSR-348. Is anyone out there still willing to believe its actually going to happen?

Thursday, 17 June 2010

Exception transparency and Lone-Throws

The Project Lambda mailing list has been considering exception transparency recently. My fear with the proposal in this area is that the current proposal goes beyond what Java's complexity budget will allow. So, I proposed an alternative.

Exception transparency

Exception transparency is all about checked exceptions and how to handle them around a closure/lambda.

Firstly, its important to note that closures are a common feature in other programming languages. As such, it would be a standard approach to look elsewhere to see how this is handled. However, checked exceptions are a uniquely Java feature, so this approach doesn't help.

Within Neal Gafter's BGGA and CFJ proposals, and referenced by the original FCM proposal is the concept and solution for exception transparency. First lets look at the problem:

Consider a method that takes a closure and a list, and processes each item in the list using the closure. For out example, we have a conversion library method (often called map) that transforms an input list to an output list:

  // library method
  public static <I, O> List<O> convert(List<I> list, #O(I) block) {
    List<O> out = new ArrayList<O>();
    for (I in : list) {
      O converted = block.(in);
      out.add(converted);
    }
    return out;
  }
  // user code
  List<File> files = ...
  #String(File) block = #(File file) {
    return file.getCanonicalPath();
  };
  List<String> paths = convert(list, block);

However, this code won't work as expected unless we specially handle it in closures. This is because the method getCanonicalPath can throw an IOException.

The problem of exception transparency is how to transparently pass the exception, thrown by the user supplied closure, back to the surrounding user code. In other words, we don't want the library method to absorb the IOException, or wrap it in a RuntimeException.

Project Lambda approach

The approach of Project Lambda is modelled on Neal Gafter's work. This approach adds addition type information to the closure to specify what checked exceptions can be thrown:

  // library method
  public static <I, O, throws E> List<O> convert(List<I> list, #O(I)(throws E) block) throws E {
    List<O> out = new ArrayList<O>();
    for (I in : list) {
      O converted = block.(in);
      out.add(converted);
    }
    return out;
  }
  // user code
  List<File> files = ...
  #String(File)(throws IOException) block = #(File file) {
    return file.getCanonicalPath();
  };
  List<String> paths = convert(list, block);

Notice how more generic type information was added - throws E. In the library method, this is specified at least three time - once in the generic declaration, once in the function type of the block and once on the method itself. In short, throws E says "throws zero-to-many exceptions where checked exceptions must follow standard rules".

However, the user code also changed. We had to add the (throws IOException) clause to the function type. This actually locks in the exception that will be thrown, and allows checked exceptions to continue to work. This creates the mouthful #String(File)(throws IOException).

It has recently been noted that syntax doesn't matter yet in Project Lambda. However, here is a case where there is effectively a minimum syntax pain. No matter how you rearrange the elements, and what symbols you use, the IOException element needs to be present.

On the Project Lambda mailing list I have argued that the syntax pain here is inevitable and unavoidable with this approach to exception transparency. And I've gone further to argue that this syntax goes beyond what Java can handle. (Imagine some of these declarations with more than one block passed to the library method, or with wildcards!!!)

Lone throws approach

As a result of the difficulties above, I have proposed an alternative - lone-throws.

The lone-throws approach has three elements:

  1. Any method may have a throws keyword without specifying the types that are thrown ("lone-throws"). This indicates that any exception, checked or unchecked may be thrown. Once thrown in this manner, any checked exception flows up the stack in an unchecked manner.
  2. Any catch clause may have a throws keyword after the catch. This indicates that any exception may be caught, even if the exception isn't known to be thrown by the try block.
  3. All closures are implicitly declared with lone throws. Thus, all closures can throw checked and unchecked exceptions without declaring the checked ones.

Here is the same example from above:

  // library method
  public static <I, O> List<O> convert(List<I> list, #O(I) block) {
    List<O> out = new ArrayList<O>();
    for (I in : list) {
      O converted = block.(in);
      out.add(converted);
    }
    return out;
  }
  // user code
  List<File> files = ...
  #String(File) block = #(File file) {
    return file.getCanonicalPath();
  };
  List<String> paths = convert(list, block);

If you compare this example to the very first one, it can be seen that it is identical. Personally, I'd describe that as true exception transparency (as opposed to the multiple declarations of generics required in the Project Lambda approach.)

It works, because the closure block automatically declares the lone-throws. This allows all exceptions, checked or unchecked to escape. These flow freely through the library method and back to the user code. (Checked exceptions only exist in the compiler, so this has no impact on the JVM)

The user may choose to catch the IOException, however they won't be forced to. In this sense, the IOException has become equivalent to a runtime exception because it was wrapped in a closure. The code to catch it is as follows:

  try {
    paths = convert(list, block);  // might throw IOException via lone-throws
  } catch throws (IOException ex) {
    // handle as normal - if you throw it, it is checked again
  }

The simplicity of the approach in syntax terms should be clear - it just works. However, the downside is the impact on checked exceptions.

Checked exceptions have both supporters and detractors in the Java community. However, all must accept that given projects like Spring avoiding checked exceptions, their role has been reduced. It is also widely known that other newer programming languages are not adopting the concept of checked exceptions.

In essence, this proposal provides a means for the new reality where checked exceptions are less important to be accepted. Any developer may use the lone-throws concept to convert checked exceptions to unchecked ones. They may also use the catch-throws concept to catch the exceptions that would otherwise be uncatchable.

This may seem radical, however with the growing integration of non-Java JVM languages, the problem of being unable to catch checked exceptions is fast approaching. (Many of those languages throw Java checked exceptions in an unchecked manner.) As such, the catch-throws clause is a useful language change on its own.

Finally, I spent a couple of hours tonight implementing the lone-throws and catch-throws parts. It took less than 2 hours - this is an easy change to specify and implement.

Summary

Overall, this is a tale of two approaches to a common problem - passing checked exceptions transparently from inside to outside a closure. The Project Lambda approach preserves full type-information and safeguards checked exceptions at the cost of horribly verbose and complex syntax. The lone-throws approach side-steps the problem by converting checked exceptions to unchecked, with less type-information as a result, but far simpler syntax. (The mailing list has discussed other possible alternatives, however these two are the best developed options.)

Can Java really stand the excess syntax of the Project Lambda approach?
Or is the lone-throws approach too radical around checked exceptions?
Which is the lesser evil?

Feedback welcome!

Monday, 23 November 2009

More detail on Closures in JDK 7

This blog goes into a little more detail about the closures announcement at Devoxx and subsequent information that has become apparent.

Closures in JDK 7

At Devoxx 2009, Mark Reinhold from Sun announced that it was time for closures in Java. This was a big surprise to everyone, and there was a bit of a vacuum as to what was announced. More information is now available.

Firstly, Sun, via Mark, have chosen to accept the basic case for including closures in Java. By doing so, the debate now changes from whether to go ahead, to how to proceed. This is an important step.

Secondly, what did Mark consider to be in and what out? Well, he indicated that non-local returns and the control-invocation statement were out of scope. There was also some indication that access to non-final variables may be out of scope (this is mainly because it raises nasty multi-threading Java Memory Model issues with local variables).

In terms of what was included, Mark indicated that extension methods would be considered. This would be necessary to provide meaningful closure style APIs since the existing Java collection APIs cannot be altered. The result of what was in was being called "simple closures".

Finally, Mark offered up a possible syntax. The syntax he showed was very close to FCM:

  // function expressions
  #(int i, String s) {
    System.println.out(s);
    return i + str.length();
  }

  // function expressions
  #(int i, String s) (i + str.length())
  
  // function types
  #int(int, String)

As such, its easy to say that Sun has "chosen the FCM proposal". However, with all language changes, we have to look at the semantics, not the syntax!

The other session at Devoxx was the late night BOF session. I didn't attend, however according to Reinier Zwitserloot, Mark indicated that Exception transparancy might not be essential to the final proposal.

According to Reinier, Mark also said "It is not an endorsement of FCM. He was very specific about that." I'd like to consider that in a little more detail (below).

The final twist was when a new proposal by Neal Gafter was launched which I'm referring to as the CFJ proposal. After some initial confusion, it became clear that this was written 2 weeks before Devoxx, and that Neal had discussed it primarily with James Gosling on the basis of it being a "compromise proposal". Neal has also stated that he didn't speak to Mark directly before Devoxx (and nor did I).

There are a number of elements that have come together in the various proposals:

  CICE BGGA 0.5 FCM 0.5 CFJ 0.6a Mark's announcement
(Devoxx summary)
Literals for reflection - - Yes - No
(No info)
Method references - - Yes Yes Worth investigating
(No info)
Closures assignable to
single method interface
Yes Yes Yes Yes Yes
(No info)
Closures independent of
single method interface
- Yes Yes Yes Yes
Access non-final local variables Yes Yes Yes Yes No
(Maybe not)
Keyword 'this' binds to
enclosing class
- Yes Yes Yes No info
(Probably yes)
Local 'return'
(binds to closure)
Yes - Yes Yes Yes
Non-local 'return'
(binds to enclosing method)
- Yes - - No
Alternative 'return' binding
(Last-line-no-semicolon)
- Yes - - No
Exception transparancy - Yes Yes Yes No info
(Maybe not)
Function types - Yes Yes Yes Yes
Library based Control Structure - Yes - - No
Proposed closure syntax Name(args) {block} {args => block;expr} #(args) {block} #(args) {block}
#(args) expr
#(args) {block}
#(args) (expr)

A table never captures the full detail of any proposal. However, I hope that I've demonstrated some key points.

Firstly, the table shows how the CFJ is worthy of a new name (from BGGA) as it treats the problem space differently by avoiding non-local returns and control invocation. Neal Gafter is also the only author of the CFJ proposal, unlike BGGA.

Secondly, it should be clear that at the high level, the CFJ and FCM proposals are similar. But in many respects, this is also a result of what would naturally occur when the non-local returns and control invocation is removed from BGGA.

Finally, it should be clear that it is not certain that the CFJ proposal meets the aims that Mark announced at Devoxx ("simple closures"). There simply isn't enough hard information to make that judgement.

One question I've seen on a few tweets and blog posts is whether Mark and Sun picked the BGGA or FCM proposal. Well, given what Mark said in the BOF (via Reinier), the answer is that neither was "picked" and that he'd like to see a new form of "simple closures" created. However, it should also be clear that the choices announced at Devoxx were certainly closer to the FCM proposal than any other proposal widely circulated at the time of the announcement.

At this point, it is critical to note that Neal Gafter adds a huge amount of technical detail to each proposal he is involved with - both BGGA and CFJ. The FCM proposal, while considerably more detailed than CICE, was never at the level necessary for full usage in the JDK. As such, I welcome the CFJ proposal as taking key points from FCM and applying the rigour from BGGA.

As of now, there has been relatively little debate on extension methods.

Summary

The key point now is to focus on how to implement closures within the overall scope laid out. This should allow the discussion to move forward considerably, and hopefully with less acrimony.

Thursday, 19 November 2009

Closures in JDK 7

So, on Wednesday, Mark Reinhold from Sun announced that it was time for closures in Java. This came as a surprise to everyone, but what was announced?

Closures in JDK 7

Mark took the audience through the new features of JDK 7 in his presentation at Devoxx. As part of this he showed the Fork Join framework.

Part of this framework features a set of interfaces to support functional style predicates and transforms. But, in order to implement this in Java required 80 interfaces, all generified, and that only covered 4 of the primitive types! Basically, Mark, and others, had clearly come to the conclusion that this code was important to Java, but that the implementation with single-method interfaces, generics and inner classes was too horrible to stomach for the JDK.

Thus, 'its time to add closures to Java'.

Mark announced that Sun would investigate, and intended to commit, closures to OpenJDK for JDK 7. While there will be no JSR yet, it isn't unreasonable to expect that there will be an open aspect to discussions.

Mark noted that the work in the closures discussion, and particularly the detail in the BGGA proposal, had allowed the options to be properly explored. He emphasised how this had resulted in the decision to reject key aspects of BGGA.

JDK 7 closures will not have control-invocation statements as a goal, nor will it have non-local returns. He also indicated that access to non-final variables was unlikely. Beyond this, there wasn't much detail on semantics, nor do I believe that there has been much consideration of semantics yet.

On syntax, Mark wrote up some strawman syntaxes. These follow the FCM syntax style:

  // function expressions
  #(int i, String s) {
    System.println.out(s);
    return i + s.length();
  }

  // function expressions
  #(int i, String s) (i + s.length())
  
  // function types
  #int(int, String)

Mark also argued that a form of extension methods would be added to allow closures to be used with existing libraries like the collections API.

Although I must emphasise that everything announced was a proposal and NOT based on any specific proposal (BGGA, FCM or CICE). However, the syntax and semantics do follow the FCM proposal quite closely.

In addition, Neal Gafter has produced an initial formal writeup of what was announced derived from BGGA. Based on my initial reading, I believe that Neal's document represents a good place to start from. I also hereby propose that we refer to Neal's document as the CFJ proposal (Closures for Java), as BGGA rather implies control invocation.

So, lets hope the process for closures provides for feedback, and we get the best closures in the style most suitable for Java in JDK 7.

Thursday, 28 February 2008

FCM closures - options within

The FCM closures proposal, with the JCA extension, consists of multiple parts. This blog outlines how those parts fit together.

FCM+JCA spec parts

The FCM+JCA spec, contains the following elements:

  • Method literals, also constructor and field literals
  • Method references, also constructor references
  • Inner methods
  • Method types, aka function types
  • Control invocation, in the JCA extension

These five parts all fulfil different roles in the proposal. But what is often not understood is how feasible it would be to implement less than the whole specification.

Method references and literals

Reviewing response to the entire debate, it is clear to me that method references and/or method literals have generally widespread appeal. The ability to reference a method, constructor or field in a compile-time safe and refactorable way is a huge gain for a statically typed language like Java.

It should be noted that although Method References appear only in the FCM spec, they could be added to the BGGA or CICE spec without difficulty. Also, they are included in the closures JSR proposal.

There are three areas of contention with method references and literals.

Firstly, should both references and literals be supported, or just references. Or, looking at the question differently, should literals have a different syntax.

The problem here is that if a method reference and literal have the same syntax then it is unclear as to what the type of the expression is. The FCM prototype demonstrates, I believe, that this can be solved using the same syntax. The approach taken is to say that the default type is a method literal, but that it can be converted (boxed) to a method reference at construction time. Any ambiguity is an error.

 Method m = Integer#valueOf(int);
 IntToInteger i = Integer#valueOf(int);

In this example, IntToInteger is a single method interface. Because the expression Integer#valueOf(int) is assigned to a single method interface, the conversion occurs (generating a wrapping inner class).

The second issue is what a method reference can be boxed into. This is essentially a question of whether function types should be supported, and I'll cover that below.

The third issue is syntax, specifically the use of the #. Personally, I find this syntax natural and obvious to read, but I know others differ. I think it is important to get the syntax right, but final decisions on that can come later.

So, are method literals and reference required when implementing FCM? I would say 'yes'. These are simple, popular, constructs that naturally extend Java in a non-threatening way. Some have suggested omitting the literals as reflection is not type-safe, however this misses the point of the large number of existing frameworks and APIs that accept reflection Method as an input parameter.

Inner methods / Closures

 ActionListener lnr = #(ActionEvent ev) {
   ...
 };

This is where the key difference with BGGA lies, notably over the meaning of return and the value, and safety, of non-local returns.

Opinions on this appear to me to be impacted by the generics implementation, where the decision was made to do what feels like 'half a job'. As a result, there is a meme that runs 'we must implement closures fully or not at all'. This meme is extremely unfortunate, as it is not allowing a rational analysis of the semantics of the proposals. Anyone supporting BGGA really needs to consider the mistakes that developers will make again and again with the non-local return/last-line-no-semicolon approach.

So, are inner methods required when implementing FCM? I would say 'effectively, yes'. Although you could just implement method literals and references alone, there are even bigger gains to be had from adding inner methods. They greatly simplify the declaration of single method inner classes, and allow much of the impact of closures in the style of Java.

Function types / Method types

 #(int(String) throws IOException)

These allow a new powerful form of programming where common pieces of code can be easily abstracted. They simply act as types, but they have two different properties from other types.

Firstly, they have no name. This means an absence of Javadoc, including any semantic requirements of the API, such as thread-safety or null/not-null.

Secondly, they only describe the input and output types. This is a higher abstraction than Java has previously used, and will require a mindset shift for those using them.

So, are function types required when implementing FCM? I would say 'no, not required'. It is perfectly possibly and reasonable to implement FCM without method types. In fact, that is what the prototype does. In practice, this just means that all conversions from method references and inner methods must be to single method interfaces rather than method types.

Omitting method types greatly simplifies the conceptual weight of the change. The downside is that true higher order functional programming becomes near impossible. That may be no bad thing. Java is not, and never has been, a functional programming language. It seems very odd to try and push it in that direction at this point in its life.

A better alternative would be to pursue supporting primitive types in generics. This would greatly reduce the overhead of single method interfaces required by something like the fork-join framework.

Similarly, making single method interfaces easier to write (lightweight interfaces) would be a direction to take in the absence of method types.

Control invocation

 withLock(lock) {
   ...
 }
 public void withLock(#(void()) block : Lock lock) {
   ...
 }

Control invocation forms are perhaps the only way forward in Java longer term because they allow us to escape from many of these language change debates. They allow anyone to write methods that can be used in the style of control statements. It is vital to remember that they are just methods however.

BGGA appears to build much of its spec around control invocation, and the non-local returns make perfect sense in this area.

The JCA spec defines that the calling code should be identical to BGGA, but the method invoked should be written differently. The aim of JCA is to provide an element of discouragement from using control invocation. This is because of the additional complexities in getting the code right (exception transparancy, completion transparancy, non-local returns etc). A different, special, syntax encourages this feature to be restricted to senior developers, or heavily code reviewed.

So, is control invocation required when implementing FCM+JCA? I would say 'no'. It is perfectly possibly and reasonable to implement FCM+JCA without control invocation (although of course that means it would just be FCM!).

The inclusion or omission of method types is also linked to control invocation, as method types are a pre-requisite for control invocation in the JCA spec.

Summary of possible implementations

Thus, here are the possible FCM+JCA implementation combinations that make sense to me:

  1. Literals and References
  2. Literals, References and Inner methods
  3. Literals, References, Inner methods and Method types
  4. Literals, References, Inner methods, Method types and Control invocation

My preferred options are number 2 and number 4.

Why? Because, I believe inner methods are too useful to omit, and I believe method types are generally too complex unless you really need them. (Also Java isn't a functional programming language.)

The key point of this blog is to emphasise that FCM is not a take it or leave it proposal. There are different options and levels within it that could be adopted.

This extends to versions of Java. For example, it would be feasible to implement option 1, literals and references in Java 7, whilst adding inner methods and maybe more in Java 8.

Summary

I've shown how FCM has parts which can be considered separately to a degree. I've also indicated which combinations make sense to me.

Which combinations make sense to you?

Sunday, 24 February 2008

FCM prototype available

I'm happy to announce the first release of the First Class Methods (FCM) java prototype. This prototype anyone who is interested to find out what FCM really feels like.

Standard disclaimer: This software is released on a best-efforts basis. It is not Java. It has not passed the Java Compatibility Testing Kit. Having said that, in theory no existing programs should be broken. Just don't go relying on code compiled with this prototype for anything other than experimentation!

FCM javac implementation

The FCM javac implementation is hosted at Kijaro. The javac version used as a baseline is OpenJDK (the last version before the Mercurial cutover).

The prototype includes the following features:

  • Method literals
  • Constructor literals
  • Field literals
  • Static method references
  • Bound method references
  • Constructor references
  • Anonymous inner methods

The following are not implemented:

  • Method types
  • Instance method references
  • Named inner methods
  • Inner method non-final local variable access
  • Inner method exception inference
  • Inner method exception/completion transparancy
  • Conversion to single abstract method classes

The download includes a README with many FAQs answered, including more information on how the types work.

The biggest outstanding area is getting generics working properly. This is a complex task however, and I took the view that it was better to release early than spend any more time trying to get generics working properly.

Nevertheless, even without full generics, you can get a really good feel for how Java would look and feel with FCM. In addition, the FCM enabled code just falls off my fingers very nicely. Now if only we could get Eclipse or Netbeans support...

Summary

One of the key requests for considering FCM as a viable proposal has been having a prototype to play with. Now the prototype is out there, it would be really great to hear some feedback, including any bugs. Comments welcome here, at kijaro-dev mailing list or scolebourne-joda-org.

Thursday, 21 February 2008

Closures - Lightweight interfaces instead of Function types

Function types, or method types as FCM refers to them, are one of the most controversial features of closures. Is there an alternative that provides 80% of the power but in the style of Java?

Function/Method types

Function types allow the developer to define a type using just the signature, rather than a name. For example:

 // BGGA
 {String, Date => int}
 
 // FCM v0.5
 #(int(String, Date))

Apart from the different syntax, these are identical concepts in the two proposals as they currently stand. They both mean "a type that takes in two parameters - String and Date - and returns an int". This will be compiled to a dynamically generated single method interface as follows:

 public interface IOO<A, B> {
  int invoke(A a, B b);
 }

When instantiated, the generics A and B will be String and Date.

This is a complicated underlying mechanism. It is also one that can't be completely hidden from the developer, as the exception stack traces will show the auto-generated interface "IOO". This will certainly be a little unexpected at first. Update: Neal points out correctly that an interface name will not appear in the stacktrace!

A second complaint about function types is that there is no home for documentation. One of Java's key strengths is its documentation capabilities in the form of Javadoc. This is perhaps the unsung reason as to why Java became so popular in enterprises as a long-life, bet your company, language. Maintenance coders love that documentation. And everybody loves the ability to link to it within your IDE

So, why are we even considering function types? Well they allow APIs to be written that can take in any closure, simply defining it in terms of the input and output types. They also allow lightweight definition - there is no need to define the type before using it.

These are highly powerful features, and they lead towards functional programming idioms. But are these idioms completely in the style of Java?

Another option

Lets start from what we would write in Java today.

 public interface Convertor {
  int convert(String str, Date date);
 }

The advantage of this is that everyone knows and understands it. Its part of the lingua franca that is Java. Now lets examine what we could do with this.

Firstly, we need to remember that it is possible to define a class or interface such as this one within a method in Java today. The scope of such a class is the scope of the method. This will come in useful later.

So, lets examine what would happen if we start shortening the interface definition. For a function type equivalent, we know that there is only one method. As such, there isn't really any need for the braces:

 public interface Convertor int convert(String str, Date date);

Now, lets consider that for a function type equivalent, the method name is pre-defined as 'invoke'. As such, there is no need to include the method name:

 public interface Convertor int(String str, Date date);

Now, lets consider that for a function type equivalent, the parameter names are unimportant. As such, lets remove them (or maybe make them optional):

 public interface Convertor int(String, Date);

And that's it. I'm calling this a lightweight interface for now.

They represent a reasonable reduction of the code necessary to define a named single method interface. The syntax would be allowed anywhere an existing interface could be defined. This includes its own source file, nested in another class or interface, or locally scoped within a method. This is a longer example:

 // with function types (FCM syntax)
 public int process() {
  #(int(int, int)) add = #(int a, int b) {return a + b;};
  #(int(int, int)) mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }
 
 // with named lightweight interfaces - solution A
 interface Adder int(int,int);
 interface Multiplier int(int,int);
 public int process() {
  Adder add = #(int a, int b) {return a + b;};
  Multiplier mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }
 
 // with named lightweight interfaces - solution B
 public int process() {
  interface MathsCombiner int(int,int);
  MathsCombiner add = #(int a, int b) {return a + b;};
  MathsCombiner mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }

Solution A shows how you might define one lightweight interface for each operation. Solution B shows how you might define just one lightweight interface. It also shows that the lightweight interface could be define locally within the same method.

What we have gained is a name for the function type. It is now possible to write Javadoc for it and hyperlink to it in your IDE.

And it can be quickly and easily grasped as simply a shorthand way of defining a single method interface. In fact, you would be able to use this anywhere in your code as a normal single method interface, implementing it using a normal class, or extending it as required.

Its also possible to imagine IDE refactorings that would convert a lightweight interface to a full interface if you needed to add additional methods. Or to convert a single method interface to the lightweight definition.

Of course it would be possible to take this further by eliminating the name:

 // example showing what is possible, I'm not advocating this!
 public int process() {
  interface int(int,int) add = #(int a, int b) {return a + b;};
  interface int(int,int) mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }

However, the developer must now mentally parse both the lines "interface int(int,int)" to see if they are the same type. Previously, they could just see that they were both "MathsCombiner". As such, I prefer keeping the name, and requiring developers to take the extra step.

I see this as an example of where the style of Java differs from other more dynamic languages. In Java you always define your types up front. In more dynamic languages, you often just code the closure. As this concept requires defining types up front, I might suggest it is more in the Java style.

Final example

One final example is from my last blog post, this time in BGGA syntax:

 // Example functional programming style method using BGGA syntax
 public <T, U> {T => U} converter({=> T} a, {=> U} b, {T => U} c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }
 
 // The same using lightweight interfaces
 interface Factory<C> C();
 interface Transformer<I, O> O(I);
 public <T, U> Transformer<T, U> converter(Factory<T> a, Factory<U> b, Transformer<T, U> c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }

Personally, I find the latter to be much more readable, even though it involves more code. Both Factory and Transformer can be defined once, probably in the JDK or framework, and have associated documentation.

In addition, if I'd never seen the code before, I'd much prefer to be assigned to maintain the latter code with lightweight interfaces. Perhaps that is the key to Java's success - code that can be maintained. Write once. Read many times.

Thanks

Finally, I should note that some of the inspiration for this idea came from blogs and documents by Remi Forax and Casper Bang.

Summary

I've outlined an alternative to function types that keeps a key Java element - the type and its name. Lightweight interfaces are easy and quick to code if you don't want to document, but have the capacity to grow and be full members of the normal Java world if required.

I'd really love to hear opinions on this. It seems like a great way to balance the competing forces, but what do you think?

 

PS. Don't forget to vote for FCM at the java.net poll!

Tuesday, 19 February 2008

Evaluating BGGA closures

The current vote forces us all to ask what proposal is best for the future of Java. Personally, I don't find the BGGA proposal persuasive (in its current form). But exactly why is that?

Touchstone debate

The closures debate is really a touchstone for two other, much broader, debates.

The first "what is the vision for Java". There has been no single guiding vision for Java through its life, unlike other languages:

In the C# space, we have Anders. He clearly "owns" language, and acts as the benevolent dictator. Nothing goes into "his" language without his explicit and expressed OK. Other languages have similar personages in similar roles. Python has Guido. Perl has Larry. C++ has Bjarne. Ruby has Matz.

The impact of this lack of guiding vision is a sense that Java could be changed in any way. We really need some rules and a guiding Java architect.

The second touchstone debate is "can any language changes now be successfully implemented in Java". This is generally a reference to generics, where erasure and wildcards have often produced havoc. In the votes at Javapolis, 'improving generics' got the highest vote by far.

The result of implementing generics with erasure and wildcards has been a loss of confidence in Java language change. Many now oppose any and all change, however simple and isolated. This is unfortunate, and we must ensure that the next batch of language changes work without issues.

Despite this broader debates that surround closures, we must focus on the merits of the individual proposals.

Evaluating BGGA

BGGA would be a very powerful addition to Java. It contains features that I, as a senior developer, could make great use of should I choose to. It is also a well written and thought out proposal, and is the only proposal tackling some key areas, such as exception transparancy, in detail.

However, my basic issue with BGGA is that the resulting proposal doesn't feel like Java to me. Unfortunately, 'feel' is highly subjective, so perhaps we can rationalise this a little more.

Evaluating BGGA - Statements and Expressions

Firstly, BGGA introduces a completely new block exit strategy to Java - last-line-no-semicolon expressions. BGGA uses them for good reasons, but I believe that these are very alien to Java. So where do they come from? Well consider this piece of Java code, and its equivalent in Scala:

 // Java
 String str = null;
 if (someBooleanMethod()) {
  str = "TRUE";
 } else {
  str = "FALSE";
 }
 
 // Scala
 val str = if (someBooleanMethod()) {
  "TRUE"
 } else {
  "FALSE"
 }

There are two points to note. The first is that Scala does not use semicolons at the end of line. The second, and more important point is that the last-line-expression from either the if or the else clause is assigned to the variable str.

The fundamental language level difference going on here is that if in Java is a statement with no return value. In Scala, if is an expression, and the result from the blocks can be assigned to a variable if desired. More generally, we can say that Java is formed from statements and expressions, while in Scala everything can be an expression.

The result of this difference, is that in Scala it is perfectly natural for a closure to use the concept of last-line-no-semicolon to return from a closure block because that is the standard, basic language idiom. All blocks in Scala have the concept of last-line-no-semicolon expression return.

This is not the idiom in Java.

Java has statements and expressions as two different program elements. In my opinion, BGGA tries to force an expression only style into the middle of Java. The result is in a horrible mixture of styles.

Evaluating BGGA - Non local returns

A key reason for BGGA using an alternate block exit mechanism is to meet Tennant's Correspondance Principle. The choices made allow BGGA to continue using return to mean 'return from the enclosing method' whilst within a closure.

There is a problem with using return in this way however. Like all the proposals, you can take a BGGA closure, assign it to a variable and store it for later use. But, if that closure contains a return statement, it can only be successfully invoked while the enclosing method still exists on the stack.

 public class Example {
  ActionListener buttonPressed = null;
  public void init() {
   buttonPressed = {ActionEvent ev => 
     callDatabase();
     if (isWeekend() {
      queueRequest();
      return;
     }
     processRequest();
   };
  }
 }

In this example, the init() method will be called at program startup and create the listener. The listener will then be called later when a button is pressed. The processing will call the database, check if it is weekend and then queue the request. It will then try to return from the init() method.

This obviously can't happen, as the init() method is long since completed. The result is a NonLocalTransferException - an unusual exception which tries to indicate to the developer that they made a coding error.

But this is a runtime exception.

It is entirely possible that this code could get into production like this. We really need to ask ourselves if we want to change the Java programming language such that we introduce new runtime exceptions that can be easily coded by accident.

As it happens, the BGGA specification includes a mechanism to avoid this, allowing the problem to be caught at compile time. Their solution is to add a marker interface RestrictedFunction to those interfaces where this might be a problem. ActionListener would be one such interface (hence the example above would not compile - other examples would though).

This is a horrible solution however, and doesn't really work. Firstly, the name RestrictedFunction and it being a marker interface are two design smells. Secondly, this actually prevents the caller from using ActionListener with return if they actually want to do so (within the same enclosing method).

The one final clue about non local returns is the difficulty in implementing them. They have to be implemented using the underlying exception mechanism. While this won't have any performance implications, and won't be visible to developers, it is another measure of the complexity of the problem.

In my opinion, allowing non local returns in this way will cause nothing but trouble in Java. Developers are human, and will easily make the mistake of using return when they shouldn't. The compiler will catch some cases, with RestrictedFunction, but not others, which will act as a further level of confusion.

Evaluating BGGA - Functional programming

Java has always laid claim to be an OO language. In many ways, this has often been a dubious claim, especially with static methods and fields. Java has never been thought of as a functional programming langauge however.

BGGA introduces function types as a key component. These introduce a considerable extra level of complexity in the type system.

A key point is that at the lowest compiler level, function types are no different to ordinary, one method, interfaces. However, this similarity is misleading, as at the developer level they are completely different. They look nothing like normal interfaces, which have a name and javadoc, and they are dynamically generated.

Moreover, function types can be used to build up complex higher-order function methods. These might take in a function type parameter or two, often using generics, and perhaps returning another function type. This can lead to some very complicated method signatures and code.

 public <T, U> {T => U} converter({=> T} a, {=> U} b, {T => U} c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }

It shoud be noted that FCM also includes function types, which are named method types. However, both Stefan and myself struggle with the usability of them, and they may be removed from a future version of FCM. It is possible to have FCM inner methods, method references and method literals without the need for method types.

Function types have their uses, however I am yet to be entirely convinced that the complexity is justified in Java. Certainly, they look very alien to Java today, and they enable code which is decidedly hard to read.

Evaluating BGGA - Syntax

The syntax is the least important reason for disliking BGGA, as it can be altered. Nevertheless, I should show an example of hard to read syntax.

 int y = 6;
 {int => boolean} a = {int x => x <= y};
 {int => boolean} b = {int x => x >= y};
 boolean c = a.invoke(3) && b.invoke(7);

Following =>, <= and >= can get very confusing. It is also a syntax structure for parameters that is alien to Java.

Summary

As co-author of the FCM proposal it is only natural that I should want to take issue with aspects of competitor proposals. However, I do so for one reason, and that is to ensure that the changes included in Java 7 really are the right ones. Personally, I believe that BGGA attempts something - full closures - that simply isn't valid or appropriate for Java.

It remains my view that the FCM proposal, with the optional JCA extension, provides a similar feature set but in a more natural, safe, Java style. If you've read the proposals and are convinced, then please vote for FCM at java.net.

Opinions welcome as always :-)

References

For reference, here are the links to all the proposals if you want to read more:

  • BGGA - full closures for Java
  • CICE - simplified inner classes (with the related ARM proposal)
  • FCM - first class methods (with the related JCA proposal)

Sunday, 17 February 2008

Vote for FCM!

Java.net is currently running a poll on closures, to get a feel for the strength of support for each proposal. Obviously, this poll has no power, but it is useful to see at a high level what the communities opinion is.

Personally, I'm really pleased with the level of support FCM has had in the vote so far, on this blog and privately. Now I'd like to encourage you, if you so desire, to vote and support FCM. Thanks!

Update: If you want to compare the three proposals, please take a look at my previous comparison articles - one method callbacks, control structures and type inference.

Friday, 25 January 2008

Closures - Comparing closure type inference

In this blog I'm going to compare the three principle 'closure' proposals based on how they infer types. This follows my previous comparison blogs.

Here are the links to the proposals if you want to read more:

  • BGGA - full closures for Java
  • CICE - simplified inner classes (with the related ARM proposal)
  • FCM - first class methods (with the related JCA proposal)

Type inference

Type inference is where the compiler works out the type of a given expression without the programmer needing to explicitly state the type. For example:

 String str = "Hello";

In the above, the type of str is known to be a String because it is stated by the developer. However, there is no actual need for the developer to manually state the type of 'String'. Instead, the compiler (not the Java 1.6 compiler) could work it out itself by infering the type from the literal "Hello" - hence 'type inference'.

Closures

Lets consider how the closures proposals allow some limited type inference in Java. Consider this single method interface:

 public interface PairMatcher<T,U> {
   boolean matches(T arg1, U arg2);
 }

The purpose of the interface is to allow developers to write a callback that takes two arguments and returns true if they 'match' some criteria that is important to the developer. Here is an example of such a framework method, that searches through two lists and returns the matched entry:

 public T find(List<T> list1, List<U> list2, PairMatcher<T,U> matcher) {
   for (T t : list1) {
     for (U u : list2) {
       if (matcher.matches(t, u)) {
         return t;
       }
     }
   }
   return null;
 }

And here is how calling this code looks today using an inner class:

 List<String> stringList = ...
 List<Integer> integerList = ...
 String str = find(stringList, integerList, new PairMatcher<String,Integer>() {
   public boolean matches(String str, Integer val) {
     return (str != null && str.length() > 0 && val != null && val > 10);
   }
 });

BGGA

With BGGA, the code can be shortened to use use a closure:

 List<String> stringList = ...
 List<Integer> integerList = ...
 String str = find(stringList, integerList, {String str, Integer val =>
   (str != null && str.length() > 0 && val != null && val > 10)
 });

Obviously the code is shorter. But it is important to understand what has occurred. The type PairMatcher, and its generic arguments, have been inferred. This happened as the BGGA compiler identified which method was being called. In order to call the method, a 'closure conversion' occurred, which changed the closure to a PairMatcher with the correct generics.

So, not only is the type inferred, but the generic arguments are as well.

FCM

FCM can be used, like BGGA, to shorten the original:

 List<String> stringList = ...
 List<Integer> integerList = ...
 String str = find(stringList, integerList, #(String str, Integer val) {
   return (str != null && str.length() > 0 && val != null && val > 10);
 });

Exactly the same conversion and type inference is occurring as in BGGA. Thus, for this scenario of type inference, FCM and BGGA have the same power, just different syntax.

CICE

CICE also provides a means to shorten the original:

 List<String> stringList = ...
 List<Integer> integerList = ...
 String str = find(stringList, integerList, PairMatcher<String,Integer>(String str, Integer val) {
   return (str != null && str.length() > 0 && val != null && val > 10);
 });

I hope that this example makes it clear that there is no type inference here as there was in FCM and BGGA. Instead, the developer still has to type the interface name and, more significantly, the generic arguments.

While just typing the interface name might be regarded as a documentation benefit by some, I would suggest that it is very hard to justify the retyping of the generic arguments. Clearly, for this scenario of type inference, CICE has less power than FCM and BGGA.

Comparison

Both BGGA and FCM support the same style of type inference for the single method interface and it's generic arguments. CICE does not, and with generics that results in quite a verbose result for the developer to type.

Summary

There are two basic approaches here - infer or be explicit. My view is that this is an area where inference is really needed. With CICE, I find it hard to see how this is really much of a gain over an inner class.

Opinions welcome as always :-)

Tuesday, 1 January 2008

Closures - Comparing control structures of BGGA, ARM and JCA

In this blog I'm going to compare the second major part of the three principle 'closure' proposals - control structures. This follows my previous blog where I compared one method callbacks.

Here are the links to the proposals if you want to read more:

  • BGGA - full closures for Java
  • CICE - simplified inner classes (with the related ARM proposal)
  • FCM - first class methods (with the related JCA proposal)

Comparing 'closure' 'control structure' proposals

Firstly, what do we mean by control structure problems? Well, there are three classic use cases:

 // resource management
 try {
   // open resource
   // use resource
 } finally {
   // close resource
 }

 // locking
 try {
   // lock
   // process within the lock
 } finally {
   // unlock
 }

 // looping around a map
 for (Map.Entry<String,Integer> entry : map.entrySet()) {
   String key = entry.getKey();
   Integer value = entry.getValue();
   // process map
 }

Each of the three principle closure proposals provide a solution to some or all of these issues.

BGGA

BGGA treats control structures as simply an alternative way to invoke a method. This allows any developer to write control structures, something referred to as library-defined control structures.

Thus, with BGGA, you can write a method to manage a lock. This can be called in two ways - either by passing the closure as a parameter, or by following the method call by the block.

 // library defined control structure
 public void withLock(Lock lock, {=>void} block) {
   lock.lock();
   try {
     block.invoke();
   } finally {
     lock.unlock();
   }
 }

 // call using closure as parameter
 withLock(lock, {=>
   // process within the lock
 });

 // call using control invocation syntax
 withLock(lock) {
   // process within the lock
 }

Whichever style is used, the meaning of return and this is the same. The return keyword will return from the lexically enclosing method, which is entirely sensible for control invocation. The this keyword also refers to the lexically enclosing class.

CICE/ARM

CICE does not support control structures. Instead, this area is covered by the ARM proposal.

The basic idea behind the ARM proposal is that control structures should only be added by language designers, and should not be able to be added to libraries.

The ARM proposal itself is merely an outline proposal and does not have detail. However, Josh Bloch has indicated that he would see ARM tackling specific issues, notably resource management and locking.

 // resource management
 try (BufferedReader r = new BufferedReader(new FileReader(file))) {
   // process resource
 }

 // locking
 protected(lock) {
   // process within the lock
 }

As this is a language change, it has to wait until a new version of Java is released. In addition, the meaning of return and this is naturally as per any other language defined control statement, such as a normal try block.

FCM/JCA

FCM does not support control structures. Instead, this area is covered by the JCA proposal.

The basic idea behind the JCA proposal is that control structures can be added by anyone, but that they should clearly be separated from the one method callback cases. The aim of the separation is to send a message to developers - adding a control structure in a library is something that should be undertaken with care.

 // library defined control structure
 public void withLock(#(void()) block : Lock lock) {
   lock.lock();
   try {
     block.invoke();
   } finally {
     lock.unlock();
   }
 }

 // call
 withLock(lock) {
   // process within the lock
 }

It should be noted that this is similar to the BGGA design. The calling code is identical to BGGA's control invocation syntax. However, you cannot pass the block of code as a parameter to the method as you can in BGGA - there is only one calling syntax.

The library defined control structure is very different however. In JCA, the library method has a special syntax with the block before the colon, and the input parameters after the colon. This matches the callers syntax, and clearly indicates that this is a 'special' kind of method.

Again, the meaning of return and this is lexically scoped, as with BGGA and ARM.

Comparison

Each of the three approaches has merits.

BGGA provides a full-featured approach, where a closure is treated as it would be in most other languages. Control structures can be written in libraries, and the control invocation syntax is just sugar for a normal method call.

ARM provides a use-case based approach. It solves specific problems with specific language changes. This doesn't allow ordinary developers to add control structures, but does provide valuable, reliable, enhancements.

JCA sits between BGGA and ARM, but because it allows library-defined control structures it is closer to BGGA. It differs from BGGA in that it treats control structures as effectively a separate language change to one method callbacks. By doing this, and with specific syntax, it aims to slightly discourage the extensive use of library-defined control structures.

Summary

There are two basic approaches to control structures - let language designers write them, eg. ARM, or let anyone write them, eg. BGGA or JCA. The BGGA and JCA proposals differ mainly on syntax, where JCA is using the syntax to warn against excessive or inappropriate use of control structures.

My own view is that everyone should have the right to write control structures, and that it will allow the Java language to keep fresh without lots of pressure on Sun to add new language features.

However, the right to add control structures comes with responsibilities. If everyone writes control structures all the time, then the resulting code might look more like a dialect of Java than Java itself. That is why I favour JCA, which aims through syntax and documentation to encourage serious consideration before writing a control structure.

Opinions welcome as always :-)

Tuesday, 18 December 2007

Closures - Comparing the core of BGGA, CICE and FCM

In this blog I'm going to compare the core of the three principle 'closure' proposals. This is particularly apt following the recent surge in interest after Josh Bloch's Javapolis talk.

Comparing 'closure' proposals

The three principle closure proposals are:

  • BGGA - full closures for Java
  • CICE - simplified inner classes (with the related ARM proposal)
  • FCM - first class methods (with the related JCA proposal)

It is easy to get confused when evaluating these competing proposals. So much new syntax. So many new ideas. For this blog I'll summarise, and then deep dive into one area.

The first key point that I have been making recently is that closures is not 'all or nothing' - there are parts to the proposals that can be implemented separately. This table summarises the 5 basic parts in the proposals (I've split the last one into two rows):

  BGGA CICE+ARM FCM+JCA
Literals for reflection - - FCM member literals
References to methods - - FCM method references
One method callback classes BGGA closures CICE FCM inner methods
Function types BGGA function types - FCM method types
Library based Control Structure BGGA control invocation - JCA control abstraction
Language based Control Structure - ARM -

Of course a table like this doesn't begin to do justice to any of the three proposals. Or to express the possible combinations (for example, BGGA could add support for the first two rows easily). So, instead of talking generally, lets examine a key use case where all 'closure' proposals work.

One method callbacks

This issue in Java refers to the complexity and verbosity of writing an anonymous inner class with one method. These are typically used for callbacks - where the application code passes a piece of logic to a framework for it to be executed later in time. The classic example is the ActionListener:

  public void init() {
    button.addActionListener(new ActionListener() {
      public void actionPerformed(ActionEvent ev) {
        // handle the event
      }
    });
  }

This registers a callback that will be called by the framework whenever the button is pressed. We notice that the code in the listener will typically be called long after the init() method has completed. While this is an example from swing, the same design appears all over Java code.

Some of the issues with this code are immediately obvious. Some are not.

  1. The declaration of the listener is very verbose. It takes a lot of code to define something that is relatively simple, and importantly, that makes it harder to read.
  2. Secondly, information is duplicated that could be inferred by the compiler. Its not very DRY.
  3. The scope of the code inside actionPerformed() is different to that of the init() method. The this keyword has a different meaning.
  4. Any variables and methods from the interface take preference to those available in init(). For example, toString(), with any number of parameters, will always refer to the inner class, not the class that init() is declared in. In other words, the illusion that the code in actionPerformed() has full access to the code visible in init() is just that - just an illusion.

The three closure proposals differ in how they tackle this problem area. BGGA introduces full closures. These eliminate all the problems above, but introduce new issues with the meaning of return. CICE introduces a shorthand way of creating an inner class. This solves the first issue, and some of the second, but does not solve issue 3 or 4. FCM introduces inner methods. These eliminate all the problems above, and add no new surprises.

  // BGGA
  public void init() {
    button.addActionListener({ActionEvent ev =>
      // handle the event
    });
  }
  // CICE
  public void init() {
    button.addActionListener(ActionListener(ActionEvent ev) {
      // handle the event
    });
  }
  // FCM
  public void init() {
    button.addActionListener(#(ActionEvent ev) {
      // handle the event
    });
  }

In this debate, it is easy to get carried away by syntax, and say that one of these might look 'prettier', 'more Java' or 'ugly'. Unfortunately, the syntax isn't that important - its the semantics that matter.

BGGA and FCM have semantics where this within the callback refers to the surrounding class, exactly as per code written directly in the init() method. This is emphasised by removing all signs of the inner class. The scope confusion of inner classes (issues 3 and 4 above) disappears (as this is 'lexically scoped').

CICE has semantics where this within the callback still refers to the inner class. Only you can't even see the inner class now, its really well hidden. This simply perpetuates the scope confusion of inner classes (issues 3 and 4 above) and gains us little other than less typing. In fact, as the inner class is more hidden, this might actually be more confusing than today.

BGGA also has semantics where a return within the callback will return from init(). This generally isn't what you want, as init() will be long gone when the button press occurs. (It should be noted that this feature is a major gain to the overall power of Java, but comes with a high complexity factor)

FCM and CICE have semantics where return within the callback returns from the callback, as in an inner class. This enables easy conversion of code from today's inner classes. (For FCM, the return is considered to be back to the # symbol, a simple rule to learn)

And here is my personal summary of this discussion:

  // BGGA
  public void init() {
    button.addActionListener({ActionEvent ev =>
      // return will return to init() - BAD for many (most) use cases
      // this means the same as within init() - GOOD and simple
    });
  }
  // CICE
  public void init() {
    button.addActionListener(ActionListener(ActionEvent ev) {
      // return will return to caller of the callback - GOOD for most use cases
      // this refers to the inner class of ActionListener - BAD and confusing
    });
  }
  // FCM
  public void init() {
    button.addActionListener(#(ActionEvent ev) {
      // return will return to caller of the callback - GOOD for most use cases
      // this means the same as within init() - GOOD and simple
    });
  }

Summary

This blog has evaluated just one part of the closures proposals. And this part could be implemented without any of the other parts if so desired. (BGGA would probably find it trickiest to break out just the concept discussed, but it could be done)

The point here is that there are real choices to be made here. The simplicity and understandability of Java must be maintained. But that doesn't mean standing still.

Once again, I'd suggest FCM has the right balance here, and that anyone interested should take a look. Perhaps the right new features for Java are method references and inner methods, and no more?

Opinions welcome as always :-)

Tuesday, 6 November 2007

Implementing FCM - part 1

Its been a long coding weekend, but after much head scratching, I might finally be getting the hang of changing the java compiler. My goal? to implement First class methods (FCM).

Implementing FCM

Currently, on my home laptop I have an altered version of javac, originally sourced from the KSL project which supports Method literals (section 2.1 of the spec). What this means in practice is that you can write:

  Method m = String#substring(int,int);

This will now compile and return the correct Method object. The left hand side can be a reference to an inner class such as Map.Entry without problem. It will also validate the standard access rules, preventing you from referencing a private/package/protected method when you shouldn't.

In fact, the implementation is currently more generous than this, as the left hand side can also be a variable. This is needed for later phases of FCM, but isn't really appropriate for a method literal.

Making a language change like this is definitely intimidating though, and mostly because the compiler feels like such a big scary beast. The first two parts - Scanner and Parser are actually pretty easy. Once those are done you have an AST (Abstract Syntax Tree) representing your code. In this case that involves a new node, currently called 'MethodReference'.

After that it gets more confusing. The main piece of work should be in Lower, which is where syntax sugar is converted to 'normal' Java. For example the new foreach loop is converted to the old one here. Since method literals are just syntax sugar, Lower is where the change goes.

Unfortunately, to get to Lower, you also have to fight with Attr which is where the types get allocated, checked and validated. That was a real head scratcher, and I'm sure a compiler expert would laugh at my final code. It does work though, and after all thats the main thing.

The next step will be to convert 'MethodReference' to 'MemberReference' and enable Field and Constructor literals. After that, the real FCM work begins!

Publishing

So, by now you probably want to have a download link. But I'm not going to provide one. As the code is GPL, I can't just publish a binary, the source has to come too. And personally, I'd rather actually have it in a SVN repo somewhere before I let it escape at all (because that can probably count as the GPL source publication). So, I'm afraid you'll have to be patient :-)

This probably should go in the KSL, but right now I don't know whether it will. The trouble with the KSL is that it still has a few guards and protections around it. And to date, no-one other than Sun has committed there, even in a branch. I'm feeling more comfortable in creating a separate, 'fewer rules', project at sourceforge. In fact, it would be great if other javac hackers like Remi wanted to join too ;-)

One final thing... if anyone wants to help out with this work I'd much appreciate it - scolebourne|joda|org. Compiler hacking isn't easy, but its a great feeling when you get it working!

Friday, 8 June 2007

Speaking at TSSJS Europe on Closures

I will be speaking on the subject of Java Closures at TSSJS Europe in Barcelona on Wednesday June 27th. I'll be introducing closures, showing what they can do, and comparing the main proposals. It should be a good trip!

Saturday, 12 May 2007

Closures - comparing options impact on language features

At JavaOne, closures were still a hot topic of debate. But comparing the different proposals can be tricky, especially as JavaOne only had BGGA based sessions.

Once again, I will be comparing the three key proposals in the 'closure' space - BGGA from Neal Gafter et al, CICE from Josh Bloch et al, and FCM from Stefan and myself, with the optional extension JCA extension. In addition I'm including the BGGA restricted variant, which is intended to safeguard developers when the closure is to be run asynchronously, and the ARM proposal for resource management.

In this comparison, I want to compare the behaviour of key program elements in each of the main proposals with regards the lexical scope. With lexical scoping here I am referring to what features of the surrounding method and class are available to code within the 'closure'.

Neal Gafter analysed the lexical scoped semantic language constructs in Java. The concept is that 'full closures' should not impact on the lexical scope. Thus any block of code can be surrounded by a closure without changing its meaning. The proposals differ on how far to follow this rule, as I hope to show in the table (without showing every last detail):

Is the language construct lexically scoped when surrounded by a 'closure'?
  Inner class CICE ARM FCM JCA BGGA
Restricted
BGGA
variable names Part (1) Part (1) Yes Yes Yes Part (7) Yes
methods Part (2) Part (2) Yes Yes Yes Yes Yes
type (class) names Part (3) Part (3) Yes Yes Yes Yes Yes
checked exceptions - - Yes Yes (4) Yes (4) Yes (4) Yes (4)
this - - Yes Yes Yes Yes Yes
labels - - Yes - (5) Yes - (5) Yes
break - - Yes - (5) Yes - (5) Yes
continue - - Yes - (5) Yes - (5) Yes
return - - Yes - (6) Yes - (8) Yes

(1) Variable names from the inner class hide those from the lexical scope
(2) Method names from the inner class hide those from the lexical scope
(3) Type names from the inner class hide those from the lexical scope
(4) Assumes exception transparency
(5) Labels, break and continue are constrained within the 'closure'
(6) The return keyword returns to the invoker of the inner method
(7) The RestrictedClosure interface forces local variables to be declared final
(8) The RestrictedClosure interface prevents return from compiling within the 'closure'
BTW, if this is inaccurate in any way, I'll happily adjust.

So more 'Yes' markers means the proposal is better right? Well not necessarily. In fact, I really don't want you to get that impression from this blog.

What I'm trying to get across is the kind of problems that are being tackled. Beyond that you need to look at the proposals in more detail with examples in order to understand the impact of the Yes/No and the comments.

Let me know if this was helpful (or innaccurate), or if you'd like any other comparisons.

Thursday, 3 May 2007

Closures options - FCM+JCA is my choice

In my last post I drew a diagram showing how I viewed the main proposals in terms of complexity. It has since been pointed out to me that I didn't clearly identify whether the JCA part of FCM was included or not. So I'll redraw the diagram and express my opinon.

The three key proposals in the 'closure' space are - BGGA from Neal Gafter et al, CICE from Josh Bloch et al, and FCM from Stefan and myself. FCM has an optional extension called JCA which adds control abstraction [1].

Control abstraction [1] is that part of the closures language change which allows a user to write an API method which acts, via a static import, as though it is a built in keyword. This is an incredibly powerful facility in a programming language. However, reading many forums, this is a facility that many Java developers also find concerning.

Given this, I'll redraw the diagram from yesterday:

 --------------------------------------------------------------------------
  Simpler                                                     More complex
 --------------------------------------------------------------------------
   No change   Better inner   Method-based   Method-based   Full-featured
                 classes        closures     closures and   closures with
                                             control abst.   control abst.
 --------------------------------------------------------------------------
 |        No control abstraction [1]      |    Control abstraction [1]   |
 --------------------------------------------------------------------------
   No change      CICE*           FCM           FCM+JCA         BGGA
 --------------------------------------------------------------------------

* It should also be noted that Josh Bloch has written the ARM proposal for resource management that tackles a specific control abstraction [1] use case.

So, what is my preference?

Firstly, I believe that Java should be enhanced with FCM-style closures (where the 'return' keyword returns to the invoker of the closure).

Secondly, I believe that control abstraction [1] would be a very powerful and expressive language enhancement. Personally, I would like to see Java include control abstraction, but I am very aware that it is a real concern for others in the community. I am also not yet fully confident of the risks involved by rushing into control abstraction [1].

Thus, my preferred option is FCM+JCA for Java 7. However I would also accept a potentially lower risk option of including just the FCM-style solution in Java 7 and adding JCA-style control abstraction [1] in Java 8.

Feel free to add your preference to the comments.

[1] Update: Neal Gafter has indicated that I am using conflicting terminology wrt 'control abstraction'. Neal uses the term 'control abstraction' to refer to any closure which can abstract over an arbitrary block of code including interacting with break, continue and return. Whereas, in this blog post I have been referring to control abstraction as adding API based keywords (described in paragraph 3), which Neal refers to as the 'control invocation syntax'. Apologies for any confusion

Wednesday, 2 May 2007

Cautious welcome on closure JSR - too soon for consensus

Neal Gafter announced a draft closures JSR a few days ago. I'd like to give the JSR a cautious welcome, but also wish to emphasise that the 'consensus' term could be misleading.

At present, there are three key proposals in the 'closure' space - BGGA from Neal Gafter et al, CICE from Josh Bloch et al, and FCM from Stefan and myself. Conceptually, these proposals meet differing visions of how far Java should be changed, and what is an acceptable change.

 ----------------------------------------------------------------
  Simpler                                           More complex
 ----------------------------------------------------------------
   No change    Better inner     Method-based     Full-featured
                  classes          closures         closures
 ----------------------------------------------------------------
   No change       CICE              FCM              BGGA
 ----------------------------------------------------------------

In deciding to support this JSR, a number of factors came into play for me:

  • I want to see closures in Java 7 - unfortunately it may already be too late for that
  • I'm not a language designer, and don't have the knowledge to run a JSR myself on this topic
  • The text of the draft JSR is clearly phrased as a derivation of the BGGA proposal
  • There has been positive feedback on Java forums about each of the three proposals - the Java community does not have a settled opinion yet
  • Many Java developers feel uncomfortable with any change in this area
  • Groovy, a Java derived language, uses an approach similar to FCM
  • This is the only JSR proposal currently available to support
  • Private emails

In the end, taking all of these factors into account, I took the judgement that this JSR was currently the only game in town and it would be better for me to participate rather than stand outside. And that is why I give it a cautious welcome.

Unfortunately, the extensive list of pre-selected Expert Group members (not including FCM) and Expert Group questionnaire came as rather a surprise, especially given the 'consensus' tag.

On the point of consensus, some commentators have assumed that this means a compromise proposal has been created, or agreement reached. This isn't the case.

Looking forward, my preference would be for Sun to finally play their hand on this topic. As a neutral player, they are well placed to encourage all parties to take a step back for a minute and put the focus firmly on what is best for Java going forward. Let the Sun shine :-)

Tuesday, 10 April 2007

Java Control Abstraction for First-Class Methods (Closures)

Stefan Schulz, Ricky Clarkson and I are pleased to announce the release of Java Control Abstraction (JCA). This is a position paper explaining how we envisage the First-Class Methods (FCM) closures proposal being extended to cover control abstraction.

Java Control Abstraction

So, what is control abstraction? And how does it relate to FCM? Well its all about being able to add methods in an API that can appear as though they are part of the language. The classic example is iteration over a map. Here is the code we write today:

  Map<Long,Person> map = ...;
  for (Map.Entry<Long, Person> entry : map) {
    Long id = entry.getKey();
    Person p = entry.getValue();
    // some operations
  }

and here is what the code looks like with control abstraction:

  Map<Long,Person> map = ...;
  for eachEntry(Long id, Person p : map) {
    // some operations
  }

The identfier eachEntry is a special method implemented elsewhere (and statically imported):

  public static <K, V> void eachEntry(for #(void(K, V)) block : Map<K, V> map) {
    for (Map.Entry<K, V> entry : map.entrySet()) {
      block.invoke(entry.getKey(), entry.getValue());
    }
  }

As can be seen, the API method above has a few unique features. Firstly, it has two halves separated by a colon. The first part consists of a method type which represents the code to be executed (the closure). The second part consists of any other parameters. As shown, the closure block is invoked in the same way as FCM.

The allowed syntax that the developer may enter in the block is not governed by the rules of FCM. Instead, the developer may use return, continue, break and exceptions and they will 'just work'. Thus in JCA, return will return from the enclosing method, not back into the closure. This is the opposite to FCM. This behaviour is required as the JCA block has to act like a built-in keyword.

One downside of the approach is that things can go wrong because the API writer has access to a variable that represents the closure. The API writer could store this in a variable and invoke it at a later time after the enclosing method is complete. However, if this occurred, then any return/continue/break statements would no longer operate correctly as the original enclosing method would no longer be on the call stack and a weird and unexpected exception will be thrown.

The semantics of a pure FCM method invocation are always safe, and there is no way to get one of these unexpected exceptions. But, for JCA control abstraction we could find no viable way to stop the weird exceptions. Instead, we have chosen to specifically separate the syntax of FCM from the syntax of control abstraction in JCA.

Our approach is to accompany the integration of control abstraction into Java by a strong set of messages. Developers will be encouraged to use both FCM callbacks and JCA control abstractions. However, developers would only be encouraged to write FCM style APIs, and not JCA.

Writing the API part of any control abstraction (including JCA) is difficult to get right (or more accurately easy to get wrong). As a result, some coding shops may choose to ban the writing of control abstraction APIs, but by having a separate syntax this will be easy to do for the tools. It is expected, of course, that the majority of the key control abstractions will be provided by the JDK, where experts will ensure that the control abstraction APIs work correctly.

Summary

This document has taken a while to produce, especially by comparison with FCM. In the end this indicated to us that writing a control abstraction is probably going to be a little tricky irrespective of what choices the language designer makes. By separating the syntax and semantics from FCM we have clearly identified the control abstraction issue in isolation, which can only be a good thing.

Feedback always welcome!