Thursday 28 February 2008

FCM closures - options within

The FCM closures proposal, with the JCA extension, consists of multiple parts. This blog outlines how those parts fit together.

FCM+JCA spec parts

The FCM+JCA spec, contains the following elements:

  • Method literals, also constructor and field literals
  • Method references, also constructor references
  • Inner methods
  • Method types, aka function types
  • Control invocation, in the JCA extension

These five parts all fulfil different roles in the proposal. But what is often not understood is how feasible it would be to implement less than the whole specification.

Method references and literals

Reviewing response to the entire debate, it is clear to me that method references and/or method literals have generally widespread appeal. The ability to reference a method, constructor or field in a compile-time safe and refactorable way is a huge gain for a statically typed language like Java.

It should be noted that although Method References appear only in the FCM spec, they could be added to the BGGA or CICE spec without difficulty. Also, they are included in the closures JSR proposal.

There are three areas of contention with method references and literals.

Firstly, should both references and literals be supported, or just references. Or, looking at the question differently, should literals have a different syntax.

The problem here is that if a method reference and literal have the same syntax then it is unclear as to what the type of the expression is. The FCM prototype demonstrates, I believe, that this can be solved using the same syntax. The approach taken is to say that the default type is a method literal, but that it can be converted (boxed) to a method reference at construction time. Any ambiguity is an error.

 Method m = Integer#valueOf(int);
 IntToInteger i = Integer#valueOf(int);

In this example, IntToInteger is a single method interface. Because the expression Integer#valueOf(int) is assigned to a single method interface, the conversion occurs (generating a wrapping inner class).

The second issue is what a method reference can be boxed into. This is essentially a question of whether function types should be supported, and I'll cover that below.

The third issue is syntax, specifically the use of the #. Personally, I find this syntax natural and obvious to read, but I know others differ. I think it is important to get the syntax right, but final decisions on that can come later.

So, are method literals and reference required when implementing FCM? I would say 'yes'. These are simple, popular, constructs that naturally extend Java in a non-threatening way. Some have suggested omitting the literals as reflection is not type-safe, however this misses the point of the large number of existing frameworks and APIs that accept reflection Method as an input parameter.

Inner methods / Closures

 ActionListener lnr = #(ActionEvent ev) {
   ...
 };

This is where the key difference with BGGA lies, notably over the meaning of return and the value, and safety, of non-local returns.

Opinions on this appear to me to be impacted by the generics implementation, where the decision was made to do what feels like 'half a job'. As a result, there is a meme that runs 'we must implement closures fully or not at all'. This meme is extremely unfortunate, as it is not allowing a rational analysis of the semantics of the proposals. Anyone supporting BGGA really needs to consider the mistakes that developers will make again and again with the non-local return/last-line-no-semicolon approach.

So, are inner methods required when implementing FCM? I would say 'effectively, yes'. Although you could just implement method literals and references alone, there are even bigger gains to be had from adding inner methods. They greatly simplify the declaration of single method inner classes, and allow much of the impact of closures in the style of Java.

Function types / Method types

 #(int(String) throws IOException)

These allow a new powerful form of programming where common pieces of code can be easily abstracted. They simply act as types, but they have two different properties from other types.

Firstly, they have no name. This means an absence of Javadoc, including any semantic requirements of the API, such as thread-safety or null/not-null.

Secondly, they only describe the input and output types. This is a higher abstraction than Java has previously used, and will require a mindset shift for those using them.

So, are function types required when implementing FCM? I would say 'no, not required'. It is perfectly possibly and reasonable to implement FCM without method types. In fact, that is what the prototype does. In practice, this just means that all conversions from method references and inner methods must be to single method interfaces rather than method types.

Omitting method types greatly simplifies the conceptual weight of the change. The downside is that true higher order functional programming becomes near impossible. That may be no bad thing. Java is not, and never has been, a functional programming language. It seems very odd to try and push it in that direction at this point in its life.

A better alternative would be to pursue supporting primitive types in generics. This would greatly reduce the overhead of single method interfaces required by something like the fork-join framework.

Similarly, making single method interfaces easier to write (lightweight interfaces) would be a direction to take in the absence of method types.

Control invocation

 withLock(lock) {
   ...
 }
 public void withLock(#(void()) block : Lock lock) {
   ...
 }

Control invocation forms are perhaps the only way forward in Java longer term because they allow us to escape from many of these language change debates. They allow anyone to write methods that can be used in the style of control statements. It is vital to remember that they are just methods however.

BGGA appears to build much of its spec around control invocation, and the non-local returns make perfect sense in this area.

The JCA spec defines that the calling code should be identical to BGGA, but the method invoked should be written differently. The aim of JCA is to provide an element of discouragement from using control invocation. This is because of the additional complexities in getting the code right (exception transparancy, completion transparancy, non-local returns etc). A different, special, syntax encourages this feature to be restricted to senior developers, or heavily code reviewed.

So, is control invocation required when implementing FCM+JCA? I would say 'no'. It is perfectly possibly and reasonable to implement FCM+JCA without control invocation (although of course that means it would just be FCM!).

The inclusion or omission of method types is also linked to control invocation, as method types are a pre-requisite for control invocation in the JCA spec.

Summary of possible implementations

Thus, here are the possible FCM+JCA implementation combinations that make sense to me:

  1. Literals and References
  2. Literals, References and Inner methods
  3. Literals, References, Inner methods and Method types
  4. Literals, References, Inner methods, Method types and Control invocation

My preferred options are number 2 and number 4.

Why? Because, I believe inner methods are too useful to omit, and I believe method types are generally too complex unless you really need them. (Also Java isn't a functional programming language.)

The key point of this blog is to emphasise that FCM is not a take it or leave it proposal. There are different options and levels within it that could be adopted.

This extends to versions of Java. For example, it would be feasible to implement option 1, literals and references in Java 7, whilst adding inner methods and maybe more in Java 8.

Summary

I've shown how FCM has parts which can be considered separately to a degree. I've also indicated which combinations make sense to me.

Which combinations make sense to you?

Sunday 24 February 2008

FCM prototype available

I'm happy to announce the first release of the First Class Methods (FCM) java prototype. This prototype anyone who is interested to find out what FCM really feels like.

Standard disclaimer: This software is released on a best-efforts basis. It is not Java. It has not passed the Java Compatibility Testing Kit. Having said that, in theory no existing programs should be broken. Just don't go relying on code compiled with this prototype for anything other than experimentation!

FCM javac implementation

The FCM javac implementation is hosted at Kijaro. The javac version used as a baseline is OpenJDK (the last version before the Mercurial cutover).

The prototype includes the following features:

  • Method literals
  • Constructor literals
  • Field literals
  • Static method references
  • Bound method references
  • Constructor references
  • Anonymous inner methods

The following are not implemented:

  • Method types
  • Instance method references
  • Named inner methods
  • Inner method non-final local variable access
  • Inner method exception inference
  • Inner method exception/completion transparancy
  • Conversion to single abstract method classes

The download includes a README with many FAQs answered, including more information on how the types work.

The biggest outstanding area is getting generics working properly. This is a complex task however, and I took the view that it was better to release early than spend any more time trying to get generics working properly.

Nevertheless, even without full generics, you can get a really good feel for how Java would look and feel with FCM. In addition, the FCM enabled code just falls off my fingers very nicely. Now if only we could get Eclipse or Netbeans support...

Summary

One of the key requests for considering FCM as a viable proposal has been having a prototype to play with. Now the prototype is out there, it would be really great to hear some feedback, including any bugs. Comments welcome here, at kijaro-dev mailing list or scolebourne-joda-org.

Thursday 21 February 2008

Closures - Lightweight interfaces instead of Function types

Function types, or method types as FCM refers to them, are one of the most controversial features of closures. Is there an alternative that provides 80% of the power but in the style of Java?

Function/Method types

Function types allow the developer to define a type using just the signature, rather than a name. For example:

 // BGGA
 {String, Date => int}
 
 // FCM v0.5
 #(int(String, Date))

Apart from the different syntax, these are identical concepts in the two proposals as they currently stand. They both mean "a type that takes in two parameters - String and Date - and returns an int". This will be compiled to a dynamically generated single method interface as follows:

 public interface IOO<A, B> {
  int invoke(A a, B b);
 }

When instantiated, the generics A and B will be String and Date.

This is a complicated underlying mechanism. It is also one that can't be completely hidden from the developer, as the exception stack traces will show the auto-generated interface "IOO". This will certainly be a little unexpected at first. Update: Neal points out correctly that an interface name will not appear in the stacktrace!

A second complaint about function types is that there is no home for documentation. One of Java's key strengths is its documentation capabilities in the form of Javadoc. This is perhaps the unsung reason as to why Java became so popular in enterprises as a long-life, bet your company, language. Maintenance coders love that documentation. And everybody loves the ability to link to it within your IDE

So, why are we even considering function types? Well they allow APIs to be written that can take in any closure, simply defining it in terms of the input and output types. They also allow lightweight definition - there is no need to define the type before using it.

These are highly powerful features, and they lead towards functional programming idioms. But are these idioms completely in the style of Java?

Another option

Lets start from what we would write in Java today.

 public interface Convertor {
  int convert(String str, Date date);
 }

The advantage of this is that everyone knows and understands it. Its part of the lingua franca that is Java. Now lets examine what we could do with this.

Firstly, we need to remember that it is possible to define a class or interface such as this one within a method in Java today. The scope of such a class is the scope of the method. This will come in useful later.

So, lets examine what would happen if we start shortening the interface definition. For a function type equivalent, we know that there is only one method. As such, there isn't really any need for the braces:

 public interface Convertor int convert(String str, Date date);

Now, lets consider that for a function type equivalent, the method name is pre-defined as 'invoke'. As such, there is no need to include the method name:

 public interface Convertor int(String str, Date date);

Now, lets consider that for a function type equivalent, the parameter names are unimportant. As such, lets remove them (or maybe make them optional):

 public interface Convertor int(String, Date);

And that's it. I'm calling this a lightweight interface for now.

They represent a reasonable reduction of the code necessary to define a named single method interface. The syntax would be allowed anywhere an existing interface could be defined. This includes its own source file, nested in another class or interface, or locally scoped within a method. This is a longer example:

 // with function types (FCM syntax)
 public int process() {
  #(int(int, int)) add = #(int a, int b) {return a + b;};
  #(int(int, int)) mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }
 
 // with named lightweight interfaces - solution A
 interface Adder int(int,int);
 interface Multiplier int(int,int);
 public int process() {
  Adder add = #(int a, int b) {return a + b;};
  Multiplier mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }
 
 // with named lightweight interfaces - solution B
 public int process() {
  interface MathsCombiner int(int,int);
  MathsCombiner add = #(int a, int b) {return a + b;};
  MathsCombiner mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }

Solution A shows how you might define one lightweight interface for each operation. Solution B shows how you might define just one lightweight interface. It also shows that the lightweight interface could be define locally within the same method.

What we have gained is a name for the function type. It is now possible to write Javadoc for it and hyperlink to it in your IDE.

And it can be quickly and easily grasped as simply a shorthand way of defining a single method interface. In fact, you would be able to use this anywhere in your code as a normal single method interface, implementing it using a normal class, or extending it as required.

Its also possible to imagine IDE refactorings that would convert a lightweight interface to a full interface if you needed to add additional methods. Or to convert a single method interface to the lightweight definition.

Of course it would be possible to take this further by eliminating the name:

 // example showing what is possible, I'm not advocating this!
 public int process() {
  interface int(int,int) add = #(int a, int b) {return a + b;};
  interface int(int,int) mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }

However, the developer must now mentally parse both the lines "interface int(int,int)" to see if they are the same type. Previously, they could just see that they were both "MathsCombiner". As such, I prefer keeping the name, and requiring developers to take the extra step.

I see this as an example of where the style of Java differs from other more dynamic languages. In Java you always define your types up front. In more dynamic languages, you often just code the closure. As this concept requires defining types up front, I might suggest it is more in the Java style.

Final example

One final example is from my last blog post, this time in BGGA syntax:

 // Example functional programming style method using BGGA syntax
 public <T, U> {T => U} converter({=> T} a, {=> U} b, {T => U} c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }
 
 // The same using lightweight interfaces
 interface Factory<C> C();
 interface Transformer<I, O> O(I);
 public <T, U> Transformer<T, U> converter(Factory<T> a, Factory<U> b, Transformer<T, U> c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }

Personally, I find the latter to be much more readable, even though it involves more code. Both Factory and Transformer can be defined once, probably in the JDK or framework, and have associated documentation.

In addition, if I'd never seen the code before, I'd much prefer to be assigned to maintain the latter code with lightweight interfaces. Perhaps that is the key to Java's success - code that can be maintained. Write once. Read many times.

Thanks

Finally, I should note that some of the inspiration for this idea came from blogs and documents by Remi Forax and Casper Bang.

Summary

I've outlined an alternative to function types that keeps a key Java element - the type and its name. Lightweight interfaces are easy and quick to code if you don't want to document, but have the capacity to grow and be full members of the normal Java world if required.

I'd really love to hear opinions on this. It seems like a great way to balance the competing forces, but what do you think?

 

PS. Don't forget to vote for FCM at the java.net poll!

Tuesday 19 February 2008

Evaluating BGGA closures

The current vote forces us all to ask what proposal is best for the future of Java. Personally, I don't find the BGGA proposal persuasive (in its current form). But exactly why is that?

Touchstone debate

The closures debate is really a touchstone for two other, much broader, debates.

The first "what is the vision for Java". There has been no single guiding vision for Java through its life, unlike other languages:

In the C# space, we have Anders. He clearly "owns" language, and acts as the benevolent dictator. Nothing goes into "his" language without his explicit and expressed OK. Other languages have similar personages in similar roles. Python has Guido. Perl has Larry. C++ has Bjarne. Ruby has Matz.

The impact of this lack of guiding vision is a sense that Java could be changed in any way. We really need some rules and a guiding Java architect.

The second touchstone debate is "can any language changes now be successfully implemented in Java". This is generally a reference to generics, where erasure and wildcards have often produced havoc. In the votes at Javapolis, 'improving generics' got the highest vote by far.

The result of implementing generics with erasure and wildcards has been a loss of confidence in Java language change. Many now oppose any and all change, however simple and isolated. This is unfortunate, and we must ensure that the next batch of language changes work without issues.

Despite this broader debates that surround closures, we must focus on the merits of the individual proposals.

Evaluating BGGA

BGGA would be a very powerful addition to Java. It contains features that I, as a senior developer, could make great use of should I choose to. It is also a well written and thought out proposal, and is the only proposal tackling some key areas, such as exception transparancy, in detail.

However, my basic issue with BGGA is that the resulting proposal doesn't feel like Java to me. Unfortunately, 'feel' is highly subjective, so perhaps we can rationalise this a little more.

Evaluating BGGA - Statements and Expressions

Firstly, BGGA introduces a completely new block exit strategy to Java - last-line-no-semicolon expressions. BGGA uses them for good reasons, but I believe that these are very alien to Java. So where do they come from? Well consider this piece of Java code, and its equivalent in Scala:

 // Java
 String str = null;
 if (someBooleanMethod()) {
  str = "TRUE";
 } else {
  str = "FALSE";
 }
 
 // Scala
 val str = if (someBooleanMethod()) {
  "TRUE"
 } else {
  "FALSE"
 }

There are two points to note. The first is that Scala does not use semicolons at the end of line. The second, and more important point is that the last-line-expression from either the if or the else clause is assigned to the variable str.

The fundamental language level difference going on here is that if in Java is a statement with no return value. In Scala, if is an expression, and the result from the blocks can be assigned to a variable if desired. More generally, we can say that Java is formed from statements and expressions, while in Scala everything can be an expression.

The result of this difference, is that in Scala it is perfectly natural for a closure to use the concept of last-line-no-semicolon to return from a closure block because that is the standard, basic language idiom. All blocks in Scala have the concept of last-line-no-semicolon expression return.

This is not the idiom in Java.

Java has statements and expressions as two different program elements. In my opinion, BGGA tries to force an expression only style into the middle of Java. The result is in a horrible mixture of styles.

Evaluating BGGA - Non local returns

A key reason for BGGA using an alternate block exit mechanism is to meet Tennant's Correspondance Principle. The choices made allow BGGA to continue using return to mean 'return from the enclosing method' whilst within a closure.

There is a problem with using return in this way however. Like all the proposals, you can take a BGGA closure, assign it to a variable and store it for later use. But, if that closure contains a return statement, it can only be successfully invoked while the enclosing method still exists on the stack.

 public class Example {
  ActionListener buttonPressed = null;
  public void init() {
   buttonPressed = {ActionEvent ev => 
     callDatabase();
     if (isWeekend() {
      queueRequest();
      return;
     }
     processRequest();
   };
  }
 }

In this example, the init() method will be called at program startup and create the listener. The listener will then be called later when a button is pressed. The processing will call the database, check if it is weekend and then queue the request. It will then try to return from the init() method.

This obviously can't happen, as the init() method is long since completed. The result is a NonLocalTransferException - an unusual exception which tries to indicate to the developer that they made a coding error.

But this is a runtime exception.

It is entirely possible that this code could get into production like this. We really need to ask ourselves if we want to change the Java programming language such that we introduce new runtime exceptions that can be easily coded by accident.

As it happens, the BGGA specification includes a mechanism to avoid this, allowing the problem to be caught at compile time. Their solution is to add a marker interface RestrictedFunction to those interfaces where this might be a problem. ActionListener would be one such interface (hence the example above would not compile - other examples would though).

This is a horrible solution however, and doesn't really work. Firstly, the name RestrictedFunction and it being a marker interface are two design smells. Secondly, this actually prevents the caller from using ActionListener with return if they actually want to do so (within the same enclosing method).

The one final clue about non local returns is the difficulty in implementing them. They have to be implemented using the underlying exception mechanism. While this won't have any performance implications, and won't be visible to developers, it is another measure of the complexity of the problem.

In my opinion, allowing non local returns in this way will cause nothing but trouble in Java. Developers are human, and will easily make the mistake of using return when they shouldn't. The compiler will catch some cases, with RestrictedFunction, but not others, which will act as a further level of confusion.

Evaluating BGGA - Functional programming

Java has always laid claim to be an OO language. In many ways, this has often been a dubious claim, especially with static methods and fields. Java has never been thought of as a functional programming langauge however.

BGGA introduces function types as a key component. These introduce a considerable extra level of complexity in the type system.

A key point is that at the lowest compiler level, function types are no different to ordinary, one method, interfaces. However, this similarity is misleading, as at the developer level they are completely different. They look nothing like normal interfaces, which have a name and javadoc, and they are dynamically generated.

Moreover, function types can be used to build up complex higher-order function methods. These might take in a function type parameter or two, often using generics, and perhaps returning another function type. This can lead to some very complicated method signatures and code.

 public <T, U> {T => U} converter({=> T} a, {=> U} b, {T => U} c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }

It shoud be noted that FCM also includes function types, which are named method types. However, both Stefan and myself struggle with the usability of them, and they may be removed from a future version of FCM. It is possible to have FCM inner methods, method references and method literals without the need for method types.

Function types have their uses, however I am yet to be entirely convinced that the complexity is justified in Java. Certainly, they look very alien to Java today, and they enable code which is decidedly hard to read.

Evaluating BGGA - Syntax

The syntax is the least important reason for disliking BGGA, as it can be altered. Nevertheless, I should show an example of hard to read syntax.

 int y = 6;
 {int => boolean} a = {int x => x <= y};
 {int => boolean} b = {int x => x >= y};
 boolean c = a.invoke(3) && b.invoke(7);

Following =>, <= and >= can get very confusing. It is also a syntax structure for parameters that is alien to Java.

Summary

As co-author of the FCM proposal it is only natural that I should want to take issue with aspects of competitor proposals. However, I do so for one reason, and that is to ensure that the changes included in Java 7 really are the right ones. Personally, I believe that BGGA attempts something - full closures - that simply isn't valid or appropriate for Java.

It remains my view that the FCM proposal, with the optional JCA extension, provides a similar feature set but in a more natural, safe, Java style. If you've read the proposals and are convinced, then please vote for FCM at java.net.

Opinions welcome as always :-)

References

For reference, here are the links to all the proposals if you want to read more:

  • BGGA - full closures for Java
  • CICE - simplified inner classes (with the related ARM proposal)
  • FCM - first class methods (with the related JCA proposal)

Sunday 17 February 2008

Vote for FCM!

Java.net is currently running a poll on closures, to get a feel for the strength of support for each proposal. Obviously, this poll has no power, but it is useful to see at a high level what the communities opinion is.

Personally, I'm really pleased with the level of support FCM has had in the vote so far, on this blog and privately. Now I'd like to encourage you, if you so desire, to vote and support FCM. Thanks!

Update: If you want to compare the three proposals, please take a look at my previous comparison articles - one method callbacks, control structures and type inference.