Thursday, 21 February 2008

Closures - Lightweight interfaces instead of Function types

Function types, or method types as FCM refers to them, are one of the most controversial features of closures. Is there an alternative that provides 80% of the power but in the style of Java?

Function/Method types

Function types allow the developer to define a type using just the signature, rather than a name. For example:

 // BGGA
 {String, Date => int}
 
 // FCM v0.5
 #(int(String, Date))

Apart from the different syntax, these are identical concepts in the two proposals as they currently stand. They both mean "a type that takes in two parameters - String and Date - and returns an int". This will be compiled to a dynamically generated single method interface as follows:

 public interface IOO<A, B> {
  int invoke(A a, B b);
 }

When instantiated, the generics A and B will be String and Date.

This is a complicated underlying mechanism. It is also one that can't be completely hidden from the developer, as the exception stack traces will show the auto-generated interface "IOO". This will certainly be a little unexpected at first. Update: Neal points out correctly that an interface name will not appear in the stacktrace!

A second complaint about function types is that there is no home for documentation. One of Java's key strengths is its documentation capabilities in the form of Javadoc. This is perhaps the unsung reason as to why Java became so popular in enterprises as a long-life, bet your company, language. Maintenance coders love that documentation. And everybody loves the ability to link to it within your IDE

So, why are we even considering function types? Well they allow APIs to be written that can take in any closure, simply defining it in terms of the input and output types. They also allow lightweight definition - there is no need to define the type before using it.

These are highly powerful features, and they lead towards functional programming idioms. But are these idioms completely in the style of Java?

Another option

Lets start from what we would write in Java today.

 public interface Convertor {
  int convert(String str, Date date);
 }

The advantage of this is that everyone knows and understands it. Its part of the lingua franca that is Java. Now lets examine what we could do with this.

Firstly, we need to remember that it is possible to define a class or interface such as this one within a method in Java today. The scope of such a class is the scope of the method. This will come in useful later.

So, lets examine what would happen if we start shortening the interface definition. For a function type equivalent, we know that there is only one method. As such, there isn't really any need for the braces:

 public interface Convertor int convert(String str, Date date);

Now, lets consider that for a function type equivalent, the method name is pre-defined as 'invoke'. As such, there is no need to include the method name:

 public interface Convertor int(String str, Date date);

Now, lets consider that for a function type equivalent, the parameter names are unimportant. As such, lets remove them (or maybe make them optional):

 public interface Convertor int(String, Date);

And that's it. I'm calling this a lightweight interface for now.

They represent a reasonable reduction of the code necessary to define a named single method interface. The syntax would be allowed anywhere an existing interface could be defined. This includes its own source file, nested in another class or interface, or locally scoped within a method. This is a longer example:

 // with function types (FCM syntax)
 public int process() {
  #(int(int, int)) add = #(int a, int b) {return a + b;};
  #(int(int, int)) mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }
 
 // with named lightweight interfaces - solution A
 interface Adder int(int,int);
 interface Multiplier int(int,int);
 public int process() {
  Adder add = #(int a, int b) {return a + b;};
  Multiplier mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }
 
 // with named lightweight interfaces - solution B
 public int process() {
  interface MathsCombiner int(int,int);
  MathsCombiner add = #(int a, int b) {return a + b;};
  MathsCombiner mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }

Solution A shows how you might define one lightweight interface for each operation. Solution B shows how you might define just one lightweight interface. It also shows that the lightweight interface could be define locally within the same method.

What we have gained is a name for the function type. It is now possible to write Javadoc for it and hyperlink to it in your IDE.

And it can be quickly and easily grasped as simply a shorthand way of defining a single method interface. In fact, you would be able to use this anywhere in your code as a normal single method interface, implementing it using a normal class, or extending it as required.

Its also possible to imagine IDE refactorings that would convert a lightweight interface to a full interface if you needed to add additional methods. Or to convert a single method interface to the lightweight definition.

Of course it would be possible to take this further by eliminating the name:

 // example showing what is possible, I'm not advocating this!
 public int process() {
  interface int(int,int) add = #(int a, int b) {return a + b;};
  interface int(int,int) mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }

However, the developer must now mentally parse both the lines "interface int(int,int)" to see if they are the same type. Previously, they could just see that they were both "MathsCombiner". As such, I prefer keeping the name, and requiring developers to take the extra step.

I see this as an example of where the style of Java differs from other more dynamic languages. In Java you always define your types up front. In more dynamic languages, you often just code the closure. As this concept requires defining types up front, I might suggest it is more in the Java style.

Final example

One final example is from my last blog post, this time in BGGA syntax:

 // Example functional programming style method using BGGA syntax
 public <T, U> {T => U} converter({=> T} a, {=> U} b, {T => U} c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }
 
 // The same using lightweight interfaces
 interface Factory<C> C();
 interface Transformer<I, O> O(I);
 public <T, U> Transformer<T, U> converter(Factory<T> a, Factory<U> b, Transformer<T, U> c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }

Personally, I find the latter to be much more readable, even though it involves more code. Both Factory and Transformer can be defined once, probably in the JDK or framework, and have associated documentation.

In addition, if I'd never seen the code before, I'd much prefer to be assigned to maintain the latter code with lightweight interfaces. Perhaps that is the key to Java's success - code that can be maintained. Write once. Read many times.

Thanks

Finally, I should note that some of the inspiration for this idea came from blogs and documents by Remi Forax and Casper Bang.

Summary

I've outlined an alternative to function types that keeps a key Java element - the type and its name. Lightweight interfaces are easy and quick to code if you don't want to document, but have the capacity to grow and be full members of the normal Java world if required.

I'd really love to hear opinions on this. It seems like a great way to balance the competing forces, but what do you think?

 

PS. Don't forget to vote for FCM at the java.net poll!

17 comments:

  1. Do these lightweight interfaces somehow support implicit contravariance/covariance, like BGGA's function types naturally do?

    If they don't, then I think you'll have to add some '? super T', '? extends U' etc to your converter example in order to make the two versions equivalent.

    ReplyDelete
  2. You wrote: 'This is a complicated underlying mechanism. It is also one that can't be completely hidden from the developer, as the exception stack traces will show the auto-generated interface "IOO". This will certainly be a little unexpected at first.'

    I've never seen an interface in a stack trace. As a practical matter, it was quite straightforward to implement, and these types are not printed "IOO" but rather using the same syntax by which they were written in the source code.

    To follow up on Mark Mahieu's point, if you want to compare apples to apples, your equivalent code for the BGGA example would have to be

    // The same using lightweight interfaces
    interface Factory C();
    interface Transformer O(I);
    public Transformer converter(Factory a, Factory b, Transformer c) {
    return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
    }

    I can't say I find it more readable.

    Can you please explain how you achieve exception transparency in this scheme?

    ReplyDelete
  3. I don't see much difference with using anonymous inner classes, you can't write documentation for them either.

    If you don't like that you make them named, but I think you could do them same for Function/Method types if you want:

    interface DateConverterType extends {String, Date => int} {
    }

    it's what I do a lot nowadays to get rid of those lengthy generic definitions so you could use it in this case as well.

    ReplyDelete
  4. Stephen Colebourne21 February 2008 11:57

    Mark, Neal: Your point wrt to wildcards is correct. But that doesn't mean it can't be solved - it requires the same implicit variance of function types. That could be added - although it may imply a separate keyword rather than reusing interface.

    The point with the comparison thus still remains valid - if the wildcards can be solved, then I would say the named form is simpler to understand.

    Neal: I've updated the post about IOO in the stack trace - it was late when I wrote that...

    Exception transparancy could be addressed by making lightweight interfaces always exception transparant. I haven't thought much about the practicality of that. Note that while exception transparancy is needed for JCA, FCM inner methods potentially have less need of it.

    Quintesse: The comparison with anonymous inner classes for documentation is bogus. An anonymous inner classes always implements an interface or class, and that is where the documentation resides.

    I understand that you can write interface DateConverterType extends {String, Date => int}, however I just have real doubts about exposing function types in the first place - my argument is that types in Java are about names, not just about signatures.

    ReplyDelete
  5. Stephen, Rémi : Regarding wildcards, doesn't that mean I'd have to know which keyword the types Factory and Transformer were defined with in order to understand what I can pass to the converter() method?

    ReplyDelete
  6. Stephen: That sounds like implicit declaration-site variance, which I can't see making Java's type system any easier to understand. Perhaps I've misunderstood though...

    Either way, I think the problem is in the use-site syntax (eg. in the converter method's signature) - unless it's distinct from existing interface/class syntax (as BGGA's function types are), it'll be a potential source of confusion.

    ReplyDelete
  7. Stephen Colebourne21 February 2008 14:52

    Mark: I think this actually emphasises the differences between function types and normal types rather well. Function types have the ability to act in a completely new way wrt generics. That is surely a major change to the type system wouldn't you say? And wouldn't that be a major source of confusion?

    This is also a classic example of Josh's example of how different language features combine to make it really difficult to sensibly add new features. Function types are essentially trying to sidestep the wildcard issue by using a new 'different' stntax.

    So, the underlying question is - can a simplified interface approach be made to work with generics and wildcards without turning ugly?

    ReplyDelete
  8. The desire to keep things simple is admirable, and I think everyone should consider it deeply. That said, is there anything from the BGGA proposal that is valuable?

    Personally I like very much the concept of control abstraction, however alien or not it may initially appear. To make it work properly, it requires the non-local returns and exception transparency, but the benefits seems tremendous. Even the library-level (as opposed to language-level) capabilities that provide scoped use semantics (much like C# using statements) alone should make one consider the additions worth it. I see it as not adding features for features sake, but rather adding much needed expressiveness, which should allow addition of much needed features that depend on a language being expressive to allow such additions.

    One can argue that control abstraction is not necessary, and we lived with it and we'll be fine in the future. This can be said about many innovations though, and mere presence of the 'using' keyword in C# becomes on attractive point (perhaps not the only one) that developers who currently use Java may consider as a reason to jump ship and migrate. This drains the community of much needed talent, and prevents the revitalization of the platform, despite the presence of other, arguably more expressive languages on the JVM.

    Those of us who have heavily invested in Java, even if only the platform, shall all think very deeply what it would mean for the Java language to languish and die. Sure, language attachment for the sake of it is rather silly, but from a business stand point I would argue that Java and the platform are tightly linked, and unless and until a major change of focus occurs (embracing Scala or some other language de-jour and delivering a lethal injection to Java) Java is a very important language to keep up-to-date with attractive features, if only for the reason of keeping the vitality of the community and the platform.

    Lastly, the notion of Java being a "simple" language for the "average Joe" is not only useless metric of its success, but also perhaps no longer applicable. Java is used in industrial-grade systems, and sometimes those systems require cream-of-the-crop language innovations to increase productivity. Attributing Java to only "toy things" may turn out to be a self-fullfiling prophecy, whereby only toys or perhaps nothing at all will be developed with it.

    ReplyDelete
  9. Re: 'Exception transparancy could be addressed by making lightweight interfaces always exception transparant.'

    I have no idea what this means. Can you please sketch the form of a specification that would have the intended effect?

    ReplyDelete
  10. Why do you still need to write "add.invoke(...)" when just "add(...)" should be sufficient? I can't see why invoke is needed.

    ReplyDelete
  11. > Function types have the ability to act in a completely new way wrt generics. That is surely a major change to the type system wouldn't you say?

    That's not my understanding at all, which is that BGGA's function types are a direct application of the type system already put in place in Java 5. If I'm wrong, I'd be very happy to be corrected.

    I think there are far more valid questions which have been raised about adding function types to Java, regardless of the various conclusions. I also believe confusion would occur with the lightweight interfaces you've proposed, but that doesn't mean I think that the idea is fundamentally flawed - it is after all just a slightly different perspective on the same problem.

    > This is also a classic example of Josh's example of how different language features combine to make it really difficult to sensibly add new features. Function types are essentially trying to sidestep the wildcard issue by using a new 'different' stntax.

    BGGA's function types are able to 'sidestep' the wildcard issue because they describe a single method's parameter types and return types, and are able to apply Java's normal co/contra-variant rules to the corresponding generic types. 'Lightweight interfaces' could do the same, but that's not the only problem to solve. I hope we don't end up throwing out that particular baby with the bath-water.

    Another interesting argument is that function types would be less necessary in this context if the generics vs primitives problems could be resolved. I don't think that such a resolution would mean function types would become irrelevant, but perhaps they take on a different timeliness - eg. a phased approach.

    Josh Bloch's argument about the combination of features is certainly pertinent, but from what I can see the BGGA proposal goes to pains not to preclude future solutions to the underlying problems. In doing so, it leaves itself open to some misdirected criticism.

    > So, the underlying question is - can a simplified interface approach be made to work with generics and wildcards without turning ugly?

    It certainly deserves further thought :)

    ReplyDelete
  12. In some cases, such as the proposed concurrency extension we end up with a very large number of interfaces, many of which differ only in the type of the parameters/results (different primitive types so generics doesn't help). Coming up with a good set of names for all these interfaces is tedious and the result uninspiring.

    Eliminating the primitives in favour of objects would also lose all the gains of concurrency (and more). So that isn't an option.

    ReplyDelete
  13. Stephen Colebourne22 February 2008 20:13

    Werner: I believe the problem with using add(...) instead of add.invoke(...) is namespaces. In Java, variable and method namespaces are different, and thus calling add(...) would be unclear as to whether it was accessing the method add or the variable add (ie. the closure).

    Mark(s): I do wonder how hard allowing primitives a generic type parameters would be. The problem is what it gets compiled to, but in the case of closures and function types I suspect that there is no conflict.

    Mark T: I agree that naming of the interfaces could be tedious, essentially what we are contrasting here is naming (definitely like Java today) vs higher level structural types (function types).

    Neal: I believe that this whole idea (not a proposal) can be taken two ways. The first is as a simple shorthand for declaring single method interfaces. That way does not immediately provide any variance or transparancy, but is useful if you only provide closure conversion and don't provide function types.

    The second way to look at this idea is as an alternate way of defining a BGGA function type, simply giving it a name. Thus variance and exception transparancy would (I believe) be no different to a function type:

    interface Block T() throws E;

    T withLock(Lock lock, Block block) throws E {...}

    'Always make it exception transparant' meant that you could write this:

    interface Block T(); // implied 'throws E'

    T withLock(Lock lock, Block block) throws E {...}

    ie. add the throws clause when you need it. Not sure thats possible or a good idea though.

    ReplyDelete
  14. There are times when one really does want a simple anonymous class/function/whatever. With inner classes the syntactic overhead is often unbearable for such trivial scenarios.

    In some of my tests I've got a main loop that iterates over an array, and compares values from that array with that of an iterator that it receives as a parameter. Some of the setup variables in the loop code are abstracted out so one can initialize the iteration range, as well as other aspects of the code. Right now I have to create a named type to do this, or at the very least a hefty AIC with the close-over-variables using a final generic tuple hack, which works, but is verbose.

    Hope you're not advocating against anonymous lambda expressions (regardless of what the actual implementation vehicle and the resulting name of the construct is, single-method-interface, function-type, big-ugly-thing, warm-fuzzy-thing) ... in the end, those things are quite useful.

    ReplyDelete
  15. If you ditch wildcards:

    http://www.artima.com/weblogs/viewpost.jsp?thread=222021

    Then you can define standard interfaces in java.lang, e.g.:

    public interface Method2 {
    __R call(A1 a1, A2 a2);
    }

    Then you example becomes:

    interface Adder extends Method2 {}
    interface Multiplier extends Method2 {}
    public int process() {
    __Adder add = #(int a, int b) {return a + b;};
    __Multiplier mul = #(int a, int b) {return a * b;};
    __return mul.call(add.call(2, 3), add.call(3, 4));
    }

    Nothing new needed at all!

    Or in C3S syntax (http://www.artima.com/weblogs/viewpost.jsp?thread=182412):

    interface Adder extends Method2 {}
    interface Multiplier extends Method2 {}
    public int process() {
    __final add = Adder.method(a, b) {a + b};
    __final mul = Multiplier.method(a, b) {a * b};
    __mul.call(add.call(2, 3), add.call(3, 4));
    }

    ReplyDelete
  16. Must the lightweight interface be explicitly implemented, or can any function (method) with a compatible type signature be passed as a lightweight interface? If not, won't you still need function types to be able to write true higher order functions?

    ReplyDelete
  17. Stephen Colebourne26 February 2008 10:06

    Odd: As I said in another reply, if this is a simple shorthand for the definition of an interface, then it would need to be explicitly implemented - and true higher order functions would not happen.

    However, if a lightweight interface is simply a named 'typedef' for a function type, then all the conversions could still occur and true high order functions would work.

    ReplyDelete