Tuesday, 18 December 2007

Closures - Comparing the core of BGGA, CICE and FCM

In this blog I'm going to compare the core of the three principle 'closure' proposals. This is particularly apt following the recent surge in interest after Josh Bloch's Javapolis talk.

Comparing 'closure' proposals

The three principle closure proposals are:

  • BGGA - full closures for Java
  • CICE - simplified inner classes (with the related ARM proposal)
  • FCM - first class methods (with the related JCA proposal)

It is easy to get confused when evaluating these competing proposals. So much new syntax. So many new ideas. For this blog I'll summarise, and then deep dive into one area.

The first key point that I have been making recently is that closures is not 'all or nothing' - there are parts to the proposals that can be implemented separately. This table summarises the 5 basic parts in the proposals (I've split the last one into two rows):

  BGGA CICE+ARM FCM+JCA
Literals for reflection - - FCM member literals
References to methods - - FCM method references
One method callback classes BGGA closures CICE FCM inner methods
Function types BGGA function types - FCM method types
Library based Control Structure BGGA control invocation - JCA control abstraction
Language based Control Structure - ARM -

Of course a table like this doesn't begin to do justice to any of the three proposals. Or to express the possible combinations (for example, BGGA could add support for the first two rows easily). So, instead of talking generally, lets examine a key use case where all 'closure' proposals work.

One method callbacks

This issue in Java refers to the complexity and verbosity of writing an anonymous inner class with one method. These are typically used for callbacks - where the application code passes a piece of logic to a framework for it to be executed later in time. The classic example is the ActionListener:

  public void init() {
    button.addActionListener(new ActionListener() {
      public void actionPerformed(ActionEvent ev) {
        // handle the event
      }
    });
  }

This registers a callback that will be called by the framework whenever the button is pressed. We notice that the code in the listener will typically be called long after the init() method has completed. While this is an example from swing, the same design appears all over Java code.

Some of the issues with this code are immediately obvious. Some are not.

  1. The declaration of the listener is very verbose. It takes a lot of code to define something that is relatively simple, and importantly, that makes it harder to read.
  2. Secondly, information is duplicated that could be inferred by the compiler. Its not very DRY.
  3. The scope of the code inside actionPerformed() is different to that of the init() method. The this keyword has a different meaning.
  4. Any variables and methods from the interface take preference to those available in init(). For example, toString(), with any number of parameters, will always refer to the inner class, not the class that init() is declared in. In other words, the illusion that the code in actionPerformed() has full access to the code visible in init() is just that - just an illusion.

The three closure proposals differ in how they tackle this problem area. BGGA introduces full closures. These eliminate all the problems above, but introduce new issues with the meaning of return. CICE introduces a shorthand way of creating an inner class. This solves the first issue, and some of the second, but does not solve issue 3 or 4. FCM introduces inner methods. These eliminate all the problems above, and add no new surprises.

  // BGGA
  public void init() {
    button.addActionListener({ActionEvent ev =>
      // handle the event
    });
  }
  // CICE
  public void init() {
    button.addActionListener(ActionListener(ActionEvent ev) {
      // handle the event
    });
  }
  // FCM
  public void init() {
    button.addActionListener(#(ActionEvent ev) {
      // handle the event
    });
  }

In this debate, it is easy to get carried away by syntax, and say that one of these might look 'prettier', 'more Java' or 'ugly'. Unfortunately, the syntax isn't that important - its the semantics that matter.

BGGA and FCM have semantics where this within the callback refers to the surrounding class, exactly as per code written directly in the init() method. This is emphasised by removing all signs of the inner class. The scope confusion of inner classes (issues 3 and 4 above) disappears (as this is 'lexically scoped').

CICE has semantics where this within the callback still refers to the inner class. Only you can't even see the inner class now, its really well hidden. This simply perpetuates the scope confusion of inner classes (issues 3 and 4 above) and gains us little other than less typing. In fact, as the inner class is more hidden, this might actually be more confusing than today.

BGGA also has semantics where a return within the callback will return from init(). This generally isn't what you want, as init() will be long gone when the button press occurs. (It should be noted that this feature is a major gain to the overall power of Java, but comes with a high complexity factor)

FCM and CICE have semantics where return within the callback returns from the callback, as in an inner class. This enables easy conversion of code from today's inner classes. (For FCM, the return is considered to be back to the # symbol, a simple rule to learn)

And here is my personal summary of this discussion:

  // BGGA
  public void init() {
    button.addActionListener({ActionEvent ev =>
      // return will return to init() - BAD for many (most) use cases
      // this means the same as within init() - GOOD and simple
    });
  }
  // CICE
  public void init() {
    button.addActionListener(ActionListener(ActionEvent ev) {
      // return will return to caller of the callback - GOOD for most use cases
      // this refers to the inner class of ActionListener - BAD and confusing
    });
  }
  // FCM
  public void init() {
    button.addActionListener(#(ActionEvent ev) {
      // return will return to caller of the callback - GOOD for most use cases
      // this means the same as within init() - GOOD and simple
    });
  }

Summary

This blog has evaluated just one part of the closures proposals. And this part could be implemented without any of the other parts if so desired. (BGGA would probably find it trickiest to break out just the concept discussed, but it could be done)

The point here is that there are real choices to be made here. The simplicity and understandability of Java must be maintained. But that doesn't mean standing still.

Once again, I'd suggest FCM has the right balance here, and that anyone interested should take a look. Perhaps the right new features for Java are method references and inner methods, and no more?

Opinions welcome as always :-)

17 comments:

  1. I still don't understand what's the big deal with the scope of "this" within anonymous functions. Maybe I'm in the minority, but I've never been confused about the object identity of an anonymous inner class vs its containing class. Sometimes it's annoying to type OuterClassName.this.method(), but otherwise this one aspect seems a solution in search of a problem.

    I do like that the JCA proposal only overrides the meaning of return, break and continue in control blocks and not in other anonymous functions. It makes perfect sense to do so in that one case.

    Contrast this to BGGA, where the meaning is overridden in *all* closure blocks. I agree with Josh Bloch that this is fertile territory for difficult bugs:

    Collections.sort(list, {Integer o1, Integer o2 => o1 - o2 } // correct
    Collections.sort(list, {Integer o1, Integer o2 => o1 - o2; } // compiler error: must return an integer!
    Collections.sort(list, {Integer o1, Integer o2 => return o1 - o2; } // throws NonLocalReturnException at runtime

    With that said, it *would* make sense to have simple expressions expanded to their more verbose counterparts:

    Collections.sort(list, #(int(Integer o1, Integer o2)) { o1 - o2 });

    is equivalent to:

    Collections.sort(list, #(int(Integer o1, Integer o2)) { return o1 - o2; });

    This is a nice touch and accomplishes what seems to be the goal of CICE. The problem is that BGGA takes what is a nice touch in one situation and applies it to *all* other situations when it usually doesn't make sense to do so.

    ReplyDelete
  2. FCM is by far my favourite proposal because it reenforces my Java mental model of classes, methods, interfaces and fields. Code written with FCM fits nicely into the existing Java language.

    When I was first learning reflection, I was surprised that typesafe method literals didn't exist. It seems like FCM fills in a gap, whereas the other two proposals add to the language.

    ReplyDelete
  3. According to
    http://gafter.blogspot.com/2007/12/what-flavor-of-closures.html
    BGGA can do
    """
    Or, using the control invocation syntax:

    button.addClickListener() {
    button.setText("Hello");
    }
    """

    ReplyDelete
  4. I think you need to get a job a google or something, so more attention is based to FCM. Method literals are a really big win even without the closures use case.

    ReplyDelete
  5. You claim that FCM+JCA supports library based control structures. I fail to see how these can be supported without lexically scoping the return keyword (and others). Control structures must react the same way, no matter if they are language based or library based. In language based control structures, the return keyword (and others) refers to the outer scope.

    ReplyDelete
  6. Stephen Colebourne19 December 2007 15:29

    @Matthew, we have discussed having single expressions be handled without the 'return' or ';'. We chose to go with simplicity, but if the demand was there it could be added easily.

    @Peter, although some of your post got lost, you appear to be asking for adding callbacks using a control structure type syntax. Personally, I find that idea too complex. I like deliberately drawing a distinction between callbacks (an object passed to a method) and control structure.

    @Faith, the meaning of 'return' in JCA is different to the meaning of 'return' in FCM. (It is lexically scoped in JCA). This enables library based control structures in exactly the way you describe and expect. Think of FCM and JCA as two related, but separate, language changes.

    ReplyDelete
  7. You forgot the fourth closure proposal, the most important one, and the one most supported by real Java programmers around the world. The *no closures* proposal.

    Yes, it is ignored by the powers that be, but it is still the one with the largest support. For $DEITY sake, leave the language alone. Why can't the closure fans use or fuck up another language? As if generics weren't bad enough.

    ReplyDelete
  8. I don't understand why "this" in an inner class should mean the outer class automatically. It is convenient but sugar IMO. OuterClass.this works wonderfully well. If we want it fixed you should call up the IDE vendors and have them add a Quick-fix for it.

    ReplyDelete
  9. I'm with you Jesse. I haven't committed my vote, but so far FCM seems much more palatible to me than the other choices. I'm especially into typesafe reflection. Yay! For all those of us writing frameworks, typesafe reflective access to fields, methods, and (dare I hope) properties would be a big win.

    ReplyDelete
  10. I'm in favor of CICE, if closures are to be added at all. It is the most consistent and intuitive for existing Java users.

    I don't think it's necessarily a problem that "this" points to the inner class, but what prefers you from making "this" point to the outer class if you're to be reworking the language anyway?

    ReplyDelete
  11. Why this "has" to point to the outer class are the control invocation syntaxes. Its very unnatural in a control structure to have this not point to the outer class. Image that if-blocks or loop-blocks introduce a new reference for "this", very unintuitive! The control invocation syntax is to source of the complexity and needs in the BGGA proposal. Understand the control invocation syntax first, before complaining about "unnecessary" changes.

    ReplyDelete
  12. Thanks for the great explanation and comparison. It would be great if you could highlight the rest of the feature, the proposals offer, in the same way.

    ReplyDelete
  13. @Richard: What would be the primary idiom for getting a hold of a referende to the outer class then of the CICE?

    In doing a bunch of event/action handling in Swing, I find that declaring a member instance in the outer class and set its reference to "this" is nasty boiler plate code. To the best of my knowledge you can't even resolve this problem by the usual Java means of going pseudo-dynamic, throwing in a bunch of introspection code.

    ReplyDelete
  14. Lars Westergren21 December 2007 12:30

    I really like what I see about FCM+JCA, it is very readable and easy to understand and therefore currently my favored closures proposal, but I think most (all?) the articles about it that I have seen so far are from the people who submitted the proposal. I wish it would gain more attention and some critical scrutiny from people more experienced in language/VM design than myself. Also I would like to see a bigger example such as a framework done in all three styles, though I realise this is a daunting task - especially since the proposals are still changing.

    References to methods, would that be something that would make something like the Hibernate or Tapestry framework implementations easier - no need to declare methods to invoke/inspect/inject in XML files or annotations anymore, but instead directly in the language? Have I understood it correctly?

    ReplyDelete
  15. Lars Westergren21 December 2007 12:53

    Never mind my previous question. I just read the full proposal myself and got the answer. :)

    ReplyDelete
  16. As a Java/JavaScript/ActionScript 3 developer, I prefer the FCM+JCA proposal by far.

    If you want to reverse the Java=>Flash movement, you definitely have to choose this proposal. It's natural (and ever more important: not scary).

    ReplyDelete
  17. BTW: I would even like to see a 'function' keyword.

    Don't underestimate the 'attract script developers' need.

    ReplyDelete