Saturday, 12 May 2007

Closures - comparing options impact on language features

At JavaOne, closures were still a hot topic of debate. But comparing the different proposals can be tricky, especially as JavaOne only had BGGA based sessions.

Once again, I will be comparing the three key proposals in the 'closure' space - BGGA from Neal Gafter et al, CICE from Josh Bloch et al, and FCM from Stefan and myself, with the optional extension JCA extension. In addition I'm including the BGGA restricted variant, which is intended to safeguard developers when the closure is to be run asynchronously, and the ARM proposal for resource management.

In this comparison, I want to compare the behaviour of key program elements in each of the main proposals with regards the lexical scope. With lexical scoping here I am referring to what features of the surrounding method and class are available to code within the 'closure'.

Neal Gafter analysed the lexical scoped semantic language constructs in Java. The concept is that 'full closures' should not impact on the lexical scope. Thus any block of code can be surrounded by a closure without changing its meaning. The proposals differ on how far to follow this rule, as I hope to show in the table (without showing every last detail):

Is the language construct lexically scoped when surrounded by a 'closure'?
  Inner class CICE ARM FCM JCA BGGA
Restricted
BGGA
variable names Part (1) Part (1) Yes Yes Yes Part (7) Yes
methods Part (2) Part (2) Yes Yes Yes Yes Yes
type (class) names Part (3) Part (3) Yes Yes Yes Yes Yes
checked exceptions - - Yes Yes (4) Yes (4) Yes (4) Yes (4)
this - - Yes Yes Yes Yes Yes
labels - - Yes - (5) Yes - (5) Yes
break - - Yes - (5) Yes - (5) Yes
continue - - Yes - (5) Yes - (5) Yes
return - - Yes - (6) Yes - (8) Yes

(1) Variable names from the inner class hide those from the lexical scope
(2) Method names from the inner class hide those from the lexical scope
(3) Type names from the inner class hide those from the lexical scope
(4) Assumes exception transparency
(5) Labels, break and continue are constrained within the 'closure'
(6) The return keyword returns to the invoker of the inner method
(7) The RestrictedClosure interface forces local variables to be declared final
(8) The RestrictedClosure interface prevents return from compiling within the 'closure'
BTW, if this is inaccurate in any way, I'll happily adjust.

So more 'Yes' markers means the proposal is better right? Well not necessarily. In fact, I really don't want you to get that impression from this blog.

What I'm trying to get across is the kind of problems that are being tackled. Beyond that you need to look at the proposals in more detail with examples in order to understand the impact of the Yes/No and the comments.

Let me know if this was helpful (or innaccurate), or if you'd like any other comparisons.

13 comments:

  1. What is "BGGA Restricted"? Is it fair to leave off ARM blocks from CICE?

    ReplyDelete
  2. In addition to including the ARM blocks from CICE, it would be considerate to also include Howard Lovatt's C3S proposal. Getting ignored is not fun when its being done to you, but I guess dishing out some of that ignorance balances things out?

    ReplyDelete
  3. I don't think ARM belongs here, as it is about introducing new language constructs for specific cases and not developer-definable closure like mechanisms.

    ReplyDelete
  4. I do think ARM-blocks belong here, as they are in some sense an alternative to user-defined control constructs for Java. Adding a facility that allows users to define their own control constructs is risky. It will change the character of the language, for better or worse. The idea behind ARM-blocks is to add a small amount of syntactic sugar to solve a real, common problem without risk of destabilizing the language. In this way, it is similar in character to the for-each loop in Java 5, which I see as a great success.

    Also I want to remind people that the "checked exceptions" issue is a red herring. It is largely orthogonal to closures. It is solved by allowing for explicit "disjunctive exception types" (e.g., InstantiationException | IllegalAccessException). This solution works equally well with any of the proposals on the table including "do nothing" (AKA "Inner Classes").

    I believe disjunctive exception types would be a real improvement to the language, independent of the adoption of any of the closures proposals. It solves a real problem without a significant increase in the conceptual weight of the language.

    Note that it would also solve a common problem with code repetition in catch clauses:

    try {
    return class.newInstance();
    } catch (InstantiationException e) {
    logger.log(SEVERE, "instantiation failed", e);
    throw e;
    } catch (IllegalAccessException e) {
    logger.log(SEVERE, "instantiation failed", e);
    throw e;
    }

    could be replaced by:

    try {
    return class.newInstance();
    } catch (InstantiationException | IllegalAccessException) {
    logger.log(SEVERE, "instantiation failed", e);
    }

    This seems like the sort of small, high-payoff language change that is appropriate for a mature language such as Java.

    ReplyDelete
  5. This is a slight misunderstanding. Surely, ARM belongs to CICE as JCA belongs to FCM. But the comparison Stephen made is about closures and lexical scoping. Being language constructs, there are (or should be) no issues in using ARM with respect to lexical scoping.

    ReplyDelete
  6. Just a remark on C3S, as it seems missing: in its current state, having such extreme language syntax impact, it in my opinion is no realistic option for being considered for Java integration. I'd rather use JRuby (or Scala) directly than trying to make Java look Ruby. And as both languages seem to embed nicely with a Java application, I cannot see a reason to.

    ReplyDelete
  7. Stephen Colebourne12 May 2007 20:01

    I am with limited internet connection right now, but these all seem valid points. Looking at the comparison, including ARM makes sense, as it is a related independent element that should be considered.

    Including C3S is more complex, as it is more than just a simple closures proposal. It involves fairly big language syntax changes, such as ommitting brackets etc. For me, that makes it hard to get a handle on what the 'closure' aspects are. If Howard is listening, please send me your equivalent column and I'll add it when I get a good internet column.

    BGGA restricted means BGGA where the interface being implemented implements RestrictedClosure.

    ReplyDelete
  8. Josh, I would be very interested in your analysis of this closure frenzy that is going on in the Java community. I would be equally interested in why you are taking the back seat on this one. You clearly have an opinion about how closures should be crafted and I think it is absolutely spot on.

    I would ask for you to make it more public as I think it is the right way forward. Your word carries a lot of weight in the Java community. Or is this subject sensitive within Google?

    Cheers,
    Mikael Grev

    ReplyDelete
  9. Thank you to jnice for suggesting that C3S is included in the comparison and thank you to Stephen for agreeing. I think that jnice raises an important issue in suggesting inclusion. If a particular proposal is to gain wide acceptance then the author proposing a particular proposal needs to demonstrate awareness and understanding of alternatives, otherwise they can hardly claim that their particular proposal is superior if they cannot demonstrate an understanding of the alternatives!

    With regard to Stephen?s comparison table and associated notes I would suggest a few modifications:

    A. Note 1 should read field not variable (a local variable in all the proposals hides an enclosing variable).

    B. Note 2 and 3 are repeats of note 1; similarly rows 2 and 3 are repeats of 1. The whole concept of members could be combined into ?inherited members hide enclosing names?.

    C. To be even handed with the inner class based proposals you could add a line that says access to ?access to inherited members?.

    D. Similarly to above point there could be two ?this? lines, one for the enclosing scope this and one for the inherited scope this.

    E. The labels, break, and continue lines could be combined ? they are the same in all cases.

    F. Notes 5 and 6 are really just no, i.e. you don?t need the note at all.

    G. Note 8 could be more explicit. You cannot use return at all. The only form of exit is going off the end of the control block.

    H. I am with Josh Bloch, the multiple exceptions is so far removed from the inner class.closure stuff that it is probably best omitted.

    H. You could be more explicit with the descriptions of the rows.

    I have sent a possible alternative table to Stephen including the above suggestions and a C3S column, but unfortunately you don't seem able to post in HTML on the site and therefore I can't post the new table directly.

    An alternative comparison of inner class/closure proposals is http://www.artima.com/weblogs/viewpost.jsp?thread=202004.

    ReplyDelete
  10. I also think of C3S as a different language rather than a proposal to add features to Java. If needed, it would belong better on a separate chart comparing features across languages (if it included both implemented and hypothetical languages).

    ReplyDelete
  11. @Tom,

    The C3S proposal is simply syntactic sugar for existing constructs, just as CICE is. Therefore it is arguable that both these proposals are simpler than FCM/BGGA proposals that add a closure construct with different semantics than inner classes. So in the semantic sense C3S is more Java than FCM/BGGA.

    With regard to syntax, a withLock example in C3S would be:

    withLock lock, method{out.println "Hello"};

    To which you can add brackets if you like:

    withLock(lock, method{out.println("Hello")});

    Contrast this with BGGA (without control loop abstractions - I will come back to why without control loop in a moment) for example:

    withLock(lock, {=> out.println("Hello")});

    I would contend that the use of the keyword, method, is more Java than the use of the symbol, =>.

    The omitting bracket rule is an alternative to control loop abstractions that are in BGGA, FCM (JCA), and CICE (ARM). The bracket rule is simply described: When calling a method the brackets may be ommitted, provided that the call is unambigous.

    This bracket rule is much simpler than any of the other proposals for control abstractions and has the added benifit of many more use cases. It is also one of the features that make languages like Ruby and Haskell great for writing embedded domain specific languages (DSLs) in.

    In summary I would contend that C3S is Java with syntactic sugar and as such is easier to understand than proposals that change semantic meaning.

    ReplyDelete
  12. My problem with the word method is that it is meaningless here. What we're looking at isn't a method. There's no object to speak of, at least none that the closure is part of.

    If one can rename functions or blocks as methods and hope to gain Javaness, then there's something wrong.

    Parentheses are no barrier to DSLs, as Lisp happily demonstrates.

    ReplyDelete
  13. @Ricky,

    In CICE and C3S the block is a method just like any other inner class method. I agree that in FCM and BGGA the behaviour is different than normal methods. You can consider the CICE and C3S version as a superset of the capabilities of those available in BGGA and FCM. In particular with inner classes, CICE, and C3S the following type of code is possible:

    /* Not using generics (to keep posting simple) but using C3S syntax for inner class */
    class Reducer {
    __int total;
    __abstract int call( int x );
    __static int reduce( int[] array, int initial, Reducer reducer ) {
    ____total = initial;
    ____for ( int value : array ) { reducer.call( value ); }
    ____return total;
    __}
    }

    ...

    int[] a = ...;
    int sum = reduce a, 0, method( x ) { total += x };

    This comes out nicer than other proposals because you have access to the field total from within the inner class method. Note also how you can write control structures, reduce, that return a value. Again something that FCM and BGGA can't do and the colntrol structure rule is simple and optional - optionally mis brackets out if there is no ambiguity.

    ReplyDelete