Wednesday, 4 April 2007

First-Class Methods: Java-style closures - v0.5

Stefan and I are pleased to announce the release of v0.5 of the First-class Methods: Java-style closures proposal.

Changes

Since v0.4, we have focussed on understanding the relationship between the various use cases for closures and FCM. We identified the following three use case groups:

  • Asynchronous - callbacks, like swing event listeners - where the block is invoked by a method other than the higher order method to which it was passed
  • Synchronous - inline callbacks, like list sorting, filtering and searching - where the block is invoked by the higher order method and completes before it returns
  • Control abstraction - keywords, like looping around a map, or resource acquisition

Having identified the three groups, we found that we could extend FCM from asynchronous to synchronous with a couple of simple changes. But we became convinced that the control abstraction use case is completely different. This is because the meaning of return, continue and break needs to be entirely different in the control abstraction use case.

As a result, of our deliberations, these are the main changes from v0.4:

1) Local variables can now be accessed read/write by all FCM inner methods. This enables the synchronous closure use case, and does not greatly compromise the asynchronous use case. We were concerned about the possibility of race conditions, but we identified that this is caused by concurrent access to the local variable, and could be addressed by a warning annotation - @ConcurrentLocalVariableAccess:

  public void addUpdater(@ConcurrentLocalVariableAccess #(void()) updater) {
    // invoke updater on a new thread
    // marked with annotation as local variable could be accessed
    // by new thread and original at the same time
  }

  public void process() {
    int val = 6;
    addUpdater(#{
      int calc = val;    // OK, can read local variables
      val = 7;           // OK, can update local variables
    });
    System.out.println(val);  // race condition - warning from annotation
  }

2) Added brief section on transparency. When considering synchronous closures it is desirable to not have to be concerned with every checked exception. Although we discussed alternatives, we simply reference the BGGA work here.

3) Removed the creation of stub methods for named inner methods. This prevents FCM from creating MouseListener instances, but is safer. As a result, we also removed the method compound concept.

Summary

This version tidies up the princpal loose ends of FCM and provides rationale for our decisions. We think it represents a simple and safe extension to Java which would be of great value to developers. It should also be noted that we are not intending to extend FCM further to cover the control abstraction use case.

If you've any more feedback, please let us know!

15 comments:

  1. I'd just like to note that despite being listed as technical reviewer, I never saw this version.

    ReplyDelete
  2. Stephen Colebourne4 April 2007 at 12:08

    @Ricky, My bad. I just checked, and you never got assigned to the doc. Sorry about that :-(

    ReplyDelete
  3. I had wondered why all the docs were quiet recently. No problem.

    ReplyDelete
  4. Concerning the code block style, you said this: "However, there is a key difference - the block of code cannot be assigned to a variable within the process method." Not true. As an example (assuming lock could store the action here, no good reason except to demonstrate at the moment):

    public static void withLock(Lock lock, Runnable action) {
    ___ lock.setActionTaken(action);
    ___ action.run();
    }

    Then, after calling withLock, you'd have access to the closure (a Runnable in this case). Again, this isn't very useful here, but it demonstrates the point.

    That said, I think control block closures should only be used for synchronous cases (executed 0 to N times and perhaps in a different thread). And I think either common sense coding conventions ("Don't do that!") or annotations may help to prevent people from trying to use control block closures for asynchronous uses.

    ReplyDelete
  5. FCM would be great as is your proposal, however I would do one things different:

    Keep names of methods, but allow conversion with a special type-cast.

    Although not intuitive, this can lead to a much shorter:

    Allow to use

    Math#min(int, int) ref = Math#min(int, int);

    instead of

    #(int(int, int)) ref = Math#min(int, int);

    Now, the parenthesis can be omitted, if no overloading exists, e.g.:

    final ActionListener.actionPerformed act = ..#actionPerformed;

    Further, if another language proposal will be accepted, everything reduces to:

    var min = Math#min(int,int);
    final act = ..#actionPerformed;

    Finally, we want methods to to convertible if possible: with a type cast, e.g.: ( Math#min(int,int) )

    Example:

    interface Foo {
    Number foo(Object arg);
    }

    interface Bar { // does not extend Foo, but is compatible
    Integer bar(String arg);
    }

    Foo foo = ..

    Foo#foo(Object) mFoo = foo#foo(Object);
    Bar#bar(String) mBar = (Bar#bar(String)) mFoo;

    without parenthesis:

    Foo#foo mFoo = foo#foo;
    Bar#bar mBar = (Bar#bar) mFoo;

    with syntax sugar:

    var mFoo = foo#foo;
    var mBar = (Bar#bar) mFoo;


    All the best with your proposal

    ReplyDelete
  6. Sorry messed things up, the following should be right:

    interface Foo {
    Number foo(String arg);
    }

    interface Bar { // does not extend Foo, but is compatible
    Integer bar(Object arg);
    }

    Bar bar = ..

    Bar#bar(Object) mBar = bar#bar(Object);
    Foo#foo(String) mFoo = (Foo#foo(String)) mBar;

    without parenthesis:

    Bar#bar mBar = bar#bar;
    Foo#foo mFoo = (Foo#foo) mBar;

    with syntax sugar:

    var mBar = bar#bar;
    var mFoo = (Foo#foo) mBar;

    ReplyDelete
  7. @Michael, we went through similar discussions in earlier posts. I think "var" would be great, but it would be hard to pass through many Java folks. Failing that, the main alternative typing syntax proposal is this:

    Foo foo = ...
    Integer(Object) mFoo = foo#foo(Object);
    Number(String) mBar = (Number(String))mFoo;

    There's also no need to state which class contains the methods when it comes to types.

    Well, I guess BGGA is a more well known alternative syntax:

    Foo foo = ...
    {Object => Integer} mFoo = foo#foo(Object);
    {String => Number} mBar = ({String => Number})mFoo;

    But that's backwards from normal Java (although you could argue that Java/C is backwards from common sense depending what you want to emphasize - but we're used to it).

    But in any case it looks like discussion typing syntax isn't the main issue right now.

    ReplyDelete
  8. @Tom
    you"re right, there's no need to state which class contains the methods when it comes to types, but:

    Finally, I realized my problem is, that I dislike methods with method-types in the signature like

    void start( void() runnnable );

    for the following reason reason:

    I find it inconsistent the Java library to classses with single-method-interface as method arguments and other with method-types as well.

    This does NOT introduce any new functionality, BUT makes me to decide wich variation I should use, writing a new class.

    Probably, to be most general, every newly created class will use a method-type as argumetns instead of a single-method-interface. However, the best would be to forbid this, as the convertion to an interface isdone automatically, for consistency reasons and finally syntax may even get screwed worse:

    interface Foo {
    String foo(int a, long b, Object c);
    }

    void bar(Foo foo);
    void bar(Foo foo, Object arg); // overloaded

    vs.

    void bar( String(int, long, Object) foo);
    void bar( String(int, long, Object) foo, Object arg);


    syntax sugar could reduce single-method-inteface declarrations, like

    interface String Foo.foo(int a, long b, Object c);


    Now back to my first (two) comment(s), I believe I wanted a single-method-interface generated automaticly for every single method of a class, without recognizing it - which of course nonsense.

    ReplyDelete
  9. With the elimination of stub methods, the MouseListener use case may be out, but you can still create a MouseAdapter using the named inner method syntax:

    MouseAdapter adapter = #mousePressed(MouseEvent evt) {
    __handleMousePressed(evt);
    }

    Not to beat a dead horse, but it seems to me that if we have named inner methods, we might as well allow for explicitly naming the superclass like I suggested after the 0.3 proposal, so that we can code to the interface instead of the implementation:

    MouseListener listener = MouseAdapter#mousePressed(MouseEvent evt) {
    __handleMousePressed(evt);
    };

    It's not much more typing (only the interface name is added), but it is completely explicit about what is going on. And if the argument for changing the semantics of "this" applies to named inner methods, then it applies equally to "qualified" named inner methods as well.

    I think in reality most MouseListener cases will look more like this:

    canvas.addMouseListener(MouseAdapter#mousePressed(MouseEvent evt) {
    __handleMousePressed(evt);
    });

    It would be interesting to see if there was a (non-ugly) way to convert this#handleMousePressed(MouseEvent) to a MouseListener, the same way that this#handleActionPerformed(ActionEvent) can be converted to an ActionListener. However there would have to be some way of specifying both the superclass (MouseAdapter) and the method to be overridden (mousePressed). It would be nice to bridge that gap so that doing MouseListeners were just as convenient as doing ActionListeners. The first idea that came to mind:

    canvas.addMouseListener((MouseAdapter#mousePressed) this#handleMousePressed);

    Sort of like "casting" the local method into a MouseAdapter. To save typing the method signature could be inferred from MouseAdapter or this.class if either one is not overloaded.

    Farther down the proposal, there is some ambiguous wording in section 5B, "Common rules and mechanisms": "The return keyword returns from the method" could mean that it returns from the inner method or the enclosing method. It would help if each instance of the word "method" here were qualified with either "enclosing" or "inner" just to make things perfectly clear.

    Then in section 7, "Open issues": it seems to me that issue #4 has already been addressed as of version 0.4. If not then please expand on what is unresolved.

    I also want to second a comment I saw on an earlier version of the proposal, suggesting an even shorter syntax for the simple case of a single expression without a semicolon: just return the value of the expression without making us type "return" and a semicolon. Modifying the PersonCache example, you get:

    PersonCache personCache = ...
    Person found = personCache.find(#(Person person) { person.getAge() >= 18 });

    This makes the simple case simpler the same way that omitting the ()'s for a no-arg method does.

    ReplyDelete
  10. four points (bias disclaimer, these ideas come from C3S - see my URL below):

    1. The this#name syntax gains you very little, e.g.:

    button.addActionListener( this#handleAction( ActionEvent ) );

    could be:

    button.addActionListener( #( e ) { handleAction( e ); } );

    if you allowed optional type inference, which is easy since only SAMs are allowed in the proposal!

    2. As others have pointed out it would be nice to be able to optionally qualify with class/interface name as well as have them inferred, e.g.:

    button.addActionListener( ActionListener#actionPerformed( e ) { handleAction( e ); } );

    3. Since there is no control loop syntax you can make the methods proper inner class methods and remove the restrictions associated with closures. In particular you could allow: both this pointers, multiple methods, and constructor arguments, e.g.:

    MouseAdaptor ma = #( /* optional constructor arguments */ ) [
    __#mouseEntered( e ) { doSomething( e ); }
    __#mouseExited( e ) { doSomething( e ); }
    __void doSomething( MouseEvent e ) { ... }
    ];

    4. The function types are translated into standard interface types that reside in a standard library, e.g.:

    #( void( ActionEvent ) )

    is shorthand for:

    Method1< Void, ActionEvent >

    Why bother with the new syntax at all, just provide the library of MethodX interfaces and use them instead of the new syntax.


    As I said at the start, I will show my bias - all of the above is straight out of the C3S proposal (see URL below).

    ReplyDelete
  11. This proposal just keeps getting better. Stephen and Stefan, I appreciate that you are doing this out in the open, and especially that you are carefully considering the concerns we raise here.

    Looking back at comments on the older revisions of the proposal, I see it was Tom Palmer who suggested the "expression method," and that his suggestion was even more concise than my own, by omitting the curly braces:

    Person found = personCache.find(#(Person person) person.getAge() >= 18);

    I still agree with Aris and Tom on the method type syntax. Using # is great for method references, but not for method types. Given the choice between:

    int(int,int) ref = ...
    (void() throws InterruptedException) ref = ...

    and

    #(int(int,int)) ref = ...
    #(void() throws InterruptedException) ref = ...

    I would much rather use the former. It's not about saving keystrokes; it's the fact that the first form just feels familiar and comfortable. The second is jarring.

    ReplyDelete
  12. @Matthew, would you distinguish implementations like MouseAdapter#mousePressed(MouseEvent evt) {...} and static method references like MouseAdapter#mousePressed(MouseEvent) by the parameter name? They look awfully close. My opinion is still to get rid of named inner methods completely. It makes it easier to adapt to current cases in Java, but anonymous inner classes will be clearer for such cases.

    Also, they haven't taken my suggestion for expression methods, unless I missed it.

    Side comment, I think control block syntax is vital in Java, and if it's out of scope for this proposal, I hope some other proposal with it (even such as BGGA) can be given higher priority, even though there's a lot nice in FCM.

    Of course, all these proposals are hypothetical so far. But closures sure would be nice.

    ReplyDelete
  13. Stephen Colebourne5 April 2007 at 17:01

    Lots of comments, I'll reply roughly earliest to latest...

    @Tom, Your example using withLock to make the Runnable action variable available is exactly what the sentence aims to avoid. One solution is to define the withLock method in such a way that there is no simple way to access the action variable. We'll publish some example syntax ideas soon.

    @Michael, A var keyword is probably a good idea, but many believe it to be against Java-style.

    For method signatures, I would consider good design would be for API designers to use a single method interface for asynchronous use case objects like swing listeners, and method types for short lived code like filtering and searching.

    @Tom, Every syntax possibility has upsides and downsides :-)

    @Matthew, Both MouseAdapter#mousePressed(MouseEvent evt) and casting (MouseAdapter#mousePressed) are ideas we've considered before. The former is viable, although you know my reservations (as Tom has just pointed out). The latter doesn't seem to use a 'cast' correctly. I suspect that API changes will handle the MouseListener case better than language changes.

    We'll try and tighten up the working as you suggest.

    Open issue #4 refers to field references like method references. We added field literals to the proposal, not field references, so the point remains open.

    We had a look at a short expression syntax, like Person found = personCache.find(#(Person person) person.getAge() >= 18); but felt that it could have syntax conflicts. We didn't want to pursue it at this point, as the sync vs async vs control-flow is the main point of argument.

    @Howard, #1 - type inference is certainly possible but its pretty un-Java.

    #2 - I won't rule it out, although it rather over-emphasises the class when this is a solution (first-class Methods / closures / functions) where the class should be invisible.

    #3 - We chose to remove the more complicated method compound, as using an inner class itself was clearer.

    #4 - Defining a set of MethodN interfaces in the JDK doesn't work with exception transparency. I'd prefer it if method types weren't needed, but I'm pretty convinced that they are.

    @Tom, Named inner methods are useful in a couple of cases, like ThreadLocal.initialValue, and LinkedHashMap.removeEldestEntry. Some developers also like the ability they provide to document the API being called.

    ReplyDelete
  14. Hmm. I thought on the control blocks and came up with this (for example):

    for eachPair(yield Key, Value: Map map) {
    ___ for (Map.Entry entry: map.entrySet()) {
    _______ yield entry.getKey(), entry.getValue();
    ___ }
    }

    Could work. And the block reference does go away. It doesn't work for invokeAndWait(), but it's hard to cover every case. I wonder what Neal would think.

    ReplyDelete
  15. @Stephen,

    In C3S and BGGA the proposal for the definition of the MethodX interfaces did provide exception transparency. In C3S varargs for generic exception parameters is allowed. So in C3S the definition of Method1 is:

    public interface< R, A1, Throwable... Es > Method1 {
    __R call( A1 a1 ) throws Es;
    }

    The key is that Es can be any length including empty.

    BGGA is similar but uses throws instead of ...

    ReplyDelete