Tuesday, 10 April 2007

Java Control Abstraction for First-Class Methods (Closures)

Stefan Schulz, Ricky Clarkson and I are pleased to announce the release of Java Control Abstraction (JCA). This is a position paper explaining how we envisage the First-Class Methods (FCM) closures proposal being extended to cover control abstraction.

Java Control Abstraction

So, what is control abstraction? And how does it relate to FCM? Well its all about being able to add methods in an API that can appear as though they are part of the language. The classic example is iteration over a map. Here is the code we write today:

  Map<Long,Person> map = ...;
  for (Map.Entry<Long, Person> entry : map) {
    Long id = entry.getKey();
    Person p = entry.getValue();
    // some operations
  }

and here is what the code looks like with control abstraction:

  Map<Long,Person> map = ...;
  for eachEntry(Long id, Person p : map) {
    // some operations
  }

The identfier eachEntry is a special method implemented elsewhere (and statically imported):

  public static <K, V> void eachEntry(for #(void(K, V)) block : Map<K, V> map) {
    for (Map.Entry<K, V> entry : map.entrySet()) {
      block.invoke(entry.getKey(), entry.getValue());
    }
  }

As can be seen, the API method above has a few unique features. Firstly, it has two halves separated by a colon. The first part consists of a method type which represents the code to be executed (the closure). The second part consists of any other parameters. As shown, the closure block is invoked in the same way as FCM.

The allowed syntax that the developer may enter in the block is not governed by the rules of FCM. Instead, the developer may use return, continue, break and exceptions and they will 'just work'. Thus in JCA, return will return from the enclosing method, not back into the closure. This is the opposite to FCM. This behaviour is required as the JCA block has to act like a built-in keyword.

One downside of the approach is that things can go wrong because the API writer has access to a variable that represents the closure. The API writer could store this in a variable and invoke it at a later time after the enclosing method is complete. However, if this occurred, then any return/continue/break statements would no longer operate correctly as the original enclosing method would no longer be on the call stack and a weird and unexpected exception will be thrown.

The semantics of a pure FCM method invocation are always safe, and there is no way to get one of these unexpected exceptions. But, for JCA control abstraction we could find no viable way to stop the weird exceptions. Instead, we have chosen to specifically separate the syntax of FCM from the syntax of control abstraction in JCA.

Our approach is to accompany the integration of control abstraction into Java by a strong set of messages. Developers will be encouraged to use both FCM callbacks and JCA control abstractions. However, developers would only be encouraged to write FCM style APIs, and not JCA.

Writing the API part of any control abstraction (including JCA) is difficult to get right (or more accurately easy to get wrong). As a result, some coding shops may choose to ban the writing of control abstraction APIs, but by having a separate syntax this will be easy to do for the tools. It is expected, of course, that the majority of the key control abstractions will be provided by the JDK, where experts will ensure that the control abstraction APIs work correctly.

Summary

This document has taken a while to produce, especially by comparison with FCM. In the end this indicated to us that writing a control abstraction is probably going to be a little tricky irrespective of what choices the language designer makes. By separating the syntax and semantics from FCM we have clearly identified the control abstraction issue in isolation, which can only be a good thing.

Feedback always welcome!

32 comments:

  1. Does your proposal attempt to address exception transparency? I read through it but didn't see any comment about that.

    ReplyDelete
  2. Nevermind, I found it:

    "In particular, any exception thrown within the block will appear transparently as part of the method."

    I like that the exception transparency is implicit. Keeps the syntax clean on both ends of the interaction.

    ReplyDelete
  3. All of your examples have are using static imports on static methods. Wouldn't it be equally valid to have instance methods do the same thing? e.g.

    Map
    map.eachEntry(String key, Object value) {
    __// do stuff
    }

    (I know that we can't add methods to Map; this is strictly an example.)

    ReplyDelete
  4. "In particular, any exception thrown within the block will appear transparently as part of the method."

    Too bad, that makes it impossible to define an API that swallows certain exceptions.

    ReplyDelete
  5. The abstraction seems to add little, why not write:

    Map map = ...;
    eachEntry(map, #(Long id, Person p) {
    // some operations
    });

    and deal with the exceptions and non-local returns seperately.

    ReplyDelete
  6. Stephen Colebourne10 April 2007 at 10:37

    @Matthew, Yes, control abstraction can be used on an instance method.

    @Neal, It may not be specified here, but we consider it a requirement
    that the API can swallow an exception. The way of achieving the exception transp would probably follow BGGA.

    @Howard, With your example, exceptions would probably be handled anyway.
    The big issue is non-local returns. Being able to add new structures
    that act as keywords to the language is really powerful.

    ReplyDelete
  7. I'm curious about Neal's comment for the control abstraction method swallowing some of the block's exceptions. Could someone give an example of a situation where you would need to do this?

    ReplyDelete
  8. @Matthew, "swallowing" is useful, when the abstraction method already handles some of the exceptions possibly thrown by the passed block.

    ReplyDelete
  9. Thanks for promoting control abstraction.

    I think this is a strong misstatement: "However, developers would only be encouraged to write FCM style APIs, and not JCA. ... Writing the API part of any control abstraction (including JCA) is difficult to get right (or more accurately easy to get wrong)."

    I think it's quite easy to get right for most common cases (especially with proper language support which also isn't too fancy). But having special syntax for this could make it easy to know what you are doing and therefore easier to get right. So maybe it is better to have a special syntax separate from other forms such as what you've proposed. And the compiler might be able to help out too with the distinct syntax.

    Considering exception swallowing: I this should be the exception (pun noticed after the fact, honestly) rather than the rule. Almost _always_ it would be safe to assume that exceptions in the block would come outside the block. Anything else would be confusing.

    ReplyDelete
  10. Actually I think that "swallowing" exceptions would be fairly simple. If the control abstraction method (CAM) accepts a block type which itself throws an exception, but the CAM doesn't in turn throw that exception, then the CAM is expected to swallow the exceptions declared in the block:

    public static void eachEntry(for #(void(K, V) throws SomeException) block : Map map) {
    __for (Map.Entry entry : map.entrySet()) {
    ____block.invoke(entry.getKey(), entry.getValue());
    }

    ReplyDelete
  11. My apologies. I forgot to update the method code to actually catch the exception:

    public static void eachEntry(for #(void(K, V) throws SomeException) block : Map map) {
    __for (Map.Entry entry : map.entrySet()) {
    ____try {
    ______block.invoke(entry.getKey(), entry.getValue());
    ____} catch (SomeException ignored) {}
    __}
    }

    ReplyDelete
  12. Here is what the usingFileReader method from the proposal looks like when it swallows IOExceptions:

    public static void usingFileReader(#(void(FileReader) throws IOException) block : File file) {
    __FileReader reader = null;
    __try {
    ____reader = new FileReader(file);
    ____block.invoke(reader);
    __} catch (IOException ignored) {
    __} finally {
    ____if (reader != null)
    ______try { reader.close(); } catch (IOException ignored) {}
    __}
    }

    ReplyDelete
  13. Stephen Colebourne10 April 2007 at 18:42

    @Tom, The statement that JCA is hard may be too strong. However, it definitely is easy to get wrong. At the very least there should be a checklist of things to check when implementing such a method.

    Ideally, tools like checkstyle or PMD could then encode the rules into their checking to help further. This is a key advantage of using a different API declaration syntax - that tools can easily identify it and apply specific rules.

    @Matthew, Yes, you've worked through examples of swallowing exceptions. These are an essential part of closures.

    ReplyDelete
  14. Thanks, Matthew, for bringing up examples. I thought the BGGA proposal would explain fairly enough on why it makes sense to swallow exceptions.
    Matching exceptions, not thrown by the abstraction method, to those a block may throw is only half the story. An abstraction method may take closures that throw more than the declared exceptions. Otherwise, one may come into the need for declaring a bunch of methods or adding up on exceptions allowed for a block. BGGA does provide such a mechanism, although using Generics not necessarily is the best tool for it. Some collector mechanism may suffice, which also prevents from overriding the generics on applying control abstraction. For example:

    public static void usingFileReader(#void(FileReader) throws IOException, ...) block : File file) throws ... {
    ´ ´ // code catching IOException
    }

    Where ... is a collator for passed through exceptions. This is necessary to allow developers to deliberately define if the abstraction method will take exceptions (and which ones) or pass exceptions.

    @Tom, I'm not sure I agree with Stephen on that specific statement. But in the end, it is more difficult to get JCA right than FCM style closures. Especially handling of non-local-transfer, i.e., break, continue, and return statements gives one strong headaches.

    ReplyDelete
  15. Break, continue, and return can be made to work automatically except in the cases of execution in separate threads (e.g., "invokeAndWait()"). Or in cases of deferred execution which shouldn't be done, and that was already discussed well. Synchronous work on the current thread would be the common use case, and it would work automatically and easily.

    For swallowing exceptions, the more I think about it, the more I think it should _never_ be done by control abstraction.

    And that's what's nice about FCM assuming asynchronous and JCA assuming synchronous. Exceptions can just work correctly automatically in each case. FCM can assume you won't throw them to the surrounding block by default, and JCA can assume you will. Problem solved.

    ReplyDelete
  16. @Stefan, I have read the BGGA proposal but have been following FCM more closely, so forgive me if I ask redundant questions.

    I did understand the semantics of exception transparency, but thank you for taking time to spell things out.

    I got the impression while reading the proposal that exception transparency was implicit. In other words, by using the block syntax, any exception that the block throws is also thrown by the control abstraction method. (Unless the block type in the CAM signature throws a specific exception type, in which case the CAM would be required to catch that exception.)

    However I didn't see any mention of the "..." syntax in the JCA proposal. Are you just clarifying intent with your example, or are the ellipses actually part of the intended syntax?

    ReplyDelete
  17. @Tom, it's difficult to explain in a short comment, but break, continue, and return do not work automatically, unfortunately, although it looks as if it would. The block of a JCA is passed as inner method to the abstraction method, which can handle it like any other inner method, i.e., call it, store it, loop over it, or apply it concurrently (e.g., for matrix operations). There is and should be no restriction on what can be done, as this would limit the expressiveness of closures in general. Hence, all the problems of non-local transfer as described in BGGA can appear.
    It takes quite more than this short paragraph to explain, many more information are given by a couple of posts in Neal's blog, though.

    @Matthew, JCA is a position paper, not a proposal. We suggest a syntax to clarify the application of control abstraction that fitted our requirements. It's by no means complete nor does it fully cover all features a control abstraction may provide or need to provide in the end.
    So, yes, the "..." only is an option and a syntax I used to clarify my explanation, as is the syntax stated in the JCA document. The syntax given in BGGA is another option.

    ReplyDelete
  18. My "never swallow" statement was off base. If the main purpose of a block is to swallow certain exceptions (or in other clear cases), then that sounds okay. For example (perhaps in a unit test framework using closures instead of annotations):

    assertException(NullPointerException.class) {
    ___ String a = null;
    ___ a.length();
    }

    But I'd rather have the common case easy than support this easily.

    ReplyDelete
  19. Or, in other words, if I had to write it this way to make JCA simpler, so be it:

    assertException(NullPointerException.class, #{
    ___ String a = null;
    ___ a.length();
    });

    So, I'm back in the JCA should never swallow exceptions camp. (What a 360. Sorry about that.)

    ReplyDelete
  20. Since JCA should always be void, if it already gets its own syntax, it could be nice to clean things up some:

    public static void eachEntry(for(K, V) block: Map map) {
    ____ ...
    }

    I understand that explicitness has value, too. So it's a case of explicit vs. redundant here. I guess Java loves redundant in most cases, but not always.

    Still, I'm okay with the syntax presented. Maybe experiment with the "for" out in front of the method name such as with BGGA, too, though. It seems more obvious in front:

    public static for eachEntry(#(void(K, V)) block: Map map) {
    ____ ...
    }

    (Not that I've become a fan of '#' in type names, but I'm dodging that a bit for now.)

    ReplyDelete
  21. You can handle a non-local return, break, and continue without needing the complication of a control abstraction by naming the non-local method to return from, this is the opposite way round to BGGA where you have to stop non-local returns. E.G.:

    Integer finder( List< Integer > list, Integer value ) {
    __find( list, #( Integer e ) {
    ____if ( e.equals( value ) ) { return#finder e; }
    __} );
    __return null;
    }

    Which is equivalent to your:

    Integer finder( List< Integer > list, Integer value ) {
    __for find( Integer e : list ) {
    ____if ( e.equals( value) ) { return e; }
    __}
    __return null;
    }

    ReplyDelete
  22. Stephen Colebourne10 April 2007 at 23:33

    @Tom, on the syntax, it is important to remember that block is just an ordinary variable, so it should be prefixed by a genuine Java type. In this case, the type is being restricted to be a method type. This is necessary, as you could assign the block to an instance variable, or put it in a hashmap or do all sorts of weird stuff (most of which are of course a bad idea).

    What this syntax does suggest is that the alternative method type syntax from FCM may make sense:

    #(String return int throws Exception)

    hence:

    public static eachEntry(for #(K, V) block: Map map) {
    __ ...
    }

    On the for, personally I'm not 100% convinced its needed on the application side, but it is needed in the API to define if the method captures continue/break.

    ReplyDelete
  23. I understand it's just a variable. I was proposing being inconsistent with type names in this case. Out of the options presented, I like your current spec better than "#(String return int ...)". So, I'll also prefer "for #(void(K, V)) block: ..." if you want to stay consistent.

    ReplyDelete
  24. Also, I think this is more how waitUntilDone() should look (including lots of comments to explain):

    // Use Runnable instead of #(void()) here because.
    // We can't allow type conversion for invokeAndWait().
    // And might as well not create a new object just for this.
    // Also a good example why we shouldn't cheat on type names as I'd proposed.
    public static void waitUntilDone(for Runnable block)
    _______ throws InterruptedException, InvocationTargetException {
    ___ try {
    _______ SwingUtilities.invokeAndWait(block);
    ___ } catch (InvocationTargetException e) {
    _______ Throwable cause = e.getCause();
    _______ if (cause != null) {
    ___________ // Method that throws even checked exceptions.
    ___________ // JCA should already proper cause compile-time checking in the outside block.
    ___________ // This would also include control-flow exceptions for break/continue/return.
    ___________ sneakyThrow(cause);
    _______ }
    _______ throw e;
    ___ }
    }

    ReplyDelete
  25. Stephen Colebourne11 April 2007 at 12:26

    Yes this is more how waitUntilDone should be. Except the method sig:

    public static void waitUntilDone(Runnable block : ) throws InterruptedException, InvocationTargetException {

    ReplyDelete
  26. I guess I'm still a bit mixed on the exact syntax. Thanks for fixing it to be consistent with the rest of your proposal.

    ReplyDelete
  27. I reviewed a bit better and saw that semantically you were using "for" like BGGA. Sorry for the error. Concerning that, I like the BGGA placement of "for" better. So, combining the two styles gives this:

    void for eachEntry(#(void(K, V)) block, Map map) { ...

    I do love to see "throws X" go away. And, beating a dead horse, I'd like this even better (just assume this applies as a general caveat on "#()" typing syntax until I mention otherwise - and I'll try to avoid mentioning it again for a while):

    void for eachEntry(void(K, V) block, Map map) { ...

    ReplyDelete
  28. BTW, I like BGGA placement of "for" better because it looks the same as how it would be used. That makes it way more clear in my opinion.

    ReplyDelete
  29. And realized my other syntax error now. I should have said this:

    void for eachEntry(#(void(K, V)) block: Map map) { ...

    I rely too much on IDEs, I guess.

    ReplyDelete
  30. "We also considered and rejected the option of not assigning a variable name to the block being passed to the control abstraction method definition. This option increases the safety of the overall solution, effectively preventing many possible problem areas. However, it removes the ability to perform some important use cases, so was not a viable option."

    I'm curious, which use cases are prevented when no variable name is assigned to the block?

    ReplyDelete
  31. Stephen Colebourne12 April 2007 at 23:34

    @Matthew, It is possible to implement a multithreaded execution around items in a list, where all the processing is completed before the control abstraction syntax is completed. This is hard to achieve without a variable.

    Also, something as simple as implementing one method that delegates to another is impossible without a block variable.

    ReplyDelete