Sunday, 21 January 2007

Closures - Last-line-No-SemiColon nastiness

The BGGA closures proposal currently plans to use a 'last-line-no-semi-colon' to return the result of the closure back to the closure-defining method. I personally find this rather nasty. What about an alternative?

Last-line-no-semicolon

Here is some code that traverses a complex object structure to find all the bookings that contain a specific purchase type. (Yes, I know its not OO, a lot of real world code isn't - OO isn't the point of this blog post).

List[Booking] matches = new ArrayList[Booking];
for (Booking booking : bookingList) {
  for (Purchase purchase : booking.getPurchaseList()) {
    if (purchase.matchesType(typeToFind)) {
      matches.add(booking);
      break;
    }
  }
}

And here is the code refactored to use a BGGA closure, 'findAll', using the current rules:

List[Booking] matches = findAll(Booking booking : bookingList) {
  boolean matched = false;
  for (Purchase purchase : booking.getPurchaseList()) {
    if (purchase.matchesType(typeToFind)) {
      matched = true;
      break;
    }
  }
  matched        // passes result to findAll, last-line-no-semicolon
}

So, adding the closure has made things worse. We had to introduce an artificial matched variable, and the result was returned on the last line without a semicolon, which is highly error prone and very un-Java-like.

In my prior blog post on this topic I suggested one possibility was using the goto keyword to help. But the truth is thats just horrible. So, I'll propose an alternative which I believe fits BGGA much better:

List[Booking] matches = findAll(Booking booking : bookingList) {
  for (Purchase purchase : booking.getPurchaseList()) {
    if (purchase.matchesType(typeToFind)) {
      => true;        // passes result to findAll of true 
    }
  }
  => false;           // passes result to findAll of false
}

So, this proposal is to simply use the => symbol to act as the result syntax within a closure. The true/false value is returned to the closure method which then decides whether to add the booking to the overall result or not. This fits the other use of => in the function type declaration, where it means 'returning'.

So, the overall effect is that a result can be returned from any part of the closure block, and there are no missing semicolons or weird last line effects.

Nested closures

Neal has previously raised a problem with this approach. He points out that the use of something like this means that refactoring the method further to introduce a second closure may cause problems. Here we change the inner for loop to the 'eachNotNull' closure:

List[Booking] matches = findAll(Booking booking : bookingList) {
  eachNotNull(Purchase purchase : booking.getPurchaseList()) {
    if (purchase.matchesType(typeToFind)) {
      => true;        // now within two closures! help!
    }
  }
  => false;           // passes result to findAll of false
}

As shown, the '=> true' is now within two closures, 'findAll' and 'eachNotNull'. So what does the => mean? Well, this actually isn't a new concept in Java - break and continue currently refer to the closest loop, and so are vulnerable to refactorings that introduce a new loop level. As such, I suspect that this isn't as big a deal as Neal thinks.

Option A for this situation is that the => operator returns the result to the closest surrounding closure that accepts that type of the result. In this case, the 'eachNotNull' closure has a 'Void' result, so the boolean value 'true' is still passed back to the 'findAll' closure. Thus, adding the 'eachNotNull' closure had no effect on the logic.

Option B is that any result within a second closure has to be labelled to reach its destination. This is the same rule as break/continue from a loop within a loop in current Java.

outer:   // label
List[Booking] matches = findAll(Booking booking : bookingList) {
  eachNotNull(Purchase purchase : booking.getPurchaseList()) {
    if (purchase.matchesType(typeToFind)) {
      => outer: true;    // labelled result now reaches findAll
    }
  }
  => false;              // no need to label, result goes to closest closure, findAll
}

Here, the closure is labelled 'outer' in exactly the same way as in current Java. The => result then references the label to return the result to the correct closure.

So which do you prefer? Option A (auto-seek best first matching result) or option B (explicit labelling)? I suspect that option A is too vague and complex for Java, so I'm leaning towards option B.

Fully refactored

By the way, if the code was refactored a little more, this is what you might end up with:

List[Booking] matches = findAll(Booking booking : bookingList) {
  => booking.matchesType(typeToFind);
}

To my eyes, this is the most attractive form, making it clear that a result is being returned from the closure block, and retaining the standard Java semicolon.

Summary

I'm determined to excise the evil 'last-line-no-semicolon' concept from closures - its a seriously nasty hack. Let me know your thoughts on it!

16 comments:

  1. Hi Stephen,

    The statement-form closure syntax (unfortunately) cannot be used as an expression, so your syntax is a bit out.

    I'd suggest return label:expression; as a possible syntax, e.g., return outer:true; Of course, this is confusing if you still allow unlabelled returns from within closures. And if you don't, then it's inconvenient.

    Perhaps expression-form closures (the => variety) could be restricted to a single expression, no statements. Then the last-line-return wouldn't apply.

    Lisp typically (but not everywhere) uses last-expression-return, and C has last-expression-return in its comma operator (int a=5,6,7; a is 7), so there already is a precedent for this in programming. It just doesn't smell of coffee.

    ReplyDelete
  2. You're not using the BGGA syntax. The control invocation syntax does not provide a way of returning a result. Even if it did, you've chosen a horrible way to use closures to solve your problem.

    The fact that a loop redefines the meaning of break within the loop doesn't help the refactoring problem you would be introducing with a closure return statement. You've only proven that a loop makes a poor closure. We already knew that.

    ReplyDelete
  3. This is a somewhat twofold blog entry. It seems valid to me, that using control invocation syntax on closures only should be applicable, if the closure's return type is void or java.lang.Void. This would be consistent with the general use of control statements. Though, the "withLock" example in the closures proposal is defined returning a value and is later on used applying the control invocation syntax, which seems a bit inconsistent.

    What's to think over is the scope of "break" and "continue" within closures. Basically, both statements should stay in the visible scope, but I actually see no difference to handling exceptions. Remember the "withLock" example (simplyfied):

    void withLock(Lock lock, {=> throws CEx} block) throws CEx {
    ´ lock.lock();
    ´ try {
    ´ ´ return block.invoke();
    ´ } finally {
    ´ ´ lock.unlock();
    ´ }
    }

    Applying the example like follows:

    withLock(propertiesLock) {
    ´ this.logLevel = this.properties.getLogLevel();
    ´ if (!supportedLevels.contains(this.logLevel)) {
    ´ ´ throw new CEx("Unsupported level");
    ´ }
    }

    As the proposal states and everybody would expect, the exception thrown within the closure will be handled by "withLock" to unlock the lock, first, and then propagated outside the closure.

    Now take a "forEach" closure in a Matrix API that executes on each cell and which, in my humble opinion, is no uncommon pattern for closures (actually a recurrent example of why closures would have made the for-each loop an API issue rather than a language issue):

    void forEach({Cell =>} block) {
    ´ for (List row : columns) {
    ´ ´ for (Cell cell : row) {
    ´ ´ ´ block.invoke(t);
    ´ ´ }
    ´ }
    }

    Applying "forEach" like follows:

    for (Matrix m : matrices) {
    ´ m.forEach() {C cell =>
    ´ ´ if (cell.isEmpty()) continue;
    ´ ´ ...
    ´ }
    }

    I'd expect "continue" to refer to "forEach" and not the for-loop on matrices.

    What is _not_ needed, in my point of view, is a new notion for what scope the statements refer to (as I am not aware of what's inside the code that executes my closure), but some general propagation mechanism to handle them appropriately like exceptions get propagated correctly.

    Maybe, one could argue similar for a "return" statement, but binding "continue" and "break" to a statement block seems Java-ish, while a "return" statement is rather bound to the current visible method.

    ReplyDelete
  4. Stephen Colebourne21 January 2007 14:22

    @Archimedes, @Ricky, The return keyword isn't an option because it returns from the surrounding method, not back to the closure defining method. Its just too strongly linked in developers minds to be reused.

    @Ricky, I believe that restricting expression form closures to a single expression would work, but be too restricting. As you point out, its clear where the last-line-no-semicolon has come from, but its just not Java. I would be happy to have the semicolon optional if the closure body only consists of a single expression.

    ReplyDelete
  5. As with generics, I see it with a certain distance that language constructs (should) become more and more concise, or brief. Nothing to say about better or cleaner architecture of code. But the readability is the most important thing for me in first place. A lot of errors occur because a code change is made after the existing code has not been understood correctly. Cleaner micro-architecture does not help a lot to reduce errors. At least, Generics don't do as I assume.

    ReplyDelete
  6. Stephen Colebourne21 January 2007 14:33

    @Neal, I've re-read BGGA and I see your syntax objection is stated. However I disagree with it. (Although as Stefan spotted, BGGA 0.4 is inconsistent between the withLock closure-defining-method and the use of withLock using the control-invocation-form.)

    BGGA states that the control-invocation form is not an expression, as it looks like a statement, and there would be semicolon issues. I believe this is being overly protective (I think I understand why as a language designer you chose to disallow it, but it seems too restrictive as a user). I'll blog again to explain my thoughts fully.

    I also don't understand your point about this being a horrible way to use closures or your point about loops being 'poor closures'. Surely, loops are a prime use case for closures!

    ReplyDelete
  7. Stephen Colebourne21 January 2007 14:40

    @Stefan, I'll write about my take on continue/break bindings soon. I believe that it is something that will have to be addressed to make the proposal gel with existing Java code.

    @Klaus, A key part of what I'm trying to do is to ensure that all the language change proposals blend successfully with existing Java.

    ReplyDelete
  8. What's with the names 'findAll' and 'eachNotNull'? Aren't closures anonymous? And aren't those names utterly useless anyway?

    And yes, that last-line-no-semi-colon makes the syntax confusing. Which is always bad.

    The 'Fully refactored' version goes pretty functional. How about to just forget Java (as in language) and go for Scala? (Or Fortress.)

    ReplyDelete
  9. Stephen Colebourne21 January 2007 16:02

    @Janne, findAll and eachNotNull are regular Java methods that take a closure as their final argument. They would typically be pulled in via a static import. So, the closure itself is anonymous, but the method which controls the closure and defines when and how many times the block is called is a real Java method, such as findAll.

    ReplyDelete
  10. @Stephen: regarding the "horrible way": if you'd lack in for-each loops, the solution using closures and invocation syntax would be somewhat like follows. Might look familiar.

    List[Booking] matches = new ArrayList[Booking];
    eachNotNull (Booking booking : bookingList) {
    ´ eachNotNull (Purchase purchase : booking.getPurchaseList()) {
    ´ ´ if (purchase.matchesType(typeToFind)) {
    ´ ´ ´ matches.add(booking);
    ´ ´ ´ break;
    ´ ´ }
    ´ }
    }

    Having:
    void eachNotNull (List list, {T => } block) {
    ´ for (Iterator iter = list.iterator(); iter.hasNext();) {
    ´ ´ T t = iter.next();
    ´ ´ if (t != null) block.invoke(t);
    ´ }
    }

    ReplyDelete
  11. Stephen Colebourne22 January 2007 01:05

    @Neal, The point of this blog post is that last-line-no-semicolon is a very poor way to pass a result back to the closure-control-method. I used an example to show this - clearly you disagree with the example. The key point however, is that the spec allows multiline closure-blocks, thus there can be if statements and loops, where you may want to return a result early.

    You point us at the tenants post where you discuss this. There, you dismiss the usefulness of returning early in the last sentence with a regret. I'm arguing that if we don't have it then we'll end up with some dog-ugly workarounds using local variables. Remember that developers already handle break and continue changing meaning when an additional loop level is added.

    More interesting however is that your reply indicates that you may have a different vision for the use of closures - functional - to that I've been expecting. And this probably also explains the last-line-no-semicolon disagreement.

    Put simply, personally I can't imagine wanting to write code in the functional style you gave in your example. Why? Because it doesn't fit with the way I think about the problem as a Java developer. Even after five readings I'm still struggling to follow how your example works. I have to mentally work it out each time, and that means its completely failed my code clarity and readability tests. Its just not Java.

    Now you could just dismiss me as another stupid blogger who knows Java but doesn't know FP. But I believe that non-FP Java developers will in fact be the main group of people who you need to convince that closures in Java is a good idea. That mass of developers are going to pickup Java 7 and try and use closures as I did, not in the FP style you appear to be hoping for.

    Finally you asked, why am I refactoring here. Well, I was aiming to clarify the intent of the code. I see this as no different to using the foreach loop rather than the old for loop to increase intent and remove extraneous detail.

    In summary, I like the idea of closures and know that there are lots of places I could use them to remove duplicated code and increase intent. The trouble is that I'm finding the details of BGGA more problematic than I hoped.

    ReplyDelete
  12. Re: "The point of this blog post is that last-line-no-semicolon is a very poor way to pass a result back to the closure-control-method."

    Agreed! The control invocation syntax is for controlling statements, not expressions.

    ReplyDelete
  13. Stephen, you've completely lost me with your last comment. If not for FP, why would you want closures at all? Personally, having been contaminated with Ruby but being stuck in Java 1.4, I find myself using commons-collections utility methods to solve problems like yours. That would be:

    List matches = CollectionUtils.select(bookingList, new Predicate() {
    ´ ´ public boolean evaluate(Object o) {
    ´ ´ ´ ´ Booking b = (Booking) o;
    ´ ´ ´ ´ return (null != CollectionUtils.find(b.getPurchaseList(), new Predicate() {
    ´ ´ ´ ´ ´ ´ public boolean evaluate(Object o) {
    ´ ´ ´ ´ ´ ´ ´ ´ Purchase purchase = (Purchase) o;
    ´ ´ ´ ´ ´ ´ ´ ´ return purchase.matchesType(typeToFind);
    ´ ´ ´ ´ ´ ´ }
    ´ ´ ´ ´ });
    ´ ´ }
    });

    Anything to avoid another Iterator! Why? Because I don't want to be bothered specifying *how* to get those matching bookings out of the list; I only want to code *which* bookings I want. So I can't wait for closures, even with the BGGA syntax.

    You wrote "Even after five readings I'm still struggling to follow how your example works." How about if we patch up Neal's indentation (and fix a tiny syntax error)?

    List matches = filter(bookingList, {Booking b =>
    ´ ´ any(b.getPurchaseList(), {Purchase purchase =>
    ´ ´ ´ ´ purchase.matchesType(typeToFind)
    ´ ´ })
    });

    I do agree with you about those missing semicolons (lines 3 and 4); it just feels like something is missing. But I don't see how this is hard to understand, especially since you claim to have worked with Ruby (which would make for the most elegant code (but that's probably a matter of taste):)

    matches = booking_list.select { |b|
    ´ b.purchase_list.find { |purchase|
    ´ ´ purchase.matches_type(type_to_find)
    ´ }
    }

    ReplyDelete
  14. Stephen Colebourne22 January 2007 14:18

    @Danny, My argument is that the FP style probably isn't the way that most Java developers will end up using closures. It feels alien. What I'm asking for is that closures doesn't force FP when there is an alternative style more familiar to Java.

    For example, your first example using inner classes and commons-collections to emulate closures is also pretty alien to Java IMO.

    Oh, and I never claimed I've used Ruby - I just try to follow the bigger picture, including Ruby and Groovy, and apply those concepts I think match to Java.

    ReplyDelete
  15. The funcional style feels alien, but using break and continue doesn't?

    I see merit on the conservative approach to getter/setter handling and to the null-invocation-bypass (what's the name again?).

    But using nested for loops, break and continue to exemplify edge cases is going too far.

    Man, adding closures to Java can be amazing. It's also a big responsibility. I hope you remember to punish bad habits and reward good habits whenever you get the chance, so we all benefit.

    In other words, crappy code changed to use closures should get even crappier.

    ReplyDelete