Stephen Colebourne's blog: March 2007

Saturday, 31 March 2007

Closures - Outside Java

Trying to agree semantics for closures in Java isn't easy. Maybe we should look at what some other Java-like languages have done - Nice, Groovy and Scala.

The first thing to notice is that they all have closures. Java really is the odd one out now. The devil is in the detail though. Specifically, what do these languages do with the interaction between continue/break/return and the closure?

Nice

  // syntax for the higher order function
  String find(String -> boolean predicate) {
    ...
    boolean result = predicate(str);
    ...
  }

  // syntax 1 to call the higher order function
  String result = strings.find(String str => str.length() == 2);

  // syntax 2 to call the higher order function
  String result = strings.find(String str => {
    // other code
    return str.length() == 2;
  });

Nice has two variants of calling the closure. Syntax 1, where there is just a single expression has no braces and no return keyword. Syntax 2, has a block of code in braces, and uses the return keyword to return a value back to the higher order function.

There is a control abstraction syntax variant, where again the return keyword returns back to the higher order function. Similarly, the continue and break keywords may not cross the boundary between the closure and the main application.

Groovy

  // syntax for the higher order function
  def String find(Closure predicate) {
    ...
    boolean result = predicate(str)
    ...
  }

  // syntax 1 to call the higher order function
  String result = strings.find({String str -> str.length() == 2})

  // syntax 2 to call the higher order function
  String result = strings.find({String str -> 
    // other code
    return str.length() == 2;
  })

Groovy has many possible variants of calling the closure (with optional types). Semicolons at the end of lines are always optional, as is specifying the return keyword. So syntax 1 and 2 are naturally the same in Groovy anyway. However, the example does show that if you do use the return keyword, then it returns from the closure to the higher order function.

Scala

  // syntax for the higher order function
  def find(predicate : (String) => boolean) : String = {
    ...
    boolean result = predicate(str)
    ...
  }

  // syntax 1 to call the higher order function
  String result = strings.find({String str => str.length() == 2});

  // syntax 2 to call the higher order function
  String result = strings.find({String str =>
    // other code
    str.length() == 2;
  });

In Scala, like Groovy, semicolons at the end of lines are always optional. However, the key difference with Groovy is that the return keyword is always linked to the nearest enclosing method, not the closure. If that method is no longer running when the closure is invoked, then a NonLocalReturnException may be thrown.

There is a control abstraction syntax variant, where again the return keyword will return from the enclosing method. Scala does not support the continue and break keywords.

BGGA and FCM

Hopefully, the above is clear enough to show that Nice and Groovy operate in a similar manner (return always returns to the higher order function), whereas Scala is different (return will always return the enclosing method, but you may get an exception). This is especially noticeable when Nice/Groovy is used in a control abstraction syntax manner, because then the return keyword looks odd as it returns from an 'invisible' closure (the code often just looks like a keyword).

So how do the Java proposals compare?

BGGA follows the Scala model. If you write a return within the closure block it will return from the enclosing method. And you might get a NonLocalReturnException if that enclosing method has completed processing. BGGA also handles continue and break similarly.

FCM follows the Nice/Groovy model. If you write a return within the closure block it will return to the higher order function method, and there are no possible exceptions. There is no way to return from the enclosing method. Similarly, FCM prevents continue and break from escaping the boundary of the closure block.

Which model is right? Well, both have points in their favour and points against.

Scala/BGGA is more powerful as it allows return from the enclosing method, but it comes with the price of a weird exception - NonLocalReturnException. FCM, is thus slightly less powerful, but has no risk of undesirable exceptions.

Personally I can see why Scala's choice makes sense in Scala - because the language generally omits semicolons and return statements. But I'm not expecting us to remove semicolons from Java any time soon, or make return optional on all methods, so Scala's choices within a Java context seem dubious.

Feedback

Feedback welcomed of course, including examples from other languages (C#, Smalltalk, Ruby, ...).

And what about adding a new keyword to FCM to allow returning from the enclosing method? Such as "break return"? It would expose FCM to NonLocalReturnException though...

Sunday, 18 March 2007

First-Class Methods: Java-style closures - v0.4

Stefan and I are pleased to announce the release of v0.4 of the First-class Methods: Java-style closures proposal.

Changes

Since v0.3, we have tried to incorporate some of the feedback received on the various forums. The main changes are as follows:

1) Constructor and Field literals. It is now possible to create type-safe, compile-time changed instances of java.lang.reflect.Constructor and Field using FCM syntax:

  // method literal:
  Method m = Integer#valueOf(int);

  // constructor literal:
  Constructor<Integer> c = Integer#(int);

  // field literal:
  Field f = Integer#MAX_VALUE;

2) Extended method references. Invocable method references have been renamed to method references. There are now four types, static, instance, bound and constructor:

  // static method reference:
  #(Integer(int)) ref = Integer#valueOf(int);

  // constructor reference:
  #(Integer(int)) ref = Integer#(int);

  // bound method reference:
  Integer i = ...
  #(int()) ref = i#intValue();

  // instance method reference:
  #(int(Integer)) ref = Integer#intValue();

3) Defined mechanism for accessing local variables. We have chosen a simple mechanism - copy-on-construction. This mechanism copies the values of any local variables accessed by the inner method to the compiler-generated inner class at the point of instantiation (via the constructor).

The effect is that changes made to the local variable are not visible to the inner method. It also means that you do not have write access to the local variables from within the inner method. The benefits are simplicity and thread-safety:

  int val = 6;
  StringBuffer buf = new StringBuffer();
  #(void()) im = #{
    int calc = val;    // OK, val always equals 6
    val = 7;           // compile error!
    buf.append("Hi");  // OK
    buf = null;        // compile error!
  };

4) Stub methods are only generated for methods of return type void. This addresses a concern that it was too easy to create invalid implementations using named inner methods.

5) Added method compounds. This addresses the inability to override more than one method at a time. Unlike inner classes, each inner method in a method compound remains independent of one another, thus in this example, the mouseClicked method cannot call the mouseExited method.

  MouseListener lnr = #[
    #mouseClicked(MouseEvent ev) {
      // handle mouse click
    }
    #mouseExited(MouseEvent ev) {
      // handle mouse exit
    }
  ];

6) Added more detail on implementation throughout the document, with specific examples.

We believe that we have addressed the key concerns of the community with this version of FCM. But if you've any more feedback, please let us know!

Monday, 12 March 2007

Configuration in Java - It sure beats XML!

Is the Java community slowly remembering what's good about Java? I'm talking about static typing.

For a long time, Java developers have been forced by standards committees and framework writers to write (or generate) reams of XML. The argument for this was flexibility, simplicity, standards.

The reality is that most Java developers will tell you of XML hell. These are files where refactoring doesn't work. Files where auto-complete doesn't work. Files where errors occur at runtime instead of compile-time. Files that get so unwieldy that we ended up writing tools like XDoclet to generate them. Files whose contents are not type-safe.

In other words, everyone forgot that Java is a statically-typed language, and that is one of its key strengths.

Well, it seems to me that things are slowly changing. The change started with annotations, which are slowly feeding through to new ideas and new frameworks. What prompted me to write this blog was the release of Guice. This is a new dependency injection framework based around annotations and configuration using Java code. For example:

  bind(Service.class)
    .annotatedWith(Blue.class)
    .to(BlueService.class);

I've yet to use it, but I already like it ;-)

So, Guice allows us to replace reams of XML configuration in a tool like Spring with something a lot more compact, that can be refactored, thats type-safe, supports auto-complete and thats compile-time checked.

But lets not stop with module definitions. Many applications have hundreds or thousands of pieces of configuration stored in XML, text or properties files as arbitrary key-value pairs. Each has to be read in, parsed, validated before it can be used. And if you don't validate then an exception will occur.

My question is why don't we define this configuration in Java?

public class Config {
  public static <T> List<T> list(T... items) {
    return Arrays.asList(items);
  }
}
public class LogonConfig extends Config {
  public int userNameMinLength;
  public int userNameMaxLength;
  public int passwordMinLength;
  public int passwordMaxLength;
  public List<String> invalidPasswords;
}
public class LogonConfigInit {
  public void init(LogonConfig config) {
    config.userNameMinLength = 6;
    config.userNameMaxLength = 20;
    config.passwordMinLength = 6;
    config.passwordMaxLength = 20;
    config.invalidPasswords = list("password","mypass");
  }
}

So, we've got a common superclass, Config, with useful helper methods. A POJO, LogonConfig, without getters and setters (why are they needed here?). And the real class, LogonConfigInit, where the work is actually done. (By the way, please don't get hung up on the details of this example. Its late, and I can't think of anything better right now.)

So, is the Java class LogonConfigInit really any more complicated than an XML file? I don't think so, yet it has so many benefits in terms of speed, refactoring, validation, ... Heck we could even step through it with a debugger!

Whats more, its easy to have variations, or overrides. For example, if the XxxInit file was for database config, you could imagine decorating it with setups by machine, one for production, one for QA, etc.

So, what about reloading changes to configuration at runtime? Thats why we use text/xml files isn't it? Well once again, that needn't be true. Another of Java's key strengths is its ability to dynamically load classes, and to control the classloader.

All we need is a module that handles the 'reload configuration' button, invokes javac via the Java 6 API, compiles the new version of LogonConfigInit and runs it to load the new config. Apart from the plumbing, why is this fundamentally any different to reloading a text/xml file?

Well, there's my challenge to the Java community - how about a rock solid configuration framework with full class compilation and reloading and not a drop of XML in sight. Just compile-time checked, statically-typed, 100% pure Java. Who knows, maybe it already exists!

As always, your opinions on the issues raised are most welcome!

Monday, 5 March 2007

Comparing closures (2 more examples) - CICE, BGGA and FCM

As my last post comparing the three closure proposals - FCM, CICE and BGGA - seemed to be useful, I thought I'd post another. Again, I'll try not to be biased!

Example 3: Sorting

The Java Collections Framework has a callback to support sorting using the Comparator interface. This is used via the two argument sort method on Collections, taking the List and the Comparator. This example will sort a list of strings by length placing the "selected" string first.

The following example is from Java 6. This uses an inner class, which can access the parameter 'selected' so long as it is final:

  public void sort(List<String> list, final String selected) {
    Collections.sort(list, new Comparator<String>() {
      public int compare(String str1, String str2) {
        if (str1.equals(selected)) {
          return -1;
        } else if (str2.equals(selected)) {
          return 1;
        } else {
          return str1.length - str2.length();
        }
      }
    });
  }

The following example is from the CICE proposal. This is shorthand for an inner class, so the parameter 'selected' must be final:

  public void sort(List<String> list, final String selected) {
    Collections.sort(list, Comparator<String>(String str1, String str2) {
      if (str1.equals(selected)) {
        return -1;
      } else if (str2.equals(selected)) {
        return 1;
      } else {
        return str1.length - str2.length();
      }
    });
  }

The following example is from the BGGA proposal, and uses the standard-invocation syntax and closure conversion. The closure can access the parameter 'selected' without it needing to be final. The closure can only return a value from its last line, and does so by not specifying the final semicolon, hence the need for the result local variable:

  public void sort(List<String> list, String selected) {
    Collections.sort(list, {String str1, String str2 =>
      int result = 0
      if (str1.equals(selected)) {
        result = -1;
      } else if (str2.equals(selected)) {
        result = 1;
      } else {
        result = str1.length - str2.length();
      }
      result
    });
  }

The BGGA proposal control-invocation syntax may not be used for this example. This is because the control-invocation syntax is not permitted to yield a result as required here.

The following example is from the FCM proposal, and uses an inner method. The inner method has access to the parameter 'selected' without it needing to be final:

  public void sort(List<String> list, String selected) {
    Collections.sort(list, #(String str1, String str2) {
      if (str1.equals(selected)) {
        return -1;
      } else if (str2.equals(selected)) {
        return 1;
      } else {
        return str1.length - str2.length();
      }
    });
  }

The example cannot be directly translated to an FCM invocable method reference. This is because the functionality being performed relies on the parameter 'selected', which is not being passed into the compareTo method on the Comparator. If the 'selected' variable was an instance variable then an invocable method reference could be used.

Example 4: Looping over a map

This example aims to print each item in a map.

The following example is from Java 6, and uses an enhanced foreach loop:

  Map<String, Integer> map = ...
  for (Map.Entry<String, Integer> entry : map.entrySet()) {
    String key = entry.getKey();
    Integer value = entry.getValue();
    System.out.println(key + "=" + value);
  }

The CICE proposal contains no new syntax to address this issue.

The following example is from the BGGA proposal, and uses the control-invocation syntax.

  Map<String, Integer> map = ...
  for eachEntry(String key, Integer value : map) {
    System.out.println(key + "=" + value);
  }

The FCM proposal contains no new syntax to address this issue. However the authors hope to publish a separate proposal to cover this use case at some point.

Summary

This blog contains two more comparisons of CICE, BGGA and FCM with examples. For the previous two, see the previous post.

Thursday, 1 March 2007

Comparing closures - CICE, BGGA and FCM

I've been asked by a comment to compare the new FCM closure proposal from Stefan and myself with the other two proposals, CICE and BGGA. Obviously, this is difficult without being too biased, but I'll to do my best!

Its all about this

The key difference, when considering closures, between CICE and BGGA or FCM is the handling of this.

CICE is just a simplified syntax for creating an inner class. As such, the meaning of this is the same as in an inner class. As a reminder, in an inner class, this refers to the inner class instance, but there is implicit support for calling methods from the outer class. If you need to refer to this of the outer class you have to use the OuterClass.this syntax.

BGGA closures and FCM inner methods make this refer directly to the nearest surrounding class in the source code (the outer class). This results in much simpler code within the closure/inner method.

Ignoring this, the closure part of the three proposals vary principally by the detail of syntax. Thats why you have to focus on the semantics, not the syntax, when comparing them.

In addition, BGGA offers the control-invocation syntax for closures, while FCM offers method literals and invocable method references.

Example 1: Adding an ActionListener to a swing button

This is a fairly common example when using swing. The standard solution today is to use an inner class.

The following example is from Java 6. The method within the inner class can access handleButtonPress() via the special inner class rules:

  public void init(final JButton button) {
    button.addActionListener(new ActionListener() {
      public void actionPerformed(ActionEvent ev) {
        handleButtonPress(ev);
      }
    });
  }
  public void handleButtonPress(ActionEvent ev) {
    // actually handle the button press
  }

The following example is from the CICE proposal. This is shorthand for an inner class, so the method within the inner class can still access handleButtonPress() via the special inner class rules:

  public void init(JButton button) {
    button.addActionListener(ActionListener(ActionEvent ev) {
      handleButtonPress(ev);
    });
  }

The following example is from the BGGA proposal, and uses the standard-invocation syntax and closure conversion. The closure has full access to this (the instance of the init() method) and hence handleButtonPress():

  public void init(JButton button) {
    button.addActionListener({ActionEvent ev =>
      handleButtonPress(ev);
    });
  }

The following example is also from the BGGA proposal, but uses the control-invocation syntax. I believe that the authors of BGGA would not recommend that this syntax should be used for an ActionListener:

  public void init(JButton button) {
    button.addActionListener(ActionEvent ev : ) {
      handleButtonPress(ev);
    }
  }

The following example is from the FCM proposal, and uses an inner method. The inner method has full access to this (the instance of the init() method) and hence handleButtonPress():

  public void init(JButton button) {
    button.addActionListener(#(ActionEvent ev) {
      handleButtonPress(ev);
    });
  }

The following example is also from the FCM proposal, and uses an invocable method reference. This syntax merely references handleButtonPress() and FCM creates the bridging ActionListener. Note however, that the signature of handleButtonPress() must match that of the listener.

  public void init(JButton button) {
    button.addActionListener(this#handleButtonPress(ActionEvent));
  }

Example 2: A closure that multiplies a number

This example is using closures in the style of functional programming. This is rarely used in Java today, but can be simulated to a degree using an inner class.

The following example is from Java 6, and uses an inner class. Note how the button variable must be declared final:

  final int multiplier = 3;
  IntMultiplier mult = new IntMultiplier() {
    public int multiply(int value) {
      return value * multiplier;
    }
  };
  int result = mult.multiply(6);  // result is 18 (ie. 3 * 6)

The following example is from the CICE proposal. The only change is that the variable is now declared to be public (it could also be declared as final, or have no modifier which also means final):

  public int multiplier = 3;
  IntMultiplier mult = new IntMultiplier() {
    public int multiply(int value) {
      return value * multiplier;
    }
  };
  int result = mult.multiply(6);

The following example is from the BGGA proposal, and uses a closure. The {int => int} is a function-type that defines the parameters and return from the closure. The closure definition names the parameter and uses the value of multiplier from its environment. The result of the closure is returned by omitting the semicolon from the last line of the closure. The closure is called using the implied invoke() method:

  int multiplier = 3;
  {int => int} mult = {int value =>
    value * multiplier
  };
  int result = mult.invoke(6);

The following example is from the FCM proposal, and uses an inner method. The #(int(int)) is a method-type that defines the parameters and return from the inner method. The method definition names the parameter and uses the value of multiplier from its environment. The result of the method is returned using the return keyword. The closure is called using the implied invoke() method:

  int multiplier = 3;
  #(int(int)) mult = #(int value) {
    return value * multiplier;
  };
  int result = mult.invoke(6);

Summary

I've presented a quick comparison of CICE, BGGA and FCM with examples. Let me know if it was useful, and if you'd like more comparisons.