Stephen Colebourne's blog: 2008

Sunday, 21 December 2008

JDK 7 language changes - JavaEdge votes!

For the past few days I have been in the fabulous country of Israel at the JavaEdge conference. Duing the keynote speech, I presented 10 possible language changes for JDK7, and asked the audience to vote on them.

These results are in more depth than the Devoxx figures simply because we could extract information about each persons preferences. For academics or statisticians the raw data is available in Open Office format.

Given ten language changes, rank them according to priority

So, the question is the same basic question as we asked at Devoxx, but with a different selection of changes to pick from. The results show that 171 people voted correctly (starting from 1 for the first preference, 2 for the second and so on). Another 41 attendees voted mostly corrrectly, skipping one of the prefereces (for example marking a 1st, 2nd, 3rd and 5th preference, but not a 4th). A final 113 attendees didn't follow the instructions at all, and marked multipe 1st preferences, multiple 2nd preferences and so on.

The features voted on were:

For-each loop over a Map
Control over for-each loops (index/remove)
List/Map access using []
Infer generics on the RHS
Multi-catch of exceptions
String switch
String interpolation
Multi line strings
Resource management
Null handling

These results are for those who completed the voting according to the rules (171 attendees).

Here are the first preference votes:

Adding the second preference votes:

Adding the third preference votes:

Adding the fourth preference votes:

Adding the fifth preference votes:

Adding the remaining votes (white is didn't vote, black is 'this is a bad idea':

Now those that almost voted according to the rules - votes are adjusted to move everything beyond the missing column up (41 attendees).

All votes:

Now those that voted based on general preference (103 attendees).

All votes:

So, what do these results tell us? (I'm focussing the analysis on those who voted fully according to the rules, as it is hard to interpret the results from those who simply gave arbitrary rankings).

As with the Devoxx results, there is a clear winner - null handling. Null handling had 50 first preference votes, double that of second place string switch and almost a third of all first preferences. This trend continued with almost two thirds placing it in their top four.

Other popular options were String Switch, Multi-catch of exceptions, enhanced for-each loop for Maps, enhancing the for-each loop to be able to remove or find the index and ARM-style resource management.

Unpopular options (especially considering the black 'bad idea' values) were List/Map access using [] and String interpolation (${variable} in strings).

Infer generics and multi-line strings were deemed of relatvely low importance but not particularly objectionable.

The other 144 votes that were not completely according to the rules are harder to interpret. If someone wants to give it a go, feel free to interpret the raw data.

Summary

Firstly, a big thank you to the JavaEdge voters, and to the AlphaCSP team in Israel. Again, we have some really interesting figures from this vote, with null handling again coming out top. Clearly, there needs to be some real thought about whether some form of null handling can be achieved.

Any feedback or thoughts are welcome!

Sunday, 14 December 2008

JDK 7 language changes - Devoxx votes!

Once again, the great Devoxx conference allowed ordinary developers to express an opinion on the future of the Java language. This time, the focus was on prioritising which changes should happen.

The figures given below are of course indicative only. They are based on those Devoxx attendees who actually participated on the whiteboards. Also, the figures are as recorded at the end of Thursday (Friday scores not included). The whiteboard photos are available which provide additional information. Bear in mind that my figures may disagree with the photos in some small way based on when I counted the results relative to when the photos were taken.

Given eight language changes, rank them according to priority

This year, rather than simple "do you like this yes/no" questions, we posed something tougher. Prompted by Alex Buckley, we asked participants to rank eight language change proposals. Voters had to mark the number one against their highest priority, two against their next highest and so on until they had no more priorities. Attendees could also vote against a proposal by marking an 'X'.

The voting system was not the simplest, and we know a few people didn't vote correctly. However, we do believe that the vast majority did vote according to the rules making these results valid. The total number of voters was around 220, so the results have good validity.

Feature	Properties	Multi catch	Null handling	List/Map syntax	Extension meths	Method pointers	Multiline Strs	Infer generics
1	33	47	70	10	6	8	9	61
2	21	56	38	16	8	10	12	49
3	30	35	27	35	10	15	21	35
4	25	26	15	42	17	13	28	19
5	10	20	15	29	13	25	29	19
6	14	15	8	17	23	21	27	15
7	14	9	2	18	24	25	24	11
8	26	19	2	9	19	23	32	8
X	29	1	12	8	28	24	15	0

Total Votes	202	228	189	184	148	164	197	217
Total votes (excluding X)	173	227	177	176	120	140	182	217
Weighted average	4.08	3.41	2.49	4.32	5.36	5.25	5.16	3.07

Here are the first preference votes:

Adding the second preference votes:

Adding the third preference votes:

Adding the fourth preference votes:

Adding the remaining votes:

So, what do these results tell us?

Well, they are in fact amazingly clear. Null-handling was the first preference favourite by a small margin, ahead of Infering RHS generics and Multi-catch of exceptions. However, if particpants had been given just three votes, then these three would have scored almost exactly equal (135/145/138) and over double that of List/Map access (61).

Next in line (ignoring properties) was List/Map access using [], which can be seen by the big jump in third and fourth preference votes. Fifth was Mult-line strings, which developers seem to not be that fussed about.

Finally, extension methods and method pointers (method references) performed poorly in the vote.

Properties was shown to be the most divisive, with both a large number of votes for and against, and fewer in the middle preferences. For example, notice the high number of first preferences, but dropping back when adding second and third preferences. In addition, properties has a high number of low preferences and votes against (coloured black). Since properties isn't going to be included in JDK 7, the properties column is really just an interesting comparison data point.

Obviously in a free vote like this without any explanations, it is easy for people to not vote for items they don't understand. This could explain some the lower scores for Extension methods and Method references (see the total number of votes cast to confirm this). However, even taking this into account, these two don't seem to have as much support as the big three of Null-handling, Infering RHS generics and Multi-catch exceptions

It should also be noted that the graphs and the weighted averages are in line with one another.

Other language changes

In addition to the ranking above, there were some other votes.

The vote on closures was split 50/50:

Yes	41
No	41

The vote on abstract enums showed a majority in favour:

Yes	24
No	18

The vote on ARM (resource management) showed a large majority in favour:

Yes	64
No	11

The vote on some form of delegation showed a majority in favour:

Yes	36
No	9

The vote on for each over a string showed a majority in favour:

Yes	34
No	21

Summary

Firstly, a big thank you to the Devoxx voters, and to Stephan and the Devoxx organisers. I think this exercise was a huge success in participation. I'm certainly not in favour of language design by democracy, but I am very much in favour of gathering large scale information on what really causes developers pain. Clearly, handling nulls, RHS generics and catching exceptions are three big issues.

I hope to repeat the exercise again soon. And if any JUG (public or company internal) wants to do the same (maybe with different proposals) then I'd recommend it (and I've even got an Open Office presentation if you want some material - just drop me a line).

Any feedback or thoughts are welcome!

Devoxx 2008 - Whiteboard votes

Devoxx 2008 is over :-( But is was a great conference again. And Java really started to feel alive again with real discussions of JDK 7.

In this post I'll go through the general discussions that were held on the Devoxx whiteboards. The figures given are of course indicative only. They are based on those Devoxx attendees who actually participated on the whiteboards. Also, the figures are a record taken at the end of Thursday (Friday scores not included).

The whiteboard photos are also available which provide additional information. Bear in mind that my figures may disagree with the photos in some small way based on when I counted the results relative to when the photos were taken.

Are you considering using JavaFX ?

Java FX had its big conference launch at Devoxx. This question captured developers feelings about it:

Yes	62
Maybe	78
Next version	21
Never	45
What is jfx?	6

It seems clear that Devoxx attendees were pleasantly surprised by Java FX, and are thinking about using it.

What Testing Framework/Tools are you using ?

Testing is an essential developer activity, but what tools get used? I've broken out the big 3 unit testing tools here.

JUnit3	54
JUnit4	102
TestNG	23
Selenium	32
None	3
Testing?!	5
EasyMock	17
JMock	2
HTMLUnit	2
Fitnesse	3
SOAPUI	11
Cobertura	5

Obviously, a wide range of tools are in use, but in the unit test area JUnit still rules the roost.

REST vs SOAP

A hot topic with many developers who still have to try and convince sceptical managers that REST is a valid solution. I'm displaying the main numbers for the question here:

REST	50
SOAP	17
Both	30
JSON	6

So, developers prefer REST to SOAP. Somehow I'm not surprised!

What IDE/Editor do you use ?

A question that will always bring out strong opinions:

Eclipse	214
Netbeans	64
IntelliJ	126
Vi/vim	9
Textmate	6
Jdeveloper	5
Notepad++	7

I'd have expected a stronger showing for Eclipse than this. Maybe Eclipse users have less strong opinions about their tool as Netbeans and IntelliJ users?

Which Java VM are you using ?

How up to date are the versions of Java we use? The question asked Devoxx attendees to mark all they are using:

0.9b	2
1.0	3
1.1b	6
1.3	6
1.4	74
1.5	216
1.6	255
1.7	2
1.7 (patched)	4
1.7 (closures)	2
Java ME	13

So, more people are using Java 1.6 than Java 1.5. Interesting.

Checked exceptions ?

Checked exceptions are a feature essentially unique to Java. Many, like me, think they were a very bad idea. But what about Devoxx attendees:

Good idea	34
Bad idea	36
Both, It depends	6

Opinion seems split on this topic. Although I should note that the number of votes was lower on this issue, so maybe there is an element of "don't care".

What is your favourite JVM lang other than java ?

The JVM isn't only about Java now. What else are people using:

Groovy	49
Scala	30
JRuby	14
Jython	10
Fan	5
PHP	4
Clojure	5
Don't care	17

So, Groovy is looking pretty popular right now, with Scala seeing some good use. There is interest in additional languages too and this is healthy for the JVM's future (and personally, I was pleased to see Fan get 5 votes at this early stage in its life).

Summary

Thanks again to all the Devoxx voters. I'll write up about language change issues soon. Feel free to comment on any of the vote results.

Tuesday, 9 December 2008

Java 7 - small language changes

Todays news that there are likely to be language changes in JDK 7 was a bit of a surprise. It seemed like the chance had passed.

Small language changed for JDK 7

Joe's post isn't very long, but it is clear.

"I'll be leading up Sun's efforts" - so its supported by Sun
"develop a set of ... language changes - so its more than one change
"small language changes" - small means no closures or properties
"we'll first be running a call for proposals" - community involvment
"in JDK 7" - in the next version (but why does the blog not say Java 7?)

I've used this blog, and previous visits to JavaPolis (now Devoxx) to discuss possible language changes. Some have been good ideas, others not so good. The main point was to provide a place for discussion.

The next phase after that was to implement some of the language changes. This has been achieved with the Kijaro project. As far as I'm concerned, anyone can write up an idea for a Java language change, and then I'll provide commit access at Kijaro for it to be implemented in javac - no questions asked.

Expectations

So, before we all submit lots of crazy ideas and get carried away, lets remember that Sun provided some hints at JavaOne 2008. This presentation includes the following as ruled out:

Operator overloading (user defined)
Dynamic typing
Macros
Multiple dispatch / multi-methods

And the following as 'under consideration':

Multi-catch in exceptions
Rethrowing exceptions

The following are listed as 'long term areas of interest':

Parallel algorithms
Versioning of interfaces
Delegation
Extension methods
Structural typing
Pluggable literal syntaxes

So, there is already quite a wide list on the table. Plus, there were other ideas suggested at last years JavaPolis, by both Josh and Neal:

Variable declaration type inference (for generics)
Enum comparisons
Switch statement for strings
Chained invocations, even when method returns void

In addition to all the above, I strongly suspect that there isn't going to be a chance to tackle problems with generics. This is similar to closures and properties. There isn't the timesclae or manpower to tackle these big issues in the timeframe being talked about (especially now Neal Gafter works for Microsoft).

My ideas

Well, I'll mostly save those for another day. But I would like to see a proper consideration of enhancements to help with null handling. And enhancments to the for-each loop.

Saying NO!

Finally, an appeal to Sun. Many in the community are deeply sceptical of any language changes at this point. The message should be simple - people need to feel that there is a clear means to vote or argue AGAINST a proposal, just as much as to make suggestions FOR change. Although I don't expect to make much use of the facility, I know there are many that do want to express this opinion.

Summary

This is a new departure for Sun in so openly asking for ideas from the community. We need to respond and reply with thoughtful ideas for what will and will not work.

Sunday, 9 November 2008

Java language change - Unique identifier strings

This last week I've been refactoring some old code in my day job to simplify some very crusty code. One of the parts of the refactor has made me write up why language features often affect more than is immediately obvious.

The case of the Unique Identifier String

The specific code I've been working on isn't that significant - its a set of pools that manage resources. The important feature for this discussion is that there are several pool instances each of which has a unique identifier.

When the code started out many years ago, the identifier was simple - the unique name of the pool. As a result, the unique identifier was the pool name, and that was defined as a String:

// example code - hugely simplified from the real thing...
public class PoolManager {
  public static Pool getPool(String poolName) { ... }
  ...
}
public class Pool {
  private String poolName;
  ...
}

Internally, the manager consists of maybe 25 classes (its over-complex, no IoC, and needs refactoring, remember...). Most of the 25 classes have some kind of reference to the pool name, whether to access configuration, logging or some other reason.

At some point in the past, a new development was commissioned that affected the whole system. The new development - maintenance code - was to allow multiple sets of configuration throughout the system.

To achieve this, everywhere that accessed configuration needed a unique key for the configuration it needed to access. Again, as this was a simple lookup, a String was used. And, since the pooling component was affected, a second unique key was added:

// example code - still hugely simplified from the real thing...
public class PoolManager {
  public static Pool getPool(String poolName, String configName) { ... }
  ...
}
public class Pool {
  private String poolName;
  private String configName;
  ...
}

Now, in order to complete the change, the config name was rolled out to most of the 25 classes alongside the poolId. In effect, the true 'unique id' for the pool became the combination of the two separate keys of poolName and configName.

Now, we could debate lots about this design, but thats not the point. The point, in case you missed it, is that we now have up to 25 classes with two 'unique ids' that are really one. In addition, this creates confusion in what things mean. After all, with two keys we now need a map within a map to lookup the actual pool, right? (again, I know the alternatives - this is a blog about what maintenance code does over time, and how to tackle it...)

OK, so how might we improve this using Java?

A better design

If the original developer had coded a PoolId class then the overall design would have been a lot better:

// pre-maintenance:
public class PoolId {
  private String poolName;
  ...
}
public class PoolManager {
  public static Pool getPool(PoolId poolId) { ... }
  ...
}
public class Pool {
  private PoolId poolId;
  ...
}

Now, the maintenance coder would have had a much easier task:

// post-maintenance:
public class PoolId {
  private final String poolName;
  private final String configName;   // NEW CODE ADDED
  ...
}
// NOTHING ELSE CHANGES! PoolManager and Pool stay the same!

Wow! Thats a lot clearer. We've properly encapsulated the concept of the unique pool identifier. This allowed us to change the definition to add the configName during the later maintenance. This isn't rocket science of course, and there isn't anything new in this blog so far...

Now, what I want to do is ask the awkward question - Why wasn't the PoolId class written originally?

Its vital that we understand that question. Its the root cause as to why the code now needs refactoring, and why it is hard to understand and change. (And bear in mind this is just an example scenario - You should be able to think of many similar examples in your own code)

Well, lets look at the PoolId class in more detail. In particular, lets look at the code I omitted above with some '...'.

// real version of PoolId in Java - pretty boring...
public final class PoolId {
  private final String poolName;
  
  public PoolId(String poolName) {
    if (poolName == null) {
      throw new IllegalArgumentException();
    }
    this.poolName = poolName;
  }
  public String getPoolName() {
    return poolName;
  }
  public boolean equals(Object obj) {
    if (obj == this) {
      return true;
    }
    if (obj instanceof PoolId == false) {
      return false;
    }
    PoolId other = (PoolId) obj;
    return poolName.equals(other.poolName);
  }
  public int hashCode() {
    return poolName.hashCode();
  }
  public String toString() {
    return poolName;
  }
}

Now we know why the original developer didn't write the class PoolId. Very, very few of us would - the effort required is simply too great. Its verbose, boring, and probably has enough potential for bugs that it might need its own test.

But the way we write this class - what it actually looks like - is a language design issue!

Quick composites

It is perfectly possible to design a language that makes such classes really easy to write. For example, here is a psuedo-syntax of an imaginary language a bit like Java:

// made up language, based on Java
public class PoolId {
  property state public final String! poolName;
}

The 'property' keyword adds the get/set methods (no need for set in this case, as the field is final). The 'state' keyword indicates that this is part of the main state of the class. Adding the keyword generates the constructor, equals(), hashCode() and toString() methods. And finally, the '!' character means that the string cannot be null.

Adding another item of state is really simple:

public class PoolId {
  property state public final String! poolName;
  property state public final String! configName;
}

Suddenly, adding a new class for things like PoolId doesn't seem a hardship. In fact, we've changed implementing the right design to doing the easy thing. Basically, its about as easy as its ever going to get.

My real point is that if Java had a language feature like this, then there would a much greater chance for the better design to be written. After all, most developers will always take the lazy option - and in Java that is way too many String identifiers.

So, does this imaginary language exist? Well, some get a lot closer than Java, but I don't think any language achieves quite this kind of brevity (prove me wrong!).

In addition, I'm arguing for Fan to encompass 'quick composites' like this. After all, I'd argue that most (80%?) of the classes we write could have auto-generated equals() and hashCode() based on a 'state' keyword.

Summary

As a community of Java developers, we need sometimes to realise that the language we develop in can actually hold us back. A language design feature like this is not just about saving a few keystrokes. It can fundamentally change the way lots of code gets developed simply by changing the better design from very hard/verbose to really easy. And the knock-on effects in maintenance could be huge.

Finally, I want to be clear though that I am NOT advocating a change like this in Java. Java is probably too mature now to handle big changes like this. But new languages should definitely be thinking about it.

Opinions

What languages come close to this design?
What percentage of classes in your codebase could have their equals()/hashCode() methods generated by a 'state' keyword?
Opinions welcome!

Thursday, 23 October 2008

Life in the Fan lane - less NPEs

Since I last blogged I've been contributing to the development of the ~~Fan~~ Fantom language through lots of discussions. One of the latest features to be added is targeting a major reduction in NPEs.

Fantom and NPEs

The Fantom programming language is a relatively new language for the JVM (it compiles to bytecodes). However, it also compiles to .NET CLR making it very portable, with its own set of APIs.

Its a language that is easy for Java and C# developers to grasp (familiar syntax and concepts). It also fixes up many of the weak points identified in Java, with closures properly designed in, understandable generics, much less boilerplate and clever concurrency. Further, it performs close to, or as well as Java, because it is a statically typed language.

// find files less than one day old
files = dir.list.findAll |File f->Bool| {
  return DateTime.now - f.modified < 1day
}
// print the filenames to stdout
files.each |File f| {
  echo("$f.modified.toLocale:  $f.name")
}

When I first came across the language, I immediately saw its potential as a Java successor. It is an evolutionary language. Simple for Java developers to move to, yet with clear potential productivity gains. One thing irked me though, and that was the handling of nulls, because Fantom had a model exactly like Java, where any variable could be null.

This has changed recently however, and now Fantom supports both nullable and not-null types:

 Str surname      // a variable or field that cannot hold null
 Str? middleName  // a variable or field that can hold null

At a stroke, this allows much more control within your application of the presence of null. No longer do we have to write in JavaDoc (or lengthy nasty annotations) as to the null status of a variable.

For example, you cannot compare surname to null, as that makes no sense. In the Java defensive coding style, it is often the case that variables are needlessly checked for null. This additional code gets in the way of the real business logic, and requires extra tests and analysis. With Fantom's null/not-null choice, a not-null variable can simply be relied on to be not-null, and any attempt to compare a not-null variable to null is a compile error.

Further to this, the type system allows you to block the presence of null in lists and maps. This is often an area forgotten when passing collections to methods - until the NPE strikes.

 Str[] list     // a list that cannot hold null
 Str:Int map    // a map that holds non-null strings to non-null integers
 Str:Int? map2  // a map that holds non-null strings to nullable integers

Finally, the null-safe operators (?., ?-> and ?:) are all prevented on operating on not-null variables. The ?. operator is a null-safe method invoke, that returns null if the LHS is null, so clearly this makes no sense if the LHS is a not-null variable.

One point to note is that not-null is the default. Why is that?

Well, it turns out that is the most common state for variable. Most programmers intend most variables to not hold null. In converting the Fantom sourcebase figures of 80% were not uncommon (ie. 80% of variables were originally intended to never hold null). Clearly it makes sense to make the most common case the default, and thus to make handling null a special case.

And what does the nullable/not-null variable actually represent? Well, most variables are objects, so once it gets to bytecode there is no difference. But for Int and Float, the non-null types can be converted to the primitive bytecodes for Java's long and double. This means that Fantom now has access to primitive level performance at the bytecode level which is going to allow Fantom applications to speed along nicely.

Summary

Fantom is coming along nicely. NPEs are a major headache in todays systems, and Fantom's approach will eliminate many of those errors. Further, it will eliminate much of the defensive code that is added to protect against values that will never actually be null - leaving Fantom even more decluttered.

Opinions welcome on NPE handling - something that seems to have been a complete lack of priority in Java.

Friday, 20 June 2008

Friday fun - Or is this a serious AJAX comment?

This is a genuine street sign from Nelson, New Zealand. But perhaps its telling us that AJAX really is a dead end and we need something better?! Or that now we've started using AJAX there's no escape?!!

(Well I found it funny anyway ;-)

Tuesday, 17 June 2008

Firefox download day farce - never ignore time zones!

So, the big download Firefox world record attempt has started in a farce. Why? Because of a complete lack of understanding of and planning for date and time issues!

The 'download day' has been advertised as 17th June 2008. But there is no clear information as to when on the 17th June. So when would you assume?

Personally, I would naturally assume the 24 hours starting at midnight UTC. But no. The 24 hour period actually starts at 10am Pacific Daylight Time. Crazy!!!

The recriminations on this have already started. The main complaints include bad planning, American focus and plain stupidity. It already looks like quite a few people assume that the release has been delayed. Some appear to have given up on the download.

Again with dates and times the message is clear - time zones are a pain, but you can't ignore them!

Thursday, 12 June 2008

The Fantom language - Is it JavaNG?

For many moons there have been discussions about what we need to do to 'fix' Java. But what might the language look like if we applied all the possible fixes?

The Fantom programming language

Many (all?) of the ideas for creating a 'better Java' are not new, and have been talked about many times over the years. Ola Bini produced one list, which included no primitives, enhanced generics (no angle brackets), closures, method references, implementation in interfaces, some type inference, no checked exceptions and non-null variables. This is a good list and mirrors many of the discussions that have been held on this blog.

Of course, this JavaNG/Java3/BetterJava language doesn't exist does it?

Well, perhaps it does.

The ~~Fan~~ Fantom programming language is one I've been watching for a couple of months since it first launched its website. Its statically typed like Java and Scala, although there are some dynamic features.

I've been meaning to blog about it for some time, but Cedric beat me to it (and its a good writeup too).

Basically, Fantom fixes 95% of the pain points in Java in a manner and style that is close to that which you'd naturally pick if you were creating JavaNG/Java3/BetterJava. Here is a quick rundown of the key features and changes:

Pods, Types and Slots

The basic unit of grouping code is the Pod. This is like a combination of a Java package and module. It forms a key part of the name of any type in the same way as the package name does. But because it is also the module, it is easy to find missing classes - its like every class file knowing what jar file it is stored in. There is also a module level access specifier ('internal'), which is a cross between package scope and proposed 'module' scope in Java.

The two main types of type are classes and mixins. Classes are as expected, with single inheritance. Mixins are like interfaces, but can have implementation too, providing convenient and safe multiple inheritance if required.

Slots are the contents of classes and mixins. There are only two types of slot - fields and methods. All fields behave as properties, and are accessed via methods. You don't need to manually write the getter/setter, but you can if you need to, such as with calculated fields. Methods are pretty much as expected, except that you have to declare them as virtual in order to allow them to be overridden (and you have to declare override if you are overriding).

One key point is that all slots have unique names. This means that a field cannot have the same name as a method. This is a Good Thing, and allows method/field references to just reference the slot name (contrast this with the FCM method reference spec).

It also means that methods cannot be overloaded! But this is also a Good Thing, when combined with default values for method parameters, for example either of the last two arguments may be omitted and will default to zero:

Date withDate(Int hour, Int minute, Int second := 0, Int milli := 0) {
  ....
}

The totality is a very simple, but consistent, basic structure.

Constructors

Constructors are defined like static methods, but use the method name 'make' by convention. This method name and design is designed to allow easy switching from creation of objects to returning cached objects (or specialised subclasses) without affecting the caller:

class Date {
  new make(Int hour, Int minute, Int second := 0, Int milli := 0) {
    ....
  }
}

The 'new' keyword indicates that the method is a constructor.

Assignments and operators

All assignments use the := symbol. This clearly separates them from any other usages of =. It also enables a basic form of type inference when declaring local variables:

  value := 6
  str := "Hello world"

Note that there was no need to declare the variable type (Str/Int). Declarations of fields still require the type to be specified though.

Fantom also supports operator overloading by delegation to methods. Thus, writing a method with a special name will cause the matching operator to be allowed.

As a result, there is no need to call a.equals(b). The == operator is mapped to the equals method - this one change makes code a lot neater.

Immutability and Threading

Immutability (using the 'const' keyword, although a different meaning to C) is built into the language. Fields and classes can be immutable, as these gain extra power in the language.

The main benefit is with multi-threading. The language prevents there being any shared mutable state between two threads. Only immmutable objects may be freely passed between threads. If you need to pass a mutable object to another thread, then it must be serializable (using the built in serialization) and it will be sent to the new thread as a serialized message There is thus no synchronized keyword.

Collections and Closures

Collections and Closures are the only places where anything like generics appears. The collection classes have dedicated type signatures, which look like Java array signatures:

 String[] listOfStrings := ["one", "two", "three"]
 String:Int[] mapOfStringsToInts := ["one": 1, "two": 2, "three" : 3]

Note that both the list and map also have a literal syntax to define initial contents.

Closures are built in, and the API is written to work with them. The syntax is modelled after Ruby, but is pretty readable to a Java developer, using pipes to define the closure arguments:

 list := ["one", "two", "three"]
 list.each |Str val, Int index| {
  echo("$index = $val")
 }

Note how local variables can be accessed and output from within a string (like Groovy amongst others).

Dynamic coding

Fantom encourages more dynamic coding styles. If you use the dot to call a method (as per normal Java) then the method is compile time checked. If you use the arrow operator, then the method is called by reflection. Further, if the method called by reflection doesn't exist, then this error can be caught (the 'trap' method) which is like the 'method missing' concepts in other languages and enables powerful DSLs to be written.

  obj.doStuff()   // compile-time checked call, as per Java
  obj->doStuff()   // runtime reflection call

This can also be thought of as enabling duck typing.

API

Fantom has its own API. It does not directly expose the Java API, although you can access it via the 'native' keyword. The benefit if this is the removal of all the bad and broken parts of the Java API.

The Fantom authors philosophy is to provide all the useful features of the Java API, but in less classes. Thus, the IO API consists of just 4 key classes, with byte/char operations on the same class, and buffering assumed.

Odds and ends

Lines do not need a semicolon at the end. A newline will suffice.

There is built in serialization to a JSON like syntax. This is actually a subset of the language, and can be used to initialise objects at creation time in normal code.

Methods may have convariant returns like Java. But they can also have Self type returns.

Methods can be declared to run 'once', and then cache their value.

If, while, and basic for loops are the same as in Java. Switch statements are better, as they can't fall-through.

Exception handling is as per Java, with try, catch and finally. All exceptions are unchecked (yay!)

The default type for numbers with a decimal point is a BigDecimal equivalent, not a double equivalent. This will mean a lot fewer numerical errors.

Fantom runs on the JVM and the .NET runtime. It manages this by writing to a temporary intermediate format, which then gets further compiled to the right bytecode.

The website is really good and detailed.

Comparing to Java

Fantom is not Java, it is its own language. Yet, it is many, many ways the natural result of applying all the changes that the blogosphere asks for in Java. Just for that reason it pays to evaluate it closely.

Is Fantom a good language? The answer is a qualified yes. At the moment, it looks like a very well designed language I have very few points of contention with it - the main one being there is no non-null support. Minor points include some choices for coding standards (open curly brace positioning, and when to use method parameters).

Is Fantom a new language? Not really. It is a consolodation language, which is what James Gosling claims Java was (ie. a language that doesn't invent much that is radically new. Most of the ideas have been seen elsewhere, but Fantom has a particular JavaNG feel about it.

What are my key features? Built in modules, immutability and safe threading, closures, and a really great solution to generics (ie. only support them on Lists, Maps and Closures). The local type inference, no semicolons, == for equals and interpolated string are four minor features that make a big difference too.

Summary

Fantom is worth checking out. Whether it suceeds as a language is up to many factors - it needs an IDE for example, and a lot of luck.

The key point for me is that Fantom represents much of what JavaNG/Java3/BetterJava would look like if all our ideas were adopted. And while it has many similarities to Java, there is also quite a sense of difference. Perhaps, the biggest aspect of this is that the APIs are different. But that is perhaps inevitable if you want to get any real benefit from closures and fixing generics (by simplifying them).

And that perhaps gives us the definition of where JavaNG/Java3/BetterJava ends and BeyondJava starts. If the language is based around the Java APIs, its a JavaNG/Java3/BetterJava language (eg. Groovy). If the language has its own APIs, its a BeyondJava language (eg Fan, Scala).

All opinions welcome on Fantom and the BetterJava to BeyondJava boundary!

Friday, 2 May 2008

Enhancing Java - Multi-lingual blocks

The reality for Java is that there are many other programming languages, and many of those have features that Java developers sometimes wish they could access. But its simply impossible to add all those features. Is there a possible alternative if we think 'outside the box'?

Multi-lingual

What I'm thinking about in this blog is the possibility of embedding Groovy, Ruby, Jython or Scala code directly within Java code.

Why might that be useful?

Well each language has their own benefits, whether Scala's functional style or Groovy's GStrings. Including a small part of another language within the main code body could be useful, although obviously this would be a technique to be used with care.

And it doesn't have to stop at known languages. What about a dedicated 'SQL language'? Or a dedicated 'XML language'? These would be more than just DSLs, but actual languages with whatever syntax rules are most applicable.

So, what might a syntax look like:

 public String fetchRow(int id) {
   :groovy: {
     println "Row id: $id!"
   }
   :sql: {
     SELECT %text% FROM my_table WHERE row_id = %id%;
   }
   return text;
 }

The idea is that a block of code, surrounded by curly brackets, can be identified as belonging to a different language. In this case I've used the syntax of the name of the language (which would have to be imported) surrounded by colons. Note that there is nothing specific about the syntax within the block. Bear in mind that the syntax isn't that important - its the concept that matters.

The Groovy example - just normal Groovy code - outputs the row id using an embedded string. The SQL example is an invented 'language' where a column is read by id, and then returned to the Java code as the variable text.

So, what about the detail? Well, the approach requires two parts.

Firstly, there needs to be a parser for each language that understands the relevant syntax. This will typically be a variation of the normal parser for a 'real' language like Scala or Ruby. For a new language like SQL or XML, it would be written from scratch. The parser also needs to be able to recognise when the block of code in that language is complete.

Secondly, the parser needs to be able to share variables with the surrounding code. As a basic principle, this can be thought of as a map, where the other language code can both read and write to the map. Of course this requires there to be a mapping between the various type systems - for Groovy this should be easy, other languages might find that more tricky.

So how hard is this to implement? Probably pretty hard. But it does open up lots of possibiilties - wheter for embedded DSLs or larger blocks of code in another language.

Summary

This is an outline of an idea to allow other languages, whether existing or new, to be easily embedded directly in existing code. Any thoughts?

Monday, 28 April 2008

Plans for JavaOne

Just a quick post to outline my plans for JavaOne.

I'll be in San Francisco from Sunday, so I expect I'll be picking up my pass then. I expect the rest of the day will be more touristy, unless I get grabbed for a techie discussion!

On Monday, I've arranged a JSR-310 dates/times Expert Group session. As always though, I'm trying not to limit this to just EG members, so if you want to contribute, see the mail on the mailing list.

I'll be kept busy by the JCP during the week too. There is some JCP training on the Monday, plus, for some strange reason, I've been nominated for an award - "JCP participant of the year". Most unexpected!

Finally, of course, I'm giving a JSR-310 technical session with Michael on Thursday at 13:30, with a repeat on Friday at 13:30. The id is TS-6578.

Hope to see some of you there - and if you'd like to meetup and chat about JSR-310, Joda-Time, FCM or any of my blog posts then drop me a line at scolebourne-joda-org.

Saturday, 26 April 2008

Java 7 - For-each loop control access

I've gathered together a few more thoughts on improving the enhanced for-each loops. The basic idea is to take this very popular Java 5 feature and provide the missing parts.

Control access

One of the more frustrating parts of the Java 5 for-each loop is when you are 80% through writing a loop, and you discover you need to remove an item, or require the loop index. At that point, you have to go back and manually change the loop to one of the old formats (in Eclipse at least). This is a hassle.

Perhaps more importantly is that the older for loops simply aren't as clear in their intentions, aren't very DRY, and are definitely more error-prone. As a result, I've documented my proposal to improve the for-each loop with control access. For example, to access the loop index:

 Collection<String> coll = new ArrayList<String>();

 for (String str : coll : it) {
   System.out.println("Item: " + str + ", Index: " + it.index());
 }

And here is an example of removing an item:

 List<String> list  = new ArrayList<String>();

 for (String str : list : it) {
   if (str == null || it.isFirst()) {
     it.remove();
   }
 }

As can be seen, the syntax simply involves adding another colon and a 'variable' name. The 'variable' can be used access loop control and manipulation functions. Note that the additional colon and 'variable' are of course optional for full backwards compatibility.

The document discusses two strategies for implementing the syntax - either via real Java types or as a language level feature. Please read the document for more information.

Maps

I have updated my previous document about extending for-each to maps. The download of the javac implementation remains available from Kijaro.

Summary

It seems increasingly unlikely that there is time for closures to make it into Java 7. There are also many developers expressing real doubts as to whether the complexity of control invocation is just too much for the venerable Java language.

The alternative is smaller improvements like these two. They provide an easy to grasp extension to the popular Java 5 for-each loop, that might still be possible to deliver in Java 7. Opinions welcome, as always.

Saturday, 19 April 2008

Java 7 - For-each loops for Maps

Have you ever been fustrated by the new Java 5 for each loop because it didn't operate directly on maps?

For-each loop for Maps

I have documented a proposal to change Java to allow for each loops on maps. I have also used the Kijaro project to implement the enhanced for-each loops!

  Map<String, Integer> map = new HashMap<String, Integer>();
 
  for (String str, Integer val : map) {
    System.out.println("Entry" + str + "=" + val);
  }

The altered version of javac can be downloaded, with the normal caveats of 'no warranty' and 'not intended for production'.

Closures

The real question here is whether we should use closures to obtain this functionality, or just code a specific language feature. Since it is far from certain that closures will appear in Java 7 due to timescale and resourcing questions, maybe we should be considering the alternatives?

Extending for-each loops to cover maps is a simple extension to the Java language that introduces no radically new concepts. Existing developers should be able to pick up the feature without any difficulty. In addition, developers that have never been exposed to the feature (but have seen a Java 5 for-each loop) should be able to read the code and grasp the meaning without tuition.

The truth is that sometimes the simple solution is the right one. Perhaps, closures are overkill for many of the uses in Java? Hopefully this document and prototype will allow people to kick the tyres on implementing this concept as a language feature allowing a fair comparison with closures.

Summary

I've released a document and prototype of For-each loop for Maps, a language change to build on the Java 5 for-each loop. All feedback welcomed!

Thursday, 28 February 2008

FCM closures - options within

The FCM closures proposal, with the JCA extension, consists of multiple parts. This blog outlines how those parts fit together.

FCM+JCA spec parts

The FCM+JCA spec, contains the following elements:

Method literals, also constructor and field literals
Method references, also constructor references
Inner methods
Method types, aka function types
Control invocation, in the JCA extension

These five parts all fulfil different roles in the proposal. But what is often not understood is how feasible it would be to implement less than the whole specification.

Method references and literals

Reviewing response to the entire debate, it is clear to me that method references and/or method literals have generally widespread appeal. The ability to reference a method, constructor or field in a compile-time safe and refactorable way is a huge gain for a statically typed language like Java.

It should be noted that although Method References appear only in the FCM spec, they could be added to the BGGA or CICE spec without difficulty. Also, they are included in the closures JSR proposal.

There are three areas of contention with method references and literals.

Firstly, should both references and literals be supported, or just references. Or, looking at the question differently, should literals have a different syntax.

The problem here is that if a method reference and literal have the same syntax then it is unclear as to what the type of the expression is. The FCM prototype demonstrates, I believe, that this can be solved using the same syntax. The approach taken is to say that the default type is a method literal, but that it can be converted (boxed) to a method reference at construction time. Any ambiguity is an error.

 Method m = Integer#valueOf(int);
 IntToInteger i = Integer#valueOf(int);

In this example, IntToInteger is a single method interface. Because the expression Integer#valueOf(int) is assigned to a single method interface, the conversion occurs (generating a wrapping inner class).

The second issue is what a method reference can be boxed into. This is essentially a question of whether function types should be supported, and I'll cover that below.

The third issue is syntax, specifically the use of the #. Personally, I find this syntax natural and obvious to read, but I know others differ. I think it is important to get the syntax right, but final decisions on that can come later.

So, are method literals and reference required when implementing FCM? I would say 'yes'. These are simple, popular, constructs that naturally extend Java in a non-threatening way. Some have suggested omitting the literals as reflection is not type-safe, however this misses the point of the large number of existing frameworks and APIs that accept reflection Method as an input parameter.

Inner methods / Closures

 ActionListener lnr = #(ActionEvent ev) {
   ...
 };

This is where the key difference with BGGA lies, notably over the meaning of return and the value, and safety, of non-local returns.

Opinions on this appear to me to be impacted by the generics implementation, where the decision was made to do what feels like 'half a job'. As a result, there is a meme that runs 'we must implement closures fully or not at all'. This meme is extremely unfortunate, as it is not allowing a rational analysis of the semantics of the proposals. Anyone supporting BGGA really needs to consider the mistakes that developers will make again and again with the non-local return/last-line-no-semicolon approach.

So, are inner methods required when implementing FCM? I would say 'effectively, yes'. Although you could just implement method literals and references alone, there are even bigger gains to be had from adding inner methods. They greatly simplify the declaration of single method inner classes, and allow much of the impact of closures in the style of Java.

Function types / Method types

 #(int(String) throws IOException)

These allow a new powerful form of programming where common pieces of code can be easily abstracted. They simply act as types, but they have two different properties from other types.

Firstly, they have no name. This means an absence of Javadoc, including any semantic requirements of the API, such as thread-safety or null/not-null.

Secondly, they only describe the input and output types. This is a higher abstraction than Java has previously used, and will require a mindset shift for those using them.

So, are function types required when implementing FCM? I would say 'no, not required'. It is perfectly possibly and reasonable to implement FCM without method types. In fact, that is what the prototype does. In practice, this just means that all conversions from method references and inner methods must be to single method interfaces rather than method types.

Omitting method types greatly simplifies the conceptual weight of the change. The downside is that true higher order functional programming becomes near impossible. That may be no bad thing. Java is not, and never has been, a functional programming language. It seems very odd to try and push it in that direction at this point in its life.

A better alternative would be to pursue supporting primitive types in generics. This would greatly reduce the overhead of single method interfaces required by something like the fork-join framework.

Similarly, making single method interfaces easier to write (lightweight interfaces) would be a direction to take in the absence of method types.

Control invocation

 withLock(lock) {
   ...
 }
 public void withLock(#(void()) block : Lock lock) {
   ...
 }

Control invocation forms are perhaps the only way forward in Java longer term because they allow us to escape from many of these language change debates. They allow anyone to write methods that can be used in the style of control statements. It is vital to remember that they are just methods however.

BGGA appears to build much of its spec around control invocation, and the non-local returns make perfect sense in this area.

The JCA spec defines that the calling code should be identical to BGGA, but the method invoked should be written differently. The aim of JCA is to provide an element of discouragement from using control invocation. This is because of the additional complexities in getting the code right (exception transparancy, completion transparancy, non-local returns etc). A different, special, syntax encourages this feature to be restricted to senior developers, or heavily code reviewed.

So, is control invocation required when implementing FCM+JCA? I would say 'no'. It is perfectly possibly and reasonable to implement FCM+JCA without control invocation (although of course that means it would just be FCM!).

The inclusion or omission of method types is also linked to control invocation, as method types are a pre-requisite for control invocation in the JCA spec.

Summary of possible implementations

Thus, here are the possible FCM+JCA implementation combinations that make sense to me:

Literals and References
Literals, References and Inner methods
Literals, References, Inner methods and Method types
Literals, References, Inner methods, Method types and Control invocation

My preferred options are number 2 and number 4.

Why? Because, I believe inner methods are too useful to omit, and I believe method types are generally too complex unless you really need them. (Also Java isn't a functional programming language.)

The key point of this blog is to emphasise that FCM is not a take it or leave it proposal. There are different options and levels within it that could be adopted.

This extends to versions of Java. For example, it would be feasible to implement option 1, literals and references in Java 7, whilst adding inner methods and maybe more in Java 8.

Summary

I've shown how FCM has parts which can be considered separately to a degree. I've also indicated which combinations make sense to me.

Which combinations make sense to you?

Sunday, 24 February 2008

FCM prototype available

I'm happy to announce the first release of the First Class Methods (FCM) java prototype. This prototype anyone who is interested to find out what FCM really feels like.

Standard disclaimer: This software is released on a best-efforts basis. It is not Java. It has not passed the Java Compatibility Testing Kit. Having said that, in theory no existing programs should be broken. Just don't go relying on code compiled with this prototype for anything other than experimentation!

FCM javac implementation

The FCM javac implementation is hosted at Kijaro. The javac version used as a baseline is OpenJDK (the last version before the Mercurial cutover).

The prototype includes the following features:

Method literals
Constructor literals
Field literals
Static method references
Bound method references
Constructor references
Anonymous inner methods

The following are not implemented:

Method types
Instance method references
Named inner methods
Inner method non-final local variable access
Inner method exception inference
Inner method exception/completion transparancy
Conversion to single abstract method classes

The download includes a README with many FAQs answered, including more information on how the types work.

The biggest outstanding area is getting generics working properly. This is a complex task however, and I took the view that it was better to release early than spend any more time trying to get generics working properly.

Nevertheless, even without full generics, you can get a really good feel for how Java would look and feel with FCM. In addition, the FCM enabled code just falls off my fingers very nicely. Now if only we could get Eclipse or Netbeans support...

Summary

One of the key requests for considering FCM as a viable proposal has been having a prototype to play with. Now the prototype is out there, it would be really great to hear some feedback, including any bugs. Comments welcome here, at kijaro-dev mailing list or scolebourne-joda-org.

Thursday, 21 February 2008

Closures - Lightweight interfaces instead of Function types

Function types, or method types as FCM refers to them, are one of the most controversial features of closures. Is there an alternative that provides 80% of the power but in the style of Java?

Function/Method types

Function types allow the developer to define a type using just the signature, rather than a name. For example:

 // BGGA
 {String, Date => int}
 
 // FCM v0.5
 #(int(String, Date))

Apart from the different syntax, these are identical concepts in the two proposals as they currently stand. They both mean "a type that takes in two parameters - String and Date - and returns an int". This will be compiled to a dynamically generated single method interface as follows:

 public interface IOO<A, B> {
  int invoke(A a, B b);
 }

When instantiated, the generics A and B will be String and Date.

This is a complicated underlying mechanism. It is also one that can't be completely hidden from the developer, as the exception stack traces will show the auto-generated interface "IOO". This will certainly be a little unexpected at first. Update: Neal points out correctly that an interface name will not appear in the stacktrace!

A second complaint about function types is that there is no home for documentation. One of Java's key strengths is its documentation capabilities in the form of Javadoc. This is perhaps the unsung reason as to why Java became so popular in enterprises as a long-life, bet your company, language. Maintenance coders love that documentation. And everybody loves the ability to link to it within your IDE

So, why are we even considering function types? Well they allow APIs to be written that can take in any closure, simply defining it in terms of the input and output types. They also allow lightweight definition - there is no need to define the type before using it.

These are highly powerful features, and they lead towards functional programming idioms. But are these idioms completely in the style of Java?

Another option

Lets start from what we would write in Java today.

 public interface Convertor {
  int convert(String str, Date date);
 }

The advantage of this is that everyone knows and understands it. Its part of the lingua franca that is Java. Now lets examine what we could do with this.

Firstly, we need to remember that it is possible to define a class or interface such as this one within a method in Java today. The scope of such a class is the scope of the method. This will come in useful later.

So, lets examine what would happen if we start shortening the interface definition. For a function type equivalent, we know that there is only one method. As such, there isn't really any need for the braces:

 public interface Convertor int convert(String str, Date date);

Now, lets consider that for a function type equivalent, the method name is pre-defined as 'invoke'. As such, there is no need to include the method name:

 public interface Convertor int(String str, Date date);

Now, lets consider that for a function type equivalent, the parameter names are unimportant. As such, lets remove them (or maybe make them optional):

 public interface Convertor int(String, Date);

And that's it. I'm calling this a lightweight interface for now.

They represent a reasonable reduction of the code necessary to define a named single method interface. The syntax would be allowed anywhere an existing interface could be defined. This includes its own source file, nested in another class or interface, or locally scoped within a method. This is a longer example:

 // with function types (FCM syntax)
 public int process() {
  #(int(int, int)) add = #(int a, int b) {return a + b;};
  #(int(int, int)) mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }
 
 // with named lightweight interfaces - solution A
 interface Adder int(int,int);
 interface Multiplier int(int,int);
 public int process() {
  Adder add = #(int a, int b) {return a + b;};
  Multiplier mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }
 
 // with named lightweight interfaces - solution B
 public int process() {
  interface MathsCombiner int(int,int);
  MathsCombiner add = #(int a, int b) {return a + b;};
  MathsCombiner mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }

Solution A shows how you might define one lightweight interface for each operation. Solution B shows how you might define just one lightweight interface. It also shows that the lightweight interface could be define locally within the same method.

What we have gained is a name for the function type. It is now possible to write Javadoc for it and hyperlink to it in your IDE.

And it can be quickly and easily grasped as simply a shorthand way of defining a single method interface. In fact, you would be able to use this anywhere in your code as a normal single method interface, implementing it using a normal class, or extending it as required.

Its also possible to imagine IDE refactorings that would convert a lightweight interface to a full interface if you needed to add additional methods. Or to convert a single method interface to the lightweight definition.

Of course it would be possible to take this further by eliminating the name:

 // example showing what is possible, I'm not advocating this!
 public int process() {
  interface int(int,int) add = #(int a, int b) {return a + b;};
  interface int(int,int) mul = #(int a, int b) {return a * b;};
  return mul.invoke(add.invoke(2, 3), add.invoke(3, 4));
 }

However, the developer must now mentally parse both the lines "interface int(int,int)" to see if they are the same type. Previously, they could just see that they were both "MathsCombiner". As such, I prefer keeping the name, and requiring developers to take the extra step.

I see this as an example of where the style of Java differs from other more dynamic languages. In Java you always define your types up front. In more dynamic languages, you often just code the closure. As this concept requires defining types up front, I might suggest it is more in the Java style.

Final example

One final example is from my last blog post, this time in BGGA syntax:

 // Example functional programming style method using BGGA syntax
 public <T, U> {T => U} converter({=> T} a, {=> U} b, {T => U} c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }
 
 // The same using lightweight interfaces
 interface Factory<C> C();
 interface Transformer<I, O> O(I);
 public <T, U> Transformer<T, U> converter(Factory<T> a, Factory<U> b, Transformer<T, U> c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }

Personally, I find the latter to be much more readable, even though it involves more code. Both Factory and Transformer can be defined once, probably in the JDK or framework, and have associated documentation.

In addition, if I'd never seen the code before, I'd much prefer to be assigned to maintain the latter code with lightweight interfaces. Perhaps that is the key to Java's success - code that can be maintained. Write once. Read many times.

Thanks

Finally, I should note that some of the inspiration for this idea came from blogs and documents by Remi Forax and Casper Bang.

Summary

I've outlined an alternative to function types that keeps a key Java element - the type and its name. Lightweight interfaces are easy and quick to code if you don't want to document, but have the capacity to grow and be full members of the normal Java world if required.

I'd really love to hear opinions on this. It seems like a great way to balance the competing forces, but what do you think?

PS. Don't forget to vote for FCM at the java.net poll!

Tuesday, 19 February 2008

Evaluating BGGA closures

The current vote forces us all to ask what proposal is best for the future of Java. Personally, I don't find the BGGA proposal persuasive (in its current form). But exactly why is that?

Touchstone debate

The closures debate is really a touchstone for two other, much broader, debates.

The first "what is the vision for Java". There has been no single guiding vision for Java through its life, unlike other languages:

In the C# space, we have Anders. He clearly "owns" language, and acts as the benevolent dictator. Nothing goes into "his" language without his explicit and expressed OK. Other languages have similar personages in similar roles. Python has Guido. Perl has Larry. C++ has Bjarne. Ruby has Matz.

The impact of this lack of guiding vision is a sense that Java could be changed in any way. We really need some rules and a guiding Java architect.

The second touchstone debate is "can any language changes now be successfully implemented in Java". This is generally a reference to generics, where erasure and wildcards have often produced havoc. In the votes at Javapolis, 'improving generics' got the highest vote by far.

The result of implementing generics with erasure and wildcards has been a loss of confidence in Java language change. Many now oppose any and all change, however simple and isolated. This is unfortunate, and we must ensure that the next batch of language changes work without issues.

Despite this broader debates that surround closures, we must focus on the merits of the individual proposals.

Evaluating BGGA

BGGA would be a very powerful addition to Java. It contains features that I, as a senior developer, could make great use of should I choose to. It is also a well written and thought out proposal, and is the only proposal tackling some key areas, such as exception transparancy, in detail.

However, my basic issue with BGGA is that the resulting proposal doesn't feel like Java to me. Unfortunately, 'feel' is highly subjective, so perhaps we can rationalise this a little more.

Evaluating BGGA - Statements and Expressions

Firstly, BGGA introduces a completely new block exit strategy to Java - last-line-no-semicolon expressions. BGGA uses them for good reasons, but I believe that these are very alien to Java. So where do they come from? Well consider this piece of Java code, and its equivalent in Scala:

 // Java
 String str = null;
 if (someBooleanMethod()) {
  str = "TRUE";
 } else {
  str = "FALSE";
 }
 
 // Scala
 val str = if (someBooleanMethod()) {
  "TRUE"
 } else {
  "FALSE"
 }

There are two points to note. The first is that Scala does not use semicolons at the end of line. The second, and more important point is that the last-line-expression from either the if or the else clause is assigned to the variable str.

The fundamental language level difference going on here is that if in Java is a statement with no return value. In Scala, if is an expression, and the result from the blocks can be assigned to a variable if desired. More generally, we can say that Java is formed from statements and expressions, while in Scala everything can be an expression.

The result of this difference, is that in Scala it is perfectly natural for a closure to use the concept of last-line-no-semicolon to return from a closure block because that is the standard, basic language idiom. All blocks in Scala have the concept of last-line-no-semicolon expression return.

This is not the idiom in Java.

Java has statements and expressions as two different program elements. In my opinion, BGGA tries to force an expression only style into the middle of Java. The result is in a horrible mixture of styles.

Evaluating BGGA - Non local returns

A key reason for BGGA using an alternate block exit mechanism is to meet Tennant's Correspondance Principle. The choices made allow BGGA to continue using return to mean 'return from the enclosing method' whilst within a closure.

There is a problem with using return in this way however. Like all the proposals, you can take a BGGA closure, assign it to a variable and store it for later use. But, if that closure contains a return statement, it can only be successfully invoked while the enclosing method still exists on the stack.

 public class Example {
  ActionListener buttonPressed = null;
  public void init() {
   buttonPressed = {ActionEvent ev => 
     callDatabase();
     if (isWeekend() {
      queueRequest();
      return;
     }
     processRequest();
   };
  }
 }

In this example, the init() method will be called at program startup and create the listener. The listener will then be called later when a button is pressed. The processing will call the database, check if it is weekend and then queue the request. It will then try to return from the init() method.

This obviously can't happen, as the init() method is long since completed. The result is a NonLocalTransferException - an unusual exception which tries to indicate to the developer that they made a coding error.

But this is a runtime exception.

It is entirely possible that this code could get into production like this. We really need to ask ourselves if we want to change the Java programming language such that we introduce new runtime exceptions that can be easily coded by accident.

As it happens, the BGGA specification includes a mechanism to avoid this, allowing the problem to be caught at compile time. Their solution is to add a marker interface RestrictedFunction to those interfaces where this might be a problem. ActionListener would be one such interface (hence the example above would not compile - other examples would though).

This is a horrible solution however, and doesn't really work. Firstly, the name RestrictedFunction and it being a marker interface are two design smells. Secondly, this actually prevents the caller from using ActionListener with return if they actually want to do so (within the same enclosing method).

The one final clue about non local returns is the difficulty in implementing them. They have to be implemented using the underlying exception mechanism. While this won't have any performance implications, and won't be visible to developers, it is another measure of the complexity of the problem.

In my opinion, allowing non local returns in this way will cause nothing but trouble in Java. Developers are human, and will easily make the mistake of using return when they shouldn't. The compiler will catch some cases, with RestrictedFunction, but not others, which will act as a further level of confusion.

Evaluating BGGA - Functional programming

Java has always laid claim to be an OO language. In many ways, this has often been a dubious claim, especially with static methods and fields. Java has never been thought of as a functional programming langauge however.

BGGA introduces function types as a key component. These introduce a considerable extra level of complexity in the type system.

A key point is that at the lowest compiler level, function types are no different to ordinary, one method, interfaces. However, this similarity is misleading, as at the developer level they are completely different. They look nothing like normal interfaces, which have a name and javadoc, and they are dynamically generated.

Moreover, function types can be used to build up complex higher-order function methods. These might take in a function type parameter or two, often using generics, and perhaps returning another function type. This can lead to some very complicated method signatures and code.

 public <T, U> {T => U} converter({=> T} a, {=> U} b, {T => U} c) {
   return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)};
 }

It shoud be noted that FCM also includes function types, which are named method types. However, both Stefan and myself struggle with the usability of them, and they may be removed from a future version of FCM. It is possible to have FCM inner methods, method references and method literals without the need for method types.

Function types have their uses, however I am yet to be entirely convinced that the complexity is justified in Java. Certainly, they look very alien to Java today, and they enable code which is decidedly hard to read.

Evaluating BGGA - Syntax

The syntax is the least important reason for disliking BGGA, as it can be altered. Nevertheless, I should show an example of hard to read syntax.

 int y = 6;
 {int => boolean} a = {int x => x <= y};
 {int => boolean} b = {int x => x >= y};
 boolean c = a.invoke(3) && b.invoke(7);

Following =>, <= and >= can get very confusing. It is also a syntax structure for parameters that is alien to Java.

Summary

As co-author of the FCM proposal it is only natural that I should want to take issue with aspects of competitor proposals. However, I do so for one reason, and that is to ensure that the changes included in Java 7 really are the right ones. Personally, I believe that BGGA attempts something - full closures - that simply isn't valid or appropriate for Java.

It remains my view that the FCM proposal, with the optional JCA extension, provides a similar feature set but in a more natural, safe, Java style. If you've read the proposals and are convinced, then please vote for FCM at java.net.

Opinions welcome as always :-)

References

For reference, here are the links to all the proposals if you want to read more:

BGGA - full closures for Java
CICE - simplified inner classes (with the related ARM proposal)
FCM - first class methods (with the related JCA proposal)

Sunday, 17 February 2008

Vote for FCM!

Java.net is currently running a poll on closures, to get a feel for the strength of support for each proposal. Obviously, this poll has no power, but it is useful to see at a high level what the communities opinion is.

Personally, I'm really pleased with the level of support FCM has had in the vote so far, on this blog and privately. Now I'd like to encourage you, if you so desire, to vote and support FCM. Thanks!

Update: If you want to compare the three proposals, please take a look at my previous comparison articles - one method callbacks, control structures and type inference.

Monday, 28 January 2008

Java 7 - Multi-line String literals

One of the most common features in other programming languages is the multi-line String literal. Would it be possible to add this to Java?

Update, 2011-10-31, Just wanted to note that multi-line strings are not in Java 7, nor are they likely to be in Java 8. This blog post is still useful to understand some of the difficulties that would have to be tackled if they were to be included in future.

Update, 2018-01-28: This is now being considered for addition to Java, read more here.

Multi-line String literals

In Java today there is only one form of string literal, supplied in double quotes. Within those double quotes, certain characters have to be escaped.

 String basic = "Hello";
 String three = "This string\nspans three\nlines";
 String welcome = "Hello, My name is \"Stephen\", Hi!";

The first of these two examples is not complex, and would not make use of a multi-line String literal. The other two might be more readable with such a literal.

The standard for defining a multi-line String literal in both Scala and Groovy is three double quotes. This also seems like a sensible choice for Java:

  String three = """This string
spans three
lines""";

This is potentially much more readable, especially with large blocks of text. This form of literal would also avoid the need for escaping:

 String welcome = """Hello, My name is "Stephen", Hi!""";

Note that we no longer need to escape the double quotes. This would be especially useful for regular expressions.

Bear in mind that multi-line String literals are fundamentally no different to normal String literals on the key point of the object created. Both would create java.lang.String objects.

Issues

The first issue is the multi-line arrangement. Since all text within the multi-line literal is included, all lines except the first must begin from column zero. This will look odd in a piece of well-formatted Java code:

  // what a naive multi-line literal forces us to write
  public class MyClass {
    public void doStuff() {
      String three = """This string
spans three
lines""";
      System.out.println(three);
    }
  }
  
  // what we'd like to write
  public class MyClass {
    public void doStuff() {
      String three = """This string
                        spans three
                        lines""";
      System.out.println(three);
    }
  }

One possible solution is to provide a method on String that strips all whitespace after each newline. This could be called directly after the literal. Unfortunately this approach loses some efficiency as the string must be trimmed each time:

  // option with trimNewline()
  public class MyClass {
    public void doStuff() {
      String three = """This string
                        spans three
                        lines""".trimNewLine();
      System.out.println(three);
    }
  }

Another, perhaps better, solution might be to have a syntax variation. If the opening triple quote is followed immediately by a newline, then the position of the first non-space character on the next line represents the column to begin the literal at. (The first newline would not be included in this form of the literal.) Only the space character would be permitted in earlier columns until the end of the literal. This would allow for natural formatting of this kind of string:

  // option with columns determined by first line
  public class MyClass {
    public void doStuff() {
      String three = """
              This string
              spans three
              lines""";
      System.out.println(three);
    }
  }

One final tricky issue is handling a string containing the triple double quote. The answer is probably to ignore this situation (Scala does this). It is going to be very rare, and it can be worked around using string concatenation.

Summary

Multi-line String literals should be a relatively easy addition to Java (anyone fancy adding it to Kijaro?). The main benefits would be avoiding escaping in regular expressions, and pasting in large blocks of text from other sources.

Overall, I think they would be a valuable addition to Java. But have I missed any obvious issues? Are there any other syntax options that should be considered? Opinions welcome as always :-)