Sunday 14 March 2010

Java language design by use case

In a blog in 2006 Neal Gafter wrote about how language design was fundamentally different to API design and how use cases were a bad approach to language design. This blog questions some of those conclusions in the context of the Java language.

Java language design by use case

Firstly, Neal doesn't say that use cases should be avoided in language design:

In a programming language, on the other hand, the elements that are used to assemble programs are ideally orthogonal and independent. ...
To be sure, use cases also play a very important role in language design, but that role is a completely different one than the kind of role that they play in API design. In API design, satisfying the requirements of the use cases is a sufficient condition for completeness. In language design, it is a necessary condition.

So, Neal's position seems very sound. Language features should be orthogonal, and designed to interact in new ways that the language designer hadn't thought of. This is one element of why language design is a different skill to API design - and why armchair language designers should be careful.

The problem, and the point of this blog, is that it would appear that the development of the Java language has never been overly concerned with following this approach. (I'm not trying to cast aspersions here on those involved - just trying to provide some background on the language).

Consider inner classes - added in v1.1. These target a specific need - the requirements of the swing API. While they have been used for other things (poor mans closures), they weren't overly designed as such.

Consider enums - aded in v1.5. These target a single specific use case, that of a typesafe set of values. They don't extend to cover additional edge cases (shared code in an abstract superclass or extensibility for example) because these weren't part of the key use case. JSR-310 has been significantly compromised by the lack of shared code.

Consider the foreach loop - added in v1.5. This meets a single basic use case - looping over an array or iterable. The use case didn't allow for indexed looping, finding out if its the first or last time around the loop, looping around two lists pairwise, and so on. The feature is driven by a specific use case.

And the var-args added in v1.5? I have a memory that suggests the use case for its addition was to enable String printf.

Finally, by accounts I've heard, even James Gosling tended to add items to the original Java builds on the basis of what he needed at that moment (a specific use case) rather than to a great overarching plan for a great language.

To be fair, some features are definitely more orthogonal and open - annotations for example.

Looking forward, Project Lambda repeats this approach. It has a clear focus on the Fork-Join/ParallelArray use case - other use cases like filtering/sorting/manipulating collections are considered a second class use case (apparently - its a bit hard to pin down the requirements). Thus, once again the Java language will add a use case driven feature rather than a language designers orthogonal feature.

But is that necessarily a Bad Thing?

Well, firstly we have to consider that the java language has 9 million developers and is probably still the worlds most widely used language. So, being use case driven in the past hasn't overly hurt adoption.

Now, most in the community and blogosphere would accept that in many ways Java is actually not a very good programming language. And somewhere deep down, some of that is due to the use case/feature driven approach to change. Yet, down in the trenches most Java developers don't seem especially fussed about the quality of the language. Understanding that should be key for the leaders of the Java community.

I see two ways to view this dichotomy. One is to say that it is simply because people haven't been exposed to better languages with a more thought through and unified language design. In other words - once they do see a "better designed language" they'll laugh at Java. While I think that is true of the blogosphere, I'm rather unconvinced as to how true that is of the mainstream.

The alternative is to say that actually most developers can more easily handle discrete use-case focussed language features better than abstracted, independent, orthogonal features. In other words - "use feature X do achieve goal Y". I have a suspicion that is how many developers actually like to think.

Looked at in this way, the design of the Java language suddenly seems a lot more clever. The use case driven features map more closely onto the discrete mental models of working developers than the abstract super-powerful ones of more advanced languages. Thus this is another key difference that marks out a blue collar language (pdf) (cache) from an academic experiment.

Project Lambda

I'm writing this blog because of Project Lambda, which is adding closures to the Java language. Various options have been suggested to solve the problem of referring to local variables and whether those reference should be safe across multiple threads or not. The trouble is that there are two use cases - immediate invocation, where local variables can be used immediately and safely, and deferred asynchronous invocation where local variables would be published to another thread and be subject to data races.

What this blog suggests is that maybe these two use cases need to be representable as two language features or two clear variations of the same feature (as in C++11).

Summary

Many of the changes to the Java language, and some of the original features, owe as much to a use case driven approach as to an overarching language design with orthogonal features. Yet despite this supposed "flaw" developers still use the Java language in droves.

Maybe its time to question whether use case focus without orthogonality in language features isn't such a Bad Thing after all?

Feedback welcome!