Stephen Colebourne's blog: September 2010

Monday, 27 September 2010

Oracle and the Business of Java

Oracle now owns Java. But how do they view it as a business? As a Java Champion, I had some additional meetings at JavaOne which provide a slightly different take on this important topic. This also affects #bijava to some degree.

The Business of Java

JavaOne 2010 saw a new start for Java with Oracle in charge. Many of the same big names, now employed by Oracle instead of Sun, were there providing deep technical talks on the major Java topics. What was interesting is who else was there and what their roles were.

For most of us, being a senior technical lead or architect is not sufficient to enable us to implement our plans. Instead, we need a stakeholder from the business with a budget - typically a product manager.

Under Sun, I am informed that the product management side was minimal (please note that I don't have any links here, just heresay). Contrast this to Oracle, where there is a clear, separate and in control product manager for each key technology area in Java.

One public example of this was the Java FX talk, where Richard Bair was speaking on the technology changes. In the Q & A section, whenever an awkward or forward looking question was asked that implied commitment or spending money, he was able to bring his product manager up to answer that question. This isn't something I remember seeing with Sun.

At this point, your reaction might be "but I don't want business people in charge".

In my view, having a product manager is generally a Good Thing, and looks like working well for Java. The key aspect is the ability to treat Java as a product (both gratis and with paid support/extensions). The paid elements provide some of the funding going forward to invest in the elements that technologists want from the platform.

Moreover, the product manager provides an independent way to balance competing priorities. Clearly, there are many, many things that might be good enhancements to Java. Which ones should go ahead is not always obvious, and an external perspective is useful. Its also one that most of us outside Sun/Oracle deal with everyday.

One way that the analysis and investment decisions can be made is by Cost-Benefit-Analysis (CBA). Oracle has about 100,000 employees. Of these, lets say 10% (10,000) are Java developers (there are many, many deveopers working on Oracle Java-based products). It should be clear that any change that make a developer even 1% more productive would have massive savings just within Oracle itself (thanks to the 10,000 in-house deveopers). As such, many changes or enhancements can probably be justified simply on an internal CBA.

To emphasise the point, there is no need to say "change X will save the industry Y million dollars per year". Instead, there can be a calculation that says that the change will save Oracle Z million dollars per year. Clearly, such an analysis is much more useful.

I have been reassured in conversations that this does not place product managers in absolute power. More specifically, the technical leaders (whose names we all know), will also have a say in what goes ahead and what doesn't. I saw no signs that the product manager will be deciding the syntax of Project Lambda for example!

In terms of #bijava, the point here is simply that to move forwards with a break in compatibility requires both technical and business buy-in. But there are also good business reasons for that break - competition, market-share, productivity of those 10,000 developers, and against - risk, cost. Balancing those competing elements is probably more a product decision than a straight technical one, so it is actually useful to have these new product leaders in place.

Summary

Sometimes as developers we like to think that we could do without the business sponsor or product manager. In truth their role is vital.

Oracle now has clear product managers for all the Java technologies, something I am given to understand was lacking prior to the takeover. I see this as an unequivocal Good Thing. However, we in the community now need to respond by making the business as well as the technical case when making proposals.

Feedback welcome as always!

Thursday, 23 September 2010

Checked Exceptions (#bijava)

In my recent post I brought up the idea of a backwards incompatible version of Java. I indicated that checked exceptions would be an obvious candidate for removal.

Checked exceptions

Java exceptions are divided into checked and unchecked. This distinction is not terribly neat in the API, because RuntimeException extends Exception. The purpose behind the distinction between the two isn't overly clear from the Javadoc:

RuntimeException is the superclass of those 
 exceptions that can be thrown during the normal operation of the 
 Java Virtual Machine. 
 A method is not required to declare in its throws 
 clause any subclasses of RuntimeException that might 
 be thrown during the execution of the method but not caught. 

 The class Exception and its subclasses are a form of 
 Throwable that indicates conditions that a reasonable 
 application might want to catch.

Maybe this lack of clear purpose didn't help developers use exceptions in the way intended (because the intended way wasn't specified).

In my last post, I summarised my views as follows:

If you are still using or advocating checked exceptions then I'm afraid you skill set is 5 to 10 years out of date.

I base this opinion on observation of the industry, and the behaviour of its most active participants.

Key industry projects do not use checked exceptions. Spring has a specific policy of wrapping any checked exception to make it unchecked. Hibernate changed from checked to unchecked - in the words of a key committer I spoke to at JavaOne "we threw our hands up and said sorry" about using checked exceptions. Another key player in Java EE confirmed that that specification is also moving away from checked exceptions.

But since there are some who have commented that they thought these weren't sufficient data points, I've gathered more information. I walked around the entire exhibition floor at JavaOne and asked each vendor what their Java API used. (Apologies in advance to any vendor where the information is inaccurate!)

These 10 vendors indicated that they expose only runtime exceptions:
Spring Source, Terracotta, JReport, Neo4J, Electric Cloud, Serva, Exo, n-software, Ricoh, Sahi, Splunk.

These 8 vendors indicated that they expose some checked exceptions, typically including the JDK exceptions like IOException or MalformedURLException:
JRebel, Artifactory, Caucho (limited by J2EE spec), JetBrains, Parasoft, JBoss (partially limited by J2EE spec), JRapid

These 5 vendors indicated that they expose a number of their checked exceptions, including ones they have written, like FooCompanyException:
Windward reports, Sonatype, Perforce (have own NPE for example), Blackberry, Intersystems Cache (all exceptions extend Throwable directly)

A number of other companies weren't able to provide the info, or didn't have an exposed public API.

So, there is no there is no single viewpoint by the vendors. Many now only use runtime exceptions, with others only really exposing JDK checked exceptions. However some have a wide selection of checked exceptions.

Do these data points change my analysis?
No.

In addition to finding out what a company did, I also asked a number of people the "what if" question of removing them from JDK 9 (#bijava). Most were in favour of removing them, with those having C# experience distinctly in favour. As always, there were some who still favoured checked.

The checked exception issue will continue to be emotive. For me its a classic case of not considering the community effect as being important enough.

My take is that far too often you will see a checked exception caught and ignored, or simply print stack trace and continue. Developers are human, and they are lazy. No amount of telling developers the right way to do something is ever going to change that. Developers will always use them inappropriately.

For me, a language feature that is proven to be used incorrectly most of the time is a language feature not worth having, however logical the basic idea may seem.

Summary

My view is that broken features in the Java language should be removed. And checked exceptions are broken because in 15 years we've found it impossible to get developers to use them properly (even the JDK library has many poor examples).

Like any religious debate, I don't expect to change minds overnight. But I ask those justifying them to look a little closer at how in 15 years, we're still using them poorly. What does that say?

Feedback welcome as always! Please use #bijava to talk about backwards incompatible Java on twitter.

Tuesday, 21 September 2010

Two features for #bijava

In my last post I brought up the idea of a backwards incompatible version of Java. What might just two of the incompatible changes be?

Backwards incompatible Java - #bijava

Firstly, a reminder. The key with this proposal is that these changes still result in a language that is still recognisably Java. The goal is to enhance the working lives of 10 million developers, most of whom do not have the luxury of just switching to another language. Improving their productivity by just 5% would be a huge benefit globally.

Secondly, its obviously the case that these ideas are not new. Scala, Groovy and Fantom are all trialling different approaches to a better Java right now. What hasn't been talked about has been taking some of these new approaches directly into Java once you remove the absolute hurdle of backwards incompatibility.

Remove primitives, Add nullable types

The first example is primitives and nullable types. The approach here is interesting, in that we remove a language feature in order to be able to add a better one.

Exposed primitives are a pain in the Java language. They were added to version 1.0 for performance and attracting developers from C/C++. Version 1.0 of Java was slow enough as it was, so performance mattered. And attracting developers is a key reason to add any new feature.

However, primitives have caused major problems to the language as it has evolved. Today, you cannot have a generified list of primitive types for example. The split between primitive and object types causes this.

The more appropriate solution is to have every type in the source code as an object. It is then the compiler's job to optimise to use primitives where it makes sense. Doing this effectively is where the nullable types comes in.

Nullable types allow you as a developer to specify whether any given variable can or cannot contain null. Clearly, an int is at a high level equivalent to an Integer that cannot contain null. Equivalent enough that a smart compiler could optimise the difference and use the primitive under the covers when the variable is declared as non-null.

But it turns out that academic research has shown that programmers actually intend the default of most variables to be not-null. Thus, we need to change the meaning of variable declarations.

// Java code - (ignoring closures for this example)
 public Person findPerson(List<Person> people, String surname) {
   if (surname != null) {
     for (Person person : people) {
       if (surname.equals(person.getSurname())) {
         return person;
       }
     }
   }
   return null;
 }
 // #bijava code
 public Person? findPerson(List<Person> people, String? surname) {
   if (surname != null) {
     for (Person person : people) {
       if (surname.equals(person.getSurname())) {
         return person;
       }
     }
   }
   return null;
 }

Here we have added the ? symbol to any variable that can hold null. If a variable cannot hold null, then it is does not have the ? on the type. By doing this, the compiler can check the null-handling behaviour of the code. For example, the line "surname.equals(...)" would not compile without the previous check to ensure that surname was non-null.

In summary, this is a classic change which cannot be made today. Removing primitives would break code, so would changing the meaning of a variable declaration such that the default is non-null variables. Yet both are good changes.

The point here is that the resulting language is still effectively Java. We haven't scared off lots of developers. Its a challenging change for the language designer and compiler writer, but results in a lot better code for 10 million developers.

Equals

The second example of an incompatible change is the equals method.

In Java today, we use the .equals() method all the time for comparing two objects. Yet for primitives we have to use ==. The reasons are ones we rarely think about, yet if we take a step back its clearly odd.

Given how frequent the .equals() method is used, it makes perfect sense to have an operator for it. Clearly, the right choice for the operator is ==. But we can't make this change to Java as it is backwards incompatible.

But, with #bijava, this change can be made. The existing == operator is renamed to ===, and a new operator == is added that simply compiles to .equals(). (Technically, it has to handle null, which is another reason why nullable types help.)

// #bijava code
 public Person? findPerson(List<Person> people, String? surname) {
   if (surname != null) {
     for (Person person : people) {
       if (surname == person.getSurname()) {
         return person;
       }
     }
   }
   return null;
 }

As shown above, this change, seen in many other languages, has a huge impact on the readability of code. If you are working today, try spending 5 minutes replacing .equals() by == in some of your code, and see the readability benefits.

Of course this is another example of a change where we need a backwards incompatible version to gain the benefit.

Summary

Neither of these changes are really that radical. It is entirely possible to write a tool that will convert source code from the old form to the new and back again. The tool, plus the JVM bytecode, provides the backwards compatability story necessary to reassure managers.

Some will say that these examples aren't radical enough. And they're right. (They are just two proposals of many for what would be included in #bijava.) But the key point is that #bijava must be close enough to Java (post JDK 7/8) such that that huge body of 10 million developers can be brought along without excessive training or learning needs. Specifically, each change above can be taught to a developer in less than five minutes just standing at their desk.

It also means that #bijava is not a threat to Scala, Groovy, Clojure, Fantom or whatever you're favourite language is. Each of these has their own target area, and Java cannot and will not morph into them. Thus, they are free to continue to argue their own case as to why developers should just move away from Java completely.

#bijava isn't a panacea. It will not result in all the changes we might want. But it changes the mindset to allowing there to be some incompatibilities between versions of Java, providing the right supporting infrastrucure (modules, conversion tools) are present.

Feedback welcome as always! Please use #bijava to talk about backwards incompatible Java on twitter.
PS. I'll get comments invalidly marked as spam out as soon as I can!

The Next Big JVM Language

At JavaOne on Monday I spoke on the topic of "The Next Big JVM Language". My conclusion wasn't what I expected it to be when I started researching the topic. Read on to find my conclusion...

The Next Big JVM Language

To discuss this topic, we first have to define what is a "Big Language". Specifically, I defined it as being a key or dominant language, with wide usage and job market, established and with supporting ecosystem and community. Another possible definition that 15% of the developers in the world are using that language at a point in time. Based on these, I listed C, C++, Java, C#, COBOL, VB, Perl, PHP and Javascript as Big Languages. Depending on your cutoff point, you could argue for others, but that seems like a good starting set.

I then looked at what Java got right and wrong. Certainly Java got a lot right - its used by 10 million developers. But it got a lot wrong too...

Well, thats unfair! Java is 15 years old. We are applying a historical judgement to Java, and many of the choices made in Java 1.0 were appropriate then, but not now. Its much better to ask "what we have learnt from Java".

Learning from Java

I looked at some key points in the Java language that we have learnt over 15 years.

1) Checked exceptions. Spring rejects them. Hibernate rejects them. Java EE rejects them. They are a failed experiment (good in theory, bad in practice). Now, many reading this blog still hold checked exceptions dear to your hearts. But I'm afraid its finally time to say "wake up and smell the coffee". Checked exceptions have been rejected by all the key industry API writers and leaders at this point. If you're still using or advocating checked exceptions, then I'm afraid your skill set is 5 to 10 years out of date. Period. (I know that may sound harsh, but to those in that camp, you seriously need to open your mind to what modern API design is about.)

2) Primitives and Arrays. Both these features expose low level details from bytecode. They break the "everything is an object" model. The lack of generics over primitives is a classic example of this. The correct solution is a language where the source code does not have exposed primitives or arrays, and the compiler (or perhaps the JVM) works out if it can optimise it for you.

3) Everything is a monitor. In Java and the JVM, every object is a monitor, meaning that you can synchronize on any object. This is incredibly wasteful at the JVM level. Senior JVM guys have indicated large percentage improvements in JVM space and performance if we removed the requirement that every object can be synchronized on. (Instead, you would have specific classes like Java 5 Lock)

4) Static. Code in static methods is inherently less reusable and accessible than code in objects. This, together with constructors, often results in a need to have explicit factory interfaces, making APIs more complex. A better solution is singleton objects, which can be used most of the time just like a static, but can be passed as an object when required.

5) Method overloading. One of the most constraining parts of the Java language specification is method resolution, where the compiler has to work out what method you intended to call. Resolution touches superclasses, interfaces, varargs, generics, boxing and primitives. It is a very complex algorithm, that is difficult to extend. Having no method overloading, except by default parameters, would be a huge win.

6) Generics. Java generics is complex, especially around the ? extends and ? super variance clauses. Its too easy to get these wrong. The lesson from Java should be that use-site variance hasn't worked out well.

I could have chosen others, but these are a selection of items where we have learnt from Java.

What might NBJL contain?

Looking forward, I argued that human factors are key in the adoption of a next mass-appeal language. I put forward a simple test for the ability of any new language to be adopted:

1) Take a piece of code in the new language. A piece of reasonable complexity that a typical developer would be expected to deal with day-to-day.

2) Give the code to a mid-level developer. Someone who is not interested in blogging, tweeting or new languages in general.

3) Can they make a reasonable guess as to what the code in the new language does? Without any help or training.

Now this is a fairly harsh definition of how far NBJL can evolve, but it is I believe quite a practical one. The truth is that we need to be able to transition to the new language without massive training programmes.

In terms of features, I covered a long list of features and issues that a new language should address.

C-like syntax (familiar and good-enough)
Static typing (dynamic is too loose and ineffective for this type of language)
OOP with functional elements (pure functional too hard for mainstream)
Easy access reflection (to escape the static typing restrictions)
Properties (because getters and setters are crazy)
Closures (capturing looping design patterns)
Null-handling (preferably a means to declare whether each variable can or cannot hold null)
Concurrency story (something better than raw threads and shared mutable state)
Modules (need to be thinking in terms of larger units)
Tools (need a language to be designed to help tool writers)
Extensibility (allowing some additions without going back to the language designer)

There are many other concepts that could be discussed - language design is fairly obviously a design artifact, and so different views and opinions are likely.

So, what language?

1) Clojure is a functional programming language, using syntax from the LISP family. It has some great ideas on handling state, which change completely the approach Java developers are used. to. However, it is a million miles away from Java in syntax and function approach.
NBJL isn't Clojure.

2) Groovy is a dynamic language with optional typing. It is heavily based on Java, using the syntax and structures directly in many cases. This makes it very quick and easy to pick up, and use. Its strengths are in scripting and web pages, where the dynamic and meta-programming elements shine. The dynamic nature makes it a poor choice for large bodies of core entterprise server logic.
NBJL isn't Groovy.

3) Scala is a OO/functional language, using C-like syntax. On deeper examination, it can be seen that the functional side is more significant. In particular, the culture of the language pushes users to the more pure functional solutions. Scala is statically typed, to the extent that the generics of Scala is apparently Turing complete. But, does writing a programming language in the generics of another programming language really make sense?!! To cover all the elements of Scala complexity really needs to write a separate blog post. Suffice to say that Scala simply gives developers way too much rope to be able to hang themselves by. While it may at first glance appear to offer the better than Java features that are being searched for, it quickly bites your head off once you go beyond the basics - its simply a language too complex for the mainstream.
NBJL isn't Scala.

4) Fantom is an OO language with functional elements, using C-like syntax. It has very simple and neat solutions to topics as diverse as nullable types, immutability and modules. It has a static type system with a relaxed approach. Geerics are only supported on Lists, Maps and Functions, and developers cannot add their own. To compenstate, the language sutomatically adds a cast wherever a developer would normally have needed to add one. While Fantom contains almost a complete sweep of what a sensible mainstream language should contain, it hasn't received that much attention. One point of concern is whether the type system is strong enough to attract lots of developers.
Fantom is closest to NBJL of these languages, but seems unlikely to succeed as the more relaxed type system seems to scare people off.

At a personal level, each of these four languages will teach you something new if you learn it. Clojure and Scala in particular will teach you about functional programming if you've never been exposed to it. However, NBJL is about picking a language suitable for use by everyone for all tasks, in a team and enterprise environment. While each of these four has a niche, none of them are currently placed to jump up and replace Java.

An alternate approach

There are 10 million Java developers. Any improvement that affects all 10 million has a big benefit for the cost. But none of the current languages is capable of being that next language.

Maybe we should reconsider Java?

What if we created a backwards incompatible version of Java (the language)?
Where we fixed the things we know are broken? (exposed primitives, arrays, checked exceptions, ...
Where the changes were not too massive to create the need for formal training courses?
Where a tool converts old code to new code (and vice versa) with 100% accuracy? (providing the essential backwards compatibility)
Where features like closures and properties could be added with the current compromises?
Where you only compile modules, and never single class files?
Where the compiler/module-linker can determine that two versions of a module use the same bytecode for the operations you actually call, therefore they are actually fully compatible? (and other similar module version transformations)

What if the community asked Oracle to do this instead of JDK 8? (accepting a delay to 2013) Or as JDK 9?

Is it time to learn the lessons of Java? And apply those lessons to Java itself?

Summary

Any talk on languages is controversial, as each language has a specific world view and fans. I rejected Clojure, Groovy and Scala as NBJL even though each is a good language in its own way. I concluded that Fantom is closest to the statically typed mainstream language that is needed, yet its simple type system and some of its APIs are counting against it.

As a result, my conclusion is that the language best placed to be the Next Big JVM Language is Java itself - just a backwards incompatible version. And I suggested that maybe we should accept a 2013 release date (instead of 2012) for JDK 8 as a backwards incompatible version.

Feedback expected! Please use #bijava to talk about backwards incompatible Java on twitter.
PS. I'll get comments invalidly marked as spam out as soon as I can!

Tuesday, 7 September 2010

Joda-Convert and Joda-Primitives v1.0

At the weekend I release v1.0 of two projects - Joda-Convert and Joda-Primitives.

Joda-Convert

This is a simple library targetted at converting objects to and from strings. As well as the standard interface plugin point for adding your own converters it has annotation based conversion. This allows you to add the ToString and FromString to methods of your class and have Joda-Convert pick them up.

See the website and previous blog for more details.

Joda-Primitives

This is a library providing collections, iterators and lists based on primitive types rather than objects. Using collections in this way can yield considerable space and performance benefits over boxed primitives.

The project began life a number of years ago as a fork from Commons Primitives. The main difference is that Joda-Primitives extends the original JDK collections interfaces for seamless integration, whereas Commons-Primitives requires wrappers to interoperate.

It is hoped that support for other collection types may be added, however it is not top of my list of things to do.

See the website for more details.