Friday, 28 September 2007

JSR-275 - Suitable for Java 7?

I've been evaluating JSR-275 - Units and Quantities tonight in relation to JSR-310. Its not been pretty.

JSR-275 - Units and Quantities

JSR-275 is a follow on from JSR-108, and aims to become part of the standard Java library in the JDK. It provides a small API for quantities and units, and a larger implementation of physical units, conversions and formatting. The goal is to avoid using ints for quantities, with their potential for mixing miles and metres.

So where's the problem?

Well, it starts with naming. There is a class called Measure which is the key class. But its what I would call a quantity - a number of units, such as 6 metres or 32 kilograms.

Measure is generified by two parameters. The first is the value type. You'd expect this to be a Number, but weirdly it is any class you like.

You'd expect the second generic parameter to be the unit type, right? Er, no. It's the Quantity. So, not only is the quantity class not called Quantity, there is another class which doesn't have much to do with a quantity that is called Quantity. Confused yet?

The unit class is at least called Unit. Unit also has a generic parameter, and again its Quantity.

So, what is this mysterious Quantity class in JSR-275?

Well, its the concept of length or temperature or mass. Now actually that is a very useful thing for JSR-275 to have in the API. By having it in the API, it allows the generic system to prevent you from mixing a length with a mass. Which must be a Good Thing, right?

 Measure<Integer, Length> distance = Measure.valueOf(6, KILO(METRES));
 int miles = distance.intValue(MILES);

The first thing to notice is that the variable is held using a Length generic parameter. There is no support in JSR-275 to specify the variable to only accept metres or miles. Personally, I think thats a problem.

The second thing to notice is how easy conversion is. It just happened when it was requested. But is that what you necessarily want? We just converted from kilometres to miles without a thought. What about lost accuracy? Are those units really compatible?

Well now consider dates and times:

 Measure<Integer, Duration> duration = Measure.valueOf(3, MONTHS);
 int days = duration.intValue(DAYS);

Er, how exactly did we manage to convert between months and days? What does that really mean? Well, assuming I've understood everything, what it does is use a definition of a year as 365 days, 5 hours, 49 minutes, and 12 seconds, and a month being one twelth of that. It then goes ahead and does the conversion from months to days using that data.

I doubt that the result will be what you want.

In fact, perhaps what has happened here is that we have substituted the unsafe int with a supposedly safe class. With JSR-275 we relax and assume all these exciting conversions will just work and nothing will go wrong. In short, we have a false sense of security. And thats perhaps one of the worst things an API can do.

I then looked at the proposal for integrating JSR-275 and JSR-310. One idea was for the JSR-310 duration field classes (Seconds, Minutes, Days, Months, etc) to subclass Measure. But that isn't viable as Measure defines static methods that are inappropriate for the subclasses. Measure also defines too many regular methods, including some that don't make sense in the context of JSR-310 (or at least the JSR-310 I've been building).

Looking deeper, what I see is that JSR-275 is focussed on conversions between different units. There is quite a complex API for expressing the relationships between different units (multipliers, powers, roots, etc). This then feeds right up to the main public API, where a key 'service' provided by the API is instant conversion.

But while conversion is important, for me it has come to dominate the API at the expense of what I actually wanted and expected. My use case is a simple model to express a quantity - a Number plus a Unit.

Alternative

So, here is my mini alternative to JSR-275:

 public abstract class Quantity<A extends Number, U extends Unit> {
  public abstract A amount();
  public abstract U unit();
 }
 public abstract class Unit {    // grams or celcius
  public abstract String name();
  public abstract Scale scale();
 }
 public abstract class Scale {   // mass or temperature
  public abstract String name();
 }
 
 Quantity<Integer, MilesUnit> distance = ...

So, what have we lost? Well, all the conversion for a start. In fact, all we actually have is a simple class, similar to Number, but for quantities. And the unit is very bare-bones too.

We've also lost some safety. There is now no representation of the distance vs duration vs temperature concept. And that could lead us to mix mass and distance. Except that the API will catch it at runtime, so its not too bad.

And what have we gained? Well its a much smaller and simpler API. But less functional. However, by being simpler, its easier for other teams like JSR-310 to come along and add the extra classes and conversions we want. For example, implementations of Quantity called Days and Months that don't provide easy conversion.

In essence, this is just the quantity equivalent of Number - thus its a class which the JDK lacks.

I'm sure there are many holes in this mini-proposal (I did code it and try to improve on it, but ran into generics hell again). The main thing is to emphasise simplicity, and to separate the whole reams of code to do with conversion and formatting of units. I'm highly unconvinced at the moment that the conversion/formatting code from JSR-275 belongs in the JDK.

Summary

Having spent time looking at JSR-275 I'm just not feeling the love. Its too focussed on conversion for my taste, and prevents the simple subclassing I actually need.

Opinions welcome on JSR-275!

Thursday, 27 September 2007

Java 7 - Update on Properties

Whilst I was on holiday there was a bit of discussion about properties in Java.

Shannon Hickey announced Beans Binding version 1.0. This is the JSR that intends to develop an API to connect together swing components in a simple and easy to use manner. The idea is that you call a simple API to setup a binding between a GUI field and your data model.

During development, a lot of discussion on the JSR was about the interaction with the possible language proposal for properties. As a result, the v1.0 API includes a Property abstract class, with a subclasses for beans and the EL language from J2EE.

Per-class or per-instance

One key issue that has to be addressed with a property proposal is whether to go per-class or per-instance. The per-class approach results in usage like Method or Field, where the bean must be passed in to obtain the value:

 Person mother = ...;
 Property<Person, String> surnameProperty = BeanProperty.create("surname");
 String surname = surnameProperty.getValue(mother);

Here, the surname property refers to the surname on all Person beans. The caller must pass in a specific person instance to extract the value of the surname.

On the other hand, the per-instance approach results in usage where the property itself knows what bean it is attached to:

 Person mother = ...;
 Property<Person, String> surnameProperty = BeanProperty.create("surname", mother);
 String surname = surnameProperty.getValue();

Here, the surname property refers to the specific surname property on the mother instance. There is thus no need to pass in any object to extract the value of the surname, as the property knows that it is attached to mother.

Which is better? And which did Beans Binding adopt?

Well, its actually the case that for the swing use case, the per-class approach is really the only option. Consider a table where each row is the representation of a bean in a list. The columns of this table are clearly going to be properties, but which type? Well, when defining the binding, we want to do this without any reference to the beans in the list - after all, the beans may change and we don't want to have to redefine the binding if they do.

So, we naturally need the per-class rather than per-instance properties. And, that is what the Beans Binding project defines.

Is there ever a need for the per-instance approach? In my opinion yes, but that can be added to the per-class based property functionality easily.

Adding language support

Having seen the Beans Binding release, Remi Forax released a version of his property language compiler that integrates:

 Person mother = ...;
 Property<Person, String> surnameProperty = LangProperty.create(Person#surname);
 String surname = surnameProperty.getValue(mother);

As can be seen, the main advantage is that we now have a type-safe, compile-time checked, refactorable property Person#surname. This is a tremendous leap forward for Java, and one that shouldn't be underestimated. And the # syntax can easily be extended to fields and methods (FCM).

But, why do we see this new class LangProperty? Well, Remi's implementation returns an object of type java.lang.Property, not a Beans Binding org.jdesktop.Property. This fustrating difference means that two classes named Property exist, and conversion has to take place.

What this really demonstrates is the need for Sun to declare if Java 7 will or won't contain a language change for properties, and for a real effort to occur to unify the various different JSRs currently running in the beans/properties arena. It would be a real disaster for Java to end up with two similar solutions to the same problem (although I should at least say I'm very happy that the two parts are compatible :-).

Summary

Properties support in Java is progressing through a combination of JSR and individual work. Yet Sun is still silent on Java 7 language changes. Why is that?

Opinions welcomed as always!

Friday, 31 August 2007

Java 7 - Self-types

I just screamed at Java again. This time my ire has been raised by the lack of self-types in the language.

Self-types

So what is a self-type. Well, I could point you at some links, but I'll try and have a go at defining it myself...

A self-type is a special type that always refers to the current class. Its used mostly with generics, and if you've never needed it then you won't understand what a pain it is that it isn't in Java.

Here's my problem to help explain the issue. Consider two immutable classes - ZonedDate and LocalDate - both declaring effectively the same method:

 public final class ZonedDate {
   ZonedDate withYear(int year) { ... }
 }
 public final class LocalDate {
   LocalDate withYear(int year) { ... }
 }

The implementations only differ in the return type. This is essential for immutable classes, as you need the returned instance to be the same type as the original in order to call other methods (like withMonthOfYear).

Now consider that both methods actually contain the same code, and I want to abstract that out into a shared abstract superclass. This is where we get stuck:

 public abstract class BaseDate {
   BaseDate withYear(int year) { ... }
 }
 public final class ZonedDate extends BaseDate {
   // withYear removed as now in superclass, or is it?
 }
 public final class LocalDate extends BaseDate {
   // withYear removed as now in superclass, or is it?
 }

The problem is that we've changed the effect of the method as far as ZonedDate and LocalDate are concerned, because they no longer return the correct type, but now return the type of the superclass.

One solution is to override the abstract implementation in the subclass, just to get the correct return type (via covariance):

 public abstract class BaseDate {
   BaseDate withYear(int year) { ... }
 }
 public final class ZonedDate extends BaseDate {
   ZonedDate withYear(int year) {
     return (ZonedDate) super.withYear(year);
   }
 }
 public final class LocalDate extends BaseDate {
   LocalDate withYear(int year) {
     return (LocalDate) super.withYear(year);
   }
 }

What a mess. Imagine doing that for many methods on many subclasses. Its a large amount of pointless boilerplate code for no good reason. What we really want is self-types, with a syntax such as <this>:

 public abstract class BaseDate {
   <this> withYear(int year);
 }
 public final class ZonedDate extends BaseDate {
   // withYear removed as now correctly in superclass
 }
 public final class LocalDate extends BaseDate {
   // withYear removed as now correctly in superclass
 }

The simple device of the self-type means that the withYear method will now appear to have the correct return type in each of the subclasses - LocalDate in LocalDate, ZonedDate in ZonedDate and so on. And without reams of dubious boilerplate.

Summary

Self-types are a necessary companion for abstract classes where the subclass is to be immutable (and there are probably other good uses too...).

Opinions welcome as always!

Wednesday, 11 July 2007

Java 7 - What to do with BigDecimal?

What happens in Java when we have to deal with large decimal numbers? Numbers that must be accurate, of unlimited size and precision? We use BigDecimal, of course. Yet, I have a sense that we tend to curse every time we do. Why is that?

BigDecimal

I think that the main reason why we dislike working with BigDecimal is that its so much more clunky than working with a primitive type. There are no literals to construct them easily. Its only in Java 5 that constructors taking an int/long have been added.

And then there are the mathematical operations. Calling methods is just a whole lot more verbose and confusing than using operators.

  // yucky BigDecimal
  BigDecimal dec = new BigDecimal("12.34");
  dec = dec.add(new BigDecimal("34.45")).multiply(new BigDecimal("1.12")).subtract(new BigDecimal("3.21"));

  // nice primitive double
  double d = 12.34d;
  d = (d + 34.45d) * 1.12d - 3.21d;

Finally, there is the performance question. Perhaps this is a Java Urban Myth, but my brain associates BigDecimal with poor performance. (Note to self... really need to benchmark this!).

So, I was wondering if anything can be done about this? One solution would be a new primitive type in Java - decimal - with all the associated new bytecodes. Somehow I doubt this will happen, although it would be nice if it did.

More realistic is operator overloading (lets just consider BigDecimal for now, and not get into the whole operator overloading debate...). Overloading for BigDecimal has definitely been talked about in Sun for Java 7, and it would make sense. However, as can be seen in my example below, there is really a need for BigDecimal literals to be added at the same time:

  // now
  BigDecimal dec = new BigDecimal("12.34");
  dec = dec.add(new BigDecimal("34.45")).multiply(new BigDecimal("1.12")).subtract(new BigDecimal("3.21"));

  // just operator overloading
  BigDecimal dec = new BigDecimal("12.34");
  dec = dec + new BigDecimal("34.45") * new BigDecimal("1.12") - new BigDecimal("3.21");

  // with literals
  BigDecimal dec = 12.34n;
  dec = dec + 34.45n * 1.12n - 3.21n;

As you can see, the literal syntax makes a big difference to readability (I've used 'n' as a suffix, meaning 'number' for now. Of course the main issue with operator overloading is precedence, and that would need quite some work to get it right. I would argue that if literals are added at the same time, then precedence must work exactly as per other primitive numbers in Java.

One possibility to consider is replacing BigDecimal with a new class. I don't know if there are backwards compatability issues holding back the performance or design of BigDecimal, but its something to consider. Obviously, its difficult though, as BigDecimal is, like Date, a very widely used class.

A new idea?

One idea I had tonight was to write a subclass of BigDecimal that uses a long for internal storage instead of a BigInteger. The subclass would override all the methods of BigDecimal, implementing them more quickly because of the single primitive long storage. If the result of any calculation overflowed the capacity of a long, then the method would just return the standard BigDecimal class. I'd love to hear if anyone has had this idea before, or thinks that it would work well.

By the way, I thought of this in relation to JSR-310. It could be a specialised implementation of BigDecimal just for milliseconds, allowing arbitrary precision datetimes if people really need it.

And finally...

And finally, could I just say that I hate the new Java 5 method on BigDecimal called plus(). Believe it or not, the method does absolutely nothing except return 'this'. Who thought that was a good idea???

Summary

Hopefully this random collection of thoughts about BigDecimal can trigger some ideas and responses. There is still time before Java 7 to get a better approach to decimal numbers, and stop the use of inaccurate doubles. Opinions welcome as always.

Monday, 2 July 2007

JSR-310 - What running a JSR really means

Choosing to take on the development of a JSR (JSR-310, Date and Time API) wasn't an easy decision. I knew at the start that it would involve a lot of effort.

My original plan was to work on the JSR solely in my free time (evenings and weekends), but after 4 months the reality was that I hadn't got enough done. So thats why, starting today, I'm taking unpaid leave two days a week from my day-job for 3 months to try and get the task done.

So, I guess that this post should serve as a warning to anyone thinking of starting a JSR. Its going to cost you :-)

PS. JSR-310 is run in a fully open manner, so if you've got something to contribute to the debate, please join the mailing list or add to the wiki!

Friday, 8 June 2007

Speaking at TSSJS Europe on Closures

I will be speaking on the subject of Java Closures at TSSJS Europe in Barcelona on Wednesday June 27th. I'll be introducing closures, showing what they can do, and comparing the main proposals. It should be a good trip!

Saturday, 12 May 2007

Closures - comparing options impact on language features

At JavaOne, closures were still a hot topic of debate. But comparing the different proposals can be tricky, especially as JavaOne only had BGGA based sessions.

Once again, I will be comparing the three key proposals in the 'closure' space - BGGA from Neal Gafter et al, CICE from Josh Bloch et al, and FCM from Stefan and myself, with the optional extension JCA extension. In addition I'm including the BGGA restricted variant, which is intended to safeguard developers when the closure is to be run asynchronously, and the ARM proposal for resource management.

In this comparison, I want to compare the behaviour of key program elements in each of the main proposals with regards the lexical scope. With lexical scoping here I am referring to what features of the surrounding method and class are available to code within the 'closure'.

Neal Gafter analysed the lexical scoped semantic language constructs in Java. The concept is that 'full closures' should not impact on the lexical scope. Thus any block of code can be surrounded by a closure without changing its meaning. The proposals differ on how far to follow this rule, as I hope to show in the table (without showing every last detail):

Is the language construct lexically scoped when surrounded by a 'closure'?
  Inner class CICE ARM FCM JCA BGGA
Restricted
BGGA
variable names Part (1) Part (1) Yes Yes Yes Part (7) Yes
methods Part (2) Part (2) Yes Yes Yes Yes Yes
type (class) names Part (3) Part (3) Yes Yes Yes Yes Yes
checked exceptions - - Yes Yes (4) Yes (4) Yes (4) Yes (4)
this - - Yes Yes Yes Yes Yes
labels - - Yes - (5) Yes - (5) Yes
break - - Yes - (5) Yes - (5) Yes
continue - - Yes - (5) Yes - (5) Yes
return - - Yes - (6) Yes - (8) Yes

(1) Variable names from the inner class hide those from the lexical scope
(2) Method names from the inner class hide those from the lexical scope
(3) Type names from the inner class hide those from the lexical scope
(4) Assumes exception transparency
(5) Labels, break and continue are constrained within the 'closure'
(6) The return keyword returns to the invoker of the inner method
(7) The RestrictedClosure interface forces local variables to be declared final
(8) The RestrictedClosure interface prevents return from compiling within the 'closure'
BTW, if this is inaccurate in any way, I'll happily adjust.

So more 'Yes' markers means the proposal is better right? Well not necessarily. In fact, I really don't want you to get that impression from this blog.

What I'm trying to get across is the kind of problems that are being tackled. Beyond that you need to look at the proposals in more detail with examples in order to understand the impact of the Yes/No and the comments.

Let me know if this was helpful (or innaccurate), or if you'd like any other comparisons.