Friday, 28 September 2007

JSR-275 - Suitable for Java 7?

I've been evaluating JSR-275 - Units and Quantities tonight in relation to JSR-310. Its not been pretty.

JSR-275 - Units and Quantities

JSR-275 is a follow on from JSR-108, and aims to become part of the standard Java library in the JDK. It provides a small API for quantities and units, and a larger implementation of physical units, conversions and formatting. The goal is to avoid using ints for quantities, with their potential for mixing miles and metres.

So where's the problem?

Well, it starts with naming. There is a class called Measure which is the key class. But its what I would call a quantity - a number of units, such as 6 metres or 32 kilograms.

Measure is generified by two parameters. The first is the value type. You'd expect this to be a Number, but weirdly it is any class you like.

You'd expect the second generic parameter to be the unit type, right? Er, no. It's the Quantity. So, not only is the quantity class not called Quantity, there is another class which doesn't have much to do with a quantity that is called Quantity. Confused yet?

The unit class is at least called Unit. Unit also has a generic parameter, and again its Quantity.

So, what is this mysterious Quantity class in JSR-275?

Well, its the concept of length or temperature or mass. Now actually that is a very useful thing for JSR-275 to have in the API. By having it in the API, it allows the generic system to prevent you from mixing a length with a mass. Which must be a Good Thing, right?

 Measure<Integer, Length> distance = Measure.valueOf(6, KILO(METRES));
 int miles = distance.intValue(MILES);

The first thing to notice is that the variable is held using a Length generic parameter. There is no support in JSR-275 to specify the variable to only accept metres or miles. Personally, I think thats a problem.

The second thing to notice is how easy conversion is. It just happened when it was requested. But is that what you necessarily want? We just converted from kilometres to miles without a thought. What about lost accuracy? Are those units really compatible?

Well now consider dates and times:

 Measure<Integer, Duration> duration = Measure.valueOf(3, MONTHS);
 int days = duration.intValue(DAYS);

Er, how exactly did we manage to convert between months and days? What does that really mean? Well, assuming I've understood everything, what it does is use a definition of a year as 365 days, 5 hours, 49 minutes, and 12 seconds, and a month being one twelth of that. It then goes ahead and does the conversion from months to days using that data.

I doubt that the result will be what you want.

In fact, perhaps what has happened here is that we have substituted the unsafe int with a supposedly safe class. With JSR-275 we relax and assume all these exciting conversions will just work and nothing will go wrong. In short, we have a false sense of security. And thats perhaps one of the worst things an API can do.

I then looked at the proposal for integrating JSR-275 and JSR-310. One idea was for the JSR-310 duration field classes (Seconds, Minutes, Days, Months, etc) to subclass Measure. But that isn't viable as Measure defines static methods that are inappropriate for the subclasses. Measure also defines too many regular methods, including some that don't make sense in the context of JSR-310 (or at least the JSR-310 I've been building).

Looking deeper, what I see is that JSR-275 is focussed on conversions between different units. There is quite a complex API for expressing the relationships between different units (multipliers, powers, roots, etc). This then feeds right up to the main public API, where a key 'service' provided by the API is instant conversion.

But while conversion is important, for me it has come to dominate the API at the expense of what I actually wanted and expected. My use case is a simple model to express a quantity - a Number plus a Unit.

Alternative

So, here is my mini alternative to JSR-275:

 public abstract class Quantity<A extends Number, U extends Unit> {
  public abstract A amount();
  public abstract U unit();
 }
 public abstract class Unit {    // grams or celcius
  public abstract String name();
  public abstract Scale scale();
 }
 public abstract class Scale {   // mass or temperature
  public abstract String name();
 }
 
 Quantity<Integer, MilesUnit> distance = ...

So, what have we lost? Well, all the conversion for a start. In fact, all we actually have is a simple class, similar to Number, but for quantities. And the unit is very bare-bones too.

We've also lost some safety. There is now no representation of the distance vs duration vs temperature concept. And that could lead us to mix mass and distance. Except that the API will catch it at runtime, so its not too bad.

And what have we gained? Well its a much smaller and simpler API. But less functional. However, by being simpler, its easier for other teams like JSR-310 to come along and add the extra classes and conversions we want. For example, implementations of Quantity called Days and Months that don't provide easy conversion.

In essence, this is just the quantity equivalent of Number - thus its a class which the JDK lacks.

I'm sure there are many holes in this mini-proposal (I did code it and try to improve on it, but ran into generics hell again). The main thing is to emphasise simplicity, and to separate the whole reams of code to do with conversion and formatting of units. I'm highly unconvinced at the moment that the conversion/formatting code from JSR-275 belongs in the JDK.

Summary

Having spent time looking at JSR-275 I'm just not feeling the love. Its too focussed on conversion for my taste, and prevents the simple subclassing I actually need.

Opinions welcome on JSR-275!

16 comments:

  1. by the way. The next step for making the self type happen is for someone to hack it into the KSL, if it hasn't been done already. I'd love to try it, but don't know if I'll have a chance.

    ReplyDelete
  2. Great post, Stephen. I think your Quantity class is fantastic!

    I wonder if there's any way to tweak your proposed API to increase readability, but without breaking your nice separation of responsibilities. Perhaps I define an interface as a shorthand for a Unit/Number pair:
    interface Miles extends Quantity { }

    Then potentially I have a nice, JodaTime-like DSL:
    Miles homeToWork = Quantities.create(5, Miles.class);
    Miles workToStarbucks = Quantities.create(1, Miles.class);
    Miles dailyDriveDistance = homeToWork.times(2).plus(workToStarbucks.times(4));

    ReplyDelete
  3. Hi Stephen

    I think you're conflating issues here. There are conversions which are known and fixed, for example inches to centimeters, seconds to milliseconds. Then there are conversions, such as those related to dates and calendars, which are dependent on some model of a calendar. The units and quantities API needs to have some way to plug in a conversion model for those types of conversions. JSR 310 needs to be a provider for those conversions. Currency conversion would be another example of a pluggable conversion system.

    I've read one article introducing 275 by the spec lead (in a german programming magazine), and the focus was somewhat more on people doing math, physics, chemistry, etc. For those use-cases, it seemed 275 was both elegant and powerful.

    Regards
    Patrick

    ReplyDelete
  4. Stephen Colebourne29 September 2007 at 13:39

    @Patrick, You are correct up to a point. JSR-275 has come from a maths/physics/chemistry/science background, and probably works quite neatly in that arena. The problem is that I doubt that is what an average JDK user wants.

    You suggest a pluggable conversion mechanism in JSR-275. I believe that only complicates matters further. What I'm arguing for is a separation of basic simple quantity model from the science derived conversions (and formatting).

    ReplyDelete
  5. Stephen Colebourne29 September 2007 at 13:50

    @Kevin, @Jesse, I'm glad you like the outline of the simpler model :-)

    On the self-type point, I think I agree (although Number should be updated too really). Adding to the KSL is also needed. But AFAIK, there has been nothing added to the KSL despite quite a few patches being submitted to the list. And who will actually do the code work?

    On the arithmetic operations, I think I agree. I would find a place for BigDecimal which you haven't allowed for though.

    Quantity vs Amount? I'm flexible I guess. Its just not a 'Measure'.

    Jesse, your idea of a Miles interface corresponds to Joda-Time/JSR-310 well. However, it needs to be an immutable class directly subclassing Quantity for full and best use of self-types (in parameters). And the code can be written even more simply with static imports and a utility class:

    Miles homeToWork = miles(5);
    Miles workToStarbucks = miles(1);
    Miles dailyDriveDistance = homeToWork.mupltipliedBy(2).plus(workToStarbucks.multipliedBy(4));

    I'm arguing that JSR-310 should provide the time related ones.

    ReplyDelete
  6. I also think JSR-275 is not a good addition to the JDK.

    Btw - there was a long and interesting discussion about it on JL:
    http://www.javalobby.org/java/forums/m92158973.html

    ReplyDelete
  7. Is this about linguistics?

    I'm not sure that Measure is the wrong term, as it denotes "the dimensions, capacity, or amount of something ascertained by measuring" and thus is more general than Amount. The correct term might be "measured value" consisting of a value and a unit. Quantities to me might be represented by various units (e.g. length in miles or metres) but do not belong to a measured value. I could actually see units being categorized by quantities to allow for conversion (although, this might be tricky, as you said, Stephen).

    Restricting operations to the same kind is another point, which makes me think. Why cannot I devide length by time and retrieve a velocity? If the framework would know about some math rules, it would be easy to have a dynamic unit being meter per second or miles per hour. The example on dailyDriveDistance actually does that already when multiplying by a value having the quantity with the dimensions of 1 (i.e., a pure number).

    ReplyDelete
  8. You don't need a degree in mathematics to know what a "measure" is - well obviously it helps...

    The Quantity type parameter in measure is ugly - thx again to type erasure: V is supposed to be as well a non scalar value, like a vector of velocities - hard to define as a type parameter.

    Anyway, a simpler API that focussed on scalars (monetary units, lengths, etc.) would be more useful for a wider audience. I find scientific notions ("measure") not really suitable for the standard API

    ReplyDelete
  9. Kevin, I don't think someone is trying.
    You can join us on
    compiler-dev@openjdk.java.net
    if you want some support.

    Rémi

    ReplyDelete
  10. Stephen Colebourne1 October 2007 at 22:52

    @Quintesse, I think your idea (which I also thought of ;-) is a definite possibility. It does allow a very simple classification of units in a quick immediate way, understandable to most developers.

    The problem I came up with was forcing users to write "? extends LengthUnit" in their APIs, which isn't as pretty as 275.

    There is also a problem with generics - imagine writing a convert(A, B) method to convert between two different units. Obviously you want them both to be lengths. Its fine if you want to write "A extends LengthUnit, B extends LengthUnit". But its not fine if you want to write a generic bit of code (as 275 has now) that is suitable for any unit scale length vs mass vs temperature.

    ReplyDelete
  11. Stephen Colebourne2 October 2007 at 23:12

    @Werner, I believe there are terminology issues, and 275's choice of Quantity is one I strongly dislike.

    The description of quantity I see on wikipedia is "Measurements of any particular quantitative property are expressed as a specific quantity, referred to as a unit, multiplied by a number" (http://en.wikipedia.org/wiki/Quantitative). ie. a 'quantity' is a unit multiplied by a number. This is what I suggested in my blog post.

    (I believe I understand reasonably well now how 275 hangs together, I'm arguing that different class names might help your cause.)

    I am also interested in your comment wrt a common Unit/Quantity interface between 275 and 310, perhaps this is something we can explore further.

    BTW, Collectable would be unsuitable for a collection. In english, you should be able to say the following for all words ending in -able:
    Collectable = something that can be collected.
    Serializable = something that can be serialized.
    Comparable = something that can be compared.

    ReplyDelete
  12. I suspect normal usage largely conflates the concepts represented by Measure, Quantity and Unit into a single entity (and then often as not drops a few rather crucial powers of ten at the same time). Scientists (and mathematicians) need all three concepts, regardless of the names given the result may still seem alien to non scientists.
    Quantity's give meaning to Unit's and allow them to be combined with meaningful results. The ability to convert the numerical values is then a minor addition to the structure which permits dimensional analysis.

    So JSR-275 is 'units' for scientists/engineers. Your proposed simplification would probably leave a result that was worthless to that group.

    As you have noted, months just don't fit in this structure, although the proverbial furlongs/fortnight does.

    ReplyDelete
  13. Stephen Colebourne8 October 2007 at 22:59

    @Mark, I think I agree with your analysis. JSR-275 is designed by and for scientists/mathematicians and probably serves that use case well. (That is also the use case in their JSR definition).

    What is at question is whether a code base like that should be the basis for unit/quantity code in the JDK. And thus if JSR-310 should use it.

    What I'm asking is whether a simplified quantity/unit could be the basis of both 275 and 310.

    ReplyDelete
  14. Jean-Marie Dautelle9 October 2007 at 01:01

    JSR-275 is not going to tell you how many month/days there are between now and Christmas. In fact a Duration object cannot tell you either! It depends from the starting point. For example, 1 ms just before Dec 24 Midnight and Christmas is still one day away! Trying to convert a duration to a number of month or days is absurd unless you assumes a constant predefined conversion factor. Stephen knows that very well, for example in JODA time, Days.toDuration() assumes a 24 hours day.
    It is clear that JSR-275 and JODA Time have the same understanding for "Duration". It is also clear that JODA time and JSR-310 have the abstract concept of Period (e.g. number of days) whose actual duration varies and therefore should not implement Measurable.

    Assuming Duration is renamed to something like DurationAmount (duration amounts could be added/subtracted). Having DurationAmount implements Measurable should make everybody happy (hopefully).

    ReplyDelete
  15. What might non-scientists expect of a units package? As my background is in applied mathematics, I can't answer this question.

    ReplyDelete
  16. > But would your idea still include some kind of class hierarchy to abstract units like Length and Mass?
    The problem is that if you are doing any science or engineering you quickly hit expressions where the units are too messy for a fixed set of classes, and Java doesn't support any type calculus. You have to strip off the type of E, m, and c, write your E = m * pow(c, 2), then hope that the units you chose for the expression were consistent. The approach in the Amount interface above also fails - you want it to tell you that you can't write E = m * pow(c, 3), and neither only allowing it to return this type, or only operating on values stripped of their units, suffices for that. One approach I tried for a previous employer was to use the first few primes to represent each of the dimensions, and ratios (represented by two longs) to represent combinations, which is OK as long as you're only using integer powers (non-integer powers or transcendent values are usually dimensionless), though that was to allow any units rather than giving run-time or compile-time checks. You probably could do a compile time check in a type language about as powerful as C++'s templates, (which isn't very powerful), but not with Java's generics.

    ReplyDelete