Thursday, 23 October 2008

Life in the Fan lane - less NPEs

Since I last blogged I've been contributing to the development of the Fan Fantom language through lots of discussions. One of the latest features to be added is targeting a major reduction in NPEs.

Fantom and NPEs

The Fantom programming language is a relatively new language for the JVM (it compiles to bytecodes). However, it also compiles to .NET CLR making it very portable, with its own set of APIs.

Its a language that is easy for Java and C# developers to grasp (familiar syntax and concepts). It also fixes up many of the weak points identified in Java, with closures properly designed in, understandable generics, much less boilerplate and clever concurrency. Further, it performs close to, or as well as Java, because it is a statically typed language.

// find files less than one day old
files = dir.list.findAll |File f->Bool| {
  return DateTime.now - f.modified < 1day
}
// print the filenames to stdout
files.each |File f| {
  echo("$f.modified.toLocale:  $f.name")
}

When I first came across the language, I immediately saw its potential as a Java successor. It is an evolutionary language. Simple for Java developers to move to, yet with clear potential productivity gains. One thing irked me though, and that was the handling of nulls, because Fantom had a model exactly like Java, where any variable could be null.

This has changed recently however, and now Fantom supports both nullable and not-null types:

 Str surname      // a variable or field that cannot hold null
 Str? middleName  // a variable or field that can hold null

At a stroke, this allows much more control within your application of the presence of null. No longer do we have to write in JavaDoc (or lengthy nasty annotations) as to the null status of a variable.

For example, you cannot compare surname to null, as that makes no sense. In the Java defensive coding style, it is often the case that variables are needlessly checked for null. This additional code gets in the way of the real business logic, and requires extra tests and analysis. With Fantom's null/not-null choice, a not-null variable can simply be relied on to be not-null, and any attempt to compare a not-null variable to null is a compile error.

Further to this, the type system allows you to block the presence of null in lists and maps. This is often an area forgotten when passing collections to methods - until the NPE strikes.

 Str[] list     // a list that cannot hold null
 Str:Int map    // a map that holds non-null strings to non-null integers
 Str:Int? map2  // a map that holds non-null strings to nullable integers

Finally, the null-safe operators (?., ?-> and ?:) are all prevented on operating on not-null variables. The ?. operator is a null-safe method invoke, that returns null if the LHS is null, so clearly this makes no sense if the LHS is a not-null variable.

One point to note is that not-null is the default. Why is that?

Well, it turns out that is the most common state for variable. Most programmers intend most variables to not hold null. In converting the Fantom sourcebase figures of 80% were not uncommon (ie. 80% of variables were originally intended to never hold null). Clearly it makes sense to make the most common case the default, and thus to make handling null a special case.

And what does the nullable/not-null variable actually represent? Well, most variables are objects, so once it gets to bytecode there is no difference. But for Int and Float, the non-null types can be converted to the primitive bytecodes for Java's long and double. This means that Fantom now has access to primitive level performance at the bytecode level which is going to allow Fantom applications to speed along nicely.

Summary

Fantom is coming along nicely. NPEs are a major headache in todays systems, and Fantom's approach will eliminate many of those errors. Further, it will eliminate much of the defensive code that is added to protect against values that will never actually be null - leaving Fantom even more decluttered.

Opinions welcome on NPE handling - something that seems to have been a complete lack of priority in Java.

9 comments:

  1. > Something that seems to have been a complete lack of priority in Java.

    Is the committee still debating whether it should be called NonNull or NotNull?! lol

    Nice writeup, limiting the state space seems like an obvious thing to want to do. Will there be support for the C# coalescing operator too? Seems to me a lot of current Java code (using the capability pattern) could really bennefit from that.

    ReplyDelete
  2. Nice. The ?. operator is roughly what Scala has in Option.map. You didn't document what ?-> and ?: are, but hopefully at least one of those is Option.flatMap.

    Scala doesn't have null-safe variables, sadly, but most Scala programmers will use Option to represent a nullable, rather than actually use null. There was talk of optimising out Option in the bytecode, but it turned out that some enterprising folks liked to use Some(null) (an Option containing one element that was null), blocking this operation.

    It would be nice if the nullable variable was a library feature rather than a language feature, but either way, this is a step forward for JVM languages.

    ReplyDelete
  3. Stephen Colebourne23 October 2008 09:15

    The -> operator is method invocation using reflection, thus the ?-> operator is a version of that that returns null if the LHS is null, which makes no sense to be called on a null LHS.

    The ?: operator is a shortcut for the pattern (a != null ? b : c), becoming (a ?: c). Again, this operator makes no sense to be called on a null LHS.

    Having read the scaladoc for Option.map and Option.flatMap, I still have no idea what they do, so I can't answer that question (I find this to be true about Scala in general - most things are couched in a language only accessible with a FP background)

    ReplyDelete
  4. Yes, I should have explained them.

    Option is an abstract type with two implementations, Some and None. Some(T) represents a non-null value, and None represents a null. Well, that's one view of it; you don't have to actually think of null when using it.

    anOption.map(some function) will, if it's a Some, run the function on the contained value, producing a new Option. This is your ?. operator.
    For map, if the function returns an Option, then you end up with Option[Option[Foo]] as the return type, which is clearly pretty much always useless.

    So there is flatMap too, which will 'flatten' the result. Another name for flatMap, used outside Scala, is 'bind'. Perhaps ?. is actually that, I'm not sure.

    The nice thing is that flatMap and map (and some other methods) are common between Option, List, Array, etc., so you can use very similar techniques, and sometimes just the same code, for various container types.

    Scala has 'for comprehensions' based on these method names, so, say, for (i <- 1 to 10) yield i * 5, might translate to 1 to 10 map (i => i * 5).

    C#'s LINQ works in exactly the same way, just with different syntax and different method names. Option is trivial to make work with LINQ.

    ReplyDelete
  5. This isn't enough, Stephen. This is broken. You can't do it this way. nullity in static types is difficult for the same reason generics is difficult. There are three distinct nullity types:

    Definitely not null. Let's say '!'.
    Definitely allows null. Let's say '?'.
    Either state. Let's say '#'.

    Just like there are three ways to express 'A list of numbers': , , and .

    Str[] foo = ["a", "b", "c"];
    Str?[] bar = foo;
    bar.add(null);
    Str baz = foo[4];
    //baz now holds null eventhough it shouldn't be possible.

    So, Str[] and Str?[] aren't assignable to each other. Which means, you need a way to express 'Str[] OR Str?[], and I'll only write in non-null, and do null checks on reading, so it'll be okay'. I don't see it this in your examples. Without this, in my opinion, its unusable, especially in combination with legacy code, which is littered with fairly inappropriate nullity.

    For a fuller explanation with a lot more examples about why this is far more complex than you seem to think it is, at least in this blog post, see:

    http://www.zwitserloot.com/2008/10/23/non-null-in-static-languages/

    ReplyDelete
  6. @Reinier

    You are quite right that Fan's treatment of nullability in its type system is "incomplete". But designing features like nullability is an exercise in engineering trade-offs. Fan starts with the philosophy that type systems are useful, but can never be truly perfect. So we tend to make trade-offs to keep the type system simple. Honestly I don't think nullables would have made it into Fan's type system if it weren't for the benefit of getting primitive performance. So in same ways we are giving up simplicity for performance.

    To fully plug the holes you discuss would be a trade-off for more complexity to gain a more sound type system. There is also the dimension of whether that hole is best plugged statically (in the type system) or at runtime. With Fan nullability has been designed to solve the 80% common case at compile time, and the rest at runtime.

    To give a counter example to your List example: Java does indeed allow you to do to that with arrays:

    Number[] number = new Integer[10];

    number[0] = new Float(3f);

    That code will compile fine, but will generate an ArrayStoreException at runtime. That is pretty similar to how Fan will work with nullables. It will allow you some flexibility at compile time, but in the end coercion from a nullable type to a non-nullable type will throw an NPE at runtime (just like a runtime cast will throw a CastException).

    So the general way I like to frame these discussions is not about whether Fan has designed a 100% foolproof nullable type system. Rather the discussion is have we designed a system with the right trade-offs?

    ReplyDelete
  7. Stephen Colebourne24 October 2008 02:14

    Brian's explanation is key to why Fan is interesting as a Java successor. Fan is a pragmatic language, not an academic or pure one (and this approach has a tendency to throw language geeks into a spin...). It makes choices that don't obsess about types, but keep the system simple, yet still safe.

    How many users of Java 5 really believe that Java generics are the best way to tackle that particular issue? Fan has a much simpler and more-focussed approach to generics and many other points. In the end it really is about choosing the right trade-offs!

    ReplyDelete
  8. I refer back to the 'legacy' point: Existing code isn't compiled with explicit null checks in place. Unless you want to change the way to JVM works, I don't see how this would work.

    I don't think it's -that- complicated. It's really just generics all over again; same caveats, same principle.

    Fan doesn't yet have the baggage of legacy code. I'm not really impressed with a new language until I see one that has a sane way of handling language changes while not placing an undue burden on upgrading developers. As far as I can tell, neither Scala nor Fan have something as simple as a 'source' keyword that helps the compiler figure out what the author of the source thought the language is supposed to look like. This seems like a 'duh' lesson that you should have learned from java - please add it.

    Then there's the issue of letting code figure out which version it is expected to emulate, so that you can make changes the libraries, but that is a lot more complicated. Still worth it.

    ReplyDelete
  9. Stephen Colebourne24 October 2008 12:55

    Reinier - Fan has it own API set independent of Java. There is no legacy code to integrate with.

    There is a question of how to integrate with Java code, but step 1 of that is to declare all Java objects as nullable. Further work can then go beyond to tighten up the behaviour.

    PS. sorry about the rubbish spam filters here - they are not in my control, but I will publish anything that gets caught.

    ReplyDelete