Tuesday, 23 January 2007

Java 7 - Short declarations

Peter Ahe recently blogged about shortening the code needed for declaring variables, where he proposed static factory methods. I just wanted to note down my opinion on this area and why.

Short declarations

The problem to be solved is the excessive length of defining a variable, particularly as half the information is repeated. For example:

SomeVeryLongClassname foo = new SomeVeryLongClassname();

Why do we actually need to say 'SomeVeryLongClassname' twice for example? The current proposals for changing this are:

// now
SomeVeryLongClassname foo = new SomeVeryLongClassname();

// option A
foo := new SomeVeryLongClassname();

// option B
final foo = new SomeVeryLongClassname();

// option C
var foo = new SomeVeryLongClassname();

// option D
SomeVeryLongClassname foo = SomeVeryLongClassname.new();

// option E
SomeVeryLongClassname foo = new ();

Now, lets repeat the options using the classic map example:

// now
Map<String, List<Integer>> foo = new HashMap<String, List<Integer>>();

// option A
foo := new HashMap<String, List<Integer>>();

// option B
final foo = new HashMap<String, List<Integer>>();

// option C
var foo = new HashMap<String, List<Integer>>();

// option D
Map<String, List<Integer>> foo = HashMap.new();

// option E
Map<String, List<Integer>> foo = new HashMap<>();

// option E2
Map<String, List<Integer>> foo = new HashMap();

And finally, lets view the syntax options when applied to initialising the variable from a method:

// now
Map<String, List<Integer>> foo = loadHashMap();

// option A
foo := loadHashMap();

// option B
final foo = loadHashMap();

// option C
var foo = loadHashMap();

// option D
Map<String, List<Integer>> foo = loadHashMap();

// option E
Map<String, List<Integer>> foo = loadHashMap();

// option E3
Map<> foo = loadHashMap();

I hope that this is useful in comparing the main options being proposed.

My view is that options A, B and C are really very similar. They differ only in the exact syntax used to perform the inference. If I were forced to choose one of those three, I would probably choose C. However, I believe that all three are inappropriate for Java.

Why? One of the key tenets of Java is static typing. I believe that this manifests itself in Java-Style as variables always being fully declared at the point of creation (ie. on the LHS). This allows the maintainance developer to easily scan up the file and find what type the variable is that they are working with (Clarity, Intent and Readability). Options A, B and C do not meet this LHS declaration Java-Style rule.

A side effect of this is that the type of foo should be able to be declared to be a Map, and not a HashMap. Again, options A, B and C have no ability to do this.

So, I believe that options A, B and C are not acceptable if we are to stick with Java-Style in Java. I understand how dynamic language programmers may have no problems with these options and the kind of type-inference that they represent. I just don't believe that model is Java-Style.

Having ruled out A, B and C, we have D and E. Option D is only really of value in the map/list generics case. As such, it seems too specialist to me.

Option E has value in both the long class name and map/list generics cases. Dropping the RHS type doesn't clash with my proposal of defining the LHS as a key element of Java-Style. And without the RHS, object creation becomes significantly shorter.

(Option E2 is a really technical point to do with future reification of generics - not that important for this post.)

The downside of E is that it makes no difference when a factory or method is used on the RHS. But thats the inevitable result of defining the LHS as essential in Java-Style. This is tackled by E3, but again at the cost of losing LHS information, potentially breaking the Java-Style rule we've strived to create.

Summary

I believe that Java has a certain style, and its not the same as scripting languages. Part of that style is that the LHS of a declaration completely specifies the type. This rules out the options discussed elsewhere (A, B and C) and leads naturally to my favourite E (although I might be persuaded to accept E3).

As always feedback is welcomed. If possible could you mention if you regularly use a dynamic/type-inferring language, as I suspect views on this topic vary by how much people use, or have used, those languages.

25 comments:

  1. option D is the coolest, lol :D
    SomeVeryLongClassname foo = SomeVeryLongClassname.new();

    I really like
    SomeVeryLongClassname foo = new ();
    only problem is that it wont work on interfaces, but you can always use class and return something from method as interface

    ReplyDelete
  2. First, I agree upon A to C, they not only look foreign, but make unnecessary restrictions on defining variables and misguides to prefer concrete classes over interfaces (as it is a shorter syntax).

    Option E looks inconsistent and awkward to me. Especially having these cases:

    (1) ConcreteClass foo = new ();
    (2) Interface foo = new ConcreteClass<>();

    There is nothing Java-ish in (2). I wonder, why both belong to the same proposal kind as they have few in common. I'd rather see some mix of D and E:

    (3) ConcreteClass foo = new();
    (4) Interface foo = ConcreteClass.new();

    Where in (3) inference on the concrete type can be applied and in (4) inference can only be applied on the generic type, hence, one has to name the concrete class anyway. new() would be the method-like variant of new (still a keyword) and used similarily mapping on construtors. Another thing, it is quite clear in its structure. Cf. inner classes:

    (5) Foo.Bar bar = Foo.new Bar();
    (6) Foo.Bar bar = Foo.Bar.new();

    E2 could actually be seen equivalent to (3) (despite the reification issue), where E3 completely obfuscates the generic part.

    ReplyDelete
  3. Wow, have we really run out of ideas on how to improve java that this is what we're reduced to.. sheesh.

    ReplyDelete
  4. i agree with dave on this. I can use my ide NOW to autocomplete both cases of SomeVeryLongClassname.
    The benefit gained of using short declarations? A small fraction of a second perhaps more efficient. The added complexity (which results from being able to do the same thing 600 times) outways that small benefit.

    ReplyDelete
  5. The new() thing... that's not something I've seen before, and I'm not sure it makes sense to conflate it with the other thing you label option E.

    ReplyDelete
  6. Instincly when reading the examples I went for E first, and D when looking at generics.

    But they both suffer from the fact that Type(LHS) == Type(RHS) which promotes a tighter coupling.

    So even though SomeVeryLargeClassName is ridiculous to write more than once, I prefer the origins. Especially since tool support makes it easy anyhow.

    I would however like to remove constructors all along and replace with class factory methods, that could decide IF an instance might be required in the first place, and hence postpone instantiation into this.

    ReplyDelete
  7. Stephen Colebourne24 January 2007 11:26

    Obviously I confused a little with option E. The syntax I was driving at is:

    Concrete foo = new ();
    GenericConcrete foo = new();
    Interface foo = new Concrete();
    Interface = new Concrete<>();

    It probably would have been clearer if I'd separated generic inference (using <>) from type inference. So, lets do that and define option F as:

    Concrete foo = new ();
    GenericConcrete foo = new();
    Interface foo = new Concrete();
    Interface = new Concrete();

    As for the general comments about whether any change here is needed, the argument is mostly about trying to pull variable declarations back onto one line again, as that is more readable than a declaration spanning two lines, or one where the constructor arguments are lost off the RHS of the screen.

    ReplyDelete
  8. Great stuff!!

    I have an issue with "Concrete foo = new();". I fear that refactoring becomes difficult, especially when you extract subclasses.

    At the same time, I don't see why I shouldn't be able to call:

    GenericConcrete = new GenericConcrete();
    Interface = new Concrete();

    As for the motivation, it again strikes me as a stylistic problem. As such, it should be resolved the same way language problems are solved: with conventions. "Days of week are capitalized, paragraph are indented."

    Oh, and getting some 21" TFT monitors can also help! :-)

    ReplyDelete
  9. This is like the Null-ignore invocation blogged previously. The net result would be a loss -- a saving of keystrokes at the expense of good programming principles. As described above (e.g. @Neils), if option F was implemented, it would encourage programmers to declare variables by concrete type instead of interface.

    I constantly see newbies code lines like this:

    > ArrayList a = new ArrayList();
    > Vector v = new Vector();
    > HashMap m = new HashMap();
    > Hashtable t = new Hashtable();

    (most newbies I run into have been newbies since 1.1/1.2 days and don't know about generics).

    And I have to explain why they should use

    > List x = new ArrayList();
    > Map x = new HashMap();

    and use Vector and Hashtable as little as possible, if at all.

    Option F is a move in the wrong direction.

    ---

    Assume, for the sake of argument, that we don't use (F) for the Collections API (or java.io). When might we use it?

    Pretty much to replace cases like

    > SomeVeryLongClassname foo = new SomeVeryLongClassname();

    where SomeVeryLongClassname is one of our own constructs. But I would say that there should not be (relatively) many of these lines in code. Certainly they will appear in isolated instances. But shouldn't the new be refactored into a factory method? And shouldn't polymorphism be applied, or at least allowed to foster in the future. Option F goes against this.

    ReplyDelete
  10. Please don't change something that is well known, and where there is only one way to do it.

    If you give more than one possibility to declare variables it will affect code clarity.
    If you have to read some code with the new type of declaration and that you never used it yourself, you will be a bit lost.

    Don't change a program because of lazy people!

    - Does it add functionnalities ? No
    - Does it take less time to write ? Yes, maybe, maybe not (use copy/paste if you're so lazy)

    - // option A
    foo := new SomeVeryLongClassname();

    --> Everything in Java is typed (Java 5 bring Generics to define type with more details), and there you go exactly in the opposed way.

    It's the same for option B and C.

    // option D
    SomeVeryLongClassname foo = SomeVeryLongClassname.new();

    --> An object needs to be instantiated before being able to call methods on it. And here it seems you call a method in place of instantiating it. "new ..." is clear, dont change it!

    // option E
    SomeVeryLongClassname foo = new ();

    --> And what about inheritance ?
    Example :
    SomeVeryLongClassname foo = new ();
    SomeVeryLongClassname foo = new SomeVeryLongClassnameExtended();

    You need two different notations, that's not really beautiful.


    Don't change something that worked for years, add functionnalities, improve performance, but please, don't change good things!

    ReplyDelete
  11. Stephen Colebourne24 January 2007 23:23

    A number of comments here have argued against making a change here at all (for type inference). Overall, I believe I agree.

    The main point of the blog was to highlight options A, B and C which are popular elsewhere (option A comes from James Gosling apparently). I very much oppose A, B and C as just not Java. Peter Ahe suggested option D, but I've yet to understand the point of it.

    And option F while neat for some scenarios is not neat for others. When examined further it doesn't greatly enhance the language, but does make it more inconsistent - not a good combination.

    However, some generics inference would be useful:
    Map> foo = new HashMap<>();
    This is the area that I'd like to see tackled, not general object creation.

    ReplyDelete
  12. Are there people actually arguing against type inference? Slava is right. Java Joe is functionally retarded.

    ReplyDelete
  13. Sorry about the above post - it did say HTML syntax on - honest

    ReplyDelete
  14. @Kitounet: "--> An object needs to be instantiated before being able to call methods on it. And here it seems you call a method in place of instantiating it. "new ..." is clear, dont change it!"

    That is not true. Any static method is called directly on a class, as the syntax Classname.new() implies new to be a static factory method.

    @gjfdh: The main reason for the short syntax is not to save typing but to save repetition. Especially of lengthy generic statements, where inference is available already when calling methods. Having a static factory method, this would come for free.

    @Howard: Why is it more common to omit the LHS of an assignment and how do you know? How would you assign to interfaces using an RHS-declaration only? New keywords always will bring problems with existing code using the name for variables.

    @Stephen: Peter's proposal somewhat makes the inference already existing for methods being used for instance creation, too. I'm not sure about all the automatic creation of static methods or allowing new as name for static methods to override or create constructors. Making them aliasing to constructors would suffice, IMHO. Awkward notations as empty <> should be avoided as non-Java.

    ReplyDelete
  15. @62.143.166.84

    1. There are more use cases if the type on the left is omitted rather than on the right, e.g.:

    Now
    final Type name = new Type();
    final Type name = methodCall();
    for ( final Type name : collection ) { ... }

    Omit on right
    final Type name = new();
    final Type name = methodCall();
    for ( final Type name : collection ) { ... }

    Omit on left
    final name = Type.new();
    final name = methodCall();
    for ( final name : collection ) { ... }

    Only the omit on left gets all 3 use cases.

    2. Assignment to interfaces: Firstly the advice of programming to interfaces is aimed at external visibility not internal. It is a variation on the Jonathan Postel?s Prescription: ?Be liberal in what you accept, and conservative in what you send? (he was talking about the Internet). Therefore for locals and private fields programming to the exact class is not a problem, but if you want to you can always do:

    final Interface name = Class.new( ... );

    and you still get inference of the generic parameters.

    3. I think that if a source statement were added to the start of the file that identified the Java version number then new keywords could be added. If a package or import statement following a source statement included into the name space a keyword then this would be name mangled, e.g. assert would become _assert_. This way you can add keywords and equally as importantly you can use a library from another language; that might have used a Java keyword, because that particular word isn't a keyword in the other language. I have suggested this as an RFE but the RFE hasn't gone through the Sun internal reviewing yet and hence isn't in the bug list.

    4. new as a static method: Peter Ahe has proposed that the compiler write a static factory method that calls a constructor, when new is used in Type.new( ... ) expressions, so that you get type inference. I have proposed that new in these expressions is like new Type( ... ) except that it allows type inference. IE no extra static method, just a direct call to the constructor.

    ReplyDelete
  16. dont like 99% of the proposals :>
    the think like SomeClass.new() instead of new Someclass()... no comment :+)
    in the moment you can make
    ArrayList a=new ArrayList() ; ok thats bad and you dont like it ? cuz you write 2 times 1 thing ? LOOOL :)
    When i Use Eclipse i make :
    "new"+CTRL+SPACE, [Enter] i get :
    $SomeClass $SomeName=new $SomeClass($Params);
    i write SomeClass=Somethink one time it automatic
    is placed by the other place i do TAB write a Name Tab (del to del params ) and whopy fucking doo :+)
    it works ;>
    ye realy i dont like scary things like :
    // now
    Map> foo = new HashMap>();

    instead of
    foo := new HashMap>();

    but the secound option is scary too i just dont see it.

    ReplyDelete
  17. I like option A :
    x := something() ;

    Especially if something is not required just to be a class constructor. This doesn't really change the typing in Java, x is still explicitly typed, it has the type of the result of "something()", but it does make refactoring much easier in many places.

    That is, I can change the type of x over time as needed, as long as any method calls 'x.foo()' work correctly and I don't need to worry about propagating this to other variables. While interfaces help with this quite a bit, they don't quite achieve the convenience this does.

    It might also be nice to have a type query system so you can find out what type x is at any point, but that's not entirely necessary.

    By the way, option A looks like Sather and it was very useful in that language. I also use haskell and am very, very attached to the type inferencing aspects of the language.

    ReplyDelete
  18. I like to know the type of the variable I'm dealing with so any short form must indicate the variable type. Dynamic typing is just a mess IMO. Hard to read, hard to track.

    ReplyDelete
  19. > A side effect of this is that the type of foo
    > should be able to be declared to be a Map, and
    > not a HashMap.

    Why do you need to declare it as a Map? It is a Map anyway.

    I really like Scala languages syntax. It combines JavaScript notation with Javas final variables:

    val finalString = ""
    var nonFinalString = ""

    In above case you can omit type declaration due type inference.

    You could of course write:

    val finalString : String = ""
    var nonFinalString : String = ""

    If you would combine this with Nice languages option types*, we would have a winner:

    var ?nullableString : String = null
    var notNullString : String = null // doesn't compile

    *) http://nice.sourceforge.net/manual.html#optionTypesJava

    ReplyDelete
  20. Everyone seems to be focusing on object initialization expressions, but anyone who has done lots of Generics in Java knows that the issue simply is not just object allocation expressions, but assigning from method returns, as well as copying method parameters into other locals. It's a PITA.

    My IDE can easily figure out that if I write

    x = foo.methodWithBigUglyGenericReturnType();

    that 'x' should be declared with the return type of that method. In fact, it offers a QuickFix to do just that for me.

    I vote for the var proposal. The static factory proposal solves only a tiny part of the pain of Tiger Generics and is not even a half-way solution.

    The "types must be on LHS" rule is part of the problem, and VAR solves it. VAR is in fact a type, just like Integer, or Object. It means "my type is Placeholder type" which means "Hello, compiler, please type inference me at compile time"

    As for type safety, VAR causes the loss of no type safety. If you try to pass a VAR into another function, the compiler will detect a type mismatch. It's not really much difference that writing foo(bar()) vs var x=bar(), foo(x). The former case creates an implicit temporary typed label in the compiler anyway.

    ReplyDelete
  21. i like the final notation with (optional) inferred type. Later you could change the final to an interface if you don't want it final anymore.

    ps. i never liked the new operator and always thort having a .new() method notation like you have presented would be way neater or even HashMap() ie. with optional new.

    ReplyDelete
  22. @Ray Cromwell

    "Everyone seems to be focusing on object initialization expressions, but anyone who has done lots of Generics in Java knows that the issue simply is not just object allocation expressions, but assigning from method returns, as well as copying method parameters into other locals. It's a PITA. "

    --> What's the mother of the sister of my father ?
    It's my grand-mother!
    What I mean is that if you need to use Generecis like "<<<>><<>>><>", maybe you need to simplify some things in your code in place of trying to change the language.

    ------------------------------------------

    @Mark

    "I like to know the type of the variable I'm dealing with so any short form must indicate the variable type. Dynamic typing is just a mess IMO. Hard to read, hard to track."

    --> I totally agree.
    If we loose the type of variables, it will be a scripting language. It seems that it's what a lot of people want here, unfortunately.

    ------------------------------------------

    @62.143.166.84

    "@Kitounet: --> An object needs to be instantiated before being able to call methods on it. And here it seems you call a method in place of instantiating it. "new ..." is clear, dont change it!"

    "That is not true. Any static method is called directly on a class, as the syntax Classname.new() implies new to be a static factory method."

    --> Yes you're right, but what I mean here, is that you'll loose the signification of the "new...", the "instatiation"! Calling a static method doesn't imply object instantiation.

    For me, "Classname.new()" means "He you! You that doesn't exist, instantiate yourself"
    But "new Classname" means "Me, I instantiate you".

    There must be a difference between instantiation and method call, because a program must be as easy as possible to understand, there must not be any confusion possible.

    Why do you think we say that french is the language of the justice, a rich language ?
    Moving a word in a sentence totally change the meaning of it.
    I you know french you will understand this :

    1) J'ai un certain doute.
    (Possible translation : I'm not totally sure)
    2) J'ai un doute certain.
    (Possible translation : I have an unquestionable doubt)

    It's the same for the "new".

    ReplyDelete
  23. I thought about two or three things. The first is that is it redundant. Then that the typing is really important so something like var can't be used. So

    A a = new();

    seems the best solution. Then, I'm thinking about two things. First, it's more and more rare these days that we instantiate a class to assign it to itself. Except for immutable little objects (like BigDecimal) you will have more

    List l = new ArrayList();

    The variable is an interface. And then, I still agree, it's redundant. BUT, I don't understand why it isn't the IDE that solves that. Why, lets pick randomly Eclipse, does Eclipse doesn't propose constructors of the variable type when I type the keyword new. That would solve it all.

    And complete the generic typing is needed. Like:

    A a = new (super combo proposing all A constructors) A();

    or

    List l = new Ar (Ctrl+space: super combo proposing constructor compatible with List)rayList< (auto-completion putting String here) String>();

    It's not a change in java we need, it just basic IDE auto-completion. Maybe in Eclipse 3.3?

    That how Eclipse managed to get rid of class cast typing. You now do this:

    A a = list.get(); (how my god, compilation error, Ctrl+1, magic, the typecast is added)

    BTW, in case your wondering, for me, generics were now built to remove casting, it was done to allow us to know the content of a data structure. List obviously contains strings. List alone is clueless.

    ReplyDelete
  24. I vote for E,E2.

    But if ... there could be also another idea.

    SomeVeryLongClassname foo();

    Map foo() HashMap;

    ReplyDelete