Wednesday, 23 November 2005

Adding auto casts to Java

OK, so after auto null checks and code blocks what else would make Java that little bit clearer to read and write? What about casts?

Once again, here is an example use case:

public int getAge(Object obj) {
  if (obj instanceof Person) {
    return ((Person) obj).getAge();
  }
  return -1;
}

Note, I've deliberately kept the use case 'stupid'. There are many good OO solutions to this kind of problem, but I believe that the vast majority of Java developers still end up writing nasty casts all the time.

So, what do I propose could be changed in Java 1.6?

public int getAge(Object obj) {
  if (obj instanceof Person) {
    return obj.getAge();
  }
  return -1;
}

The compiler can easily figure out that obj must be an instance of Person in the if statement. So why do we need the cast? The code without the cast is more readable and just as safe.

I still believe that casts have a useful role to play in Java, when you want to emphasise a point in conversion, or to cast without an instanceof. But when we've already checked with the instanceof, why are we have to repeat ourselves? Thats just boilerplate code.

Opinions welcome, as always.

8 comments:

  1. I've actually made the trivial changes to javac to do autocasting. It took about 15 minutes to add in a cast every time it would have normally blown chunks.

    http://www.javarants.com/C1464297901/E1994239229/index.html

    There a couple of problems that I point out in that article, but on the whole it rocks.

    ReplyDelete
  2. There is one pathological case where this causes ambiguity. Suppose you have a terrible overloaded method that does completely different things depending on the params:
    public void eat(Person person) { person.feed(new Hamburger()); }
    publc void eat(Object object) { getHungriestPerson().feed(object); }

    Then the behaviour of this method is not well defined:
    public void feedDependents(Object obj) {
    if (obj instanceof Person) {
    if(obj.getAge() < 5 || obj.getAge() > 90) feed(obj);
    }
    }

    This comes up in practice is with List.remove(int index), and List.remove(Object value), although it's mostly autoboxing's fault in that case.

    Otherwise I think your idea is really great, and these types of problems could be worked out!

    ReplyDelete
  3. Stephen Colebourne24 November 2005 00:18

    Thanks for the comments.

    Sam, I read your blog and its good to know that the changes could be made very easily. I'm not sure that I'd go as far as removing all casts though. I like the idea of only doing it when an instanceof has been used.

    However Jesse, you make a good point about overloaded methods. I believe that it should be solved simply by calling the more specific overload (Person in the example).

    ReplyDelete
  4. Want this?:

    public int getAge(Object obj) {
    if (obj instanceof Person) {
    return obj.getAge();
    }
    return -1;
    }

    What about this?:

    public int getAge(Object obj) {
    if (obj instanceof Person) {
    if (obj instanceof SpecialPerson) {
    obj = new SomethingWeird();
    }
    // is obj a Person or a SomethingWeird?
    return obj.getAge();
    }
    return -1;
    }

    Admittedly that's a weird case, but you see the point. It's not as straightforward as it looks. It's like 'definite assignment' all over again. It'll need a whole chapter in the JLS.

    OK - so you could do this with magical sugar like the 1.5 foreach loops, but then you wouldn't be able to 'reach out' and change the variable:

    public int getHashcode(Object obj) {
    if (obj instanceof Person) {
    obj = new PersonWrapper(obj); // ctor takes Person
    }
    return obj.hashCode();
    }

    after compiler mangling, that might look like:

    public int getHashcode(Object obj) {
    if (obj instanceof Person) {
    Person $autocasted$obj = (Person)obj;
    $autocasted$obj = new PersonWrapper($autocasted$obj); // say ctor takes Person
    }
    return obj.hashCode(); // obj! not $autocasted$obj
    }

    You could have it insert casts only when the object under consideration is dereferenced - calling a method or accessing a field, and leave it alone when it's "naked".

    void method(Object o) {
    if (o instanceof Person) {
    o.doPersonThing(); // autocast o to Person
    o.personField = 0; // autocast o to Person
    feed(o); // o is still Object to avoid ambiguity with overloading
    }
    }

    but that's not ideal. It's asymmetrical and therefore confusing.

    Probably would be best to add a totally new construct like the foreach loop. Doesn't C# have something like this:

    void method(Object o) {
    if_instanceof(o, Person p) {
    p.personStuff();
    }
    }

    OK - that's a terrible keyword and token arrangement, but the concept is clear and unambiguous.

    ReplyDelete
  5. That all seems terrible vs. just using the left hand side as the intended type when it is possible to assign the right hand side to it.

    ReplyDelete
  6. Huh? What do you mean, Sam? Can you give a line or two of code to show what you're after? There were no left-hand-sides in any of my code blocks. You mean:

    Object o = getSomething();
    Person p1 = (Person)o; // don't make me type (Person)!
    Person p2 = o; // this? what's so great about this?

    ReplyDelete
  7. Woops sorry, Sam - I hadn't read your blog yet. I see what you're getting at. But you see too that your case only applies to assignment and the point of this was to avoid casts *and* boilerplate assignments.

    stephen's:
    if (o instanceof Person)
    return o.personStuff();
    else
    return null;

    yours:
    if (o instanceof Person) {
    Person p = o;
    return p.personStuff();
    } else {
    return null;
    }

    ReplyDelete
  8. Stephen Colebourne24 November 2005 20:35

    Ouch! Good points. There is no way that I will support a change that is unclear when reading the code, and '216' has highlighted a clear case where this could be by assigning to obj.

    The two proposed alternatives are:

    1) Retain the variable assign:

    if (obj instanceof Person) {
    Person p = obj;
    return p.personStuff();
    }

    This has the problem of 'Why?'. ie. someone new to the syntax would go 'why are we assigning that variable?' and more importantly 'why can't I just inline it?'.

    2) A different syntax:
    if (obj instanceof Person p) {
    return p.personStuff();
    }

    This is better than #1 for just reading the code, but causes issues when scaled or used in other ways:

    boolean b = (obj instanceof Person p):

    if (obj instanceof Person p || obj instanceof Company c) {
    // are both p and c set, or just the one that matched?
    }

    Are we really just stuck with lots of casts?

    ReplyDelete