Tuesday, 30 January 2007

Java language - dynamic instanceof

The instanceof keyword is well known and well understood. It certainly isn't the most OO part of Java, but it is one that the vast majority of programs will use somewhere. Is it possible that even this familiar operator could be improved?

instanceof

Lets consider a method that finds all the Strings in a mixed collection which might contain any kind of object:

  public Collection<String> extract(Collection<?> coll) {
    Collection<String> result = new ArrayList<String>();
    for (Object obj : coll) {
      if (obj instanceof String) {
        result.add((String) obj);
      }
    }
    return result;
  }

Nothing strange here. But what if we want to pass in the type to check for at runtime?

  public <T> Collection<T> extract(Collection<?> coll, Class<T> type) {
    Collection<T> result = new ArrayList<T>();
    for (Object obj : coll) {
      if (type.isInstance(obj)) {
        result.add((T) obj);
      }
    }
    return result;
  }

Suddenly, we can't use the instanceof operator. Instead we have to use the isInstance() method on the type itself. What would be neat would be a more dynamic instanceof operator:

  public <T> Collection<T> extract(Collection<?> coll, Class<T> type) {
    Collection<T> result = new ArrayList<T>();
    for (Object obj : coll) {
      if (object instanecof type) {
        result.add((T) obj);
      }
    }
    return result;
  }

This code sample is using instanceof to check the type of a Class object, rather than a hard coded compile-time Class literal. Of course, its really just syntax sugar for type.isInstance(obj), possibly with null-handling.

This strikes me as something that probably should have been in Java from the start, and has minimal potential for damage if added now. However, its also a pretty specialised operation, rarely used and has a simple enough workaround.

Thus, I fear this is one for the 'nice idea, but not worth the effort' pile. Any opinions?

7 comments:

  1. The right-hand-operand of instanceof is currently resolved as a type name, not a variable name. Change that rule and you change the meaning of existing code. Therefore the proposal is not backward compatible.

    ReplyDelete
  2. A more general implementation of extract is filter, which looks a little like this:

    public [T] ArrayList[T] filter(Iterable[T] iterable,{T=>Boolean} function)
    {
    ArrayList[T] result=new ArrayList[T]();
    for (final T t: iterable)
    if (function.invoke(t))
    result.add(t);
    return result;
    }

    I'd rather see some form of checked type-switching, either generic methods like Lisp has, or an enhanced switch statement:

    switch (object.getClass())
    {
    case String:
    collection.add(object);
    break;
    }

    Unfortunately, it's quite tricky to 'close' a type in Java, so providing compile-time checking for this switch statement wouldn't work for some cases; i.e., the compiler wouldn't be able to tell that your switch was incomplete.

    I don't have a solution to this, just throwing an idea at the Stephen Colebourne thinking machine.

    ReplyDelete
  3. @Neal: that is not changing meaning to *the user*, only to the compiler. It's not backward compatible in the sense that old compilers will break when they see it. But old programmers won't. :-)

    I also have no problem with the incomplete switch. But I have problems with returning ArrayList :-)

    IMHO, nice idea but probably not worth the effort.

    ReplyDelete
  4. Stephen, i prefer to praise to have
    reified generics (see Peter Ahé's blog)

    So we can write :
    [code]
    public Collection extract(Collection coll) {
    Collection result = new ArrayList();
    for (Object obj : coll) {
    if (object instanceof T) {
    result.add((T) obj);
    }
    }
    return result;
    }
    [/code]
    Rémi

    ReplyDelete
  5. Stephen Colebourne30 January 2007 at 18:19

    @Neal, Interesting point. I think it could be backwards compatible, as you could write the JLS to favour types over variables in this location in the syntax, but thats not a great idea. That said, it is very rare for a variable and class to have the same name.

    @Ricky, Your enhanced switch statement has given me an idea, which I'll write up soon

    @Remi, I like your reified version, and that probably is the best solution here. We'll have to see what the other implications of reification are.

    ReplyDelete
  6. I wonder what's the hype with introducing new keywords or symbols or change their meaning. I actually prefer to have it more Object-oriented by using Class.isInstance().

    On reified generics, being able to use a T.isInstance() would have the desired effect without having to break instanceof, or, alternatively, object instanceof T.class.

    ReplyDelete
  7. To me, debating the OO-ness of instanceof vs. Class.isInstance is like debating the piety of Paris Hilton vs. Charles Manson. Runtime type-checking can be useful in rare circumstances, but it's usually best to find a way to refactor it out (usually in favor of polymorphism). So yeah, to me it falls squarely in the "not worth the effort pile". ;)

    Besides, this example would be better served by closures!

    ReplyDelete