Thursday, 25 January 2007

Java 7 - Dot Equals

Comparing two objects is one of the most common tasks developers do. Yet its one where Java feels rather, well verbose. I just find those .equals() seem to be clutter rather than clarity. So what can we do?

Groovy equals

One of the key changes that Groovy made compared to Java was the alteration of the meaning of the == operator. Here's the change that was made:

  // Java
  if (person1.equals(person2)) { ... }

  // Groovy
  if (person1 == person2) { ... }

For those that don't know Groovy, these two extracts of code are identical at the bytecode level. Groovy simply makes the == operator call the equals() method. What about the not equal example:

  // Java
  if (!person1.equals(person2)) { ... }
  // or
  if (person1.equals(person2) == false) { ... }

  // Groovy
  if (person1 != person2) { ... }

And when you actually want to check for equality?

  // Java
  if (person1 == person2) { ... }

  // Groovy
  if (person1 === person2) { ... }

So, Groovy uses two equals symbols, ==, to mean equals() and three equals symbols, ===, to mean equality.

Personally, I find the Groovy code using == to mean equals() to be a lot clearer. But, clearly Java cannot be improved in the same way. Redefining the meaning of the == operator would break lots and lots of existing code and be very confusing.

Dot equals

I do believe there is a possible alternative that fits Java-Style. The concept is to add a new operator to the language that maps to the equals() method:

  if (person1 .= person2) { ... }

So, dot-equals is a new operator in the language, consisting of a dot immediately followed by an equals symbol. This compiles to the following standard Java code:

  if (person1.equals(person2)) { ... }

The advantage of the dot-equals operator is that it retains the visibility of the dot. And the dot allows existing Java developers to link the operator to the actual method call going on in the background. The dot also provides a hint that a NPE could occur at that point.

However, it is also possible to eliminate NPE completely from the dot-equals operator. To achieve this, the code would compile to something like this instead:

  if (System.nullSafeEquals(person1, person2)) { ... }

Given how many NPEs we suffer from, this seems like a Good Thing. But it does make the meaning of the dot-equals operator more complex. Opinions welcome on whether to eliminate NPEs or compile straight to equals().

Finally, what about the not equals concept? This is where things are not so pretty:

  // option A
  if (!person1 .= person2) { ... }

  // option B
  if (person1 !.= person2) { ... }

  // option C
  if (person1 .!= person2) { ... }

I have to say that option A misses the point for me - its too easy to lose the ! symbol. Neither option B or C are beautiful, but they are viable. If you've any other suggestions there I'd love to hear.

Summary

So, this is a proposal for a .= operator:

  // existing
  if (person1.getSurname().equals(person2.getSurname())) { ... }

  // proposed
  if (person1.getSurname() .= person2.getSurname()) { ... }

So, what am I missing? Do you love it or hate it? Is it clear and readable, or really obsure? Feedback welcomed of course.

22 comments:

  1. Oh man, ur suggestion is so bad.
    "dot" usually refer to field/method/property
    ur suggestion make "=" looks like a member operator of the object. May be u can call ".=" as the short-hand of ".equals" but this make the thing look more wried. ".>=", ".<=" should also be considered if ".=" is allowed... But it turn over become the topic of simple operator overloading in Java language.

    ReplyDelete
  2. I like the idea of shortening to a concide operator, but agree with the above commenter that .= may be confusing, but looking at my keyboard I can't find anything else that would better it. It kind of says what it does on the tin.

    The problem is as you stated by changing to Groovy's implementation would be a migration nightmare.

    ReplyDelete
  3. Stephen Colebourne26 January 2007 at 12:18

    @Marc, As you asked, I have about 10 years experience of Java as my main development language, on both client and server. Please bear in mind that this blog is about having ideas and being free to talk about them. Disagreement is expected from time to time!

    For example, I don't believe that another operator is going to confuse Java developers excessively. Especially when it does what it looks like it does.

    @203, Yes, dot usually does refer to a field/method. Thats the whole point of this proposal - to emphasise that a method is being called!

    ReplyDelete
  4. i dont like the .= idea..
    the groovy way is perfect but it breaks existing code BUT YOU HAVE GROOVY MAN use iT ! :)
    crazy proposal :
    you "cant" use "=" in logical operation cuz it is somethink other.. so the "=" in logical operation is Clean - no existing code using it ..
    so Why we cant make = ( to be .equal() where boolean result is expected ).

    other idea : did you forgot <> ? :)
    if ( somethink <> somethink2 ) - this is != :)
    if ( ! somethink <> somethink2 ) - yeah we have it.

    ReplyDelete
  5. The main idea is to have a syntax which normally doesn't appear in regular code.

    I think that a programmer could delete the first "name" from the code below by mistake and only notice it too late (after potentially ages of debugging):

    if (this. = other.name) {
    // wow!
    }

    I think we sould use a character that would never appear. Maybe the tilde?

    if ("quit" ~ other.getActionCommand()) {
    // equals behaviour
    }

    I don't know. Actually, this is a problem I don't care much about. A much more meaningful improvement would be a decent switch syntax.

    ReplyDelete
  6. After years of programming in Java, calling .equals() is actually fine. After all, with Eclipse it's at most, what, 2-3 keystrokes? There's no speed bump anymore when an experienced Java programmers sees "equals()" these days.

    I do like Groovy's way of differentiating equality and identity though. How JavaScriptish! I have to admit, I can't come up with a better idea than your (bad) idea of adding this operator to the Java language. But then again, typing .eq it's not that big of a deal.

    ReplyDelete
  7. Stephen Colebourne26 January 2007 at 17:00

    @Ray, This isn't about keystrokes, but about readability. The summary examples show how the equals() method tends to 'blend' into the other methods. An operator blends far less.

    @Tiago, I don't think that this.=other.name is a likely bug, however it isn't something I'd considered. I'm definitely open to alternative characters, but tilde means 'almost equals' in my eyes.

    @All, Is this really too radical for Java developers? Or is it just that you don't like the .= syntax?

    ReplyDelete
  8. I'm 203, sorry for missed the name in the post above.

    "@203, Yes, dot usually does refer to a field/method. Thats the whole point of this proposal - to emphasise that a method is being called!"

    I'm not objecting to the short-hand of .equals(). I believe that If something is used very frequently, we should make it easier to use. One example in the JDK is the plus/+ operator of the String.concat().

    To make the java language as consistent as possible the ".=" should be ".==". This is because = is the assignment operator and == is the isEqual boolean operator. The .= will(or may, I believe that Java developers are clever.) miss leading developer that it is a "member assignment method".
    ".==" is more consistent with the existing language. Furthermore, if .= is added, the + operator of String should be changed to .+ in order to emphasize it is the method of String class.

    One step further, what will happen when some one called .= on a primitive ? Autoboxing or else ?
    It is undefined in ur proposal, if I read it correctly.

    IMHO, why not use some new operator to do this job. Something like:

    "~=" similar:
    .equals() can't sure that two object are really equal, it can only proof that they are similar.

    "<>" not equals(): borrowed from VB.

    However, it will complex the existing Java language a bit.

    As I mentioned before, beside the short-hand of .equals() we should make the proposal to look futher about the possibility of short-hand for Comparable.compareTo().
    Something like ".>= .<= .< .>" using ur dot syntax. Hence, developer can write numeric expressions easier, without the most of the side-effects of operator overloading.I think developers will be greatly benefited by this.

    ReplyDelete
  9. The first to learn from this, if you want to discuss rather than being flamed, don't call something a proposal ;)

    The notations == and === are not only Groovy, also PHP and Ruby have it. Other languages like Smalltalk an Eiffel don't have this problem, as a simple = is no assignment.

    A .= operator is something I would hate to see in Java. Maybe, the language is a bit verbose at some places, but I prefer verbosity over hidden operator overloading syntax. Operators that are used to hide complex semantics (e.g. the null-safe calls proposed earlier) usually obfuscate things rather than simplifying them.

    ReplyDelete
  10. Honestly, after using the Groovy syntax, I'm really not sure why you came up with such a goofy syntax (.=).

    If this is to be done at all, I would think === would be the best choice.

    ReplyDelete
  11. Stephen Colebourne27 January 2007 at 01:10

    I do appreciate all the comments, even the negative ones!

    @Jess, I think that === implies 'very equal', and certainly more equal than ==. Thats why I don't think we could use === here.

    @Stefan, Do you always call String.concat manually? Or do you use +? (BTW, the + overload for strings is also null-safe...)

    @Mm, The syntax .== may indeed make more sense - you correctly point out that .= could be interpreted as assignment.

    And I agree that there is a broader operator overloading issue here - the dot prefix could enable .+ on numerical values for example. However I don't want to stray into the overloading debate here. Equals is so commonly used on its own to warrant an operator.

    For primitives, if both sides are primitive, then compare as primitives, else compare as though boxed.

    ReplyDelete
  12. I guess the syntax is a little jarring because it brings us back to the C++ days, with their operator-overloading member functions. Ughh, "obj.operator==(something)"!

    ReplyDelete
  13. "Do you always call String.concat manually? Or do you use +? (BTW, the + overload for strings is also null-safe...) "

    No. I use StringBuilder, whenever it seems valuable. But that's no argument to introduce more shortcuts. I'd rather like a proper operator overloading or infix notation syntax than introducing an arbitrary number of new symbols.

    Also note that the == operator is commutative, while simply calling equals is not. Hence, to have it defined similar to ==, a .= b would have to be equivalent to the following:

    (a == null && b == null) || (a != null && a.equals(b))

    And I am not sure, that this would be helpful or easy to remember, when actually looking for equality on objects.

    ReplyDelete
  14. Why not stray into the broader operator overloading issue? What issue? Operator overloading is the pinnacle of object-oriented development (assignment occurs depending whether such operation is supported for the assigned type in question). Test occurs using the same basic rules. It is type safe, and it makes code in many emerging domains (for Java) easy to parse for the *human*.(say linear alg.)

    Your option is to just incrementally introduce yet another half-baked "solution".

    Of course operator overloading elicits an avalanche of angry protesters each clamoring for the long-gone simplicity of the Java language, bhla bhla bhla.

    IMO, one should refrain from proposing solutions which just tip-toe into the big ocean of capability. Operating overloading is one such ocean, and if C++ is any indication, its developers (myself included ... and I use Java as well, quite a bit) enjoy every molecule of it!

    ReplyDelete
  15. What about ~~ for a null-safe equals and !~ for !equals?
    It should also work for primitives, where it would have the same meaning of == and !=.
    This can put an end on all of the auto-boxing surprises that happens nowadays.

    ReplyDelete
  16. With all due respect for all those who are proposing to actually overload the meaning of the symbol for equality, I think it is the wrong overloading approach. One should be able to overload the operation, and not the semantics of a given symbol.

    i.e.,

    ~ is typically associated with "approximately".

    My great hope is that Java designers will add operator overloading; Java deserves it.

    ReplyDelete
  17. You're all wrong. I have 15 years of real-world COBOL experience, 10 with .net, and 5 with Java. I've written more real world applications than you've all written lines of code (real world or not). In the real world, language features don't matter, but outsourcing, regulatory compliance, SOA and holistic people centric approaches do.

    ReplyDelete
  18. a == b means "a" is the same thing "b" is. For references it means that they reference the same object. For primitives it means that they are the same value.
    a ~~ b could mean "a" is equivalent to "b". For references, it means that their referenced objects may not be the same, but may be used indistinctly. The same is true for primitives, although equivalent primitives always are the same.

    ReplyDelete
  19. Another aproach would be to allow methods with a single paremeter to be called without the "." and the parenthesis. Thus, instead of:

    a.equals(b)

    one could write:

    a equals b

    A special rule for boolean methods could allow this:

    a !equals b

    which would be equivalent to:

    !(a equals b)

    The only problem is that it isn't null-safe.

    ReplyDelete
  20. Stephen Colebourne28 January 2007 at 23:08

    I have implemented this change in GPL javac. This didn't turn out to be too hard, and is very educational in terms of understanding how the bytecodes we use each day are put together.

    Following this discussion, I used .== as the operator, although it is in fact relatively easy to change.

    The most interesting part was that once implemented and in use, you find you absolutely have to use the null-safe version for it to make sense.

    If I get round to it, I might try and get this included in the Kitchen Sink Language - https://ksl.dev.java.net/.

    ReplyDelete
  21. It's too bad that new keywords are verbotten -- I think the most readable alternative would be to use == to mean .equals() and to use the keyword 'is' to denote testing the same object

    a == b // tests via a.equals()

    a is b // tests via object identity

    ReplyDelete
  22. I don't mind .==, or the general concept, but I definitely suggest you take it the one step further.
    If you think "if (o1.equals(o2))" can become a little hard to read, what about "if (o1.compareTo(o2) < 0)"?
    I say if a class implements Comparable, then going by the same convention you can also use
    .<, .>, .<=, .>= (heck, do we event need the dots in this case?)
    This avoids one of the main arguments against operator overloading as all of the above are going to call the same method to ensure consistent results, e.g. not like C++ where you would overload < and > and <= etc separately.

    ReplyDelete