Tuesday, 16 January 2007

Property access - is it an expression language?

There is lots of talk about a simpler syntax for getting and setting properties. But what are the implications?

Property value access

The basic concept is simple. The proposers of this idea (not me!!!) use a simple get or set example:

  // current code
  person.setSurname( input.getName() );

  // proposed change - not from me!
  person.surname = input.name;

As a simple example, this works fine. Its effectively an operator overload for get/set methods. I don't think it would be disastrous if implemented - just a little weak and incomplete.

First, some specific issues with it. Problem one is that the syntax clashes with public fields on any existing class. The compiler will have to have specific rules to decide whether to pick a field or a property. Or alternatively, you can't declare both with the same name. Or a syntax other than dot could be used (which has other downsides).

Problem two is that the code is no longer 'transparent' wrt what it actually does. A basic field assignment statement doesn't fail (yes, yes, I know it can, but 99.9% of the time it doesn't). With this change, what looks like an innocuous field assignment is actually a method call, which may throw an exception.

Problem three is related to two - the get and set methods could end up being remote operations performed across a network. Obviously, a remote operation may have performance and scalability issues, and should be thought about a bit more. This is very hidden from the calling code (although it could be argued that a get/set method hides it equally as much)

Problem four is null handling. Consider a more complex example:

  address.postcode = input.address.postcode;

What happens if input or address is null. Do we care? Or do we just want to assign null to address.postcode? This is related to the issue I tackled in null-ignore invocation.

Problem five is scope - the proposal only covers basic get/set. But real applications use Lists and Maps. And they need indexed and mapped access too.

  List<Person> people = ...
  Person person = people[0];

  Mapt<String, Integer> numberToBook = ...
  Integer adultsToBook = numberToBook["Adult"];

  Requirements reqs = ...
  Integer adultsToBook = reqs.flight[0].numberToBook["Adult"];

Suddenly, the scope of the change has grown dramatically to cover List.get(int), Map.get(K) and Map.put(K, V).

A full solution

So, we need a different syntax, to meet these goals:

  • no conflict with existing Java syntax
  • is isolated enough that we can be aware that an error might occur on assignment
  • can handle nulls along the way, optionally ignoring them
  • can handle indexed and mapped access

To me, this more completely set of requirements is a long way from simply an operator overload for get and set methods. These goals remind me of JSP/JSF EL. Or OGNL. These are expression languages where there is a non-Java syntax dedicated to accessing and manipulating bean properties.

Now, non-Java languages, such as Groovy, don't have a problem with this. They can design their language around the need for property and path access.

But, making the change in Java is a big deal - its almost like adding a new sub-language within the Java language. The question is whether Java can take that level of change?

A separate sub-language

So perhaps we should be honest and admit that Java isn't suited for defining these expressions. Instead, embed a new expression language, delimited by special tokens:

  String email = ${contact.email};

  ${contact.email = input.email};

  ${address.postcode = input#address#postcode};

  List<Person> people = ...
  Person person = ${people[0]};

  Mapt<String, Integer> numberToBook = ...
  Integer adultsToBook = ${numberToBook["Adult"]};

  Requirements reqs = ...
  Integer adultsToBook = ${reqs.flight[0].numberToBook["Adult"]};

This seems to meet the goals in a much cleaner fashion. The expression is clearly delineated, using ${...}, to indicate it follows different syntax rules. It can interact fully with all the enclosing variables. In fact it would be compiled to bytecode so the resulting code would be indistinguishable. Nulls can be ignored, here with a # syntax. Lists and maps are dealt with using a syntax that is familiar from other ELs.

It does rather feel like the additional ${} characters get in the way though.

Summary

Groovy has shown that you can embed a full property access syntax, GPath, within a syntax like Java. But is it viable to change Java by that degree? The expression-language concept allows a new syntax to be used in Java that doesn't clash with existing assumptions in language.

I'm not sure I'm sold on the concept though - for a start it just seems to look a bit ugly. All I hope to do is to consider some of the issues involved, and hopefully trigger other ideas elsewhere. As usual, your thoughts are welcomed.

6 comments:

  1. This was the first one of the many dispatches in the property-related froth that made me say "ooo". I think you're on to something here.

    ReplyDelete
  2. Stephen, I hope you don't get the feeling I'm picking on you. I understand you're on a very difficult quest, and I'm trying to help the best I can.

    As far as I know, public (and even protected) fields are considered harmful. Everybody already uses getters and setters, which might throw exceptions. Nobody thinks that a getter might trigger a remote operation anymore. input.getAddress().getPostcode() has the same issues as input.address.postcode.

    Isn't scope the only real-world issue?

    ReplyDelete
  3. Stephen Colebourne16 January 2007 at 18:21

    @Tiago, All feedback is good ;-) I'm just trying to publicly brainstorm some of the issues involved. I'll certainly make mistakes, and others will have better ideas. And I'm not sure where it'll end up yet either!

    I think you are optimistic in your assumptions about public/protected fields and get/set remote operations. Certainly, while the blog/internet/OO world may recognise the value of these ideals, I bet much real code doesn't. Plus when you are within a class, you do of course access the field directly using a syntax that clashes with property access.

    I certainly agree that input.getAddress().getPostcode() has the same issues as input.address.postcode and blogged about null-ignore invocation a few days ago. But feedback there was not overwhelmingly positive. Yet when accessing nested beans, null-ignore is a real issue - and one that scripting languages all seem to deal with.

    ReplyDelete
  4. @Stephen, about a "A separate sub-language" in your example, I already do something like that in my code for dynamic Business Rules. This allows me to solve the null-pointer problem and some other tricks as well :) However, how about using Annotations to include the expressions in the class?

    ReplyDelete
  5. Stephen Colebourne17 January 2007 at 21:43

    @Doron, I can't envisage how an annotation helps. This proposal is about embedding the expression inline.

    ReplyDelete
  6. @Stephen, sorry, you're right... I originally thought that an Annotation could hold a sort of "script" that could be used to create a run-time object that once invoked will resolve the expression, but that proves to be just as odd as your example and even less understandable.

    ReplyDelete