Wednesday, 6 August 2014

StringJoiner in Java SE 8

Java SE 8 added a new class for joining strings - StringJoiner. But is it any good?

StringJoiner

Here is what the Javadoc of the new class says:

StringJoiner is used to construct a sequence of characters separated by a delimiter and optionally starting with a supplied prefix and ending with a supplied suffix.

Sounds good! Maybe we can finally stop using the Joiner class in Guava.

Unfortunately, the JDK StringJoiner is not a general purpose string joiner in the way that the Guava Joiner is. The first clue is in the API:

// constructors
 StringJoiner(CharSequence delimiter)
 StringJoiner(CharSequence delimiter, CharSequence prefix, CharSequence suffix)
 // methods
 StringJoiner setEmptyValue(CharSequence emptyValue)
 StringJoiner add(CharSequence newElement)
 StringJoiner merge(StringJoiner other)
 int length()
 String toString()

There are two constructors, one taking the separator (delimiter) and one allowing a prefix and suffix. There are then just 5 other methods. By contrast, the Guava version has 15 other methods on top of 2 factory methods.

But the real missing thing? A method to add multiple elements at once to the joiner!

Every time I want to join, I have a list, set or other iterable. With Guava I simply say:

 String joined = Joiner.on(", ").join(list);

StringJoiner has no equivalent method. You have to add the elements one by one using add(CharSequence)!

 StringJoiner joiner = new StringJoiner(", ");
 for (String str : list) {
   joiner.add(str);
 }
 String joined = joiner.toString();

I think we'd all agree that rather defeats the purpose of having a joiner at all!

However, it turns out that it is kind of possible to add multiple with the JDK, but you might not spot it:

 String joined = String.join(", ", list);

So, not too bad then?

Firstly, I don't expect the method to actually perform a useful join to be on String, I expect it to be on StringJoiner. The method on String is not referenced from StringJoiner at all.

Secondly, the method on String is static, whereas the Guava method is an instance method. This means that the Guava method can pickup additional state from the builder phase of the joiner, such as the ability to handle null. The Guava joiner can in fact handle Map joins as well thanks to its clever immutable instance-based design.

Thirdly, StringJoiner only works on CharSequence. By contrast, Guava's Joiner works on Object, which is much more useful in most circumstances.

Rationale

So, why was StringJoiner written this way?

Well, partly, it is just bad API design. But the reason why no-one noticed is because you are not supposed to actually use the class!

The whole StringJoiner API is designed to be a tool used as a Collector, the mutable reduction phase of the new Java SE 8 stream API. In this context StringJoiner itself is not visible:

 String joined = list.stream()
   .map(Object::toString)
   .collect(Collectors.joining(", "));

In the simple case, this is longer than Guava and less discoverable, plus I had to manually map to a string. However, in more advanced stream cases it is a great tool to have.

The other advantage of StringJoiner over Guava Joiner is that it handles prefixes and suffixes. This is actually really useful, the classic example being to output the '[' and ']' at the start and end of a list. Ideally, Guava would add prefix and suffix handling to their Joiner.

The good news is that some of the flaws in StringJoiner can be mitigated in a later JDK version. However, since StringJoiner is fundamentally stateful and mutable it will never be comparable to Guava's Joiner.

Commons-Lang

Amusingly, for many of the day-to-day tasks in string building, the class I developed in Commons-Lang over 12 years ago, StrBuilder is still the best option. It takes the concept of StringBuilder class and adds many additional methods. Relevant to this discussion is:

 return new StrBuilder()
   .append("something")
   .append(somethingElse)
   .appendWithSeparators(list, ", ")
   .toString();

Note how the joining occurs naturally within the middle of a fluent set of method calls. Neither Guava nor JDK joiners can be used in this way.

Summary

The Java SE 8 StringJoiner class is in my opinion nothing more than a behind-the-scenes tool. It should only be used indirectly from String.join() or Collectors.joining(). If you use it directly you are liable to be frustrated.

Personally, I plan to continue using the Guava joiner, unless I am performing a mutable reduction of a stream.

5 comments:

  1. Multiple addition is handled by list.forEach(joiner::add);

    ReplyDelete
  2. Why are there so setters and getters for the delimiter, prefix and suffix? Because it's supposed to be an internal class?

    ReplyDelete
  3. JDK once again devs fail to design APIs that allow doing simple things in a simple way. Just compare the over-general multiline verbosity above with Scala:

    val list = List(1, 2, 3)
    list.mkString("[", ", ", "]")

    > [1, 2, 3]

    ReplyDelete
  4. The java code is more verbose because it uses a flexible API to a specific task. Scala has some in-built api sugar for this specific scenario, of course it's more concise. It can be achieved in java in no time though.

    ReplyDelete
  5. It's not much good joining strings without a test to prove that they can be split again, even if a string can be empty or contain the delimiter. A string should only be eligible for reversible joining if the programmer affirms that it has been quoted or escaped or validated or there is some other reason why it cannot cause a delimiter collision, for example if numbers are being joined (using a delimiter and a locale where the decimal point does not happen to be the same as a comma separator).
    The StringJoiner also does not allow the programmer to estimate the capacity to save reallocation and copying.

    ReplyDelete

Please be aware that by commenting you provide consent to associate your selected profile with your comment. Long comments or those with excessive links may be deleted by Blogger (not me!). All spam will be deleted.