Stephen Colebourne's blog: scala

Showing posts with label scala. Show all posts

Tuesday, 29 November 2011

Real life Scala feedback from Yammer

Following up on my recent Scala posts, some commented that because I haven't used Scala seriously, my opinion is of little value. I responded by noting that writing the FCM closures proposal for Java, altering the javac compiler, and talking at conferences about language design might qualify me to have an opinion.

Having said all that, I do believe that the most valuable feedback comes from those that have tried Scala, and found it wanting.

Today there was a good example of this, the most important kind of feedback. Yammer, the Enterprise Social Network, ~~announced~~ provided an indication that they are moving away from Scala, and back to Java. And whats more, they provided a very detailed rationale. The original gist disappeared, so for certainty I've copied the current version inline here (my bold highlights):

Update: Twitter user coda says "@jodastephen I'm saying you're misrepresenting it. It was a personal email which did not represent the views of my employer. In the same sense your Twitter account isn't an official publication of OpenGamma, neither is my personal email.". The main point appears to be that this isn't official Yammer policy. Personally, I think the contents are more interesting than whether its an official announcement or not, but YMMV... For the record, here is the original Hacker News thread.

Update 2011-11-30: Coda has explained how this document (originally a private email) came to be public. For the record, I got the story off Twitter, then Hacker News, and did not receive anything privately, but wanted to ensure the quality of this feedback reached a wider audience (especially given the original gist was deleted). The text of the explanation is also in a comment below.

Update 2011-12-01: A formal response has now arrived from Yammer. If you read the text below, you should also read the response linked above to get the fuller picture. Very little in tech is black and white, and, despite what some readers may think, I understand that very well.

Originally:
https://gist.github.com/7565976a89d5da1511ce

Hi Donald (and Martin),

Thanks for pinging me; it's nice to know Typesafe is keeping tabs on this, and I appreciate the tone. This is a Yegge-long response, but given that you and Martin are the two people best-situated to do anything about this, I'd rather err on the side of giving you too much to think about. I realize I'm being very critical of something in which you've invested a great deal (both financially and professionally) and I want to be explicit about my intentions: I think the world could benefit from a better Scala, and I'd like to see that work out even if it doesn't change what we're doing here.

Right now at Yammer we're moving our basic infrastructure stack over to Java, and keeping Scala support around in the form of façades and legacy libraries. It's not a hurried process and we're just starting out on it, but it's been a long time coming. The essence of it is that the friction and complexity that comes with using Scala instead of Java isn't offset by enough productivity benefit or reduction of maintenance burden for it to make sense as our default language. We'll still have Scala in production, probably in perpetuity, but going forward our main development target will be Java.

So.

Scala, as a language, has some profoundly interesting ideas in it. That's one of the things which attracted me to it in the first place. But it's also a very complex language. The number of concepts I had to explain to new members of our team for even the simplest usage of a collection was surprising: implicit parameters, builder typeclasses, "operator overloading", return type inference, etc. etc. Then the particulars: what's a Traversable vs. a TraversableOnce? GenTraversable? Iterable? IterableLike? Should they be choosing the most general type for parameters, and if so what was that? What was a =:= and where could they get one from?

A lot of this has been waved away as something only library authors really need to know about, but when an library's API bubbles all of this up to the top (and since most of these features resolve specifics at the call site, they do), engineers need to have an accurate mental model of how these libraries work or they shift into cargo-culting snippets of code as magic talismans of functionality.

In addition to the concepts and specific implementations that Scala introduces, there is also a cultural layer of what it means to write idiomatic Scala. The most vocal — and thus most visible — members of the Scala community at large seem to tend either towards the comic buffoonery of attempting to compile their Haskell using scalac or towards vigorously and enthusiastically reinventing the wheel as a way of exercising concepts they'd been struggling with or curious about. As my team navigated these waters, they would occasionally ask things like: "So this one guy says the only way to do this is with a bijective map on a semi-algebra, whatever the hell that is, and this other guy says to use a library which doesn't have docs and didn't exist until last week and that he wrote. The first guy and the second guy seem to hate each other. What's the Scala way of sending an HTTP request to a server?" We had some patchwork code where idioms which had been heartily recommended and then hotly criticized on Stack Overflow threads were tried out, but at some point a best practice emerged: ignore the community entirely.

Not being able to rely on a strong community presence meant we had to fend for ourselves in figuring out what "good" Scala was. In hindsight, I definitely underestimated both the difficulty and importance of learning (and teaching) Scala. Because it's effectively impossible to hire people with prior Scala experience (of the hundreds of people we've interviewed perhaps three had Scala experience, of those three we hired one), this matters much more than it might otherwise. If we take even the strongest of JVM engineers and rush them into writing Scala, we increase our maintenance burden with their funky code; if we invest heavily in teaching new hires Scala they won't be writing production code for a while, increasing our time-to-market. Contrast this with the default for the JVM ecosystem: if new hires write Java, they're productive as soon as we can get them a keyboard.

Even once our team members got up to speed on Scala, the development story was never as easy as I'd thought it would be. Because one never writes pure Scala in an industrial setting, we found ourselves having to superimpose four different levels of mental model — the Scala we wrote, the Java we didn't write, the bytecode it all compiles into, and the actual problem we were writing code to solve. It wasn't until I wrote some pure Java that I realized how much extra burden that had been, and I've heard similar comments from other team members. Even with services that only used Scala libraries, the choice was never between Java and Scala; it was between Java and Scala-and-Java.

Adding to the unease in development were issues with the build toolchain. We started with SBT 0.7, which offered a pleasant interface to some rather dubious internals, but by the time SBT 0.10 came out, we'd had endless issues trying to debug or extend SBT. We looked at using 0.10, but we found it to have the exact same problems managing dependencies (read: Ivy), two new, different flavors of inpenetrable, undocumented, symbol-heavy API, and an implementation which can only be described as an idioglossia. The fact that SBT plugin authors had to discover what "best practices" are in order to avoid making two plugins accidentally incompatible should have been a red flag for any tool which includes typesafety as a selling point. (The fact that I tried to write a plugin to replace SBT's usage of Ivy with Maven's Aether library should have been a red flag for me.)

We ended up moving to Maven, which isn't pretty but works. We jettisoned all of the SBT plugins I wrote to duplicate Maven functionality, our IDE integration worked properly, and the rest of our release toolchain (CI, deployment, etc.) no longer needed custom shims to work. But using Maven really highlighted the second-class status assigned to it in the Scala ecosystem. In addition to the "enterprisey" cat-calls and disbelief from the community, we found out that pointing out scalac's incremental compilation bugs had gotten that feature removed outright. Even the deprecation warning for -make: suggests using SBT or an IDE. This emphasis on SBT being the one true way has meant the marginalization of Maven and Ant -- the two main build tools in the Java ecosystem.

Cross-building is also crazy-making. I don't have any good solutions for backwards compatibility, but each major Scala release being incompatible with the previous one biases Scala developers towards newer libraries and promotes wheel-reinventing in the general ecosystem. Most Scala releases contain improvements in day-to-day programming (including compilation speed), but an application developer has to wait until all their dependencies are upgraded before they themselves can upgrade. If they can't wait, they have to take on the maintenance burden of that library indefinitely. In order to reduce their maintenance overhead, they naturally look for another, roughly equivalent library with a more responsive author. Even if the older library is better-tested, better-documented, and better-featured it will still lose out over time as developers jump ship for something that works with Scala 2.next sooner. (It's also worth noting that most companies using Scala at scale or in mission-critical capacities will not immediately upgrade; the library authors they employ will likely be similarly conservative, and the benefit their experience brings to their code will benefit the community less and less over time. As far as I've found, we're the only big startup in SF using 2.9.)

Once in production, Scala's runtime characteristics were the least subtle problem. At one point, half the team was working on a distributed database, and given the write fanout for our large networks some parts of the code could be called 10-20M times per write. Via profiling and examining the bytecode we managed to get a 100x improvement by adopting some simple rules:

1. Don't ever use a for-loop. Creating a new object for the loop closure, passing it to the iterable, etc., ends up being a forest of invokevirtual calls, even for the simple case of iterating over an array. Writing the same code as a while-loop or tail recursive call brings it back to simple field access and gotos. While I'm sure Scala will be have better optimizations in the future, we had to mutilate a fair portion of our code in order to actually ship it. (In another service, we got away with just using the ScalaCL compiler plugin and copying things to and from arrays instead of using immutable collections.)

2. Don't ever use scala.collection.mutable. Replacing a scala.collection.mutable.HashMap with a java.util.HashMap in a wrapper produced an order-of-magnitude performance benefit for one of these loops. Again, this led to some heinous code as any of its methods which took a Builder or CanBuildFrom would immediately land us with a mutable.HashMap. (We ended up using explicit external iterators and a while-loop, too.)

3. Don't ever use scala.collection.immutable. Replacing a scala.collection.immutable.HashMap with a java.util.concurrent.ConcurrentHashMap in a wrapper also produced a large performance benefit for a strictly read-only workload. Replacing a small Set with an array for lookups was another big win, performance-wise.

4. Always use private[this]. Doing so avoids turning simple field access into an invokevirtual on generated getters and setters. Generally HotSpot would end up inlining these, but inside our inner serialization loop this made a huge difference.

5. Avoid closures. Ditching Specs2 for my little JUnit wrapper meant that the main test class for one of our projects (~600-700 lines) no longer took three minutes to compile or produced 6MB of .class files. It did this by not capturing everything as closures. At some point, we stopped seeing lambdas as free and started seeing them as syntactic sugar on top of anonymous classes and thus acquired the same distaste for them as we did anonymous classes.

Now, every language has its performance issues, and the best a standard library can hope to do is to hit 80% of use cases. But what we found were pervasive issues — we could replace all of our own usages of s.c.i.HashMap, but it's a class which is extensively used throughout the standard library. It being slower than j.u.HashMap means groupBy is slower, as is a lot of other collections functionality I like.

At some point, I wondered if the positive aspects of our development experience owed less to Scala and more to the set of libraries we use, so I spent a few days and roughly ported a medium-sized service to pure Java. I broached this issue with the team, demo'd the two codebases, and was actually surprised by the rather immediate consensus on switching. There's definitely aspects of Scala we'll miss, but it's not enough to keep us around.

Already I've moved our base web service stack to Java, with Scala support as a separate module. New services are already being written on it, and given the results from our Hack Day at the beginning of this week it hasn't slowed our ability to quickly ship complex code. I'm keeping a close eye on the effects of this change, but I'm optimistic, and the team seems excited. We'll see.

So.

I've tried hard here not to offer you advice. Some of these problems could easily be specific to our team and our workload; some of them won't make a difference in how your company does; some of them aren't even your problems to solve, really. But they're still the problems we've encountered over the past two years, and they compose the bulk of what's motivating this change.

Despite the fact that we're moving away from Scala, I still think it's one of the most interesting, innovative, and exciting languages I've used, and I hope this giant wall of opinion helps you in some way to see it succeed. If there's anything here I can clarify for you, please let me know.

Its not for me to add any more to this, Yammer's opinion. There is a gold mine of information about Scala in there. And in my opinion, everyone thinking of adopting Scala should read it - in detail.

Update 2011-12-01: Just a quick reminder to read the responses from Coda/Yammer: immediate and formal.

Thursday, 24 November 2011

Scala EJB 2 feedback

My Scala/EJB post generated plenty of attention, as I expected. I left the comments there for others to discuss the post - this is my reply to a few of the points.

EJB 2 comparison

A number of comments arose about the comparison, especially those feeling it didn't make sense. Basically, EJB 1/2 was one of those technologies that appeared at first glance to have a lot of promise, targetting known pain points. But over time everyone figured out that the approach was needlessly complicated and created more problems than it solved. Collectively, developers who lived through the era wonder how on earth it got adopted - in hindsight it seems obviously bad.

As I indicated in the blog, I was trained in EJB, but it was obvious to me that the technology was deeply flawed. So, I argued against its adoption, and never had to use it in anger. I see Scala as equally deeply flawed, thus am arguing against its adoption, and endeavouring to avoid using it in anger.

Since I had that same reaction in my gut as I did with EJB, I used the analogy - a high level analogy, not a low level one. To me, Scala really does feel as bad an idea as EJB 2 did.

Fantom

The trouble with today's tweet based soundbites is that it is difficult to have a slightly subtle position on something, and I'd say my position on Fantom is subtle. I think Fantom is hugely interesting because it shows what happens when you challenge your preconceptions about what a static type system should do (and also because of its ability to turn shared mutable state into a compile error).

The subtlety is that I don't see any evidence that the majority of the static typing community of developers (ie. Java) are willing to take the radical step that Fantom offers (paring down static typing to the bare minimum). For me, I think Ceylon and Kotlin are both being seduced into adding more to the type system than developers really need.

In my Devoxx talk, and in the evaluating Fantom blog post, I made the point about the type system. I also suggested that Fantom might well appeal to those from the dynamic side of the fence who have been bitten by an absence of static typing (like Ruby).

Thus, Fantom makes a good counterpoint to Scala. They are pretty much polar opposites in the static typing space. And I find it interesting and worth noting that Fantom spends its language complexity budget on things I care about, whereas Scala (over)spends its complexity budget on things I don't care about.

Thus, while it may seem like I'm saying "use Fantom, use Fantom, please use Fantom", I'm really just using it as an effective counterpoint. Pointing out something that in my opinion has better answers to the hard questions is not the same as saying go and adopt it. Linked yes, but not the same.

The other points

A number of comments from the Scala side noted that modules (of the type I was referring to) were a problem. I will also willingly acknowledge the heritage of the word module in other contexts.

On concurrency, some got the message and some didn't. My point is that you can design a programming language such that shared mutable state does not compile. Scala talks a good game in this area, but in forensic analysis it doesn't match up.

On the type system, some feel the strength of Scala's approach is valuable, while some like me see it as way too far beyond the point of sensible returns. I also maintain that if I add a string to a list of integers I should get a compile error, not a list of Any. With type inference and implicits, there is far too much potential for things that should be errors during maintenance/refactoring to not be spotted for my taste.

On syntax I was primarily driving at the open and flexible nature of the design. With optional method invocation dot, optional brackets, optional semicolons, symbols for methods/operators and implicits thrown in, it will necessarily be harder than many languages to work out what any individual piece of code does. And there are consequences. That flexibility leaves ample room for mailing list discussion about the "right way" to do something. It also makes it very difficult for IDEs and compilers to figure out what the code means - which is the reason for the slow compile speeds mentioned in a number of comments. Personally, I find the goal of the open and flexible syntax (arbitrary DSLs) to be not worth the pain. There are other neater ways to think about DSLs.

FUD, criticism and my goals

I was accused of spreading FUD. No big surprise there. My view is that if that was my goal I would have done a better job in showing the more complex interactions of the feature set, or just flat out lied. No, I think my goal was a bit more interesting than just FUD.

Basically, the key goal with the blog was to provide reassurance to others who feel as I do that Scala just isn't right. I opened my blog talking about the Scala community not liking dissent, and I stand by that. A number of reactions actually praised my bravery for being willing to stand up to the "Scala cult". I don't think its my imagination to suggest that Scala's enthusiasts have managed to stifle criticism and given the impression that you'd be crazy not to use Scala. If I have inspired confidence in others to speak out, or question what they've heard, then that is a Good Thing in my book.

Beyond that, the long term theme of this blog has been that we should look again at just how much Java threw away from C, and judge new languages as much on what they threw away as what they added. For me, Scala didn't throw enough away and added too much - a lethal combination.

In the end, as Dick Wall suggested, individuals should try it for themselves. I just ask those that do to think deeply before adopting it, and as I said in a tweet - make sure you get both the positives and negatives before deciding.

My personal favourite responses

These are selected because I found them funny, or the point they made was interesting to my biased eyes. I'll let you figure out which are which!

Colebourne is a sad, old twat.
Anonymous comment

He sounds like a naughty schoolboy who misses being spanked..
AlSpeer

I developed with Scala for about a year. I learn a lot and got to work with some smart people. However, then I had to maintain code and, oh boy, it was like being forced onto a diet of live insects.
Anonymous comment

I've only evaluated Scala at the surface, but simply want to back up Stephen in agreeing with the subjective scary feel of the language. I can think of no other languages where I've scanned examples and documentation, and wanted to run away as I have with Scala.
Casper Bang comment

I'm sure the type system is genius, but I'd prefer if it was sensible genius instead of mad genius. I like my sanity.
mcv comment

Years ago, I was pretty excited about it. I saw lots of things Java couldn't do, and that would make my life easier. But then more features. And more. But I think that even before I got totally turned off by the language, I got turned off by the community first. It felt very elitist and unwelcoming to people who may be just interested in a language to 'get things done' instead of one you can put on your resume and earn an immediate 'smart like hell' badge with.
chillenious comment

What can be worse than a likeness to EJB 2? A likeness to WS-*
Paul Sandoz tweet

You know who else compared Scala with EJB? Hitler.
Runar Oli tweet (which I took in good humour)

According to @jodastephen "Scala feels like EJB 2" http://blog.joda.org/2011/11/scala-feels-like-ejb-2-and-other.html ... will the next article explain how #Scala will cause world war 3?
Mario Fusco tweet

FWIW, I've used Scala for two years, written tens of thousands of lines of code in it, and find your criticism incisive.
Coda Hale tweet

Not that I really should comment on Scala, but I feel that 50% of it would be better than 100% of it -- "too much of everything"
Tatu Saloranta tweet

i used scala for a few months. it sounded very promising, java without the verbosity. but in the end i decided to stop using it.
The biggest problem for me was readability. Scala is the first language that i've learned where at first i couldn't just read code and immediately guess what it does.
adabsurdo

My personal experience in wrapping non-trivial Java libraries in Scala and Clojure is that with Clojure it usually just works and it works quickly. In Scala I am usually reduced to an extra hour or two of adding manifests to signatures until the compiler accepts it.
I am disappointed with Scala and having lived through EJB 1, EJB 2 and then onto Spring and EJB 3, I agree with Steve it makes me feel exactly the same as I felt about EJB 1 and EJB 2 - that is I am being sold overcomplicated technology as a panacea.
Tim Clark

I have similar feelings about Scala. It's a bit like C++. The difference is: I found a subset of C++ I liked.
Glyn Normington tweet

Scala sucks, and i'm blessed to know that i'm not alone feeling that way.
Evgeny Shepelyuk tweet

And finally

I don't see myself writing a post in quite the same way about Groovy, Ceylon, Kotlin, Xtend, Clojure,... I may critique them (all do or will have flaws), but I don't see myself ripping into them in the same way. There is just something about Scala...

My final thought is that it is OK to look at Scala and decide against. You're not alone.

Tuesday, 22 November 2011

Scala feels like EJB 2, and other thoughts

At Devoxx last week I used the phrase "Scala feels like EJB 2 to me". What was on my mind?

Scala

For a number of years on this blog I've been mentioning a desire to write a post about Scala. Writing such a post is not easy, because anyone who has been paying attention to anti-Scala blog posts will know that writing one is a sure fire way of getting flamed. The Scala community is not tolerant of dissent.

But ultimately, I felt that it was important for me to speak out and express my opinions. As I said in my talk, if it was just me that had a poor opinion of Scala I would probably keep quiet (or try to figure out why I was out of step). But I perceive considerable uneasiness amongst many that have tried or looked at the language - something that reinforces my concerns.

Before I start I should mention that although I like the Fantom programming language, I also see merit in aspects of other languages - Groovy, Kotlin, Ceylon, Gosu, Xtend and many others. I also respect Clojure. By comparison, I really struggle to find positive feelings for Scala, and what positive feelings I have had have reduced over time. But why is that?

(For those not at my Devoxx talk, I tried to do two things - firstly to show Fantom off and explain how most simple comparisons to Scala rather miss the point, and secondly to point out some of the difficulties I have with Scala. To be clear, I'm not bashing Scala to promote Fantom. I'm bashing Scala because I think its entirely the wrong direction for the future.)

For this blog post Ive picked out a few key areas for discussion. I probably could have written a post 3 times as long as this one, and this one is very long as it is. There is lots to say about Scala, and very little is good.

Modules

Scala does not have a module system. By that I mean a deeply integrated system of modules that treats the basic program unit as something larger than a class, with versioning and dependencies. A key test is whether the new language compile modules or classes.

One of the greatest issues with Java is the lack of a module system. This absence has over time cause the platform to gain cruft (like CORBA) and struggle to shed it. The multi-year effort of modularising the JDK is evidence of how complicated this work is to do if not done from the start. And of course the Java platform will always have to support code not written in modules. Beyond the core JDK, most experienced Java developers have encountered the "Jar hell" scenario, where different versions of Jar files are required by different libraries and the ease with which it becomes impossible to assemble the whole application.

So, I have a clear sense that proper modularistion is a Good Thing, with all the versioned goodness that comes with it. (Managing change of a large application over time remains one of the largest problems faced by most large development shops, and one I don't see Scala tackling.) Over time, Java has evolved the Maven, Ivy and OSGi approaches to modules. Each has some benefits, but none are integrated into the platform itself, which is a significant disadvantage.

Yet, in a recent thread on modules, Scala aficionados claimed that Scala does have modules. In fact the opinion was clear - "see the object keyword", "Scala objects and path dependent types encode ML-style modules", "Also see http://www.mpi-sws.org/~rossberg/" (an academic paper). On further prompting, the ML view (standard source code can be used to express modules) was expanded on, before eventually the Scala approved way of using Maven/Ivy and the sbt tool was finally explained. There wasn't any real sense that this was a problem for Scala - so long as it integrated with the Java solutions that was fine.

I claim that integrating with Maven/Ivy is not fine. It misses huge opportunities to make life better on a topic where developers face real productivity issues in the field. Hence I commented that "Scala focuses on the wrong issues".

I also noted that backwards compatibility has been a constant problem of the Scala libraries. Modules are a tool for managing versioning and compatibility issues and would almost certainly have helped Scala evolve.

Finally, I noted that modules allow an application to find all the classes in the classpath/modulepath. This allows an application to find all the classes that implement an interface or annotation easily, which allows applications to be easily assembled from their parts. Java and Scala can achieve this, but only via complex and slow classpath scanning tools, like scannotation.

Concurrency

Scala makes a big deal about concurrency. About how the functional approach will aid the creation of safe multi-threaded code.

Except its really a bit of smoke and mirrors.

The big problem in concurrency is shared mutable state. It turns out that us developers are pretty bad at reasoning about it and using the tools at our disposal (synchronized, locks, java.util.concurrent) to manage that state. You'd expect that Scala would have tackled the concurrency problem at source - the shared mutable state - but it doesn't.

Scala (the language) does not know whether a class is immutable or not, nor does it provide a way to check is an object is immutable (Scala's libraries might help, but the language doesn't). As a result, it is perfectly possible to have a "static" (shared-thread) variable, or a "static" value of a mutable object. Its also possible to pass a mutable object to an actor and shared mutable state that way.

object Foo {
  var bar = "Hello"        // this is shared mutable state
  val baz = new Mutable()  // so is this
}

Tackling shared mutable state is not easy in language design. It involves designing the language to know about immutability, to track it, and to only allow immutable objects to be passed by reference to another thread/actor (mutable objects can be passed by copy). Done right, it eliminates the potential for concurrency issues from shared mutable state.

Scala relies on library design and discplined behaviour from developers to get this right (whereas Fantom builds this into the language, such that code equivalent to the example above will not compile). This is of course part of Scala's design approach - to give developers the power and trust that they will not abuse it. For me, this is simply another case of Scala failing to tackle the root cause of a big developer productivity issue.

Community

Scala has a loud and vocal community, especially amongst those on forums and twitter. Some of these developers have gone on to create libraries based on Scala, in all manner of areas, from web frameworks to database access. This can have the effect of making Scala appear to be the "upcoming destination" where other developers think they should invest their time.

Unfortunately there are some aspects of the community that are much more negative. Scala appears to have attracted developers who are very comfortable with type theory, hard-core functional programming and the mathematical end of programming. Frequently, there is the sense of a brainiac competition, and an awfully large amount of argument of whether solution A, B, C or D is the right one when in reality they all do exactly the same thing (Scala typically offers many ways to achieve the same end result, something that Java sought to avoid, and something that tends to create more heat than light in debates).

There is also a sense that many in the Scala community struggle to understand how other developers cannot grasp Scala/Type/FP concepts which seem simple to them. This sometimes leads Scala aficionados to castigate those that don't understand as lazy or poor quality developers, as being second class citizens. This can easily lead into derogatory comments about "Java Joe" or worse.

My experience is that most developers are perfectly clever people, and perfectly capable of understanding many things if they are explained correctly. The classic example is variance in Java generics, where ? extends is needed. I find that it is perfectly possible to explain the issues to a developer, who will pretty quickly grasp why a List of Integer cannot just be assigned to a List of Number. However, what I find is that once the discussion is complete, and the developer solves their immediate problem, the explanation will tend to slip away. The problem is not that the developer isn't smart enough, its that the complexities of the type system isn't important enough to care about. Understanding the issue at hand, management priorities, the problem domain and the architecture/design of the large system they are working on are much more significant issues.

The Scala community is also infected with modern societies desire for more, more, more without considering the consequences (more gadgets, faster car, bigger TV, more money, more spending, yet bankrupt people and countries). Scala goes all the way with its language features - everything is about maximum power. And the community revels in that power, finding and exploiting every corner case that the power grants, without truly considering the harm it does.

Type system

Every time I look at Scala it feels rather like the type system fits the phrase "if all you have is a hammer, everything looks like a nail". Whatever the problem, the type system is bound to be part of the solution.

The trouble is that a big type system is inevitably a complex type system. The concepts added to support the type system have their own terminology which is instantly inaccessible without significant learning, from high kinds to type constructors to dependent types... Its all just a baffling mess of type theory that provides no meaningful connection to actual work that needs doing.

The trouble is that despite the pleading of aficionados, method signatures like this abound:

 def ++ [B >: A, That] (that: TraversableOnce[B])(implicit bf: CanBuildFrom[List[A], B, That]) : That

If you don't know Scala, you wouldn't have a hope of attempting to understand the code. In fact this is the equivalent to Java's addAll() on a list. (Its so complex that Scala tries to hide it from you in the documentation by giving you a simpler form instead.)

Oh, and by the way, I do get the idea that the humongous type system is there to prevent compile errors and pre-check your code. But applying that logic in the opposite direction would imply that no-one gets any real work done in languages with dynamic type systems. For me, Scala's type system is way way beyond the point of sensible returns for a language feature.

Steve Yegge's analysis was perhaps the most fun:

The... the the the... the language spec... oh, my god. I've gotta blog about this. It's, like, ninety percent [about the type system]. It's the biggest type system you've ever seen in your life, by 5x. Not by an order of magnitude, but man! There are type types, and type type types; there's complexity...

They have this concept called complexity complexity<T> Meaning it's not just complexity; it's not just complexity-complexity: it's parameterized complexity-complexity. ... I mean, this thing has types on its types on its types. It's gnarly

Steve uses Scala to argue for dynamic type systems. I disagree, and consider a static type system to be useful for documentation, communicating intent in a team and for basic error checking. But I don't need the world's most complicated type system to do that - I just need something simple and effective.

In essence, Scala's type system is giving static typing in general a bad name.

Syntax

Scala's syntax is very wide open. It is the case that given a small piece of code (the kind that developers look at all day long) it is frequently difficult to reason about what that code does.

Scala has a big focus on flexible syntax with the aim of allowing a user to create a DSL in almost any form without having to write any parsing code. Instead, the developer just has to write a "normal" API, and use the language's syntactic flexibility to enable the ultimate end user to write in the desired style.

Take implicits, a technique that seems perfectly sensible to allow type conversions and object enhancements in a type safe way. Well, it may be type safe, but its also silent and very deadly. You can look a piece of code and not have any idea what is being converted. Unless you understand every import, every active implicit, their scope, their priorties and much more, you really don't have a clue what your code is doing. But thats OK, you didn't want to be able to understand Scala code did you???

Or take the fold operators and placeholder _, which produce delightful code like this:

 (0/:l)(_+_)

Thats practically the very definition of line noise. (And I've not even shown any scalaz examples, or similar unicode weirdness)

By the way, if you're looking at Scala, you may come across conference presentations, blog posts and forums that show small snippets of code in Java and Scala and show how much less code you have to write in Scala, and how much simpler it appears to be. This is a clever trick. The real complexity occurs when you start mixing each of the seemingly simple features together in a large codebase, on a big team, over a period of time, where code is read far more than it is written. That is when you start to appreciate that some of Java's restrictions are there for a reason.

Quality

When evaluating a language, its important to get a sense of the quality of the implementation. This is useful for determining how easy it will be to maintain the current language and extend it in the future.

For this, I turn to the analysis by the core Scala committer, Paul Phillips, in a Scala podcast in June 2011 (selected elements):

The compiler is, and the libraries and language as a whole, its awesome but the number of places where features interact is astronomical. In order to really get the lid on that many feature intersections we need a massively comprehensive test suite, that we simply don't have.
...
[Question:] Its been suggested that you are skeptical of community involvement because there is no test suite? You're afraid that if anyone touches it but you the whole world will break.
[Answer:] I, unfortunately, continue to be bitten by extremely subtle bugs that come out because of the inadequecy of our test suite.
...
[Question:] Where do you want the test suites?
[Answer:] Collections
[Question:] Anywhere in particular?
[Answer:] All of them. There's no reason that many many many of the bugs we've seen in the collections over the last couple of years should ever have happened because they should be exhaustively shown not to exist by virtue of the tests that we have, but don't have yet.
...
[Question:] And what about the compiler?
[Answer:] An exhaustive test suite for the pattern matcher would certainly aid me in the process of finally really fixing it. It would be very very helpful actually.

An incredibly complicated language with very few tests? Sounds like a poor foundation to build real world applications on to me.

Specifically, note this line - "the number of places where features interact is astronomical". This is a key aspect of Scala. That each language feature is orthogonal and flexible. Implicits mean that code can be inserted almost anywhere (which slows the compiler, listen to the podcast). The ability to drop method invocation dots and brackets for parameters (to achieve DSLs) makes the meaning of code non-obvious, and leaves no spare syntax space for future enhancements. And these things combine to make a good IDE a very difficult challenge.

(If you're evaluating Scala for adoption by your team, I strongly recommend listening to the whole 40 minutes of the podcast. It will help you understand just what the real issues are with Scala, the quality of the implementation, and how difficult the language is to evolve.)

EJB 2

The EJB 2 spec was in many ways the nadir of Java EE, where huge amounts of boilerplate, XML and general complexity were foisted onto the Java industry. The spec was widely adopted, but adoption was followed by criticism. Developers found that while EJB 2 sought to reduce complexity in building an enterprise application through an abstracted higher level API, in reality it added more complexity without providing the expected gains. Documentation, best practices and tooling failed to solve the basic design issue. Spring was launched as a greatly simplified alternative, and eventually the much simpler EJB 3 was launched, a spec that had little to do with EJB 2.

As a data point, I attended a formal weeks training course in EJB. At the end of the course I knew that this was a very bad technology and that I would recommend against its use at every opportunity. Scala has exactly that same feel to me.

So, at Devoxx I said that "Scala feels like EJB 2 to me". The language is a well-meaning attempt to create something with a higher abstraction level. But what got created is a language that has huge inherent complexity and doesn't really address the true issues that developers face today, yet is being pitched as being a suitable replacement for Java, something which I find to be bonkers.

At the moment, Scala is at the stage of thinking that better documentation, best practices and tooling will make a huge difference. None of these helped EJB 2.

In fact one might argue that Java's biggest flaw down the years has been the architectural over-engineering of solutions, when something simpler would have done the job. Again, Scala feels very much in the mold of that strand of the Java community, over-engineered rather than YAGNI or 80/20.

Of course, neither Spring nor EJB 3 are perfect, but the core concept of injection appears to be easy to grasp and the basic mechanism of linking them simple. In particular, having easily cut and pasted sections of documentation proved very valuable. Having code that is easy to grasp where problems can be tracked down without needing a PhD in type theory is a Good Thing, not a bad one. Having code where you can work out what it does without needing to know every last detail of the "astronomical" number of language feature intersections is a Good Thing, not a bad one.

Of course the upside for Scala of my EJB 2 comparison is that EJB 3 is a lot better. Thus, it is conceivable that Scala could reinvent itself. But I would point out that EJB 2 and 3 are essentially utterly different approaches, happening to share a common name. I would say that Scala would need a similar reinvention from scratch to solve its problems.

Summary

I don't like Scala. And that dislike is increasing. Specifically, I do not want to spend any of my future working life having to write code in it.

Had Scala stayed as a remote language, for highly specialist use cases (like Haskell or Erlang) then I would have far less of an issue. But it is being sold as the solution for mainstream development, and for that it is as utterly unsuited as EJB 2 was.

Update 2011-11-24: Rather than respond to all the comments inline, I penned a response blog.

Sunday, 24 July 2011

Reversed type declarations

I can't write about Kotlin without first talking about the folly of "reversed" type declarations. Sadly this disease has afflicted Scala, Gosu and now Kotlin (but perhaps its not yet too late for Kotlin ;-).

Reversed type declarations

When I see a new language, one of the first things I look at is the parameters and variable declarations. For this blog I'll refer to them as "standard" (like Java, Ceylon and Fantom) and "reversed" (like Pascal, Scala, Gosu and Kotlin).

  // standard
  Type variableName
  
  // reversed
  variableName : Type

Here I compare a Java and Kotlin method, although the principle is similar for Gosu, Scala and quite a few others.

  // Java
  public void process(String str) { ... }
  
  // Reversed lang (Kotlin or any similar language like Scala or Gosu)
  fun process(str : String) { ... }

When I see the latter, I cringe. Its usually a sign that the language isn't going to win me over.

So why the big deal?

For a start, there is one extra syntax character, the colon. Given that this is unnecessary (as shown by Java), it continues to be surprising to me that languages that aim to be less verbose than Java begin with something that is more verbose.

Ignoring the annoyance of having to type the extra character, the two examples above are still fundamentally readable. The human eye (specifically my human eye, but hey I'm being opinionated...) can cope with the extra colon and "reversed" declaration order. However, add complexity and the situation changes:

  // Java
  public void process(String str, int total, List<String> input) { ... }

  // Reversed lang
  fun process(str : String, total : int, input : List<String>) { ... }

As more parameters are added, the eye has more difficulty in picking out where one parameter ends and the next one starts. This is simply because the colon is more visually arresting than the comma, so as the eye scans the line it breaks up the parameters using the colons, not the commas. Thus at a glance I see "str", "String, total", "int, input", "List<String>". In fact, my eye sometimes doesn't see the commas at all, thus I get "str", "String total", "int input", "List<String>" which is horribly broken.

In order to actually read the information, I have to slow down and take longer. But when designing a programming language, it is rapid and quick readability that matters. Slowing me down on reading is a Bad Thing. So, beyond being unnecessary, the extra colons are actually making the code significantly harder to read (and write!).

But it gets worse:

  // Java
  public Future<Person> process(String str, List<String> input) { ... }

  // Reversed lang
  fun process(str : String, input : List<String>) : Future<Person> { ... }

Now, we have a return type, again separated by a colon. The use of the same character (yes that is another verbosity character I have to type) makes it especially difficult to visually parse. For me, the strength of the colon overrides the end bracket, thus I end up seeing "Future<Person>" as a parameter. Effectively my eye is parsing the line in a fraction of a second, but it gets to the end and has to double back to "push" the last thing it saw onto the "return type" stack. Try this one if you're struggling to see the issue. Note how the types and colons dominate and flow into one another, causing the distinctions as to their meaning (type of a parameter vs return type) to be lost:

  fun process(a : Int, b : Int, c : Int) : Int { ... }

As an aside, lets look at default parameters, which many new languages support (using an equivalent syntax for Java):

  // Java
  public Future<Person> process(String str, List<String> input, int total = 0) { ... }

  // Reversed lang
  fun process(str : String, input : List<String>, total : Int = 0) : Future<Person> { ... }

Now it is really broken! Now I've got "Int = 0" staring me in the eye, which really is not what the programmer was trying to express. Again, that visual barrier of the colon, together with the type, makes it very hard to connect the actual parameter name "total" with the value it has "0".

The real test for the syntax is the more complex case of higher order functions. This varies a lot by language, so lots of examples (hopefully accurate - I don't have time for lots of testing). I'm simulating some syntax for Java and Fantom:

  // Java
  public <T, R> List<R> transform(List<T> list, Transformer<T, R> transformer) { ... }
  public <T, R> List<R> transform(List<T> list, #R(T) transformer) { ... }  // lambda strawman
  public <T, R> List<R> transform(List<T> list, {T => R} transformer) { ... }  // BGGA
  
  // Ceylon
  shared List<R> transform<T, R>(List<T> list, R transformer(T)) { ... }
  
  // Fantom
  R[] transform<T, R>(T[] list, |T -> R| transformer) { ... }  // simulated syntax

  // Gosu
  function transform<T, R>(list : List<T>, transformer(T) : R) : List<R> { ... }
  
  // Kotlin
  fun <T, R> transform(list : List<T>, transformer : fun(T) : R) : List<R> { ... }
  
  // Scala
  def transform[T, R](list : List[T], transformer: T => R) : List[R]

The Java strawman and BGGA examples and Fantom pseudo-example demonstrate that a form can be created where higher order declarations are possible using "standard" declarations. Ceylon chooses to go down the C style route, mixing the parameter name in the middle of the return type and arguments. I don't find Ceylon's choice as readable to my eye when scanning as the Strawman/BGGA/Fantom pseudo-examples because it mixes the type and the variable name.

The three "reversed" declaration approaches are very different. Gosu (if I've read the documentation correctly) makes a very weird choice as the element after the colon is the return type of the function not the type of the variable "transformer" within the method as would be expected most of the time. Kotlin's choice is also poor, as it now means there are two colons in the parameter declaration, one to separate the variable name from the type, and one to separate the function type input from its output. Scala's is the most rational of the "reversed" declaration styles. However, I find the lack of anything surrounding the "T => R" means that the eye struggles to find the start and end of the type in more complex examples, which is essential to finding the variable name.

Of these, the Strawman/BGGA/Fantom pseudo-examples and most readable of the first group and Scala in the second group (ignoring the real Java example for a minute, and noting that Scala would be clearer with something around the function type). That is because when I'm performing the eye parsing/scanning I've been talking about, I am essentially trying to grasp the signature. To do that I need to know the number of parameters, their names and their types. Specifically, I want to put the name and type into different mental boxes. Mixing the name and type as Ceylon and Gosu do makes that harder, while Kotlin's additional colon simply creates another fence for my eye to have to jump.

To do this full justice, I should really have some examples of function types of function types. However this blog is already very long...

Finally, I'll point out that this isn't just confined to method parameters, but also to variable declarations such as local variables. Again, this gets complicated between language in the detail, so I'll compare to a typical "reversed" type language using a braindead stupid example:

  // Java
  private String process(int total) {
    String str = Integer.toString(total);
    return str;
  }

  // Reversed lang
  fun process(total : Int) : String {
    val str : String = Integer.toString(total)
    return str
  }

Again, the "reversed" example is telling me that "String = ...", not "str = ...". In logical terms, its utterly broken.

OK, OK, I can hear you yelling "type inference":

  // Java
  private String process(int total) {
    val str = Integer.toString(total);
    return str;
  }

  // Reversed lang
  fun process(total : Int) : String {
    val str = Integer.toString(total)
    return str
  }

So, I cheated right? By inventing a Java type inference syntax. Well, I'm making the point that type inference need not be limited to new languages, Java or any language using the "standard" type declaration style can have it too (and Fantom and Ceylon do). Thus, we should judge the variable declarations by the long form, even if it is not used for local variables all the time. And as shown above, the long form is awful in the "reversed" style. I am most emphatically not assigning the total to "String", I'm assigning it to "str", and that is what the code should say.

I'm sure if you've read this far you have a number of comments. Perhaps you believe tooling solves the issue, maybe syntax colouring? Well, I'll simply say that while tooling helps, you should still be able to understand the language without it, even if just for command line diffs around a version control system. Or perhaps you're objecting to my methodology of trying to visually parse a line in a glance? Its how I work, don't you do scan code too?

Let me be clear, in none of the above do I mention the task of the compiler. My sole focus is on the developer reading the code, and an order of magnitude less writing code.

Any argument in support of "reversed" type declarations should never be based on relevance to the compiler or some other element of type theory.

My view is that the usability to the mainstream developer is what matters. And that is primarily about ease of reading what is written. I have endeavoured to show that the reverse style hampers readability, and is unnecessary to achieve the same goals of a more complex type system that are sometimes used as justification.

Summary

I'm arguing that the "reversed" type declaration style is flat out harder to visually parse, and should therefore be rejected by language authors, even if they believe they have sound compiler or type theory rationales. Programming languages exist primarily for developers, not to aid the compiler or underlying theories. "Write once, Read many" must be the first law of language design!

I am thus hugely disappointed that Kotlin, which has many fine features taken from Fantom, did not think through this choice in more detail, and I plead with the authors to change their minds before Kotlin is locked down.

Opinions welcome, and I'm sure there will be lots...