Thursday, 22 March 2018

JPMS modules for library developers - negative benefits

Java 9 introduced a major new feature - JPMS, the Java Platform Module System. After six months I've come to the conclusion that JPMS currently offers "negative benefits" to open source library developers. Read on to understand why.

Modules for library developers

Java 8 is probably the most successful Java release ever. It is widely used and widely liked. As such, almost all open source libraries run on Java 8 (as library authors want their code to be used!). Some libraries with a long history also still run on older versions. Joda-Convert has a Java 6 baseline, while Joda-Time has a Java 5 baseline. Others have a Java 8 baseline, such as ThreeTen-Extra.

Java 9 was released in September 2017, but it is not a release that will be supported for a number of years. Instead, it had a lifetime of six months and is now obsolete because Java 10 is out. And in six months time Java 11 will be out making Java 10 obsolete, and so on.

While most releases last six months, some are luckier. Java 11 will be a "long term support" (LTS) release with security and bug support for a few years (Java 8 is also an LTS release). Thus, even though Java 10 is out, Java 8 is still the sensible Java version for open source library developers to target right now because it is the current LTS release.

But what happens when Java 11 comes out? Since Java 8 will be unsupported relatively soon after Java 11 is released, you'd think that the sensible baseline would be 11. Unfortunately I believe many companies will be sticking with Java 8 for a long time. An aggressive open source project might move quickly to a Java 11 baseline, but doing so would be a risky strategy for adoption.

The module-path

Before discussing the JPMS options for open source library developers, it is important to cover the distinction between the class-path and the module-path. The class-path that we all know and love is still present in Java 9+, and it mostly works in the same way.

The module-path is new. When a jar file is on the module-path any module-info is used to apply the new stricter JPMS rules. For example, a public method is no longer callable unless it has been exported from the module it is contained in (and required by the caller's module).

The basic idea is simple, you put old fashioned non-modular jar files on the class-path, while you put modular jar files on the module-path. Nothing enforces this however, and it turns out this is a bit of a problem. There are thus four possibilities:

  • modular jar on the module-path
  • modular jar on the class-path
  • classic non-modular jar on the module-path
  • classic non-modular jar on the class-path

To be sure your library works correctly, you need to test it both on the class-path and on the module-path. For example, service loading is very different on the module-path compared to the class-path. And some resource lookup methods also work completely differently.

To complicate this further, JPMS has no explicit support for testing. In order to test a module on the module-path (which is a tightly locked down set of packages) you have to use the --patch-module command line flag. This flag effectively extends the module, adding the testing packages into the same module as the classes under test. (If you only test the public API, you can do this without using patch-module, but in Maven you'd need a whole new project and pom.xml to achieve that approach, so its likely to be rare.)

In the latest Maven surefire plugin (v2.21.0 and later) the patch-module flag is used, but if your module has optional dependencies, or you have additional testing dependencies, you may have to manually add them, see this issue and this issue.

Given all this, what should an open source library developer do?

Option 1, do nothing

In most cases, but not all, code that is compiled on Java 8 or earlier will run just fine on the class-path in Java 9+. So, you can do nothing and ignore JPMS.

The problem is that other projects will depend on your library. By not adopting JPMS at all, you block those projects from progressing in their modularization. (A project can only choose to fully modularize once all of its dependencies are modularized.)

Of course if your code doesn't run on Java 9+ because you've used sun.misc.Unsafe or something else you shouldn't have done then you've got other things to fix.

And don't forget that a user could put your jar file on the class-path or the module-path. Have you tested both? ie. The truth is that "do nothing" is not possible - at a minimum you have extra testing to do, even if it just to document that your project does't work on the module-path.

Option 2, add a module name

Java 9+ recognises a new entry in the MANIFEST.MF file. The Automatic-Module-Name entry allows a jar file to declare what name it will use if/when it is turned into a proper modular jar file. Here is how you can use Maven to add it:

 <plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-jar-plugin</artifactId>
  <configuration>
   <archive>
    <manifestEntries>
     <Automatic-Module-Name>org.foo.bar</Automatic-Module-Name>
    </manifestEntries>
   </archive>
  </configuration>
 </plugin>

This is a nice simple way to move forward. It reserves the module name and allows other projects that depend on your jar file to fully modularize if they wish.

But because its so simple, its easy to forget the testing aspect. Again, your jar file might be placed on the class-path or on the module-path, and the two can behave quite differently. In fact, now that it has some module information, tools may treat it differently.

When Maven sees an Automatic-Module-Name it will normally place the classes on the module-path instead of the class-path. This may have no effect, or it may show up a bug where your code works on the class-path but not on the module-path. With Maven right now, you have to use surefire plugin v2.20.1 to test on the class-path (an old version that doesn't know about the module-path). To test on the module-path, use v2.21.0. Swapping between these two versions is of course a manual process, see this issue for a request to improve this.

While upgrading some of my projects I added Automatic-Module-Name without testing on the module-path. When I did eventually test on the module-path the tests failed, as the code simply didn't work on the module-path. Unfortunately, I now have some releases on Maven-Central that have Automatic-Module-Name but don't work on the module-path, happy days...

To emphasise this, just adding something to the MANIFEST.MF file can have an effect on how the project is run and tested. You need to test on both the class-path and module-path.

Option 3, add module-info.java

This is the full modularization approach described in numerous web pages and tutorials on JPMS.

 module org.foo.bar {
   requires org.threeten.extra;
   exports org.foo.bar;
   exports org.foo.bar.util;
 }

So, what are the implications of doing this to the open source project?

Unlike option 2, your code now has a baseline of Java 9+. The Java 8 compiler won't understand the file. What we really want is a jar file that contains Java 8 class files, but with just the module-info.java file compiled under Java 9+. In theory, when running on Java 8 the module-info.class file will be ignored if it is not used.

Maven has a technique to achieve this. While the technique works OK, it turns out to be nowhere near sufficient to achieve the goal. To actually get a single jar file that works on both Java 8 and 9+ you need:

  • use the release flag on Java 9+ to build for Java 8
  • add an OSGi require capability filter to inform it that its still Java 8 compatible
  • exclude module-info.java from maven-javadoc-plugin when building on Java 8
  • use maven-javadoc-plugin v3.0.0-M1 (not later), manually copy dependencies to a directory and refer to them using additional Javadoc command line arguments, see this issue
  • exclude module-info.java from maven-checkstyle-plugin
  • exclude module-info.java from maven-pmd-plugin
  • manually swap the version of maven-surefire-plugin to test both the module-path and the class-path

And probably some more I've forgotten about. Here is one pom.xml before integrating Java 9. Here it is after integrating Java 9. An increase from 650 to 862 lines, with lots of complexity, profiles and workarounds.

With a Java 11 baseline, the project would be simpler again, but that baseline isn't going to happen for a number of years. Note that my comments should not be interpreted as anti-Maven. A small team there is working hard to do the best they can - JPMS is complex.

And just for kicks, your project can no longer be used by Android (as the team there seems to be very slow in adding a simple "ignore module-info" rule). And many tools with older versions of bytecode libraries like ASM will fail too - I had a report that a particular version of Tomcat/TomEE could not load the modular jar file. I've ended up having to release a "classic" non-modular jar file to cope with these situations, something which is profoundly depressing.

While I've added module-info.java to some of my projects, I cannot recommend others to do so - its a very painful and time-consuming process. The time to consider it would appear to be once Java 11 or beyond is widely adopted and the baseline of your project.

Negative benefits

Now for the controversial part.

It is my conclusion that JPMS, as currently designed, has "negative benefits" for open source libraries.

As explained above, the cost of full modularization is high for library developers. The need to retain Java 8 compatibility makes JPMS really hard to use (module information should have been textual, not a class file). The tooling is still incomplete/buggy. Many older projects can't cope with the new jar files if you do go for it. Much of this will improve over time, but we're talking a number of years before Java 11 is widely adopted. But don't be lulled into just believing waiting will solve the key problem.

The split (bifurcation) of the module-path from the class-path is an absolute nightmare. At a stroke, there are now two different ways that your library can be run, and the two environments have quite different qualities. Code that compiles and runs on the class-path will often not compile or not run on the module-path. And vice versa. As a library author, you cannot control whether the class-path or module-path is used. You have no choice - you must test both, which you probably won't think to do. (And Maven currently provides no way to test both in one pom.xml)

Given all this effort and extra complexity, we should be getting some great benefits, right? Well no.

JPMS is supposed to ensure reliable configuration (that all your dependencies are available at startup) and strong encapsulation (that other code can't see or use packages that you want to keep hidden). But since there is no way to stop your modular jar file being used on the class-path, you get none of these benefits.

Did you put lots of effort into choosing which packages to hide? Meaningless, as the user can just put the jar file on class-path and call your internal packages. Did you believe that the JVM will check all your dependent modules are available before starting? Afraid not, no checks performed when the user puts the jar file on class-path.

Since we get none of the claimed benefits of JPMS, but get lots of extra work in testing and complexity in the build tools, I feel "negative benefits" is a pretty accurate summary.

Summary

As of today, JPMS is a pain for library authors. The split of the module-path from the class-path is at the heart of the problem. You really can't afford to release a library without testing on both module-path and class-path - there are subtle corner cases where the environments differ.

What is desperately needed is a small change to JPMS. There needs to be a way for a library author to insist that the modular jar file they are producing can only be run on the module-path (with any attempt to use it on the class-path preventing application startup). This would eliminate the need for testing both class-path and module-path. Together with the passage of time, JPMS might yet achieve its goals and go from negative to positive benefits.

11 comments:

  1. The easiest way to add module-info to the library is to put compiled module-info.class to src/main/resources. Since module-info changes rarely, this is the most appropriate option for me.

    ReplyDelete
    Replies
    1. If you do this and then only compile on Java 8, you won't be running the compiler in module mode. As such, you will not spot any situations where your code would not compile on Java 9. This is why the Maven technique for compilation compiles the whole project with module-info using Java 9+, then recompiles the whole project excluding module-info. Its a pain, but it works.

      Delete
  2. What about just making use of multi-release jars?

    ReplyDelete
    Replies
    1. I've seen a number of reports that multi-release jars cause problems to some tools (not necessarily the same ones that module-info.class causes problems with).

      Delete
  3. Important post, thanks for it. Oracle did reach out to the Maven community, but it looks like it was not yet enough to have a pleasant tooling. (Not to mention OSGi).

    ReplyDelete
  4. As others say, Multi-Release JARs are a good means to overcome some of the issues outlined. By having a module descriptor not in the root of the JAR, but under META-INF/versions/9/module-info.class, it will be out of the way in most cases when being used on an older Java version. Yes, tooling still needs to catch up on that one, but I don't see this as a general problem.

    As far as adding an Automatic-Module-Name without testing is concerned, I think that's clearly wrong-doing by the library author. Adding this header IMHO implies that at least some basic vetting has been done to make sure that this library actually works on the module path.

    Regarding usage of non-exported types when run on the classpath, I don't buy into this argument. If a library can clearly describe its public API as a module, it usually does so via some very clear package naming pattern. E.g. everything not under com.example.internal is exported. This is a clear indication to not make use of the internal packages also on the classpath (a pattern lots of libraries employed for many years). Sure, a user can ignore this pattern. But then they could also call --add-exports=com.example.internal when using this JAR on the module path. There are always ways to shoot yourself in the foot, but the important thing for library authors is to express the intent of what's public and what's not. And that they can do much better now.

    I think that compiling module descriptors "out-of-bands" can be a useful way to prevent some of the issues described here. I work on a tool which lets you do this: https://github.com/moditect/moditect. As a nice feature, it also lets you add module-info.class descriptors when building with Java 8. Agreed that this requires good testing of the assembled JAR to make sure everything works as expected.

    Overall, I am - as a library author - much more positive on JPMS. I think it's an improvement all in all. Sure, some things need further improvement (tooling, esp. testing and MR JAR support), but this will happen over time. I appreciate the clearly expressed APIs and dependences and support for modular runtime images. That these things are not available when using my libraries on earlier Java versions or with the classpath, doesn't speak against JPMS IMHO. In fact, had the classpath been removed, complaints would have been much bigger and I don't think the module system could have been introduced overall. So I think it's a sensible approach.

    That's not to say that the JPMS is perfect, for sure I'd like to see some additions. Most notably, the issue of concealed package conflicts should be addressed, that's just not reasonable for a module system. And I'd like to see multi-version support with the default resolver :)

    ReplyDelete
  5. You mentioned „the code simply didn’t work on the module path“. What exactly was the problem there?

    ReplyDelete
  6. Replies
    1. I was wondering why the module-info.class including keywords was necessary at all. Especially considering how conservative the Java engineers are regarding compatibility.
      It would have been enough to parse the MANIFEST.MF for all the relevant module-informations (name, packages) and let the tools fill this information in. Another options would be compile-time annotations that create some text-files (or fills the manifest).
      Then I could compile a modular library which also runs on Java 8 classpath (considering I didn't use any Java 9 APIs).

      Delete
  7. Hi,

    I have the query regarding the Open JDK.
    Please let me know that Open JDK supports the windows offically instead of Oracle JDK.

    I tried download window Open JDK from following link : http://jdk.java.net/8/ , but download link having the Oracle JDK.
    Please let me know is there any official support for Windows Open JDK.


    Br,
    Ankit Saxena

    ReplyDelete
  8. i totally agree. plus, what oracle did to google (recently) in the so called justice system is unacceptable and thousands of developers will do whatever it takes to hurt oracle to the bone. it's not only that oracle violated gpl v2, but they generally derived revenue from somebody else's work, this is acceptable in american under that tramp, but will not fly anywhere else, you will see. hope oracle goes to hell

    ReplyDelete

Please be aware that by commenting you provide consent to associate your selected profile with your comment.