Monday, 17 April 2017

Java 9 modules - JPMS basics

The Java Platform Module System (JPMS) is the major new feature of Java SE 9. In this article, I will introduce it, leaving most of my opinions to a follow up article. This is based on these slides.

Java Platform Module System (JPMS)

The new module system, developed as Project Jigsaw, is intended to raise the abstraction level of coding in Java as follows:

The primary goals of this Project are to:
* Make the Java SE Platform, and the JDK, more easily scalable down to small computing devices;
* Improve the security and maintainability of Java SE Platform Implementations in general, and the JDK in particular;
* Enable improved application performance; and
* Make it easier for developers to construct and maintain libraries and large applications, for both the Java SE and EE Platforms.
To achieve these goals we propose to design and implement a standard module system for the Java SE Platform and to apply that system to the Platform itself, and to the JDK. The module system should be powerful enough to modularize the JDK and other large legacy code bases, yet still be approachable by all developers.

However as we shall see, project goals are not always met.

What is a JPMS Module?

JPMS is a change to the Java libraries, language and runtime. This means that it affects the whole stack that developers code with day-to-day, and as such JPMS could have a big impact. For compatibility reasons, most existing code can ignore JPMS in Java SE 9, something that may prove to be very useful.

The key conceptual point to grasp is that JPMS adds new a concept to the JVM - modules. Where previously, code was organized into fields, methods, classes, interfaces and packages, with Java SE 9 there is a new structural element - modules.

  • a class is a container of fields and methods
  • a package is a container of classes and interfaces
  • a module is a container of packages

Because this is a new JVM element, it means the runtime can apply strong access control. With Java 8, a developer can express that the methods of a class cannot be seen by other classes by declaring them private. With Java 9, a developer can express that a package cannot be seen by other modules - ie. a package can be hidden within a module.

Being able to hide packages should in theory be a great benefit for application design. No longer should there be a need for a package to be named "impl" or "internal" with Javadoc declaring "please don't use types from this package". Unfortunately, life won't be quite that simple.

Creating a module is relatively simple however. A module is typically just a jar file that has a module-info.class file at the root - known as a modular jar file. And that file is created from a module-info.java file in your sourcebase (see below for more details).

Using a modular jar file involves adding the jar file to the modulepath instead of the classpath. If a modular jar file is on the classpath, it will not act as a module at all, and the module-info.class will be ignored. As such, while a modular jar file on the modulepath will have hidden packages enforced by the JVM, a modular jar file on the classpath will not have hidden packages at all.

Other module systems

Java has historically had other module systems, most notably OSGi and JBoss Modules. It is important to understand that JPMS has little resemblance to those systems.

Both OSGi and JBoss Modules have to exist without direct support from the JVM, yet still provide some additional support for modules. This is achieved by launching each module in its own class loader, a technique that gets the job done, yet is not without its own issues.

Unsurprisingly, given these are existing module systems, experts from those groups have been included in the formal Expert Group developing JPMS. However, this relationship has not been harmonious. Fundamentally, the JPMS authors (Oracle) have set out to build a JVM extension that can be used for something that can be described as modules, whereas the existing module systems derive experience and value from real use cases and tricky edge cases in big applications that exist today.

When reading about modules, it is important to consider whether the authors of the article you are reading are from the OSGi/JBoss Modules design camp. (I have never actively used OSGi or JBoss Modules, although I have used Eclipse and other tools that use OSGi internally.)

module-info.java

The module-info.java file contains the instructions that define a module (the most important ones are covered here, but there are more). This is a .java file, however the syntax is nothing like any .java file you've seen before.

There are two key questions that you have to answer to create the file - what does this module depend on, and what does it export:

module com.opengamma.util {
  requires org.joda.beans;  // this is a module name, not a package name
  requires com.google.guava;

  exports com.opengamma.util;  // this is a package name, not a module name
}

(The names to use for modules needed a whole separate article, for this one I'll use package-name style)

This module declaration says that com.opengamma.util depends on (requires) org.joda.beans and com.google.guava. It exports one package, com.opengamma.util. All other packages are hidden when using the modulepath (enforced by the JVM).

There is an implicit dependency on java.base, the core module of the JDK. Note that the JDK itself is also modularized, so if you want to depend on Swing, XML or Logging, that dependency needs to be expressed.

module org.joda.beans {
  requires transitive org.joda.convert;

  exports org.joda.beans;
  exports org.joda.beans.ser;
}

This module declaration says that org.joda.beans depends on (requires) org.joda.convert. The "requires transitive", as opposed to a simple "requires", means that any module that requires org.joda.beans can also see and use the packages from org.joda.convert. This is used here as Joda-Beans has methods where the return type is from Joda-Convert. This is shown by a dashed line.

module org.joda.convert {
  requires static com.google.guava;

  exports org.joda.convert;
}

This module declaration says that org.joda.convert depends on (requires) com.google.guava, but only at compile time, "requires static", as opposed to a simple "requires". This is an optional dependency. If Guava is on the modulepath, then Joda-Convert will be able to see and use it, and no error will occur if Guava is not present. This is shown by a dotted line.

Access rules

When running a modular jar on the modulepath with JVM access rules applied, code in package A can see a type in package B if:

  • the type is public
  • package B is exported from it's module
  • there is a dependency from the module containing package A to the module containing package B

Thus, in the example above, code in module com.opengamma.util can see packages org.joda.beans, org.joda.beans,ser, org.joda.convert and any package exported by Guava. However, it cannot see package org.joda.convert.internal (as it is not exported). In addition, code module com.google.guava cannot see code in package org.joda.beans or org.joda.convert as there is no modular dependency.

What can go wrong?

The basics described above are simple enough. It is initially quite easy to imagine how you might build an application from these foundations and benefit from hiding packages. Unfortunately, quite a few things can go wrong.

1) All use of module-info files only applies if using modular jars on the modulepath. For compatibility, all code on the classpath is packaged up as a special unnamed module, with no hidden packages and full access to the whole JDK. Thus, the security benefits of hiding packages are marginal at best. However, the modules of the JDK itself are always run in modular mode, thus are always guaranteed the security benefits.

2) Versions of modules are not handled. You cannot have the same module name loaded twice - you cannot have two versions of the same module loaded twice. It is left entirely to you, and thus to your build tool, to create a coherent set of modules that can actually be run. Thus, the classpath hell situation caused by clashing versions is not solved. Note that putting the version number in the module name is a Bad Idea that does not solve this problem and creates others.

3) Two modules may not contain the same package. This seems eminently sensible, until you consider that it also applies to hidden packages. Since hidden packages are not listed in module-info.class, a tool like Maven must unpack the jar file to discover what hidden packages there are in order to warn of clashes. As a user of the library, such a clash will be completely surprising, as you won't have any indication of the hidden packages in the Javadoc. This is a more general indication that JPMS does not provide sufficient isolation between modules, for reasons that are far from clear at this point.

4) There must be no cycles between modules, at compile time and at runtime. Again, this seems sensible - who wants to have module A depend on B depend on C which depends on A? But the reality of existing projects is that this happens, and on the classpath is not a problem. For example, consider what would happen if Guava decided to depend on Joda-Convert in the example above. This restriction will make some existing open source projects hard to migrate.

5) Reflection is changing, such that non-public fields and methods will no longer be accessible via reflection. Since almost every framework uses reflection in this way, there will be significant work needed to migrate existing code. In particular, the JDK will be very locked down against reflection, which may prove painful (command line flags can escape the trap for now). This article hasn't had a chance to explore how the module declaration can influence reflection - see "opens" in the slides for more details.

6) Are your dependencies modularized? In theory, you can only turn your code into a module once all your dependencies are also modules. For any large application with hundreds of jar file dependencies, this will be a problem. The "solution" is automatic modules, where a normal jar file placed on the modulepath is automatically turned into a module. This process is controversial, with naming a big issue. Library authors should not publish modules that depend on automatic modules to public repositories like Maven Central unless they have the Automatic-Module-Name manifest entry. Again, automatic modules deserve their own article!

7) Module naming is not yet set in stone. I've come to believe that naming your module after the highest package it contains, causing that module to "take ownership" of the subpackages, is the only sane strategy.

8) Conflicts with build systems - who is in charge? A Maven pom.xml also contains information about a project. Should it be extended to allow module information to be added? I would suggest not, because the module-info.java file contains a binding part of your API, and that is best expressed in .java code, not metadata like pom.xml.

For those wanting a book to read in much more depth, try this one from Nicolai.

Summary

Do not get too excited about JPMS - modules in Java 9. The above is only a summary of what is possible with the module-info.java file and the restrictions of the JPMS. If you are thinking of modularizing your library or application, please wait a little longer until everything becomes a little clearer.

Feedback welcome, but bear in mind that I am planning more articles.

16 comments:

  1. When you say "This module declaration says that com.foo.myapp", do you mean "com.opengamma.util" instead?

    ReplyDelete
  2. I think the most important point of OSGi is its highly dynamic nature. Bundles are used as a more structured way (compared to classic raw classloader shenanigans) of managing unattended upgrades of long-running embedded systems (from set-top boxes to cars), dynamic plugin systems that don't need global restarts (eclipse), etc. It's complex, but not for no reason. OSGi painstakingly provides stop and uninstall, not just start and install. The JPMS seems near-orthogonal to most of what OSGi does, the new modules just another static layer like packages. OSGi will now just have to hammer out correct dynamic management of java modules too, sigh.

    http://eclipsesource.com/blogs/2013/01/23/how-to-track-lifecycle-changes-of-osgi-bundles/

    This is something quite different to what JPMS seems to be. I'm not saying JPMS should have been OSGi either, just that they seem to do very different things.

    ReplyDelete
    Replies
    1. Yes, OSGi is very different from JPMS. The two will continue to be completely separate IMO.

      Delete
    2. The problem as I see it is that: despite the initial spec explicitly referencing OSGi as a standard that jigsaw should play nice with, the technical realities of a design that allowed for jdk modularization to happen this century heavily influenced the actual spec. This is a difficult temptation to resist, esplecially under time pressure.

      Its a shame that the spec lead for this JSR was completely fine with OSGI- a module-focused, widely adopted industry standard alive since 1998- having to hack together an interoperability solution that bastardizes the actual concepts (layers, specifically) delivered in the "reference" implementation.

      Reading through the spec evolutions and the published EG discussions, the real takeaway for me was that you must not ever sign off on functional specs unless they are perfectly clear. The final reference implementation relies on an "easy" word-for-word interpretation of the document that that, while being technically true,

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Great post.

    As for JMPS not ressembling OSGi, the latter systems has had over a decade to gain popularity and has failed to do so. JBoss modules, altough more recent, seems to have had little success as well.

    On the other hand, JMPS is much simpler than OSGi and as far as I can tell addresses a different use-case. With JPMS modules, 1) searching for classes is likely to be faster 2) self contained applications will become feasable at last.

    While it is difficult to see how JMPS will facilitate the creation of "big" applications composed of many components, JMPS will allow applications to be assembled as stand-alone programs rather than WAR/EAR files to be deployed in Application Servers.

    ReplyDelete
  5. Nice post on the JPMS basics, how Layers work in JPMS?

    ReplyDelete
  6. @Stephen Colebourne - is "requires transitive static com.google.guava;" possible? The Java Specification says something as below. I am not sure what is "one of". Does it mean the combination is not allowed?
    RequiresModifier:
    (one of)
    transitive static

    ReplyDelete
    Replies
    1. I don't think you can have both together. With things like this, asking Stack Overflow or trying it is a better approach.

      Delete
  7. The "lack" of popularity of the existing module systems (OSGi, JBoss) is related to the fact that people have been mislead by the JDK to stop thinking in modules.
    Modula-2, Turbo Pascal Ada had their foundations in modules and there were large appls that gained from this.
    All topics related to modular development will just pop up again as people will start using JPMS extensively.
    JPMS is a good thing (btw, I am using OSGi extensively ;-)). This is the beginning, not the end.

    ReplyDelete
  8. Thanks Stephen for this concise overview.
    Note that OSGi is more than the module layer.

    ReplyDelete
  9. 4) There must be no cycles between modules, at compile time and at runtime.

    How do you compile modules with cycle dependencies?
    Simplest example is com.a requires com.b, com.b requires com.a.

    ReplyDelete
    Replies
    1. You have to break the cycle, such as by using a separation between interfact and implementation, or ServiceLoader.

      Delete
    2. If com.b requires com.a we compile like this:
      javac -d mods/com.a $(find com.a/src/ -name "*.java")
      javac -d mods/com.b -p mods/com.a $(find com.b/src/ -name "*.java")

      If com.b requires com.a and a.com requires com.b:
      no way to compile.

      I mean introducing cycle dependencies requires dependencies resolution in runtime not in compile-time.

      Delete
    3. In a web application you will often compile against an API only, but the implementation has its own dependencies, potentially creating cyclical relationships.

      Delete

Please be aware that by commenting you provide consent to associate your selected profile with your comment. Long comments or those with excessive links may be deleted by Blogger (not me!). All spam will be deleted.