Enterprise Integration Zone is brought to you in partnership with:

I am the founder and lead developer of Hibernate Envers, a Hibernate core module, which provides entity versioning/auditing capabilities. I am also one of the co-founders of SoftwareMill, a company specializing in delivering customized software solutions (http://softwaremill.com, "Extraordinary software as a standard"), based on Java and JBoss technologies. After work, apart from being involved in development of Envers, I work on several small open source projects, like ElasticMQ (simple message queue written in Scala with an SQS interface), projects around static analysis (using JSR 308 - Typestate Annotations/ Checkers Framework and FindBugs), and some CDI/Weld (not always portable) extensions, like autofactories or stackable security interceptors. I am also interested in new JVM-based languages, especially with functional elements (like Scala, JRuby) and frameworks built using them (like Lift), as well as improving the ways we use Dependency Injection. Adam is a DZone MVB and is not an employee of DZone and has posted 52 posts at DZone. You can read more from them at their website. View Full User Profile

Let's Turn Packages into a Module System!

11.16.2012
| 3266 views |
  • submit to reddit

 

Many projects are divided into modules/subprojects using the build system (Maven, Gradle, SBT …); and writing modular code is generally a Good Thing. Dividing the code into build modules is mainly used for:

  • isolating parts of code (decreasing coupling)
  • api/impl split
  • adding a third-party dependency only to a specific part of code
  • grouping code with similar functionality
  • statically checking that code in one module only uses code from its dependent modules (inter-module dependencies)

While some may say that it is also useful for separate compilation, I don’t think that matters a lot (when considering one project). The build tools are pretty smart nowadays to figure out what needs to be recompiled.

Problems with build modules

I think there are several problems with this approach. First of all, it is pretty hard to decide when a piece of functionality is “big enough” to turn it into a build module. Is a handful of classes enough? Or do you need more? Should it strictly be one functionality per module? But that would cause a module explosion; and so on. At least in the projects I took part in, it was a common theme of discussions, how coarse-grained the build modules should be.

Secondly, build modules are pretty “heavy”. Maven is worst I suppose, you need a large piece of xml to create a module, with lots of boilerplate (for example repeated group id, version number, parent definition); SBT and Gradle are much better, but still, it is a significant effort. A separate directory needs to be created, the whole directory structure (src/main/..., src/test/...), build config updated, etc. Overall it is quite a hassle.

And then quite often when we have our beautiful modules separated, it turns out that in order for two of them to cooperate, we need a “common” part. Then we either end up with a bloated foo-common module, which contains loads of unrelated classes, or multiple small foo-foomodule-common modules; the second solution is fine of course, except for the time wasted setting it up.

Finally, a build module is an additional thing you have to name; most probably already the package name and the class name reflect what the code is doing, now it also needs to be repeated in the build module name (DRY violation).

All in all, I think creating build modules is much too hard and time-consuming. Programmers are lazy (which, of course, is a good thing), and this leads to designs which are not as clean as they could be. Time to change that :).

(See also my earlier blog on modules.)

Packages

Java, Scala and Groovy already have a system for grouping code: packages. However, currently a package is just a string identifier. Except for some very limited visibility options (package-private in Java, package-scoping in Scala) packages have no semantic meaning. So we have several levels of grouping code:

  1. Project
  2. Build module
  3. Package
  4. Class

What if we merged 2. and 3. together; why shouldn’t packages be used for creating modules?

Packages as modules?

Let’s see what would it take to extend packages to be modules. Obviously the first thing that we’d need is to associate some meta-data with each module. There are already some mechanisms for this (e.g. via annotations on package-info.java), or this could be an extension of package objects in Scala – some traits to mix in, or vals to override.

What kind of meta-data? Of course we don’t want to move the whole build definition to the packages. But let’s separate concerns – the build definition should define how to build the project, not what the module dependencies are. Then the first thing to define in a module’s meta-data would be dependencies on third-party libraries. Such definitions could be only symbols, which would be bound to concrete versions in the build definition.

For example, we would specify that package “foo.bar.dao” depends on the “jpa” libraries. The build definition would then contain a mapping from “jpa” to a list of maven artifacts (e.g. hibernate-core, hibernate-entitymanager etc.). Moreover, it would probably make most sense if such dependencies where transitive to sub-packages. So defining a global library would mean adding a dependency on the root package.

As a side note, with an extension of Scala’s package objects, this could even be made type-safe. The package objects could implement a trait, where one of the values to override could be the list of third-party dependencies symbols. The symbols themselves could be e.g. contained in an Enumeration, defined in the root package; which could make things like “find all modules dependent on jpa” a simple usage-search in the IDE.

Second step is to define inter-module dependencies using this mechanism as well. It would be possible, in the package’s meta-data, to define a list of other packages, from which code is visible. This follows how currently build modules are used: each contains a list of project modules which can be accessed. (Another Scala side-note: as the package objects would implement a trait, this would mean defining a list of objects with a given type.)

Taking this further, we could specify api and impl type-packages. Api-type ones would by default be accessible from other packages. Impl-type packages, on the other hand, couldn’t be accessed without explicitly specifying them as a dependency.

How could it look like in practice? A very rough sketch in Scala:

package foo.user
 
// Even without definition, each package has an implicit package object
// implementing a PackageModule trait ...
package object dao {
  // ... which is used here. The type of the val below is
  // List[PackageModule].
  override val moduleDependsOn = List(foo.security, foo.user.model)
  override val moduleType = ModuleType.API
  // FooLibs enum is defined in a top-level package or the build system
  override val moduleLibraries = List(FooLibs.JPA)
}

Refactoring

Refactoring is an everyday activity; however, refactoring modules is usually a huge task, approached only once in a while. Should it be so? If packages were extended to modules, refactoring modules would be the same as moving around and renaming packages, with the additional need to update the meta-data. It would be much easier than currently, which I think would lead to better overall designs.

Build system

The above would obviously mean more work to the build system – it would have a harder time figuring out the list of modules, build order, list of artifacts to create etc (by the way, should a separate jar be created for a package, could also be part of the meta-data). Also some validations would be needed – for circular dependencies, or trying to constraint the visibility in a wrong way.

But then, people have done more complicated software than that.

Jigsaw?

You would probably say that this overlaps with project Jigsaw, which will come in Java 9 (or not). However, I think Jigsaw aims at a different scale: project-level modules. So one jigsaw module would be your whole project, while you would have multiple (tens) of packages-modules.

The name “module” is overloaded here, maybe the name “mini-modules” would be better, or very modestly “packages done right”.

Bottom line

I think that currently the way to define build modules is way too hard and constraining. On the other hand, lifting packages to modules would be very lightweight. Defining a new module would be the same as creating a new package – couldn’t get much simpler. Third-party libraries could be added only where needed easily. There would be one less thing to name. And there would be one source tree per project.

Also such an approach would be scalable and adjustable to the project’s needs. It would be possible to define fine-grained modules or coarse-grained ones without much effort. Or even better, why not create both – modules could be nested and built one on top of the other.

Now … the only problem is implementing, and adding IDE support ;)


Published at DZone with permission of Adam Warski, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Andreas Schilling replied on Sat, 2012/11/17 - 11:01am

 Hmm, how do you come to the conclusion, that Jigsaw aimes at project-scale modules? To me it definitely aimes at the same level of granularity that you describe.

While I agree with the points made in your article, we already have that sort of stuff since ages. It's called OSGi.

Adam Warski replied on Sat, 2012/11/17 - 3:35pm in response to: Andreas Schilling

As for Jigsaw, that's the impression I got when reading http://openjdk.java.net/projects/jigsaw/doc/quickstart.html, for example: "The module-info.java is placed in the root of the source tree of the module's classes". So it looks pretty build-module global, not per-package.

As for OSGi, I think it tries to solve a bit different problem. First of all, unless I'm mistaken, it only checks the dependencies at run-time - not at compile-time. Secondly, one of the main features is the ability to replace a module, which is a completely unrelated thing. 

Finally, at least currently, turning each package into a bundle (a bundle is a jar) with a manifest would be very unpractical and a burden, not much help. So while the meta-data defined in OSGi is similar, and the idea is in some places similar, the current implementation won't help here.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.