Friday, 23 November 2012

Javadoc coding standards

Javadoc is a key part of coding in Java, yet there is relatively little discussion of what makes good Javadoc style - a coding standard.

Javadoc coding standard

These are the standards I tend to use when writing Javadoc. Since personal tastes differ, I've tried to explain some of the rationale for some of my choices. Bear in mind that this is more about the formatting of Javadoc, than the content of Javadoc.

There is an Oracle guide which is longer and more detailed than this one. The two agree in most places, however these guidelines are more explicit about HTML tags, two spaces in @param and null-specification, and differ in line lengths and sentence layout.

Each of the guidelines below consists of a short description of the rule and an explanation, which may include an example:

Write Javadoc to be read as source code

When we think of "Javadoc" we often think of the online Javadoc HTML pages. However, this is not the only way that Javadoc is consumed. A key way of absorbing Javadoc is reading source code, either of code you or your team wrote, or third party libraries. Making Javadoc readable as source code is critical, and these standards are guided by this principal.

Public and protected

All public and protected methods should be fully defined with Javadoc. Package and private methods do not have to be, but may benefit from it.

If a method is overridden in a subclass, Javadoc should only be present if it says something distinct to the original definition of the method. The @Override annotation should be used to indicate to source code readers that the Javadoc is inherited in addition to its normal meaning.

Use the standard style for the Javadoc comment

Javadoc only requires a '/**' at the start and a '*/' at the end. In addition to this, use a single star on each additional line:

  /**
   * Standard comment.
   */
  public ...

  /** Compressed comment. */
  public ...

Do not use '**/' at the end of the Javadoc.

Use simple HTML tags, not valid XHTML

Javadoc uses HTML tags to identify paragraphs and other elements. Many developers get drawn to the thought that XHTML is necessarily best, ensuring that all tags open and close correctly. This is a mistake. XHTML adds many extra tags that make the Javadoc harder to read as source code. The Javadoc parser will interpret the incomplete HTML tag soup just fine.

Use a single <p> tag between paragraphs

Longer Javadoc always needs multiple paragraphs. This naturally results in a question of how and where to add the paragraph tags. Place a single <p> tag on the blank line between paragraphs:

  /**
   * First paragraph.
   * <p>
   * Second paragraph.
   * May be on multiple lines.
   * <p>
   * Third paragraph.
   */
  public ...

Use a single <li> tag for items in a list

Lists are useful in Javadoc when explaining a set of options, choices or issues. These standards place a single <li> tag at the start of the line and no closing tag. In order to get correct paragraph formatting, extra paragraph tags are required:

  /**
   * First paragraph.
   * <p><ul>
   * <li>the first item
   * <li>the second item
   * <li>the third item
   * </ul><p>
   * Second paragraph.
   */
  public ...

Define a punchy first sentence

The first sentence, typically ended by a dot, is used in the next-level higher Javadoc. As such, it has the responsibility of summing up the method or class to readers scanning the class or package. To achieve this, the first sentence should be clear and punchy, and generally short.

While not required, it is recommended that the first sentence is a paragraph to itself. This helps retain the punchiness for readers of the source code.

It is recommended to use the third person form at the start. For example, "Gets the foo", "Sets the "bar" or "Consumes the baz". Avoid the second person form, such as "Get the foo".

Use "this" to refer to an instance of the class

When referring to an instance of the class being documented, use "this" to reference it. For example, "Returns a copy of this foo with the bar value updated".

Aim for short single line sentences

Wherever possible, make Javadoc sentences fit on a single line. Allow flexibility in the line length, favouring between 80 and 120 characters to make this work.

In most cases, each new sentence should start on a new line. This aids readability as source code, and simplifies refactoring re-writes of complex Javadoc.

  /**
   * This is the first paragraph, on one line.
   * <p>
   * This is the first sentence of the second paragraph, on one line.
   * This is the second sentence of the second paragraph, on one line.
   * This is the third sentence of the second paragraph which is a bit longer so has been
   * split onto a second line, as that makes sense.
   * This is the fourth sentence, which starts a new line, even though there is space above.
   */
  public ...

Use @link and @code wisely

Many Javadoc descriptions reference other methods and classes. This can be achieved most effectively using the @link and @code features.

The @link feature creates a visible hyperlink in generated Javadoc to the target. The @link target is one of the following forms:

  /**
   * First paragraph.
   * <p>
   * Link to a class named 'Foo': {@link Foo}.
   * Link to a method 'bar' on a class named 'Foo': {@link Foo#bar}.
   * Link to a method 'baz' on this class: {@link #baz}.
   * Link specifying text of the hyperlink after a space: {@link Foo the Foo class}.
   * Link to a method handling method overload {@link Foo#bar(String,int)}.
   */
  public ...

The @code feature provides a section of fixed-width font, ideal for references to methods and class names. While @link references are checked by the Javadoc compiler, @code references are not.

Only use @link on the first reference to a specific class or method. Use @code for subsequent references. This avoids excessive hyperlinks cluttering up the Javadoc.

Never use @link in the first sentence

The first sentence is used in the higher level Javadoc. Adding a hyperlink in that first sentence makes the higher level documentation more confusing. Always use @code in the first sentence if necessary. @link can be used from the second sentence/paragraph onwards.

Do not use @code for null, true or false

The concepts of null, true and false are very common in Javadoc. Adding @code for every occurrence is a burden to both the reader and writer of the Javadoc and adds no real value.

Use @param, @return and @throws

Almost all methods take in a parameter, return a result or both. The @param and @return features specify those inputs and outputs. The @throws feature specifies the thrown exceptions.

The @param entries should be specified in the same order as the parameters. The @return should be after the @param entries, followed by @throws.

Use @param for generics

If a class or method has generic type parameters, then these should be documented. The correct approach is an @param tag with the parameter name of <T> where T is the type parameter name.

Use one blank line before @param

There should be one blank line between the Javadoc text and the first @param or @return. This aids readability in source code.

Treat @param and @return as a phrase

The @param and @return should be treated as phrases rather than complete sentences. They should start with a lower case letter, typically using the word "the". They should not end with a dot. This aids readability in source code and when generated.

Treat @throws as an if clause

The @throws feature should normally be followed by "if" and the rest of the phrase describing the condition. For example, "@throws IllegalArgumentException if the file could not be found". This aids readability in source code and when generated.

@param should two spaces after the parameter name

When reading the Javadoc as source code, a single space after the parameter name is a lot harder to read than two spaces. Avoid aligning the parameters in a column, as it is prone to difficulty in refactoring where parameter names are changed or added.

  /**
   * Javadoc text.
   * 
   * @param foo  the foo parameter
   * @param bar  the bar parameter
   * @return the baz content
   */
  public String process(String foo, String bar) {...}

Define null-handling for all parameters and return types

Whether a method accepts null on input, or can return null is critical information for building large systems. All non-primitive methods should define their null-tolerance in the @param or @return. Some standard forms expressing this should be used wherever possible:

  • "not null" means that null is not accepted and passing in null will probably throw an exception , typically NullPointerException
  • "may be null" means that null may be passed in. In general the behaviour of the passed in null should be defined
  • "null treated as xxx" means that a null input is equivalent to the specified value
  • "null returns xxx" means that a null input always returns the specified value

When defined in this way, there should not be an @throws for NullPointerException.

  /**
   * Javadoc text.
   * 
   * @param foo  the foo parameter, not null
   * @param bar  the bar parameter, null returns null
   * @return the baz content, null if not processed
   */
  public String process(String foo, String bar) {...}

While it may be tempting to define null-handling behaviour in a single central location, such as the class or package Javadoc, this is far less useful for developers. The Javadoc at the method level appears in IDEs during normal coding, whereas class or package level Javadoc requires a separate "search and learn" step.

Other simple constraints may be added as well if applicable, for example "not empty, not null". Primitive values might specify their bounds, for example "from 1 to 5", or "not negative".

Specifications require implementation notes

If you are writing a more formal specification that will be implemented by third parties, consider adding an "implementation notes" section. This is an additional section, typically at the class level, that specifies any behaviours required by implementations that are not otherwise specified, or not of general interest. See this example.

Avoid @author

The @author feature can be used to record the authors of the class. This should be avoided, as it is usually out of date, and it can promote code ownership by an individual. The source control system is in a much better position to record authors.

Examples

The ThreeTen repository has some more complete examples

Summary

Hopefully these suggestions will help you to write better Javadoc. Feel free to disagree or point to some alternative standards.

28 comments:

  1. This is a thoughtful guide. I disagree with one rule, however. Requiring "all public and protected methods should be fully defined with Javadoc" encourages useless documentation, like this:

    /**
    * Gets the foo.
    *
    * @return the foo
    */
    public Foo getFoo() {
    ...
    }

    This wastes everyone's time and decreases the overall value of the documentation. When you have nothing useful to say, say nothing!

    ReplyDelete
    Replies
    1. There's no reason to think that every get method simply returns some member of the class, they can very well be large methods possibly even with side effects. Those using the method perhaps from a just without actually seeing the code should be aware company,or simplicity, of the method.

      Delete
    2. Just *jar
      Company *complexity

      Sorry, autocorrect

      Delete
  2. I agree with the above comment.
    We should not be trying to do something efficiently which should not be done at all.

    There are excellent opportunities to document well through JavaDocs for public APIs.
    Here is a sample from lucene code that explains fairly complex feature very well.
    http://lucene.apache.org/core/old_versioned_docs/versions/2_9_0/api/all/org/apache/lucene/search/Similarity.html


    Just took example from your example, Sorry can't appreciate the necessity of for having this that too for private variables.

    /**
    * The year.
    */
    private final int year;
    /**
    * The month-of-year, not null.
    */
    private final short month;
    /**
    * The day-of-month.
    */
    private final short day;

    ReplyDelete
    Replies
    1. Specifically on the private variable point, this is used by Javadoc if the class is serializable, for example http://docs.oracle.com/javase/7/docs/api/serialized-form.html#java.awt.BorderLayout

      Delete
    2. This comment has been removed by the author.

      Delete
  3. I agree too with previous comments (that "all public and protected methods should be fully defined with Javadoc" is a bad idea).
    I add that compressed comments (on a single line) should be greatly encouraged #myScreenSpaceIsValuable

    ReplyDelete
    Replies
    1. Just to all of those that think that their screen space is valuable: Eclipse has a nice little feature called "Folding" that will automatically fold any comments. #eclipseValuesYourScreenSpace

      Delete
  4. I agree with most of the advice here. In particular, I like the idea of sentence-line documentation. I thought I was the only one who encouraged that.

    I would add that, whatever your documentation guidelines, they should be described in the Javadoc of package-info.java at the highest appropriate level in the project, or in overview.html. I shouldn't have to explicitly identify every parameter as being non-null if that is the explicit understanding. When something deviates from the guidelines, I try to use consistent phrasing, both for readability and in order to create an expectation that facilitates understanding. For example, "This argument may be null, in which case a default value of xxx is assumed."

    ReplyDelete
  5. Personally, I always choose to document getters and setters. I understand the "doesn't add value" argument, but disagree on two counts. Firsly, its inconsistent - why should those methods have a special status. Secondly, the Javadoc is useful to me. Not all methods starting with "get" or "set" are simple getters/setters. This can be discovered by doing ctrl+space in the IDE and seeing the Javadoc. If it is simply 4 or 5 words of a standard pattern, then it is reasonable to assume the getter is simple. If there is no Javadoc I can't tell anything.

    And since I often generate my Javadoc (in a better way than IDE generation), there is no effort in writing.

    I tend to document private fields, as I have a memory like a sieve. I believe most of us do. Any piece of information is helpful when looking at a piece of code 6 months later.

    @Fraaargh, I'm not a huge fan of compressed comments. Buy yourself a bigger screen or use comment folding in your IDE. Sometimes they make sense, such as in inner classes.

    @Nathan, as I indicated, I strongly believe that documenting null-handling centrally is useless to developers casually using your API. Most developers do not read the Javadoc as a whole, just the small parts after ctrl+space. Thus, that is where the null-handling needs to be defined. Adding ", not null" to the end of each parameter is not a burden.

    ReplyDelete
    Replies
    1. Do you consider null handling to be of special interest to documentation because of its ubiquity with regard to reference types? I ask because I tend to centrally document a number of such constraints, not just whether a value can be null. For example, floats/doubles are not NaN unless specified, arrays may be zero-length unless specified, collections/maps may be empty unless specified, collections/maps do not contain null unless specified, etc. Documenting each variation of only the null constraint feels inconsistent, but I think documenting each variation of each constraint for each documented value is tedious and more prone to cause documentation errors. Worse, it has been my experience (anecdotal, granted) that documenting every non-exceptional case has the affect of washing out the exceptional cases, making the users of my APIs more likely to miss the exceptional cases.

      Personally, I consider overview and package-level documentation to be essential, not intended to be ignored by even the casual developer. I can't help but feel there's a bit of a RTFM problem here, and I'm not convinced that the "locally document everything" approach minimizes the probability of misuse.

      Delete
    2. I find null to be the most common error, resulting in NPE. By defining the expected behaviour on the parameter, I am forced to think about that case and what the code should do. I also indicate to others that I have thought about it.

      The impact in code maintenance is very positive. If a NPE occurs, it is clear who is at fault. If the NPE occurs in a method that declares it accepts null, then that method is at fault. If the method declares it does not accept null, the caller is at fault. This can be baked into teams as a very simple rule to follow.

      But it relies on those writing code in the first place to actually think about null inputs/outputs for every single case. And adding documentation inline is the best proof that they did.

      Delete
  6. Thanks for your post Stephen. I agree with much with what you said but have two thoughts.

    1) What is your opinion with regards to using annotations to document "nullness"? May be we need to wait until JSR308 and JSR305 are done?

    2) I noticed that the threeten code documents thread safety, should this be a part of your post as well? Also, do you think we should have a standard set of annotations to help define these levels? (Effective Java 2nd, Item #70 mentions the idea of thread safety levels)

    Tx

    ReplyDelete
    Replies
    1. I dislike the null annotations, as they are very verbose within the source code. (I'm aware methods can "inherit" from classes, but find that to be unhelpful).

      Its also the case that a method with parameters marked @NotNull can still accept nulls. Its only if the annotation checker is actually plugged in that the extra safety arrives. That means that the tool provides false safety, something I strongly object to.

      The "right" answer is something like Fantom (or other languages) where the type system can recognise and manage the difference between a nullable and non-nullable.

      You are right that thread-safety should be documented. I think standard annotations only tend to work if they are in the JDK. This may happen to some degree with John Rose's value types.

      Delete
  7. This comments demonstrate exactly what is wrong with most developers today.

    All public/protected methods *should* be documented. As should private variables. You're taking too much for granted when you this is obvious. It's *not* obvious whether a method return value may be null or not. It's *not* obvious what a private variable means a few months after you wrote the code. All this stuff *must* be documented.

    You're not the target audience of your own code. Other people are. Anyone who thinks this is obvious should be made to maintain someone else's code for a couple of years. Trust me, it's hell.

    I make it a policy to dump employees who don't document their code properly. I give them many warnings but if they persist I show them the door. You can't build a team without a team-oriented attitude.

    ReplyDelete
  8. Stephen, what's your stance on runtime exceptions declared in the method signature (throws..)? Such as IllegalArgumentException, IllegalStateException or a library specific runtime exception type.

    ReplyDelete
    Replies
    1. I don't document NPE, as I use the "not null" or "may be null" Javadoc for that. I would document IAE, ISE and similar wherever possible, as they are real outcomes of using the method.

      Delete
  9. Note that you can't have a list (<ul> or <ol>) inside a paragraph. When an HTML parser sees <p><ul> it will close the paragraph automatically before creating the <ul> element.

    You can check it in Hixie's live DOM viewer to see how your browser interprets it:
    http://software.hixie.ch/utilities/js/live-dom-viewer/?%3Cp%3E%3Cul%3E

    ReplyDelete
    Replies
    1. My comment on p/ul is based on the tag soup interpretation I have observed. I don't doubt that ul shouldn't be in a paragraph (although thats exactly the kind of official rule that makes XHTML so hard for humans to write).

      Delete
  10. There is a long discussion about commenting public API. It is important to give a context before arguing. For general purpose libraries and frameworks, public API should be documented.

    But for regular projects when the code is not reused outside, this strict rule does not make sense for me. I would prefer to have a good tests for the method, tests are living documentation.

    If setter has side effect choose correct name for it, if your getter has side effect it is serious problem. Use more descriptive variable name instead of additional comment.

    Regarding null handling: do you consider the concept like Optional from Guava library? Then you don't need to comment, null is then defined explicitly.

    ReplyDelete
    Replies
    1. I consider code without any Javadoc to be sloppy, even if it is not public. Thats because it is still an API, and still "public" to your colleagues.

      I'm not a fan of Optional. Null-handling is a language level problem in Java, and the language level solutions are far nicer than annotations of Optional.

      Delete
    2. @Marcin
      A further point, it can't be assumed that the tester can see into an implementation. Many "regular project" teams hand-off 'finished' specification (documentation) to other teams within an organization whom exclusively write the tests for that API and may never see source code.

      Not adhering to thought-out or 'stricter' documentation standards in the present limits the ease for organizational scaling in a future, when labor resources are increased by multitude of man-hours for porting - if porting is even an option by that time (legacy support).

      Delete
  11. I agree with Marcin but would extend his argument to include all general purpose libraries and frameworks, including those developed for internal use only. Other classes should be self-documenting by sensible use of variable, method and class names, and unit tests. My colleagues at work all disagree with me and advocate comprehensive Javadoc, and without exception they all write totally useless comments that add no value whatsoever, just code-bloat.

    ReplyDelete
  12. Checkstyle project is considering to add support for Comments and JavaDoc validation , please support proposals at https://groups.google.com/forum/#!topic/checkstyle/VEVFDsZKLzg

    ReplyDelete
  13. In code section #3, the second "ul" should instead be "/ul", no?

    ReplyDelete
  14. I prefer to not add comments for getters and setters, by default. I find this adds noise. If a getXYZ or setXYZ method has additional side-effect or does something other than just returning or setting a value, then yes, a comment in this case should be present to document the unconventional behavior. But an even better solution, if possible, would be to rename these methods or rewrite the code somehow so that the methods are indeed the expected simple getters/setters.

    ReplyDelete
    Replies
    1. Another situation where documenting getters and setters could be useful is if the field isn't obvious. For example, getMonth/setMonth: is the month 0-based or 1-based?

      I guess you can know if a getter/setter comment is useful by what it says. If the comment for getFoo only says "return the foo", and if there's nothing more you can really add to that, then I would prefer to leave that comment out.

      Delete
  15. All public and protected methods should be fully defined with Javadoc. Package and private methods do not have to be, but may benefit from it.

    ReplyDelete

Please be aware that by commenting you provide consent to associate your selected profile with your comment. Long comments or those with excessive links may be deleted by Blogger (not me!). All spam will be deleted.