checkstyle – Writing Javadoc Checks

Content

Content
What is Javadoc comment
Limitations
Overview
Difference between Java Grammar and Javadoc comments Grammar
Tools to see Javadoc tree structure
Access Java AST from Javadoc Check
HTML Code In Javadoc Comments
Checkstyle SDK GUI
Integrating new Javadoc Check
Examples of Javadoc Checks

What is Javadoc comment

Javadoc comment is multiline comment that starts with * character and placed above class definition, interface definition, enum definition, method definition or field definition.

For example, here is java file:

/**
 * My <b>class</b>.
 * @see AbstractClass
 */
public class MyClass {

}

Javadoc content is:

 * My <b>class</b>.
 * @see AbstractClass

Attention that java comment is start with /*, following with Identificator of comment type. Javadoc Identificator is *. All symbols after Javadoc Identificator till */ are part of javadoc comment. In internet you can find different types of documentation generation tools similar to javadoc. Such tools reply on different Identificators: "!", "#", "$". Comments looks like "/*! some comment */" , "/*# some comment */" , "/*$ some comment */". Such multiline comments are not a javadoc.

Limitations

Javadoc by specification could contain any HTML tags that to let user generate content he needs. Checkstyle can not parse something that looks like an HTML, so limitation appear. The comment should be written in XHTML to be correctly processed by Checkstyle. This means that every HTML tag should have matching closed HTML tag or it is self-closed one (singlton tag). The only exceptions are <p>, <li>, <tr>, <td>, <th>, <body>, <colgroup>, <dd>, <dt>, <head>, <html>, <option>, <tbody>, <thead>, <tfoot> and Checkstyle won't show error about missing closing tag, however, it leads to broken XHTML structure and, therefore, incorrect Abstract Syntax Tree of the Javadoc comment anyway. See examples at "HTML Code In Javadoc Comments" chapter.

Javadoc parser requires XHTML to be used in Javadoc comments, i.e. if there is some open tag(for example <div>) then there have to be its close tag </div>. This means that if Javadoc comment has incorrect XHTML structure then Javadoc Parser will fail processing the comment, therefore, your new Check can't get its parse tree and process anything from this Javadoc comment. For more details and examples go to "HTML code in Javadoc comments" section.

Javadoc grammar requires XHTML, but it can also parse some parts of HTML code (like some unclosed tags). However result tree will be unpredictable. It is done just to not fail on every Javadoc comment, because there are tons of using unclosed tags, etc.

Overview

To start implementing new Check create new class and extend AbstractJavadocCheck. It has two abstract methods you should implement:

getDefaultJavadocTokens() - return int array of javadoc token types your Check is going to process. The array should contain int constants from JavadocTokenTypes class. There is also TokenTypes class in Checkstyle. Make sure you use JavadocTokenTypes class in your Check, because the TokenTypes is used to describe standard Java DetailAST token type.
visitJavadocToken(DetailNode) - it's the place you should put tree nodes proccessing. The argument is Javadoc tree node of type you described before in getDefaultJavadocTokens() method.

In Javadoc comment every whitespace matters, so parse tree contains whitespace nodes (WS javadoc token type). So do CHAR javadoc token that presents single character. The only redundancy Javadoc tree has because of this is that TEXT node consists of CHAR and WS nodes which is useless, but it is implementation nuance. (In future we will try to resolve this).

Difference between Java Grammar and Javadoc comments Grammar

Java grammar parses java file due to Java language specifications. So, there are singleline comments and multiline/block comments in it. Java compiler doesn't know about Javadoc because it is just a multiline comment. To parse multiline comment as a Javadoc comment, checkstyle has special Parser that is based on ANTLR Javadoc grammar. So, it's supposed to proccess block comments that start with Javadoc Identificator and parse them to Abstract Syntax Tree (AST).

The problem is that Java grammar is old one and uses ANTLR v2, while Javadoc grammar uses ANTLR v4. Because of that, these two grammars and their trees are not compatible. Java AST consists of DetailAST objects, while Javadoc AST consists of DetailNode objects.

Tools to see Javadoc tree structure

Checkstyle can print Abstract Syntax Tree for Java and Javadoc trees. You need to run checkstyle jar file with -J argument, providing java file.

For example, here is MyClass.java file:

/**
 * My <b>class</b>.
 * @see AbstractClass
 */
public class MyClass {

}

Command:

java -jar checkstyle-6.18-all.jar -J MyClass.java

Output:

CLASS_DEF -> CLASS_DEF [5:0]
|--MODIFIERS -> MODIFIERS [5:0]
|   |--JAVADOC -> \r\n * My <b>class</b>.\r\n * @see AbstractClass\r\n <EOF> [1:0]
|   |   |--NEWLINE -> \r\n [1:0]
|   |   |--LEADING_ASTERISK ->  * [2:0]
|   |   |--TEXT ->  My  [2:2]
|   |   |   |--WS ->   [2:2]
|   |   |   |--CHAR -> M [2:3]
|   |   |   |--CHAR -> y [2:4]
|   |   |   `--WS ->   [2:5]
|   |   |--HTML_ELEMENT -> <b>class</b> [2:6]
|   |   |   `--HTML_TAG -> <b>class</b> [2:6]
|   |   |       |--HTML_ELEMENT_OPEN -> <b> [2:6]
|   |   |       |   |--OPEN -> < [2:6]
|   |   |       |   |--HTML_TAG_NAME -> b [2:7]
|   |   |       |   `--CLOSE -> > [2:8]
|   |   |       |--TEXT -> class [2:9]
|   |   |       |   |--CHAR -> c [2:9]
|   |   |       |   |--CHAR -> l [2:10]
|   |   |       |   |--CHAR -> a [2:11]
|   |   |       |   |--CHAR -> s [2:12]
|   |   |       |   `--CHAR -> s [2:13]
|   |   |       `--HTML_ELEMENT_CLOSE -> </b> [2:14]
|   |   |           |--OPEN -> < [2:14]
|   |   |           |--SLASH -> / [2:15]
|   |   |           |--HTML_TAG_NAME -> b [2:16]
|   |   |           `--CLOSE -> > [2:17]
|   |   |--TEXT -> . [2:18]
|   |   |   `--CHAR -> . [2:18]
|   |   |--NEWLINE -> \r\n [2:19]
|   |   |--LEADING_ASTERISK ->  * [3:0]
|   |   |--WS ->   [3:2]
|   |   |--JAVADOC_TAG -> @see AbstractClass\r\n  [3:3]
|   |   |   |--SEE_LITERAL -> @see [3:3]
|   |   |   |--WS ->   [3:7]
|   |   |   |--REFERENCE -> AbstractClass [3:8]
|   |   |   |   `--CLASS -> AbstractClass [3:8]
|   |   |   |--NEWLINE -> \r\n [3:21]
|   |   |   `--WS ->   [4:0]
|   |   `--EOF -> <EOF> [4:1]
|   `--LITERAL_PUBLIC -> public [5:0]
|--LITERAL_CLASS -> class [5:7]
|--IDENT -> MyClass [5:13]
`--OBJBLOCK -> OBJBLOCK [5:21]
    |--LCURLY -> { [5:21]
    `--RCURLY -> } [7:0]

As you see very small java file transforms to a huge Abstract Syntax Tree, because that is the most detailed tree including all components of the java file: classes, methods, comments, etc. But in most cases while developing Javadoc Check you need only parse tree of the exact Javadoc comment. To do that just copy Javadoc comment to separate file and remove /** at the begining and */ at the end. After that, run checkstyle with -j argument.

MyJavadocComment.javadoc file:

 * My <b>class</b>.
 * @see AbstractClass

Command:

java -jar checkstyle-6.18-SNAPSHOT-all.jar -j MyJavadocComment.javadoc

Output:

JAVADOC ->  * My <b>class</b>.\r\n * @see AbstractClass<EOF> [0:0]
|--LEADING_ASTERISK ->  * [0:0]
|--TEXT ->  My  [0:2]
|   |--WS ->   [0:2]
|   |--CHAR -> M [0:3]
|   |--CHAR -> y [0:4]
|   `--WS ->   [0:5]
|--HTML_ELEMENT -> <b>class</b> [0:6]
|   `--HTML_TAG -> <b>class</b> [0:6]
|       |--HTML_ELEMENT_OPEN -> <b> [0:6]
|       |   |--OPEN -> < [0:6]
|       |   |--HTML_TAG_NAME -> b [0:7]
|       |   `--CLOSE -> > [0:8]
|       |--TEXT -> class [0:9]
|       |   |--CHAR -> c [0:9]
|       |   |--CHAR -> l [0:10]
|       |   |--CHAR -> a [0:11]
|       |   |--CHAR -> s [0:12]
|       |   `--CHAR -> s [0:13]
|       `--HTML_ELEMENT_CLOSE -> </b> [0:14]
|           |--OPEN -> < [0:14]
|           |--SLASH -> / [0:15]
|           |--HTML_TAG_NAME -> b [0:16]
|           `--CLOSE -> > [0:17]
|--TEXT -> . [0:18]
|   `--CHAR -> . [0:18]
|--NEWLINE -> \r\n [0:19]
|--LEADING_ASTERISK ->  * [1:0]
|--WS ->   [1:2]
|--JAVADOC_TAG -> @see AbstractClass [1:3]
|   |--SEE_LITERAL -> @see [1:3]
|   |--WS ->   [1:7]
|   `--REFERENCE -> AbstractClass [1:8]
|       `--CLASS -> AbstractClass [1:8]
`--EOF -> <EOF> [1:21]

Access Java AST from Javadoc Check

As you alreasy know Javadoc parse tree is result of parsing block comment. There is a method to get the original block comment from Javadoc Check. You may need this block comment to check its position or something else in main DetailAST tree.

For example, to write a JavadocCheck that verifies @param tags in Javadoc comment of a method definition, you also need all method's parameter names. To get method definition AST you should access main DetailAST tree throuth block comment AST. For this purpose use getBlockCommentAst() method that returns DetailAST node.

Example:

class MyCheck extends AbstractJavadocCheck {

    @Override
    public int[] getDefaultJavadocTokens() {
        return new int[]{JavadocTokenTypes.PARAMETER_NAME};
    }

    @Override
    public void visitJavadocToken(DetailNode paramNameNode) {
        String javadocParamName = paramNameNode.getText();
        DetailAST blockCommentAst = getBlockCommentAst();

        if (BlockCommentPosition.isOnMethod(blockCommentAst)) {

            DetailAST methodDef = blockCommentAst.getParent();
            DetailAST methodParam = findMethodParameter(methodDef);
            String methodParamName = methodParam.getText();

            if (!javadocParamName.equals(methodParamName)) {
                log(methodParam, "params.dont.match");
            }

        }
    }
}

HTML Code In Javadoc Comments

Examples:

1) Unclosed paragraph HTML tag. As you see in the tree, content of the paragraph tag is not nested to this tag. That is because HTML tags are not closed by pair tag </p>, and Checkstyle requires XHTML to predictably parse Javadoc comments.

2) Here is correct version with open and closed HTML tags.

<p> First
<p> Second

<p> First </p>
<p> Second </p>

JAVADOC -> <p> First\r\n<p> Second<EOF> [0:0]
|--HTML_ELEMENT -> <p> [0:0]
|   `--P_TAG_OPEN -> <p> [0:0]
|       |--OPEN -> < [0:0]
|       |--P_HTML_TAG_NAME -> p [0:1]
|       `--CLOSE -> > [0:2]
|--TEXT ->  First [0:3]
|   |--WS ->   [0:3]
|   |--CHAR -> F [0:4]
|   |--CHAR -> i [0:5]
|   |--CHAR -> r [0:6]
|   |--CHAR -> s [0:7]
|   `--CHAR -> t [0:8]
|--NEWLINE -> \r\n [0:9]
|--HTML_ELEMENT -> <p> [1:0]
|   `--P_TAG_OPEN -> <p> [1:0]
|       |--OPEN -> < [1:0]
|       |--P_HTML_TAG_NAME -> p [1:1]
|       `--CLOSE -> > [1:2]
|--TEXT ->  Second [1:3]
|   |--WS ->   [1:3]
|   |--CHAR -> S [1:4]
|   |--CHAR -> e [1:5]
|   |--CHAR -> c [1:6]
|   |--CHAR -> o [1:7]
|   |--CHAR -> n [1:8]
|   `--CHAR -> d [1:9]
`--EOF -> <EOF> [1:10]

JAVADOC -> <p> First </p>\r\n<p> Second </p><EOF> [0:0]
|--HTML_ELEMENT -> <p> First </p> [0:0]
|   `--PARAGRAPH -> <p> First </p> [0:0]
|       |--P_TAG_OPEN -> <p> [0:0]
|       |   |--OPEN -> < [0:0]
|       |   |--P_HTML_TAG_NAME -> p [0:1]
|       |   `--CLOSE -> > [0:2]
|       |--TEXT ->  First  [0:3]
|       |   |--WS ->   [0:3]
|       |   |--CHAR -> F [0:4]
|       |   |--CHAR -> i [0:5]
|       |   |--CHAR -> r [0:6]
|       |   |--CHAR -> s [0:7]
|       |   |--CHAR -> t [0:8]
|       |   `--WS ->   [0:9]
|       `--P_TAG_CLOSE -> </p> [0:10]
|           |--OPEN -> < [0:10]
|           |--SLASH -> / [0:11]
|           |--P_HTML_TAG_NAME -> p [0:12]
|           `--CLOSE -> > [0:13]
|--NEWLINE -> \r\n [0:14]
|--HTML_ELEMENT -> <p> Second </p> [1:0]
|   `--PARAGRAPH -> <p> Second </p> [1:0]
|       |--P_TAG_OPEN -> <p> [1:0]
|       |   |--OPEN -> < [1:0]
|       |   |--P_HTML_TAG_NAME -> p [1:1]
|       |   `--CLOSE -> > [1:2]
|       |--TEXT ->  Second  [1:3]
|       |   |--WS ->   [1:3]
|       |   |--CHAR -> S [1:4]
|       |   |--CHAR -> e [1:5]
|       |   |--CHAR -> c [1:6]
|       |   |--CHAR -> o [1:7]
|       |   |--CHAR -> n [1:8]
|       |   |--CHAR -> d [1:9]
|       |   `--WS ->   [1:10]
|       `--P_TAG_CLOSE -> </p> [1:11]
|           |--OPEN -> < [1:11]
|           |--SLASH -> / [1:12]
|           |--P_HTML_TAG_NAME -> p [1:13]
|           `--CLOSE -> > [1:14]
`--EOF -> <EOF> [1:15]

Checkstyle SDK GUI

Not implemented yet. See Github Issue #408.

Integrating new Javadoc Check

Javadoc Checks as well as regular Checks extend AbstractCheck class. So integrating new Javadoc Check is similar to integrating other Checks.

Examples of Javadoc Checks

The best source knowledge about how to write Javadoc Checks could be taken from existing Checks.

About

Documentation

Developers

Project Documentation