1. Introduction

Abstract

JML is a behavioral interface specification language tailored to Java(TM). Besides pre- and postconditions, it also allows assertions to be intermixed with Java code; these aid verification and debugging. JML is designed to be used by working software engineers; to do this it follows Eiffel in using Java expressions in assertions. JML combines this idea from Eiffel with the model-based approach to specifications, typified by VDM and Larch, which results in greater expressiveness. Other expressiveness advantages over Eiffel include quantifiers, specification-only variables, and frame conditions.
This paper discusses the goals of JML, the overall approach, and describes the basic features of the language through examples. It is intended for readers who have some familiarity with both Java and behavioral specification using pre- and postconditions.

JML stands for "Java Modeling Language" [Leavens-Baker-Ruby99]. JML is a behavioral interface specification language (BISL) [Wing87] designed to specify Java [Arnold-Gosling-Holmes00] [Gosling-etal00] modules. Java modules are classes and interfaces.

The main goal of our research on JML is to better understand how to make BISLs (and BISL tools) that are practical and effective for production software environments. In order to understand this goal, and the more detailed discussion of our goals for JML, it helps to define more precisely what a behavioral interface specification is. After doing this, we return to describing the goals of JML, and then give a brief overview of the tool support for JML and an outline of the rest of the paper.

1.1 Behavioral Interface Specification

1.2 Lightweight Specifications

1.3 Goals

1.4 Tool Support

1.5 Outline

1.1 Behavioral Interface Specification

As a BISL, JML describes two important aspects of a Java module:

its interface, which consists of the names and static information found in Java declarations, and
its behavior, which tells how the module acts when used.

BISLs are inherently language-specific [Wing87], because they describe interface details for clients written in a specific programming language, For example, a BISL tailored to C++, such as Larch/C++ [Leavens97c], describes how to use a module in a C++ program. A Larch/C++ specification cannot be implemented correctly in Java, and a JML specification cannot be correctly implemented in C++ (except for methods that are specified as native code).

JML specifications can either be written in separate files or as annotations in Java program files. To a Java compiler such annotations are comments that are ignored [Luckham-vonHenke85] [Luckham-etal87] [Rosenblum95] [Tan94] [Tan95]. This allows JML specifications, such as the specification below, to be embedded in Java program files. Consider the following simple example of a behavioral interface specification in JML, written as annotations in a Java program file, `IntMathOps.java'.

public class IntMathOps { // 1 // 2 /*@ public normal_behavior // 3 @ requires y >= 0; // 4 @ assignable \nothing; // 5 @ ensures 0 <= \result // 6 @ && \result * \result <= y // 7 @ && ((0 <= (\result + 1) * (\result + 1)) // 8 @ ==> y < (\result + 1) * (\result + 1)); // 9 @*/ //10 public static int isqrt(int y) //11 { //12 return (int) Math.sqrt(y); //13 } //14 } //15

The specification above describes a Java class, IntMathOps that contains one static method (function member) named isqrt. The single-line comments to the far right (which start with //) give the line numbers in this specification; they are ignored by both Java and JML. Comments with an immediately following at-sign, //@, or, as on lines 3--10, C-style comments starting with /*@, are annotations. Annotations are treated as comments by a Java compiler, but JML processes the text of an annotation. The text of an annotation is either the remainder of a line following //@ or the characters between the annotation markers /*@ and @*/. In the second form, at-signs (@) at the beginning of lines are ignored; they can be used to help the reader see the extent of an annotation.

In the above specification, interface information is declared in lines 1 and 11. Line 1 declares a class named IntMathOps, and line 11 declares a method named isqrt. Note that all of Java's declaration syntax is allowed in JML, including, on lines 1 and 11, that the names declared are public, that the method is static (line 11), that its return type is int (line 11), and that it takes one int argument.

Such interface declarations must be found in a Java module that correctly implements this specification. This is automatically the case in the file `IntMathOps.java' shown above, since that file also contains the implementation. In fact, when Java annotations are embedded in `.java' files, the interface specification is the actual Java source code.

To be correct, an implementation must have both the specified interface and the specified behavior. In the above specification, the behavioral information is specified in the annotation text on lines 3--10.(1) The keywords public normal_behavior are used to say that the specification is intended for callers (hence "public"), and that when the precondition is satisfied a call must return normally, without throwing an exception (hence "normal"). In such a public specification, only names with public visibility may be used.(2) On line 4 is a precondition, which follows the keyword requires.(3) On line 5 is frame condition, which says that this method, when called, does not assign to any locations. On lines 6--9 is a postcondition, which follows the keyword ensures.(4) The precondition says what must be true about the arguments (and other parts of the state); if the precondition is true, then the method must terminate normally in a state that satisfies the postcondition. This is a contract between the caller of the method and the implementor [Hoare69] [Jones90] [Jonkers91] [Guttag-Horning93] [Meyer92a] [Meyer97] [Morgan94]. The caller is obligated to make the precondition true, and gets the benefit of having the postcondition then be satisfied. The implementor gets the benefit of being able to assume the precondition, and is obligated to make the postcondition true in that case.

In general, pre- and postconditions in JML are written using an extended form of Java expressions. In this case, the only extension visible is the keyword \result, which is used in the postcondition to denote the value returned by the method. The type of \result is the return type of the method; for example, the type of \result in isqrt is int. The postcondition says that the result is an integer approximation to the square root of y. The first conjunct on line 6, 0 <= \result say that the result is non-negative. The second and third conjuncts state that the result is an integer approximation to the square root of the argument y. The second conjunct, on line 7, says that the result squared is no larger than the argument, y. The third conjunct, on lines 8--9, is an implication; it has two expressions connected by ==>, which means implication in JML. This implication says that if the result plus one squared is non-negative, then the result plus one squared is strictly larger than y.(5) Note that the behavioral specification does not give an algorithm for finding the square root.

Method specifications may also be written in Java's documentation comments. The following is an example. The part that JML sees is enclosed within the HTML "tags" <jml> and </jml>.(6) As in this example, one can use surrounding tags <pre> and </pre> to tell javadoc to ignore what JML sees, and to leave the formatting of it alone. The <pre> and </pre> tags are not required by JML tools (including jmldoc, which does a better job of formatting specifications than does javadoc).

public class IntMathOps4 { /** Integer square root function. * @param y the number to take the root of * @return an integer approximating * the positive square root of y * <pre><jml> * public normal_behavior * requires y >= 0; * assignable \nothing; * ensures 0 <= \result * && \result * \result <= y * && ((0 <= (\result + 1) * (\result + 1)) * ==> y < (\result + 1) * (\result + 1)); * </jml></pre> **/ public static int isqrt(int y) { return (int) Math.sqrt(y); } }

Because we expect most of our users to write specifications in Java code files, most of our examples will be given as annotations in `.java' files as in the specifications above. However, it is possible to use JML to write documentation in separate, non-Java files, such as the file `IntMathOps2.jml-refined' below. Since these files are not Java program files, JML requires the user to omit the code for concrete methods in such a file (except that code for "model" methods can be present, see section 2.3.1 Purity). The specification below shows how this is done, using a semicolon (;), as in a Java abstract method declaration.

//@ model import org.jmlspecs.models.*; public /*+@ spec_bigint_math @+*/ class IntMathOps2 { /*@ public normal_behavior @ requires y >= 0; @ assignable \nothing; @ ensures -y <= \result && \result <= y; @ ensures \result * \result <= y; @ ensures y < (Math.abs(\result) + 1) @ * (Math.abs(\result) + 1); @*/ public static int isqrt(int y); }

Besides files with suffixes of `.jml-refined' or `.jml', JML also works with files with the suffixes `.spec' and `.spec-refined'. All these files use Java's syntax, and one must use annotation markers just as in a `.java' file. However, since these kinds of files files are not Java files, in such a file one must also omit the code for concrete, non-model methods.

The specification of IntMathOps2 below is written in spec_bigint_math mode [Chalin04]. This means that integer mathematics inside the specifications in the class IntMathOps2 are done in infinite precision arithmetic, instead of the usual Java arithmetic. This leads to a simpler specification, especially in the ensures clause.(7)

The above specification also demonstrates that ensures clauses can be repeated in a specification. In IntMathOps2's specification of isqrt, there are three ensures clauses; all of them must be satisfied. Thus the meaning is the same as the conjunction of all of the postconditions specified in the individual ensures clauses. This specification is also more underspecified than the specifications given previously, as it allows negative numbers to be returned as results.

The above specification would be implemented in the file `IntMathOps2.java', which is shown below. This file contains a refine clause, which tells the reader of the `.java' file what is being refined and the file in which to find its specification.

//@ refine "IntMathOps2.jml-refined"; //@ model import org.jmlspecs.models.*; public class IntMathOps2 { public static int isqrt(int y) { return (int) Math.sqrt(y); } }

To summarize, a behavioral interface specification describes both the interface details of a module, and its behavior. The interface details are written in the syntax of the programming language; thus JML uses the Java declaration syntax. The behavioral specification uses pre- and postconditions.

1.2 Lightweight Specifications

Although we find it best to illustrate JML's features in this paper using specifications that are detailed and complete, one can use JML to write less detailed specifications. In particular, one can use JML to write "lightweight" specifications (as in ESC/Java). The syntax of JML allows one to write specifications that consist of individual clauses, so that one can say just what is desired. More precisely, a lightweight specification is one that does not use a behavior keyword (like normal_behavior). By way of contrast, we call a specification a heavyweight specification if it uses one of the behavior keywords.

For example, one might wish to specify just that isqrt should be called only on positive arguments, but not want to be bothered with saying anything formal about the locations that can be assigned to by the method or about the result. This could be done as shown below. Notice that the only specification given below is a single requires clause. Since the specification of isqrt has no behavior keyword, it is a lightweight specification.

public class IntMathOps3 { //@ requires y >= 0; public static int isqrt(int y) { return (int) Math.sqrt(y); } }

What is the access restriction, or privacy level, of such a lightweight specification? The syntax for lightweight specifications does not have a place to specify the privacy level, so JML assumes that such a lightweight specification has the same level of visibility as the method itself. (Thus, the specification below is implicitly public.) What about the omitted parts of the specification, such as the ensures clause? JML assumes nothing about these. In the example below when the precondition is met, an implementation might either signal an exception or terminate normally, so this specification technically allows exceptions to be thrown. However, the gain in brevity often outweighs the need for this level of precision.

JML has a semantics that allows most clauses to be sensibly omitted from a specification. When the requires clause is omitted, for example, it means that no requirements are placed on the caller. When the assignable clause is omitted, it means that nothing is promised about what locations may not be assigned to by the method; that is, the method may assign to all locations that it can otherwise legally assign to. When the ensures clause is omitted, it means that nothing is promised about the state resulting from a method call. See section A. Specification Case Defaults, for the default meanings of various other clauses.

1.3 Goals

As mentioned above, the main goal of our research is to better understand how to develop BISLs (and BISL tools) that are practical and effective. We are concerned with both technical requirements and with other factors such as training and documentation, although in the rest of this paper we will only be concerned with technical requirements for the BISL itself. The practicality and effectiveness of JML will be judged by how well it can document reusable class libraries, frameworks, and Application Programming Interfaces (APIs).

We believe that to meet the overall goal of practical and effective behavioral interface specification, JML must meet the following subsidiary goals.

JML must be able to document the interfaces and behavior of existing software, regardless of the analysis and design methods used to create it.
If JML were limited to only handling certain Java features, certain kinds of software, or software designed according to certain analysis and design methods, then some APIs would not be amenable to documentation using JML. This would mean that some existing software could not be documented using JML. Since the effort put into writing such documentation will have a proportionally larger payoff for software that is more widely reused, it is important to be able to document existing software components.
(However, it should be noted that we make some exceptions to this goal. One is that JML requires that all subtypes be behavioral subtypes [Dhara-Leavens96] [Leavens97c] [Wing87] of their supertypes. This is done because otherwise one cannot reason modularly about programs that use subtyping and dynamic dispatch. Another is that we specify Object's method equals as a pure method, which prohibits even benevolent side effects in any equals method that takes an Object as an argument. This is done to permit purity checking for collection classes that contain objects as members and use equals to compare them, as in the collection types found in java.util.)
The notation used in JML should be readily understandable by Java programmers, including those with only standard mathematical training.
A preliminary study by Finney [Finney96] indicates that graphic mathematical notations, such as those found in Z [Hayes93] [Spivey92] [Woodcock-Davies96] may make such specifications hard to read, even for programmers trained in the notation. This accords with our experience in teaching formal specification notations to programmers. Hence, our strategy for meeting this goal has been to shun most special-purpose mathematical notations in favor of Java's own expression syntax.
The language must be capable of being given a rigorous, formal semantics, and must also be amenable to tool support.
This goal also helps ensure that the specification language does not suffer from logical problems, which would make it less useful for static analysis, prototyping, and testing tools.

We also have in mind a long range goal of a specification compiler, that would produce prototypes from specifications that happen to be constructive [Wahls-Leavens-Baker00].

Our partners at Compaq SRC and the University of Nijmegen have other goals in mind. At Compaq SRC, the goal is to make static analysis tools for Java programs that can help detect bugs. At the University of Nijmegen, the goal is to be able to do full program verification on Java programs.

As a general strategy for achieving these goals, we have tried to blend the Eiffel [Meyer92a] [Meyer92b] [Meyer97], Larch [Wing87] [Wing90a] [Guttag-Horning93] [LeavensLarchFAQ], and refinement calculus [Back88] [Back-vonWright98] [Morgan-Vickers94] [Morgan94] approaches to specification. From Eiffel we have taken the idea that assertions can be written in a language that is based on Java expressions. We also adapt the "old" notation from Eiffel, which appears in JML as \old, instead of the Larch-style annotation of names with state functions. However, Eiffel specifications, as written by Meyer, are typically not as detailed as model-based specifications written, for example, in Larch BISLs or in VDM-SL [Fitzgerald-Larsen98] [ISO96] [Jones90]. Hence, we have combined these approaches, by using syntactic ideas from Eiffel and semantic ideas from model-based specification languages.

JML also has some other differences from Eiffel (and its cousins Sather and Sather-K). The most important is the concept of specification-only declarations. These declarations allow more abstract and exact specifications of behavior than is typically done in Eiffel; they allow one to write specifications that are similar to the spirit of VDM or Larch BISLs. A major difference is that we have extended the syntax of Java expressions with quantifiers and other constructs that are needed for logical expressiveness, but which are not always executable. Finally, we ban side-effects and other problematic features of code in assertions.

On the other hand, our experience with Larch/C++ has taught us to adapt the model-based approach in two ways, with the aim of making it more practical and easy to learn. The first adaptation is again the use of specification-only model variables. An object will thus have (in general) several such model fields, which are used only for the purpose of describing, abstractly, the values of objects. This simplifies the use of JML, as compared with most Larch BISLs, since specifiers (and their readers) hardly ever need to know about algebraic-style specification. It also makes designing a model for a Java class or interface similar, in some respects, to designing an implementation data structure in Java. We hope that this similarity will make the specification language easier to understand. (This kind of model also has some technical advantages that will be described below.)

The second adaptation is hiding the details of mathematical modeling behind a facade of Java classes. In the Larch approach to behavioral interface specification [Wing87], the mathematical notation used in assertions is presented directly to the specifier. This allows the same mathematical notation to be used in many different specification languages. However, it also means that the user of such a specification language has to learn a notation for assertions that is different than their programming language's notation for expressions. In JML we use a compromise approach, hiding these details behind Java classes. These classes have objects with many "pure" methods, in the sense that they do not use side-effects (at least not in any observable way). Such classes are intended to present the underlying mathematical concepts using Java syntax. Besides insulating the user of JML from the details of the mathematical notation, this compromise approach also insulates the design of JML from the details of the mathematical logic used for theorem proving.

We have generally taken features wholesale from the refinement calculus [Back88] [Back-vonWright98] [Morgan-Vickers94] [Morgan94]. Our adaptation of it consists in blending it with the idea of interface specification and adding features for object-oriented programming. We are using the adaptation of the refinement calculus by Büchi and Weck [Buechi-Weck00], which helps in specifying callbacks. However, since the refinement calculus is mostly needed for advanced specifications, in the remainder of this paper we do not discuss the JML features related to refinement, such as model programs.

1.4 Tool Support

Our partners at Compaq SRC have built a tool, ESC/Java, that does static analysis for Java programs [Leino-etal00]. ESC/Java uses a subset of the JML specification syntax, to help detect bugs in Java code. At the University of Nijmegen the LOOP tool [Huisman01] [Jacobs-etal98] is being adapted to use JML as its input language. This tool would generate verification conditions that could be checked using a theorem prover such as PVS or Isabelle/HOL. At the Massachusetts Institute of Technology (MIT), the Daikon invariant detector project [Ernst-etal01] is using a subset of JML to record invariants detected by runs of a program. Recent work uses ESC/Java to validate the invariants that are found.

In the rest of the section we concentrate on the tool support found in the JML release from Iowa State. Iowa State's JML release has tool support for: static type checking of specifications, run-time assertion checking, generation of HTML pages, and generation of unit testing harnesses. Use a web browser on the `JML.html' file in the Iowa State JML release to access more detailed documentation on these tools.

1.4.1 Type Checking Specifications

1.4.2 Generating HTML Documentation

1.4.3 Run Time Assertion Checking

1.4.4 Unit Testing with JML

1.4.1 Type Checking Specifications

Details on how to run the JML checker can be found in its manual page, which is part of the JML release. Here we only indicate the most basic uses of the checker. Running the checker with filenames as arguments will perform type checking on all the specifications contained in the given files. For example, one could check the specifications in the file `UnboundedStack.java' by executing the following command.

jml UnboundedStack.java

One can also pass several files to the checker. For example, the following shows a handy pattern to catch all of the JML files in the current directory.

jml *.*j* *.*spec*

One can also pass directories to the JML checker, for example the following will check all the specifications in the current directory.

jml .

By default, the checker does not recurse into subdirectories, but this can be changed by using the -R option. For example, the following checks specifications in the current directory and all subdirectories.

jml -R .

To allow specifications to be split into several files and to allow documentation of code without changing existing files, the checker recognizes several filename suffixes. The checker recognizes several filename suffixes. The following are considered to be "active" suffixes: `.refines-java', `.refines-spec', `.refines-jml', `.java', `.spec', and `.jml'; There are also three "passive" suffixes: `.java-refined', `.spec-refined', and `.jml-refined'. Files with passive suffixes can be used in refinements (see section 1.1 Behavioral Interface Specification) but should not normally be passed explicitly to the checker on its command line. Graphical user interface tools for JML should, by default, only present the active suffixes for selection. Among files in a directory with the same prefix, but with different active suffixes, the one whose suffix appears first in the list of active suffixes above should be considered primary by such a tool.

Files with different suffixes should be connected to each other using refines clauses. We give several examples in the remainder of this paper.

1.4.2 Generating HTML Documentation

To generate HTML documentation that can be browsed on the web, one uses the jmldoc tool.(8) This tool is a replacement for javadoc that understands JML specifications. In addition to generating web pages for JML files and for JML annotated Java files, jmldoc also generates the indexes and other HTML files that surround these and provide access, in the same way that javadoc does.

For example, here is how we use jmldoc to generate the HTML pages for the MultiJava project.

rm -fr $HOME/MJ/javadocs jmldoc -Q -private -d $HOME/MJ/javadocs \ -link file:/cygwin/usr/local/jdk1.4/docs/api \ -link file:/cygwin/usr/local/antlr/javadocs \ --sourcepath $HOME/MJ \ org.multijava.dis org.multijava.javadoc org.multijava.mjc \ org.multijava.mjdoc org.multijava.util org.multijava.util.backend \ org.multijava.util.classfile org.multijava.util.compiler \ org.multijava.util.jperf org.multijava.util.lexgen \ org.multijava.util.msggen org.multijava.util.optgen \ org.multijava.util.optimize org.multijava.util.testing

The options used in the above invocation of jmldoc make jmldoc be quiet (-Q), document all members (including private ones) of classes and interfaces (-private), write the HTML files relative to `$HOME/MJ/javadocs' (-d), link to existing HTML files for the JDK and for ANTLR (-link), and find listed packages relative to `$HOME/MJ' (--sourcepath). More details on running jmldoc are available from its manual page, which is part of the JML release.

1.4.3 Run Time Assertion Checking

The JML runtime assertion checking compiler is called jmlc. It type checks assertions (so there is no need to run jml separately), and then generates a class file with the executable parts of the specified assertions, invariants, preconditions, and postconditions (and other JML constructs) checked at run-time. Its basic usage is similar to a Java compiler, as shown in the following example.

jmlc TestUnboundedStack.java UnboundedStack.java

This will produce output telling what the compiler is doing, as well as class files `TestUnboundedStack.class' and `UnboundedStack.class'.

To run or test a program compiled with jmlc, you should use the script jmlrac. The jmlrac script runs the resulting code with a CLASSPATH that includes various JAR files containing code needed for run-time assertion checking. The following is an example.

jmlrac org.jmlspecs.samples.stacks.TestUnboundedStack

Using the jmlrac script is necessary. If you do not use jmlrac to run the program, you will get errors, since the code that jmlc compiles expects various runtime library classes to be available.

More details on invoking jmlc and jmlrac are available from their manual pages, which are available in the JML release. See also the `README.html' file in the JML release for more details and troubleshooting tips. Details on the implementation of jmlc are found in a paper by Cheon and Leavens [Cheon-Leavens02b].

1.4.4 Unit Testing with JML

The run time assertion checker is also integrated with a tool, jmlunit that can write out a JUnit [Beck-Gamma98] test oracle class for given Java files. For example, to generate the classes UnboundedStack_JML_Test and UnboundedStack_JML_TestData from UnboundedStack, one would execute the following.

jmlunit UnboundedStack.java

The file `UnboundedStack_JML_Test.java' will then contain code for an abstract class to drive the tests. This class uses the runtime assertion checker to decide test success or failure. (Tests are only as good as the quality of the specifications; hence the specifications must be reasonably complete to permit reasonably complete testing.)

The file `UnboundedStack_JML_TestData.java' will contain code for the superclass of UnboundedStack_JML_Test that can be used to fill in test data for such testing. You need to fill in the test data in the code for this subclass, as described in the comments. The file `UnboundedStack_JML_TestData.java' is produced automatically the first time you run jmlunit as described above. However, subsequent runs of jmlunit never overwrite or change an `_JML_TestData.java' file such as this if it exists. Hence it is safe to edit the file to add test data (and even additional test methods if you wish).

To run the test use the script jml-junit, as in the following example.

jml-junit org.jmlspecs.samples.stacks.UnboundedStack_JML_TestData

More details on invoking these tools can be found in their manual pages which ship with the JML release. More discussion on this integration of JML and JUnit are explained in the ECOOP 2002 paper by Cheon and Leavens [Cheon-Leavens02].

JML also provides a tool, jtest, that combines both jmlc and jmlunit. The jtest tool both compiles a class with run-time assertion checks enabled using jmlc, and also generates the test oracle and test data classes, using jmlunit.

1.5 Outline

In the next sections we describe more about JML and its semantics. See section 2. Class and Interface Specifications, for examples that show how Java classes and interfaces are specified; this section also briefly describes the semantics of subtyping and refinement. See section 3. Extensions to Java Expressions, for a description of the expressions that can be used in specifications. See section 4. Conclusions, for conclusions from our preliminary design effort. See the JML Reference Manual [Leavens-etal-JMLRef] for details on the syntax of JML.

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Gary Leavens on March, 16 2009 using texi2html