[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
JML is a behavioral interface specification language tailored to Java(TM). Besides pre- and postconditions, it also allows assertions to be intermixed with Java code; these aid verification and debugging. JML is designed to be used by working software engineers; to do this it follows Eiffel in using Java expressions in assertions. JML combines this idea from Eiffel with the model-based approach to specifications, typified by VDM and Larch, which results in greater expressiveness. Other expressiveness advantages over Eiffel include quantifiers, specification-only variables, and frame conditions.This paper discusses the goals of JML, the overall approach, and describes the basic features of the language through examples. It is intended for readers who have some familiarity with both Java and behavioral specification using pre- and postconditions.
JML stands for "Java Modeling Language" [Leavens-Baker-Ruby99]. JML is a behavioral interface specification language (BISL) [Wing87] designed to specify Java [Arnold-Gosling-Holmes00] [Gosling-etal00] modules. Java modules are classes and interfaces.
The main goal of our research on JML is to better understand how to make BISLs (and BISL tools) that are practical and effective for production software environments. In order to understand this goal, and the more detailed discussion of our goals for JML, it helps to define more precisely what a behavioral interface specification is. After doing this, we return to describing the goals of JML, and then give a brief overview of the tool support for JML and an outline of the rest of the paper.
1.1 Behavioral Interface Specification 1.2 Lightweight Specifications 1.3 Goals 1.4 Tool Support 1.5 Outline
As a BISL, JML describes two important aspects of a Java module:
BISLs are inherently language-specific [Wing87], because they describe interface details for clients written in a specific programming language, For example, a BISL tailored to C++, such as Larch/C++ [Leavens97c], describes how to use a module in a C++ program. A Larch/C++ specification cannot be implemented correctly in Java, and a JML specification cannot be correctly implemented in C++ (except for methods that are specified as native code).
JML specifications can either be written in separate files or as annotations in Java program files. To a Java compiler such annotations are comments that are ignored [Luckham-vonHenke85] [Luckham-etal87] [Rosenblum95] [Tan94] [Tan95]. This allows JML specifications, such as the specification below, to be embedded in Java program files. Consider the following simple example of a behavioral interface specification in JML, written as annotations in a Java program file, `IntMathOps.java'.
public class IntMathOps { // 1 // 2 /*@ public normal_behavior // 3 @ requires y >= 0; // 4 @ assignable \nothing; // 5 @ ensures 0 <= \result // 6 @ && \result * \result <= y // 7 @ && ((0 <= (\result + 1) * (\result + 1)) // 8 @ ==> y < (\result + 1) * (\result + 1)); // 9 @*/ //10 public static int isqrt(int y) //11 { //12 return (int) Math.sqrt(y); //13 } //14 } //15 |
The specification above describes a Java class, IntMathOps
that contains one static method (function member) named isqrt
.
The single-line comments to the far right (which start with //
)
give the line numbers in this specification;
they are ignored by both Java and JML.
Comments with an immediately following at-sign, //@
,
or, as on lines 3--10, C-style comments
starting with /*@
, are annotations.
Annotations are treated as comments by a Java compiler,
but JML processes the text of an annotation.
The text of an annotation is either the remainder of a line
following //@
or the characters between the annotation markers /*@
and @*/
.
In the second form, at-signs (@
) at the beginning of lines
are ignored;
they can be used to help the reader see the extent of an annotation.
In the above specification,
interface information is declared in lines 1 and 11.
Line 1 declares a class named IntMathOps
,
and line 11 declares a method named isqrt
.
Note that all of Java's declaration syntax is allowed in JML,
including, on lines 1 and 11, that the names declared are public
,
that the method is static
(line 11),
that its return type is int
(line 11),
and that it takes one int
argument.
Such interface declarations must be found in a Java module that correctly implements this specification. This is automatically the case in the file `IntMathOps.java' shown above, since that file also contains the implementation. In fact, when Java annotations are embedded in `.java' files, the interface specification is the actual Java source code.
To be correct, an implementation must have both the specified interface
and the specified behavior.
In the above specification,
the behavioral information is specified in the annotation text
on lines 3--10.(1)
The keywords public normal_behavior
are used to say that
the specification is intended for callers (hence "public"),
and that when the precondition is satisfied
a call must return normally, without throwing an exception (hence "normal").
In such a public specification, only names with public visibility
may be used.(2)
On line 4 is a precondition, which follows the keyword
requires
.(3)
On line 5 is frame condition, which says that this method, when called,
does not assign to any locations.
On lines 6--9 is a postcondition,
which follows the keyword ensures
.(4)
The precondition says what must be true about the arguments
(and other parts of the state);
if the precondition is true, then the method must terminate normally
in a state that satisfies the postcondition.
This is a contract between the caller of the method and the implementor
[Hoare69] [Jones90] [Jonkers91] [Guttag-Horning93] [Meyer92a] [Meyer97] [Morgan94].
The caller is obligated to make the precondition true,
and gets the benefit of having the postcondition then be satisfied.
The implementor gets the benefit of being able to assume the precondition,
and is obligated to make the postcondition true in that case.
In general, pre- and postconditions in JML are written using an
extended form of Java expressions.
In this case, the only extension visible is the keyword \result
,
which is used in the postcondition to denote the value returned by the method.
The type of \result
is the return type of the method;
for example, the type of \result
in isqrt
is
int
.
The postcondition says that the result is an integer approximation to the
square root of y
. The first conjunct on line 6,
0 <= \result
say that the result is non-negative.
The second and third conjuncts state that the result is an integer
approximation to the square root of the argument y
.
The second conjunct, on line 7, says that the result squared is no
larger than the argument, y
.
The third conjunct, on lines
8--9, is an implication; it has two expressions connected by
==>
, which means implication in JML.
This implication says that if the result plus one squared is
non-negative, then the result plus one squared is strictly larger than
y
.(5)
Note that the behavioral specification does not give an algorithm for finding
the square root.
Method specifications may also be written in Java's documentation comments.
The following is an example. The part that JML sees is enclosed within
the HTML "tags" <jml>
and </jml>
.(6)
As in this example, one can use surrounding
tags <pre>
and </pre>
to tell javadoc to ignore what JML sees,
and to leave the formatting of it alone. The <pre>
and </pre>
tags are not
required by JML tools (including jmldoc, which does a better job of formatting
specifications than does javadoc).
public class IntMathOps4 { /** Integer square root function. * @param y the number to take the root of * @return an integer approximating * the positive square root of y * <pre><jml> * public normal_behavior * requires y >= 0; * assignable \nothing; * ensures 0 <= \result * && \result * \result <= y * && ((0 <= (\result + 1) * (\result + 1)) * ==> y < (\result + 1) * (\result + 1)); * </jml></pre> **/ public static int isqrt(int y) { return (int) Math.sqrt(y); } } |
Because we expect most of our users to write specifications in Java
code files, most of our examples will be given as annotations
in `.java' files as in the specifications above.
However, it is possible to use JML to write documentation in separate,
non-Java files,
such as the file `IntMathOps2.jml-refined' below.
Since these files are not Java program files,
JML requires the user to omit the code for concrete methods in such a
file (except that code for "model" methods can be present,
see section 2.3.1 Purity).
The specification below shows how this is done, using a semicolon (;
),
as in a Java abstract method declaration.
//@ model import org.jmlspecs.models.*; public /*+@ spec_bigint_math @+*/ class IntMathOps2 { /*@ public normal_behavior @ requires y >= 0; @ assignable \nothing; @ ensures -y <= \result && \result <= y; @ ensures \result * \result <= y; @ ensures y < (Math.abs(\result) + 1) @ * (Math.abs(\result) + 1); @*/ public static int isqrt(int y); } |
Besides files with suffixes of `.jml-refined' or `.jml', JML also works with files with the suffixes `.spec' and `.spec-refined'. All these files use Java's syntax, and one must use annotation markers just as in a `.java' file. However, since these kinds of files files are not Java files, in such a file one must also omit the code for concrete, non-model methods.
The specification of IntMathOps2
below
is written in spec_bigint_math
mode [Chalin04].
This means that integer mathematics inside the specifications in the
class IntMathOps2
are done in infinite precision arithmetic,
instead of the usual Java arithmetic. This leads to a simpler
specification, especially in the ensures clause.(7)
The above specification also demonstrates that ensures clauses can be
repeated in a specification. In IntMathOps2
's specification of
isqrt
, there are three ensures clauses; all of them must be
satisfied. Thus the meaning is the same as the conjunction of all of
the postconditions specified in the individual ensures clauses.
This specification is also more underspecified than the specifications
given previously, as it allows negative numbers to be returned as results.
The above specification would be implemented
in the file `IntMathOps2.java',
which is shown below.
This file contains a refine
clause,
which tells the reader of the `.java'
file what is being refined
and the file in which to find its specification.
//@ refine "IntMathOps2.jml-refined"; //@ model import org.jmlspecs.models.*; public class IntMathOps2 { public static int isqrt(int y) { return (int) Math.sqrt(y); } } |
To summarize, a behavioral interface specification describes both the interface details of a module, and its behavior. The interface details are written in the syntax of the programming language; thus JML uses the Java declaration syntax. The behavioral specification uses pre- and postconditions.
Although we find it best to illustrate JML's features in this paper using
specifications that are detailed and complete, one
can use JML to write less detailed specifications.
In particular, one can use JML to write "lightweight"
specifications (as in ESC/Java). The syntax of JML allows one to write
specifications that consist of individual clauses, so that one can say
just what is desired. More precisely, a lightweight specification
is one that does not use a behavior keyword (like
normal_behavior
).
By way of contrast, we call a specification a heavyweight
specification if it uses one of the behavior keywords.
For example, one might wish to specify just that isqrt
should be called only on positive arguments, but not want to be
bothered with saying anything formal about the locations
that can be assigned to by the method or about the result.
This could be done as shown below.
Notice that the only specification given below is
a single requires
clause.
Since the specification of isqrt
has no behavior keyword,
it is a lightweight specification.
public class IntMathOps3 { //@ requires y >= 0; public static int isqrt(int y) { return (int) Math.sqrt(y); } } |
What is the access restriction, or privacy level,
of such a lightweight specification?
The syntax for lightweight specifications does not
have a place to specify the privacy level,
so JML assumes that such a lightweight specification
has the same level of visibility as the method itself.
(Thus, the specification below is implicitly public
.)
What about the omitted parts of the specification, such as the ensures clause?
JML assumes nothing about these.
In the example below when the precondition is met,
an implementation might either signal an
exception or terminate normally, so this specification technically
allows exceptions to be thrown. However, the gain in brevity often
outweighs the need for this level of precision.
JML has a semantics that allows most clauses to be sensibly omitted
from a specification. When the requires
clause is omitted, for
example, it means that no requirements are placed on the caller. When
the assignable
clause is omitted, it means that nothing is
promised about what locations may not be assigned to by the method;
that is, the method may assign to all locations that it can otherwise
legally assign to. When the ensures
clause is omitted, it
means that nothing is promised about the state resulting from a method
call. See section A. Specification Case Defaults, for the default meanings of
various other clauses.
As mentioned above, the main goal of our research is to better understand how to develop BISLs (and BISL tools) that are practical and effective. We are concerned with both technical requirements and with other factors such as training and documentation, although in the rest of this paper we will only be concerned with technical requirements for the BISL itself. The practicality and effectiveness of JML will be judged by how well it can document reusable class libraries, frameworks, and Application Programming Interfaces (APIs).
We believe that to meet the overall goal of practical and effective behavioral interface specification, JML must meet the following subsidiary goals.
If JML were limited to only handling certain Java features, certain kinds of software, or software designed according to certain analysis and design methods, then some APIs would not be amenable to documentation using JML. This would mean that some existing software could not be documented using JML. Since the effort put into writing such documentation will have a proportionally larger payoff for software that is more widely reused, it is important to be able to document existing software components.
(However, it should be noted that we make some exceptions to this goal.
One is that JML requires that all subtypes be behavioral subtypes
[Dhara-Leavens96] [Leavens97c] [Wing87] of their supertypes.
This is done because otherwise one cannot reason modularly about
programs that use subtyping and dynamic dispatch.
Another is that we specify Object
's method equals
as a
pure method, which prohibits even benevolent side effects in any
equals
method that takes an Object
as an argument. This
is done to permit purity checking for collection classes that contain
objects as members and use equals
to compare them, as in the
collection types found in java.util
.)
A preliminary study by Finney [Finney96] indicates that graphic mathematical notations, such as those found in Z [Hayes93] [Spivey92] [Woodcock-Davies96] may make such specifications hard to read, even for programmers trained in the notation. This accords with our experience in teaching formal specification notations to programmers. Hence, our strategy for meeting this goal has been to shun most special-purpose mathematical notations in favor of Java's own expression syntax.
This goal also helps ensure that the specification language does not suffer from logical problems, which would make it less useful for static analysis, prototyping, and testing tools.
We also have in mind a long range goal of a specification compiler, that would produce prototypes from specifications that happen to be constructive [Wahls-Leavens-Baker00].
Our partners at Compaq SRC and the University of Nijmegen have other goals in mind. At Compaq SRC, the goal is to make static analysis tools for Java programs that can help detect bugs. At the University of Nijmegen, the goal is to be able to do full program verification on Java programs.
As a general strategy for achieving these goals,
we have tried to blend the Eiffel [Meyer92a] [Meyer92b] [Meyer97],
Larch [Wing87] [Wing90a] [Guttag-Horning93] [LeavensLarchFAQ],
and refinement calculus
[Back88] [Back-vonWright98] [Morgan-Vickers94] [Morgan94]
approaches to specification.
From Eiffel we have taken the idea that assertions can be written in
a language that is based on Java expressions.
We also adapt the "old
" notation from Eiffel,
which appears in JML as \old
,
instead of the Larch-style annotation of names with state functions.
However, Eiffel specifications, as written by Meyer,
are typically not as detailed as model-based specifications
written, for example, in Larch BISLs or in
VDM-SL [Fitzgerald-Larsen98] [ISO96] [Jones90].
Hence, we have combined these approaches, by using syntactic ideas
from Eiffel and semantic ideas from model-based specification languages.
JML also has some other differences from Eiffel (and its cousins Sather and Sather-K). The most important is the concept of specification-only declarations. These declarations allow more abstract and exact specifications of behavior than is typically done in Eiffel; they allow one to write specifications that are similar to the spirit of VDM or Larch BISLs. A major difference is that we have extended the syntax of Java expressions with quantifiers and other constructs that are needed for logical expressiveness, but which are not always executable. Finally, we ban side-effects and other problematic features of code in assertions.
On the other hand, our experience with Larch/C++ has taught us to adapt the model-based approach in two ways, with the aim of making it more practical and easy to learn. The first adaptation is again the use of specification-only model variables. An object will thus have (in general) several such model fields, which are used only for the purpose of describing, abstractly, the values of objects. This simplifies the use of JML, as compared with most Larch BISLs, since specifiers (and their readers) hardly ever need to know about algebraic-style specification. It also makes designing a model for a Java class or interface similar, in some respects, to designing an implementation data structure in Java. We hope that this similarity will make the specification language easier to understand. (This kind of model also has some technical advantages that will be described below.)
The second adaptation is hiding the details of mathematical modeling behind a facade of Java classes. In the Larch approach to behavioral interface specification [Wing87], the mathematical notation used in assertions is presented directly to the specifier. This allows the same mathematical notation to be used in many different specification languages. However, it also means that the user of such a specification language has to learn a notation for assertions that is different than their programming language's notation for expressions. In JML we use a compromise approach, hiding these details behind Java classes. These classes have objects with many "pure" methods, in the sense that they do not use side-effects (at least not in any observable way). Such classes are intended to present the underlying mathematical concepts using Java syntax. Besides insulating the user of JML from the details of the mathematical notation, this compromise approach also insulates the design of JML from the details of the mathematical logic used for theorem proving.
We have generally taken features wholesale from the refinement calculus [Back88] [Back-vonWright98] [Morgan-Vickers94] [Morgan94]. Our adaptation of it consists in blending it with the idea of interface specification and adding features for object-oriented programming. We are using the adaptation of the refinement calculus by Büchi and Weck [Buechi-Weck00], which helps in specifying callbacks. However, since the refinement calculus is mostly needed for advanced specifications, in the remainder of this paper we do not discuss the JML features related to refinement, such as model programs.
Our partners at Compaq SRC have built a tool, ESC/Java, that does static analysis for Java programs [Leino-etal00]. ESC/Java uses a subset of the JML specification syntax, to help detect bugs in Java code. At the University of Nijmegen the LOOP tool [Huisman01] [Jacobs-etal98] is being adapted to use JML as its input language. This tool would generate verification conditions that could be checked using a theorem prover such as PVS or Isabelle/HOL. At the Massachusetts Institute of Technology (MIT), the Daikon invariant detector project [Ernst-etal01] is using a subset of JML to record invariants detected by runs of a program. Recent work uses ESC/Java to validate the invariants that are found.
In the rest of the section we concentrate on the tool support found in the JML release from Iowa State. Iowa State's JML release has tool support for: static type checking of specifications, run-time assertion checking, generation of HTML pages, and generation of unit testing harnesses. Use a web browser on the `JML.html' file in the Iowa State JML release to access more detailed documentation on these tools.
1.4.1 Type Checking Specifications 1.4.2 Generating HTML Documentation 1.4.3 Run Time Assertion Checking 1.4.4 Unit Testing with JML
Details on how to run the JML checker can be found in its manual page, which is part of the JML release. Here we only indicate the most basic uses of the checker. Running the checker with filenames as arguments will perform type checking on all the specifications contained in the given files. For example, one could check the specifications in the file `UnboundedStack.java' by executing the following command.
jml UnboundedStack.java |
One can also pass several files to the checker. For example, the following shows a handy pattern to catch all of the JML files in the current directory.
jml *.*j* *.*spec* |
One can also pass directories to the JML checker, for example the following will check all the specifications in the current directory.
jml . |
By default, the checker does not recurse into subdirectories, but this can be changed by using the -R option. For example, the following checks specifications in the current directory and all subdirectories.
jml -R . |
To allow specifications to be split into several files and to allow documentation of code without changing existing files, the checker recognizes several filename suffixes. The checker recognizes several filename suffixes. The following are considered to be "active" suffixes: `.refines-java', `.refines-spec', `.refines-jml', `.java', `.spec', and `.jml'; There are also three "passive" suffixes: `.java-refined', `.spec-refined', and `.jml-refined'. Files with passive suffixes can be used in refinements (see section 1.1 Behavioral Interface Specification) but should not normally be passed explicitly to the checker on its command line. Graphical user interface tools for JML should, by default, only present the active suffixes for selection. Among files in a directory with the same prefix, but with different active suffixes, the one whose suffix appears first in the list of active suffixes above should be considered primary by such a tool.
Files with different suffixes should be connected to each other using
refines
clauses. We give several examples in the remainder
of this paper.
To generate HTML documentation that can be browsed on the web,
one uses the jmldoc
tool.(8)
This tool is a replacement for javadoc
that understands JML
specifications.
In addition to generating web pages for JML files and for JML
annotated Java files,
jmldoc
also generates the indexes and other HTML files that
surround these and provide access, in the same way that javadoc
does.
For example, here is how we use jmldoc
to generate the HTML pages for
the MultiJava project.
rm -fr $HOME/MJ/javadocs jmldoc -Q -private -d $HOME/MJ/javadocs \ -link file:/cygwin/usr/local/jdk1.4/docs/api \ -link file:/cygwin/usr/local/antlr/javadocs \ --sourcepath $HOME/MJ \ org.multijava.dis org.multijava.javadoc org.multijava.mjc \ org.multijava.mjdoc org.multijava.util org.multijava.util.backend \ org.multijava.util.classfile org.multijava.util.compiler \ org.multijava.util.jperf org.multijava.util.lexgen \ org.multijava.util.msggen org.multijava.util.optgen \ org.multijava.util.optimize org.multijava.util.testing |
The options used in the above invocation of jmldoc
make jmldoc be
quiet (-Q
), document all members (including private ones) of classes and
interfaces (-private
), write the HTML files relative to
`$HOME/MJ/javadocs' (-d
), link
to existing HTML files for the JDK and for ANTLR (-link
),
and find listed packages relative to `$HOME/MJ'
(--sourcepath
).
More details on running jmldoc
are available from its manual
page, which is part of the JML release.
The JML runtime assertion checking compiler is called jmlc
.
It type checks assertions (so there is no need to run jml
separately), and then generates a class file with the executable parts
of the specified assertions, invariants, preconditions, and postconditions
(and other JML constructs)
checked at run-time.
Its basic usage is similar to a Java compiler, as shown in
the following example.
jmlc TestUnboundedStack.java UnboundedStack.java |
This will produce output telling what the compiler is doing, as well as class files `TestUnboundedStack.class' and `UnboundedStack.class'.
To run or test a program compiled with jmlc
, you should use the
script jmlrac
. The jmlrac
script
runs the resulting code with a CLASSPATH
that includes various JAR files containing code needed for run-time assertion
checking. The following is an example.
jmlrac org.jmlspecs.samples.stacks.TestUnboundedStack |
Using the jmlrac
script is necessary.
If you do not use jmlrac
to run the program, you will
get errors, since the code that jmlc
compiles expects various
runtime library classes to be available.
More details on invoking jmlc
and jmlrac
are available
from their manual pages, which are available in the JML release.
See also the `README.html' file in the JML release for more
details and troubleshooting tips.
Details on the implementation of jmlc
are found in a paper by
Cheon and Leavens [Cheon-Leavens02b].
The run time assertion checker is also integrated with a tool, jmlunit
that can write out a JUnit [Beck-Gamma98]
test oracle class for given Java files.
For example, to generate the classes
UnboundedStack_JML_Test
and
UnboundedStack_JML_TestData
from UnboundedStack
,
one would execute the following.
jmlunit UnboundedStack.java |
The file `UnboundedStack_JML_Test.java' will then contain code for an abstract class to drive the tests. This class uses the runtime assertion checker to decide test success or failure. (Tests are only as good as the quality of the specifications; hence the specifications must be reasonably complete to permit reasonably complete testing.)
The file `UnboundedStack_JML_TestData.java' will contain code for
the superclass of UnboundedStack_JML_Test
that can be used to fill in test data for such testing. You need to fill in
the test data in the code for this subclass, as described in the
comments. The file `UnboundedStack_JML_TestData.java' is
produced automatically the first time you run jmlunit
as
described above. However, subsequent runs of jmlunit
never
overwrite or change an `_JML_TestData.java' file such as this if
it exists. Hence it is safe to edit the file to add test data (and
even additional test methods if you wish).
To run the test use the script jml-junit
, as in the following example.
jml-junit org.jmlspecs.samples.stacks.UnboundedStack_JML_TestData |
More details on invoking these tools can be found in their manual pages which ship with the JML release. More discussion on this integration of JML and JUnit are explained in the ECOOP 2002 paper by Cheon and Leavens [Cheon-Leavens02].
JML also provides a tool, jtest
, that combines both jmlc
and jmlunit
.
The jtest
tool both compiles a class
with run-time assertion checks enabled using jmlc
,
and also generates the test oracle and test data classes, using
jmlunit
.
In the next sections we describe more about JML and its semantics. See section 2. Class and Interface Specifications, for examples that show how Java classes and interfaces are specified; this section also briefly describes the semantics of subtyping and refinement. See section 3. Extensions to Java Expressions, for a description of the expressions that can be used in specifications. See section 4. Conclusions, for conclusions from our preliminary design effort. See the JML Reference Manual [Leavens-etal-JMLRef] for details on the syntax of JML.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |