Program verification using characteristic formulae
In my thesis, I have developed a new approach to program verification based on the notion of characteristic formula. This approach is implemented in a tool, called CFML, which can be used to establish in Coq the full functional correctness of arbitrarily complex Caml programs. I have used CFML to verify more than half of the content of Okasaki's book Purely Functional Data Structures. I have also used CFML to verify a collection of nontrivial imperative programs, including Dijkstra's shortest path algorithm, the Union-Find data structure, iterators on mutable lists, functions with local state, functions in CPS form, and recursion through the store.
The characteristic formula of a piece of code is a logical formula that describes the semantics of this code (it is a form of strongest post-condition). This characteristic formula satisfies the following property:
The concept of characteristic formula
The general concept of characteristic formula is not new: it originates in work on process calculi from the 80's (Park, 1981). In this setting, every syntactic process definition is mapped to a formula of Hennessy-Milner's logic. This mapping is such that two syntactic processes are behaviorally equivalent if and only if their associated formulae are logically equivalent. More recently, these results on process calculi were adapted to lambda-calculi by Honda, Berger and Yoshida, who derived a sound and complete Hoare logic for PCF (2006). I have turned the concept of characteristic formula into an effective approach to program verification my building formulae of linear size, expressed in a standard higher-order logic, associated with a pretty-printer, and including support for local reasoning.
It is useful to understand what the characteristic formula approach is not:
Software and formal developments
CFML consists of two parts: a program (implemented in Caml) that parses Caml code and produces characteristic formulae in the form of Coq axioms, and a Coq library that provides all the definitions, lemmas and tactics needed for manipulating characteristic formulae. The source code of CFML and a collection of example developments (1000+ lines of Caml proved correct) can be found on the CFML webpage.
Characteristic formulae for purely-functional programs are described in the ICFP'10 paper. The generalization to imperative programs is the matter of the ICFP'11 paper. My thesis covers both functional and imperative programs, and moreover includes detailed proofs of soundness and completeness.
Program Verification Through Characteristic Formulae
International Conference on Functional Programming (ICFP), September 2010
Characteristic Formulae for the Verification of Imperative Programs
International Conference on Functional Programming (ICFP), September 2011
Characteristic Formulae for Mechanized Program Verification
PhD thesis, December 2010
Earlier work: use of a deep embedding
Before developing characteristic formulae, I worked on a deep embedding approach to reasoning about Caml programs using Coq. I showed that, by introducing the right definition, notation and tactics, it is actually feasible to scale up the deep embedding approach to a realistic programming language, more precisely to the pure fragment of Caml.
Interactive Verification of Call-by-Value Functional Programs Through a Deep Embedding
Report, March 2009
Nevertheless, specifications were polluted with the need to relate embedded Caml values with the corresponding Coq values, and proofs involved heavy custom tactics whose use resulted in big proof terms. Characteristic formulae appeared as a way to avoid all of the issues of the deep embedding, while retaining all of the expressivity of this approach.