Formal representation of variable binding

Proving theorems about programming language semantics or types systems usually involves long and tedious proofs, with many uninteresting cases. However, the slightest mistake in a proof can easily compromise the soundness of the entire system. Machine-checked proofs appear as an ideal solution: they eliminate the possibility for errors in proofs, and easy proof cases can be automatically handled by brute-force proof search. Yet, the formalization of programming languages requires a formal representation of variable bindings. In paper proofs, alpha-conversion is easily dealt with by adopting Barendregt's convention (1992). Reproducing a similar style of reasoning in formal proofs has been a long-standing challenge.

In 2005, based on the observation that no existing technique was practical and general enough for formally reasoning about binders, a number of researchers proposed the POPLMark challenge as a way to stimulate new research on the topic (Aydemir et al 2005). This challenge, which received a lot of attention in the community, led to two the development of two new approaches: the nominal approach, mainly developed by Urban (2005), and another approach called locally nameless with cofinite quantification. I have developed the latter during an internship at UPenn in 2006, with Benjamin Pierce and Stephanie Weirich as advisors. Brian Aydemir and Randy Pollack then joined us in 2007 and contributed to the formal developments and to the proof of adequacy, respectively.

Locally nameless with cofinite quantification

The first ingredient is the locally nameless representation, in which bound variables are represented using de Bruijn indices, whereas free variables are represented using names. This representation, whose possibility has been mentioned by de Bruijn himself (1972), is a variant of the locally named approach of McKinna and Pollack (1993). Gordon (1994) first employed the locally nameless representation, and Leroy (2005) experimented it on the POPLMark challenge .

The second ingredient is the cofinite quantification, which plays a key role in making it possible to directly obtain strong induction principles. Without it, a large number of renaming lemmas are required, as experienced by Leroy in his POPLMark solution. The cofinite quantification plays a similar role as in Gabbay and Pitts' nominal logic (1999), with the difference that it is here used in a standard logic and that is is introduced directly in inductive definitions.

The particular way in which we combine these two ingredients leads to concise developments that remain quite faithful to the presentation of conventional paper proofs. We have illustrated the applicability of this technique on four major developments: the formalization of ML with datatypes, references and exceptions, the formalization of System F with subtyping, the formalization of the Calculus of Constructions, and the formalization of the CPS translation. The locally nameless approach has already been adopted by dozens of researchers. A non-exhaustive list can be found at the bottom of this page.

Publications

Engineering Formal Metatheory
with B. Aydemir, B. C. Pierce, R. Pollack and S. Weirich
Symposium on Principles of Programming Languages (POPL), January 2008
The Locally Nameless Representation
Journal of Automated Reasoning (JAR), May 2011
Pretty-Big-Step Semantics
To appear at the European Symposium on Programming (ESOP), March 2013

Formal developments

All developments are available from the page of the LN library.

Example

An excerpt from the formalization of simply-typed lambda-calculus.

(* Definition of the grammar of terms *)

Inductive trm : Type :=
  | trm_bvar : nat -> trm (* bound variable in de Bruijn style *)
  | trm_fvar : var -> trm (* free variable in named style *)
  | trm_abs  : trm -> trm
  | trm_app  : trm -> trm -> trm.

(* Definition of the typing relation *)

Inductive typing : ctx -> trm -> typ -> Prop :=
  | typing_var : forall E x T,
      binds x T E -> ok E ->
      E |= (trm_fvar x) ~: T
  | typing_abs : forall L E U T t1, 
      (forall x, x \notin L ->  (* <-- cofinite quantification *)
        (E & x ~ U) |= (t1 ^ x) ~: T) ->
      E |= (trm_abs t1) ~: (typ_arrow U T)
  | typing_app : forall S T E t1 t2,
      E |= t1 ~: (typ_arrow S T) -> 
      E |= t2 ~: S ->
      E |= (trm_app t1 t2) ~: T

where "E |= t ~: T" := (typing E t T).

(* Proof of the substitution lemma *)

Lemma typing_subst : forall F E t T x v U,
  (E & (x ~ U) & F) |= t ~: T ->
  E |= v ~: U ->
  (E & F) |= ([x ~> v]t) ~: T. 
Proof.
  introv Typt Typu. gen_eq G: (E & x ~ U & F). gen F.
  induction Typt; intros G Equ; subst; simpl subst.
  (* var case: testing equality of a free variable with x *)
  case_var.
    binds_get H0. apply_empty* typing_weaken.
    binds_cases H0; apply* typing_var.
  (* abs case: has a cofinitely-quantified induction hypothesis *)
  apply_fresh typing_abs as y. 
   rewrite* subst_open_var. apply_ih_bind* H0.
  (* app case: trivial *)
  apply* typing_app.
Qed.

Users (non-exhaustive list)

  • Vries et al proved type soundness for uniqueness typing.
  • Jia et al proved soundness and decidability of type-checking of AURA, a programming language for access control.
  • Benton and Koutavas formalized a bisimulation for the nu-calculus, a simply-typed lambda calculus with fresh name generation.
  • Swamy and Hicks prove type soundness of lambda-AIR, a language that combines dependent, affine and singleton types to enforce information release policies.
  • Pratikakis et al formalized type soundness of a ``contextual effects'' system.
  • Yyakobowski formalized type soundness for a preliminary version of xML^F, which is a type system that aims at integrating ML-style type inference in System F.
  • Russo and Vytiniotis formalized QML, a type system where explicit System F types do coexist with ordinary ML types.
  • Rendel et al formalized F-omega-star, an extension of F-omega that allows typed self-representations (representations of programs inside the programming language).
  • Garrigue formalized a type-checker and an interpreter for the core ML language extended with structural polymorphism and recursion.
  • Rossberg et al formalized the soundness of an elaboration from ML with modules towards F-omega.
  • Henrio et al formalized type soundness and Church-Rosser property for the sigma-calculus, a theory of objects.
  • Papakyriakou et al formalized a lambda calculus with impredicative polymorphism and mutable references.
  • Effinger-Dean and Grossman formalized a shared-memory, multi-threaded programming languages with relaxed memory consistency models
  • Montagu formalized type soundness of Core F-zip, a foundation for a module system, based on a variant of System F where existential types have an open scope.
  • Krebbers formalized Gamma-infinity, a presentation of type theory without explicit contexts, establishing that PTS derivations can be translated into Gamma-infinity derivations.
  • Zhao et al formalized System F-circle, an extension of System F that uses kinds to distinguish linear from intuitionistic terms. They established soundness and completeness of logical equivalence with respect to contextual equivalence.
  • Chen and Huang formalized a concurrent lambda-calculus equipped with a type and effect system for modeling service behavior.
  • Bousdira et al formalized Bulk Synchronous Parallel ML (BSML), a high-level language based on ML and dedicated to parallel computation.
  • Renaud and Zimmermann formalized and proved a full composition result for lambda-j, an untyped structural lambda-calculus whose reduction rules involve action at a distance.
  • Sanchez-Gil et al formalized Launchbury's natural semantics for lazy evaluation.
  • Backes et al formalized RCF-forall-conj-disj, a language featuring union, intersection and refinement types.
  • Dogguy and Glondu formalized the soundness of Tapis, a timed variant of the pi-calculus.