Idea of the solution

A solution using De Bruijn indices and implicit environments to the POPLmark Challenge - Part 1A

Arthur Charguéraud - May 2006

Notations

Definition of well-formation

We need two well-formation statements, to capture the well-formation of types and environment. In this document, we will use a predicate to capture that a term is well-formed in an environment (which does not itself have to be well-formed) and another one to capture that an environment has been constructed as iterated "push" of well-form types.

There are several ways of defining those predicates. Here is the one we used:

   wf E T   is defined inductively as

   | wf E top
   | wf E T1 /\ wf E T2 -> wf E (T1->T2)
   | wf E T1 /\ (forall U, wf (E,X<:U) T2) -> wf E ($X<:T1.T2)
   | X in domain of E -> wf E X

   
   wf_env E   is defined inductively as

   | wf_env (empty environment)
   | wf_env E /\ wf E U -> wf_env (E,X<:U)

Looking for a good definition of subtyping

From informal to formal definition

Here is the definition of subtyping given in the paper, once rewritten is ASCII:

   E |- S <: T  is "informally" defined as

   |                                          -> (E |- S <: top)
   | (X<:_ in E)                              -> (E |- X <: X)
   | (X<:U in E) /\ (E |- U <: T)             -> (E |- X <: T)
   | (E |- T1 <: S1) /\ (E |- S2 <: T2)       -> (E |- S1->S2 <: T1->T2)
   | (E |- T1 <: S1) /\ (E,X<:T1 |- S2 <: T2) -> (E |- $X<:S1.S2 <: $X<:T1.T2)

A formal definition needs to take into account the well-formation of types and environments. We now give two equivalent ways to implement this.

Formal definition of subtyping [1]. This first definition introduces well-formation hypotheses in each cases (we shall see in the next section how to do simpler).

   E |- S <: T  is defined inductively as

   | wf_env E /\ wf E S 
     -> (E |- S <: top)
   | wf_env E  
     /\ (X<:_ in E) 
     -> (E |- X <: X)
   | wf_env E /\ wf E T 
     /\ (X<:U in E) /\ (E |- U <: T)
     -> (E |- X <: T)
   | wf_env E /\ wf E (S1->S2) /\ wf E (T1->T2)
     /\ (E |- T1 <: S1) /\ (E |- S2 <: T2)
     -> (E |- S1->S2 <: T1->T2)
   | wf_env E /\ wf E ($X<:S1.S2) /\ wf E ($X<:T1.T2)
     /\ (E |- T1 <: S1) /\ (E,X<:T1 |- S2 <: T2)
     -> (E |- $X<:S1.S2 <: $X<:T1.T2)

Formal definition of subtyping [2]. This second definition factorizes all those well-formation statements. To do so, it introduces an auxiliary relation (notation is "").

   E |- S <: T   is defined as   wf_env E  /\  wf E S  /\  wf E T  /\  E |- S <¤ T 
    
   where E |- S <¤  T is defined as:
 
   |                                          -> (E |- S <¤ top)
   | (X<:_ in E)                              -> (E |- X <¤ X)
   | (X<:U in E) /\(E |- U <: T)              -> (E |- X <¤ T)
   | (E |- T1 <: S1) /\ (E |- S2 <: T2)       -> (E |- S1->S2 <¤ T1->T2)
   | (E |- T1 <: S1) /\ (E,X<:T1 |- S2 <: T2) -> (E |- $X<:S1.S2 <¤ $X<:T1.T2)

Note that the two relations are mutually recursive, since the premisses of relation "" refer to the initial subtyping relation "<:".

Standard transformation: pushing well-formation to the leaves

From formal definition of subtyping [1], we may reduce the number of well-formation hypotheses. Indeed, we can show that it is sufficient to have well-formation hypotheses on the leaves of the derivation tree. The intuition is that all well-formation hypotheses needed on the nodes can then be deduced.

Equivalent definition of subtyping [1bis]:

   E |- S <: T  is equivalent to

   | wf_env E /\ wf E S                       -> (E |- S <: top)
   | wf_env E /\ (X<:_ in E)                  -> (E |- X <: X)
   | (X<:U in E) /\ (E |- U <: T)             -> (E |- X <: T)
   | (E |- T1 <: S1) /\ (E |- S2 <: T2)       -> (E |- S1->S2 <: T1->T2)
   | (E |- T1 <: S1) /\ (E,X<:T1 |- S2 <: T2) -> (E |- $X<:S1.S2 <: $X<:T1.T2)

This is the idea used in all the solutions previously proposed to the POPLmark challenge which don't use a higher-order representation of types. Namely, these are solutions by: Berghofer, Leroy, Sallinens, Stump, Urban et al, and Vouillon.

The equivalence between formal definition [1] and equivalent definition [1bis] is usually considered as a standard and straight-forward induction, and so none of the solutions mentioned above do formally prove this equivalence.

Alternative transformation: generalizing to ill-formed types

From formal definition of subtyping [2], we may simplify the definition of the auxiliary relation "" so that it is no longer mutually recursive with the initial relation "<:". To do that we take the definition of "" and replace all all occurences of "<:" by "". The key result is that this transformation preserves the meaning of subtyping.

Equivalent definition of subtyping [2bis]:

   E |- S <: T   is equivalent to  wf_env E  /\  wf E S  /\  wf E T  /\  E |- S <¤ T 

   where relation "E |- S <¤ T" is defined inductively as
   
   |                                       -> (E |- S <¤ top)
   | (X <¤ _ in E)                         -> (E |- X <¤ X)
   | (X <¤ U in E) /\ (E |- U <¤ T)        -> (E |- X <¤ T)
   | (E |- T1<¤S1) /\ (E |- S2<¤T2)        -> (E |- S1->S2 <¤ T1->T2)
   | (E |- T1<¤S1) /\ (E, X<¤T1 |- S2<¤T2) -> (E |- $X<¤S1.S2 <¤ $X<¤T1.T2)

The intuition of this transformation is to extand the definition of subtyping to ill-formed values, and then restrict the relation to well-formed values. The proof of this equivalence is a straight-forward induction on the subtyping derivation.

Comparison between the two transformations

Intuitively, the second encoding should be simpler to work with. The goal of our solution is to verify this.

Effective implementation using the alternative transformation

Labelling variables

Now we are going to use a little trick here: we label all free variables with the type they are bound to in the environment. The motivation for doing so is to make the environment E implicit in "E |- S <¤ T". Note that we don't need to label bound variables.

We will write X^U to mean that variable X is labelled with U, and we do that when X is bound to U in the environement. Now we can rewrite the definition of our extended subtyping relation as:

   relation "E |- S <¤ T" is equivalent to (modulo the fix for SA-all described just below)

   |                                          -> (E |- S <¤ top)
   |                                          -> (E |- X^U <¤ X^U)
   | (E |- U <¤ T)                            -> (E |- X^U <¤ T)
   | (E |- T1 <¤ S1) /\ (E |- S2 <¤ T2)       -> (E |- S1->S2 <¤ T1->T2)
   | (E |- T1 <¤ S1) /\ (E,X<¤T1 |- S2 <¤ T2) -> (E |- $X<¤S1.S2 <¤ $X<¤T1.T2)

In the SA-All rule, we have two types $X<¤S1.S2 and $X<¤T1.T2, and we want to explore them by pushing variable X into the environment. At this step, the variable X which was previously bound becomes free for S2 and T2, and so we have to label occurences of X. So we define a function called "update" which takes three arguments X, U, and T, and will decorate all occurences of X in T with the label U. Then we fix the SA-All thus as:

   | (E |- T1 <¤ S1) /\ (E, X<¤T1 |- (update X T1 S2) <¤ (update X T1 T2)) 
                                              -> (E |- $X<¤S1.S2 <¤ $X<¤T1.T2)

Note: the reason we call this function "udpate" and not "label" is because we will later put some dummy labels on bound variables in order to get a uniform representation of all variables.

Also, we shall not forget to extend the definition of well-formation to support labels, saying that X^U is well-formed in an environment E if and only if E contains a binding mapping X to U.

Making the environment implicit

In the above formulation of "E |- S <¤ T", the environment is not used at any place: it is just accumulating bindings while going through the forall nodes. Since we don't use the environment, it should be fine to just remove it. Here is the result:

   "E |- S <¤ T"  implies  "S <¤ T"
   for any E,  "S <¤ T"  implies  "E |- S <¤ T"  

   where relation "S <¤ T" is defined as
   |                                                  -> S <¤ top
   |                                                  -> X^U <¤ X^U
   | U <¤ T                                           -> X^U <¤ T
   | T1 <¤ S1 /\ S2 <¤ T2                             -> S1->S2 <¤ T1->T2
   | T1 <¤ S1 /\ (update X T1 S2) <¤ (update X T1 T2) -> $X<¤S1.S2 <¤ $X<¤T1.T2

Now we can give an equivalent definition of subtyping using labels and implicit environments.

   E |- S <: T   is equivalent to  wf_env E  /\  wf E S  /\  wf E T  /\  S <¤ T

   where relation "S <¤ T" is defined as
   |                                                  -> S <¤ top
   |                                                  -> X^U <¤ X^U
   | U <¤ T                                           -> X^U <¤ T
   | T1 <¤ S1 /\ S2 <¤ T2                             -> S1->S2 <¤ T1->T2
   | T1 <¤ S1 /\ (update X T1 S2) <¤ (update X T1 T2) -> $X<¤S1.S2 <¤ $X<¤T1.T2

Specifying a representation: using De Bruijn indices

Forgetting about the environment make things more difficult to deal with freshness conditions for variables. This is the main motivation to go for a De Bruijn representation of free variables.

Then in order to have a uniform representation for both free and bound variables (and to avoid capture issues), we'll have indices also for bound variables. So we end up with a full De Bruijn representation of variables.

Again, to have a uniform representation of free and bound variables, we will have bound variables to also have a label. But those labels will be completely ignored. This allows us to define types in a compact way:

   Inductive typ :=
   | top   : typ
   | arrow : typ -> typ -> typ
   | all   : typ -> typ -> typ
   | var   : nat -> typ -> typ

Notation 1: in our system, "$X<¤T1.T2" stands for "$(var 0 _)<¤T1.T2". We shall write simply "$T1.T2" to stand for that.

Notation 2: when we pass through the SA-All case, what was written as "label X T1 S2" now becomes "label 0 T1 S2", and so we will define a function "push" to be an alias for "label 0".

With those two new notations, the SA-All case now looks like this:

   | T1 <¤ S1 /\ (push T1 S2) <¤ (push T1 T2) -> $S1.S2 <¤ $T1.T2

Summary

This is the result as it stands in our solution:

   E |- S <: T   is equivalent to  wf_env E  /\  wf E S  /\  wf E T  /\  S <¤ T

   where relation "S <¤ T" is defined inductively as

   |                                          -> S <¤ top
   |                                          -> X^U <¤ X^U
   | U <¤ T                                   -> X^U <¤ T
   | T1 <¤ S1 /\ S2 <¤ T2                     -> S1->S2 <¤ T1->T2
   | T1 <¤ S1 /\ (push T1 S2) <¤ (push T1 T2) -> $S1.S2 <¤ $T1.T2

This relation is very small, and it turns out to be nice to prove properties about it.

In the solution we formally prove the equivalence above, using for E |- S <: T the formal definition of subtyping [1]. The proof goes easily by induction.

Proving results about subtyping using the alternative transformation

This idea is to prove properties about subtyping (<:) by proving equivalent properties on unsafe subtyping ().

Reflexivity of subtyping

   subtyping_reflexivity  :  wf_env E  /\  wf E T  ->  E |- T <: T

   using our equivalence result, we shall prove that:

      wf_env E  /\  wf E T  ->  wf_env E  /\  wf E T  /\  T <¤ T
   
   to do this, it suffise to prove reflexivity of relation <¤, ie:
      
      forall T,  T <¤ T

Narrowing preserves subtyping

To prove transitivity of relation , we use a version of narrowing on this relation which looks like:

        (update X Q S) <¤ (update X Q T) /\ P <¤ Q
     -> (update X P S) <¤ (update X P T).

Reminder: "update X U T" does relabel all occurences of X in T with the type U.

What about narrowing for the true subtyping relation (<:) ? We could do it, but we would need to define the concatenation of environments, and the well-formation of this concatenation, and this is quite a pain. Since the only reason to prove narrowing that we are aware of is to prove transitivity, then there is probably no need to bother defining and proving narrowing for the <: subtyping relation.

Transitivity of subtyping

   subtyping_reflexivity  :  E |- S <: Q  /\  E |- Q <: T  ->  E |- S <: T 

   using our equivalence result, we shall prove that:

      wf_env E  /\  wf E S  /\  wf E Q  /\  wf E T  /\  S <¤ Q  /\  Q <¤ T
      -> wf_env E  /\  wf E S  /\  wf E T  /\  S <¤ T

   to do this, it suffise to prove transitivity of relation <¤, ie:
      
      forall S Q T,  S <¤ Q  /\  Q <¤ T  ->  S <¤ T

Conclusion

We reduced proofs about subtyping to proofs about unsafe subtyping. This latter relation is much easier to work with, since it does not contain any environment nor any well-formation judgment.