Learn

Computational calculi primer Part 1: Lambda calculus & BNF

Overview

Greg Meredith has been teaching a weekly course called Introduction to the Design of Computational Calculi (DoCC) to explain the foundations of mathematical models of grammar, how computational calculi are used in the RChain project, and how to design computational calculi for specific applications. As this is a highly technical topic, it may seem like the ideas are exotic or esoteric and difficult to wrap your head around. However, I am here to help you understand these ideas. I will make them as down-to-earth and comprehensible as possible while maintaining rigor and keeping in mind the higher purpose.
The first few posts here will serve as background material required for following along with the DoCC course. We first need to cover some of the basics of computational calculi, like the syntax and semantics of λ-calculus, π-calculus, and the model RChain is built on, ρ-calculus. After covering these basics, there will be weekly posts explaining the DoCC material in more depth.
First, we all need to get comfortable with BNF and λ-calculus. So, we will start our discussion with grammar and operational semantics.

Baukus-Naur form (BNF)

The grammar we discuss will always be presented in what is known as Baukus-Naur form (BNF) which is typical for programming languages. BNF is just a concise way of describing how the rules for the syntax work in a grammar, i.e. what “admissible” terms look like in the language. The BNF presentation of a grammar explicitly displays the syntactic rules. To keep the discussion concrete, let’s look at the example of λ-calculus.

λ-calculus

Grammar

λ-calculus is a universal model of computation, discovered by mathematician Alonzo Church in the 1930’s. The BNF presentation of a grammar for λ-calculus:

                      M,N ::= x | λx.M | (M N)                                (1.1)

You don’t actually need to know what these things mean yet. You just need to know that there are three basic forms for terms which correspond to the three different rules for constructing terms. This presentation gives rise to a language, the collection of λ-terms, i.e. terms of the form described on the righthand side of the BNF presentation above. Our language consists of terms we call M and N of the form x or λx.M or (MN). Notice that these terms are built recursively so we have quite elaborate terms in our language. Keep in mind, we also get all admissible combinations of these forms in the language too.
It’s a game. In this game, there are three rules describing how you may construct terms and a hat to draw names from. Each round, you have a λ-term M to work from and you are given the choices: (i) draw a name x from the hat, (ii) draw a name x and form the abstraction λx.M, (iii) choose another λ-term N and form the application (MN). That’s it, now you can build any λ-term by playing sufficiently many rounds of this game.
This raises the question of where we start. Well, we can always choose a variable x from our variable set X, so let’s start there. Let’s make our first λ-term x. Then from this term, we can form other λ-terms, notably, λx.x and (x x). Now we are off to the races! What other λ-terms can be formed?
Returning to the BNF presentation of λ-calculus (1.1), one noteworthy thing is the appearance of x on the righthand side. Where did that come from (there is only M and N on the left-hand side)? Well, it turns out that x is just a name or variable we choose from out of a set X. This set X is something we have control over, and changing it changes the admissible expressions in our language. Therefore, we say that this grammar is parametric in the set of variables X. This leads some people to write the BNF presentation of λ-calculus in a slightly odd-looking format with the variable set X displayed explicitly

M[X] ::= X | λX.M[X] | (M[X] M[X])

to emphasize the dependence of the λ-terms M[X] on X. But I digress. The important thing here is that the λ-terms depend on the given variables.
Another noteworthy aspect of (1.1) is the appearance of Ms and Ns on both sides of the BNF presentation. Basically, what this means is that if we have two λ-terms M and N, we have a few options for creating new λ-terms, recursively:

  • x (mention; pick a variable from the set X, admittedly this has little to do with M and N);
  • λx.M (abstraction; binds the variable in M, i.e. declares x a variable in the function M);
  • (M N) (application; “applies” the function M to the argument N)

That’s basically it. Pretty simple, huh? We always have the option of forming these admissible terms given some admissible term(s). Let’s look at a few examples.
Examples:

  1. λx.λy.x is admissible (this is an important example because it is the encoding of the Boolean function True in the λ-calculus)

Proof:  x is admissible ⇒ M = λy.x admissible ⇒ λx.M = λx.λy.x admissible

  1. (λx.x y) is admissible (this is also an important example because λx.x acts as the identity function in λ-calculus)

Proof:  λx.x admissible ⇒ (λx.x y) admissible (application of λx.x to y)

  1. x λx. is not proper syntax ⇒ not an admissible λ-term

This is all we need to know about constructing λ-terms. However, this doesn’t tell us anything about how these terms interact with one another. For that, we will need to discuss the operational semantics.
 

Operational semantics

λ-calculus captures the notion of a function; abstractions picking variables in the function and applications evaluating the function at the variable. This thinking guides the interactions between these constructions, known as the operational semantics. There are two main relations in λ-calculus, α-conversion and β-reduction:

  • α-conversion:

λx.M{x} → λy.M{y}

This rule deems that terms differing only by the names of their bound variables are structurally equivalent. By structurally equivalent, we mean that these terms are indistinguishable from the viewpoint of computation. By bound variable, we mean something like x in the expression λx.M{x}. The variable x appears in the term M (that’s why it’s being written like M{x}) and λ binds x in this expression, i.e. λ declares x a variable in M. Another way to think about this is that a function f can have x as a variable, which we write f(x), or it can have y as a variable, which we write f(y), and regardless of what we call the variable, this function denotes/does the same thing. This is the essence of α-conversion. This is an equivalence of terms of the form λx.M and λy.M{y/x}.

  • β-reduction:

(λx.MN) → M{N/x}

This rule reduces an application of the form (λx.MN), where the first term is a λ-abstraction, λx.M, to a substitution of the term N for the variable x in the term M. This is simply function evaluation. We are declaring x a variable in the function M and evaluating it at N, i.e. substituting N for x. This can also be thought of as function composition. If we apply the function f(x) to the function g(y), we get the composition f(g(y)). This is exactly the same as substituting x = g(y).
Examples:

  1. λx.(x z) → λy.(y z)

Let M{x} = (x z) and M{y} = (y z) so we have λx.(x z) = λx.M{x} → λy.M{y} = λy.(y z) by direct α-conversion. Even though there is abstraction and application in the term λx.(x y), it does not β-reduce. Why not?

  1. (λx.(x x) y) → (y y)

This is a β-reduction because we are applying the abstraction λx.(x x) to the term y. Let M{x} = (x x). Then (λx.(x x) y) = (λx.M y) → M{y/x} = (x x){y/x} = (y y) by direct β-reduction.

  1. (λx.(λy.(x y) u)v) → (v u)

There are two rounds of β-reduction here. For the first round, let M{x} = (λy.(x y) u). Then (λx.(λy.(x y) u)v) = (λx.M v) → M{v/x} = (λy.(x y) u){v/x} = (λy.(v y) u). Now we use a direct β-reduction on the remaining expression to get (λy.(v y) u) → (v u).
To summarize, we have covered the grammar and operational semantics for λ-calculus. You should be feeling a mixture of confusion and comfort, if this topic is new to you. Admittedly, this doesn’t directly tell us how λ-calculus is related to computation, but we will get there soon enough. One step at a time.
 
Next, we will discuss the grammar and operational semantics for π-calculus.
 
You can also watch the videos for the Introduction to the Design of Computational Calculi course.