August 2020 – optimix

The Untyped λ-Calculus

The untyped λ-calculus is a formal system in mathematical logic that expresses computations. Each computation is presented as a term, and terms may themselves be composed of additional terms. Every term is either a variable, an abstraction or an application. Evaluating a program is equivalent to starting with a term and performing a series of substitutions, based on a set of formal rules.

Grammar

The ABNF grammar for the untyped λ-calculus is shown below. For simplicity, we assume that whitespace is not significant except as a delimiter between terms (where needed).

term := var | abs | app
abs  := "λ" var "." term
app  := term term

Example

Given below is an example of a well-formed term, where $x$ , $y$ and $z$ are variables, $\lambda x.x$ and $\lambda xy.xyz$ are abstractions, $xyz$ and the term as a whole are applications. Here, we take the liberty of assuming that $xy$ actually denotes two separate variables (the application of $x$ to $y$ ).

(\lambda x.x)(\lambda xy.xyz)

Note that there are actually four variables in this expression – the $x$ variables in the two parenthesized terms are completely unrelated as they are bound to different abstractions. When a variable is not bound to any abstraction, it is said to be free.

\left(\overbrace{\lambda x.\underbrace{x}_{\text{\footnotesize Bound Variable}}}^{\text{\footnotesize Abstraction}}\right)\left(\overbrace{\lambda xy.xy\underbrace{z}_{\text{\footnotesize Free Variable}}}^{\text{\footnotesize Abstraction}}\right)

Substitution Rules

α-conversion

Within the scope of an abstraction, a bound variable may be freely substituted with any symbol that isn’t already in use (which means it is neither free nor bound within the current context). This is known as an α-conversion. For instance, the example above is equivalent to the following expressions.

\left(\lambda x.x\right)\left(\lambda wk.wkz\right)

\left(\lambda y.y\right)\left(\lambda xy.xyz\right)

However, it is not equivalent the following expression, as it inadvertently converts the free $z$ into a bound one.

\left(\lambda x.x\right)\left(\lambda zy.zyz\right)

β-reduction

When an abstraction is applied to a term, the former can be reduced by substituting every occurrence of the first bound variable of the abstraction with the term it is applied to. This is known as a β-reduction. In traditional programming languages, this corresponds to function application, converting a complex expression into something simpler. Our original example above reduces to the following.

\left(\lambda x.x\right)\left(\lambda wk.wkz\right)\xrightarrow{β}\lambda wk.wkz

The first term, $(\lambda x.x)$ , is equivalent to a function that returns its argument as-is – the identity function. Applying the identity function to the second term simply returns the second term.

Note that β-reduction doesn’t always simplify the term. In some cases, further reduction may yield the same term ad infinitum, in which case the term is said to be in a ‘minimal form’. In other cases, the term may actually become bigger with each reduction, in which case it is said to diverge. In all other cases, when no further β–reduction is possible, the term is said to be in β-normal form.

\left(\lambda x.x x\right)\left(\lambda x.x x\right) \tag{\text{Minimal}}

\left(\lambda x.xxy\right)\left(\lambda x.xxy\right) \tag{\text{Divergent}}

Writing an interpreter for the untyped λ-calculus is relatively straightforward in Haskell.

Preparation

Make sure the stack is installed on your system.
Clone the package from GitHub to run the code.

$ git clone https://github.com/rri/untype.git
$ cd untype
$ stack build
$ stack test
$ stack exec -- untype

A Quick Walkthrough

Link to GitHub

The core data structure to represent terms mimics the ABNF grammar described earlier. This recursive type declaration is easy to define in Haskell, as it uses a non-strict evaluation strategy.

data Term
    = Var Sym       -- ^ Variable
    | Abs Sym Term  -- ^ Abstraction
    | App Term Term -- ^ Application
    deriving (Eq)

For contrast, a similar data structure in Rust would have looked like this (notice the Box type that adds a level of indirection).

pub enum Term {
    Var(Sym),
    Abs(Sym, Box<Term>),
    App(Box<Term>, Box<Term>),
}

The general strategy here is to accept an expression as newline-terminated text, apply a parser to the input to derive an abstract syntax tree, apply α-conversion and β-reduction strategies on the term until it converges, and then finally print it to the screen.

We use attoparsec, a nifty parser-combinator library, to write simple parsers and combine them into larger ones. As we apply a few basic parsers recursively, we need to look out for a few gotchas, called out in the code, that might cause the parser to loop indefinitely or give us incorrect results.

Finally, determining when to generate fresh variables and how to name them is surprisingly challenging. Here is an example when variable names must be substituted prior to reduction.

\overbrace{\left(\lambda x y.x y y\right)}^{\text{\footnotesize Term 1}}\overbrace{\left(\lambda u.u y x\right)}^{\text{\footnotesize Term 2}}

Here, Term 2 needs to replace the $x$ within Term 1 as part of the β-reduction step. However, Term 2 already has a $y$ , conflicting with the $y$ – a distinct bound variable – in Term 1. We therefore need to replace the $y$ in Term 1 with a fresh variable, and only then proceed with the substitution.

We leverage a simple strategy for generating fresh variables. First, we collect free variables across the whole term. Then, as we traverse the term, we keep track of all bound variables in the current context. Whenever we need a fresh variable, we take the original and append an apostrophe (') at the end. We then check this new variable against the list of free variables as well as the list of bound variables in the current context. If there are no collisions, we’re done; if not, we repeat the process.

That’s all for today, folks! 🖖

The Josephus Problem

The final problem presented in the first chapter of Concrete Mathematics: A Foundation for Computer Science is The Josephus Problem. While the fables of the origin of this problem are entertaining in themselves, the premise itself is simple. We have $n$ objects that are given sequential labels from $1$ to $n$ , and arranged in a circle in the same sequence. Starting from the object labeled $1$ , we are to eliminate every second object until only one object survives. The objective is to determine the label associated with the surviving object.

The Josephus Problem for n = 12 (link to LaTeX source)

As an example, consider the Josephus number $J(n)$ for $n = 12$ , shown in the diagram above. Labels get eliminated in the following sequence:

2, 4, 6, 8, 10, 12, 3, 7, 11, 5, 1

…leaving $9$ at the very end.

J(12)=9

In looking for a general solution to this problem, we first observe that the pattern of elimination is slightly different depending on whether $n$ is even or odd. When $n$ is even, the first round eliminates all even numbers. The second round begins again with $1$ , but with exactly half the original number of objects, but with the labels themselves being larger. We can establish the following relation:

J(2n) = 2J(n)\;-\;1 \tag{1}

When $n$ is odd, the first round again eliminates all even numbers, but the second round begins by eliminating $1$ . Similar to equation (1), we can establish the following relation:

J(2n + 1) = 2J(n) + 1 \tag{2}

With these recurrence relations established, we can calculate a few sample values of the function and thereby derive a general proposition.

\begin{aligned} J(1) &= 1 \\ \\ J(2) &= 1 \\ J(3) &= 3 \\ \\ J(4) &= 1 \\ J(5) &= 3 \\ J(6) &= 5 \\ J(7) &= 7 \\ \\ J(8) &= 1 \\ J(9) &= 3 \\ J(10) &= 5 \\ J(11) &= 7 \\ J(12) &= 9 \\ J(13) &= 11 \\ J(14) &= 13 \\ J(15) &= 15 \\ \\ J(16) &= 1 \end{aligned}

\begin{aligned} J(2^m + l) &= 2l + 1 & 0 \le m, 0 \le l \lt 2^m \tag{3} \end{aligned}

Proof (by induction on m)

We already know the value of the function when $n=1$ , which we obtain when $m = l = 0$ .

\begin{aligned} J(1) = J(2^0 + 0) = 1 \tag{4} \end{aligned}

Suppose that the proposition is true for some $m \ge 0$ . We want to show that this implies that the proposition is also true for $m + 1$ . To do so, we need to examine two cases separately: (a) when $l$ is even, and (b) when $l$ is odd.

\begin{aligned} \text{(when $l$ is even)} \\ \\ J \big(2^{m+1}+l\big) &= J \big(2 \cdot [2^m+l/2]\big) \\ &= 2 \cdot J (2^m+l/2)-1 \\ &= 2 \cdot (2 \cdot l/2 + 1)-1 \\ &= 2l + 1 \\ \\ \text{(when $l$ is odd)} \\ \\ J \big(2^{m+1}+l\big) &= J \big(2 \cdot [2^m+(l-1)/2] + 1\big) \\ &= 2 \cdot J \big(2^m+(l-1)/2\big) + 1 \\ &= 2 \cdot \big(2 \cdot (l-1)/2 + 1\big) + 1 \\ &= 2l + 1 \\ \end{aligned}

These results follow from the earlier recurrence equations (1) and (2) and the assertion that the proposition is true for a specific $m$ . Together with the base case in (4), they show that the proposition is indeed true for all $m$ . QED.

There are a number of ways to implement this program in Rust, and in this case, we take the most straightforward approach of evaluating the formula (3) that we just proved.

Preparation

Make sure cargo is installed on your system via rustup.
Create a new package called hanoi to run your code.

$ cargo new josephus
$ cd josephus

P0: Basic Solution

Link to GitHub branch.

//
// src/lib.rs
//

/// Calculate the Josephus Number J(n) for any n > 0.
///
/// We use a simple formula here: express the number `n` as
/// 2^m + l for some m >= 0, 0 <= l < 2^m, then calculate the
/// value of 2 * l + 1, which is the result.
///
/// * `n` - The number to calculate J for
pub fn josephus(n: u32) -> u32 {
    let base: u32 = 2;

    let mut index: u32 = 0;
    let mut hbits: u32 = n;

    loop {
        hbits = hbits >> 1;
        if hbits == 0 {
            break;
        }
        index += 1;
    }

    let l = n - base.pow(index);

    2 * l + 1
}

#[cfg(test)]
mod tests {

    use super::*;

    #[test]
    fn test_one() {
        assert_eq!(josephus(1), 1);
    }

    #[test]
    fn test_three() {
        assert_eq!(josephus(3), 3);
    }

    #[test]
    fn test_twelve() {
        assert_eq!(josephus(12), 9);
    }

    #[test]
    fn test_fifty_one() {
        assert_eq!(josephus(51), 39);
    }

    #[test]
    fn test_sixty_four() {
        assert_eq!(josephus(64), 1);
    }
}

That’s all for today, folks! 🖖

The Tower of Hanoi

The first problem presented in Concrete Mathematics: A Foundation for Computer Science is The Tower of Hanoi. Although a relatively simple problem, it offers a useful foray into introductory programming with Rust. We are given a tower of $n$ disks, initially stacked in decreasing size on one of three pegs. The objective is to transfer the entire tower to one of the other pegs, moving only one disk at a time and never moving a larger one onto a smaller. Once we satisfy ourselves that a solution to this problem exists, we can choose to determine the total number of moves $T_n$ required to solve the problem, or an enumerated list of moves constituting this number.

In creating a solution for this, note that we are not pursuing a closed form solution; in essence, the computer has to ‘do the work’ of making each move, recording it in the process. We begin with the simplest possible version of the solution.

Preparation

Make sure cargo is installed on your system via rustup.
Create a new package called hanoi to run your code.

$ cargo new hanoi
$ cd hanoi

//
// src/main.rs
//
use std::io;
use std::io::Write;

/// Solve the Tower of Hanoi problem
fn main() {
    let mut num_disks = 0;
    let mut num_moves = 0;

    get_num_disks(&mut num_disks);

    if num_disks >= 0 {
        println!("Number of disks = {}", num_disks);
        move_disk_set(num_disks, "A", "B", "C", &mut num_moves);
        println!("Number of moves = {}", &num_moves);
    } else {
        println!("Invalid number ({}) of disks!", &num_disks);
    }
}

/// Get the number of disks to solve the problem for
///
/// * `k` - Buffer to populate the result in
fn get_num_disks(k: &mut i64) {
    let mut buf = String::new();
    print!("Enter the number of disks: ");
    io::stdout().flush().unwrap();
    io::stdin()
        .read_line(&mut buf)
        .expect("Failed to read input!");
    *k = buf
        .trim()
        .parse::<i64>()
        .expect("Failed to parse input as an integer!");
}

/// Move a tower of the given size from source to destination.
///
/// * `num_disks` - Number of disks in the tower
/// * `src`       - Label for the source peg
/// * `dst`       - Label for the destination peg
/// * `buf`       - Label for the peg acting as a buffer
/// * `num_moves` - Number of moves executed so far
fn move_disk_set(
    num_disks: i64,
    src: &str,
    dst: &str,
    buf: &str,
    num_moves: &mut i64,
) {
    match num_disks {
        0 => return,
        n => {
            move_disk_set(n - 1, src, buf, dst, num_moves);
            move_disk(num_moves);
            move_disk_set(n - 1, buf, dst, src, num_moves);
        }
    }
}

/// Move a single disk from source to destination.
///
/// * `num_moves` - Number of moves executed so far
fn move_disk(num_moves: &mut i64) {
    *num_moves = *num_moves + 1;
}

P0: Basic Solution

Link to GitHub branch.

The output of this program is quite straightforward.

$ cargo run
   Compiling hanoi v0.1.0 (/Users/rri/Code/play/hanoi)
    Finished dev [unoptimized + debuginfo] target(s) in 0.36s
     Running `target/debug/hanoi`
Enter the number of disks: 3
Number of disks = 3
Number of moves = 7

P1: Printed Moves

Link to GitHub branch.

The solution gets a bit more interesting when you print each move on the console, thus giving you a way to see the code in action and verify its output. (For brevity, only the updated functions are shown below.)

Update move_disk to print out the move when it is called.

fn move_disk(num_moves: &mut i64, src: &str, dst: &str) {
    *num_moves = *num_moves + 1;
    println!("{:>10}: {} -> {}", num_moves, src, dst);
}

Update move_disk_set to pass more parameters to move_disk.

fn move_disk_set(
    num_disks: i64,
    src: &str,
    dst: &str,
    buf: &str,
    num_moves: &mut i64,
) {
    match num_disks {
        0 => return,
        n => {
            move_disk_set(n - 1, src, buf, dst, num_moves);
            move_disk(num_moves, src, dst);
            move_disk_set(n - 1, buf, dst, src, num_moves);
        }
    }
}

The output of the modified program is slightly more involved.

$ cargo run
   Compiling hanoi v0.1.0 (/Users/rri/Code/play/hanoi)
    Finished dev [unoptimized + debuginfo] target(s) in 0.28s
     Running `target/debug/hanoi`
Enter the number of disks: 4
Number of disks = 4
         1: A -> C
         2: A -> B
         3: C -> B
         4: A -> C
         5: B -> A
         6: B -> C
         7: A -> C
         8: A -> B
         9: C -> B
        10: C -> A
        11: B -> A
        12: C -> B
        13: A -> C
        14: A -> B
        15: C -> B
Number of moves = 15

We would like to write automated tests for this solution. The need to test software usually forces us to change the structure of the program in certain ways, but this is a good thing as it drives the design of the application to be much more modular. The problem with the solution above is that we can’t really test what output is printed on the console (at least, not easily). We need to modify our program to return a list of moves that we are free to inspect, rather than execute these moves directly. Modular software has this useful property that all effects on the environment external to system under consideration are reified (given a concrete representation), as you see in the next section.

P2: Tested Moves

Link to GitHub branch.

The basic idea behind reification of effects is to take code that does some work, and represent it in a structure or object that represents the work (instead of actually doing the work). In this case, instead of printing a move on the console, we return a structure that represents a move (complete with source and destination), and leave it up to the consumer of the structure to determine what to do with it. Second, we no longer need to maintain a count of moves, as the count is implied by the size of our move list. Finally, we add a few tests that validate our assumptions. (For brevity, only the updated functions and structures are shown below.)

Define a structure to represent a move.

/// Representation of a 'move' of a disk from one peg to another
struct Move<'a> {
    /// Label for the source peg
    src: &'a str,
    /// Label for the destination peg
    dst: &'a str,
}

Update the main function to consume and process (print) the moves.

fn main() {
    let mut num_disks = 0;
    let mut lst_moves: Vec<Move> = Vec::new();

    get_num_disks(&mut num_disks);

    if num_disks >= 0 {
        println!("Number of disks = {}", num_disks);
        move_disk_set(num_disks, "A", "B", "C", &mut lst_moves);
        println!("Number of moves = {}", lst_moves.len());
        for (pos, m) in lst_moves.iter().enumerate() {
            println!("{:>10}: {} -> {}", pos + 1, m.src, m.dst);
        }
    } else {
        println!("Invalid number ({}) of disks!", &num_disks);
    }
}

Update the move_disk function to add to a vector instead of printing the move.

fn move_disk<'a>(
    lst_moves: &mut Vec<Move<'a>>,
    src: &'a str,
    dst: &'a str,
) {
    lst_moves.push(Move { src: src, dst: dst });
}

Add some tests to satisfy ourselves that the solution works. More interesting and useful tests may be added, of course.

#[cfg(test)]
mod tests {

    use super::*;

    #[test]
    fn test_nil() {
        test_disks(0, 0);
    }

    #[test]
    fn test_one() {
        test_disks(1, 1);
    }

    #[test]
    fn test_two() {
        test_disks(2, 3);
    }

    #[test]
    fn test_ten() {
        test_disks(10, 1023);
    }

    fn test_disks(num_disks: i64, exp_num_moves: i64) {
        let mut lst_moves: Vec<Move> = Vec::new();
        move_disk_set(num_disks, "A", "B", "C", &mut lst_moves);
        assert_eq!(lst_moves.len() as i64, exp_num_moves);
    }
}

One of the most interesting things about this code that has insidiously made its way into an otherwise simple example is the notion of lifetimes in Rust. The parameter 'a in the example represents a lifetime, denoting the scope for which the reference parameterized by it is available. The easiest way to think of this is as an extra parameter that gets threaded through whenever the reference is passed along to a function. For instance, the move_disk_set function now needs to be parameterized with 'a, with no other changes. In this case, the actual value of the parameter is inferred to be 'static as we are ultimately starting with string literals ("A", "B", "C").

fn move_disk_set<'a>(
    num_disks: i64,
    src: &'a str,
    dst: &'a str,
    buf: &'a str,
    lst_moves: &mut Vec<Move<'a>>,
) {
    match num_disks {
        0 => return,
        n => {
            move_disk_set(n - 1, src, buf, dst, lst_moves);
            move_disk(lst_moves, src, dst);
            move_disk_set(n - 1, buf, dst, src, lst_moves);
        }
    }
}

The output of the running the tests are exactly as expected.

$ cargo test
    Finished test [unoptimized + debuginfo] target(s) in 0.00s
     Running target/debug/deps/hanoi-40e0f0eba95aed0c

running 4 tests
test tests::test_nil ... ok
test tests::test_one ... ok
test tests::test_two ... ok
test tests::test_ten ... ok

test result: ok. 4 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

That’s all for today, folks! 🖖