Ramnath R Iyer – Page 6

The Command Pattern

Modularity is a desirable property of any reasonably complex software. Modularity allows the programmer to examine, understand and change parts — modules — of the software while temporarily ignoring the rest of it. When the software becomes too large for a single programmer to work on it sequentially, modularity allows a team of individuals to work on it in parallel. Ideally, modularity is recursive — modules should themselves consist of modules, and so on, until each module is small enough for an individual to grasp quickly, even with an arbitrarily large team.

From one perspective, modularity is less about breaking down software into smaller modules, and more about creating small modules that can be easily combined with other modules to create large systems. Combining modules is called ‘composition’, and composability is the holy grail of software design.

Functions are the ultimate tool in the toolbox of composition. In mathematics, a function has inputs and outputs, and its definition represents the mapping of inputs to outputs. In the computational world, these mappings may be viewed as transformations of inputs into outputs. Functions are inherently composable, as under the right conditions, the outputs of one function may be connected to the input of another function (even itself). Unfortunately, functions in mainstream programming languages are impure in the sense that they may do other things, such as write bytes to disk or send data over a network. These so-called ‘effects’ hinder our ability to compose functions using their mathematical representations, unless the effects themselves are modeled as first-class inputs or outputs.

Only a few languages like Haskell support pure functions — functions that are free of effects. In Haskell, effects are possible only if they are modeled as first-class inputs or outputs. In these cases, effects are encoded into the runtime instances of a special type called IO, and it is the responsibility of language runtime to execute these effects on behalf of the programmer. For example, in the program below, the main function returns an IO instance, and the language runtime executes the effects encapsulated by the instance. In fact, without resorting to backdoor (aka unsafe) techniques, it is not possible to specify effects within a function that doesn’t return an IO instance. With the IO type, Haskell programs are smart enough to declare which functions are effectful, and these functions can be distinguished from ones that are not.

import System.IO (BufferMode (NoBuffering), hSetBuffering, stdout)

-- Main program.
-- This function returns an IO instance.
main :: IO ()
main = do
    hSetBuffering stdout NoBuffering          -- :: IO ()
    putStr $ "Enter a number x: "             -- :: IO ()
    x <- getLine                              -- :: IO String
    putStr $ "Enter a number y: "             -- :: IO ()
    y <- getLine                              -- :: IO String
    putStrLn $ show $ mult (read x) (read y)  -- :: IO ()

-- Multiply two numbers.
-- This function cannot write to disk or send data over the network.
mult :: Int -> Int -> Int
mult x y = x * y

Effects represented by IO instances can themselves be combined, but only sequentially. In the example above, each line within the main function returns some kind of IO instance. These IO instances are strung together to create a single combined expression…which is itself an IO instance. Furthermore, the computational results of an IO instance can never be extracted into a pure function: once you enter the real world, you can never come back.

Object Oriented Languages

In object-oriented languages like Java, composability still remains the holy grail of software development, and so lessons from the functional world are applicable. The essence of the ideas described above can be boiled down into a simple rule of thumb —

When you need to perform an action that deals with the external world, like writing to disk or sending data over the network, encapsulate the action within a ‘command’, and separate the decision of performing the action from actually performing the action.

This separation allows you as the programmer to inspect, re-arrange and re-compose your effectful code easily. Given adequately precise shapes for commands, the compiler will even aid you in making these changes safe. The command interface is analogous to the IO type in Haskell. And just like IO, you can string together commands to construct more sophisticated composite ones.

Once you are speaking the language of commands, you can perform computations on the commands themselves. For instance, suppose that instead of printing a message to the screen, you create a ‘log statement’ object that is capable of printing a message to the screen, and then invoke it. You can now enrich all log statements with timestamps, apply filtering based on various criteria and perform other actions before or after you print messages. As another example, suppose that instead of calling a remote web service, you create a ‘service invoker’ object that is given all of the information it needs to call the remote web service, and then invoke it. You can now apply throttling and caching mechanisms that control the flow of how these effects are performed.

The real world is messy, uncertain and error-prone. If we can limit the interactions of our systems with the external world to a few points at the edges, our software is that much more robust, and easier to develop, operate and maintain.

That’s all for today, folks! 🖖

The Untyped λ-Calculus

The untyped λ-calculus is a formal system in mathematical logic that expresses computations. Each computation is presented as a term, and terms may themselves be composed of additional terms. Every term is either a variable, an abstraction or an application. Evaluating a program is equivalent to starting with a term and performing a series of substitutions, based on a set of formal rules.

Grammar

The ABNF grammar for the untyped λ-calculus is shown below. For simplicity, we assume that whitespace is not significant except as a delimiter between terms (where needed).

term := var | abs | app
abs  := "λ" var "." term
app  := term term

Example

Given below is an example of a well-formed term, where $x$ , $y$ and $z$ are variables, $\lambda x.x$ and $\lambda xy.xyz$ are abstractions, $xyz$ and the term as a whole are applications. Here, we take the liberty of assuming that $xy$ actually denotes two separate variables (the application of $x$ to $y$ ).

(\lambda x.x)(\lambda xy.xyz)

Note that there are actually four variables in this expression – the $x$ variables in the two parenthesized terms are completely unrelated as they are bound to different abstractions. When a variable is not bound to any abstraction, it is said to be free.

\left(\overbrace{\lambda x.\underbrace{x}_{\text{\footnotesize Bound Variable}}}^{\text{\footnotesize Abstraction}}\right)\left(\overbrace{\lambda xy.xy\underbrace{z}_{\text{\footnotesize Free Variable}}}^{\text{\footnotesize Abstraction}}\right)

Substitution Rules

α-conversion

Within the scope of an abstraction, a bound variable may be freely substituted with any symbol that isn’t already in use (which means it is neither free nor bound within the current context). This is known as an α-conversion. For instance, the example above is equivalent to the following expressions.

\left(\lambda x.x\right)\left(\lambda wk.wkz\right)

\left(\lambda y.y\right)\left(\lambda xy.xyz\right)

However, it is not equivalent the following expression, as it inadvertently converts the free $z$ into a bound one.

\left(\lambda x.x\right)\left(\lambda zy.zyz\right)

β-reduction

When an abstraction is applied to a term, the former can be reduced by substituting every occurrence of the first bound variable of the abstraction with the term it is applied to. This is known as a β-reduction. In traditional programming languages, this corresponds to function application, converting a complex expression into something simpler. Our original example above reduces to the following.

\left(\lambda x.x\right)\left(\lambda wk.wkz\right)\xrightarrow{β}\lambda wk.wkz

The first term, $(\lambda x.x)$ , is equivalent to a function that returns its argument as-is – the identity function. Applying the identity function to the second term simply returns the second term.

Note that β-reduction doesn’t always simplify the term. In some cases, further reduction may yield the same term ad infinitum, in which case the term is said to be in a ‘minimal form’. In other cases, the term may actually become bigger with each reduction, in which case it is said to diverge. In all other cases, when no further β–reduction is possible, the term is said to be in β-normal form.

\left(\lambda x.x x\right)\left(\lambda x.x x\right) \tag{\text{Minimal}}

\left(\lambda x.xxy\right)\left(\lambda x.xxy\right) \tag{\text{Divergent}}

Writing an interpreter for the untyped λ-calculus is relatively straightforward in Haskell.

Preparation

Make sure the stack is installed on your system.
Clone the package from GitHub to run the code.

$ git clone https://github.com/rri/untype.git
$ cd untype
$ stack build
$ stack test
$ stack exec -- untype

A Quick Walkthrough

Link to GitHub

The core data structure to represent terms mimics the ABNF grammar described earlier. This recursive type declaration is easy to define in Haskell, as it uses a non-strict evaluation strategy.

data Term
    = Var Sym       -- ^ Variable
    | Abs Sym Term  -- ^ Abstraction
    | App Term Term -- ^ Application
    deriving (Eq)

For contrast, a similar data structure in Rust would have looked like this (notice the Box type that adds a level of indirection).

pub enum Term {
    Var(Sym),
    Abs(Sym, Box<Term>),
    App(Box<Term>, Box<Term>),
}

The general strategy here is to accept an expression as newline-terminated text, apply a parser to the input to derive an abstract syntax tree, apply α-conversion and β-reduction strategies on the term until it converges, and then finally print it to the screen.

We use attoparsec, a nifty parser-combinator library, to write simple parsers and combine them into larger ones. As we apply a few basic parsers recursively, we need to look out for a few gotchas, called out in the code, that might cause the parser to loop indefinitely or give us incorrect results.

Finally, determining when to generate fresh variables and how to name them is surprisingly challenging. Here is an example when variable names must be substituted prior to reduction.

\overbrace{\left(\lambda x y.x y y\right)}^{\text{\footnotesize Term 1}}\overbrace{\left(\lambda u.u y x\right)}^{\text{\footnotesize Term 2}}

Here, Term 2 needs to replace the $x$ within Term 1 as part of the β-reduction step. However, Term 2 already has a $y$ , conflicting with the $y$ – a distinct bound variable – in Term 1. We therefore need to replace the $y$ in Term 1 with a fresh variable, and only then proceed with the substitution.

We leverage a simple strategy for generating fresh variables. First, we collect free variables across the whole term. Then, as we traverse the term, we keep track of all bound variables in the current context. Whenever we need a fresh variable, we take the original and append an apostrophe (') at the end. We then check this new variable against the list of free variables as well as the list of bound variables in the current context. If there are no collisions, we’re done; if not, we repeat the process.

That’s all for today, folks! 🖖

The Josephus Problem

The final problem presented in the first chapter of Concrete Mathematics: A Foundation for Computer Science is The Josephus Problem. While the fables of the origin of this problem are entertaining in themselves, the premise itself is simple. We have $n$ objects that are given sequential labels from $1$ to $n$ , and arranged in a circle in the same sequence. Starting from the object labeled $1$ , we are to eliminate every second object until only one object survives. The objective is to determine the label associated with the surviving object.

The Josephus Problem for n = 12 (link to LaTeX source)

As an example, consider the Josephus number $J(n)$ for $n = 12$ , shown in the diagram above. Labels get eliminated in the following sequence:

2, 4, 6, 8, 10, 12, 3, 7, 11, 5, 1

…leaving $9$ at the very end.

J(12)=9

In looking for a general solution to this problem, we first observe that the pattern of elimination is slightly different depending on whether $n$ is even or odd. When $n$ is even, the first round eliminates all even numbers. The second round begins again with $1$ , but with exactly half the original number of objects, but with the labels themselves being larger. We can establish the following relation:

J(2n) = 2J(n)\;-\;1 \tag{1}

When $n$ is odd, the first round again eliminates all even numbers, but the second round begins by eliminating $1$ . Similar to equation (1), we can establish the following relation:

J(2n + 1) = 2J(n) + 1 \tag{2}

With these recurrence relations established, we can calculate a few sample values of the function and thereby derive a general proposition.

\begin{aligned} J(1) &= 1 \\ \\ J(2) &= 1 \\ J(3) &= 3 \\ \\ J(4) &= 1 \\ J(5) &= 3 \\ J(6) &= 5 \\ J(7) &= 7 \\ \\ J(8) &= 1 \\ J(9) &= 3 \\ J(10) &= 5 \\ J(11) &= 7 \\ J(12) &= 9 \\ J(13) &= 11 \\ J(14) &= 13 \\ J(15) &= 15 \\ \\ J(16) &= 1 \end{aligned}

\begin{aligned} J(2^m + l) &= 2l + 1 & 0 \le m, 0 \le l \lt 2^m \tag{3} \end{aligned}

Proof (by induction on m)

We already know the value of the function when $n=1$ , which we obtain when $m = l = 0$ .

\begin{aligned} J(1) = J(2^0 + 0) = 1 \tag{4} \end{aligned}

Suppose that the proposition is true for some $m \ge 0$ . We want to show that this implies that the proposition is also true for $m + 1$ . To do so, we need to examine two cases separately: (a) when $l$ is even, and (b) when $l$ is odd.

\begin{aligned} \text{(when $l$ is even)} \\ \\ J \big(2^{m+1}+l\big) &= J \big(2 \cdot [2^m+l/2]\big) \\ &= 2 \cdot J (2^m+l/2)-1 \\ &= 2 \cdot (2 \cdot l/2 + 1)-1 \\ &= 2l + 1 \\ \\ \text{(when $l$ is odd)} \\ \\ J \big(2^{m+1}+l\big) &= J \big(2 \cdot [2^m+(l-1)/2] + 1\big) \\ &= 2 \cdot J \big(2^m+(l-1)/2\big) + 1 \\ &= 2 \cdot \big(2 \cdot (l-1)/2 + 1\big) + 1 \\ &= 2l + 1 \\ \end{aligned}

These results follow from the earlier recurrence equations (1) and (2) and the assertion that the proposition is true for a specific $m$ . Together with the base case in (4), they show that the proposition is indeed true for all $m$ . QED.

There are a number of ways to implement this program in Rust, and in this case, we take the most straightforward approach of evaluating the formula (3) that we just proved.

Preparation

Make sure cargo is installed on your system via rustup.
Create a new package called hanoi to run your code.

$ cargo new josephus
$ cd josephus

P0: Basic Solution

Link to GitHub branch.

//
// src/lib.rs
//

/// Calculate the Josephus Number J(n) for any n > 0.
///
/// We use a simple formula here: express the number `n` as
/// 2^m + l for some m >= 0, 0 <= l < 2^m, then calculate the
/// value of 2 * l + 1, which is the result.
///
/// * `n` - The number to calculate J for
pub fn josephus(n: u32) -> u32 {
    let base: u32 = 2;

    let mut index: u32 = 0;
    let mut hbits: u32 = n;

    loop {
        hbits = hbits >> 1;
        if hbits == 0 {
            break;
        }
        index += 1;
    }

    let l = n - base.pow(index);

    2 * l + 1
}

#[cfg(test)]
mod tests {

    use super::*;

    #[test]
    fn test_one() {
        assert_eq!(josephus(1), 1);
    }

    #[test]
    fn test_three() {
        assert_eq!(josephus(3), 3);
    }

    #[test]
    fn test_twelve() {
        assert_eq!(josephus(12), 9);
    }

    #[test]
    fn test_fifty_one() {
        assert_eq!(josephus(51), 39);
    }

    #[test]
    fn test_sixty_four() {
        assert_eq!(josephus(64), 1);
    }
}

That’s all for today, folks! 🖖