Error Correction

A general technique in teaching an “artificial intelligence” is to feed it ground truth. This may come in many different forms, but the essential idea is this: you help the machine learn from its mistakes by giving it the “correct” answer, and some time to reflect on it.

“I want you to think about what you’ve done.”

This technique works equally well for us humans, as pithily explained in Daniel Coyle’s The Little Book of Talent:

Tip #22: Pay attention immediately after you make a mistake.

Coyle elaborates: “People who pay deeper attention to an error learn significantly more than those who ignore it. […] Develop the habit of attending to your errors right away. Don’t wince, don’t close your eyes; look straight at them and see what really happened, and ask yourself what you can do next to improve. Take mistakes seriously, but never personally.”

In the workplace, most technology companies have some form of retrospective that they codify into a process. At Amazon, this is called a ‘Correction Of Errors’, or ‘COE’ for short. When someone writes up a COE, they write down the nature and import of the error to the business, lessons learned as part of the incident and recovery, and actions to prevent future recurrence of the same class of problems. COEs leverage the 5 Whys process popularized by Toyota for going deeper into the problem and establishing root causes.

I’ve always been a proponent of writing COEs and learning from mistakes. It’s important to see the COE as an “engineering chisel” rather than a “managerial hammer”. As an engineer, writing COEs is a discipline you impose upon yourself to hone your craft.

“Now Look What You’ve Done!”

Some years ago, I proposed the idea of a ‘2 Hour COE’ that was well-received by engineers. It went like this: don’t hesitate to write COEs or think of them as ‘work’; dive right in and do it; don’t spend more than 2 hours to write up a single COE (keep it ugly); focus on root causes and learnings more than anything else; don’t create more than 1-3 action items (avoid creating new work for the team).

Last month, I accidentally missed a meeting with colleagues because my iOS Calendar app wasn’t accurately synced with the Microsoft Exchange server. Someone mentioned “5 Whys” jokingly, and I thought to myself: this isn’t the first time something like this has happened, and I sure keep complaining about it — what can I do to resolve root causes myself? Perhaps I should write up a COE!

Now, I love writing and I spent an hour in the morning with good coffee in hand injecting some subtle humor into the write-up. It was primarily a joke aimed at the guy who mentioned “5 Whys” (and gloriously accomplished the job, if I may say so myself), but last week, it became popular on blind and I got asked by more than one person: was I joking or serious? But can’t it be both? It turns out I did get something really useful out of it after all: I discovered an app called VMware Boxer that is fully supported for corporate email and does work flawlessly on iOS.

If you’re an Amazonian and have questions, join #ama-riyer on the corporate Slack workspace.

Just Slack

Whenever you’re about to send an email to someone to discuss something, or to ask to schedule time on their calendar, stop – send them a Slack message1Or, whatever is the equivalent of Slack in your context. instead. Whether the other person responds immediately or not, this is a great time-saver, as it streamlines the initial conversation, keeping it quick and informal.

A Category Primer

Category theory formalizes mathematical structure. The term ‘theory’ is a misnomer, as there isn’t any hypothesis that’s subject to disconfirmation. Rather, you could think of category theory as a formalism or ‘language’ for expressing the abstract qualities of structures and relationships. It introduces a number of terms and definitions, which I like to call the ‘category zoo’ (which I will explore in a future post).

Wikipedia says, “Category theory has practical applications in programming language theory, for example the usage of monads in functional programming.” I find that statement to be a stretch: while one might find it convenient to express ideas in terms of concepts introduced in category theory, I believe these ideas were perfectly well-articulated even without. My goal then, isn’t to evangelize any practical benefit to understanding category theory, simply to explain what it is.

The Basics

A category is a collection of objects and arrows (aka morphisms) that obey certain rules. Multiple collections may include the same objects and arrows; as long as these rules are obeyed within a collection, the collection is said to form a category.

  1. Every collection has zero or more objects and arrows.
  2. Every arrow has a source object and a target object.
  3. Every collection is closed under composition.
  4. Composition is associative.
  5. Every object has a unique identity arrow.

An object is a simple beast, serving simply as the start or end of arrows. It has no further structure to it, and cannot be ‘opened up’. Instead, an object can be defined only in terms of its relationship with other objects, represented by arrows. Multiple arrows may exist between any two (even the same) object. Each of these arrows is distinct. Arrows are given different names to distinguish them from each other.

A category is closed under composition, which means that if you have an arrow f from A to B, and an arrow g from B to C, there must exist, within the collection, another arrow g \circ{} f (pronounced g after f) from A to C which is the composition of the two. Note that there could be many arrows from A to C, but only one of them is g \circ{} f.

Composition is associative, which means that if you have an arrow f from A to B, an arrow g from B to C, and an arrow h from C to D, then we have:

h \circ{} (g \circ{} f) = (h \circ{} g) \circ{} f

In the diagram below, when you take a path from A to D, it doesn’t matter if you take the blue path or the red one, as they’re both equivalent.

An identity arrow of an object X is a unique arrow id_X whose source and target are the object itself, with the additional property that for any arrow f from A to B, we have:

id_A \circ{} f = f = f \circ{} id_B

Notice that these conditions must hold true for any arrow f, not just a particular one. In the diagram below, the blue and red paths are equivalent for any arrow f.

An isomorphism is an invertible morphism (arrow). Given arrows f and f' from A to B and B to A respectively, they form an isomorphism if the following statements hold true.

\begin{aligned}
f' \circ f &= id_A \\
f \circ f' &= id_B \\
\end{aligned}

f and f' are inverses of each other. The inverse of an arrow, if it exists, is unique. Two objects A and B are isomorphic if there exists an isomorphism between them, noting in passing that there may be more than one such isomorphism.

An initial object has exactly one arrow to every object in the category (including itself), whereas a terminal object has exactly one arrow from every object. If the source and target are the object itself, then the arrow must be the object’s identity arrow. A category may have any number of initial or terminal objects, or it may have none at all. Initial and terminal objects are duals of each other, identical concepts but with the direction of arrows reversed. In the diagram below, A and F are initial objects, whereas C is a terminal object.

The initial and terminal properties are called universal properties, and an object that satisfies either of these properties is called a universal object. If there are multiple initial objects in a category such as A and F above, then these objects must be isomorphic. Similarly, if there are multiple terminal objects in a category, then they must be isomorphic as well. We say that initial and final objects are ‘unique up to isomorphism’ to express this idea that while the objects may not be identical, there is an isomorphism between them.

Interlude: Category Set

The category of sets, denoted by Set, is one whose objects are sets, and morphisms are total functions from source to target. In this category, each set is nothing more than an object serving as the source or target of one or more morphisms; a categorical description abstracts over the elements contained within any set, leaving behind only how sets relate to each other.

In the category of sets, the empty set is an initial object, as there is exactly one unique total function \mathcal{F_{\varnothing \rightarrow S}} from the empty set to every other set \mathcal{S}. The intuition for this is as follows: since the empty set \varnothing has no elements, it is logically true that if an element from the set were to be provided as input, then it could be uniquely mapped to any of the available sets, including itself. Alas, the empty set is incapable of providing any elements! Notice the parallels with propositional logic, that states that the following is true for any B.

\lnot A \rightarrow (A \rightarrow B)

Further, there is exactly one unique total function \mathcal{G_{S \rightarrow U}} from every set \mathcal{S} to every singleton set \mathcal{U} (a set with exactly one element), as there is exactly one way to construct such a set. Every singleton set is therefore a terminal object. A function mapping every set to a given singleton set is what is conventionally called a constant function — no matter what input you offer, you get the same output, as there is no other alternative.

In the category of sets, as is true for any category, both initial and terminal objects are unique up to isomorphism. Notice that there is exactly one empty set (initial object) and infinitely many singleton sets (terminal objects). All singleton sets are isomorphic, but the empty set is special because it is also unique (and therefore trivially isomorphic to itself through id_\varnothing).

In the diagram below showing a part of the category of sets, \varnothing is the empty set, whereas U_1, U_2 and U_3 are singletons.

Universal Construction

Universal construction is a method of leveraging universal properties to construct new objects. We can observe this method in action in the context of defining products and coproducts.

Product

In the context of sets, a product (or more formally, the Cartesian product) of two sets A and B is A \times B, where:

A \times B = \{(a, b) \mid a \in A, b \in B\}

In category theory, we broaden this definition to apply to any category, and define product in terms of universal properties. We start by observing that the essential property of a product \mathcal{P} is that it can be mapped to two independent projections, a left projection \mathcal{L} and a right projection \mathcal{R}. With this, we define a candidate \mathcal{P'} for the product as an object that can yield \mathcal{L} and \mathcal{R}.

But there may be many such candidates — which one do we pick? For instance, the tuple (X, Y) can be mapped to X and Y, but so can (X, Y, Z) — simply by ignoring Z. We can thus think of each of the remaining candidates as containing some degree of additional noise that should be ignored. In other words, (X, Y, Z) can be uniquely mapped to (X, Y). This is true of any such candidate, and we can represent this idea in the following diagram.

l = l' \circ{} m\\
r = r' \circ{} m

There is a catch, however. You could still have many candidates that look like \mathcal{P}. For instance, both (X, Y) and (Y, X) are equivalent in terms of their information content, and there is no reason to consider one to be a better candidate than the other for \mathcal{P} — they are isomorphic. Or expressed more formally, \mathcal{P} is not unique, but “unique up to isomorphism”.

Coproduct

A coproduct is a dual or inverse of a product, which you can construct by inverting all the arrows from the previous example.

In the context of sets, the coproduct \mathcal{C} represents a disjoint union (also known as tagged union) of \mathcal{L} and \mathcal{R}.

p = m \circ{} p' \\
q = m \circ{} q'

More on the “category zoo” — next time. That’s all for now, folks! 🖖