Programming – optimix

Dinky

Today, I want to talk about a small side-project that I’ve been working on (source code on GitHub with an Apache 2.0 license). The project is a minimalistic web application available at dinky.dev that’s open for anyone to use.

dinky.dev has the rather ambitious goal of helping you organize your life. Of course, my original goal was to help me organize my life, and I’m at a stage of the project where I can see it start to do that. I think it would now benefit from others using it, and providing feedback or suggestions.

Daily Rhythms

Something I’ve come to realize over time is that my “planned” daily activities can be boiled down into three simple abstractions, which are note-taking, task-execution, and synthesis.

Note-taking. During the course of the day, I consume a lot of information. My recall and learning are maximized when I write down my thoughts and conclusions, and to this end, I need to take notes somewhere. When there are multiple tools available to do this, I typically grab the closest or easiest one, such as Apple Notes (or a physical notebook), and write stuff down. And because I’m actively listening or thinking at the time, I don’t have the luxury of organizing my notes in any meaningful way.

Task-execution. When I’m ready to get stuff done, it’s most helpful to know exactly what I need to do next. A common pitfall is that the thing I need to do next is too big or ambiguous, and it’s easier to move on to something else rather than think about it. My aspirational rule of thumb is that tasks need to be completable within 20 minutes or less.

Synthesis. Synthesis, a form of deep thinking and planning, bridges the gap between information consumption and action determination. This takes a few forms: (a) given notes captured during the day, what new tasks ought I create? (b) given a large or ambiguous task, what smaller tasks should I break it down into? (c) do the current set of tasks continue to help me make progress towards my larger objectives?

Web Application

dinky.dev is a responsive client-side (React / JavaScript-based) web application that attempts to encode the daily rhythms described above into a single tool that has an optimal user flow. Basically, it lets you take notes and track tasks, and bridges these capabilities with some others that I’ll describe shortly.

The first rule of the application is minimalism. I’ve tried to keep bells and whistles to a minimum, both in terms of style and substance. That’s one reason why the site is themed almost entirely in grayscale. (The second reason, of course, is that I have a bias towards pure functionalism.)

Both notes and tasks support GitHub-flavored Markdown for basic formatting. Markdown is both simple and incredibly valuable for prettifying text and making it navigable, especially when dealing with long hyperlinks. The primary mode of sifting through the data is to perform a search, which is accessible using the forward-slash (/) keyboard shortcut. Search is performed entirely on the client-side, and regular expressions work just fine. All of your data is stored within Local Storage in the browser and accessible only to you (caveat: or anyone who has access to your browser and can pose as you). Besides search, several other functions support keyboard shortcuts, such as the ability to create new items (n) and page navigation.

When you take notes, you eventually need to review and organize them to figure out if there are any further actions that need to be taken. While dinky.dev can’t help you figure this out, it does make it easy for you to add new tasks to your backlog. Furthermore, at the start of the day (or the previous evening, if you’re so inclined) you can identify the tasks that need to get done on that day and add them to your agenda.

Adding tasks to your agenda is a way of separating the planning of your day from the doing of tasks, thus helping you focus on making single-threaded progress. Also, new tasks always go first to your backlog, which is a way of discouraging changes to today’s agenda once you’ve figured it all out.

Topics provide a light-weight way of capturing your core top-level objectives and associating either tasks or notes with those objectives. For instance, as I worked on this project, I tagged related notes and tasks (typically features to implement or bugs to fix) with “#dinky”, which then made it easy for me to pull together everything related to the project. Any word or phrase (without spaces, typically hyphenated) prefixed with a hash (#) is automatically considered a topic.

Limitations

Before using dinky.dev, you should be aware of its limitations. Apart from minor ones that you might discover, there are a few glaring omissions:

No synchronization across devices.
Not a lot of storage (<5MB depending on your browser).
Browser’s “Private Mode” not useful (you can’t save your data).

All-in-all, it’s probably for the best if you treat dinky.dev as an experimental web application and not use it for mission-critical work. That said, it is pretty stable and I expect future changes to be backwards-compatible.

Learnings

Finally, a segue to what I learned from it all.

Synchronizing data is complicated on multiple levels.

…which is why I never got around to implementing it.

First, my preference was to let users manage their data entirely, instead of managing it on their behalf. But this required me to pick a specific vendor that users might be persuaded to use (such as AWS), and require users to set up a complex set of policies for authentication, authorization and storage. And if I did set it up on behalf of users, I would need to figure out how users authenticate themselves to dinky.dev, and how they get billed for their usage. I believe now that some form of account setup on behalf of users combined with cross-account billing might be the best option, if possible in practice.

Second, it’s impractical to use simple and cheap storage like S3 to guarantee correct behavior in the face of concurrent writes. Moreover, things get further complicated when you try to avoid writing an entire blob in favor of incremental updates. The effort (before I abandoned it) started taking the shape of a transaction log (aka “redo” log) to be combined with a subscription mechanism (to keep track of when the log could be merged and optimized).

TypeScript is a lot of fun to use.

This was the first time I’d gotten to use TypeScript seriously, and it certainly hit a sweet spot between compile-time safety and developer-friendliness. I particularly enjoyed its structural typing paradigm, which I think works quite well in a dynamic front-end environment where “everything is data”, and you’re constantly trying to determine if the shapes of objects are fitting together correctly. Also, using TypeScript (+ Visual Studio Code hints) made it easy to develop code without errors; in hindsight, development would likely have been 10x slower without it.

React has come a long way

I initially started writing code using React class components, but eventually rewrote everything using its functional components and hooks instead. React’s architecture made it incredibly easy to decompose and organize code with minimal boilerplate. I stuck to basic features of React, but they were enough to get the job done.

IndexedDB is…complicated

I may yet switch to IndexedDB in the future, but it’s just so complicated! Local Storage on the other hand is probably as simple as it can get (with get, put, clear as the main primitives), but has severe limitations (lack of indexing, limited space). I’ve continued to trudge down the path of Local Storage for now.

Closing Thoughts

This project is not “done” by any means, but it gave me an opportunity to build something that I desperately wanted to use myself. I think that’s a great situation to be in, because it forces you to prioritize what you really need, and you know exactly what it is that you want. I can’t say for sure if that’s a formula for success, but it certainly is a formula for personal satisfaction. Tools that work with you when you want to rewrite large swathes of code (as I did several times in this case) are an essential ingredient to this recipe.

That’s all for today, folks! 🖖

EDIT 2022-05-11: I recently changed the term “tags” to “topics”, as I felt like the latter better reflected the intent of tracking high-level objectives and interests.

Versioning

I’ve been thinking about versioning as a concept and arrived at a sufficiently concise explanation that seemed worth sharing. Let’s start with some simple definitions:

Definition 1

Version tags are compressed references to objects within a given semantic category, where the category is often (but not necessarily) scoped to a name or identifier.

As an example, "Big Sur" and "Monterey" are version tags in the category "macOS", whereas 95 and 98 are version tags in the category "Windows". You can compare, say, ("macOS", "Big Sur") to ("Windows", 98), but comparing "Monterey" to 95 without reference to the respective operating systems would be like comparing apples and oranges.

System designers may assert additional semantics around version tags. For instance, they may assert that version tags are always numeric, and that a larger number references a newer object within the semantic category. (As a side note, “newer according to whom?” is a good question to ask in any distributed system.)

Version tags are compressed references in the sense that the uncompressed alternative would be to the use the referenced object in its entirety. For instance, if you had the luxury of always being able to perform bit-by-bit comparisons of the objects referenced by two version tags, you would no longer need version tags.

Version tags may be assigned through a process that conforms to either (or both) of the following forms:

1. The version tag is derived from the referenced object.
2. The version tag is derived from contextual state.

In the first scheme (ex: SHA256 checksums as version tags), the system may enforce either one (or both) of the following invariants.

1A. Distinct objects likely¹ result in distinct version tags.
1B. Distinct version tags reference distinct objects.

In the second scheme (ex: monotonically increasing version numbers), the system may enforce the following invariant.

2A. There is a total ordering relationship (representing a notion of newness) across the set of version tags associated with objects in the given semantic category.

Interestingly, 1A and 2A may be entirely compatible. For instance, the version tag may be derived from the previous version tag supplied within the contents of the target object (scheme 1), but confirmed by checking that no new version tags have been concurrently created (scheme 2).

On the other hand, 1B and 2A are not compatible unless you’re willing to rewrite history in the process of creating new objects. To see an example of this incompatibility, imagine that you have an object of type Object that you’ve chosen to version using the first scheme. Initially, the object does not exist, and [Step 1] you create a new one with content "foo" (version tag = 1), which you later [Step 2] update to "bar" (version tag = 2). If you later decide to [Step 3] update the content to "foo" once again, the system would need to make a choice between the two invariants, either setting the version tag to 1 (maintaining 1B) or setting it to 3 (maintaining 2A).

On the other hand, if you are willing to rewrite history, you can drop the reference to "bar" altogether, and thus not violate either invariant. The problem with this approach is that these references may have been published to external systems, and it’s not easy to have everyone rewrite history the same way.

However, there is one other way of rewriting history that might be more “principled”. That is to retain exactly one previous version in your history. At Step 3 in the example above, you would drop "foo" with version tag 1, retain "bar" with version tag 2, and create "foo" with version tag 3. This scheme offers compatibility with 1A, 1B and 2A, but requires that you commit to tracking only one previous version!

That’s all for today, folks! 🖖

Grid Algorithm

Over the past couple of weeks, I’ve been writing a small lexer in Rust, motivated by the need for something simpler and easier to use than the default lexer offered by LALRPOP. The in-built lexer allows for simple regexes, but managing these regexes becomes difficult for more sophisticated situations such as double-quoted literals.

The core idea behind a lexer is to convert a stream of characters into a stream of tokens. Each token constitutes a distinct building block that can be used to construct a grammar for a language parser. As an analogy, the vocabulary and punctuation supporting the English language constitute a set of tokens. Of course, you could avoid the lexer altogether and deal directly with the underlying characters, but that makes the language grammar exponentially more complex to specify. Imagine parsing English sentences based on rules associated with each character of the alphabet!

How It Works

To recap, a lexer’s job is to convert a stream of characters into a stream of tokens. Tokens can be anything that’s useful to feed into your parser. The parser is generated from a set of productions declared in a grammar file. LALRPOP provides a bottom-up LR(1)¹ parser. A bottom-up parser uses recognized tokens to construct leaf nodes of the parse tree. In contrast, a top-down parser (of which recursive-descent parsers are the most common) generates productions eagerly as matching symbols are encountered. Top-down parsers are often very convenient and easy to use, but they’re less flexible compared to bottom-up parsers, and require backtracking when they encounter production dead-ends.

One way to structure the lexer is to break it down into a number of processors, one per type of token that you could potentially produce. For example, a Sym processor processes symbols, which are arbitrarily long consecutive sequences of characters conforming to a class (such as letters, numbers and underscores). The processor abstraction is useful for encapsulating logic associated with each token type into its own implementation. The idea is to iterate through each character in the stream, and feed it to each of the processors until one of them accepts the character. If none of the others accept the character, the Err processor is sure to do so (which means the character stream is ill-formed).

This abstraction is represented by the following trait:

pub trait Processor {
    fn accept(
        &self,
        idx: usize,
        chr: Option<&char>) -> Result<LexRes, LexErr>;
}

To complete this definition, the associated types LexRes and LexErr are provided below. The lexer — when fully or partially successful — returns an ordered set of tokens, where each token declares the starting index of the token (relative to the original character stream), the length of the stream consumed, and the type of token (represented by the Cat enum). Notice that some of these enum values such as Sym (for ‘symbol’) and Lit (for ‘literal’) capture a value, whereas others do not.

#[derive(Debug, Eq, PartialEq)]
pub struct LexRes {
    pub tokens: Vec<Token>,
    pub accepted: usize,
    pub rejected: bool,
}

#[derive(Debug, Eq, PartialEq)]
pub struct LexErr {
    pub idx: usize,
    pub msg: String,
}

#[derive(Debug, Eq, PartialEq)]
pub struct Token {
    pub idx: usize,
    pub len: usize,
    pub cat: Cat,
}

#[derive(Clone, Debug, Eq, PartialEq)]
pub enum Cat {
    Sym(String),
    Lit(String),
    Assgn,
    Eql,
    SemiColn,
    LParen,
    RParen,
    Spc,
}

What the simple definition of LexRes above belies is its underlying flexibility. When a processor receives a character, it can choose from 3 options:

Accept k outstanding characters (accepted = k; k ≤ n)
Reject all characters not accepted (rejected = true; k < n)
Wait for more characters (rejected = false; k < n)

The number n represents the total number of outstanding characters, which increments on every iteration, but decrements on acceptance or rejection of characters. Notice that these options are not mutually exclusive; in fact, the first option combines with either the second or third, but the latter cannot be combined with each other.

When the processor chooses the third option, this acts as a signal to the lexer that the next character should be fed directly to this processor. In essence, the processor holds an exclusive lock over the remainder of the stream as long as characters remain pending (neither accepted nor rejected). This process can continue ad infinitum, or at least until the end of the character stream. In each iteration, the processor has the choice of accepting up to the total number of outstanding characters (in stream order), and either rejecting the rest (which ends the cycle and releases the exclusive lock) or continuing to wait for additional characters.

The declared ordering of processors has functional significance in certain situations. For instance, if you have a choice of detecting one of <, - and <- as different token types, you’d likely want the latter to have precedence. Also, as a trivial example, the Err processor is always used as the last resort.

Play the short animation below, for a concrete visual of what the lexer’s processing looks like for the simple character sequence:

let quote <- say("\"Hello world!\" --RRI");

In the example above, let, quote and say are interpreted as symbols using the Sym processor. The Seq processor looks for a specific sequence of characters (here, it is <-, mapping to Cat::Assgn). The Lit processor is distinctive because it internally tracks state associated with the escape character \, which can escape double-quotation marks inside a double-quoted literal.

Complexity

If we assume that (a) we have a small constant number of processors and (b) each processor performs its task in constant time, the overall algorithm is clearly $O(n)$ in time-complexity, with respect to a character stream of length $n$ . However, we can also evaluate how efficient the algorithm is with respect to a baseline. In theory, the only way for us to be certain that a character is part of a particular token is for its processor to accept it, which means that every character must be processed at least once, giving us a total of $n$ processing calls as the ideal minimum to baseline our algorithm against.

From the visual above, it’s clear that the checkmarks (✓) represent the ideal path, whereas the crosses (☒) represent ‘misses’, or distance traveled before the character is accepted. Further, when a processor like Sym rejects the character immediately succeeding its final character (such as whitespace or punctuation, which is not part of the symbol), processing is restarted at the rejected character from the first processor, which, in the particular example, is again Sym. That means we have an extra character to be processed for every symbol recognized (note that ‘symbol’ in this context means the output token of the Sym processor such as let, quote and say). The efficiency of the algorithm with respect to the baseline can therefore be defined as:

\frac{count(\texttt{accept})}{count(\texttt{accept}) + count(\texttt{reject}) + count(\texttt{symbol})}

In essence, this is a number in (0, 1] that is the inverse of the average number of steps until a character is accepted by a processor. It varies based on the string processed by the lexer, so we’d likely only care about average statistics. Applying it to the example above gives us an efficiency of approximately 0.59: on average it takes 1.7 steps for each character to be accepted by a processor. Turning this into a useful metric would require us to normalize it, which we won’t attempt to do today.

Finally, there are further opportunities for refinement. First, each processor requires a different number of character comparisons to complete its task. Although we’ve assumed that each operates in constant time, we really ought to weight each processor, or otherwise measure the total number of comparisons directly. This would give us a more accurate metric than the one presented above. Second, we can order processors to maximize efficiency, basing the order on predictions that we derive from the statistics of the input being processed. We’re still bound by the need to process each character at least once, but we might be able to get really good at predicting which processor to try next, giving us an efficiency of close to 1.

That’s all for today, folks! 🖖