Dinky

Today, I want to talk about a small side-project that I’ve been working on (source code on GitHub with an Apache 2.0 license). The project is a minimalistic web application available at dinky.dev that’s open for anyone to use.

dinky.dev has the rather ambitious goal of helping you organize your life. Of course, my original goal was to help me organize my life, and I’m at a stage of the project where I can see it start to do that. I think it would now benefit from others using it, and providing feedback or suggestions.

Daily Rhythms

Something I’ve come to realize over time is that my “planned” daily activities can be boiled down into three simple abstractions, which are note-taking, task-execution, and synthesis.

Note-taking. During the course of the day, I consume a lot of information. My recall and learning are maximized when I write down my thoughts and conclusions, and to this end, I need to take notes somewhere. When there are multiple tools available to do this, I typically grab the closest or easiest one, such as Apple Notes (or a physical notebook), and write stuff down. And because I’m actively listening or thinking at the time, I don’t have the luxury of organizing my notes in any meaningful way.

Task-execution. When I’m ready to get stuff done, it’s most helpful to know exactly what I need to do next. A common pitfall is that the thing I need to do next is too big or ambiguous, and it’s easier to move on to something else rather than think about it. My aspirational rule of thumb is that tasks need to be completable within 20 minutes or less.

Synthesis. Synthesis, a form of deep thinking and planning, bridges the gap between information consumption and action determination. This takes a few forms: (a) given notes captured during the day, what new tasks ought I create? (b) given a large or ambiguous task, what smaller tasks should I break it down into? (c) do the current set of tasks continue to help me make progress towards my larger objectives?

Web Application

dinky.dev is a responsive client-side (React / JavaScript-based) web application that attempts to encode the daily rhythms described above into a single tool that has an optimal user flow. Basically, it lets you take notes and track tasks, and bridges these capabilities with some others that I’ll describe shortly.

The first rule of the application is minimalism. I’ve tried to keep bells and whistles to a minimum, both in terms of style and substance. That’s one reason why the site is themed almost entirely in grayscale. (The second reason, of course, is that I have a bias towards pure functionalism.)

Both notes and tasks support GitHub-flavored Markdown for basic formatting. Markdown is both simple and incredibly valuable for prettifying text and making it navigable, especially when dealing with long hyperlinks. The primary mode of sifting through the data is to perform a search, which is accessible using the forward-slash (/) keyboard shortcut. Search is performed entirely on the client-side, and regular expressions work just fine. All of your data is stored within Local Storage in the browser and accessible only to you (caveat: or anyone who has access to your browser and can pose as you). Besides search, several other functions support keyboard shortcuts, such as the ability to create new items (n) and page navigation.

When you take notes, you eventually need to review and organize them to figure out if there are any further actions that need to be taken. While dinky.dev can’t help you figure this out, it does make it easy for you to add new tasks to your backlog. Furthermore, at the start of the day (or the previous evening, if you’re so inclined) you can identify the tasks that need to get done on that day and add them to your agenda.

Adding tasks to your agenda is a way of separating the planning of your day from the doing of tasks, thus helping you focus on making single-threaded progress. Also, new tasks always go first to your backlog, which is a way of discouraging changes to today’s agenda once you’ve figured it all out.

Topics provide a light-weight way of capturing your core top-level objectives and associating either tasks or notes with those objectives. For instance, as I worked on this project, I tagged related notes and tasks (typically features to implement or bugs to fix) with “#dinky”, which then made it easy for me to pull together everything related to the project. Any word or phrase (without spaces, typically hyphenated) prefixed with a hash (#) is automatically considered a topic.

Limitations

Before using dinky.dev, you should be aware of its limitations. Apart from minor ones that you might discover, there are a few glaring omissions:

  1. No synchronization across devices.
  2. Not a lot of storage (<5MB depending on your browser).
  3. Browser’s “Private Mode” not useful (you can’t save your data).

All-in-all, it’s probably for the best if you treat dinky.dev as an experimental web application and not use it for mission-critical work. That said, it is pretty stable and I expect future changes to be backwards-compatible.

Learnings

Finally, a segue to what I learned from it all.

Synchronizing data is complicated on multiple levels.

…which is why I never got around to implementing it.

First, my preference was to let users manage their data entirely, instead of managing it on their behalf. But this required me to pick a specific vendor that users might be persuaded to use (such as AWS), and require users to set up a complex set of policies for authentication, authorization and storage. And if I did set it up on behalf of users, I would need to figure out how users authenticate themselves to dinky.dev, and how they get billed for their usage. I believe now that some form of account setup on behalf of users combined with cross-account billing might be the best option, if possible in practice.

Second, it’s impractical to use simple and cheap storage like S3 to guarantee correct behavior in the face of concurrent writes. Moreover, things get further complicated when you try to avoid writing an entire blob in favor of incremental updates. The effort (before I abandoned it) started taking the shape of a transaction log (aka “redo” log) to be combined with a subscription mechanism (to keep track of when the log could be merged and optimized).

TypeScript is a lot of fun to use.

This was the first time I’d gotten to use TypeScript seriously, and it certainly hit a sweet spot between compile-time safety and developer-friendliness. I particularly enjoyed its structural typing paradigm, which I think works quite well in a dynamic front-end environment where “everything is data”, and you’re constantly trying to determine if the shapes of objects are fitting together correctly. Also, using TypeScript (+ Visual Studio Code hints) made it easy to develop code without errors; in hindsight, development would likely have been 10x slower without it.

React has come a long way

I initially started writing code using React class components, but eventually rewrote everything using its functional components and hooks instead. React’s architecture made it incredibly easy to decompose and organize code with minimal boilerplate. I stuck to basic features of React, but they were enough to get the job done.

IndexedDB is…complicated

I may yet switch to IndexedDB in the future, but it’s just so complicated! Local Storage on the other hand is probably as simple as it can get (with get, put, clear as the main primitives), but has severe limitations (lack of indexing, limited space). I’ve continued to trudge down the path of Local Storage for now.

Closing Thoughts

This project is not “done” by any means, but it gave me an opportunity to build something that I desperately wanted to use myself. I think that’s a great situation to be in, because it forces you to prioritize what you really need, and you know exactly what it is that you want. I can’t say for sure if that’s a formula for success, but it certainly is a formula for personal satisfaction. Tools that work with you when you want to rewrite large swathes of code (as I did several times in this case) are an essential ingredient to this recipe.

That’s all for today, folks! 🖖

EDIT 2022-05-11: I recently changed the term “tags” to “topics”, as I felt like the latter better reflected the intent of tracking high-level objectives and interests.

Versioning

I’ve been thinking about versioning as a concept and arrived at a sufficiently concise explanation that seemed worth sharing. Let’s start with some simple definitions:

Definition 1

Version tags are compressed references to objects within a given semantic category, where the category is often (but not necessarily) scoped to a name or identifier.

As an example, "Big Sur" and "Monterey" are version tags in the category "macOS", whereas 95 and 98 are version tags in the category "Windows". You can compare, say, ("macOS", "Big Sur") to ("Windows", 98), but comparing "Monterey" to 95 without reference to the respective operating systems would be like comparing apples and oranges.

System designers may assert additional semantics around version tags. For instance, they may assert that version tags are always numeric, and that a larger number references a newer object within the semantic category. (As a side note, “newer according to whom?” is a good question to ask in any distributed system.)

Version tags are compressed references in the sense that the uncompressed alternative would be to the use the referenced object in its entirety. For instance, if you had the luxury of always being able to perform bit-by-bit comparisons of the objects referenced by two version tags, you would no longer need version tags.

Version tags may be assigned through a process that conforms to either (or both) of the following forms:

  • 1. The version tag is derived from the referenced object.
  • 2. The version tag is derived from contextual state.

In the first scheme (ex: SHA256 checksums as version tags), the system may enforce either one (or both) of the following invariants.

  • 1A. Distinct objects likely1Why? Because there are no collision-free lossy hashing schemes. result in distinct version tags.
  • 1B. Distinct version tags reference distinct objects.

In the second scheme (ex: monotonically increasing version numbers), the system may enforce the following invariant.

  • 2A. There is a total ordering relationship (representing a notion of newness) across the set of version tags associated with objects in the given semantic category.

Interestingly, 1A and 2A may be entirely compatible. For instance, the version tag may be derived from the previous version tag supplied within the contents of the target object (scheme 1), but confirmed by checking that no new version tags have been concurrently created (scheme 2).

On the other hand, 1B and 2A are not compatible unless you’re willing to rewrite history in the process of creating new objects. To see an example of this incompatibility, imagine that you have an object of type Object that you’ve chosen to version using the first scheme. Initially, the object does not exist, and [Step 1] you create a new one with content "foo" (version tag = 1), which you later [Step 2] update to "bar" (version tag = 2). If you later decide to [Step 3] update the content to "foo" once again, the system would need to make a choice between the two invariants, either setting the version tag to 1 (maintaining 1B) or setting it to 3 (maintaining 2A).

On the other hand, if you are willing to rewrite history, you can drop the reference to "bar" altogether, and thus not violate either invariant. The problem with this approach is that these references may have been published to external systems, and it’s not easy to have everyone rewrite history the same way.

However, there is one other way of rewriting history that might be more “principled”. That is to retain exactly one previous version in your history. At Step 3 in the example above, you would drop "foo" with version tag 1, retain "bar" with version tag 2, and create "foo" with version tag 3. This scheme offers compatibility with 1A, 1B and 2A, but requires that you commit to tracking only one previous version!

That’s all for today, folks! 🖖

Monty Hall Problem

You’re on a game show, standing in front of three closed doors. The host urges you to pick one door, explaining that if you picked the right one, you would find behind it a coupon for a lifetime supply of coffee1If coffee isn’t your cup of tea, feel free to creatively replace this coupon with a more desirable prize. that’s yours to keep. Open the wrong door and you get nothing.

You consider your options carefully, wondering which door makes the most sense to pick. But the truth is, you have nothing to go by, so a random guess is all you can cough up: you pick Door A. The host tells you to go ahead and open the door…but as you reach for the handle, he yells, “Hold on there! I’m going to spice this game up a bit.”

Unlike the hapless guests of the show, the host knows perfectly well which door has the prize behind it. So here’s what he does: he artfully walks around to one of the doors that you hadn’t chosen — Door C — opens it up, and shows you that there’s nothing behind it. Good thing you hadn’t chosen that one, eh? Phew!

“Now here’s the coffilicious question,” continues the host, “would you like to go ahead with opening Door A, or would you like to switch to Door B?” It’s a conundrum indeed…what would you do?

Bits of Mathematics

The answer to the Monty Hall problem may be somewhat counterintuitive for some, but it’s not difficult to calculate the answer. Let p be the probability that you picked the right door in the beginning (let’s call this event X). It’s easy to see that, given three doors to pick from, the probability of Door A being the right door is:

p= P(X) = 1/3

Let Y represent the complement of X, the event that one of the other doors has the prize behind it. With q as the probability of this event, it’s again easy to see that:

q = P(Y) = 1 - p = 2/3

Later on, the host reveals that there’s nothing behind Door C. This door isn’t chosen at random — the host knows perfectly well what’s behind each door, and he picks one that is guaranteed to be a dud. If Door C had the prize behind it, the host would have chosen to eliminate Door B instead. If Door A had the prize behind it, the host would be able to freely pick either Door B or Door C to eliminate. Of course, you have no idea how the host is making these choices, so from your perspective once Door C is eliminated, there’s some non-zero chance of the prize being behind either Door A or Door B.

A Red Herring

If you were to now pick one of these two doors at random (let’s call this event W), the probability w of the new pick being the right door would be:

w = P(W) = 1/2

At this point, you might think it makes no difference whether you switch or not, as both doors are equally likely to have the prize behind it — but you would be wrong. Arguably this is where many people get tripped up.

The reason is that we don’t really care about w. Why? Because you’re not randomly picking one of Door A or Door B. Rather, you’re trying to decide whether or not your original choice still makes sense to go with. Or to phrase it differently, we want to determine how the likelihood of your original choice being right stacks up against the new information available at your disposal that Door C doesn’t have anything behind it.

Here’s a quick and simple way to understand this. The likelihood p of your original choice being right still remains unchanged at 1/3, but whereas event Y originally represented either of Door B or Door C having the prize (with an equal likelihood of 1/3 each), the same event now represents just Door B having the prize, with the same total probability of 2/3.

In other words, if you stick to your original choice, you have a 1/3 chance of being right. If you decide to switch, you have a 2/3 chance of being right. And that’s why it makes complete mathematical sense to switch!

A Broader Perspective

If the mathematics makes sense but the solution does not quite “feel right”, here’s a handy tip. Try extending the puzzle to many more doors. For instance, suppose we were dealing with a hundred doors, of which you picked one, just as you did with the three doors earlier. Of the remaining ninety-nine, the host then jumps up and eliminates all but one. Would you have faith that you’d picked the right door out of hundred? Or would you be more likely to believe that the one door that the host cunningly failed to eliminate ultimately held the prize?

If you’re interested in an empirical answer to this question, try the simulator I’ve made available on GitHub.

--------------------------------------------------
Monty Hall Simulator
--------------------------------------------------
Monty Hall Problem
-------------------------------------------------- Simulation with 3 doors, 1000000 iterations... Winning likelihood: → Original choice = 0.333568 → If you switched = 0.666432

That’s all for today, folks! 🖖