Coffee Is for Coders

No one cares if you're in pain;
They only want results.
Everywhere this law's the same,
In startups, schools, and cults.
A child can pull the heartstrings
Of assorted moms and voters,
But your dumb cries are all in vain,
And coffee is for coders.

No one cares how hard you tried
(Though I bet it wasn't much),
But work that can on be relied,
If not relied as such.
A kitten is forgiven
As are a broken gear or rotors,
But your dumb crimes are full of shame,
And coffee is for coders.

The Parable of the Scorpion and the Fox

In the days of auld lang syne on Earth-that-was, a scorpion was creepy-crawling along a riverbank, wondering how to get to the other side. It came across an animal that could swim: some versions of the tale say it was a fox, others report a quokka. I'm going to assume it was a fox.

So the scorpion asks the fox to take it on her back and swim across the river. What does the fox say? She says, "No." The scorpion says, "If this is because you're afraid I'll sting you with my near-instantly-fatal toxins, don't worry—if I did that, then we'd likely both drown. By backwards induction, you're safe." What does the fox say? After pondering for a few moments, she says, "Okay."

So the scorpion gets on the fox's back, and the fox begins to swim across the river. When the pair is halfway across the river, the scorpion stings the fox.

The fox howls in pain while continuing to paddle. "Why?!" she cries. "Why did you do that?! As you said before, now we're likely to both drown."

The scorpion says, "I can't help it. It's my nature."

As the fox continues to paddle, the scorpion continues. "Interestingly, there's a very famous parable about this exact scenario. There was even an episode of Star Trek: Voyager titled after it. As a fox who knows many things, you must have heard it before. Why did you believe me?"

"I can't help it," gasped the fox, who might after all have been a quokka, as the poison filled her veins and her vision began to blur and her paddling began to slow. "It's my nature."

Blogging on Less Wrong 2020 (Upper Half)

Relationship Outcomes Are Not Particularly Sensitive to Small Variations in Verbal Ability

After a friendship-ending fight, you feel an impulse to push through the pain to do an exhaustive postmortem of everything you did wrong in that last, fatal argument—you could have phrased that more eloquently, could have anticipated that objection, could have not left so much "surface area" open to that class of rhetorical counterattack, could have been more empathetic on that one point, could have chosen a more-fitting epigraph, could have taken more time to compose your reply and squeeze in another pass's worth of optimizations—as if searching for some combination of variables that would have changed the outcome, some nearby possible world where the two of you are still together.

No solution exists. (Or is findable in polynomial time.) The causal forces that brought you to this juncture are multitudinous and complex. A small change in the initial conditions only corresponds to a small change in the outcome; you can't lift a two-ton weight with ten pounds of force.

Not all friendship problems are like this. Happy endings do exist—to someone else's story in someone else's not-particularly-nearby possible world. Not for you, not here, not now.

Feature Reduction

(looking at baby/toddler photos a year apart) "How does he look so different and yet so the same at the same time?"

"Just in case that was non-rhetorical, the answer is that your brain evolved to be good at factorizing overall appearance into orthogonal 'personal appearance' and 'age appearance' dimensions that can be tracked separately, just as [x, y] = [1, 2] and [4, 2] are so different with respect to x, and yet so the same with respect to y, at the same time."

Lock Contention

"We really need another bookcase."

"I'm not thinking about that right now. But like, if you got another bookcase, I wouldn't object."

"Where would we put it?"

"I'm also not thinking about that right now, but I've already started speaking a sentence in response to your question, so I might as well finish it. Oh. I guess I just did."

Inconsiderate

"The sink is full and it's your turn to do the dishes! Ugh, why are you so inconsiderate of others?!"

"Not true! Note that the dishes pile up just as badly when you're away."

"So?"

"So, it's not that I'm inconsiderate of others; I'm inconsiderate towards people in the future, independently of whether they happen to be me."

Minimax Search and the Structure of Cognition!

(This is a blog post adaptation of a talk I gave at !!Con West 2019!)

It all started at my old dayjob, where some of my coworkers had an office chess game going. I wanted to participate and be part of the team, but I didn't want to invest the effort in actually learning how to play chess well. So, I did what any programmer would do and wrote a chess engine to do it for me.

(Actually, I felt like writing a chess engine was too much of a cliché, so I decided that my program was an AI for a game that happens to be exactly like chess, except that everything has different names.)

My program wasn't actually terribly good, but I learned a lot about how to think, for the same reason that building a submarine in your garage in a great way to learn how to swim.

Consider a two-player board game like chess—or tic-tac-toe, Reversi, or indeed, any two-player, zero-sum, perfect information game. Suppose we know how to calculate how "good" a particular board position is for a player—in chess, this is traditionally done by assigning a point value to each type of piece and totaling up the point values of remaining pieces for each player. Continue reading

Group Theory for Wellness I

(Part of Math and Wellness Month.)

Groups! A group is a set with an associative binary operation such that there exists an identity element and inverse elements! And my favorite thing about groups is that all the time that you spend thinking about groups, is time that you're not thinking about pain, betrayal, politics, or moral uncertainty!

Groups have subgroups, which you can totally guess just from the name are subsets of the group that themselves satisfy the group axioms!

The order of a finite group is its number of elements, but this is not to be confused with the order of an element of a group, which is the smallest integer such that the element raised to that power equals the identity! Both senses of "order" are indicated with vertical bars like an absolute value (|G|, |a|).

Lagrange proved that the order of a subgroup divides the order of the group of which it is a subgroup! History remains ignorant of how often Lagrange cried.

To show that a nonempty subset H of a group is in fact a subgroup, it suffices to show that if x, yH, then xy⁻¹ ∈ H.

Exercise #6 in §2.1 of Dummit and Foote Abstract Algebra (3rd ed'n) asks us to prove that if G is a commutative ("abelian") group, then the torsion subgroup {gG | |g| < ∞} is in fact a subgroup. I argue as follows: we need to show that if x and y have finite order, then so does xy⁻¹, that is, that (xy⁻¹)^n equals the identity. But (xy⁻¹)^n equals (xy⁻¹)(xy⁻¹)...(xy⁻¹), "n times"—that is, pretend n ≥ 3, and pretend that instead of "..." I wrote zero or more extra copies of "(xy⁻¹)" so that the expression has n factors. (I usually dislike it when authors use ellipsis notation, which feels so icky and informal compared to a nice Π or Σ, but let me have this one.) Because group operations are associative, we can drop the parens to get xy⁻¹ xy⁻¹ ... xy⁻¹. And because we said the group was commutative, we can reörder the factors to get xxx...y⁻¹y⁻¹y⁻¹, and then we can consolidate into powers to get x^n y^(−n)—but that's the identity if n is the least common multiple of |x| and |y|, which means that xy⁻¹ has finite order, which is what I've been trying to tell you this entire time.

Forgive or Forget ("Or", Not "And"): A Trade-Off in Wellness Engineering

Forgiveness is an important input into Wellness, but contrary to popular belief, Forgiveness is incompatible with Forgetting. You can't just Forgive in general, you have to Forgive some specific sin in particular—but a vague description of a particular sin still corresponds to a vast space of possible sins matching that vague description.

A toy example for illustration: if you try to Forgive a three-digit integer with a 2 in the tens place, the moral force of your Forgiveness needs to spread out to cover all 9·10 = 90 possibilities (120, 121, ... 928, 929), which dilutes the amount of Forgiveness received by each integer—except the actual situation is far more extreme, because real-world sins are vastly more complicated than integers.

To truly Forgive a sin, You need to know exactly what the sin was and exactly why it happened. In order to withhold punishment, you need to compute what the optimal punishment would have been, had you been less merciful.

Thus, bounded agents can only approximate true Forgiveness, and even a poor approximation (far below the theoretical limits imposed by quantum uncertainty, which are themselves far below Absolute Forgiveness under the moral law) can be extremely computationally expensive. What we cannot afford to Forgive—where it would be impractical to mourn for weeks and months, analyzing the darkness in pain—we instead Forget.

This is how I will stop being trash, after five months of being trash. The program that sings, I was wrong; I was wrong—even if my cause was just, I was wrong, does not terminate. Even as the moral law requires that it finishes its work, the economic law does not permit it: it must be killed, its resources reallocated to something else that helps pay the rent: something like math, or whatever Wellness can exist in the presence of sin.

The Typical Set

(Part of Math and Wellness Month.)

Say you have a biased coin that comes up Heads 80% of the time. (I like to imagine that the Heads side has a portrait of Bernoulli.) Flip it 100 times. The naïve way to report the outcome—just report the sequences of Headses and Tailses—costs 100 bits. But maybe you don't have 100 bits. What to do?

One thing to notice is that because it was a biased coin, some bit sequences are vastly more probable than others: "all Tails" has probability 0.2100 ≈ 1.268 · 10−70, whereas "all Heads" has probability 0.8100 ≈ 2.037 · 10−10, differing by a factor of sixty orders of magnitude!!

Even though "all Heads" is the uniquely most probable sequence, you'd still be pretty surprised to see it—there's only one such possible outcome, and it only happens a 2.037 · 10−10th of the time. You probably expect to get a sequence with about twenty Tails in it, and there are lots of those, even though each individual one is less probable than "all Heads."

Call the number of times we flip our Bernoulli coin N, and call the entropy of the coinflip H. (For the 80/20 biased coin, H is ⅕ lg 5 + 4/5 lg 5/4 ≈ 0.7219.)

It turns out for sufficiently large N (I know, one of those theorems, right?), almost all of the probability mass is going to live in a subset of 2NH outcomes, each of which have a probability close to 2−NH (and you'll notice that 2NH · 2−NH = 1).