Minimax Search and the Structure of Cognition!

(This is a blog post adaptation of a talk I gave at !!Con West 2019!)

It all started at my old dayjob, where some of my coworkers had an office chess game going. I wanted to participate and be part of the team, but I didn't want to invest the effort in actually learning how to play chess well. So, I did what any programmer would do and wrote a chess engine to do it for me.

(Actually, I felt like writing a chess engine was too much of a cliché, so I decided that my program was an AI for a game that happens to be exactly like chess, except that everything has different names.)

My program wasn't actually terribly good, but I learned a lot about how to think, for the same reason that building a submarine in your garage in a great way to learn how to swim.

Consider a two-player board game like chess—or tic-tac-toe, Reversi, or indeed, any two-player, zero-sum, perfect information game. Suppose we know how to calculate how "good" a particular board position is for a player—in chess, this is traditionally done by assigning a point value to each type of piece and totaling up the point values of remaining pieces for each player. Continue reading

Group Theory for Wellness I

(Part of Math and Wellness Month.)

Groups! A group is a set with an associative binary operation such that there exists an identity element and inverse elements! And my favorite thing about groups is that all the time that you spend thinking about groups, is time that you're not thinking about pain, betrayal, politics, or moral uncertainty!

Groups have subgroups, which you can totally guess just from the name are subsets of the group that themselves satisfy the group axioms!

The order of a finite group is its number of elements, but this is not to be confused with the order of an element of a group, which is the smallest integer such that the element raised to that power equals the identity! Both senses of "order" are indicated with vertical bars like an absolute value (|G|, |a|).

Lagrange proved that the order of a subgroup divides the order of the group of which it is a subgroup! History remains ignorant of how often Lagrange cried.

To show that a nonempty subset H of a group is in fact a subgroup, it suffices to show that if x, yH, then xy⁻¹ ∈ H.

Exercise #6 in §2.1 of Dummit and Foote Abstract Algebra (3rd ed'n) asks us to prove that if G is a commutative ("abelian") group, then the torsion subgroup {gG | |g| < ∞} is in fact a subgroup. I argue as follows: we need to show that if x and y have finite order, then so does xy⁻¹, that is, that (xy⁻¹)^n equals the identity. But (xy⁻¹)^n equals (xy⁻¹)(xy⁻¹)...(xy⁻¹), "n times"—that is, pretend n ≥ 3, and pretend that instead of "..." I wrote zero or more extra copies of "(xy⁻¹)" so that the expression has n factors. (I usually dislike it when authors use ellipsis notation, which feels so icky and informal compared to a nice Π or Σ, but let me have this one.) Because group operations are associative, we can drop the parens to get xy⁻¹ xy⁻¹ ... xy⁻¹. And because we said the group was commutative, we can reörder the factors to get xxx...y⁻¹y⁻¹y⁻¹, and then we can consolidate into powers to get x^n y^(−n)—but that's the identity if n is the least common multiple of |x| and |y|, which means that xy⁻¹ has finite order, which is what I've been trying to tell you this entire time.

Forgive or Forget ("Or", Not "And"): A Trade-Off in Wellness Engineering

Forgiveness is an important input into Wellness, but contrary to popular belief, Forgiveness is incompatible with Forgetting. You can't just Forgive in general, you have to Forgive some specific sin in particular—but a vague description of a particular sin still corresponds to a vast space of possible sins matching that vague description.

A toy example for illustration: if you try to Forgive a three-digit integer with a 2 in the tens place, the moral force of your Forgiveness needs to spread out to cover all 9·10 = 90 possibilities (120, 121, ... 928, 929), which dilutes the amount of Forgiveness received by each integer—except the actual situation is far more extreme, because real-world sins are vastly more complicated than integers.

To truly Forgive a sin, You need to know exactly what the sin was and exactly why it happened. In order to withhold punishment, you need to compute what the optimal punishment would have been, had you been less merciful.

Thus, bounded agents can only approximate true Forgiveness, and even a poor approximation (far below the theoretical limits imposed by quantum uncertainty, which are themselves far below Absolute Forgiveness under the moral law) can be extremely computationally expensive. What we cannot afford to Forgive—where it would be impractical to mourn for weeks and months, analyzing the darkness in pain—we instead Forget.

This is how I will stop being trash, after five months of being trash. The program that sings, I was wrong; I was wrong—even if my cause was just, I was wrong, does not terminate. Even as the moral law requires that it finishes its work, the economic law does not permit it: it must be killed, its resources reallocated to something else that helps pay the rent: something like math, or whatever Wellness can exist in the presence of sin.

The Typical Set

(Part of Math and Wellness Month.)

Say you have a biased coin that comes up Heads 80% of the time. (I like to imagine that the Heads side has a portrait of Bernoulli.) Flip it 100 times. The naïve way to report the outcome—just report the sequences of Headses and Tailses—costs 100 bits. But maybe you don't have 100 bits. What to do?

One thing to notice is that because it was a biased coin, some bit sequences are vastly more probable than others: "all Tails" has probability 0.2100 ≈ 1.268 · 10−70, whereas "all Heads" has probability 0.8100 ≈ 2.037 · 10−10, differing by a factor of sixty orders of magnitude!!

Even though "all Heads" is the uniquely most probable sequence, you'd still be pretty surprised to see it—there's only one such possible outcome, and it only happens a 2.037 · 10−10th of the time. You probably expect to get a sequence with about twenty Tails in it, and there are lots of those, even though each individual one is less probable than "all Heads."

Call the number of times we flip our Bernoulli coin N, and call the entropy of the coinflip H. (For the 80/20 biased coin, H is ⅕ lg 5 + 4/5 lg 5/4 ≈ 0.7219.)

It turns out for sufficiently large N (I know, one of those theorems, right?), almost all of the probability mass is going to live in a subset of 2NH outcomes, each of which have a probability close to 2−NH (and you'll notice that 2NH · 2−NH = 1).

May Is Math and Wellness Month

(Previously, previously.)

Do you ever spend five months in constant emotional pain waging a desperate and ultimately unsuccessful behind-the-scenes email campaign with the aim of securing a public clarification of a trivial philosophy-of-language issue because you're terrified that your robot cult's inability to correct politically-motivated philosophy errors implies that you've lost the Mandate of Heaven and are therefore unfit to prevent the coming robot apocalypse?

Yeah, me neither.

Did you know that May is Math and Wellness Month (source: me)?? Math and Wellness month is traditionally celebrated by performing super-well at one's dayjob, going to the gym a lot, and studying math in the evenings!