$

I used to think of $ in regular expressions as matching the end of the string. I was wrong! It actually might do something more subtle than that, depending on what regex engine you're using. In my native Python's re module, $

[m]atches the end of the string or just before the newline at the end of the string, and in MULTILINE mode also matches before a newline.

Note! The end of the string, or just before the newline at the end of the string.

In [2]: my_regex = re.compile("foo$")

In [3]: my_regex.match("foo")
Out[3]: <_sre.SRE_Match object; span=(0, 3), match='foo'>

In [4]: my_regex.match("foo\n")
Out[4]: <_sre.SRE_Match object; span=(0, 3), match='foo'>

I guess I can see the motivation—we often want to use the newline character as a terminator of lines (by definition) or files (by sacred tradition), without wanting to think of \n as really part of the content of interest—but the disjunctive behavior of $ can be a source of treacherous bugs in the fingers of misinformed programmers!

Continue reading

The Second R

I want to code all of the things, but I also want to write at least some of the things, but sometimes putting things in words—simple things, things I know—can be hard. Every other day I dream of getting in some writing in the night after I return from the code mines across the bay, but the box where the writing tool lives is the same as the box where you can read everything that anyone else has ever written, and you can guess what I really do then, when it's easier to read than to farm, to eat than to write.

But writing is important, because we can imagine nearby possible worlds in which the distribution of verbal skills is incompetenceward of our own, and the people in those worlds are sadder and poorer than us, the clumsiness of their attempts at communication leaving them less effective at coordinating their activities to dominate nature: colleagues maneuver against each other, ineffectually; television is less interesting; lovers stare into each others' eyes having less idea than you of what they're really looking at.

And in our own world, where people can say more, but not enough—I can read, but I'm missing something ... I can reckon with 'rithmetic, which serves a purpose, but cannot in human terms express the richness of vision that courses through ... something. And it cannot be a part of inner peace and glory until paired with something that does, high though the price may be for that something!

The second R, which is yet not an R. I want this more than I can say.

__pycache__/shibboleth.cpython-34.pyc

Sometimes I worry that people with power in Society will look down on me for my pronunciation of the .pyc extension for Python bytecode files. I always want to say pike-cee, even though many would argue that the c should either be hard (pike) or said as the name of the letter (py-cee), but certainly not both in sequence!

Smalltalk

(8:5x a.m., an office on the someteenth floor of the twenty-somethingth tallest building in San Francisco)

"Good morning!"

"'Morning."

"How was your weekend? Did you do anything exciting? Maybe you went to a movie, or to the beach—"

"No—"

"Or embarked on some heroic endeavor of engineering, the likes of which threaten to upend our understanding of the nature of computation itself?"

"No, nothing like that," (sighing, resignedly) "how was your weekend?"

(pretending to inspect his or her fingernails) "My weekend? Oh, nothing special—hung out, did some grocery shopping—"

"Uh huh."

"—wrote a compiler—"

The Foundations of Erasure Codes

(cross-posted from the SwiftStack Blog)

In enabling mechanism to combine together general symbols, in successions of unlimited variety and extent, a uniting link is established between the operations of matter and the abstract mental processes of the most abstract branch of mathematical science. A new, a vast, and a powerful language is developed for the future use of analysis, in which to wield its truths so that these may become of more speedy and accurate practical application for the purposes of mankind [sic] than the means hitherto in our possession have rendered possible.

Ada Lovelace on Charles Babbage's Analytical Engine, 1842

Dear reader, if you're reading [the SwiftStack Blog], you may have already heard that erasure codes have been added to OpenStack Swift (in beta for the 2.3.0 Kilo release, with continuing improvements thereafter) and that this is a really great thing that will make the world a better place.

All of this is entirely true. But what is perhaps less widely heard is exactly what erasure codes are and exactly why their arrival in Swift is a really great thing that will make the world a better place. That is what I aim to show you in this post—and I do mean show, not merely tell, for while integrating erasure codes into a production-grade storage system is (was!) an immense effort requiring months of work by some of the finest programmers the human race has to offer, the core idea is actually simple enough to fit in a (longish) blog post. Indeed, by the end of this post, we will have written a complete working implementation of a simple variant of Reed–Solomon coding, not entirely unlike what is used in Swift itself. No prior knowledge will be assumed except a working knowledge of high-school algebra and the Python programming language.

Continue reading

T.O.P.

"Don't worry, we've got our T.O.P. engineer working on it," said the support man on the phone with our most important customer, glancing meaningfully across the open-plan office in my direction; I winced briefly, then spasmed back towards my screen and fumbled with the keyboard, intending to return my attention to the definition of the DeviceAssignmentRuleComponentManagerFactory, but somehow fat-fingering C-x C-c along the way, every awkward, ungainly movement bearing testimony to the most casual of onlookers that I was Totally Observably Pathetic.

Epistolary

(Previously.)

[19:26:50] <bob>    alice: you still around?
[19:27:08] <alice>  bob, sort of
[19:27:20] <bob>    alice: ok. never mind.
[19:27:41] <alice>  bob, what were you going to ask? I am at the office, 
                    trying to finish up an email but I'm really slow at 
                    choosing words
[19:28:21] <bob>    alice: i was just wondering if you happened to know a 
                    way to manually foo the bar-quuxing device
[19:28:22] <alice>  perhaps because of my overly-ornate and wordy writing 
                    style, which, for not-well-understood psychological 
                    reasons, I nevertheless continue to use despite its 
                    obvious disadvantages in business communication

XXX III

const PSEUDO_DIGITS: [char; 7] = ['M', 'D', 'C', 'L', 'X', 'V', 'I'];
const PSEUDO_PLACE_VALUES: [usize; 7] = [1000, 500, 100, 50, 10, 5, 1];

#[allow(unused_parens)]
fn integer_to_roman(integer: usize) -> String {
    let mut remaining = integer;
    let mut bildungsroman = String::new();
    // get it?? It sounds like _building Roman_ (numerals), but it's
    // also part of the story about me coming into my own as a
    // programmer by learning a grown-up language
    //
    // XXX http://tvtropes.org/pmwiki/pmwiki.php/Main/DontExplainTheJoke
    for ((index, value), &figure) in PSEUDO_PLACE_VALUES.iter()
        .enumerate().zip(PSEUDO_DIGITS.iter())
    {
        let factor = remaining / value;
        remaining = remaining % value;

        if figure == 'M' || factor < 4 {
            for _ in 0..factor {
                bildungsroman.push(figure);
            }
        }

        // IV, IX, XL, &c.
        let smaller_unit_index = index + 2 - (index % 2);
        if smaller_unit_index < PSEUDO_PLACE_VALUES.len() {
            let smaller_unit_value = PSEUDO_PLACE_VALUES[smaller_unit_index];
            let smaller_unit_figure = PSEUDO_DIGITS[smaller_unit_index];

            if value - remaining <= smaller_unit_value {
                bildungsroman.push(smaller_unit_figure);
                bildungsroman.push(figure);
                remaining -= (value - smaller_unit_value);
            }
        }
    }
    bildungsroman
}

XXX II

// XXX: old_io is probably facing deprecation if names mean anything
#![feature(old_io)]
use std::old_io;
use std::collections::HashMap;

fn main() {
    let things_to_ask_about = ["name", "age", "username"];
    let mut collected_information = HashMap::new();
    for askable in things_to_ask_about.iter() {
        println!("What is your {}?", askable);
        let input = old_io::stdin()
            .read_line()
            .ok().expect("failure message here");
        // XXX EVIDENCE OF MY IMPENDING DEATH in these moments when I
        // want to scream with the righteous fury of a person who has
        // been genuinely wronged, on the topic of what the fuck is wrong
        // with this bullshit language where you can't even trim a string
        // because "`input` does not live long enough" this and "borrowed
        // value is only valid for the block suffix following statement 1
        // at 21:48" that
        //
        // But what the fuck is wrong with this bullshit language is in
        // the map, not the territory
        //
        // on the balance of available evidence, doesn't it seem more
        // likely that the borrow checker is smarter than you, or that
        // the persons who wrote the borrow checker are smarter than you?
        //
        // and if you can't even follow their work even after several
        // scattered hours of dutifully trying to RTFM, will an
        // increasingly competitive global Economy remain interested in
        // keeping you alive and happy in the decades to come?
        //
        // I am not a person who has been genuinely wronged, just a man
        // not smart enough to know any better
        collected_information.insert(askable, input.trim());
    }

    for (askable, response) in collected_information.iter() {
        println!("You claimed that your {} is {}.", askable, response);
    }
}

"Pi Day" Is an Unholy Festival of Sin That Is Corrupting Our Children

Dear reader, it's the fourteenth day of the third month of the year, and if you're reading this blog, some charlatans or overenthusiastic youth (the subject of whose enthusiasm is not what they think it is) have probably tried to convince you to celebrate it as "Pi Day." You see (these quacks implored you) π is around 3.14, and March fourteenth is 3/14. And furthermore (they may have put to you) furthermore this year's Pi Day is special, because it's 3/14/15, which is like 3.1415! Why (an especially impudent few might have continued to venture), we should plan some grand spectacle on 9:26 a.m. on the day, which is like 3.1415926! With (and this is the part that is most inevitable and offensive) pie! Get it, because it sounds like pi and is shaped like a circle?

Dear reader, it is lies or it is worse than lies; it is blasphemy, treason, superstitious superficiality, degenerate folderol, and frivolous depravity! Do not mistake me; of course I can see as clearly as any other ape can that the numeric subsequence of the string "3.14" is same as that of the string "3/14". The former string represents an occasionally useful approximation of the circle constant which is ubiquitous in mathematics (give or take a factor of two); the latter is how people in my country abbreviate today's date. Perhaps to those who don't have anything really interesting to think about, this trivial coincidence might be worth a passing mention; apes love anything for an innocent distraction, and why begrudge that?

Continue reading

Permalink or It Didn't Happen

As far as I can tell, I don't have any kind of synesthesia. You can't be too sure (which means, you can easily be entirely too sure), what with our na(t)ive theories of psychology being so inadequate that everything we believe about other minds is but a filament of noise and conjecture, but your probability distribution about the mapping of sensory inputs to perceptions for me is probably not so different as mine of the same for you (dear reader of whom I know nothing)—roses seem red, violets would seem blue if we spoke a language that didn't already have a word for violet—which means that when I tell you that there's a musty, stale odor around a blog that hasn't been updated in a month and change, it's only a trite metaphor and not a perceptual reality of any sort. Still, even if you can't smell it (if your senses are like mine; if your fox, like mine, still hasn't bothered to implement the HTML5 <aroma> element), it's an ominous thing, to see a blog hovering near the boundary between life and death, a corpus perhaps on the way to being a corpse. The internet is littered with the latter, monuments to people who reliably had something to say, month after month ... until they missed a month, and then it wasn't long before they missed another.

my_block_of_squares

Now I can assure you that that will never happen to this place while I'm still breathing—this blog lives exactly as long as I do—only that's not a precise way of speaking; what I can do is offer you my assurance, which is a different thing from you actually feeling assured, which is a different thing still from that which was assured against actually never coming to pass. But I think these differences—between feeling and reality, between saying and reality—I think these enormous differences are much greater than the tiny, barely-perceptible gap between seeing so many gloriously intricate things to say, and making the time and words to express them on your blog when you are so busy with your trade in the manufacture of useful machinery (and the green tiles which are its highly-coveted industrial byproduct). But if all I can observe is that the gap is barely perceptible, then by the enormity of the earlier differences, I am not licensed to infer that the gap is tiny, not when the only reason I am telling you this is that I would die of shame if my monthly archives sidebar skipped a month for the first time since May of 'aught-twelve, not during this second year of my life in which I am supposed to write a compiler and a bad novelette even though it is for all intents and tens of intensive purposes practically March.

2014 Year in Reverse

(Previously.)

If I had any readers who still believe in the A-theory of time, I might say: 2014 is dead! Gone! Over! But since I probably don't have any readers like that (since I probably don't have any readers, full stop?), it's better to face the truth: 2014 is an immutable part of our universe; just because we don't—get to?—have to?—experience it "now", doesn't mean it has "stopped" existing, any more than 2016 doesn't exist "yet" just because we don't remember it.

Anyway. In that two-thousand-and-fourteenth year of our Common Era, the first year of my life (that I feel comfortable admitting to), and (unfortunately) not actually the Year of the Em Dash, this blog saw 45 posts and 40 comments. Among these—

The weariness of being monolingual was confessed to. We saw how to convert Markdown to HTML within Emacs (a technique which is proving itself to be of some convenience to your author in preparing blog posts for publication). We considered one weird trick for what to write when you can't infer the correct spelling of someone's name from what you heard. It turned out that the word apology can mean different things, and that characters in popular 1990s science-fiction television programs aren't always completely honest in interpreting the moral law. We were prompted to prove why we will never write anything. We had a wild Halloween party, noted a baffling error message from Git (hint: commit hooks and virtualenv), and drowned our sorrows in tower defense. The American coffee hegemon started serving pumpkin spice again. There were feelings of inadequacy, at least one contrived distraction from writing that ineffectually pretended to not be a distraction, and the occasional obscure pun. We examined where I stand and were enlightened by some standard advice. There were more feelings of inadequacy. Even conditional on the hypothesis that all's well that ends well, I think it's important to consider the condition of people for which all is not looking to end well. We heard a poem for OpenStack object storage, and a lament against git push --force. I argued that Twilight Sparkle is a disaster waiting to happen and confessed that perhaps too many of my life decisions are determined by what things GitHub happens to provide graphs for. I ate too much ice-cream once and explained how consistent hashing works.

And as for that other nearby immutable span of reality, the one called 2015? Well, that would be telling (and I can't know that from here).

The Year of the Em Dash, Not

"2014 is the Unicodepoint for the em dash! Isn't that the greatest thing ever? How did I not know this before December of this glorious year?"

"That's two zero one four in hex, dummy. It's not the same number."

"But, but—that means the year of the em dash isn't until—four, plus sixteen, plus two-to-the-thirteenth ... the year eighty-two twelve! I'll probably be dead by then!"

"Well, you can still celebrate the year of the N'ko letter Ka."

"That is small consolation, my friend!"