August 31, 2006

i hate this blog

This blog sucks. I don't mean the content sucks, I mean blogging this blog sucks. Well the content may suck too, I leave that up to the (almost nonexistent) reader.

I'm basically sick of moveable type. Can't take typing in a weird font into a weird form with none of my weird (in a good way this time) vim key bindings available anymore. Can't take typing a long entry in, only to accidently click the mouse and navigate away, losing all my work.

Also I really don't like the format and the fact that it makes me feel so formal. I don't want to write to an audience when I blog technical stuff. I just want to jot it down. If someone comes along and can decipher the scribble later on and make use of it, great. If not, big deal, it still means something to me. Right now I feel like everything I write here needs "presentation" and needs to mean something to someone else. I want to get away from that idea. It means I haven't blogged a lot of small useful discoveries that I would have otherwise, simply because I didn't want to spend the time getting an entry into shape.

So my idea for a replacement is something pretty simple, almost a blend between a wiki, a blog, and a version control system. I just want a directory of files, not forceably organized in any particular fashion. I shouldn't ever have to worry about presentation, just pop into a new file in the editor and get out a thought and leave. Ability to cross-reference these tidbits is a must. Easy inlining of code is a must. The ability to source in other files is a must. I should be able to type make or cons or and render presentation formats (html for one obviously) for anything out-of-date, then rsync the rendered files off to somewhere. External links should just be heuristically recognized and linked in the output. Every last bit of functionality should be command-line driven, with other interfaces added as needed (e.g. for commenting).

Then instead of relying on some funky method of tracking temporal change (ala the state of art in wikis today) just track change via diffs of the files themselves. Which should be easy to drill into from the web interface.

In other words building this "blog" or whatever I'm going to call it should be like building a software project, except that in this case the source is not C or Perl but English.

Posted by Alan at 10:28 PM | Comments (4)

xml zombies

Somehow, just a few years ago, I felt that xml was magic. If only I put all my data between ending and closing tags in a tree-structure, it stands to reason that it will all become immediately useful to me. And that the use will be immediate. No more troublesome parsing of oddball syntaxes, no more nasty query language required to wring every drop of meaning out of the data. The data will extract meaning from itself...the stones will rise up and speak.

After some point--I don't know when exactly the powerful brainwashing wore off, but I woke up in a field somewhere with grass in my hair--it became the case that anyone who says the word XML around me gets hit with a stick. Something happened. Something traumatic happened that day and my mind is blocking out the memory, even now.

This was like waking up to find yourself in a zombie movie where the zombies chase you not to eat your warm flesh, but to make you store all your data in xml. But since they were zombies they were of course not satisified with this. No, if there is one lesson in life it is that the undead are never satisifed. Once they've got you storing all your data in xml, they require that you only manipulate your data via XML. Was that a blood-curdling scream I just heard?

This is your cue to get the stick and start beating them off.

See also Quotes about XML.

Posted by Alan at 10:14 PM | Comments (0)

fanaticism as applied to aesthetics: or, this could get ugly

Here's a Python programmer whining for pages and pages about tiny flecks of syntax and the aesthetic properties thereof. The inequality operator is now officially "!=" instead of "<>". Big deal you say.

Dude. The guy wrote a freaking song about it.

Posted by Alan at 02:01 PM | Comments (0)

August 21, 2006

perl anagram listers

Here's my new favorite perl interview question: write a program which finds all anagrams of an english word given a dictionary file. The word will be the lone command line argument. The dictionary file is on stdin.

I like this question because there's many dimensions to it. Obviously you're testing the candidates general problem solving skills. Regardless of their ability in perl they should at least be able to produce an algorithm that works. But you can also expand in other directions, or sit back and observe which dimensions they worry about first.

For instance, you can stretch things in the direction of completeness / robustness: what exactly is an anagram? does case matter? what about spaces? what characters are legal or does anything go? what about dos / unix line endings? are the strings ascii, unicode, utf-8? how do you handle end-of-file? etc.

You can also quiz them on performance issues: are they reading in the entire dictionary file or going line-by-line? how efficient is their check for anagram-ness?

Then it's fun to see how compact they can make their program, which is a good way to plumb the depths of their perl knowledge. My first anagram lister was over a hundred lines long. (I guess my first cut is usually optimized for clarity / readability.) Tonight I whittled that down to a 65-byte whole word anagram lister and a 79-byte general anagram lister.

Of course, these one-liners sacrifice a lot of completeness / robustness for compactness, but I think that's the point of the whole exercise. You have to pick which directions to optimize in. That choice tells you an awful lot about a programmer.

Posted by Alan at 10:42 PM | Comments (1)