Taciturn

All entries (archive)

I am now maintaining a patch that fixes the Australian timezone names on Unix-like systems.

What am I fixing?

The Australian timezone names on many Unix-like systems are missing the 'A' at the start. It's AEST, not EST. AWST, not WST. The existing timezone data makes no distinction between standard time and daylight saving time, eg. AEST vs. AEDT — it calls them both EST.

There's more info on my tzdata-au page. Naturally I'd be happy to hear from people who use this.

Thu 6 Mar 2008

David and I were discussing regexes and he wondered if it was possible to write two regexes so that each one matches the other but not itself. In programmer- and shell-readable terms, given the variables a and b: echo "$a"|(! egrep "$a") && echo "$b"|(! egrep "$b") && echo "$a" |egrep "$b" && echo "$b"|egrep "$a" && echo yeah (Thanks D.). If it says "yeah", you win.

To give it a bit more of a concrete scope, I decided that it should use extended regexes, because they have a good balance of power, usefulness, and sanity. Secondly, to avoid the easy solution of ^a and a$, both regexes must be anchored at both beginning and end.

To give readers a chance to solve this themselves, I'll disguise the spoilers below in white-on-white text. Highlight the text to read it.

The key is to use inverse classes. As a result I came up with ^[^\\]+$ and ^[^[:alpha:]]+$.

I then posed the question to #humbug, Clinton Roy got a solution pretty quickly: ^[^a]*$ and ^[^b]*$.

Doing the same when anchors are prohibited seems to be more difficult, although I haven't thought about it long enough to be satisfied that it's impossible.

By the way, setting GREP_OPTIONS='--color=auto' and optionally also --exclude-dir=.svn is totally awesome.

I was hacking together a GTK-based program that had to be capable of displaying a long list of items. It was running quite slowly — taking 174 seconds to load 50 000 items. That seemed like way too long for what should be one malloc per item and adjusting a few pointers.

The structure I used was a GtkTreeStore. It turns out it's ridiculously inefficient at appending items. Prepending takes about as long as you'd expect — about 0.83 seconds.

I tried another test with 280 000 lines. That took 4.25 seconds using gtk_tree_store_prepend, and I killed the process after nearly 30 minutes when it was using gtk_tree_store_append.

The slowness looks like it is in the GtkTreeStore's GNode backend where appending an item at a particular level involves walking the list of every item on that level:

while (sibling->next)
  sibling = sibling->next;

So, if you're building a GtkTreeStore with sequential data, reverse your data and use gtk_tree_store_prepend rather than gtk_tree_store_append — it's much faster.

I also tested GtkListStore and appending is as fast as prepending.

Tested with GTK 2.12.1 and Glib 2.14.1 (Debian). Source.

Sun 30 Sep 2007

Tonight I released Stallone 0.1.0. Here is a copy of the announcement I sent to the Avahi mailing list.

From: Ted Percival <ted midg3t.net>
To: avahi lists.freedesktop.org
Subject: [Announce] Stallone 0.1.0 - NAT-PMP Gateway

Greetings, Avahi users.

It is with great trepidation that I announce the first release of the
"Stallone" NAT-PMP gateway.

Stallone is a daemon that implements the NAT Port-Mapping Protocol
(NAT-PMP) allowing machines on a private network (behind NAT) to get
publicly-accessible TCP and UDP ports in order to accept connections
from other internet-connected machines. It runs on the NAT machine and
services incoming requests by adding and removing iptables rules that
provide the packet redirection.

The current project home page is
  http://tedp.id.au/stallone/
and you can download this release directly from
  http://tedp.id.au/stallone/releases/stallone-0.1.0.tar.gz

This is the "Works for Me" release, version 0.1.0.
Sun 23 Sep 2007

I was looking for equivalents in git for Subversion's svn revert <file> and for Mercurial's hg outgoing. It looks like they are git checkout <file> and git log origin.. (including the two trailing dots).

Transmitting passwords over HTTPS is safe, but serving the login form over HTTP is not. The attack vector is that an active attacker can send a custom login form with a different form submission address, compromising users' passwords.

I noticed this when using the Debian mentors login. Fortunately the login page is also available over HTTPS if you adjust the URI yourself, but ideally it would be the default.

As part of an assessment of benevolent dictatorship as a governance model for free software projects, Ted Ts'o suggests:

If Debian had either explicitly stated an FSF-centric position [free software only], and only accepted members that supported point of view, or explicitly subscribed to a position that users should be able to use whatever software they feel meets their needs, as Ubuntu has, it would avoided many arguments that have threatened to tear apart Debian.

Perhaps some arguments would have been avoided, but having someone declare definitively that Debian is a pure free software distribution or that it isn't would cause it to lose a large part of its developer community. Despite the arguments, the ambiguity allows both purists and pragmatists to co-exist more-or-less in harmony, working together for a more-or-less common goal.

I think the point that really shines through (and Ted mentioned it) is that in free software projects, if the existing governance model is not working then anyone is free to fork it and try another method. Indeed that is the exact reason that Ubuntu was founded — to try to run a Debian-like project under the benevolent dictator model. Both Ubuntu and Debian continue to be popular (perhaps Ubuntu more with users and Debian more with developers?), and both are healthy projects well on their way to global domination.

(You should read Josh Berkus's article, The Myth of the Benevolent Dictator, from which all this spawned.)

Mon 27 Aug 2007

I was under the impression that descriptive URIs were better than short URIs until I made mine descriptive. Now they look too long:

http://web.midg3t.net/blog/170/
  vs.
http://web.midg3t.net/blog/170/uris-short-or-descriptive/

Update 2007-08-30: Changed the word separator to hyphen rather than underscore. It's certainly easier to type and seems more common amongst blog URIs. You can still use the short, numeric-only URIs if that's what you prefer.

Distributions spawned from the need for an easy way to install a GNU system. Over the years they've come to be much more — perhaps most recognisably as a tool for simplifying system administration, but I think a much overlooked function of a distro is that it provides a stable branch for thousands of projects that otherwise would not have one. A stable branch that receives backported bug fixes and a period of feature freeze before being considered "released" to end users. This way small projects with only one or two core developers can concentrate on development of any kind and allow the distribution to cherry pick bug fixes for its stable release.

The difficult thing about maintaining a stable branch is knowing what to put in and what to leave out. It is an especially difficult job for a developer-maintainer to do. They are likely to be more keen on getting new code into users' hands. On the other hand, a distribution maintainer has the job of providing a solid piece of software and can more easily make decisions about what changes to pull and what to leave for a future release in the context of that distribution.

When I look at many of the well known projects such as Apache httpd, Linux, KDE, Gnome, GCC, glibc — they all have internally maintained stable branches. I think that attention to solid releases shows in the projects' reputations. In particular, the person in charge of stable releases is often not the person in charge of development releases.

Even with the availability of distributions as a stable branch, I'd like to see more projects maintaining their own stable branches or feature freezes. Make use of that three- or four-digit version number you've got! Having a separate stable maintainer is not necessary, only an advantage. There are always people interested in running the latest code for the latest features (and bugs). When things don't work, they'll let you know, and from that you can determine what to put in a stable release.

Of course I've just made a number of assertions about what good software engineering practice is when my experience in the field could aptly be described as "short". I'll be trying out some ideas in the future with "Stallone". First I have to make a release, though.

Mon 20 Aug 2007

I'm having trouble deciding on a project name for my standalone NAT-PMP gateway. I've decided that acronyms are too immemorable — in particular the obvious choice of "NATPMD" brings back memories of PCMCIA.

Another consideration is that it might have two parts in the near future. So far there is the gateway (router, port-forwarder) daemon, but I also intend to write an agent daemon to allow apps to easily make use of NAT-PMP to get their ports forwarded.

I've come up with a few potential names:

I like the idea of making reference to the opening of holes in firewalls which is where Boring, Witchetty and TBM come from. Napalm is a Samba-esque "similar word" idea. I also thought of Nymph with the client portion being called Satyr, but they leave me without a name for the combination of both.

At the moment Boring is my favourite, but calling the daemon boring-gateway seems a bit dull. I'm certainly open to suggestions.

You can find it now (it works) under the codename "Stallone" in various forms (links subject to change):


Updated 2007-10-08: I ended up just calling it stallone. You can find it at http://tedp.id.au/stallone/.

Also available in RSS.

The old splash page has moved (August 2007).