Saturday, July 7, 2007

Hello, Python!

I've independently talked about Python with Harry and Lars last week. Conclusion: Python seems to be "the" script language to learn if you want something versatile and hate Perl.

Today I needed a small script: Find me all the 4-lettered family names from the US Census data. Usually I'd simply do that with awk. But what better opportunity to actually start using Python?

I have the family name list in ASCII form. Nicely formatted in columns with whitespace separators. All I need is a small Python script that will loop over stdin and match as required. No huhu:

import sys
import re

while True:
    m = re.match('^[A-Z]{,4} ',line)
    if m != None:
        print m.string[m.start():m.end()]
    if line == "":

That wasn't so difficult.