Saturday, October 15, 2005

A face only a programmer could love

I need to get back in the swing of updating this regularly, and my threatened article on generators in Python seems like as good a topic as any. I use a generator maybe every other month, and each time I have to do it wrong once and then remember how to do it correctly. So I figured I would document my experience here, so the next time I have to do this, I can refer back and hopefully save myself some time. Oh, and you guys can read it, as well.

Generators in Python are odd ducks. They feel like a giant hack to me. I want them to be functors, but they're not. In fact they are syntactic sugar for creating an iterable object. Consider the following (which, in fact, is an iterable functor):
RED = "FF0000"
WHITE = "FFFFFF"

class Iterable:
__color = WHITE

def __call__(self):
if self.__color == RED:
self.__color = WHITE
else:
self.__color = RED
return self.__color

I = Iterable()
print I()
print I()
print I()
The output I get from this is:

FFFFFF
FF0000
FFFFFF

This is something I might do anytime I need to generate a pattern or a sequence. Obviously this is a very simple example, but there's plenty of room for complexity here.

Generators let you do this (arguably) in a simpler, cleaner fashion:
RED = "FF0000"
WHITE = "FFFFFF"

def generator():
color = RED
while True:
if color == RED:
color = WHITE
else:
color = RED
yield color

I = generator()
print I.next()
print I.next()
print I.next()
This gives the same output.

What I want to do is call the generator function repeatedly. In fact, calling the function creates the generator. The body of the function gets grafted onto the new generator object, where it acts kind of like a coroutine: the function's execution is suspended at the yield statement, and resumes after the yield the next time next() is called.

Of course, this example doesn't really demonstrate the real value of a generator over a hand-crafted iterator class. Consider this:
for x in range(3):
print '<tr>'
for y, color in zip(range(3), I):
print ' <td bgcolor=%s>&nbsp;</td>' % color
print '</tr>'
Which results in this chunk of html:
<tr>
<td bgcolor=FFFFFF>&nbsp;</td>
<td bgcolor=FF0000>&nbsp;</td>
<td bgcolor=FFFFFF>&nbsp;</td>
</tr>
<tr>
<td bgcolor=FF0000>&nbsp;</td>
<td bgcolor=FFFFFF>&nbsp;</td>
<td bgcolor=FF0000>&nbsp;</td>
</tr>
<tr>
<td bgcolor=FFFFFF>&nbsp;</td>
<td bgcolor=FF0000>&nbsp;</td>
<td bgcolor=FFFFFF>&nbsp;</td>
</tr>
Giving us our lovely checkerboard pattern. What the generator gets you is automatic conformity to the iterator interface (namely, the next() function, and having it raise StopIteration if it runs out of values to return). When you write a custom iterator class, you have to do these things yourself (otherwise, your iterator may not conform to the expected interface, and may break when you reuse it down the road).

0 Comments:

Post a Comment

<< Home