Usefulness of itertools.cycle & re.sub

Tim Golden

2009-11-16 11:11

(… or at least the concept). I wanted to process a piece of plain text which would include conventional double-quote marks in such a way that they became HTML smart quote characters (&ldquot; &rdquot;). I was prepared to adopt a naive algorithm which assumed that alternate quotes would always match up, something which obviously wouldn’t work for single quotes. I toyed with various ways of splitting the text up and joining it back together until I came across the slick combination of itertools.cycle and re.sub:

import itertools
import re

quotes = itertools.cycle (['&ldquot;', '&rdquot;'])
def sub (match):
  return quotes.next ()

text = 'The "quick" brown "fox" jumps over the "lazy" dog.'
print re.sub ('"', sub, text)

Obviously my itertools.cycle could trivially be written as: while 1: yield '..'; yield '...', but why reinvent the wheel?

Update: Tom Lynn points out that this can be done with a straightforward regex:

text = re.sub(r’”([^”]*)”‘, r’&ldquot;\1&rdquot;’, text)