This is an archived version of the course and is no longer updated. Please find the latest version of the course on the main webpage.

re module

Python provides the re module for working with regular expressions.

As mentioned in my live lecture, here are the four re functions that you will most likely use (plus a bonus one!)

  1. re.match()
  2. re.search()
  3. re.findall()
  4. re.finditer()
  5. re.fullmatch()

Make sure you understand the differences between these. When in doubt, consult the official documentation!

re.match(), re.search() and re.fullmatch() will return a Match object. Again, consult the documentation to find out what properties and methods a Match object has to offer. [Yes, I’m deliberately holding back on spoon-feeding now and trying to teach you how to fish instead!]

re.findall() will return a list of matched strings.

re.finditer() will return an iterator that you can use to extract multiple Match objects with a for loop.

Explore for yourself – what do these return?

>>> string = "morning morning morning world!"
>>> pattern = "morning"
>>> re.match(pattern, string)
>>> re.search(pattern, string)
>>> re.findall(pattern, string)
>>> [match for match in re.finditer(pattern, string)]

OOP way

You can also pre-compile the pattern string into a Pattern object, and perform all of the actions above as methods of Pattern instead of functions. As usual, look at the documentation to see what methods/attributes Pattern has on offer.

>>> string = "morning morning morning world!"
>>> pattern = re.compile("morning")
>>> type(pattern)
>>> pattern.pattern
>>> pattern.match(string)
>>> pattern.search(string)
>>> pattern.findall(string)
>>> [match for match in pattern.finditer(string)]