Chapter 6: Implementing regular expressions

Searching patterns

face Josiah Wang

Now that you have mastered the basics of regular expression, let’s focus on the more advanced details, especially when implementing regular expressions in Python.

Hopefully you are now familiar with both the match() and search() functions.

These functions only return a SINGLE match.

What if we need to return all possible matches? Python provides another two functions called findall() and finditer() for this.

re.findall() will return a list of matched strings.

>>> greeting = "morning morning josiah"
>>> pattern = "morning"
>>> matches = re.findall(pattern, greeting)
>>> print(matches)
['morning', 'morning']

If you need more information such as the position of the match, use re.finditer(). It will return an iterator, so use it with a for-loop to get a Match instance at each iteration.

>>> greeting = "morning morning josiah"
>>> pattern = "morning"
>>> iterator = re.finditer(pattern, greeting)
>>> for match in iterator:
...     print(match)
...
<re.Match object; span=(0, 7), match='morning'>
<re.Match object; span=(8, 15), match='morning'>

There is also a fullmatch() function that is like match(), but matches the whole string. So, re.fullmatch("ba{4}", string) is equivalent to re.search("^ba{4}$", string) and re.match("ba{4}$", string). Saves you from having to add start and end of sentence markers to your regular expressions!