This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 2: Regular expression basics

Match set of characters

face Josiah Wang

Now, instead of just the small letter b, we want to allow our words to also start with the capital B.

So you want to be able to match the following strings: bop, bap, bmp, bBp, b2p, b#p, b&p, b!p, b@p, Bop, Bap, Bmp, Bbp, B2p, B#p, B&p, B!p, and B@p.

How would you write a regular expression for this?

You can use square brackets to represent a valid set of characters. So in this case the regular expression is [Bb].p, where the first character can be a character listed inside the square bracket (in this case, B or b).

>>> pattern = "[Bb].p"
>>> re.match(pattern, "bop")
<re.Match object; span=(0, 3), match='bop'>
>>> re.match(pattern, "B2p")
<re.Match object; span=(0, 3), match='B2p'>
>>> re.match(pattern, "cap")
>>> # None

Quick task

Write a regular expression that matches only these strings: can, fan, and man.

>>> pattern = "[cmf]an"
>>> re.match(pattern, "can")
<re.Match object; span=(0, 3), match='can'>
>>> re.match(pattern, "man") 
<re.Match object; span=(0, 3), match='man'>
>>> re.match(pattern, "pan")  # None
>>>