Chapter 2: Regular expression basics

Pizza or pasta?

face Josiah Wang

How do you match both "I love pizza" and "I love pasta"?

You can say "I love (pizza|pasta)", where the | represents an or.

Note the parenthesis. The | only applies to single characters because of operator precedence. If you do not have the parenthesis, then the regular expression will be interpreted as "I love pizz(a|p)asta", and will only match "I love pizzaasta" or "I love pizzpasta" (hmmm… I think I have just invented two new dishes!)

The advantage of using | over sets [] is that you can represent a valid sequence of characters. eg. gr(ee|a)n.

Quick task

Write a regular expression that matches both spellings of gray and grey.

>>> pattern = "gra|ey"
>>> re.match(pattern, "hi!")
<re.Match object; span=(0, 3), match='hi!'>
>>> re.match(pattern, "co2") # None
>>> re.match(pattern, "art")  # None
>>> 

Other possible solutions: gr(a|e)y, (gray)|(grey), gr(ay|ey), gr[ae]y.