Advanced Lesson 1
Regular Expressions
Chapter 2: Regular expression basics
Pizza or pasta?
How do you match both "I love pizza"
and "I love pasta"
?
You can say "I love (pizza|pasta)"
, where the |
represents an or
.
Note the parenthesis. The |
only applies to single characters because of operator precedence. If you do not have the parenthesis, then the regular expression will be interpreted as "I love pizz(a|p)asta"
, and will only match "I love pizzaasta"
or "I love pizzpasta"
(hmmm… I think I have just invented two new dishes!)
The advantage of using |
over sets []
is that you can represent a valid sequence of characters. eg. gr(ee|a)n
.
Quick task
Write a regular expression that matches both spellings of gray
and grey
.
>>> pattern = "gra|ey"
>>> re.match(pattern, "hi!")
<re.Match object; span=(0, 3), match='hi!'>
>>> re.match(pattern, "co2") # None
>>> re.match(pattern, "art") # None
>>>
Other possible solutions: gr(a|e)y
, (gray)|(grey)
, gr(ay|ey)
, gr[ae]y
.