Advanced Lesson 1
Regular Expressions
Chapter 3: Quantifiers
Groupings
The *
, +
, ?
, and {M,N}
quantifiers apply only to the single previous character. So hello*
only repeats the o
.
To repeat groups of characters, use parenthesis to group the characters together. For example, (baa)+
matches baa
, baabaa
, baabaabaa
, etc.
Quick task
Write a regular expression that matches the string:
- starts with
"oh"
(one or more times) - followed by
"yeah"
(zero or more times) - it can end optionally with the word
"baby"
(at most once)
So the following strings are valid: "oh"
, "oh oh"
, "oh oh yeah yeah yeah"
, "oh yeah yeah baby"
, "oh baby"
, "oh oh baby"
, and "oh oh yeah yeah yeah baby"
.
Beware of how you include your spaces between the words!
>>> pattern = "(oh)( oh)*( yeah)*( baby)?"
>>> re.match(pattern, "oh")
<re.Match object; span=(0, 1), match='oh'>
>>> re.match(pattern, "oh oh yeah yeah yeah")
<re.Match object; span=(0, 20), match='oh oh yeah yeah yeah'>
>>> re.match(pattern, "oh baby")
<re.Match object; span=(0, 7), match='oh baby'>
>>> re.match(pattern, "baby baby yeah") # None