Advanced Lesson 1
Regular Expressions
Chapter 3: Quantifiers
Kleene star
So far we have only written regular expressions for a fixed number of characters.
What if I want a regular expression that accepts all the following: b, ba, baa, baaa, baaaa, baaaaa, baaaaaa, baaaaaaa, baaaaaaaa, baaaaaaaaa, baaaaaaaaa, etc.
This is easy. Just say "ba*", where * means “zero or more times”.
>>> pattern = "ba*"
>>> re.match(pattern, "b")
<re.Match object; span=(0, 1), match='b'>
>>> re.match(pattern, "ba")
<re.Match object; span=(0, 2), match='ba'>
>>> re.match(pattern, "baaaaaaaaaaaaa")
<re.Match object; span=(0, 14), match='baaaaaaaaaaaaa'
The star * is called the Kleene star, named after our Mathematician friend Stephen Kleene.
Quick task
Write a regular expression where the first letter is a "b", followed by zero or more characters (any character except "!"), and followed by a "!".
Example valid strings: b!, ba!, bdsf^123!
Example invalid strings: daaad!, ba1a?2@cd
>>> pattern = "b.*!"
>>> re.match(pattern, "b!")
<re.Match object; span=(0, 2), match='b!'>
>>> re.match(pattern, "b7$!")
<re.Match object; span=(0, 4), match='b7$!'>
>>> re.match(pattern, "ba1a?2@cd") # None
>>> re.match(pattern, "daaad!") # None
>>>
You can also use "b[^!]*!" if you want to be more explicit, but a simple “any character” wildcard as above will work.