This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 3: Quantifiers

Kleene star

face Josiah Wang

So far we have only written regular expressions for a fixed number of characters.

What if I want a regular expression that accepts all the following: b, ba, baa, baaa, baaaa, baaaaa, baaaaaa, baaaaaaa, baaaaaaaa, baaaaaaaaa, baaaaaaaaa, etc.

This is easy. Just say "ba*", where * means “zero or more times”.

>>> pattern = "ba*"
>>> re.match(pattern, "b")
<re.Match object; span=(0, 1), match='b'>
>>> re.match(pattern, "ba")
<re.Match object; span=(0, 2), match='ba'>
>>> re.match(pattern, "baaaaaaaaaaaaa")
<re.Match object; span=(0, 14), match='baaaaaaaaaaaaa'

The star * is called the Kleene star, named after our Mathematician friend Stephen Kleene.

Quick task

Write a regular expression where the first letter is a "b", followed by zero or more characters (any character except "!"), and followed by a "!".

Example valid strings: b!, ba!, bdsf^123!

Example invalid strings: daaad!, ba1a?2@cd

>>> pattern = "b.*!"
>>> re.match(pattern, "b!")
<re.Match object; span=(0, 2), match='b!'>
>>> re.match(pattern, "b7$!")
<re.Match object; span=(0, 4), match='b7$!'>
>>> re.match(pattern, "ba1a?2@cd") # None
>>> re.match(pattern, "daaad!") # None
>>> 

You can also use "b[^!]*!" if you want to be more explicit, but a simple “any character” wildcard as above will work.