This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 3: Quantifiers

Kleene plus

face Josiah Wang

We previously accepted "b" (with no "a" after the "b"), in addition to ba, baa, baaa, etc.

What if we want at least one "a" after the "b"? Because a "b" without an "a" is not really a sound a sheep would make!

The solution: just add an "a" to force the first "a"!

>>> pattern = "baa*"
>>> re.match(pattern, "b") # None
>>> re.match(pattern, "ba")
<re.Match object; span=(0, 2), match='ba'

You can also use a + (Kleene Plus), which represents “one or more times”. This is a shorthand notation for the above! So a+ is the same as aa*.

>>> pattern = "ba+"
>>> re.match(pattern, "b") # None
>>> re.match(pattern, "ba")
<re.Match object; span=(0, 2), match='ba'>
>>> re.match(pattern, "baaaaa")
<re.Match object; span=(0, 6), match='baaaaa'>

Quick task

Write a regular expression that matches one or more x, y or z.

Example valid strings: x, y, z, xx, xy, zxxyx, yxyz, xyzxyz

Example invalid strings: <blank>, aaa, ax

>>> pattern = "[xyz]+" 
>>> re.match(pattern, "bd")
<re.Match object; span=(0, 1), match='bd'>
>>> re.match(pattern, "baad")
<re.Match object; span=(0, 4), match='baad'>
>>> re.match(pattern, "baaaaaa") # None
>>> re.match(pattern, "daaad") # None
>>> 

Other possible solutions: "[x-z]+", "[x-z][x-z]*", `"[xyz][xyz]*"