Chapter 3: Quantifiers

Kleene plus

face Josiah Wang

We previously accepted "b" (with no "a" after the "b"), in addition to ba, baa, baaa, etc.

What if we want at least one "a" after the "b"? Because a "b" without an "a" is not really a sound a sheep would make!

The solution: just add an "a" to force the first "a"!

>>> pattern = "baa*"
>>> re.match(pattern, "b") # None
>>> re.match(pattern, "ba")
<re.Match object; span=(0, 2), match='ba'

You can also use a + (Kleene Plus), which represents “one or more times”. This is a shorthand notation for the above! So a+ is the same as aa*.

>>> pattern = "ba+"
>>> re.match(pattern, "b") # None
>>> re.match(pattern, "ba")
<re.Match object; span=(0, 2), match='ba'>
>>> re.match(pattern, "baaaaa")
<re.Match object; span=(0, 6), match='baaaaa'>

Quick task

Write a regular expression that matches one or more x, y or z.

Example valid strings: x, y, z, xx, xy, zxxyx, yxyz, xyzxyz

Example invalid strings: <blank>, aaa, ax

>>> pattern = "[xyz]+" 
>>> re.match(pattern, "y")
<re.Match object; span=(0, 1), match='y'>
>>> re.match(pattern, "zyxy")
<re.Match object; span=(0, 4), match='zyxy'>
>>> re.match(pattern, "") # None
>>> re.match(pattern, "ax") # None
>>> re.match(pattern, "baaaaa") # None

Other possible solutions: "[x-z]+", "[x-z][x-z]*", `"[xyz][xyz]*"