Advanced Lesson 1
Regular Expressions
Chapter 3: Quantifiers
Kleene plus
We previously accepted "b" (with no "a" after the "b"), in addition to ba, baa, baaa, etc.
What if we want at least one "a" after the "b"? Because a "b" without an "a" is not really a sound a sheep would make!
The solution: just add an "a" to force the first "a"!
>>> pattern = "baa*"
>>> re.match(pattern, "b") # None
>>> re.match(pattern, "ba")
<re.Match object; span=(0, 2), match='ba'
You can also use a + (Kleene Plus), which represents “one or more times”. This is a shorthand notation for the above! So a+ is the same as aa*.
>>> pattern = "ba+"
>>> re.match(pattern, "b") # None
>>> re.match(pattern, "ba")
<re.Match object; span=(0, 2), match='ba'>
>>> re.match(pattern, "baaaaa")
<re.Match object; span=(0, 6), match='baaaaa'>
Quick task
Write a regular expression that matches one or more x, y or z.
Example valid strings: x, y, z, xx, xy, zxxyx, yxyz, xyzxyz
Example invalid strings: <blank>, aaa, ax
>>> pattern = "[xyz]+"
>>> re.match(pattern, "bd")
<re.Match object; span=(0, 1), match='bd'>
>>> re.match(pattern, "baad")
<re.Match object; span=(0, 4), match='baad'>
>>> re.match(pattern, "baaaaaa") # None
>>> re.match(pattern, "daaad") # None
>>>
Other possible solutions: "[x-z]+", "[x-z][x-z]*", `"[xyz][xyz]*"