Chapter 3: Quantifiers

Bounded repetition

face Josiah Wang

What if you want to specify that sheep must "baa" with exactly 10 "a"s? You can of course write "baaaaaaaaaa". You can also use a more convenient notation "ba{10}" to specify that "a" should be repeated exactly 10 times.

>>> pattern = "ba{10}"
>>> re.match(pattern, "baaaaaaaaaa")
<re.Match object; span=(0, 11), match='baaaaaaaaaa'>
>>> re.match(pattern, "baaaaaa") # None

You can also specify a range. So "a{2,4}" will match "aa", "aaa", and "aaaa".

>>> pattern = "ba{2,4}"
>>> re.match(pattern, "baa")
<re.Match object; span=(0, 3), match='baa'>
>>> re.match(pattern, "baaaa")
<re.Match object; span=(0, 5), match='baaaa'>
>>> re.match(pattern, "baaaaaa") # Note that this only matches up to four 'a's
<re.Match object; span=(0, 5), match='baaaa'> 
>>> re.match(pattern, "ba") # None

Don’t have an upper limit? You can just specify the minimum. So a{2,} must match at least two "a"s. Similarly, a{,4} matches at most four "a"s.

Quick task

Write a regular expression that matches 3 to 7 digits. So "100", "0234", "5394212" are all valid strings.

>>> pattern = "[0-9]{3,7}" 
>>> re.match(pattern, "5394212")
<re.Match object; span=(0, 7), match='5394212'>
>>> re.match(pattern, "12345678") # Note only up to 7 digits are matched
<re.Match object; span=(0, 7), match='1234567'>
>>> re.match(pattern, "12") # None