Advanced Lesson 1
Regular Expressions
Chapter 2: Regular expression basics
Range
Now let’s say you want to match these six strings: lecture1, lecture2, lecture3, lecture4, lecture5, and lecture6.
You can define your set of valid numbers as we discussed earlier: lecture[123456].
An easier way is to define a range of valid values: lecture[1-6].
You can also define multiple ranges, for example [A-Za-z0-9_] will match any uppercase character, lowercase character, digit, or underscore.
Like earlier, you can also match any characters except those in a defined range using the caret ^. For example, [^n-p]ot will NOT match not, oot and pot.
You can combine ranges with individual characters too. For example, [^n-pbd]ot will not match not, oot, pot, bot and dot.
>>> pattern = "[A-Z][a-z]n"
>>> re.match(pattern, "Can")
<re.Match object; span=(0, 3), match='Can'>
>>> re.match(pattern, "can") # None
>>> re.match(pattern, "Cap") # None
>>> re.match(pattern, "CAn") # None
Quick task
Write a regular expression where the first character must be a lowercase letter, the second character must be a lowercase vowel, and the third character can be anything except a digit.
Example valid strings: boy, hi!, caT
Example invalid strings: HEY, art, co2
>>> pattern = "[a-z][aeiou][^0-9]"
>>> re.match(pattern, "hi!")
<re.Match object; span=(0, 3), match='hi!'>
>>> re.match(pattern, "co2") # None
>>> re.match(pattern, "art") # None
>>>