Advanced Lesson 1
Regular Expressions
Chapter 2: Regular expression basics
Do not match set of characters
We will follow on from the quick task from the previous page (you did try it, did you not?)
We now want to match any three letter word ending with an
(can
, fan
, man
, nan
) but NOT dan
, ran
and pan
. How would you do this?
You do it like this: [^drp]an
. When you have a caret ^
at the start of your open square bracket [
, it means you match any character EXCEPT the ones defined inside the set.
>>> pattern = "[^drp]an"
>>> re.match(pattern, "can")
<re.Match object; span=(0, 3), match='can'>
>>> re.match(pattern, "pan") # None
>>> re.match(pattern, "ran") # None
>>> re.match(pattern, "dan") # None
Quick task
Write a regular expression that matches any three letter words that do NOT start with a lowercase vowel.
Example valid strings: hey
, cod
, boy
, you
Example invalid strings: ant
, ear
, ice
, oop
, use
>>> pattern = "[^aeiou].."
>>> re.match(pattern, "boy")
<re.Match object; span=(0, 3), match='boy'>
>>> re.match(pattern, "ear") # None
>>> re.match(pattern, "ice") # None
>>>