Metacharacters and special characters
As a reminder, here are the special metacharacters for regular expressions in Python
.: Match any characters except a newline.^: Match the start of a string, or “not” if used inside[ ].$: Match the end of a string?: Match zero or one repetitions*: Match zero or more repetitions+: Match one or more repretitions[]: Match a set of characters(): Grouping|: Or{m}: Match exactlymrepetitions{m,n}: Matchmtonrepetitions. Omittingnwill give you infinity.\: Escape characters. Use this to represent any of the metacharacters above (e.g. “\?” if you want to match a question mark)
Here are also some special characters that can be used as a shorthand for the corresonding regular expressions:
\d==[0-9](“digits”)\D==[^0-9](“non-digits”)\s==[ \t\n\r\f\v](“whitespace”)\S==[^ \t\n\r\f\v](“non-whitespace”)\w==[a-zA-Z0-9_](“word”)\W==[^a-zA-Z0-9_](“non-word”)
Also remember the curse of the backslash \. If you want to represent a single backslash, you will need to escape it with another backslash "\\". For example if you want to represent a newline character \n, it should be written as "\\n". If you want two backslashes \\, then you will need to represent it with four backslashes "\\\\". Alternatively, you can use a Python raw string literal: r"\n" is equivalent to "\\n", and r"\\" is equivalent to "\\\\".