Advanced Lesson 1
Regular Expressions
Chapter 7: Groups
Substitutions
The true power of backreferences can be seen when you need to find and replace a string.
Let’s say you want to make the section headers in your LaTeX document to be chapters (perhaps you are converting your paper into a book?)
You can replace all instances of section
with chapter
, but keeping the original header titles using backreferences. (We’re omitting the backslashes from LaTeX for simplicity)
The function re.sub(pattern, replacement, string)
or the method pattern.sub(replacement, string)
of a Pattern
object can be used for this. It is similar to the str.replace()
method, except that you can also search for substrings using regular expressions.
>>> pattern = r"section{([^}]*)}"
>>> replacement = r"chapter{\1}"
>>> string = "section{Introduction} section{Literature review}"
>>> re.sub(pattern, replacement, string)
chapter{Introduction} chapter{Literature review}
As expected, you can also use named groups for this. You use \g<name>
to refer to the named group in the original pattern. \g<1>
works too (and is equivalent to \1
).
>>> pattern = r"section{(?P<title>[^}]*)}"
>>> string = "section{Introduction} section{Literature review}"
>>> re.sub(pattern, r"chapter{\1}", string)
'chapter{Introduction} chapter{Literature review}'
>>> re.sub(pattern, r"chapter{\g<1>}", string)
'chapter{Introduction} chapter{Literature review}'
>>> re.sub(pattern, r"chapter{\g<title>}", string)
'chapter{Introduction} chapter{Literature review}'
There is also a re.subn()
function that also returns the number of substrings replaced.