Advanced Lesson 1
Regular Expressions
Chapter 5: Exercises
Building parsers
Task 3: Building parsers
As an example of regular expressions in a practical application, you can use regular expressions to check for syntax errors in a programming language. For example, you can check whether a Python variable name is valid.
Let’s try this for a bit!
Write regular expressions for each of the following. Test your regular expressions using re.match()
(or an online regular expression tester). You can make any further assumptions if something is not explicitly specified as allowed or disallowed.
Question 1: Python identifiers
Write a regular expression representing valid Python identifiers. Remember that Python identifiers can consist of upper and lower letters, digits, or underscores (_
), and must start with a letter or an underscore. For simplicity, Python keywords like if
are allowed.
- Valid:
my_variable123
,_12is_thisvalid
,for
,if
- Invalid:
123abc
,1
,haha!
, empty string
Question 2: Python assignment statements
Write a regular expression representing valid Python assignment statements, i.e. LHS = RHS
. For simplicity, assume that the LHS can be any valid identifier or keyword (as above). The RHS can be either an integer or a float (positive or negative). The spaces before and after =
are optional.
- Valid:
my_variable123 = 5
,if= 1.23
,x=2.
,y =0.678
,abc = -2.1
- Invalid:
1 = 4
,my_var = your_var
,var
,x + 5
Hint: It can get a bit messy and hacky! It’s fine - that’s how regular expressions are!
Question 3: Python comparison expressions
Write a regular expression representing valid Python comparison expressions, e.g. LHS == RHS
. For simplicity, assume that the valid operators are ==
, <=
, >=
, <
, >
, and !=
. Also assume that both LHS and RHS can take either a Python identifier+keyword or a positive integer. Again, white spaces before and after the operator as optional.
- Valid:
12 == 14
,x >=2
,for< if
,15!=age
- Invalid:
x = 2
,av<= -5
,123abc > abc123
,hello
,5+1