This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 5: Application of dictionaries

Translation lookup tables

face Josiah Wang

I think you have had enough of employee databases to last you a lifetime! đŸ˜ˆ

Hopefully you are not tired of dictionaries yet! Let us try one more fun task before we move on to do something else.

Since we are dealing with dictionaries, we might as well work on a translation task!

Download french.txt, and save it to the same directory as where you plan to write your script.

Each line of the file starts with an English word, followed by one or more possible French translations of the English word. All words are separated by a tab (represented by \t in Python).

a   un  une
and et
ball    balle   ballon  boule
bank    banque  bancaire    rive
boy garçon  fils
...

Task 1: Load the translations

Write a function load_translations() that takes as input a str that specifies the name of the file from which to read. It should return a dict where the keys are the English words, and the values a list of possible translations of an English word to its French equivalent. The following is an example expected return value:

{"a": ["un", "une"],
 "and" : ["et"],
 "ball" : ["balle", "ballon", "boule"],
 "bank" : ["banque", "bancaire", "rive"],
 "boy" : ["garçon", "fils"],
 ...
}

I will also use this opportunity to present yet another way to read files. Instead of reading a file line by line, you can just get Python to read all lines into a list in one go with file‘s .readlines() method.

with open("french.txt") as textfile:
    all_lines = textfile.readlines()
    for line in all_lines:
        # Do something with line
        stripped_line = line.strip()

Only do this if your file is not too big - otherwise it will take up quite a lot of memory to ‘remember’ the content of the whole file! Sometimes it may better to just process files line by line without storing unnecessary data (Python only needs to remember one line at a time and process it). It is about trying to balance between speed and memory efficiency, depending on your goal. It also depends on whether you actually need all the information from the whole file before you can do something per line!

Task 2: Translate

Using the translation dictionary from load_translations(), now write a function translate() that takes two input arguments: (i) an English sentence (str); (ii) the translation dictionary from task 1. It should return the translation of the given sentence based on the given translation dictionary.

For simplicity, you can assume that all words in the input sentence can be found in the dictionary. Just implement a simple algorithm that goes through each word and pick the first translation from the possible translations of each word. Crude, but we are not aiming to write the best translator here!

Sample usage

>>> translation_dict = load_translations("french.txt")
>>> translation = translate("a boy with a ball in the park", translation_dict)
>>> print(translation)
un garçon avec un balle dans le parc