This is an archived version of the course and is no longer updated. Please find the latest version of the course on the main webpage.

Collections

Now, let us look at a package from the Standard Python Library.

Here is a useful package: collections. It contains nice utilities to help you do more things with built-in collections like list and dict.

If you browse towards the top of the documentation, you may notice that the package points to Lib/collections/__init__.py. So we know collections is a package.

collections.Counter

Let us take a look at a potentially useful class in the package called Counter.

Counter is a subclass of dict, and provides some convenient features for you to count objects.

Here is a piece of code that counts the occurrences of characters in a list.

>>> from collections import Counter 
>>> counter = Counter(["a","b","c","b","b","d","a","b","e","a","e","b"])
>>> print(counter)
>>> counter["b"]
>>> counter["x"]

We will leave you to work out all the other handy features of this class by reading the documentation. This includes addition, subtraction, union and intersection operations between two Counters.

collections.defaultdict

Here is another useful class called defaultdict.

Let’s say that you want a dictionary that indexes people by the first letter of their name. So we want something like

{"A": ["Abraham", "Alice"],
 "B": ["Bob", "Babe", "Betsy"],
 "C": [],
 "D": ["Daisy"],
 ...
}

And let us assume that we start with an empty dictionary and need to add people one by one from a database. To do this, we have to check whether a key exists, and if so, add them to the existing list. Otherwise, we have to create a new list and add the first member into the list

names = ["Bob", "Daisy", "Babe", "Abraham", "Alice", "Betsy"]

member_dict = {}
for name in names:
    first_letter = name[0]
    if first_letter not in member_dict:
        member_dict[first_letter] = []
    member_dict[first_letter].append(name)

print(member_dict)

With defaultdict, you can define what the default value should be when a key does not yet exist. This will save us from having to check whether the key exists. In our example, our default value will be a empty list.

from collections import defaultdict

names = ["Bob", "Daisy", "Babe", "Abraham", "Alice", "Betsy"]

member_dict = defaultdict(list)
for name in names:
    first_letter = name[0]
    member_dict[first_letter].append(name)

print(member_dict)