This is an archived version of the course and is no longer updated. Please find the latest version of the course on the main webpage.

Series

You can create a Series with the constructor Series(data, index, dtype)

  • data can be any object: a np.ndarray, list, dict, etc.
  • index values must have the same length as data (defaults to np.arange(data))
  • dtype represents the data type. This is inferred from data if not specified.

So all you need to do is to provide Series with your data and some axis/row labels (index).

Interestingly, you can have duplicate index labels. But this is best avoided. Python will eventually complain anyway when it attempts an operation that does not support duplicate index values. Apparently this is allowed purely for performance reasons.

s1 = pd.Series(np.array(["a", "b", "c", "d"]))
print(s1)
## 0    a
## 1    b
## 2    c
## 3    d
## dtype: object


s2 = pd.Series(np.array(["a", "b", "c", "d"]), index=[10, 11, 12, 13])
print(s2)
## 10   a
## 11   b
## 12   c
## 13   d
## dtype: object

You can also create your Series from a dictionary.

If you do not provide an index, then the dictionary keys will be use as axis labels.

If you provide an index, then pandas will attempt to match the indices to the keys of your dictionary and use the corresponding values as the data.

data = {"a" : 0., "b" : 1., "c" : 2.}

s1 = pd.Series(data)
print(s1)
## a    0.0
## b    1.0
## c    2.0
## dtype: float64

s2 = pd.Series(data, index=["b", "c", "d", "a"])
print(s2)
## b    1.0
## c    2.0
## d    NaN
## a    0.0
## dtype: float64

Accessing elements in a Series

Accessing elements in a Series is straighforward.

You can access elements by their row index like in a NumPy array. Slicing also works.

s = pd.Series(np.array(["a", "b", "c", "d", "e"]))
print(s)
## 0    a
## 1    b
## 2    c
## 3    d
## 4    e
## dtype: object

print (s[0])  
## a

print (s[:3])
## 0    a
## 1    b
## 2    c
## dtype: object

You can also access elements by their index labels.

s = pd.Series([1, 2, 3, 4, 5], index = ["a", "b", "c", "d", "e"])
print(s)
## a    1
## b    2
## c    3
## d    4
## e    5
## dtype: int64

print(s["a"])
## 1

print(s[["a", "d", "c"]])
## a    1
## d    4
## c    3
## dtype: int64