CSV - Reading
A Comma Separated Value (CSV) file is a type of plain text file that uses specific structuring to arrange tabular data. Think spreadsheets.
Generally, CSV files use a comma (,) to separate each data value (hence its name), but other delimiters can be used: tab (\t), colon (:), and semi-colon (;).
Here is an example of how a CSV file might look like.
column 1 name,column 2 name,column 3 name
first row data 1,first row data 2,first row data 3
second row data 1,second row data 2,second row data 3
The first row usually contains the name of the columns. Think of headers in tables.
Let’s say I have a CSV file called students.csv
with the content below (just copy and paste the file below into a text editor and save it as students.csv
).
name,faculty,department
Alice Smith,Science,Chemistry
Ben Williams,Eng,EEE
Bob Jones,Science,Physics
Andrew Taylor,Eng,Computing
Reading from a CSV file
Let’s use the csv
module to read this file.
import csv
with open("students.csv") as csv_file:
csv_reader = csv.reader(csv_file, delimiter=",")
column_data = next(csv_reader)
print (f"Column names are {', '.join(column_data)}")
for row in csv_reader:
print (f"Student {row[0]} is from faculty of {row[1]}, {row[2]} dept.")
The expected output is:
Column names are name, faculty, department
Student Alice Smith is from faculty of Science, Chemistry dept.
Student Ben Williams is from faculty of Eng, EEE dept.
Student Bob Jones is from faculty of Science, Physics dept.
Student Andrew Taylor is from faculty of Eng, Computing dept.
Going back to the code:
with open("students.csv") as csv_file:
open the CSV file as a text file, returning a file objectcsv_reader = csv.reader(csv_file, delimiter=",")
construct acsv.reader
object, by passing the file object to its constructor. Also specifying that we want the separater to be a comma.column_data = next(csv_reader)
get the column headers on the first line using thenext()
functionfor row in csv_reader:
each row is a list ofstr
items containing the data found by removing the delimiter
Dealing with spaces
Let’s say our CSV file has spaces after the delimiter. We will call this file students_space.csv
name, faculty, department
Ben Williams, Eng, EEE
Bob Jones,Science,Physics
Andrew Taylor,Eng,Computing
Running our code will preserve these spaces.
with open("students_space.csv") as csv_file:
csv_reader = csv.reader(csv_file)
for row in csv_reader:
print (row)
# ['name', ' faculty', ' department']
# ['Ben Williams', ' Eng', ' EEE']
# ['Bob Jones', 'Science', 'Physics']
# ['Andrew Taylor', 'Eng', 'Computing']
We can register a dialect (a class of csv
used to define the parameters for reading/writing the csv file), and set ites parameter skipinitialspace
to True
to remove the whitespaces.
Reading CSV files into a dictionary
You can also read in the CSV files into a dictionary. You can then access elements using the column names as keys (first row).
with open("students.csv") as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
print(f"Student {row['name']} is from faculty of {row['faculty']}, "
f"{row['department']} dept. ")
If the CSV file does not contain the column names, you will need to specify your own keys. You can do this by setting the fieldnames
parameter to a list containing the keys.
fieldnames = ['name', 'faculty', 'department']
csv_reader = csv.DictReader(csv_file, fieldnames=fieldnames)