Import csv file containing weather data for Sitka, AK. The csv file name, sitka_weather_07-2014.scv, will be stored in ‘filename’, then we will store the resultant file object in ‘a’. When we call csv.reader() whilst passing object ‘a’ as an argument, we will have created reader object that we simply store as ‘reader’.
To show the names of the header row as a quality check, we call the next() function ONCE, which returns the first available line in the file when the reader object is passed to it. We will store this functional step as ‘header_row’ to signify that the data stored within it is the name of the headers, the first line in the file:
import csv
filename = 'csv_sitka_weather_2014.csv'
with open(filename) as a:
reader = csv.reader(a)
header_row = next(reader)
print(header_row)
While printing the ‘header-row’ shows us the names of the headers, it does not give us the index of each header. For our visualization, we will need to know their positions within the list of data so that we can isolate the exact data we are isolating. To do this, we will use a for loop to print each header (defined as object ‘header’) and its position via the enumerate() function. The primary function of enumerate() is to retreive the index of each item in a list along with its value:
for index, header in enumerate(header_row):
print(index, header)
Now that we know the indices of the columns containing the maximum (high)) temperatures [1] and the corresponding dates [0]. We will extract and read this data, starting with a for loop that will read in the high temperature for each day of the year. To prepare these max temperatures to be read by Matplotlib, we have to convert the resultant string values to integers using the int() function:
import csv
filename = 'csv_sitka_weather_2014.csv'
with open(filename) as a:
reader = csv.reader(a)
header_row = next(reader)
max_temps = []
for row in reader:
highs = int(row[1])
max_temps.append(highs)
print(max_temps)
Using matplotlib, we will plot the high temperatures by date in a line graph, which will be the first data series on the visualization. We must first import matplotlib.pyplot, assigned as “plt” for simplicity, and import the datetime module for the purpose of converting the string dates in the extracted date into a proper date format for our visualization.
import csv
from datetime import datetime
import matplotlib.pyplot as plt
filename = 'csv_sitka_weather_2014.csv'
with open(filename) as a:
reader = csv.reader(a)
header_row = next(reader)
dates, max_temps= [], []
for row in reader:
#Daily temperature dates extracted, converted to YYYY-MM-DD format
day = datetime.strptime(row[0], "%Y-%m-%d")
dates.append(day)
#High temperatures extracted and formatted to type integer
highs = int(row[1])
max_temps.append(highs)
#Plot the dates and high temperatures via line graph in a purple data series.
x = plt.figure(dpi=128, figsize=(10,6))
plt.plot(dates, max_temps, c='purple')
c
plt.title("Sitka, AK: Daily High Temperatures, July 2014", fontsize=20)
plt.xlabel('Date', fontsize=14)
x.autofmt_xdate()
plt.ylabel("Temperature (F)", fontsize=14)
plt.tick_params(axis='both', which='both', labelsize=8)
#Display the visualization of the line graph.
plt.show()
Figure 1
In the same application, we will now plot the low temperatures by date, the former of which is columned under index [3]. Please be mindful of the comments
import csv
from datetime import datetime
import matplotlib.pyplot as plt
filename = 'csv_sitka_weather_2014.csv'
with open(filename) as a:
reader = csv.reader(a)
header_row = next(reader)
dates, max_temps, min_temps= [], [], []
for row in reader:
#Daily temperature dates extracted, converted to YYYY-MM-DD format
day = datetime.strptime(row[0], "%Y-%m-%d")
dates.append(day)
#High temperatures extracted and formatted to type integer
highs = int(row[1])
max_temps.append(highs)
#Low temperatures extracted and formatted to type integer
lows = int(row[3])
min_temps.append(lows)
x = plt.figure(dpi=128, figsize=(10,6))
#Plot the high temperatures by date (in orange) on the line graph
plt.plot(dates, max_temps, c='orange')
#Plot the low temperatures by date (in blue) on the line graph; fill the space between the two series.
plt.plot(dates, min_temps, c='blue')
plt.fill_between(dates, max_temps, min_temps, facecolor='grey', alpha=0.2)
#Set attributes of the visualization, including titles, font sizes, and axis details.
plt.title("Sitka, AK: Daily High Temperatures, 2014", fontsize=20)
plt.xlabel('Date', fontsize=14)
x.autofmt_xdate()
plt.ylabel("Temperature (F)", fontsize=14)
plt.tick_params(axis='both', which='both', labelsize=8)
#Display the visualization of the line graph.
plt.show()
Figure 2