Susan Li April 2, 2017
I have been exploring Toronto's Open Data Catalogue, but I have to admit that it is not always easy to find interesting dataset. Today, this water consumption by ward 2000 to 2015 dataset got my attention. To put things in perspective, I will only look into the data for the past five years.
As usual, download dateset, check missing values and do some wrangling. There are five columns that I do not need, I decided to remove them.
It is clear that ward 42, - Scarborough-Rouge River had the most residential usage and Ward 27, - Toronto Centre Rosedale had the most commercial usage in total from 2011 to 2015.
Again, ward 27, - Toronto Centre Rosedale had the most residential usage on average from 2011 to 2015, and ward 11, - York South West consumed the most water commercially on average from 2011 to 2015.
It has been consistent that ward 27 consumed the most water commercially every year from 2011 to 2015.
The average residential usage has been decreasing over the years, we now use less water then we did five years ago.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 240 288 313 335 353 706
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 212 263 274 299 323 534
Now what? It's map time! I downloaded the ESRI Shapefile from city of Toronto Open Data portal. This data set includes the boundaries for the City of Toronto's 44 municipal wards.
## Source: "C:/Users/Susan/Documents/gcc/Projects/Open Data/Files/Data Upload - May 2010/May2010_WGS84", layer: "icitw_wgs84"
## Driver: ESRI Shapefile; number of rows: 44
## Feature type: wkbPolygon with 2 dimensions
## Extent: (-79.64 43.58) - (-79.12 43.86)
## CRS: +proj=longlat +datum=WGS84 +no_defs
## LDID: 87
## Number of fields: 10
## name type length typeName
## 1 GEO_ID 0 9 Integer
## 2 CREATE_ID 0 9 Integer
## 3 NAME 4 40 String
## 4 SCODE_NAME 4 10 String
## 5 LCODE_NAME 4 20 String
## 6 TYPE_DESC 4 25 String
## 7 TYPE_CODE 4 4 String
## 8 OBJECTID 0 9 Integer
## 9 SHAPE_AREA 2 19 Real
## 10 SHAPE_LEN 2 19 Real
It took me sometime to figure it out how to read it.
## OGR data source with driver: ESRI Shapefile
## Source: "C:/Users/Susan/Documents/gcc/Projects/Open Data/Files/Data Upload - May 2010/May2010_WGS84", layer: "icitw_wgs84"
## with 44 features
## It has 10 fields
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
Here we have a simple map of Toronto to show all three layers.
Use 'fortify' function to turn the map into a data frame so that it can easily be plotted with 'ggplot2', which produce this data frame:
## long lat order hole piece id group
## 1 -79.63 43.74 1 FALSE 1 1 01.1
## 2 -79.63 43.74 2 FALSE 1 1 01.1
## 3 -79.63 43.74 3 FALSE 1 1 01.1
## 4 -79.64 43.74 4 FALSE 1 1 01.1
## 5 -79.64 43.75 5 FALSE 1 1 01.1
## 6 -79.64 43.75 6 FALSE 1 1 01.1
Merge our water dataframe with the map data. This time, I will be looking at 2015 usage only.
Plot 2015 total commercial usage!
2015 Total Residential Usage
Plot 2015 Average Commercial Water Usage.
2015 Average Residential Usage
Commercial water users are the top consumers of the most water in Toronto, and they are spread out throughout the city in many wards. And heavy residential water users are only concentrated in a few wards of the city.
Water use is a major policy, environmental and social issue. Understanding water usage is important everywhere. The cost of water went up in January of 2014 by nine per cent. We are lucky that Toronto is built on a lake that contains about 1640 cubic kilometers of water. But New study calls average water use by Canadians 'alarming'.
I have really enjoyed working on this project. I hope the city of Toronto will release the data by month, and type of user such as residence, park, business, school, hotel, restaurant etc, so that I will be able to perform a more in depth analysis.