August 9, 2017

Introduction

This simple project is a scatterplot of CO2 emission versus property floor area in NYC in 2012.

The data used for this project comes from NYC Open data available here.

Getting and cleaning the data

The data is in JSON format. After accessing the data using jsonlite's fromJSON function, we use complete.cases to only get complete obserations.

my_data<-fromJSON("https://data.cityofnewyork.us/resource/xwwh-wcee.json"
  ,flatten=TRUE)
my_data<-my_data[complete.cases(my_data),]
colnames(my_data)[12]<-"area"
colnames(my_data)[16]<-"co2"
my_data$area<-as.numeric(my_data$area)
my_data$co2<-as.numeric(my_data$co2)

Slide with Plot

Slide with Plot and Trendline