As a New Yorker, I am always fancinated with the beauty of Manhattan. When I have a day off, I always would like to hang out in the city. However, the transportation is always a problem for me since I live in Long Island. I have tried different ways to get to Manhattan by Long Island Rail Road, which is expensive and have to stick with their schedule, by car, which is a nightmare to find a parking and the traffic is a jam 24/7. Recently, Gov. Andrew M. Cuomo proposed the congestion pricing plan in Manhattan so that NY will become the first American city to charge such fees. The fees are expected to raise money to fix the city’s subway system and of course, thin out streets that have become stangled by traffic.
For my project, I would like to see if the congestion pricing plan is necessary and what kind of imapct it may bring to us, New Yorkers. I found 3 sets of data to help my project. Two of them are from NYC Open Data, and One is from epa.org which is the enviromental open data website.
The first thing I want to find out is the quality of enviroment in Manhattan between 2015 and 2018. The data set from epa.org includes annual Air Quality Index(AQI) for each county in each state of U.S. for 2015,2016,2017, and 2018. For my projects, I will focus on 3 categories of AQI, “number of good days”,“number of Ozone days”, and “number of PM 2.5 Days” since these categories will be correlated to the cars’ emission. Ozone is closely related to the global warming and PM2.5 would affect human’s health negatively.
| State | County | Year | Days.with.AQI | Good.Days | Moderate.Days | Unhealthy.for.Sensitive.Groups.Days | Unhealthy.Days | Very.Unhealthy.Days | Hazardous.Days | Max.AQI | X90th.Percentile.AQI | Median.AQI | Days.CO | Days.NO2 | Days.Ozone | Days.SO2 | Days.PM2.5 | Days.PM10 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Alabama | Baldwin | 2015 | 264 | 230 | 33 | 1 | 0 | 0 | 0 | 129 | 53 | 38 | 0 | 0 | 189 | 0 | 75 | 0 |
| Alabama | Clay | 2015 | 112 | 101 | 11 | 0 | 0 | 0 | 0 | 91 | 50 | 32 | 0 | 0 | 0 | 0 | 112 | 0 |
| Alabama | Colbert | 2015 | 280 | 251 | 29 | 0 | 0 | 0 | 0 | 73 | 51 | 36 | 0 | 0 | 195 | 0 | 85 | 0 |
| Alabama | DeKalb | 2015 | 363 | 319 | 43 | 1 | 0 | 0 | 0 | 101 | 52 | 37 | 0 | 0 | 307 | 0 | 56 | 0 |
| Alabama | Elmore | 2015 | 233 | 223 | 9 | 1 | 0 | 0 | 0 | 115 | 47 | 35 | 0 | 0 | 233 | 0 | 0 | 0 |
| Alabama | Etowah | 2015 | 365 | 221 | 137 | 4 | 3 | 0 | 0 | 170 | 64 | 46 | 0 | 0 | 119 | 0 | 246 | 0 |
## [1] 1061
| State | County | Year | Days.with.AQI | Good.Days | Moderate.Days | Unhealthy.for.Sensitive.Groups.Days | Unhealthy.Days | Very.Unhealthy.Days | Hazardous.Days | Max.AQI | X90th.Percentile.AQI | Median.AQI | Days.CO | Days.NO2 | Days.Ozone | Days.SO2 | Days.PM2.5 | Days.PM10 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Alabama | Baldwin | 2016 | 279 | 247 | 32 | 0 | 0 | 0 | 0 | 87 | 51 | 37 | 0 | 0 | 221 | 0 | 58 | 0 |
| Alabama | Clay | 2016 | 116 | 109 | 7 | 0 | 0 | 0 | 0 | 56 | 45 | 30 | 0 | 0 | 0 | 0 | 116 | 0 |
| Alabama | Colbert | 2016 | 282 | 258 | 23 | 1 | 0 | 0 | 0 | 115 | 50 | 38 | 0 | 0 | 219 | 0 | 63 | 0 |
| Alabama | DeKalb | 2016 | 348 | 304 | 43 | 1 | 0 | 0 | 0 | 119 | 54 | 40 | 0 | 0 | 321 | 0 | 27 | 0 |
| Alabama | Elmore | 2016 | 117 | 107 | 10 | 0 | 0 | 0 | 0 | 77 | 48 | 40 | 0 | 0 | 117 | 0 | 0 | 0 |
| Alabama | Etowah | 2016 | 352 | 162 | 184 | 3 | 3 | 0 | 0 | 179 | 67 | 52 | 0 | 0 | 104 | 0 | 248 | 0 |
## [1] 1054
| State | County | Year | Days.with.AQI | Good.Days | Moderate.Days | Unhealthy.for.Sensitive.Groups.Days | Unhealthy.Days | Very.Unhealthy.Days | Hazardous.Days | Max.AQI | X90th.Percentile.AQI | Median.AQI | Days.CO | Days.NO2 | Days.Ozone | Days.SO2 | Days.PM2.5 | Days.PM10 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Alabama | Baldwin | 2017 | 270 | 241 | 28 | 1 | 0 | 0 | 0 | 108 | 51 | 36 | 0 | 0 | 206 | 0 | 64 | 0 |
| Alabama | Clay | 2017 | 118 | 104 | 14 | 0 | 0 | 0 | 0 | 66 | 52 | 30 | 0 | 0 | 0 | 0 | 118 | 0 |
| Alabama | Colbert | 2017 | 283 | 265 | 18 | 0 | 0 | 0 | 0 | 63 | 48 | 37 | 0 | 0 | 218 | 0 | 65 | 0 |
| Alabama | DeKalb | 2017 | 359 | 329 | 30 | 0 | 0 | 0 | 0 | 80 | 50 | 39 | 0 | 0 | 315 | 0 | 44 | 0 |
| Alabama | Elmore | 2017 | 226 | 221 | 5 | 0 | 0 | 0 | 0 | 58 | 45 | 35 | 0 | 0 | 226 | 0 | 0 | 0 |
| Alabama | Etowah | 2017 | 360 | 233 | 125 | 1 | 1 | 0 | 0 | 163 | 62 | 45 | 0 | 0 | 133 | 0 | 227 | 0 |
## [1] 1061
| State | County | Year | Days.with.AQI | Good.Days | Moderate.Days | Unhealthy.for.Sensitive.Groups.Days | Unhealthy.Days | Very.Unhealthy.Days | Hazardous.Days | Max.AQI | X90th.Percentile.AQI | Median.AQI | Days.CO | Days.NO2 | Days.Ozone | Days.SO2 | Days.PM2.5 | Days.PM10 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Alabama | Baldwin | 2018 | 205 | 181 | 24 | 0 | 0 | 0 | 0 | 97 | 54 | 38 | 0 | 0 | 161 | 0 | 44 | 0 |
| Alabama | Clay | 2018 | 86 | 79 | 7 | 0 | 0 | 0 | 0 | 64 | 47 | 29 | 0 | 0 | 0 | 0 | 86 | 0 |
| Alabama | Colbert | 2018 | 205 | 181 | 24 | 0 | 0 | 0 | 0 | 93 | 51 | 37 | 0 | 0 | 154 | 0 | 51 | 0 |
| Alabama | DeKalb | 2018 | 238 | 205 | 33 | 0 | 0 | 0 | 0 | 84 | 54 | 38 | 0 | 0 | 204 | 0 | 34 | 0 |
| Alabama | Elmore | 2018 | 161 | 142 | 19 | 0 | 0 | 0 | 0 | 71 | 51 | 36 | 0 | 0 | 161 | 0 | 0 | 0 |
| Alabama | Etowah | 2018 | 252 | 167 | 84 | 0 | 1 | 0 | 0 | 153 | 62 | 44 | 0 | 0 | 131 | 0 | 121 | 0 |
## [1] 1038
## Mean_GoodDays Mean_MaxAQI Mean_MedianAQI Mean_DaysOzone Mean_DaysPM25
## 1 247.7757 118.4288 35.95193 166.9906 113.8794
## Mean_GoodDays Mean_MaxAQI Mean_MedianAQI Mean_DaysOzone Mean_DaysPM25
## 1 258.2799 118.7723 34.94213 177.2581 106.7277
## Mean_GoodDays Mean_MaxAQI Mean_MedianAQI Mean_DaysOzone Mean_DaysPM25
## 1 260.5994 122.4543 35.55325 176.5683 111.5212
## Mean_GoodDays Mean_MaxAQI Mean_MedianAQI Mean_DaysOzone Mean_DaysPM25
## 1 161.2418 106.6118 36.34586 120.2293 62.22254
## Year AvgAQ_GoodDays AvgAQ_MaxAQI AvgAQ_MedAQI AvgAQ_DaysOzone
## 1 2015 247.7757 118.4288 35.95193 166.9906
## 2 2016 258.2799 118.7723 34.94213 177.2581
## 3 2017 260.5994 122.4543 35.55325 176.2581
## 4 2018 161.2418 106.6118 36.34586 120.2293
## AvgAQ_DaysPM25
## 1 113.87940
## 2 106.72770
## 3 111.52120
## 4 62.22254
| State | County | Year | Days.with.AQI | Good.Days | Moderate.Days | Unhealthy.for.Sensitive.Groups.Days | Unhealthy.Days | Very.Unhealthy.Days | Hazardous.Days | Max.AQI | X90th.Percentile.AQI | Median.AQI | Days.CO | Days.NO2 | Days.Ozone | Days.SO2 | Days.PM2.5 | Days.PM10 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| New York | New York | 2015 | 365 | 215 | 146 | 4 | 0 | 0 | 0 | 122 | 68 | 46 | 0 | 0 | 142 | 0 | 223 | 0 |
| New York | New York | 2016 | 366 | 247 | 115 | 4 | 0 | 0 | 0 | 126 | 67 | 43 | 0 | 0 | 161 | 0 | 205 | 0 |
| New York | New York | 2017 | 365 | 265 | 98 | 2 | 0 | 0 | 0 | 122 | 61 | 40 | 0 | 0 | 147 | 0 | 218 | 0 |
| New York | New York | 2018 | 274 | 179 | 84 | 10 | 1 | 0 | 0 | 151 | 71 | 44 | 1 | 0 | 126 | 0 | 147 | 0 |
From the following graph, we can tell that the NY’s Good days comparing to the average good days of U.S are mostly under the average, except 2017. Ozone days are generally lower than than the average Ozone days. The Ozone Actiond Days are days when high temperatures and air pollution combine to form high levels of ground level ozone. The main reason behind the lower Ozone days than the average, I think weather plays an important role. NY is cold in general, summer may only last 3 months or so. However, if we look at the number of days of PM 2.5 in NY, it is way higher than the average national rate. PM2.5 primarly come from car, trick, bus and off-road vehicle. The high density of the population and number of vehicles in NY may cause such a high number. PM2.5 is harmful for health and the data may support the idea of the Congestion Pricing plans from enviromental perspective.
Secondly, I use NYC Open data’s Traffic Volume Counts from 2014 to 2018 to find the total traffic of October for 2015, 2016, 2017 for each hour. The main reason that I use October 0f 2015, 2016, and 2017’s data is that it is incomplete data in 2014 and 2018 which means that, there are some months are NA in these 2 years after I clean up the data. I can only use the common month for the major years but I still get valuable information and learned many different ways to show the data visulization. I googled online to find best ways to show 24 hours data visualzation. I found the clock graph which is very interesting and beautifully present my data.
Technically, to talk about this visualizaiton themselves, I keep “Less is more” in my mind. At the beginning, I used the very colorful histogram to show the value but I feel the colors are too much and it distracted my attention to the data themselves. So, I changed it into Lolipop graph. which is simple but get the point directly. So from these 3 clock graphs, I can tell the pattern of the hour’s traffic. From 0-5 AM, the traffic count is small which is pretty trivial. From 5 AM, the traffic started to pick up and it has a continuously increasing rush hours until 7PM. Considering people come to work in the morning and leave work at night. The data set seems to satisfy our expectation. Basically, the 3 years’ October’s traffic follows the same pattern.
Then I compare these 3 years’ data sets into one chart. It became challenging since it is not easy to fit 3 of them into 1 graph with bar chart and it also becomes overwhelming by just looking at these 72 bar charts at the same time. So, I googled again and try to find a solution to show the visualization better. Then I find the package of gganimate. It changes my static graph and show each hour of 3 years’ data one by one. It becomes so much easier to see the differnce, increase or decrease.It becomes an interesting point for me. In this class, I always keep “Less is more” in my mind. However, this animation opened my mind. It still keep the minimalism but in a different way. Instead of showing everything at one time, showing one thing at a time becomes more clear.
Back to our graphs, I found that the traffic count of 2015 is actually much higher the other two years’. First reason behind it, I think it is the data collection. There were maybe errors during the data collection. Second of all, I think that is because many people may be aware of the enviroment, the expense of driving cars into city, etc. they may find alternatives to get into city for work or for fun so the traffic counts drop dramatically.
## ID Segment_ID Roadway From To Direction
## "character" "character" "character" "character" "character" "character"
## Date AM01 AM12 AM23 AM34 AM45
## "character" "integer" "integer" "integer" "integer" "integer"
## AM56 AM67 AM78 AM89 AM910 AM1011
## "integer" "integer" "integer" "integer" "integer" "integer"
## PM1112 PM1213 PM1314 PM1415 PM1516 PM1617
## "integer" "integer" "integer" "integer" "integer" "integer"
## PM1718 PM1819 PM1920 PM2021 PM2122 PM2223
## "integer" "integer" "integer" "integer" "integer" "integer"
## PM2324
## "integer"
## [1] 18406766
## Using Time1 as id variables
Last but not least, since the Congestion Pricing Plan suggested that the car will be charged for $11 and the truck will be charged for $25. so I use another data set with the differnt types of car in the city of different hours in October from 2015 to 2017. I hoped that I can get an estimation of the amount of money they will raise through the paln. As we can see from the grapgs, auto still count as the big proportion of all the vehicles. The estimation I get roughly is around 0.2 billion dollars. The plan suggested that they will raise 1 bilion per year which is different from my calculation. Many factors play roles in this difference. first of all, it may be the press’ exaggeration. It may encourages people to agree with this plan since the 1 billion dollars will be used toward the public. Secondly, the collection of data may have errors. When I cleaned up the data, I found a lot of NAs in the date. Thirdly, the month I chose may not be the busiest month of the year in NY. In the summer, I will assume that there will be more travels than other months so the count of cars may be higher. From the technical point of view, I found an interesting package which is waffle. It would be cooler if I can use fontAwesome in my graph because it will replace the squares of my graphs with cars. However, I tried everything but I can’t make it work. The last animation is because when I see the static stacked histogram can’t present each categories clearly since there are 7 categories. So showing each category one by one will be much clearer.
## ID Segment_ID Roadway From To Direction
## "character" "character" "character" "character" "character" "character"
## Date Type AM01 AM12 AM23 AM34
## "character" "character" "integer" "integer" "integer" "integer"
## AM45 AM56 AM67 AM78 AM89 AM910
## "integer" "integer" "integer" "integer" "integer" "integer"
## AM1011 PM1112 PM1213 PM1314 PM1415 PM1516
## "integer" "integer" "integer" "integer" "integer" "integer"
## PM1617 PM1718 PM1819 PM1920 PM2021 PM2122
## "integer" "integer" "integer" "integer" "integer" "integer"
## PM2223 PM2324
## "integer" "integer"
## [1] 1799009
## [1] 242320.4
## [1] 128831.4
## [1] 61569
## [1] 13693.4
## [1] 21649.4
## [1] 37158.6
## Type_of_cars Number_of_cars type_percent
## 1 auto 1799008.8 76
## 2 Taxi 242320.4 11
## 3 Commercial 128831.4 6
## 4 Medium Truck 61569.0 3
## 5 Heavy Truck 13693.4 1
## 6 School Bus 21649.4 1
## 7 Other Bus 37158.6 2
## [1] 706063.8
## [1] 172197.4
## [1] 70681.6
## [1] 36562.8
## [1] 7603
## [1] 7073
## [1] 21344.8
## Type_of_cars Number_of_cars type_percent
## 1 auto 706063.8 68
## 2 Taxi 172197.4 17
## 3 Commercial 70681.6 7
## 4 Medium Truck 36562.8 4
## 5 Heavy Truck 7603.0 1
## 6 School Bus 7073.0 1
## 7 Other Bus 21344.8 2
## [1] 744805.5
## [1] 65678.38
## [1] 48443.8
## [1] 25504.12
## [1] 8150.26
## [1] 15191.72
## [1] 17852.62
## Type_of_cars Number_of_cars type_percent
## 1 auto 744805.46 80
## 2 Taxi 65678.38 7
## 3 Commercial 48443.80 5
## 4 Medium Truck 25504.12 3
## 5 Heavy Truck 8150.26 1
## 6 School Bus 15191.72 2
## 7 Other Bus 17852.62 2
## [1] 218280756
## Type_of_cars estimation_fee_charge
## 1 auto 12999512
## 2 Taxi 1920785
## 3 Commercial 991827
## 4 Medium Truck 1030299
## 5 Heavy Truck 245389
## 6 School Bus 365951
## 7 Other Bus 636300
## nframes and fps adjusted to match transition
In conclustion, being a New Yorker, whether I like the Congestion Plan or not, I think it has more advantages than disadvantages. The traffic is always bad in the city, with the plan, people may find alternative ways to get into city maybe with carpool, public transportation and so on to avoid fees but ultimately, it will be better for the enviroment and people can have a better place to enjoy and appreciate.