I verified that the data is open from the website that I got the data. The website that I got the code, https://data.london.gov.uk/dataset/household-waste-recycling-rates-borough, is part of the London Datastore which describes itself as “a free and open data-sharing portal.”
library(readxl)
household_recycling_borough <- read_excel("Desktop/household-recycling-borough.xls",
sheet = "Household Recycling Rates")
recycling_nona <- household_recycling_borough %>%
filter(!is.na(Area))
recycling_columns <- recycling_nona %>%
select(c(Area, "2013/14", "2014/15", "2015/16", "2016/17", "2017/18")) %>% top_n(5)
## Selecting by 2017/18
library(knitr)
kable(recycling_columns, digits = 2, caption = "Current Top 5 London Boroughs Recycling Rates Over Last 5 Years")
| Area | 2013/14 | 2014/15 | 2015/16 | 2016/17 | 2017/18 |
|---|---|---|---|---|---|
| Bexley | 55.2 | 54.0 | 52.0 | 52.7 | 52.06 |
| Bromley | 49.6 | 48.0 | 46.3 | 46.9 | 49.99 |
| Sutton | 37.1 | 37.6 | 34.7 | 36.5 | 50.05 |
| East | 49.3 | 49.3 | 49.3 | 49.4 | 48.97 |
| South West | 46.7 | 47.6 | 47.6 | 48.3 | 49.04 |
recycling_long <- recycling_columns %>%
rename("2013" = "2013/14")%>%
rename("2014" = "2014/15")%>%
rename("2015" = "2015/16")%>%
rename("2016" = "2016/17")%>%
rename("2017" = "2017/18")%>%
gather(key = "Year", value = "Recycling", 2:6)
library(ggplot2)
ggplot(recycling_long, aes(x = Year, y = Recycling, color = Area)) + geom_point() + geom_line(aes(group = Area)) + ggtitle("Current Top 5 London Boroughs Recycling Rates Over Last 5 Years")
recycling_2017 <- subset(recycling_long, Year == 2017)
ggplot(recycling_2017, aes(x = Area, y = Recycling)) + geom_col(aes(group = Area, color = Area)) + ggtitle("Top 5 Recycling Rates of London Boroughs in 2017")
I found it interesting that some of the recycling rates have gone significantly down over the years, but Sutton has increased their rate of recycling dramatically. I also thought the concept of open data versus public data was interesting, because I had previously thought of public data as open data. I had a couple of issues creating this report. The first issue was I struggled to import the Excel data into RStudio. I then had a hard time figuring out how to make a line graph of the table data.