Inspired by Colin, I browsed around the San Mateo Open Data site.
I noticed that all the data was offered up through something called Socrata and wondered if there is an “R” package for that. Well there is….. and the code below will load it.
install.packages("RSocrata")
library(RSocrata)
library(tidyverse)
To see if it worked, I went back to the San Mateo site, and somewhat randomly chose the Public Wifi Poll data and grabbed its Socrata URL, which is:
https://data.smcgov.org/resource/ffix-cxrc.json
Let’s load that data in to the variable “poll”
poll <- as.tibble(read.socrata("https://data.smcgov.org/resource/ffix-cxrc.json"))
wow! That was easy.
It has five interesting columns:
- time
- easy_access_to_county_services
- educational_opportunities
- good_for_local_business
- service_for_those_with_limited_internet_access
Let’s make the column names shorter:
clean_poll <- setNames(poll,
c("id",
"time",
"easy_access",
"ed_opportunity",
"good_for_biz",
"service_for_all"))
Each of the poll response columns is a “1” if the poll taker checked that box, or otherwise n/a. I want the number 1 or zero – that is I first want to make the columns into numeric type instead of character type, and then change all the NA values to 0 – that’s what the function coalesce does
answer_columns <- 3:6
clean_poll <- clean_poll %>%
mutate_at(answer_columns,as.numeric) %>%
mutate_at(answer_columns,function(x) coalesce(x,0))
So now we have a zero or one for each poll answer. So which is the most popular poll options?
(sum_poll <- summarize(clean_poll,
easy_access =sum(easy_access),
ed_opportunities = sum(ed_opportunity),
good4biz = sum(good_for_biz),
service4underserved = sum(service_for_all),
total=n()))
Looks like people think the best reason is service for those who otherwise don’t have access.
Let’s graph that. to do so, we make a new table called bar_poll, that has one type response per row using the gather function. Then we plot it
bar_poll <- gather(sum_poll,
easy_access,ed_opportunities,good4biz,service4underserved,
key="response", value="count")
ggplot(data = bar_poll) +
geom_col(mapping=aes(x=response,y=count),fill="blue")

ggplot(data = bar_poll,aes(response)) + geom_col(aes(y=count),fill="blue")

LS0tCnRpdGxlOiAiU2FuIE1hdGVvIFB1YmxpYyBXaWZpIE5vdGVib29rIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgpJbnNwaXJlZCBieSBDb2xpbiwgSSBicm93c2VkIGFyb3VuZCB0aGUgW1NhbiBNYXRlbyBPcGVuIERhdGFdKGh0dHBzOi8vZGF0YS5zbWNnb3Yub3JnLykgc2l0ZS4KCkkgbm90aWNlZCB0aGF0IGFsbCB0aGUgZGF0YSB3YXMgb2ZmZXJlZCB1cCB0aHJvdWdoIHNvbWV0aGluZyBjYWxsZWQgW1NvY3JhdGFdKGh0dHBzOi8vc29jcmF0YS5jb20vKSBhbmQgd29uZGVyZWQgaWYgdGhlcmUgaXMgYW4gIlIiIHBhY2thZ2UgZm9yIHRoYXQuIFdlbGwgdGhlcmUgaXMuLi4uLiBhbmQgdGhlIGNvZGUgYmVsb3cgd2lsbCBsb2FkIGl0LgoKYGBge3J9Cmluc3RhbGwucGFja2FnZXMoIlJTb2NyYXRhIikKbGlicmFyeShSU29jcmF0YSkKbGlicmFyeSh0aWR5dmVyc2UpCmBgYAoKVG8gc2VlIGlmIGl0IHdvcmtlZCwgSSB3ZW50IGJhY2sgdG8gdGhlIFNhbiBNYXRlbyBzaXRlLCBhbmQgc29tZXdoYXQgcmFuZG9tbHkgY2hvc2UgdGhlIFtQdWJsaWMgV2lmaSBQb2xsXShodHRwczovL2RhdGEuc21jZ292Lm9yZy9Hb3Zlcm5tZW50L1B1YmxpYy1XaWZpLVBvbGwtUmVzdWx0cy9yNjk4LWY3N2IpIGRhdGEgYW5kIGdyYWJiZWQgaXRzIFNvY3JhdGEgVVJMLCB3aGljaCBpczoKCmh0dHBzOi8vZGF0YS5zbWNnb3Yub3JnL3Jlc291cmNlL2ZmaXgtY3hyYy5qc29uCgpMZXQncyBsb2FkIHRoYXQgZGF0YSBpbiB0byB0aGUgdmFyaWFibGUgInBvbGwiCgpgYGB7cn0KcG9sbCA8LSBhcy50aWJibGUocmVhZC5zb2NyYXRhKCJodHRwczovL2RhdGEuc21jZ292Lm9yZy9yZXNvdXJjZS9mZml4LWN4cmMuanNvbiIpKQpgYGAKCndvdyEgVGhhdCB3YXMgZWFzeS4KCkl0IGhhcyBmaXZlIGludGVyZXN0aW5nIGNvbHVtbnM6CgoqIHRpbWUKKiBlYXN5X2FjY2Vzc190b19jb3VudHlfc2VydmljZXMKKiBlZHVjYXRpb25hbF9vcHBvcnR1bml0aWVzCiogZ29vZF9mb3JfbG9jYWxfYnVzaW5lc3MKKiBzZXJ2aWNlX2Zvcl90aG9zZV93aXRoX2xpbWl0ZWRfaW50ZXJuZXRfYWNjZXNzCgpMZXQncyBtYWtlIHRoZSBjb2x1bW4gbmFtZXMgc2hvcnRlcjoKCmBgYHtyfQpjbGVhbl9wb2xsIDwtIHNldE5hbWVzKHBvbGwsCiAgICBjKCJpZCIsCiAgICAgICJ0aW1lIiwKICAgICAgImVhc3lfYWNjZXNzIiwKICAgICAgImVkX29wcG9ydHVuaXR5IiwKICAgICAgImdvb2RfZm9yX2JpeiIsCiAgICAgICJzZXJ2aWNlX2Zvcl9hbGwiKSkKYGBgCgoKRWFjaCBvZiB0aGUgcG9sbCByZXNwb25zZSBjb2x1bW5zIGlzIGEgIjEiIGlmIHRoZSBwb2xsIHRha2VyIGNoZWNrZWQgdGhhdCBib3gsIG9yIG90aGVyd2lzZSBuL2EuIEkgd2FudCB0aGUgbnVtYmVyIDEgb3IgemVybyAtLSB0aGF0IGlzIEkgZmlyc3Qgd2FudCB0byBtYWtlIHRoZSBjb2x1bW5zIGludG8gbnVtZXJpYyB0eXBlIGluc3RlYWQgb2YgY2hhcmFjdGVyIHR5cGUsIGFuZCB0aGVuIGNoYW5nZSBhbGwgdGhlIE5BIHZhbHVlcyB0byAwIC0tIHRoYXQncyB3aGF0IHRoZSBmdW5jdGlvbiBjb2FsZXNjZSBkb2VzCgpgYGB7cn0KYW5zd2VyX2NvbHVtbnMgPC0gMzo2CmNsZWFuX3BvbGwgPC0gY2xlYW5fcG9sbCAlPiUKICBtdXRhdGVfYXQoYW5zd2VyX2NvbHVtbnMsYXMubnVtZXJpYykgJT4lIAogIG11dGF0ZV9hdChhbnN3ZXJfY29sdW1ucyxmdW5jdGlvbih4KSBjb2FsZXNjZSh4LDApKQpgYGAKCgpTbyBub3cgd2UgaGF2ZSBhIHplcm8gb3Igb25lIGZvciBlYWNoIHBvbGwgYW5zd2VyLiBTbyB3aGljaCBpcyB0aGUgbW9zdCBwb3B1bGFyIHBvbGwgb3B0aW9ucz8KCmBgYHtyfQooc3VtX3BvbGwgPC0gc3VtbWFyaXplKGNsZWFuX3BvbGwsCiAgZWFzeV9hY2Nlc3MgPXN1bShlYXN5X2FjY2VzcyksIAogIGVkX29wcG9ydHVuaXRpZXMgPSBzdW0oZWRfb3Bwb3J0dW5pdHkpLCAKICBnb29kNGJpeiA9IHN1bShnb29kX2Zvcl9iaXopLCAKICBzZXJ2aWNlNHVuZGVyc2VydmVkID0gc3VtKHNlcnZpY2VfZm9yX2FsbCksCiAgdG90YWw9bigpKSkKYGBgCgpMb29rcyBsaWtlIHBlb3BsZSB0aGluayB0aGUgYmVzdCByZWFzb24gaXMgc2VydmljZSBmb3IgdGhvc2Ugd2hvIG90aGVyd2lzZSBkb24ndCBoYXZlIGFjY2Vzcy4gCgpMZXQncyBncmFwaCB0aGF0LiB0byBkbyBzbywgd2UgbWFrZSBhIG5ldyB0YWJsZSBjYWxsZWQgYmFyX3BvbGwsIHRoYXQgaGFzIG9uZSB0eXBlIHJlc3BvbnNlIHBlciByb3cgdXNpbmcgdGhlIGdhdGhlciBmdW5jdGlvbi4gVGhlbiB3ZSBwbG90IGl0IAoKYGBge3J9CmJhcl9wb2xsIDwtIGdhdGhlcihzdW1fcG9sbCwKICAgIGVhc3lfYWNjZXNzLGVkX29wcG9ydHVuaXRpZXMsZ29vZDRiaXosc2VydmljZTR1bmRlcnNlcnZlZCwKICAgIGtleT0icmVzcG9uc2UiLCB2YWx1ZT0iY291bnQiKQoKZ2dwbG90KGRhdGEgPSBiYXJfcG9sbCkgKyAKICBnZW9tX2NvbChtYXBwaW5nPWFlcyh4PXJlc3BvbnNlLHk9Y291bnQpLGZpbGw9ImJsdWUiKQpgYGAKCmBgYHtyfQpnZ3Bsb3QoZGF0YSA9IGJhcl9wb2xsLGFlcyhyZXNwb25zZSkpICsgZ2VvbV9jb2woYWVzKHk9Y291bnQpLGZpbGw9ImJsdWUiKQpgYGAKCg==