Albert Y. Kim
Monday 2015/04/06
h/t to @armalarm42
What is the most important thing to keep in mind when observing such maps?
Strong selection biases at every step:
i.e. the sample is non-random, so it is not representative of the population as a whole, and hence the results of the sample are not generalizable to the entire US population.
George Box, one of the most famous statisticians said: All models are wrong, but some are useful.
We are getting digital permission to open the Twitter pipeline for our use, just like any other developer.
twitteR package from RStudio (homepage is here)Text is computer programming are called (character) strings. Like handling dates, handling text is a non-sexy, but necessary, task. The stringr package tries to alleviate this problem.
stringr_vignette.Rmdtwitter.R code.stringr_vignette.Rmd file.