Working from home and pivoting to teaching online has made me realise that my wifi connection is really bad, particularly when it rains. I have been teaching new honours students R and needed a little dataset to demo how to get data into R, so made a google form and put it out on twitter to confirm to myself that my connection really is worse than most other people. You can contribute to the data here.
library(tidyverse) # includes readr for .csv files
library(readxl) #for excel files
library(googlesheets4) #read straight from google sheets
library(datapasta) # for pasting data into R
The standard way to get data into R is to read a file that you have downloaded.
speed1 <- read_csv("crappy_internet.csv")
speed2 <- read_excel("crappy_internet.xlsx")
But the googlesheets4 package allows you to authenticate your google account and read data straight from from googlesheets using only the url. More info here https://googlesheets4.tidyverse.org/
speed3 <- read_sheet("https://docs.google.com/spreadsheets/d/1yyl4fNMErNQ5mQaYgc2ELF7zF6cEPcfRNUtWr56nkg8/edit#gid=552570759")
## Using an auto-discovered, cached token.
## To suppress this message, modify your code or options to clearly consent to the use of a cached token.
## See gargle's "Non-interactive auth" vignette for more details:
## https://gargle.r-lib.org/articles/non-interactive-auth.html
## The googlesheets4 package is using a cached token for jennyrichmond@gmail.com.
Alternatively, you can copy and “paste” the data into R using the datapasta package. Find the vignette here
speed4 <- # select your data and do Ctrl-C, put your cursor here, and choose Addins, paste as dataframe, and then run the chunk
library(ggbeeswarm) # add noise to point plots
library(ggeasy) # easy wrappers for difficult to remember ggplot things
library(papaja) # this is mostly a package for writing APA formatted manuscripts, but it also includes a ggplot theme that is nice
speedlong <- speed3 %>%
select(live, raining, starts_with("home")) %>%
pivot_longer(names_to = "updown", values_to = "speed", home_download:home_upload)
speedlong %>%
ggplot(aes(x = updown, y = speed)) +
geom_point()
This plot is ok, but all the points are on top of each other. Use the ggbeeswarm package to add a little noise.
speedlong %>%
ggplot(aes(x = updown, y = speed)) +
geom_beeswarm()
Beeswarm is better but I’d like more noise.
speedlong %>%
ggplot(aes(x = updown, y = speed)) +
geom_quasirandom(width = 0.2)
Now I want to know which of these points were collected when it was raining.
speedlong %>%
ggplot(aes(x = updown, y = speed, colour = raining)) +
geom_quasirandom(width = 0.2)
speedlong %>%
ggplot(aes(x = updown, y = speed, colour = raining)) +
geom_quasirandom(width = 0.2) +
facet_wrap(~ raining)
Now this version has lots of duplicated information. We probably don’t need the legend. How to remove the legend is something I have to google EVERY TIME. The ggplot solution is + theme(legend.title = element_blank()) — hard to remember
speedlong %>%
ggplot(aes(x = updown, y = speed, colour = raining)) +
geom_quasirandom(width = 0.2) +
facet_wrap(~ raining) +
easy_remove_legend()
I really dislike the grey default of ggplot. Use theme_apa() to get nice formatting
speedlong %>%
ggplot(aes(x = updown, y = speed, colour = raining)) +
geom_quasirandom(width = 0.2) +
facet_wrap(~ raining) +
theme_apa() +
easy_remove_legend()
Put ggsave(“nameofplot.png”) at the end of each chunk and it will export the most recent plot.
ggsave("testplot.png")
## Saving 7 x 5 in image
Use fig.path in your setup chunk to export all your plots to a folder. This is where chunk labels are important. If your chunks are not labelled the exported files will be called “unnamed-chunk-somenumber.png” BUT if you label the chunk the file name of the exported plot will be meaningful.
Check out the RMarkdown reference guide for details