First, we’ll load two packages that we will use. These add particular functionality to R.
library(tidyverse)
library(janitor)
library(lubridate)
We’ll read data from a URL. Note that the data is anonymized and was “published” via Google Sheets:
form_data <- read_csv("https://docs.google.com/spreadsheets/d/e/2PACX-1vQWUfMMUboX4YMwz97zz_Z-lh2NrBbMbxVOf56LQooDHtdepLh-_-yXKOLOGHNdRmZI2Xn2kUVtstHH/pub?output=csv")
Let’s rename the variables to be easier to use.
Before that, let’s clean them:
form_data <- clean_names(form_data)
names(form_data)
## [1] "timestamp"
## [2] "what_is_the_extent_of_your_prior_experience_using_r_markdown"
## [3] "why_in_brief_are_you_interested_in_learning_r_markdown_or_r"
## [4] "are_you_affiliated_with_the_university_of_tennessee_knoxville_as_a_student_or_faculty_staff_member"
## [5] "do_you_have_any_other_questions_or_comments"
They are still very long; however, now, we can more easily rename them:
form_data <- form_data %>%
rename(prior_exp = what_is_the_extent_of_your_prior_experience_using_r_markdown,
why_interest = why_in_brief_are_you_interested_in_learning_r_markdown_or_r,
utk_affiliated = are_you_affiliated_with_the_university_of_tennessee_knoxville_as_a_student_or_faculty_staff_member,
comments = do_you_have_any_other_questions_or_comments)
One variables need some additional processing to be more useful to us, timestamp
.
We’ll start with timestamp
, which needs to be recognized as a time:
form_data <- form_data %>%
mutate(timestamp = mdy_hms(timestamp))
Let’s check that it worked:
form_data
## # A tibble: 34 x 5
## timestamp prior_exp why_interest utk_affiliated comments
## <dttm> <dbl> <chr> <chr> <chr>
## 1 2020-05-10 09:55:28 5 Educational stati… No Affiliation:…
## 2 2020-05-10 19:21:37 1 I'm a grad studen… No Thanks for o…
## 3 2020-05-10 19:46:19 2 I currently use R… No <NA>
## 4 2020-05-10 19:51:28 3 Producing formatt… No <NA>
## 5 2020-05-10 20:21:42 1 Never used R mark… No Thank you ve…
## 6 2020-05-10 21:54:18 1 <NA> No <NA>
## 7 2020-05-10 23:29:06 2 Teach Data Scienc… No <NA>
## 8 2020-05-11 00:30:11 1 <NA> <NA> <NA>
## 9 2020-05-11 04:49:11 2 I want to use it … No <NA>
## 10 2020-05-11 05:18:46 1 This is Ben Gibbo… No <NA>
## # … with 24 more rows