Using the given code, answer the questions below.

library(tidyverse) 

class_roster <- read.csv("~/R/busStat/Data/classRoster01.csv") %>%
  as_tibble()
class_roster
## # A tibble: 21 x 6
##        X Student  Class           Major               income fav_holiday   
##    <int> <fct>    <fct>           <fct>                <int> <fct>         
##  1     1 Abigail  Junior          Management            1010 N/A           
##  2     2 Anthony  Sophomore       Sports Management      920 "Christmas "  
##  3     3 Lauren   Senior          Business Administr~   1031 "Thanksgiving~
##  4     4 Jonathan Junior          Finance               1064 N/A           
##  5     5 Zachary  Sophomore       Sports Management     1021 N/A           
##  6     6 Tayla    Senior          Business Administr~   1053 "Thanksgiving~
##  7     7 James    First Year Stu~ Undeclared            1001 "Easter "     
##  8     8 Jillian  First Year Stu~ Undeclared            1156 Mothers Day   
##  9     9 Luis     First Year Stu~ Business Administr~   1019 N/A           
## 10    10 Nicholas First Year Stu~ Marketing              848 "New Years Ev~
## # ... with 11 more rows

Add a variable of your own to answer a research question you might have: i.e., What is the favorite number of PSU students? Do favorite numbers vary by major or class?

Favorite Holiday was added as a research question

Q1. What does the row represent?

The row represents students

Q2. What characteristics of students (variables) does the data describe?

The characteristics of students (variables) that the data describes are, student, class, major and income

Q3. What type of data is the new variable you added (i.e., numeric, character, logical)?

The new variable I added is a character data

Q4. What type of R object is class_roster (i.e., vector, matrix, data frame, list)? And why?

The type of R object that class_roster is a data frame because by each students name, there is their income, major and fav_holiday and looks like a table.

Q5. Describe the first student (first row) using all variables.

Hint: Use View(). The first student is Abigail and she is a junior and her major is Management and income of $1,010 and she was not here to vote for her fav_holiday

Q6. Count the number of values in your new variable.

Hint: Use count().

class_roster %>% count(fav_holiday, sort = TRUE)
## # A tibble: 7 x 2
##   fav_holiday          n
##   <fct>            <int>
## 1 N/A                 11
## 2 "Thanksgiving "      3
## 3 "Christmas "         2
## 4 "Easter "            2
## 5 "4th of July "       1
## 6 Mothers Day          1
## 7 "New Years Eve "     1

Q7. Plot your new variable.

Hint: Refer to the ggplot2 cheatsheet. Google it. See the section for One Variable. Note that there are two different cases: 1) Continuous and 2) Discrete. The type of chart you can use depends on what type of data your variable is.

class_roster %>%
  ggplot(aes(fav_holiday)) +
  geom_bar()

Q8. Does your answer in Q6 vary by major?

Hint: Use dplyr::group_by in addition to count().

class_roster %>% 
  group_by(Major) %>% 
  count(fav_holiday, sort = TRUE) %>%
arrange(Major, n)
## # A tibble: 16 x 3
## # Groups:   Major [9]
##    Major                        fav_holiday          n
##    <fct>                        <fct>            <int>
##  1 Accounting                   N/A                  2
##  2 Business Administration      "Easter "            1
##  3 Business Administration      "Thanksgiving "      2
##  4 Business Administration      N/A                  3
##  5 Finance                      N/A                  1
##  6 Health Education & Promotion N/A                  1
##  7 Management                   N/A                  1
##  8 Marketing                    "Christmas "         1
##  9 Marketing                    "New Years Eve "     1
## 10 Nursing                      "Thanksgiving "      1
## 11 Sports Management            "Christmas "         1
## 12 Sports Management            N/A                  2
## 13 Undeclared                   "4th of July "       1
## 14 Undeclared                   "Easter "            1
## 15 Undeclared                   Mothers Day          1
## 16 Undeclared                   N/A                  1

Q9. To answer Q8, add the second variable to your chart in Q7.

Hint: Use ggplot2::facet_wrap. Refer to the ggplot2 cheatsheet. See the section for Faceting.

class_roster %>%
  ggplot(aes(fav_holiday)) +
  geom_bar() +
facet_wrap(~Major)