Using the given code, answer the questions below.

library(tidyverse) 

class_roster <- read.csv("~/R/business sat/DATA/classRoster02.csv") %>%
  as_tibble()
class_roster
## # A tibble: 30 x 6
##        X Student Class     Major                   income AreyoufromNh
##    <int> <fct>   <fct>     <fct>                    <int> <fct>       
##  1     1 Scott   Sophomore Marketing                 1010 Y           
##  2     2 Colette Sophomore Business Administration    920 Y           
##  3     3 Niti    Senior    Business Administration   1031 N           
##  4     4 Tyler   Sophomore Management                1064 N           
##  5     5 Ryan    Sophomore Undeclared                1021 N           
##  6     6 Jack    Sophomore Business Administration   1053 N           
##  7     7 Michael Sophomore Business Administration   1001 N           
##  8     8 Brianna Sophomore Marketing                 1156 Y           
##  9     9 Trevor  Sophomore Sports Management         1019 Y           
## 10    10 Connor  Sophomore Sports Management          848 Y           
## # ... with 20 more rows

Add a variable of your own to answer a research question you might have: i.e., What is the favorite number of PSU students? Do favorite numbers vary by major or class?

Q1. What does the row represent?

students

Q2. What characteristics of students (variables) does the data describe?

list of variables

Q3. What type of data is the new variable you added (i.e., numeric, character, logical)?

fav numbers,numeric

Q4. What type of R object is class_roster (i.e., vector, matrix, data frame, list)? And why?

data frame is a table or a two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column. data frame is more common since it holds more then one data type

Q5. Describe the first student (first row) using all variables.

Hint: Use View(). Scott,sophmore,marketing,11

Q6. Count the number of values in your new variable.

Hint: Use count().

class_roster %>% count(AreyoufromNh,sort=TRUE)
## # A tibble: 2 x 2
##   AreyoufromNh     n
##   <fct>        <int>
## 1 Y               17
## 2 N               13

Q7. Plot your new variable.

Hint: Refer to the ggplot2 cheatsheet. Google it. See the section for One Variable. Note that there are two different cases: 1) Continuous and 2) Discrete. The type of chart you can use depends on what type of data your variable is.

class_roster %>% 
  ggplot(aes(AreyoufromNh))

geom_histogram()
## geom_bar: na.rm = FALSE
## stat_bin: binwidth = NULL, bins = NULL, na.rm = FALSE, pad = FALSE
## position_stack

Q8. Does your answer in Q6 vary by major?

Hint: Use dplyr::group_by in addition to count().

Q9. To answer Q8, add the second variable to your chart in Q7.

Hint: Use ggplot2::facet_wrap. Refer to the ggplot2 cheatsheet. See the section for Faceting.