Using the given code, answer the questions below.

library(tidyverse) 

class_roster <- read.csv("~/R/business sat/DATA/classRoster02.csv") %>%
  as_tibble()
class_roster
## # A tibble: 30 x 5
##        X Student Class     Major                   income
##    <int> <fct>   <fct>     <fct>                    <int>
##  1     1 Scott   Sophomore Marketing                 1010
##  2     2 Colette Sophomore Business Administration    920
##  3     3 Niti    Senior    Business Administration   1031
##  4     4 Tyler   Sophomore Management                1064
##  5     5 Ryan    Sophomore Undeclared                1021
##  6     6 Jack    Sophomore Business Administration   1053
##  7     7 Michael Sophomore Business Administration   1001
##  8     8 Brianna Sophomore Marketing                 1156
##  9     9 Trevor  Sophomore Sports Management         1019
## 10    10 Connor  Sophomore Sports Management          848
## # ... with 20 more rows

Q1. What does the row represent?

student names

Q2. What characteristics of students (variables) does the data describe?

student,class,major and income of the student in class

Q3. What type of data are the variables (i.e., numeric, character, logical)?

numeric

Q4. What type of R object is class_roster (i.e., vector, matrix, data frame, list)? And why?

Data frame d.ata frame. can have additional attributes such as rownames.Data frames Usually created by read.csv and read.table.

Q5. How many students are in class?

30

str(class_roster)
## Classes 'tbl_df', 'tbl' and 'data.frame':    30 obs. of  5 variables:
##  $ X      : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Student: Factor w/ 28 levels "Amy","Andrew",..: 26 8 24 28 25 13 22 4 27 9 ...
##  $ Class  : Factor w/ 5 levels "Fifth Year (Senior +)",..: 5 5 4 5 5 5 5 5 5 5 ...
##  $ Major  : Factor w/ 7 levels "Accounting","Business Administration",..: 5 2 2 4 7 2 2 5 6 6 ...
##  $ income : int  1010 920 1031 1064 1021 1053 1001 1156 1019 848 ...
class_roster %>% View()

Q6. How many sophomores in class?

22

class_roster %>%
  count(Class, sort = TRUE)
## # A tibble: 5 x 2
##   Class                     n
##   <fct>                 <int>
## 1 Sophomore                22
## 2 Fifth Year (Senior +)     2
## 3 First Year Student        2
## 4 Junior                    2
## 5 Senior                    2

Q7. Create a colum chart for the data you created in Q6.

Hint: Use ggplot2 package

class_roster %>%
  count(Class, sort = TRUE) %>%
  ggplot(aes(Class, n)) +
  geom_col()

Q8. Repeat Q6 and Q7 for Major.

class_roster %>%
  count(Major, sort = TRUE)
## # A tibble: 7 x 2
##   Major                         n
##   <fct>                     <int>
## 1 Business Administration       9
## 2 Marketing                     8
## 3 Sports Management             6
## 4 Management                    3
## 5 Undeclared                    2
## 6 Accounting                    1
## 7 Interdisciplinary Studies     1
class_roster %>%
  count(Major, sort = TRUE) %>%
  ggplot(aes(Major, n)) +
  geom_col()

Q9. How is students’ income (numeric data) distributed?

class_roster %>%
  ggplot(aes(income)) +
  geom_histogram()

Q10. Plot students’ income distribution by class.

Hint: Add facet_wrap to the code for Q9.