Using the given code, answer the questions below.
library(tidyverse)
class_roster <- read.csv("~/R/busStat/Data/classRoster01.csv") %>%
as_tibble()
class_roster
## # A tibble: 21 x 6
## X Student Class Major income fav_holiday
## <int> <fct> <fct> <fct> <int> <fct>
## 1 1 Abigail Junior Management 1010 N/A
## 2 2 Anthony Sophomore Sports Management 920 "Christmas "
## 3 3 Lauren Senior Business Administr~ 1031 "Thanksgiving~
## 4 4 Jonathan Junior Finance 1064 N/A
## 5 5 Zachary Sophomore Sports Management 1021 N/A
## 6 6 Tayla Senior Business Administr~ 1053 "Thanksgiving~
## 7 7 James First Year Stu~ Undeclared 1001 "Easter "
## 8 8 Jillian First Year Stu~ Undeclared 1156 Mothers Day
## 9 9 Luis First Year Stu~ Business Administr~ 1019 N/A
## 10 10 Nicholas First Year Stu~ Marketing 848 "New Years Ev~
## # ... with 11 more rows
The row represents each students name.
The characteristics of students (variables) that the data describes is student, class, major, and income.
The data that the variables represent is different for each. The student represents, character. Class and major represents, character.Lastly income would be numeric.
The type of R object the classs_roster is
str(class_roster)
## Classes 'tbl_df', 'tbl' and 'data.frame': 21 obs. of 6 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Student : Factor w/ 20 levels "Abigail","Anthony",..: 1 2 15 12 20 19 10 11 16 17 ...
## $ Class : Factor w/ 5 levels "Fifth Year (Senior +)",..: 3 5 4 3 5 4 2 2 2 2 ...
## $ Major : Factor w/ 9 levels "Accounting","Business Administration",..: 5 8 2 3 8 2 9 9 2 6 ...
## $ income : int 1010 920 1031 1064 1021 1053 1001 1156 1019 848 ...
## $ fav_holiday: Factor w/ 7 levels "4th of July ",..: 5 2 7 5 5 7 3 4 5 6 ...
class_roster %>% View("class")
There are 21 students in Business Statisitc class.
class_roster %>%
count(Class, sort = TRUE)
## # A tibble: 5 x 2
## Class n
## <fct> <int>
## 1 First Year Student 8
## 2 Sophomore 6
## 3 Junior 3
## 4 Senior 3
## 5 Fifth Year (Senior +) 1
There are 6 sophomores in class.
Hint: Use ggplot2 package
class_roster %>%
count(Class, sort = TRUE) %>%
ggplot(aes(Class, n)) +
geom_col()
class_roster %>%
count(Major, sort = TRUE)
## # A tibble: 9 x 2
## Major n
## <fct> <int>
## 1 Business Administration 6
## 2 Undeclared 4
## 3 Sports Management 3
## 4 Accounting 2
## 5 Marketing 2
## 6 Finance 1
## 7 Health Education & Promotion 1
## 8 Management 1
## 9 Nursing 1
```
class_roster %>%
count(Major, sort = TRUE) %>%
ggplot(aes(x=Major, y=n)) +
geom_col()
```
class_roster %>%
ggplot(aes(income)) +
geom_histogram()
Students income (numeric data) is distributed ## Q10. Plot students’ income distribution by class. Hint: Add facet_wrap to the code for Q9.
class_roster %>%
ggplot(aes(income)) +
geom_histogram() + facet_wrap(~ Class)