Please replace “Your Name” in the header above. Insert code chunks after each sub-question and write your R code inside the chunks to answer the questions. When you are finished, click the Knit button in RStudio to generate a report in PDF or Word. Submit BOTH the knitted document and the R markdown file.
Suppose that we survey a classroom of students about the last concert they attend along with other information, and collect the information into a CSV file called “concert.csv”. Answer the following questions based on this dataset. (If it isn’t obvious how the R output answers a question, write a short explanation in the markdown part of your report.)
1.1. Import data from the concert.csv file to a data frame called
concertdata. (1pt)
concertdata <- read.csv("concert.csv")
1.2. How many students are in this dataset? How many variables are collected? (1pt)
str(concertdata)
## 'data.frame': 20 obs. of 5 variables:
## $ studentID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ genre : chr "Rock" "Jazz" "EDM" "EDM" ...
## $ rating : int 4 5 1 4 5 6 1 4 2 8 ...
## $ major : chr "Cognitive Science" "Psychology" "Psychology" "Cognitive Science" ...
## $ schoolyear: chr "Senior" "Sophomore" "First-year" "Senior" ...
1.3. What are the names of the variables in this dataset? (1pt)
names(concertdata)
## [1] "studentID" "genre" "rating" "major" "schoolyear"
2.1. Use $ to extract the genre variable
from the dataset and make a one-way frequency table of this variable.
(2pt)
table(concertdata$genre)
##
## EDM Hip-Hop Jazz Rock
## 4 4 7 5
2.2. Based on this table, how many people reported attending a jazz concert? (1pt) 7 people reported attending a jazz concert.
2.3. Use indexing to extract the major and
schoolyear variables, and make a two-way frequency table
for these two variables. (2pt)
(table2way <- table(concertdata[,c("major","genre")]))
## genre
## major EDM Hip-Hop Jazz Rock
## Cognitive Science 1 0 1 1
## Music 1 0 0 1
## Neuroscience 0 2 0 1
## Psychology 2 2 6 2
2.4. Add marginal frequencies to this two-way frequency table. (1pt)
addmargins(table2way)
## genre
## major EDM Hip-Hop Jazz Rock Sum
## Cognitive Science 1 0 1 1 3
## Music 1 0 0 1 2
## Neuroscience 0 2 0 1 3
## Psychology 2 2 6 2 12
## Sum 4 4 7 5 20
2.5. Based on this table, how many psychology majors are in this sample? How many of the psychology majors reported attending a jazz concert? (2pt)
There are 12 psychology majors in this sample. Out of the 12, 6 reported attending a jazz concert.
3.1. Make a histogram that showcases the frequencies of the variable
rating. (1pt)
hist(concertdata$rating,main="Histogram of Rating", xlab="Rating")
3.2. Make a pie chart that showcases the relative frequencies of the
variable genre. (1pt)
pie(table(concertdata$genre),main="Pie Chart of Genre")
3.3. Make a bar plot that describes the frequencies of the variable
schoolyear. (1pt)
barplot(table(concertdata$schoolyear),main="Bar Plot of Schoolyear")
3.4. Make schoolyear an ordered factor and remake the
bar plot so that the x-axis goes from first-year to senior. (2pt)
concertdata$schoolyear_f <- factor(concertdata$schoolyear,
ordered = TRUE,
levels = c("First-year", "Sophomore",
"Junior", "Senior"))
barplot(table(concertdata$schoolyear_f),main="Bar Plot of Schoolyear")
4.1. Load the package RColorBrewer. (You will need to
install it outside of this script if you haven’t already.)
library(RColorBrewer)
4.2. Pick a palette you like from the options it offers, and replace “Dark2” with the name of the palette in the two lines below.
display.brewer.pal(n=4,name="Dark2")
mycolors <- brewer.pal(n=4,name="Dark2")
4.3. Remake the pie chart of the variable genre, with an
additional col argument in the function with
mycolors being the input for that argument. You should
observe the same pie chart but with different colors in the slices.
pie(table(concertdata$genre),main="Pie Chart of Genre",col=mycolors)