Please replace “Your Name” in the header above. Insert code chunks after each sub-question and write your R code inside the chunks to answer the questions. When you are finished, click the Knit button in RStudio to generate a report in PDF or Word. Submit BOTH the knitted document and the R markdown file.

Due date/time: Sept. 16 @ 10 am (Tuesday lab group), Sept. 19 @ 4 pm (Friday lab group)


Suppose that we survey a classroom of students about the last concert they attend along with other information, and collect the information into a CSV file called “concert.csv”. Answer the following questions based on this dataset. (If it isn’t obvious how the R output answers a question, write a short explanation in the markdown part of your report.)

Q1. Importing data and inspecting data frames.

1.1. Import data from the concert.csv file to a data frame called concertdata. (1pt)

concertdata <- read.csv("concert.csv")

1.2. How many students are in this dataset? How many variables are collected? (1pt)

str(concertdata)
## 'data.frame':    20 obs. of  5 variables:
##  $ studentID : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ genre     : chr  "Rock" "Jazz" "EDM" "EDM" ...
##  $ rating    : int  4 5 1 4 5 6 1 4 2 8 ...
##  $ major     : chr  "Cognitive Science" "Psychology" "Psychology" "Cognitive Science" ...
##  $ schoolyear: chr  "Senior" "Sophomore" "First-year" "Senior" ...

1.3. What are the names of the variables in this dataset? (1pt)

names(concertdata)
## [1] "studentID"  "genre"      "rating"     "major"      "schoolyear"

Q2. Reporting frequencies

2.1. Use $ to extract the genre variable from the dataset and make a one-way frequency table of this variable. (2pt)

table(concertdata$genre)
## 
##     EDM Hip-Hop    Jazz    Rock 
##       4       4       7       5

2.2. Based on this table, how many people reported attending a jazz concert? (1pt) 7 people reported attending a jazz concert.

2.3. Use indexing to extract the major and schoolyear variables, and make a two-way frequency table for these two variables. (2pt)

(table2way <- table(concertdata[,c("major","genre")]))
##                    genre
## major               EDM Hip-Hop Jazz Rock
##   Cognitive Science   1       0    1    1
##   Music               1       0    0    1
##   Neuroscience        0       2    0    1
##   Psychology          2       2    6    2

2.4. Add marginal frequencies to this two-way frequency table. (1pt)

addmargins(table2way)
##                    genre
## major               EDM Hip-Hop Jazz Rock Sum
##   Cognitive Science   1       0    1    1   3
##   Music               1       0    0    1   2
##   Neuroscience        0       2    0    1   3
##   Psychology          2       2    6    2  12
##   Sum                 4       4    7    5  20

2.5. Based on this table, how many psychology majors are in this sample? How many of the psychology majors reported attending a jazz concert? (2pt)

There are 12 psychology majors in this sample. Out of the 12, 6 reported attending a jazz concert.

Q3. Data visualization

3.1. Make a histogram that showcases the frequencies of the variable rating. (1pt)

hist(concertdata$rating,main="Histogram of Rating", xlab="Rating")

3.2. Make a pie chart that showcases the relative frequencies of the variable genre. (1pt)

pie(table(concertdata$genre),main="Pie Chart of Genre")

3.3. Make a bar plot that describes the frequencies of the variable schoolyear. (1pt)

barplot(table(concertdata$schoolyear),main="Bar Plot of Schoolyear")

3.4. Make schoolyear an ordered factor and remake the bar plot so that the x-axis goes from first-year to senior. (2pt)

concertdata$schoolyear_f <- factor(concertdata$schoolyear,
                                   ordered = TRUE,
                                   levels = c("First-year", "Sophomore",
                                              "Junior", "Senior"))
barplot(table(concertdata$schoolyear_f),main="Bar Plot of Schoolyear")

Q4. Using R packages

4.1. Load the package RColorBrewer. (You will need to install it outside of this script if you haven’t already.)

library(RColorBrewer)

4.2. Pick a palette you like from the options it offers, and replace “Dark2” with the name of the palette in the two lines below.

display.brewer.pal(n=4,name="Dark2")

mycolors <- brewer.pal(n=4,name="Dark2")

4.3. Remake the pie chart of the variable genre, with an additional col argument in the function with mycolors being the input for that argument. You should observe the same pie chart but with different colors in the slices.

pie(table(concertdata$genre),main="Pie Chart of Genre",col=mycolors)