library(ggplot2)
MxMH_Data <- read.csv("mxmh_survey_results.csv")
head(MxMH_Data, 15)
## Timestamp Age Primary.streaming.service Hours.per.day
## 1 8/27/2022 19:29:02 18 Spotify 3.0
## 2 8/27/2022 19:57:31 63 Pandora 1.5
## 3 8/27/2022 21:28:18 18 Spotify 4.0
## 4 8/27/2022 21:40:40 61 YouTube Music 2.5
## 5 8/27/2022 21:54:47 18 Spotify 4.0
## 6 8/27/2022 21:56:50 18 Spotify 5.0
## 7 8/27/2022 22:00:29 18 YouTube Music 3.0
## 8 8/27/2022 22:18:59 21 Spotify 1.0
## 9 8/27/2022 22:33:05 19 Spotify 6.0
## 10 8/27/2022 22:44:03 18 I do not use a streaming service. 1.0
## 11 8/27/2022 22:51:15 18 Spotify 3.0
## 12 8/27/2022 23:00:32 19 YouTube Music 8.0
## 13 8/27/2022 23:04:00 NA Spotify 3.0
## 14 8/27/2022 23:12:03 19 Spotify 2.0
## 15 8/27/2022 23:16:06 18 Spotify 4.0
## While.working Instrumentalist Composer Fav.genre Exploratory
## 1 Yes Yes Yes Latin Yes
## 2 Yes No No Rock Yes
## 3 No No No Video game music No
## 4 Yes No Yes Jazz Yes
## 5 Yes No No R&B Yes
## 6 Yes Yes Yes Jazz Yes
## 7 Yes Yes No Video game music Yes
## 8 Yes No No K pop Yes
## 9 Yes No No Rock No
## 10 Yes No No R&B Yes
## 11 Yes Yes No Country Yes
## 12 Yes No No EDM Yes
## 13 Yes No No Hip hop Yes
## 14 Yes No No Country Yes
## 15 Yes Yes No Jazz Yes
## Foreign.languages BPM Frequency..Classical. Frequency..Country.
## 1 Yes 156 Rarely Never
## 2 No 119 Sometimes Never
## 3 Yes 132 Never Never
## 4 Yes 84 Sometimes Never
## 5 No 107 Never Never
## 6 Yes 86 Rarely Sometimes
## 7 Yes 66 Sometimes Never
## 8 Yes 95 Never Never
## 9 No 94 Never Very frequently
## 10 Yes 155 Rarely Rarely
## 11 No NA Never Very frequently
## 12 No 125 Rarely Never
## 13 Yes NA Rarely Never
## 14 No 88 Never Very frequently
## 15 Yes 148 Very frequently Rarely
## Frequency..EDM. Frequency..Folk. Frequency..Gospel. Frequency..Hip.hop.
## 1 Rarely Never Never Sometimes
## 2 Never Rarely Sometimes Rarely
## 3 Very frequently Never Never Rarely
## 4 Never Rarely Sometimes Never
## 5 Rarely Never Rarely Very frequently
## 6 Never Never Never Sometimes
## 7 Rarely Sometimes Rarely Rarely
## 8 Rarely Never Never Very frequently
## 9 Never Sometimes Never Never
## 10 Rarely Rarely Sometimes Rarely
## 11 Never Never Never Never
## 12 Very frequently Never Never Sometimes
## 13 Rarely Never Never Very frequently
## 14 Rarely Sometimes Never Sometimes
## 15 Never Never Never Never
## Frequency..Jazz. Frequency..K.pop. Frequency..Latin. Frequency..Lofi.
## 1 Never Very frequently Very frequently Rarely
## 2 Very frequently Rarely Sometimes Rarely
## 3 Rarely Very frequently Never Sometimes
## 4 Very frequently Sometimes Very frequently Sometimes
## 5 Never Very frequently Sometimes Sometimes
## 6 Very frequently Very frequently Rarely Very frequently
## 7 Sometimes Never Rarely Rarely
## 8 Rarely Very frequently Never Sometimes
## 9 Never Never Never Never
## 10 Rarely Never Rarely Rarely
## 11 Never Never Never Never
## 12 Rarely Rarely Rarely Rarely
## 13 Never Sometimes Never Very frequently
## 14 Never Never Never Rarely
## 15 Very frequently Never Sometimes Rarely
## Frequency..Metal. Frequency..Pop. Frequency..R.B. Frequency..Rap.
## 1 Never Very frequently Sometimes Very frequently
## 2 Never Sometimes Sometimes Rarely
## 3 Sometimes Rarely Never Rarely
## 4 Never Sometimes Sometimes Never
## 5 Never Sometimes Very frequently Very frequently
## 6 Rarely Very frequently Very frequently Very frequently
## 7 Rarely Rarely Rarely Never
## 8 Never Sometimes Sometimes Rarely
## 9 Very frequently Never Never Never
## 10 Never Sometimes Sometimes Rarely
## 11 Never Rarely Rarely Never
## 12 Never Rarely Rarely Sometimes
## 13 Never Sometimes Sometimes Rarely
## 14 Never Rarely Never Very frequently
## 15 Sometimes Sometimes Never Never
## Frequency..Rock. Frequency..Video.game.music. Anxiety Depression Insomnia
## 1 Never Sometimes 3 0 1
## 2 Very frequently Rarely 7 2 2
## 3 Rarely Very frequently 7 7 10
## 4 Never Never 9 7 3
## 5 Never Rarely 7 2 5
## 6 Very frequently Never 8 8 7
## 7 Never Sometimes 4 8 6
## 8 Never Rarely 5 3 5
## 9 Very frequently Never 2 0 0
## 10 Sometimes Sometimes 2 2 5
## 11 Rarely Never 7 7 4
## 12 Rarely Rarely 1 0 0
## 13 Rarely Never 9 3 2
## 14 Never Never 2 1 2
## 15 Sometimes Rarely 6 4 7
## OCD Music.effects Permissions
## 1 0 I understand.
## 2 1 I understand.
## 3 2 No effect I understand.
## 4 3 Improve I understand.
## 5 9 Improve I understand.
## 6 7 Improve I understand.
## 7 0 Improve I understand.
## 8 3 Improve I understand.
## 9 0 Improve I understand.
## 10 1 Improve I understand.
## 11 7 No effect I understand.
## 12 1 Improve I understand.
## 13 7 Improve I understand.
## 14 0 Improve I understand.
## 15 0 Improve I understand.
I will be using na.omit() to remove all rows with NA values in the dataset and give the clean dataset a new name
Clean_MxMHData <- na.omit(MxMH_Data)
Will create a histogram that shows the Age frequency for every person in the dataset
ggplot(Clean_MxMHData, aes(x = Age)) + geom_histogram(binwidth = 1, fill = "pink", color = "black") + labs(title = "Histogram for Age per person", x = "Age", y = "Frequency")
Will create a Scatter Plot that shows the correlation between Age and Hours per day someone listens to music
ggplot(Clean_MxMHData, aes(x = Age, y = `Hours.per.day`)) + geom_point(color = "blue") + labs(title = "Scatter Plot for Age and Hours per day listening to music", x = "Age", y = "Hours per day")
This bar graph will show the peoples fav genre and count them according to the genre that was picked
ggplot(Clean_MxMHData, aes(x = `Fav.genre` )) + geom_bar(fill = "pink", color = "black") + labs (title = "Bar graph for Music genres listened to", x = "Fav genre", y = "Count") + theme_minimal() + theme(axis.text = element_text(angle = 45, hjust = 1))
My Hypothesis is that People with an “Age” less than or equal to 25 have a higher “Hours.per.day” number compared to people older than 25
I will split the data into two groups. one group where age <= 25 and the second group will show where age > 25
Group1 <- Clean_MxMHData$Hours.per.day[Clean_MxMHData$Age <= 25 ]
Group2 <- Clean_MxMHData$Hours.per.day[Clean_MxMHData$Age > 25 ]
I will calculate the mean for both groups
MU_Group1 <- mean(Group1)
MU_Group1
## [1] 3.866441
MU_Group2 <- mean(Group2)
MU_Group2
## [1] 3.337838
I will conduct a T-Test to compare the means of the two groups, likely corresponding to the age groups (Age <= 25 and Age > 25) and their respective “Hours.per.day.”
T_TestResults <- t.test(Group1,Group2)
T_TestResults
##
## Welch Two Sample t-test
##
## data: Group1 and Group2
## t = 1.9504, df = 340.23, p-value = 0.05195
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.004485226 1.061692433
## sample estimates:
## mean of x mean of y
## 3.866441 3.337838
While the mean “Hours.per.day” is higher for people with Age <= 25 (3.87) compared to those with Age > 25 (3.34), the difference is not statistically significant at the 0.05 level (p-value = 0.05195). In this case, the p-value shows borderline significance, so it’s unclear whether there is a real difference to support the alternative hypothesis.