Question 1

#load csv
SSMA <- read.csv('sets_status2.csv')

#Grouped the original dataset by Academic Level, then summed the number of students who reported high levels of phone addiction, and grouped them in ascending order.
addicted <- SSMA %>%
  group_by(Academic_Level) %>%
  summarise(high = sum(Addicted_Score == "High Levels")) %>%
  arrange(high)

moderate <- SSMA %>%
  group_by(Academic_Level) %>%
  summarise(med = sum(Addicted_Score == "Medium Levels")) %>%
  arrange(med)

low <- SSMA %>%
  group_by(Academic_Level) %>%
  summarise(low = sum(Addicted_Score == "Low Levels")) %>%
  arrange(low)
  

#Created a barplot showing the number of students reporting phone adiction based on their academic level
barplot(addicted$high, 
        names.arg = addicted$Academic_Level,
        main = "Number of Social Media Addicts by Academic Level",
        xlab = "Academic Level",
        ylab = "Count"
)

#

The graph here shows the number of students who self report a level of 7 (on a scale of 1 - 10) or above for phone addiction grouped by their academic level.

Question 2

My hypothesis is that increased time spent on social media leads to a decline in mental health, sleep and relationships.

#Used the same function to group the dataset by the level of addiction and then found the average mental health score for those groups
avgMH <- SSMA %>%
  group_by(Addicted_Score) %>%
  summarise(MHscore = mean(Mental_Health_Score)) %>%
  arrange(MHscore)

#Created a barplot to show the average mental health score by self reported addiction levels
barplot(avgMH$MHscore,
        names.arg = avgMH$Addicted_Score,
        main = "Average Mental Health Score by Phone Addiction Level",
        xlab = "Level of Addiction",
        ylab = "Mental Health Score"
        )

The graph shows the average mental health score for those who reported differnt levels of phone addiction. High Levels indicated an addiction score of 7-10, Medium indicated 4-6 and low reported a level of 1-2. This suggest that a higher level of addiction does result in a lower mental health score.

Question 3

#Reads the original dataset containing the numerical scores
SSMAOG <- read.csv('SSMAddiction.csv')

#Test the correlation between average daily use and mental health score
correlation <- cor.test(SSMAOG$Mental_Health_Score, SSMAOG$Avg_Daily_Usage_Hours)
print(correlation)
## 
##  Pearson's product-moment correlation
## 
## data:  SSMAOG$Mental_Health_Score and SSMAOG$Avg_Daily_Usage_Hours
## t = -35.482, df = 703, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.8260373 -0.7729371
## sample estimates:
##        cor 
## -0.8010576

The previous bar graph showed the a high level of SM addiction leads to a decline in mental health. Here are the results of a correlation test on the relationship between mental health score and the average daily usage hours of SM. This dataset shows a strong negative relationship between the two.

Question 4

#Historgram of SM addiction

hist(SSMAOG$Addicted_Score,
     main = "Distribution of SM Addiction",
     xlab= "Scores",
     ylab = "Frequency")

Question 5

#Create a subset based on response to affect of academic performance
SSMA_subset <- subset(SSMAOG, Affects_Academic_Performance %in% c("No", "Yes"))

test <- t.test(Avg_Daily_Usage_Hours ~ Affects_Academic_Performance, data = SSMA_subset)

print(test)
## 
##  Welch Two Sample t-test
## 
## data:  Avg_Daily_Usage_Hours by Affects_Academic_Performance
## t = -25.394, df = 646.68, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
## 95 percent confidence interval:
##  -1.868371 -1.600161
## sample estimates:
##  mean in group No mean in group Yes 
##          3.804365          5.538631

This two sample test shows that those who reported that SM did not affect their academic preformance had a lower average daily usage. This test again supports the hypothesis that more time spent on SM leads to a decline in other areas of life.