In this analysis, I will be exploring the affects of mental health and technology usage in adult males and females in the United States. The following data set being used is called “Mental Health & Technology Usage”. It focuses on the effects of mental health due to prolonged exposure to technology. The convenience of smartphones has made the internet conveniently accessible, making it relatively easy to access various things online. Viewing excessive amounts of information in a very short time could possibly lead to a variety of mental health problems. Experts report that report that 68 % of adults in the United States have a smartphone (Anderson 2015). Around 65 % of individuals living in developed countries have at least two portable devices (Rosen 2012); and about 3.2 billion people globally are using the internet (International Telecommunication Union 2015). Excessive technology usage could negatively impact a person various areas, including cognitive and behavioral. (Scott, Valley, and Simecka 2017)

# Load the data

df<- read.csv("C:/DATA 712/mental_health_and_technology_usage_2024.csv")
head(df)
##      User_ID Age Gender Technology_Usage_Hours Social_Media_Usage_Hours
## 1 USER-00001  23 Female                   6.57                     6.00
## 2 USER-00002  21   Male                   3.01                     2.57
## 3 USER-00003  51   Male                   3.04                     6.14
## 4 USER-00004  25 Female                   3.84                     4.48
## 5 USER-00005  53   Male                   1.20                     0.56
## 6 USER-00006  58   Male                   5.59                     5.74
##   Gaming_Hours Screen_Time_Hours Mental_Health_Status Stress_Level Sleep_Hours
## 1         0.68             12.36                 Good          Low        8.01
## 2         3.74              7.61                 Poor         High        7.28
## 3         1.26              3.16                 Fair         High        8.04
## 4         2.59             13.08            Excellent       Medium        5.62
## 5         0.29             12.63                 Good          Low        5.55
## 6         0.11              1.34                 Poor          Low        8.61
##   Physical_Activity_Hours Support_Systems_Access Work_Environment_Impact
## 1                    6.71                     No                Negative
## 2                    5.88                    Yes                Positive
## 3                    9.81                     No                Negative
## 4                    5.28                    Yes                Negative
## 5                    4.00                     No                Positive
## 6                    6.54                    Yes                 Neutral
##   Online_Support_Usage
## 1                  Yes
## 2                   No
## 3                   No
## 4                  Yes
## 5                  Yes
## 6                  Yes
# Boxplot of Age vs Gender
# Load necessary libraries
library(ggplot2)

ggplot(df, aes(x = Gender, y = Age)) +
  geom_boxplot() +
  labs(title = "Age Distribution by Gender", x = "Gender", y = "Age")

Age Distribution by Gender:

First, I analyzed the age distribution by each gender using a box plot. From this graph, we can observe that the age distribution across genders is fairly similar, with most individuals clustering around 30–40 years old. The median age for males and females is nearly identical, but females have a slightly wider spread in age range.

# Histogram for Technology Usage Hours
ggplot(df, aes(x = Technology_Usage_Hours)) +
  geom_histogram(bins = 20, fill = "skyblue", color = "black") +
  labs(title = "Distribution of Technology Usage Hours", x = "Technology Usage Hours", y = "Frequency")

# Histogram for Social Media Usage Hours
ggplot(df, aes(x = Social_Media_Usage_Hours)) +
  geom_histogram(bins = 20, fill = "green", color = "black") +
  labs(title = "Distribution of Social Media Usage Hours", x = "Social Media Usage Hours", y = "Frequency")

Distribution of Technology and Social Media Usage Hours:

This histogram shows the distribution of technology and social media usage hours. The data is right-skewed, indicating that most individuals use technology for fewer hours, while a smaller number of individuals use it for more hours. The histogram for social media usage hours shows a similar pattern, with a right-skewed distribution.Most users spend between 2 to 6 hours on social media—these hours have the highest frequency in the dataset. There are fewer users at the extremes (0 and 8 hours)—suggesting that constant or minimal usage is less common.The distribution appears relatively steady, meaning social media habits tend to stay within a predictable range.

# Bar plot for Mental Health Status
ggplot(df, aes(x = Mental_Health_Status)) +
  geom_bar(fill = "lightcoral") +
  labs(title = "Mental Health Status Distribution", x = "Mental Health Status", y = "Count")

Mental Health Status Distribution:

This bar plot shows the distribution of mental health status in the dataset. The majority of individuals fall into the “Good” category, while a smaller number are categorized as “Poor.” This suggests that most individuals in the dataset report having a good mental health status, with fewer individuals reporting poor mental health.

# Boxplot for Stress Level vs Age and Gender
ggplot(df, aes(x = Stress_Level, y = Age, fill = Gender)) +
  geom_boxplot() +
  labs(title = "Stress Level Distribution by Age and Gender", x = "Stress Level", y = "Age") +
  scale_fill_manual(values = c("blue", "pink", "gray"))  # Add a third color if there's a third category

Age and Gender Distribution:

This boxplot shows the stress level distribution by age and gender. From this, we can see that:

Balanced gender representation across most age groups.

Peak age range: 30–40 years old, with fewer individuals in younger and older brackets.

Females slightly older on average than males.

Mental Health & Social Media Usage:

No significant link between social media usage and mental health for either gender.

Males showed a slightly stronger negative trend, but it wasn’t statistically significant.

Other factors may be more influential in predicting mental health status.

Stress Levels Across Genders: Stress levels are evenly distributed between males and females.

No clear gender-based differences in high, medium, or low stress categories.

This suggests shared challenges rather than gender-specific stressors.

# Fit the model for males
male_data <- df[df$Gender == "Male", ]
model_male <- lm(Social_Media_Usage_Hours ~ Mental_Health_Status, data = male_data)

# Fit the model for females
female_data <- df[df$Gender == "Female", ]
model_female <- lm(Social_Media_Usage_Hours ~ Mental_Health_Status, data = female_data)

# Summary of models
summary(model_male)
## 
## Call:
## lm(formula = Social_Media_Usage_Hours ~ Mental_Health_Status, 
##     data = male_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.0723 -2.0114 -0.0282  2.0486  4.0948 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               4.02419    0.08010  50.240   <2e-16 ***
## Mental_Health_StatusFair -0.03280    0.11351  -0.289    0.773    
## Mental_Health_StatusGood  0.04807    0.11386   0.422    0.673    
## Mental_Health_StatusPoor -0.12904    0.11281  -1.144    0.253    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.321 on 3346 degrees of freedom
## Multiple R-squared:  0.0007843,  Adjusted R-squared:  -0.0001116 
## F-statistic: 0.8754 on 3 and 3346 DF,  p-value: 0.453
summary(model_female)
## 
## Call:
## lm(formula = Social_Media_Usage_Hours ~ Mental_Health_Status, 
##     data = female_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.1102 -1.9702  0.0098  1.9635  4.1047 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               3.84526    0.07923  48.534   <2e-16 ***
## Mental_Health_StatusFair  0.26498    0.11443   2.316   0.0206 *  
## Mental_Health_StatusGood  0.09487    0.11201   0.847   0.3971    
## Mental_Health_StatusPoor  0.10870    0.11314   0.961   0.3368    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.304 on 3282 degrees of freedom
## Multiple R-squared:  0.001666,   Adjusted R-squared:  0.0007535 
## F-statistic: 1.826 on 3 and 3282 DF,  p-value: 0.1402
# Visualize regression results for males
ggplot(male_data, aes(x = Mental_Health_Status, y = Social_Media_Usage_Hours)) +
  geom_point(aes(color = Mental_Health_Status)) +
  geom_smooth(method = "lm", color = "blue") +
  labs(title = "Regression of Social Media Usage on Mental Health Status (Males)",
       x = "Mental Health Status", y = "Social Media Usage Hours")
## `geom_smooth()` using formula = 'y ~ x'

# Visualize regression results for females
ggplot(female_data, aes(x = Mental_Health_Status, y = Social_Media_Usage_Hours)) +
  geom_point(aes(color = Mental_Health_Status)) +
  geom_smooth(method = "lm", color = "pink") +
  labs(title = "Regression of Social Media Usage on Mental Health Status (Females)",
       x = "Mental Health Status", y = "Social Media Usage Hours")
## `geom_smooth()` using formula = 'y ~ x'

Regression Analysis:

These linear regression models examine the relationship between Mental Health Status and Social Media Usage Hours, separately for males and females.

For Males (male_data)

Intercept: 4.02 (p < 0.001) → On average, males use social media for ~4 hours per day.

Mental Health Impact:

“Fair” (-0.03, p = 0.773) → Not statistically significant.

“Good” (+0.048, p = 0.673) → Not statistically significant.

“Poor” (-0.129, p = 0.253) → Also not statistically significant. Conclusion: Mental Health Status does not significantly affect social media usage for males (p-value = 0.453).

For Females (female_data) Intercept: 3.85 (p < 0.001) → On average, females use social media slightly less than males (~3.8 hours per day). Mental Health Impact:

“Fair” (+0.26, p = 0.021) → Statistically significant increase in social media usage.

“Good” (+0.094, p = 0.397) → Not statistically significant.

“Poor” (+0.108, p = 0.337) → Not statistically significant.

Conclusion:

Females with “Fair” mental health status tend to use social media slightly more than those with other mental health statuses (p = 0.021).

However, the overall model is weak, as indicated by the low R-squared value (0.00167).

Final Takeaways For males, mental health has no meaningful effect on social media usage. For females, those with “Fair” mental health use social media slightly more, but the effect is small.Social media usage is likely influenced by other factors beyond mental health alone.

Scott, David A., Bart Valley, and Brooke A. Simecka. 2017. “Mental Health Concerns in the Digital Age.” International Journal of Mental Health and Addiction 15 (3): 604–13. https://doi.org/10.1007/s11469-016-9684-0.