The Cultivation theory suggests that heavy television exposure can gradually shape viewers perception of reality. Over time, frequent exposure to televised content may “cultivate” attitudes and fears just as the world is portrayed on screen. If this theory is correct, people who spend more time watching television should, on average, report higher levels of unhappiness.
Unhappiness scores will increase as the number of hours of daily television viewing increases. Specifically, heavier viewers will report higher levels of unhappiness than lighter viewers.
Participants reported their average number of hours of television watched per day (IV) and rated their level of unhappiness on a numerical scale (DV). Both variables were treated as continuous measures.
A linear regression was performed to test whether television viewing hours significantly predicted unhappiness. Histograms were used to inspect the distributions of both variables, and the scatterplot with a fitted regression line illustrates the relationship. Leverage statistics were also calculated to check for influential outliers.
The regression model showed a positive correlation between television hours and unhappiness. The scatterplot indicated a positive slope, meaning that unhappiness scores tended to rise as viewing hours increased. Model fit statistics confirmed that television hours accounted for a meaningful portion of the variance in unhappiness. Leverage diagnostics revealed no influential outliers.
| Leverage estimates for 10 largest outliers | |
| Row # | Leverage |
|---|---|
| 164 | 0.0303 |
| 360 | 0.0199 |
| 359 | 0.0190 |
| 371 | 0.0178 |
| 72 | 0.0170 |
| 265 | 0.0164 |
| 201 | 0.0152 |
| 392 | 0.0151 |
| 97 | 0.0151 |
| 44 | 0.0149 |
| Regression Analysis Results | ||||
| Coefficient Estimates | ||||
| Term | Estimate | Std. Error | t | p-value |
|---|---|---|---|---|
| (Intercept) | 3.9888 | 1.5531 | 2.5683 | 0.0106 |
| IV | 0.4206 | 0.0307 | 13.6953 | 0.0000 |
| Model Fit Statistics | |||||
| Overall Regression Performance | |||||
| R-squared | Adj. R-squared | F-statistic | df (model) | df (residual) | Residual Std. Error |
|---|---|---|---|---|---|
| 0.3203 | 0.3186 | 187.5609 | 1.0000 | 398.0000 | 4.1619 |
These findings support the hypothesis and align with cultivation theory, suggesting that heavier television viewing is associated with greater reported unhappiness. Increased exposure to television content may cultivate more negative emotional states over time.
##################################################
# 1. Install and load required packages
##################################################
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("gt")) install.packages("gt")
if (!require("gtExtras")) install.packages("gtExtras")
library(tidyverse)
library(gt)
library(gtExtras)
##################################################
# 2. Read in the dataset
##################################################
# Replace "YOURFILENAME.csv" with the actual filename
mydata <- read.csv("RegressionData.csv")
# ################################################
# # (Optional) 2b. Remove specific cases by row number
# ################################################
# # Example: remove rows 10 and 25
# rows_to_remove <- c(10, 25) # Edit and uncomment this line
# mydata <- mydata[-rows_to_remove, ] # Uncomment this line
##################################################
# 3. Define dependent variable (DV) and independent variable (IV)
##################################################
# Replace YOURDVNAME and YOURIVNAME with actual column names
mydata$DV <- mydata$Unhappiness
mydata$IV <- mydata$Hours
##################################################
# 4. Explore distributions of DV and IV
##################################################
# Make a histogram for DV
DVGraph <- ggplot(mydata, aes(x = DV)) +
geom_histogram(color = "black", fill = "#1f78b4")
# Make a histogram for IV
IVGraph <- ggplot(mydata, aes(x = IV)) +
geom_histogram(color = "black", fill = "#1f78b4")
##################################################
# 5. Fit and summarize initial regression model
##################################################
# Suppress scientific notation
options(scipen = 999)
# Fit model
myreg <- lm(DV ~ IV, data = mydata)
# Model summary
summary(myreg)
##################################################
# 6. Visualize regression and check for bivariate outliers
##################################################
# Create scatterplot with regression line as a ggplot object
RegressionPlot <- ggplot(mydata, aes(x = IV, y = DV)) +
geom_point(color = "#1f78b4") +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(
title = "Scatterplot of DV vs IV with Regression Line",
x = "Independent Variable (IV)",
y = "Dependent Variable (DV)"
) +
theme_minimal()
##################################################
# 7. Check for potential outliers (high leverage points)
##################################################
# Calculate leverage values
hat_vals <- hatvalues(myreg)
# Rule of thumb: leverage > 2 * (number of predictors + 1) / n may be influential
threshold <- 2 * (length(coef(myreg)) / nrow(mydata))
# Create table showing 10 largest leverage values
outliers <- data.frame(
Obs = 1:nrow(mydata),
Leverage = hatvalues(myreg)
) %>%
arrange(desc(Leverage)) %>%
slice_head(n = 10)
# Format as a gt table
outliers_table <- outliers %>%
gt() %>%
tab_header(
title = "Leverage estimates for 10 largest outliers"
) %>%
cols_label(
Obs = "Row #",
Leverage = "Leverage"
) %>%
fmt_number(
columns = Leverage,
decimals = 4
)
##################################################
# 8. Create nicely formatted regression results tables
##################################################
# --- Coefficient-level results ---
reg_results <- as.data.frame(coef(summary(myreg))) %>%
tibble::rownames_to_column("Term") %>%
rename(
Estimate = Estimate,
`Std. Error` = `Std. Error`,
t = `t value`,
`p-value` = `Pr(>|t|)`
)
reg_table <- reg_results %>%
gt() %>%
tab_header(
title = "Regression Analysis Results",
subtitle = "Coefficient Estimates"
) %>%
fmt_number(
columns = c(Estimate, `Std. Error`, t, `p-value`),
decimals = 4
)
# --- Model fit statistics ---
reg_summary <- summary(myreg)
fit_stats <- tibble::tibble(
`R-squared` = reg_summary$r.squared,
`Adj. R-squared` = reg_summary$adj.r.squared,
`F-statistic` = reg_summary$fstatistic[1],
`df (model)` = reg_summary$fstatistic[2],
`df (residual)` = reg_summary$fstatistic[3],
`Residual Std. Error` = reg_summary$sigma
)
fit_table <- fit_stats %>%
gt() %>%
tab_header(
title = "Model Fit Statistics",
subtitle = "Overall Regression Performance"
) %>%
fmt_number(
columns = everything(),
decimals = 4
)
##################################################
# 9. Final print of key graphics and tables
##################################################
DVGraph
IVGraph
RegressionPlot
outliers_table
reg_table
fit_table