Cultivation Analysis Resubmission

Rationale

In cultivation theory, prolonged exposure to media can shape a person’s perceptions of the real world. The theory does have subcategories. There’s mainstreaming, which refers to the heavy TV viewing’s homogenizing effect on people from different backgrounds. There’s also resonance. Resonance occurs when viewers’ real lives intersect with the scripted world of television.

Based on the theory, the number of weekly hours viewers spend watching television may determine the percentage of people they believe work in the following professions: law enforcement/criminal justice, medicine, and emergency response services. Therefore, the number of hours these television consumers watch could sway them into believing that many Americans work in those fields.

Hypothesis

Participants who spend more hours watching television in the six months will believe a greater number of Americans are working in law enforcement/criminal justice, medicine, and emergency response services, compared to those who have watched fewer hours of television.

Variables & method

400 volunteer study participants were recruited from a random sample of all U.S. adults and agreed to connect a monitoring device to their household television or other electronic devices, where they consume television content for 6 months. The study’s researchers recorded the precise number of hours per week each study participant spent watching television content. Time spent watching by other household members was not recorded in the total unless the study participant was also watching.

The dependent variable in the analysis was a continuous measure of participants’ estimates of the percentage of the U.S. population employed full-time in either law enforcement/criminal justice, medicine, or emergency response services. Participants were asked to report the percentages for each worker category, and these percentages were then summed. The independent variable was also continuous, indicating the average weekly hours each research participant spent watching television for the six months. A regression analysis test was conducted to test whether viewers’ viewing habits determined their perception of Americans’ occupations.

Results & discussion

The graph and scatterplot below summarize the association between the dependent and independent variables. The regression analysis results are shown as well.

The results supported the hypothesis. There was a strong positive relationship between the number of hours spent watching television and their estimates of the percentage of Americans employed in the fields mentioned above. The model explains 31% of the variance in participants’ estimates (R-squared=31%), suggesting that higher television viewing is associated with a person’s perception of employment rates in these fields.

Row #	Leverage
Leverage estimates for 10 largest outliers
359	0.0275
346	0.0236
108	0.0212
198	0.0168
236	0.0168
191	0.0158
388	0.0158
39	0.0130
333	0.0130
392	0.0130

Term	Estimate	Std. Error	t	p-value
Regression Analysis Results
Coefficient Estimates
(Intercept)	14.9525	1.4656	10.2026	0.0000
IV	0.3686	0.0275	13.4056	0.0000

R-squared	Adj. R-squared	F-statistic	df (model)	df (residual)	Residual Std. Error
Model Fit Statistics
Overall Regression Performance
0.3111	0.3093	179.7107	1.0000	398.0000	6.4347

Code:

##################################################
# 1. Install and load required packages
##################################################
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("gt")) install.packages("gt")
if (!require("gtExtras")) install.packages("gtExtras")

library(tidyverse)
library(gt)
library(gtExtras)

##################################################
# 2. Create dataset directly in R
##################################################
mydata <- read.csv(text = '
video,pct
30,47
32,39
46,56
35,52
35,60
48,47
38,52
24,51
29,42
30,51
44,65
37,57
37,61
35,52
30,44
48,38
38,54
18,43
40,62
30,43
25,63
32,53
26,46
28,60
29,41
21,36
41,75
35,53
25,61
44,46
37,53
32,54
41,61
41,48
41,58
40,46
38,63
34,62
32,28
31,62
28,35
32,55
24,50
51,65
44,54
25,46
31,54
30,58
40,37
33,35
36,68
34,62
34,56
45,68
32,60
46,36
22,53
39,51
35,55
36,51
37,63
30,45
31,55
26,41
25,34
36,66
38,62
34,70
41,59
50,75
30,69
16,34
42,46
28,45
28,45
42,60
32,67
24,40
35,57
33,49
34,52
37,58
31,63
39,58
32,58
37,62
43,69
37,49
31,66
43,56
42,58
38,70
36,67
29,37
45,52
29,34
51,69
46,64
32,54
26,51
28,41
36,44
32,61
31,57
26,30
34,51
28,38
21,20
31,51
41,57
29,47
39,58
21,50
34,54
38,49
36,47
35,51
29,51
27,36
26,43
35,63
26,44
30,42
32,47
49,75
29,53
36,66
35,54
26,50
33,46
46,68
38,50
34,38
31,50
18,58
43,68
22,54
40,61
49,59
22,40
40,54
32,46
21,48
22,30
21,50
30,57
22,30
40,63
51,75
24,38
40,66
40,49
37,65
26,48
33,68
32,35
39,55
31,44
42,57
31,43
42,51
26,51
24,33
60,75
31,37
36,55
39,62
30,55
38,52
37,55
32,60
35,38
34,44
51,70
28,43
25,58
34,59
36,55
37,40
30,49
25,42
44,60
31,38
27,51
32,40
32,48
43,63
35,45
40,63
30,35
36,25
31,54
35,61
27,43
24,48
50,54
39,55
24,24
29,60
25,63
52,75
44,61
32,50
38,40
31,57
30,47
28,40
29,34
47,60
34,43
35,49
36,41
44,75
30,48
26,56
47,37
30,44
28,40
24,31
24,59
29,34
39,59
43,68
40,59
31,41
34,62
28,49
28,36
41,44
26,66
50,58
33,33
36,59
28,50
29,35
23,24
33,50
37,66
37,61
28,42
28,38
30,51
46,64
25,51
33,47
49,74
33,43
23,39
29,55
38,65
31,30
30,49
31,56
35,38
47,68
33,43
43,70
39,62
33,59
22,43
30,57
30,63
34,72
44,61
52,45
46,63
33,53
20,39
31,55
35,63
41,53
42,56
39,61
23,37
41,59
30,29
35,42
35,39
37,46
34,45
21,45
40,67
37,47
32,40
35,42
35,49
36,51
47,68
32,47
35,32
43,59
42,71
43,72
29,40
50,50
35,75
49,63
23,42
34,56
44,44
28,40
28,31
26,39
26,46
31,35
37,60
18,41
36,44
44,63
50,73
44,71
40,55
20,39
29,47
31,64
40,68
33,60
24,41
47,67
41,62
36,43
44,43
23,49
39,41
30,48
39,59
34,57
39,58
45,54
34,62
42,75
24,53
28,28
46,75
37,54
18,43
23,49
32,50
41,57
33,62
39,63
42,68
47,75
34,59
34,54
20,18
35,75
29,43
26,69
33,55
42,74
18,37
31,54
35,55
27,44
37,53
37,65
34,50
14,15
55,67
32,56
39,63
36,60
42,42
41,61
32,60
37,67
26,43
41,55
30,64
53,52
21,36
30,53
41,41
38,58
29,67
26,45
35,50
34,57
20,37
34,60
36,62
35,53
26,33
38,42
45,55
38,63
25,25
31,30
37,48
29,52
17,28
41,67
27,50
29,46
46,70
28,51
41,53
24,46
31,56
')

##################################################
# 3. Define dependent variable (DV) and independent variable (IV)
##################################################
mydata$DV <- mydata$video
mydata$IV <- mydata$pct

##################################################
# 4. Explore distributions of DV and IV
##################################################
DVGraph <- ggplot(mydata, aes(x = DV)) + 
  geom_histogram(color = "black", fill = "#1f78b4")

IVGraph <- ggplot(mydata, aes(x = IV)) + 
  geom_histogram(color = "black", fill = "#1f78b4")

##################################################
# 5. Fit and summarize initial regression model
##################################################
options(scipen = 999)
myreg <- lm(DV ~ IV, data = mydata)
summary(myreg)

##################################################
# 6. Visualize regression and check for bivariate outliers
##################################################
RegressionPlot <- ggplot(mydata, aes(x = IV, y = DV)) +
  geom_point(color = "#1f78b4") +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(
    title = "Scatterplot of DV vs IV with Regression Line",
    x = "Independent Variable (IV)",
    y = "Dependent Variable (DV)"
  ) +
  theme_minimal()

##################################################
# 7. Check for potential outliers (high leverage points)
##################################################
hat_vals <- hatvalues(myreg)
threshold <- 2 * (length(coef(myreg)) / nrow(mydata))

outliers <- data.frame(
  Obs = 1:nrow(mydata),
  Leverage = hatvalues(myreg)
) %>%
  arrange(desc(Leverage)) %>%
  slice_head(n = 10)

outliers_table <- outliers %>%
  gt() %>%
  tab_header(title = "Leverage estimates for 10 largest outliers") %>%
  cols_label(Obs = "Row #", Leverage = "Leverage") %>%
  fmt_number(columns = Leverage, decimals = 4)

##################################################
# 8. Create nicely formatted regression results tables
##################################################
reg_results <- as.data.frame(coef(summary(myreg))) %>%
  tibble::rownames_to_column("Term") %>%
  rename(
    Estimate = Estimate,
    `Std. Error` = `Std. Error`,
    t = `t value`,
    `p-value` = `Pr(>|t|)`
  )

reg_table <- reg_results %>%
  gt() %>%
  tab_header(title = "Regression Analysis Results", subtitle = "Coefficient Estimates") %>%
  fmt_number(columns = c(Estimate, `Std. Error`, t, `p-value`), decimals = 4)

reg_summary <- summary(myreg)
fit_stats <- tibble::tibble(
  `R-squared` = reg_summary$r.squared,
  `Adj. R-squared` = reg_summary$adj.r.squared,
  `F-statistic` = reg_summary$fstatistic[1],
  `df (model)` = reg_summary$fstatistic[2],
  `df (residual)` = reg_summary$fstatistic[3],
  `Residual Std. Error` = reg_summary$sigma
)

fit_table <- fit_stats %>%
  gt() %>%
  tab_header(title = "Model Fit Statistics", subtitle = "Overall Regression Performance") %>%
  fmt_number(columns = everything(), decimals = 4)

##################################################
# 9. Final print of key graphics and tables
##################################################
DVGraph
IVGraph
RegressionPlot
outliers_table
reg_table
fit_table

Cultivation Analysis Resubmission

Aaron Cantrell

Rationale

Hypothesis

Variables & method

Results & discussion