Rationale

In cultivation theory, assumptions or perceived norms about crime risk, gender, race… depends on the absorption of commercially produced, TV- like stories, as indicated by attention to such content. Cultivation theory looks at the difference between the world views or people who watch a lot of tv and people who watch less TV. The earliest difference cultivation research detected has to do with differing perceptions about violence and risk.

In this experiment, the percentage the researchers responded with to the question of that the U.S. population estimated by each research participant to be employed in a particular field depends upon the average monthly hours each research participant spent watching television for the six-month period. It treats “pct” as the dependent variable and “video” as the independent variable because cultivation theory says heavy TV use warps other perceptions of the real world.

Hypothesis

The percentage will correlate positively with the amount of video.

Variables & Method

The continuous variable named “pct” measured the percentage the participants repondedwhen asked about the percentages for each worker category (law enforcement/criminal justice, medicine or emergency response services) and the percentages were summed. The continuous variable named “video” measured the average monthly hours each research participant spent watching television for the six-month period. The analysis treated “pct” as the dependent variable and treated “video” as the independent variable. The analysis used the bivariate regression to evaluate the hypothesis.

Results

The study suggests that people who tend to watch more hours of television tended to respond with a high percentage to the survey. This analysis hypothesis that the percentage will correlate positively with the hours. Regression supported the hypothesis (F (1, 524)= 2716, p< 0.5, R²= .84). There were no outliers in this analysis.

Code and output

# Read the data from the web
FetchedData <- read.csv("https://drkblake.com/wp-content/uploads/2023/09/Cultivation.csv")
# Save the data on your computer
write.csv(FetchedData, "Cultivation.csv", row.names=FALSE)
# remove the data from the environment
rm (FetchedData)

# Installing required packages
if (!require("dplyr"))
  install.packages("dplyr")

## Loading required package: dplyr

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

if (!require("tidyverse"))
  install.packages("tidyverse")

## Loading required package: tidyverse

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ readr     2.1.4
## ✔ ggplot2   3.4.3     ✔ stringr   1.5.0
## ✔ lubridate 1.9.2     ✔ tibble    3.2.1
## ✔ purrr     1.0.2     ✔ tidyr     1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(dplyr)
library(ggplot2)

# Read the data
mydata <- read.csv("Cultivation.csv") #Edit YOURFILENAME.csv

# Specify the DV and IV
mydata$DV <- mydata$pct #Edit YOURDVNAME
mydata$IV <- mydata$video #Edit YOURIVNAME

# Look at the DV and IV
ggplot(mydata, aes(x = DV)) + geom_histogram(color = "black", fill = "#1f78b4")

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(mydata, aes(x = IV)) + geom_histogram(color = "black", fill = "#1f78b4")

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

# Creating and summarizing an initial regression model called myreg, and checking for bivariate outliers.
options(scipen = 999)
myreg <- lm(DV ~ IV,
            data = mydata)
summary(myreg)

## 
## Call:
## lm(formula = DV ~ IV, data = mydata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.8960 -2.4820 -0.0835  2.3286  8.9264 
## 
## Coefficients:
##             Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) 0.846473   0.949889   0.891               0.373    
## IV          0.196273   0.003766  52.113 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.453 on 524 degrees of freedom
## Multiple R-squared:  0.8383, Adjusted R-squared:  0.838 
## F-statistic:  2716 on 1 and 524 DF,  p-value: < 0.00000000000000022

plot(mydata$IV, mydata$DV)
abline(lm(mydata$DV ~ mydata$IV))

Week 5 Data Exercise

Roni Portzen

2023-09-26

Rationale

Hypothesis

Variables & Method

Results

Code and output