data <- read.csv ("C:\\Users\\91630\\OneDrive\\Desktop\\statistics\\age_gaps.CSV")
A movie’s release year is shown in the “release_year” column, which also acts as the response variable.
The gender of the first character in the film is represented by the categorical column “character_1_gender,” which may have an impact on the release year of films.
The assumption that there is no discernible variation in the average
release year of character 1 in films for each gender is the null
hypothesis for the ANOVA test.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
age_gaps_df <- read.csv("age_gaps.CSV")
age_gaps_df <- age_gaps_df %>%
mutate(character_1_gender = ifelse(character_1_gender %in% names(table(character_1_gender))[table(character_1_gender) < 10], "Other", character_1_gender))
anova_model <- aov(release_year ~ character_1_gender, data = age_gaps_df)
summary(anova_model)
## Df Sum Sq Mean Sq F value Pr(>F)
## character_1_gender 1 4095 4095 15.48 8.83e-05 ***
## Residuals 1153 304992 265
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The study shows a clear link between the gender of character 1 and when movies are released, with a significant finding (p < 0.001).
This suggests that the gender of characters influences the timing of movie releases, adding depth to the storytelling process.
Understanding this connection can help filmmakers tailor their storytelling strategies, while movie enthusiasts gain insight into how character dynamics shape the cinematic experience.
The age of the first actor in the movie is represented by the continuous explanatory variable I chose, actor_1_age_col. The age difference between the couples portrayed in the film is indicated by the response variable age_diff_column.
age_diff_column <- age_gaps_df$age_difference
actor_1_age_col <- age_gaps_df$actor_1_age
correlation_coefficient <- cor(age_diff_column, actor_1_age_col)
# Printing the correlation coefficient
print(correlation_coefficient)
## [1] 0.7039631
The first actor’s age and the age gap between spouses appear to be significantly positively correlated, as indicated by the correlation coefficient of 0.7039631 between actor_1_age_col and age_diff_column.
lm_model <- lm(actor_1_age_col ~ age_difference, data = age_gaps_df)
summary(lm_model)
##
## Call:
## lm(formula = actor_1_age_col ~ age_difference, data = age_gaps_df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.372 -5.047 -0.959 3.766 36.628
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 31.64775 0.34469 91.81 <2e-16 ***
## age_difference 0.86220 0.02562 33.66 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.407 on 1153 degrees of freedom
## Multiple R-squared: 0.4956, Adjusted R-squared: 0.4951
## F-statistic: 1133 on 1 and 1153 DF, p-value: < 2.2e-16
The actor_1_age_col coefficient reveals how much a movie’s release year shifts with each additional year in actor 1’s age.
For every extra year actor 1 ages, the film’s release year changes by approximately [coefficient value] years. This suggests that movies might be timed to align with the age or experience of their lead actors.
library(ggplot2)
ggplot(age_gaps_df, aes(x = actor_1_age, y = age_difference)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "blue") +
labs(title = "Relationship between Actor 1's Age and Age Difference Between Characters",
x = "Actor 1's Age",
y = "Age Difference Between Characters") +
theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'
This visualization displays a scatter plot where the age of the first actor is represented on the horizontal axis (x-axis), while the vertical axis (y-axis) shows the age difference between characters in the movies they portray. The blue line across the plot represents a regression line that helps us understand how actor 1’s age relates to the age difference between characters.
By examining this plot, we can gain insights into how the age of the first actor influences the portrayal of age differences between characters in movies.
This analysis sheds light on how actor 1’s age impacts the timing of movie releases. Exploring further factors like directorial style or genre preferences could deepen the understanding of movie release dynamics and audience preferences.