library(readr)
Data<-
read_csv("~/final-project2 - Sheet1.csv")
head(Data, n=2)
The primary aim of this study is to understand the relationship between liking to watch television and academic performance (GPA) among students of the University Of Arcadia. Specifically, I am focusing on the difference in the proportion of students with a GPA of 2.5 or above between those who like to watch television and those who do not. Investigating this population parameter of interest will allow us to assess whether liking to watch television affects GPA.
My research question is: “Is there an association between the liking to watch television and GPA?”
Before analyzing any data, I developed an initial conjecture based on my past experiences. I hypothesized that there was no difference in the proportion of students of the University Of Arcadia with a GPA of 2.5 or above between those who liked to watch television and those who did not. However, I suspect the actual parameter value may show a difference between the two groups since liking to watch television may affect the time for studying; it is rooted in societal norms and expectations prioritizing academic success and discouraging leisure activities like watching television.
The observational units in this study were students of the University of Arcadia.
I collected data through verbal inquiries and academic records to measure the variables of interest. Participants were asked whether they liked watching television, and academic records were obtained from the participants to determine their GPA whether above 2.5 or not.
Initially, the study intended to research the impact of video games on GPA. However, most participants either rejected the survey or provided the same answer, making it challenging for me to analyze the data meaningfully. I then shifted my focus to watching television as an alternative. In addition, I made up to two repeat visits to obtain the observational units initially selected.
mosaicplot(GPA ~ TV, data = Data)
tally(TV ~ GPA, data = Data, format = "count", margin = TRUE)
## GPA
## TV No Yes
## No 10 12
## Yes 11 12
## Total 21 24
tally(TV ~ GPA, data = Data, format = "prop")
## GPA
## TV No Yes
## No 0.4761905 0.5000000
## Yes 0.5238095 0.5000000
For students who do not watch TV:
GPA below 2.5: 10/22 = 0.4545 (45.45%) GPA 2.5 or above: 12/22 = 0.5455 (54.55%)
For students who watch TV:
GPA below 2.5: 11/23 = 0.4783 (47.83%) GPA 2.5 or above: 12/23 = 0.5217 (52.17%)
Based on these proportions, there appears to be a weak association between liking to watch television and the GPA category. The difference in the proportion of students with a GPA of 2.5 or above between those who like to watch television and those who do not is relatively small (54.55% vs. 52.17%). Similarly, the difference in the proportion of students with a GPA below 2.5 between the two groups is also small (45.45% vs. 47.83%).
Population: Students of the University Of Arcadia.
Parameter: Difference in the proportion of students with a GPA of 2.5 or above between those who like to watch television and those who do not.
Null hypothesis: there is no difference in the proportion of students with a GPA of 2.5 or above between those who like to watch television (\(\pi_{1}\)) and those who do not (\(\pi_{2}\)).
Alternative hypothesis: there is a difference in the proportion of students with a GPA of 2.5 or above between those who like to watch television (\(\pi_{1}\)) and those who do not (\(\pi_{2}\)).
\(H_0\) : \(\pi_{1}\) = \(\pi_{2}\)
\(H_a\): \(\pi_{1}\) \(\neq\) \(\pi_{2}\)
Type I error: Concluding that there is a difference in the proportion of students with a GPA of 2.5 or above between those who like to watch television and those who do not when there is no difference.
Type II error: Concluding that there is no difference in the proportion of students with a GPA of 2.5 or above between those who like to watch television and those who do not when there is a difference in reality.
The measurements may not be considered representative since I used convenience sampling to select participants. This method does not ensure that each individual in the population has an equal chance of being selected.
#sample sizes
n.watch_TV<- 23
n.not_watch_TV<- 22
# counts
gpa_above_2_5_watch_TV <- 12
gpa_above_2_5_not_watch_TV <- 12
#sample proportions
p.hat.watch_TV <- gpa_above_2_5_watch_TV/n.watch_TV
p.hat.not_watch_TV <- gpa_above_2_5_not_watch_TV/n.not_watch_TV
# difference between the sample proportions of watch_TV vs not_watch_TV
p.hat.diff<-p.hat.watch_TV-p.hat.not_watch_TV
p.hat.diff
## [1] -0.02371542
#pooled proportion (total successes/cobmined sample size)
p.hat.pooled<- (gpa_above_2_5_watch_TV+gpa_above_2_5_not_watch_TV)/(n.watch_TV + n.not_watch_TV)
# Standard error of the theoretical sampling distribution based on a pooled proportion
SE.diff.null<-sqrt((p.hat.pooled*(1-p.hat.pooled))/n.watch_TV
+ (p.hat.pooled*(1-p.hat.pooled))/n.not_watch_TV)
# Standardized statistic, z
z <- p.hat.diff/SE.diff.null
round(z,2)
## [1] -0.16
The standardized statistic (z) is -0.16. And since both groups have sample sizes greater than 10, they satisfy these conditions.
#Theory-based test p-value
left.tailed.p.value<-pnorm(z, 0, 1, lower.tail = TRUE)
two.sided.p.value<-2*left.tailed.p.value
two.sided.p.value
## [1] 0.8733512
The p-value of 0.8733512 indicates an 87.33% probability of observing a difference in proportions as extreme as, or more extreme than, the one found in the sample data, assuming that the null hypothesis is true.
Since the p-value is greater than the significance level of 0.05, we fail to reject the null hypothesis. There is not enough evidence to suggest an association between liking watching TV and having a GPA of 2.5 or above.
Therefore, we cannot conclude that liking to watch TV significantly impacts GPA based on the data provided by the sample of students at the University Of Arcadia.
# Standard error of the sampling distribution based on individual proportions
SE.diff.CI<-sqrt((p.hat.watch_TV*(1-p.hat.watch_TV))/n.watch_TV
+ (p.hat.not_watch_TV*(1-p.hat.not_watch_TV))/n.not_watch_TV)
# margin of error for 95% CI
MoE <- 1.96 * SE.diff.CI
#MoE
LB<-p.hat.diff - MoE # lower limit of 95% CI
UB<-p.hat.diff + MoE # upper limit of 95% CI
round(cbind(LB,UB),2)
## LB UB
## [1,] -0.32 0.27
The 95% confidence interval is between -0.32 and 0.27.
So we are 95% confident that the difference in population proportions of students with a GPA of 2.5 or above between those who like to watch TV and those who do not like to watch TV is between -0.32 and 0.27.
Since the interval includes zero, we cannot conclude that there is a significant difference in the proportions of students with a GPA of 2.5 or above between those who like to watch TV and those who do not like to watch TV. This is consistent with the conclusion drawn in 5d, where we failed to reject the null hypothesis based on the p-value. Both the p-value and the confidence interval suggest that there is no strong evidence of an association between liking watching TV and having a GPA of 2.5 or above.
In this study, I investigated the relationship between liking watching TV and having a GPA of 2.5 or above among students of the University Of Arcadia. I collected data from 45 students using a convenience sampling method.
My analysis showed no significant difference in the proportion of students with a GPA of 2.5 or above between those who watch TV and those who do not since the p-value of approximately 0.87, and the confidence interval was between -0.32 and 0.27, which included zero. It is different from my expected.
Moreover, convenience sampling has limitations, including potential bias and lack of generalizability to the larger population. In future studies, a more representative sampling method, such as stratified random sampling or cluster sampling, could be used to obtain a more accurate representation of the population. Additionally, it would be beneficial to include a larger and more diverse sample, including students from different schools or regions, to increase the generalizability of the results.
Researchers might investigate the relationship between GPA and other forms of entertainment, such as social media or video gaming, or explore the influence of parental involvement or extracurricular activities on academic performance. Building on the results of this study can help better understand the various factors that contribute to the student’s academic success.