install.packages(“readxl”) install.packages(“ggpubr”)
library(readxl)
library(ggpubr)
## Loading required package: ggplot2
data1 <- read_excel("A4Q1.xlsx")
ggscatter(
data1,
x = "age",
y = "education",
add = "reg.line",
xlab = "Age",
ylab = "Education"
)
The relationship is linear. The relationship is positive. The
relationship is moderate. There are no obvious outliers.
statstics
mean(data1$age)
## [1] 35.32634
sd(data1$age)
## [1] 11.45344
median(data1$age)
## [1] 35.79811
mean(data1$education)
## [1] 13.82705
sd(data1$education)
## [1] 2.595901
median(data1$education)
## [1] 14.02915
Histogram
hist(data1$age,
main = "Age",
breaks = 20,
col = "lightblue",
border = "white")
hist(data1$education,
main = "Education",
breaks = 20,
col = "lightcoral",
border = "white")
Age looks normally distributed. Education looks normally distributed.
shapiro
shapiro.test(data1$age)
##
## Shapiro-Wilk normality test
##
## data: data1$age
## W = 0.99194, p-value = 0.5581
shapiro.test(data1$education)
##
## Shapiro-Wilk normality test
##
## data: data1$education
## W = 0.9908, p-value = 0.4385
both variables are normally distributed. as histogram and variables are normal we are going to do pearson test.
cor.test(data1$age, data1$education, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: data1$age and data1$education
## t = 7.4066, df = 148, p-value = 9.113e-12
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3924728 0.6279534
## sample estimates:
## cor
## 0.5200256
A Pearson correlation was conducted to test the relationship between age (m= 35.32, sd = 11.45) and education (m = 13.82 , sd = 2.59) There was a statistically significant relationship between the two variables, r(df) = .xx, p = .9.113e-12. The relationship was negative
As the sleep increases, phone use decreases