Opening libraries and importing excel files.

library("readxl")
library("ggpubr")
## Loading required package: ggplot2

Dataset import

A4Q1 <- read_excel("A4Q1.xlsx")

Scatter plot

  ggscatter(
  A4Q1, 
          x="age",
          y= "education",
          add="reg.line",
          xlab="age (years)",
          ylab="education (years)",
          title = "Scatterplot of age vs education"
          )

## Intial Observations of the scatterplot

The relationship is linear. The is positive relationship between variables. The relationship is strong between variables. There are no significant outliers.

Descriptive Stastistics

#mean age
mean(A4Q1$age)
## [1] 35.32634
sd(A4Q1$age)
## [1] 11.45344
median(A4Q1$age)
## [1] 35.79811
mean(A4Q1$education)
## [1] 13.82705
sd(A4Q1$education)
## [1] 2.595901
median(A4Q1$education)
## [1] 14.02915

Histograms

hist(A4Q1$age,
     main = "age",
     breaks = 20,
     col = "gray",
     border = "white",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

hist(A4Q1$education,
     main = "education",
     breaks = 20,
     col = "maroon",
     border = "white",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

## Observations form Histograms

Variable 1: age The first variable looks normally distributed. The data is symmetrical. The data has a proper bell curve.

Variable 2: education The second variable looks normally distributed. The data is symmetrical. The data has a proper bell curve.

Shapiro Test

shapiro.test(A4Q1$age)
## 
##  Shapiro-Wilk normality test
## 
## data:  A4Q1$age
## W = 0.99194, p-value = 0.5581
shapiro.test(A4Q1$education) 
## 
##  Shapiro-Wilk normality test
## 
## data:  A4Q1$education
## W = 0.9908, p-value = 0.4385

Observations

Variable 1: age The first variable is normally distributed (p = .5581).

Variable 2: USD The second variable is normally distributed (p = .4385).

Correlation Test

cor.test(A4Q1$age, A4Q1$education, method = "pearson") 
## 
##  Pearson's product-moment correlation
## 
## data:  A4Q1$age and A4Q1$education
## t = 7.4066, df = 148, p-value = 9.113e-12
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.3924728 0.6279534
## sample estimates:
##       cor 
## 0.5200256

Pearson Analysis

A Pearson correlation was conducted to test the relationship between a person’s age in years (M = 35.33, SD = 11.45) and education (M = 35.33, SD = 11.45). There was a statistically significant relationship between the two variables, r(148) = .520 , p < 0.001 . The relationship was positive and strong. As age increased, level of education increased.