library(readr)
library(ggpubr)
## Loading required package: ggplot2
The dean of a business school undertakes a study to relate starting salary after graduation to grade point average GPA in major courses. He then randomly selects records of 10 students shown in the accompanying table. Perform a correlation analysis using Pearson correlation.
gpa<-read.csv("gpa.csv")
head(gpa)
## GPA Starting.salary
## 1 78 17
## 2 81 18
## 3 85 18
## 4 87 28
## 5 75 17
## 6 79 22
cor.test(gpa$GPA, gpa$Starting.salary, method="pearson")
##
## Pearson's product-moment correlation
##
## data: gpa$GPA and gpa$Starting.salary
## t = 2.1034, df = 8, p-value = 0.06858
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.05268605 0.89143246
## sample estimates:
## cor
## 0.5967355
ggscatter(gpa, x="GPA", y="Starting.salary",add="reg.line", cor.coef = TRUE,cor.method="pearson", xlab="GPA", ylab="Starting Salary in thousands of pesos")
## `geom_smooth()` using formula 'y ~ x'
There is a moderate positive correlation between the grade point average (GPA) to the starting salary after graduation as indicated by the correlation coefficient value of 0.5967. Ideally, as the grade point average increases, the starting salary after graduation is expected to increase. However, at 5% significance level, the relationship is not significant due to the 0.06858 p-value.
The following are the numbers of sales contacts made by 9 salespersons during a week and the number of sales made. Perform a correlation analysis using the Pearson correlation coefficient and interpret.
sales<-read.csv("sales.csv")
head(sales)
## sales.contacts sales
## 1 71 25
## 2 64 14
## 3 100 37
## 4 105 40
## 5 75 18
## 6 79 10
cor.test(sales$sales.contacts, sales$sales, method="pearson")
##
## Pearson's product-moment correlation
##
## data: sales$sales.contacts and sales$sales
## t = 5.5545, df = 7, p-value = 0.0008559
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.5960951 0.9795925
## sample estimates:
## cor
## 0.9028119
ggscatter(sales, x="sales.contacts", y="sales",add="reg.line", cor.coef = TRUE,cor.method="pearson", xlab="Sales Contacts", ylab="Sales")
## `geom_smooth()` using formula 'y ~ x'
There is a very strong positive correlation between the sales contacts made by sales persons to the number of sales made as indicated by the correlation coefficient value of 0.9028. As the sales contacts made increases, the number of sales is expected to increase. Also, this relationship is signifcant at 5% significance level due to the p-value of 0.0008559.
The owner of a car wants to study the relationship between the age of a car and its selling price. Listed below is a random sample of 12 used cars sold at a dealership during the last year. Perform a correlation analysis using (a) Pearson correlation coefficient; (b)Spearman rank correlation coefficient. Interpret these results.
Pearson Correlation Coefficient
car<-read.csv("car.csv")
head(car)
## age selling.price
## 1 9 8.1
## 2 7 6.0
## 3 11 3.6
## 4 12 4.0
## 5 8 5.0
## 6 7 10.0
cor.test(car$age, car$selling.price, method="pearson")
##
## Pearson's product-moment correlation
##
## data: car$age and car$selling.price
## t = -2.0483, df = 10, p-value = 0.0677
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.85178991 0.04397567
## sample estimates:
## cor
## -0.5436463
ggscatter(car, x="age", y="selling.price",add="reg.line", cor.coef = TRUE,cor.method="pearson", xlab="Age", ylab="Selling Price in thousand dollars")
## `geom_smooth()` using formula 'y ~ x'
There is a moderate negative correlation between the age of a car to its selling price as indicated by the correlation coefficient value of -0.5436. This implies that as the age of a car increases, its selling price is expected to decrease. However, at 5% significance level, the relationship is not significant due to the p-value of 0.0677.
Spearman Rank Correlation Coefficient
car<-read.csv("car.csv")
head(car)
## age selling.price
## 1 9 8.1
## 2 7 6.0
## 3 11 3.6
## 4 12 4.0
## 5 8 5.0
## 6 7 10.0
cor.test(car$age, car$selling.price, method="spearman")
## Warning in cor.test.default(car$age, car$selling.price, method = "spearman"):
## Cannot compute exact p-value with ties
##
## Spearman's rank correlation rho
##
## data: car$age and car$selling.price
## S = 442.74, p-value = 0.06507
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.5480427
ggscatter(car, x="age", y="selling.price",add="reg.line", cor.coef = TRUE,cor.method="spearman", xlab="Age", ylab="Selling Price in thousand dollars")
## `geom_smooth()` using formula 'y ~ x'
There is a moderate negative linear relationship between the age of a car to its selling price as indicated by the correlation coefficient of -0.5480. This implies that as the age of a car increases, its selling price is expected to decrease. However, at 5% significance level, the relationship is not important due to the p-value of 0.06507.