library(pander)
library(ggpubr)

\(~\)

Problem 1: The dean of a business school undertakes a study to relate starting salary after graduation to grade point average GPA in major courses. He then randomly selects records of 10 students shown in the accompanying table. Perform a correlation analysis using Pearson correlation. (5 pts.)

Student 1 2 3 4 5 6 7 8 9 10
GPA 78 81 85 87 75 79 83 88 85 77
Starting salary 17 18 18 28 17 22 30 34 30 28

\(~\)

Solution:

# Enter data manually.
gpa <- c(78, 81, 85, 87, 75, 79, 83, 88, 85, 77)
ssalary <- c(17, 18, 18, 28, 17, 22, 30, 34, 30, 28)
cor.data1 <- data.frame(gpa, ssalary)

pander(cor.test(cor.data1$gpa, cor.data1$ssalary, method = "pearson"))
Pearson’s product-moment correlation: cor.data1$gpa and cor.data1$ssalary \(~\)
Test statistic df P value Alternative hypothesis cor
2.103 8 0.06858 two.sided 0.5967

The analysis result shows that there is a moderately strong, positive correlation between GPA and Starting salary (\(r = 0.5967\)). There is no significant linear relationship, however, between gpa and starting salary as indicated by the hypothesis test on the correlation coefficient (\(p > 0.05\)).

\(~\)

The following is a scatterplot of the data.

ggscatter(cor.data1, x="gpa", y="ssalary", add = "reg.line", cor.coef = TRUE,
          cor.method = "pearson", xlab = "GPA", ylab = "Starting salary")

\(~\)

Problem 2: The following are the number of sales contacts made by 9 salespersons during a week and the number of sales made. Perform a correlation analysis using the Pearson correlation coefficient and interpret. (5 pts.)

Salesperon 1 2 3 4 5 6 7 8 9
Sales contacts 71 64 100 105 75 79 82 68 110
Sales 25 14 37 40 18 10 22 12 42

\(~\)

Solution:

salescon <- c(71, 64, 100, 105, 75, 79, 82, 68, 110)
sales <- c(25, 14, 37, 40, 18, 10, 22, 12, 42)
cor.data2 <- data.frame(salescon, sales)

pander(cor.test(cor.data2$salescon, cor.data2$sales, method = "pearson"))
Pearson’s product-moment correlation: cor.data2$salescon and cor.data2$sales \(~\)
Test statistic df P value Alternative hypothesis cor
5.554 7 0.0008559 * * * two.sided 0.9028

The results indicate a very strong positive correlation between the number of sales contacts during a week and the number of sales made (\(r = 0.9028\)). The hypothesis test further show that there is a significant linear relationship between the number of sales contacts made and the number of sales (\(p < 0.05\)).
\(~\)

Here is ascatterplot of the given data.

ggscatter(cor.data2, x="salescon", y="sales", add = "reg.line", cor.coef = TRUE,
          cor.method = "pearson", xlab = "Sales Contacts", ylab = "sales")

\(~\)

Problem 3: the owner of a car wants to study the relationship between the age of a car and its selling price. Listed below is a random sample of 12 used cars at a dealership during the last year. Perform a correlation analysis using (a) Pearson correlation; (b) Spearman rank correlation. Interpret the results. (10 pts.)

Car 1 2 3 4 5 6 7 8 9 10 11 12
Age (years) 9 7 11 12 8 7 8 11 10 12 6 6
Selling Price (in $1000) 8.1 6.0 3.6 4.0 5.0 10.0 7.6 8.0 8.0 6.0 8.6 8.0

\(~\)

Solution:

age <- c(9, 7, 11, 12, 8, 7, 8, 11, 10, 12, 6, 6)
price <- c(8.1, 6.0, 3.6, 4.0, 5.0, 10.0, 7.6, 8.0, 8.0, 6.0, 8.6, 8.0)
cor.data3 <- data.frame(age, price)

# Using Pearson correlation:
pander(cor.test(cor.data3$age, cor.data3$price, method = "pearson"))
Pearson’s product-moment correlation: cor.data3$age and cor.data3$price
Test statistic df P value Alternative hypothesis cor
-2.048 10 0.0677 two.sided -0.5436
# Using Spearman correlation:
pander(cor.test(cor.data3$age, cor.data3$price, method = "spearman"))
Spearman’s rank correlation rho: cor.data3$age and cor.data3$price \(~\)
Test statistic P value Alternative hypothesis rho
442.7 0.06507 two.sided -0.548

(a) The Pearson correlation analysis results show a moderately strong, negative correlation between the age of a car and the selling price. There is no significant linear relationship, however, between age and selling price of the car (\(p > 0.05\)).

(b) The Spearman rank correlation analysis also show a moderately strong, negative relationship between the age and selling price of a car. The test of significance of the correlation coefficient also indicate that there is no significant linear relationship between age and selling price of a car (\(p > 0.05\)).

\(~\)

The following is a scatterplot of the given data.

ggscatter(cor.data3, x="age", y="price", add = "reg.line", cor.coef = TRUE,
          cor.method = "pearson", xlab = "Age (in years)", 
          ylab = "Selling Price (in $1000)")