library(pander)

\(~\)

Problem 1: Suppose data were collected from a sample of 10 branches of a pizza restaurant chain located near college campuses, shown below.

Restaurant Student Population (in 1000s) Quarterly Sales (in $1000)
1 2 58
2 6 105
3 8 88
4 8 118
5 12 117
6 16 137
7 20 157
8 20 169
9 22 149
10 26 202

\(~\)

Create data in RStudio:

popn <- c(2, 6, 8, 8, 12, 16, 20, 20, 22, 26)
qsales <- c(58, 105, 88, 118, 117, 137, 157, 169, 149, 202)

a.) Plot the scatter diagram. (2 pts.)

plot(popn, qsales, main="Scatter Plot of Data", xlab = "Student Population", 
     ylab = "Quarterly Sales")

\(~\)

b.) What does the scatter diagram indicate about the relationship between student population and quarterly sales? (1 pt.)

The scatter diagram shows that there is an upward linear trend or relationship as exhibited by the scatter points. This implies a positive linear relationship between student population and quarterly sales.

\(~\)

c.) Develop the estimated linear regression equation that could be used to predict the quarterly sales from the student population. (4 pts.)

reg.model1 <- lm(qsales~popn)
pander(summary(reg.model1))
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 60 9.226 6.503 0.0001874
popn 5 0.5803 8.617 2.549e-05
Fitting linear model: qsales ~ popn \(~\)
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
10 13.83 0.9027 0.8906

The estimated linear regression equation is given by: \(Quarterly Sales = 60 + 5*Student Population\).

\(~\)

d.) Compute the Pearson correlation coefficient and the coefficient of determination and interpret these. (4 pts.)

pander(cor.test(popn, qsales, method = "pearson"))
Pearson’s product-moment correlation: popn and qsales \(~\)
Test statistic df P value Alternative hypothesis cor
8.617 8 2.549e-05 * * * two.sided 0.9501

The Pearson correlation coefficient obtained is \(0.9501\) which indicates a very strong, positive linear relationship between Quarterly Sales and Student Population. The coefficient of determination of \(0.9027\), as obtained in the simple linear regression analysis results, indicates that \(90.27%\) of the variation in the dependent variable, Quarterly Sales, is explained by the linear relationship between Quarterly Sales and Student Population.

\(~\)

Problem 2: A study was made by a retail merchant to determine the relation between weekly advertising expenditure and sales. The following data were recorded:

Advertising Cost (in $) Sales (in $)
40 385
20 400
25 395
20 365
30 475
50 440
40 490
20 420
50 560
40 525
25 480
50 510

\(~\)

Enter data manually in RStudio:

adcost <- c(40, 20, 25, 20, 30, 50, 40, 20, 50, 40, 25, 50)
wksales <- c(385, 400, 395, 365, 475, 440, 490, 420, 560, 525, 480, 510)

\(~\)

a.) Plot and interpret the scatter diagram. (3 pts.)

plot(adcost, wksales, main = "Scatter Plot of Data", 
     xlab = "Advertising Cost", ylab = "Weekly Sales")

\(~\)

The scatter plot shows some degree of upward linear trend as exhibited by the scatter points, which implies a positive, linear relationship between Weekly Sales and Advertising Cost.

\(~\)

b.) Find the estimated linear regression equation to predict weekly sales from weekly advertising expenditures. (4 pts.)

reg.model2 <- lm(wksales~adcost)
pander(summary(reg.model2))
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 343.7 44.77 7.678 1.685e-05
adcost 3.221 1.24 2.598 0.02657
Fitting linear model: wksales ~ adcost \(~\)
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
12 50.23 0.403 0.3433

The estimated linear regression equation is given by: \(Weekly Sales = 343.7 + 3.221*Advertising Cost\)

\(~\)

c.) Compute the coefficient of correlation. Interpret. (2 pts.)

pander(cor.test(adcost, wksales, method = "pearson"))
Pearson’s product-moment correlation: adcost and wksales \(~\)
Test statistic df P value Alternative hypothesis cor
2.598 10 0.02657 * two.sided 0.6348

With the assumption that the data for both variables are approximately normally distributed, the Pearson correlation coefficient is \(0.6348\), indicating a strong, positive relationship between Advertising Cost and Weekly Sales.

\(~\)

d.) Compute the coefficient of determination. Interpret. (2 pts.)

The coefficient of determination, as already obtained from the simple linear regression analysis, is \(0.403\) indicating that \(40.3%\) of the variation in the dependent variable, Weekly Sales, is explained by the linear relationship between Weekly Sales and Advertising Cost.

\(~\)

e.) Estimate the weekly sales when advertising costs are $35. (2 pts.)

Advertising_Cost <- 35
Weekly_Sales <- 343.7+(3.221*Advertising_Cost)
Weekly_Sales
## [1] 456.435

\(~\)

When the advertising cost is \(\$35\), the estimated weekly sales is \(\$456.44\).

\(~\)

Problem 3: The paired data below consists of the costs of advertising (in thousands of dollars) and the number of products sold (in thousands).

Cost 9 2 3 4 2 5 9 10
Number 85 52 55 68 67 86 83 73

\(~\)

Create the data manually in RStudio:

cost <- c(9, 2, 3, 4, 2, 5, 9, 10)
number <- c(85, 52, 55, 68, 67, 86, 83, 73)

\(~\)

a.) Plot and interpret the scatter diagram. (3 pts.)

plot(number, cost, main = "Scatter Plot of the Data", 
     xlab = "Cost of Advertising", ylab = "Number of Products Sold")

\(~\)

The scatter plot shows a great degree of dispersion among the scatter points, with an upward trend observed. This could imply a positive, linear relationship between the Number of Products Sold and the Cost of Advertising

\(~\)

b.) Find the estimated linear regression equation to predict number of products sold from advertising costs. (4 pts.)

reg.model3 <- lm(number~cost)
pander(summary(reg.model3))
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 55.79 7.187 7.762 0.0002405
cost 2.788 1.136 2.454 0.04954
Fitting linear model: number ~ cost \(~\)
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
8 10.04 0.5009 0.4177

The estimated linear regression model is given by: \(Number = 55.79 + 2.788*Cost\).

\(~\)

c.) Compute the coefficient of correlation. Interpret. (2 pts.)

pander(cor.test(cost, number, method = "pearson"))
Pearson’s product-moment correlation: cost and number \(~\)
Test statistic df P value Alternative hypothesis cor
2.454 6 0.04954 * two.sided 0.7077

The obtained correlation coefficient of \(0.7077\) indicates a strong, positive linear relationship between the costs of advertising and the number of products sold.

\(~\)

d.) Compute the coefficient of determination. Interpret. (2 pts.)

The coefficient of determination, as obtained from the simple linear regression analysis, is \(0.5009\) and this indicates that \(50.09%\) of the variation in the dependent variable, Number of Products Sold, is explained by the linear relationship between the Number of Products Sold and the Costs of Advertising.

\(~\)

e.) Estimate the number of products sold when advertising costs are $4500. (2 pts.)

adcosts <- 4.5
numprod <- 55.79+(2.788*adcosts)
numprod
## [1] 68.336

\(~\)

When the advertising cost is \(\$4500\), the estimated number of products sold is \(68,336\).

\(~\)

Problem 4: An article in Business Week listed the “Best Small Companies” with its sales and earnings. A random sample of 12 companies was selected and the sales and earnings, in millions of dollars, are reported below.

Small Company Sales (in million $) Earnings (in million $)
1 89.2 4.9
2 18.6 4.4
3 18.2 1.3
4 71.7 8.0
5 58.6 6.6
6 46.8 4.1
7 17.5 2.6
8 11.9 1.7
9 19.6 3.5
10 51.2 8.2
11 28.6 6.0
12 69.2 12.8

\(~\)

Create data manually in RStudio:

sales1 <- c(89.2, 18.6, 18.2, 71.7, 58.6, 46.8, 17.5, 11.9, 19.6, 51.2, 28.6, 69.2)
earnings <- c(4.9, 4.4, 1.3, 8.0, 6.6, 4.1, 2.6, 1.7, 3.5, 8.2, 6.0, 12.8)

\(~\)

  1. Plot and interpret the scatter diagram. (3 pts.)
plot(sales1, earnings, main = "Scatter Plot of Data", 
     xlab = "Sales (in million $)", ylab = "Earnings (in million $)")

\(~\)

The scatter plot indicates an upward linear trend of the scatter points. This would imply a positive linear relationship between Sales and Earnings.

\(~\)

b.) Find the estimated linear regression equation to predict earnings from sales. (4 pts.)

reg.model4 <- lm(earnings~sales1)
pander(summary(reg.model4))
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.852 1.413 1.311 0.2192
sales1 0.08357 0.02901 2.881 0.01635
Fitting linear model: earnings ~ sales1 \(~\)
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
12 2.518 0.4536 0.399

The estimated linear regression equation is given by: \(Earnings = 1.852 + 0.08357*Sales\)

\(~\)

c.) Compute for the coefficient of correlation. Interpret (2 pts.)

pander(cor.test(sales1, earnings, method = "pearson"))
Pearson’s product-moment correlation: sales1 and earnings \(~\)
Test statistic df P value Alternative hypothesis cor
2.881 10 0.01635 * two.sided 0.6735

The obtained correlation coefficient of \(0.6735\) indicates a strong, positive linear relationship between Sales and Earnings.

\(~\)

d.) Compute the coefficient of determination. Interpret. (2 pts.)

The obtained coefficient of determination of \(0.4536\) from the simple linear regression analysis indicates that 45.36% of the variation in the dependent variable, Earning, is explained by the linear relationship between Earnings and Sales.

\(~\)

e.) For a small company with $50 million in sales, estimate the earnings. (2 pts.)

comsales <- 50
comearnings <- 1.852+(0.08357*comsales)
comearnings
## [1] 6.0305

\(~\)

For the given company with \(\$50\) million sales, the estimated earnings is \(\$6.0305\) million.