This is an example of Simple Linear Regression. We will see whether there is any relationship between advertising cost and sales of a product. And if there is any, we want to see how much advertising cost affects sales of this product.
We load the data first.

product
##    AdvtCost Sales
## 1       128   489
## 2       158   550
## 3       170   500
## 4       200   670
## 5       250   670
## 6        72   350
## 7        90   360
## 8       180   410
## 9        82   110
## 10      170   275
## 11      178   300
## 12      200   520
summary(product)
##     AdvtCost       Sales    
##  Min.   : 72   Min.   :110  
##  1st Qu.:118   1st Qu.:338  
##  Median :170   Median :450  
##  Mean   :156   Mean   :434  
##  3rd Qu.:185   3rd Qu.:528  
##  Max.   :250   Max.   :670

We do Correlation plot and test as well.

library("ggplot2")
qplot(AdvtCost, Sales, data = product, geom = c("point", "smooth"), method = "lm")

plot of chunk unnamed-chunk-3

cor.test(product$Sales, product$AdvtCost)
## 
##  Pearson's product-moment correlation
## 
## data:  product$Sales and product$AdvtCost
## t = 2.899, df = 10, p-value = 0.01587
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1663 0.9004
## sample estimates:
##    cor 
## 0.6757

We do see there is a positive correlation between Advertising Cost and Sales which is 68%

We build simple regression model.

product_reg <- lm(Sales ~ AdvtCost, data = product)
summary(product_reg)
## 
## Call:
## lm(formula = Sales ~ AdvtCost, data = product)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -186.7  -96.6   40.2   97.2  146.0 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  108.523    118.092    0.92    0.380  
## AdvtCost       2.078      0.717    2.90    0.016 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 128 on 10 degrees of freedom
## Multiple R-squared:  0.457,  Adjusted R-squared:  0.402 
## F-statistic:  8.4 on 1 and 10 DF,  p-value: 0.0159

The model gives R-squared - 0.4566, Adjusted R-squared - 0.4022, p-value - 0.01587 and RMSE - 128. R-squared of 46% also means 54% of sales is not explained by Advertising cost. Overall, the model is significant.

We also see all the residual plots.

par(mfrow = c(2,2))
plot(product_reg)

plot of chunk unnamed-chunk-5

We add the predicted valued to the original table.

prediction <- round(predict(product_reg), 2)
product$prediction <- prediction
product
##    AdvtCost Sales prediction
## 1       128   489      374.5
## 2       158   550      436.8
## 3       170   500      461.7
## 4       200   670      524.0
## 5       250   670      627.9
## 6        72   350      258.1
## 7        90   360      295.5
## 8       180   410      482.5
## 9        82   110      278.9
## 10      170   275      461.7
## 11      178   300      478.3
## 12      200   520      524.0