ad=read.csv('Advertising.csv')
attach(ad)
par(mfrow = c(1,3))
plot(TV, Sales, cex.lab=2, cex.axis=1.2)
plot(Radio,Sales,cex.lab=2,cex.axis=1.2)
title("Advertising data",cex.main = 2,font.main= 4, col.main= "blue")
plot(Newspaper,Sales,cex.lab=2,cex.axis=1.2)
lm.radio=lm(Sales ~ Radio)
lm.tv = lm(Sales ~ TV)
lm.newspaper = lm(Sales ~ Newspaper)
par(mfrow = c(1,3))
plot(TV, Sales, cex.lab = 2, cex.axis = 1.2)
abline(lm.tv, col = "blue", lty = 1, lwd = 2)
plot(Radio,Sales,cex.lab=2,cex.axis=1.2)
abline(lm.radio, col="blue", lty=1, lwd=2)
plot(Newspaper,Sales,cex.lab=2,cex.axis=1.2)
abline(lm.newspaper, col="blue", lty=1, lwd=2)
Analyzing the regression model of Sales Vs TV
summary(lm.tv)
##
## Call:
## lm(formula = Sales ~ TV)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.3860 -1.9545 -0.1913 2.0671 7.2124
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.032594 0.457843 15.36 <2e-16 ***
## TV 0.047537 0.002691 17.67 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.259 on 198 degrees of freedom
## Multiple R-squared: 0.6119, Adjusted R-squared: 0.6099
## F-statistic: 312.1 on 1 and 198 DF, p-value: < 2.2e-16
“\(\hat{\beta}1 = 0.047537\)” that advise 1000 dollar increase in TV advertising sale is associated with an increase in sale by 47 units. Notice that “\(\hat{\beta}0\)” and “\(\hat{\beta}1\)” are very large comparative to their standard erros and so the t static is also very large. Checking p value(<2e-16), we can ignore the null hypothesis.
Once ignore the null hypothesis, the next item is to find the extent, model fits the data. So, now checking for -
It is an estimate of the standard deviation of error term, \({\epsilon}\) So even if the model were correct, any prediction on sales would still be off by 3,260 units.
mean(Sales)
## [1] 14.0225
Since mean Sales in data set is 14,022 units. Therefore the percentage error is 23%.
It tells 2/3 of varibility in sales is explained by linear regression on R.