Simple Linear Regression

Encode Data

The data is stored in the data1 object which contains two variables, Annual Sales in thousand dollars (AnnualSales) and Store Area in feet(StoreArea), with seven observations.

data1 <- data.frame(
  StoreArea = c(1726,1542,2816,5555,1292,2208,1313),
  AnnualSales = c(3681,3395,6653,9543,3318,5563,3760)
)
str(data1)
## 'data.frame':    7 obs. of  2 variables:
##  $ StoreArea  : num  1726 1542 2816 5555 1292 ...
##  $ AnnualSales: num  3681 3395 6653 9543 3318 ...

Summary Table of the Regression Model

Simple linear regression model was used to model Annual Sales of stores using its store area as the predictor variable. The result below suggest that the model is highly significant, \(F(1,5) = 81.18\), \(p = 0.0003\) and the model explained \(94.20 \%\) of the variation from the data. Furthermore, it was found out that Store Area (in feet) significantly predict Annual Sales (in thousand dollars), \(B_1 = 1.487\), p \(= 0.0003\), which means that for every one foot increase in the Store Area, the annual sales will increase by $1487.00, on average.

model1 <- lm(AnnualSales ~ StoreArea, data1)
summary(model1)
## 
## Call:
## lm(formula = AnnualSales ~ StoreArea, data = data1)
## 
## Residuals:
##      1      2      3      4      5      6      7 
## -521.3 -533.8  830.2 -351.7 -239.1  644.1  171.6 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1636.415    451.495   3.624 0.015149 *  
## StoreArea      1.487      0.165   9.010 0.000281 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 611.8 on 5 degrees of freedom
## Multiple R-squared:  0.942,  Adjusted R-squared:  0.9304 
## F-statistic: 81.18 on 1 and 5 DF,  p-value: 0.0002812

Multiple Linear Regression

Encode Data

The data is stored in the data2 object which contains three variables, Heating Oil used in Gallons (Oil), Average Temperature in Fahrenheit (Temp), and Amount of Insulation in Inches (Insulation), with fifteen observations.

data2 <- data.frame(
  Oil = c(275.30, 363.80, 164.30, 40.80, 94.30, 230.90, 366.70, 300.60, 237.80, 121.40, 31.40, 203.50, 441.10, 323.00, 52.50),
  Temp = c(40, 27, 40, 73, 64, 34, 9, 8, 23, 63, 65, 41, 21, 38, 58),
  Insulation = c(3, 3, 10, 6, 6, 6, 6, 10, 10, 3, 10, 6, 3, 3, 10)
)
str(data2)
## 'data.frame':    15 obs. of  3 variables:
##  $ Oil       : num  275.3 363.8 164.3 40.8 94.3 ...
##  $ Temp      : num  40 27 40 73 64 34 9 8 23 63 ...
##  $ Insulation: num  3 3 10 6 6 6 6 10 10 3 ...

Summary Table of the Regression Model

Multiple linear regression model was used to model the heating oil used in gallons for a single family in the month of January using the average temperature in Fahrenheit and the amount of insulation in inches. The result below shows that the model is highly significant, \(F(2,12) = 168.5\), \(p < 0.001\), and the model explained \(96.56\%\) of the variation from the data.

Moreover, it was found out that both average temperature in Fahrenheit, \(B_1 = -5.4366\), \(p < 0.001\) and the amount of insulation used in inches, \(B_2 = -20.0123, p < 0.001\) significantly predict the amount of heating oil sued in gallons per family. Based on the values, this suggest that for every 1\(^{\circ}\) Fahrenheit increase in the average temperature, the amount of oil that a single family will be using will decrease by 5.44 gallons, on average, assuming that there will be no change in the amount of insulation used. In addition, for every 1 inch increase in the insulation used by a family, the amount of oil that a single family will be using will decrease by about 20.01 gallons, on average, assuming that the average temperature will not change. Finally, if the average temperature is \(0^{\circ}\) F and there were no insulation used, the amount of heating oil that a single family will use will be about 562.15 gallons, on average.

model2 <- lm(Oil ~ Temp + Insulation, data2)
summary(model2)
## 
## Call:
## lm(formula = Oil ~ Temp + Insulation, data = data2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -38.209 -16.806   0.164  14.105  53.154 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 562.1510    21.0931  26.651 4.78e-12 ***
## Temp         -5.4366     0.3362 -16.170 1.64e-09 ***
## Insulation  -20.0123     2.3425  -8.543 1.91e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 26.01 on 12 degrees of freedom
## Multiple R-squared:  0.9656, Adjusted R-squared:  0.9599 
## F-statistic: 168.5 on 2 and 12 DF,  p-value: 1.654e-09