Fuel figure as an essential good for most Americans who go shopping at Walmart Store. At an average price of $2.15 per gallon last year, the average American forked about 1,400 dollars to fill up their tank. Walmart is not a high-end store; the average consumer is from poor social class to the average social class. For those two social classes, the marginal variation of fuel price will have a higher impact on consumer budget capacity while in consumer behaviour than the marginal impact on the rich social class. This last assumption justifies our research question. The results of our research if it is negative or positive will be important information to know, in case of the impact of fuel price variation on Wall-Mart consumer behaviour, the fuel price could be added to the forecast procurement models. In a growth strategy planning, this kind of information can be relevant for the members of the decisional circles.
Is it exist an impact on consumer behaviour when there is variation in the fuel price. If there exists a cause to effect what is the magnitude of this effect and how the consumer will react? Will they consume more or less at Walmart store?
In the first part, we will build the most accurate model which can explain the impact on the Wall-Mart consumer behaviours.
In the second part, we will try to identify if there are some store which can use the fuel price variation as an indicator on the future Wall-Mart sales.
In the third part, we will try to identify if the stores for which the fuel price variation have a significant impact if those stores have some common characteristics as stores sizes or geographical localization.
1.The prices are fixed in time; there is the only variation in quantities.
## [1] "/Users/viktoralexy/Desktop/Rproject/git_project"
train <- read.csv("train.csv", header = TRUE)
features <- read.csv("features.csv", header = TRUE)
stores <- read.csv("stores.csv", header = TRUE)
fuelsales <- merge(train, features, by=c("Store","Date"))
sts.ex.sat <- subset(fuelsales, select = c("Fuel_Price", "Weekly_Sales"))
summary(sts.ex.sat)
## Fuel_Price Weekly_Sales
## Min. :2.472 Min. : -4989
## 1st Qu.:2.933 1st Qu.: 2080
## Median :3.452 Median : 7612
## Mean :3.361 Mean : 15981
## 3rd Qu.:3.738 3rd Qu.: 20206
## Max. :4.468 Max. :693099
Fuel price data observation: The distribution looks normal with the min of 2.472 and a max of 4.468 and median of 3.452.
Weekly sales data observation: The distribution had a negative min of -4989 which it can be interesting to investigate why to find what can make a negative sales. If we compare the max of 693099 and the median of 7612 with a slightly higher mean of 15 981 we can suppose a distribution with a long right tail creating by a few extreme high weekly sales values.
cor(sts.ex.sat)
## Fuel_Price Weekly_Sales
## Fuel_Price 1.0000000000 -0.0001202955
## Weekly_Sales -0.0001202955 1.0000000000
There is no correlation between weekly sales and fuel price which is not a good result in the way of our initial question, but we will continue to investigate if the exist relation by store.
## corrplot 0.84 loaded
# plotting functions
ggplot(data = sts.ex.sat, aes(x = Fuel_Price, y = Weekly_Sales)) +
geom_point(alpha = 0.1, aes(color = Weekly_Sales))
Unfortunately, this graph did not give us any valuable pieces of information in the potential relations between weekly sales and fuel price.
First analysis gave us a lousy signal without a clear correlation between weekly sales and fuel price. However, we will continue our study in the way to investigate if there exists a cause to effect relation for some specific stores. If we obtain an affirmative answer, the forecast procurement team can consider the impact of this variable in their forecasts models.
hist(fuelsales$Weekly_Sales)
As we observed earlier in this analysis, this histogram has a long right tail. Those extreme data can be considered as outliers. We will correct this data set by excluding the extreme value to make a normal distribution.
IDout=which(fuelsales$Weekly_Sales>200000)
It results by a data repartition, which has characteristics of closer to a normal distribution.
hist(fuelsales$Weekly_Sales-IDout)
## Warning in fuelsales$Weekly_Sales - IDout: la taille d'un objet plus long
## n'est pas multiple de la taille d'un objet plus court
hist(fuelsales$Fuel_Price)
As we observed overlap of two processes considering as bimodal – it will have two most-frequent values. We will not stratify the data because we consider that the difference between min and max is reasonable. We will quip fule price data as is.
We will build the model, we will start with the regression of weekly sales on fuel price, and we will gradually add controls variables.
First, we will regress weekly sales on fuel price and compare the results from the model with all weekly sales value and the model without the extreme sales values (IDout).
model1 <- lm(Weekly_Sales ~ Fuel_Price, data=fuelsales)
summary(model1)
##
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price, data = fuelsales)
##
## Residuals:
## Min 1Q Median 3Q Max
## -20972 -13901 -8368 4225 677117
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 16001.285 258.779 61.834 <2e-16 ***
## Fuel_Price -5.958 76.287 -0.078 0.938
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 22710 on 421568 degrees of freedom
## Multiple R-squared: 1.447e-08, Adjusted R-squared: -2.358e-06
## F-statistic: 0.006101 on 1 and 421568 DF, p-value: 0.9377
#I will cut the extreme data from the tail probably due to the hollyday sales
IDout=which(fuelsales$Weekly_Sales>200000)
model1less <- lm(Weekly_Sales ~ Fuel_Price, data=fuelsales[-IDout,])
# Summarize and print the results
summary(model1less) # show regression coefficients table
##
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price, data = fuelsales[-IDout,
## ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -20853 -13806 -8276 4297 183802
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15686.68 252.07 62.232 <2e-16 ***
## Fuel_Price 59.03 74.31 0.794 0.427
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 22120 on 421405 degrees of freedom
## Multiple R-squared: 1.498e-06, Adjusted R-squared: -8.753e-07
## F-statistic: 0.6311 on 1 and 421405 DF, p-value: 0.4269
The model Y = Weekly sales X = Fuel price \[ Y = \beta_0 + \beta_1 X + e \]
The Pr(>|t|) acronym found in the model output relates to the probability of observing any value equal to or larger than t. A small p-value indicates that it is unlikely we will observe a relationship between the predictor (x) and response (y) variables due to chance.
Typically, a p-value of 5% or less is a good cut-off point. Where (y) is the response variable what we try to explain by (x) the explanatory variable.
Three stars (or asterisks) represent a highly significant p-value. Consequently, a small p-value for the intercept and the slope indicates that we can reject the null hypothesis which allows us to conclude that there is a relationship between (x) and (y).
We observed that the p-value slightly decreases from model1(0.9377) to model1less(0.427) which is good, but p-value stays not significant as previous regression. This observation informs us that for all stores together the fuel price does not influence the weekly sales. However, we will continue our investigation in the way to find a possible relation of fuel price and weekly sales per stores.
Residual Standard Error is a measure of the quality of a linear regression fit. The Residual Standard Error is the average amount that the response (y) will deviate from the true regression line.
We observed that the residual standard error lightly decreases from model1 (22710) to model1less (22120) which is good but globally the model was not good. In other words, given that the mean weekly sales for all stores are 15686.68 and that the Residual Standard Error is 22120, we can say that the percentage error is (any prediction would still be off by) 141%, which is not good.
The R-squared statistic provides a measure of how well the model is fitting the actual data. It takes the form of a proportion of variance. R-squared is a measure of the linear relationship between our predictor variable (Fuel price) and our response/target variable (Weekly sales). It always lies between 0 and 1 (i.e., a number near 0 represents a regression that does not explain the variance in the response variable in oposit with a number close to 1 which explain the observed variance in the response variable).
The adjusted R-squared index informed us on how the model explains the variance while like for R-squared but, R-squared will always increase as more variables are included in the model. That’s why the adjusted R-squared is the preferred measure as it adjusts for the number of variables considered.
In both case for the model1less, the R2 and the Adj. R2 we obtain numbers close to 0 which tell us that the model did not explain the variance.
F-statistic is a good indicator of whether there is a relationship between our predictor and the response variables. The further the F-statistic is from 1 the better it is. However, how much larger the F-statistic needs to depend on both the number of data points and the number of predictors. Generally, when the number of data points is large, an F-statistic that is only a little bit larger than 1 is already sufficient to reject the null hypothesis (H0: There is no relationship between Fuel prices and Weekly sales). The reverse is true as if the number of data points is small; a large F-statistic is required to be able to ascertain that there may be a relationship between predictor and response variables.
In our model1less, the F-statistic is 0.6311 which is smaller than 1, which not allow us to confirm relationship between Fuel prices and Weekly sales. In this analysis because of our large number of data, if we obtain just a bit larger number than 1, it allow us to cofirm relationship between our response (Y) and our explanatory (X) variables.
## dummies-1.5.6 provided by Decision Patterns
## ── Attaching packages ───────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ tibble 1.4.2 ✔ purrr 0.2.5
## ✔ tidyr 0.8.1 ✔ dplyr 0.7.6
## ✔ readr 1.1.1 ✔ stringr 1.3.1
## ✔ tibble 1.4.2 ✔ forcats 0.3.0
## Warning: package 'dplyr' was built under R version 3.5.1
## ── Conflicts ──────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::between() masks data.table::between()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::first() masks data.table::first()
## ✖ dplyr::lag() masks stats::lag()
## ✖ dplyr::last() masks data.table::last()
## ✖ purrr::transpose() masks data.table::transpose()
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:data.table':
##
## hour, isoweek, mday, minute, month, quarter, second, wday,
## week, yday, year
## The following object is masked from 'package:base':
##
## date
In this second model, we will use a function as.factor which make the same effect like if we have creating dummy variables for each store. While, we will add holiday and temperature as new control variables. We will continue our analysis without the extreme values for weekly sales which are higher than 200 000. We will create interaction variables in the way to observe the impact of fuel price variation per store on the weekly sales.
\[ Weekly Sales = \beta_0 + \beta_1 FuelPrice + \beta_2 Store + \beta_3 FuelPrice*Store + \beta_4 IsHoliday + \beta_5 Temperature + e \]
model2 = lm(Weekly_Sales~Fuel_Price+as.factor(Store)+Fuel_Price*as.factor(Store)+as.factor(IsHoliday.x)+Temperature, data=TM1[-IDout,])
summary(model2)
##
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price + as.factor(Store) + Fuel_Price *
## as.factor(Store) + as.factor(IsHoliday.x) + Temperature,
## data = TM1[-IDout, ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -30668 -11877 -5752 4509 187819
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19483.058 1588.635 12.264 < 2e-16 ***
## Fuel_Price 813.409 488.552 1.665 0.095926 .
## as.factor(Store)2 11285.143 2243.457 5.030 4.90e-07 ***
## as.factor(Store)3 -13772.629 2316.174 -5.946 2.75e-09 ***
## as.factor(Store)4 2193.559 2266.997 0.968 0.333242
## as.factor(Store)5 -14923.212 2320.532 -6.431 1.27e-10 ***
## as.factor(Store)6 4832.642 2244.919 2.153 0.031343 *
## as.factor(Store)7 -12745.884 2270.260 -5.614 1.98e-08 ***
## as.factor(Store)8 -6691.779 2261.443 -2.959 0.003086 **
## as.factor(Store)9 -11654.137 2333.519 -4.994 5.91e-07 ***
## as.factor(Store)10 9178.313 2304.268 3.983 6.80e-05 ***
## as.factor(Store)11 1159.684 2254.644 0.514 0.607005
## as.factor(Store)12 -5000.950 2353.011 -2.125 0.033559 *
## as.factor(Store)13 3271.822 2353.398 1.390 0.164452
## as.factor(Store)14 18818.517 2281.070 8.250 < 2e-16 ***
## as.factor(Store)15 -7845.485 2314.711 -3.389 0.000701 ***
## as.factor(Store)16 -10685.373 2295.659 -4.655 3.25e-06 ***
## as.factor(Store)17 -10319.046 2395.449 -4.308 1.65e-05 ***
## as.factor(Store)18 2030.647 2270.443 0.894 0.371117
## as.factor(Store)19 4472.674 2298.431 1.946 0.051659 .
## as.factor(Store)20 10682.398 2271.651 4.702 2.57e-06 ***
## as.factor(Store)21 -6524.121 2281.895 -2.859 0.004249 **
## as.factor(Store)22 -3374.377 2276.717 -1.482 0.138308
## as.factor(Store)23 103.702 2259.024 0.046 0.963385
## as.factor(Store)24 1833.237 2295.239 0.799 0.424457
## as.factor(Store)25 -7903.027 2295.060 -3.443 0.000574 ***
## as.factor(Store)26 -4632.145 2271.342 -2.039 0.041412 *
## as.factor(Store)27 11704.728 2294.580 5.101 3.38e-07 ***
## as.factor(Store)28 1953.490 2327.226 0.839 0.401242
## as.factor(Store)29 -10282.070 2293.255 -4.484 7.34e-06 ***
## as.factor(Store)30 -8118.094 2473.550 -3.282 0.001031 **
## as.factor(Store)31 -1597.669 2246.551 -0.711 0.476982
## as.factor(Store)32 -5337.582 2244.714 -2.378 0.017415 *
## as.factor(Store)33 -11865.111 2658.568 -4.463 8.09e-06 ***
## as.factor(Store)34 -6734.635 2268.247 -2.969 0.002987 **
## as.factor(Store)35 6343.112 2313.093 2.742 0.006102 **
## as.factor(Store)36 -2441.804 2508.652 -0.973 0.330378
## as.factor(Store)37 -7128.347 2479.518 -2.875 0.004042 **
## as.factor(Store)38 -14995.307 2556.443 -5.866 4.48e-09 ***
## as.factor(Store)39 -5234.694 2262.008 -2.314 0.020658 *
## as.factor(Store)40 -5590.610 2261.663 -2.472 0.013440 *
## as.factor(Store)41 -6942.112 2254.770 -3.079 0.002078 **
## as.factor(Store)42 -7425.218 2591.406 -2.865 0.004166 **
## as.factor(Store)43 -4046.955 2512.642 -1.611 0.107260
## as.factor(Store)44 -15032.768 2639.338 -5.696 1.23e-08 ***
## as.factor(Store)45 -6833.082 2306.097 -2.963 0.003046 **
## as.factor(IsHoliday.x)TRUE 461.668 129.119 3.576 0.000350 ***
## Temperature -6.464 2.137 -3.025 0.002488 **
## Fuel_Price:as.factor(Store)2 -1969.671 690.823 -2.851 0.004356 **
## Fuel_Price:as.factor(Store)3 -474.446 713.331 -0.665 0.505978
## Fuel_Price:as.factor(Store)4 1551.300 698.571 2.221 0.026373 *
## Fuel_Price:as.factor(Store)5 -531.474 714.472 -0.744 0.456956
## Fuel_Price:as.factor(Store)6 -1459.798 691.181 -2.112 0.034684 *
## Fuel_Price:as.factor(Store)7 -250.753 696.540 -0.360 0.718849
## Fuel_Price:as.factor(Store)8 -591.947 696.432 -0.850 0.395341
## Fuel_Price:as.factor(Store)9 -395.062 718.125 -0.550 0.582231
## Fuel_Price:as.factor(Store)10 -1479.943 673.860 -2.196 0.028077 *
## Fuel_Price:as.factor(Store)11 -1128.279 694.039 -1.626 0.104020
## Fuel_Price:as.factor(Store)12 -622.460 683.536 -0.911 0.362482
## Fuel_Price:as.factor(Store)13 610.961 717.267 0.852 0.394331
## Fuel_Price:as.factor(Store)14 -3743.876 681.821 -5.491 4.00e-08 ***
## Fuel_Price:as.factor(Store)15 -1480.348 676.032 -2.190 0.028542 *
## Fuel_Price:as.factor(Store)16 -1021.614 703.274 -1.453 0.146321
## Fuel_Price:as.factor(Store)17 420.448 729.188 0.577 0.564212
## Fuel_Price:as.factor(Store)18 -2425.003 676.257 -3.586 0.000336 ***
## Fuel_Price:as.factor(Store)19 -1754.299 671.879 -2.611 0.009027 **
## Fuel_Price:as.factor(Store)20 -978.387 679.109 -1.441 0.149672
## Fuel_Price:as.factor(Store)21 -1228.276 702.740 -1.748 0.080493 .
## Fuel_Price:as.factor(Store)22 -1016.799 678.287 -1.499 0.133857
## Fuel_Price:as.factor(Store)23 -712.846 673.036 -1.059 0.289532
## Fuel_Price:as.factor(Store)24 -1391.706 670.879 -2.074 0.038038 *
## Fuel_Price:as.factor(Store)25 -1096.472 685.757 -1.599 0.109839
## Fuel_Price:as.factor(Store)26 -827.202 676.421 -1.223 0.221364
## Fuel_Price:as.factor(Store)27 -2531.784 670.686 -3.775 0.000160 ***
## Fuel_Price:as.factor(Store)28 -1475.573 676.898 -2.180 0.029265 *
## Fuel_Price:as.factor(Store)29 -1038.112 682.817 -1.520 0.128427
## Fuel_Price:as.factor(Store)30 -1491.746 760.869 -1.961 0.049928 *
## Fuel_Price:as.factor(Store)31 -142.388 691.895 -0.206 0.836951
## Fuel_Price:as.factor(Store)32 -49.688 688.873 -0.072 0.942499
## Fuel_Price:as.factor(Store)33 -1210.532 765.928 -1.580 0.113998
## Fuel_Price:as.factor(Store)34 -465.254 699.257 -0.665 0.505824
## Fuel_Price:as.factor(Store)35 -4276.284 690.831 -6.190 6.02e-10 ***
## Fuel_Price:as.factor(Store)36 -3314.113 773.428 -4.285 1.83e-05 ***
## Fuel_Price:as.factor(Store)37 -1317.126 761.692 -1.729 0.083772 .
## Fuel_Price:as.factor(Store)38 134.157 735.659 0.182 0.855298
## Fuel_Price:as.factor(Store)39 1382.227 696.591 1.984 0.047226 *
## Fuel_Price:as.factor(Store)40 -770.984 673.857 -1.144 0.252568
## Fuel_Price:as.factor(Store)41 931.949 691.419 1.348 0.177698
## Fuel_Price:as.factor(Store)42 -863.422 747.904 -1.154 0.248314
## Fuel_Price:as.factor(Store)43 -1313.104 773.961 -1.697 0.089773 .
## Fuel_Price:as.factor(Store)44 -235.142 801.658 -0.293 0.769279
## Fuel_Price:as.factor(Store)45 -1009.367 688.798 -1.465 0.142811
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 21030 on 421315 degrees of freedom
## Multiple R-squared: 0.09587, Adjusted R-squared: 0.09567
## F-statistic: 490.9 on 91 and 421315 DF, p-value: < 2.2e-16
We observed that the p-value considerably decreases from model1less(0.9377) to model2(2.2e-16) which is good based on theory, but 2.2e-16 is the smallest number larger than 0 that can be stored by the floating system in our computer. This number suppose that the sample size is enormous, which is our case or perhaps the routine that calculates p is incorrect. So we can not conclude on this global model based on the model2 p-value.
When we observed the p-value for the explanatory variable fuel price we obtain (0.095926) which is significant at 10% indicates that we can reject the null hypothesis. It means if we increase fuel price by 1 it will
The two controls variable which we add as holiday and temperature are highly significative both at 1%. # Temperature, p-value 0.0025: For temperature increasing of one unit, the sales decrease by 6.464, which in practice have zero impact. However, it interesting to know if the temperature increases the sales decrease. It could be interesting to observe this impact per store.
# Holiday(TRUE), p-value 0.000350: When it is a holiday day the sales increase by 461.668, which explain human behaviour in the way that they do shoppoçing when they have free time.
The interaction variables (Fuel_Price*Store) allow us to observe per store the variation in weekly sales for a fuel price variation. It gives us more accurate information on the relation between weekly sales and fuel price per store. In this model2 18 stores on 45 demonstrate that the fuel price variation has a significant impact on weekly sales and in those 18 stores, weekly sales of 16 store reacted negatively to increase in a fuel price.
We observed that the residual standard error lightly decreases from model1less with (22120) to model 2 with (22030) which is good but relatively not significant. In other words, given that the mean weekly sales for all stores are 19483.058 and that the Residual Standard Error is 22030, we can say that the percentage error is (any prediction would still be off by) 108%, which again not good.
In both case for the model 2, the R2 and the Adj. R2 we obtain 0.09567 numbers close to 0 which tell us that the model did not explain the variance, but it is better than the model1less.
The F-statistic is 490.9 which is fare larger than 1, which allow us to confirm a relationship between Fuel prices and Weekly sales.
AIC(model1less, model2)
The Akaike information criterion(AIC) is the estimator of the relative quality of statistical models or the relative goodness of fit. It’is an interesting criterion to consider when we are comparing a build up models.
How can we use the information from the AIC test in our analysis? Usually when we have a difference of 2 in the AIC test we can consider using the other more complex model and 10 is considering a substantial difference.
Considering the results of this first AIC test: model1less of (3), model2 of (93), this significant difference allow you to use the model 2.
Globaly this second model offer us mutch better performance with more usable and accurate informations which we can consider using practicatly.
We will create and add new control variables for year and month to our previous model.
Year=as.numeric(substring(TM1$Date,1,4))
Month=as.numeric(substring(TM1$Date,6,7))
TM1=data.frame(TM1,Year,Month)
\[ Weekly Sales = \beta_0 + \beta_1 FuelPrice + \beta_2 Store + \beta_3 FuelPrice*Store + \beta_4 IsHoliday + \beta_5 Temperature + \beta_6 Year + \beta_7 Month + \beta_8 Month*FUelPrice + e \]
model3 = (lm(Weekly_Sales~Fuel_Price+as.factor(Store)+Fuel_Price*as.factor(Store)+as.factor(IsHoliday.x)+Temperature+as.factor(Year)+as.factor(Month)+as.factor(Month)*Fuel_Price, data=TM1[-IDout,]))
summary(model3)
##
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price + as.factor(Store) + Fuel_Price *
## as.factor(Store) + as.factor(IsHoliday.x) + Temperature +
## as.factor(Year) + as.factor(Month) + as.factor(Month) * Fuel_Price,
## data = TM1[-IDout, ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -33158 -11872 -5668 4528 187540
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23761.467 2595.273 9.156 < 2e-16 ***
## Fuel_Price -1282.400 809.462 -1.584 0.113135
## as.factor(Store)2 11278.065 2240.989 5.033 4.84e-07 ***
## as.factor(Store)3 -13820.523 2313.646 -5.973 2.32e-09 ***
## as.factor(Store)4 2400.001 2267.247 1.059 0.289804
## as.factor(Store)5 -14929.001 2317.996 -6.440 1.19e-10 ***
## as.factor(Store)6 4859.649 2242.481 2.167 0.030229 *
## as.factor(Store)7 -12128.787 2270.957 -5.341 9.26e-08 ***
## as.factor(Store)8 -6551.648 2259.745 -2.899 0.003740 **
## as.factor(Store)9 -11592.825 2331.147 -4.973 6.59e-07 ***
## as.factor(Store)10 9362.953 2306.674 4.059 4.93e-05 ***
## as.factor(Store)11 1101.385 2252.260 0.489 0.624833
## as.factor(Store)12 -4789.386 2355.870 -2.033 0.042057 *
## as.factor(Store)13 3742.250 2354.617 1.589 0.111988
## as.factor(Store)14 19168.518 2280.435 8.406 < 2e-16 ***
## as.factor(Store)15 -7317.209 2317.131 -3.158 0.001589 **
## as.factor(Store)16 -9995.114 2297.769 -4.350 1.36e-05 ***
## as.factor(Store)17 -9663.189 2399.525 -4.027 5.65e-05 ***
## as.factor(Store)18 2487.380 2270.140 1.096 0.273213
## as.factor(Store)19 4972.305 2300.458 2.161 0.030662 *
## as.factor(Store)20 11082.424 2271.627 4.879 1.07e-06 ***
## as.factor(Store)21 -6532.055 2279.388 -2.866 0.004161 **
## as.factor(Store)22 -2924.282 2276.248 -1.285 0.198900
## as.factor(Store)23 685.176 2260.209 0.303 0.761778
## as.factor(Store)24 2313.365 2296.928 1.007 0.313860
## as.factor(Store)25 -7455.289 2295.699 -3.248 0.001164 **
## as.factor(Store)26 -3974.470 2274.180 -1.748 0.080525 .
## as.factor(Store)27 12094.542 2295.612 5.269 1.38e-07 ***
## as.factor(Store)28 2148.588 2330.146 0.922 0.356486
## as.factor(Store)29 -9845.675 2292.730 -4.294 1.75e-05 ***
## as.factor(Store)30 -8147.052 2470.835 -3.297 0.000976 ***
## as.factor(Store)31 -1599.473 2244.083 -0.713 0.476000
## as.factor(Store)32 -4848.149 2243.930 -2.161 0.030730 *
## as.factor(Store)33 -11802.783 2660.525 -4.436 9.16e-06 ***
## as.factor(Store)34 -6705.556 2266.432 -2.959 0.003090 **
## as.factor(Store)35 6662.011 2312.285 2.881 0.003963 **
## as.factor(Store)36 -2488.866 2506.356 -0.993 0.320700
## as.factor(Store)37 -7217.576 2476.896 -2.914 0.003569 **
## as.factor(Store)38 -14806.921 2558.760 -5.787 7.18e-09 ***
## as.factor(Store)39 -5286.135 2259.548 -2.339 0.019312 *
## as.factor(Store)40 -5055.266 2262.225 -2.235 0.025441 *
## as.factor(Store)41 -6365.893 2254.713 -2.823 0.004752 **
## as.factor(Store)42 -7267.813 2593.135 -2.803 0.005068 **
## as.factor(Store)43 -4102.931 2509.892 -1.635 0.102112
## as.factor(Store)44 -14543.667 2639.852 -5.509 3.61e-08 ***
## as.factor(Store)45 -6501.471 2305.406 -2.820 0.004801 **
## as.factor(IsHoliday.x)TRUE -29.778 139.967 -0.213 0.831520
## Temperature 8.547 6.275 1.362 0.173172
## as.factor(Year)2011 -157.133 212.168 -0.741 0.458932
## as.factor(Year)2012 166.165 241.796 0.687 0.491949
## as.factor(Month)2 -5169.688 2245.160 -2.303 0.021302 *
## as.factor(Month)3 -6403.200 2205.832 -2.903 0.003698 **
## as.factor(Month)4 -6259.923 2201.361 -2.844 0.004460 **
## as.factor(Month)5 -5428.126 2247.790 -2.415 0.015741 *
## as.factor(Month)6 -5239.259 2241.864 -2.337 0.019439 *
## as.factor(Month)7 -5844.296 2249.132 -2.598 0.009364 **
## as.factor(Month)8 -5429.774 2254.089 -2.409 0.016003 *
## as.factor(Month)9 -7043.490 2212.157 -3.184 0.001453 **
## as.factor(Month)10 -8501.320 2203.714 -3.858 0.000114 ***
## as.factor(Month)11 -6553.449 2506.638 -2.614 0.008938 **
## as.factor(Month)12 -4564.815 2690.722 -1.697 0.089792 .
## Fuel_Price:as.factor(Store)2 -1966.896 690.063 -2.850 0.004368 **
## Fuel_Price:as.factor(Store)3 -475.509 712.552 -0.667 0.504560
## Fuel_Price:as.factor(Store)4 1515.699 698.208 2.171 0.029944 *
## Fuel_Price:as.factor(Store)5 -536.195 713.702 -0.751 0.452480
## Fuel_Price:as.factor(Store)6 -1474.613 690.449 -2.136 0.032702 *
## Fuel_Price:as.factor(Store)7 -318.598 696.475 -0.457 0.647353
## Fuel_Price:as.factor(Store)8 -609.122 695.711 -0.876 0.381281
## Fuel_Price:as.factor(Store)9 -412.554 717.384 -0.575 0.565237
## Fuel_Price:as.factor(Store)10 -1566.178 674.055 -2.324 0.020152 *
## Fuel_Price:as.factor(Store)11 -1129.725 693.276 -1.630 0.103198
## Fuel_Price:as.factor(Store)12 -708.808 683.752 -1.037 0.299902
## Fuel_Price:as.factor(Store)13 521.311 717.271 0.727 0.467350
## Fuel_Price:as.factor(Store)14 -3809.057 681.410 -5.590 2.27e-08 ***
## Fuel_Price:as.factor(Store)15 -1575.106 675.805 -2.331 0.019769 *
## Fuel_Price:as.factor(Store)16 -1137.535 702.873 -1.618 0.105575
## Fuel_Price:as.factor(Store)17 307.331 729.123 0.422 0.673385
## Fuel_Price:as.factor(Store)18 -2504.634 675.988 -3.705 0.000211 ***
## Fuel_Price:as.factor(Store)19 -1842.370 671.688 -2.743 0.006090 **
## Fuel_Price:as.factor(Store)20 -1048.599 678.662 -1.545 0.122323
## Fuel_Price:as.factor(Store)21 -1229.304 701.970 -1.751 0.079908 .
## Fuel_Price:as.factor(Store)22 -1100.223 678.016 -1.623 0.104652
## Fuel_Price:as.factor(Store)23 -807.712 672.715 -1.201 0.229878
## Fuel_Price:as.factor(Store)24 -1481.592 670.700 -2.209 0.027173 *
## Fuel_Price:as.factor(Store)25 -1166.111 685.294 -1.702 0.088827 .
## Fuel_Price:as.factor(Store)26 -922.178 676.082 -1.364 0.172567
## Fuel_Price:as.factor(Store)27 -2610.858 670.541 -3.894 9.88e-05 ***
## Fuel_Price:as.factor(Store)28 -1557.564 677.118 -2.300 0.021433 *
## Fuel_Price:as.factor(Store)29 -1117.610 682.533 -1.637 0.101538
## Fuel_Price:as.factor(Store)30 -1487.100 760.036 -1.957 0.050393 .
## Fuel_Price:as.factor(Store)31 -144.672 691.137 -0.209 0.834194
## Fuel_Price:as.factor(Store)32 -138.054 688.604 -0.200 0.841103
## Fuel_Price:as.factor(Store)33 -1285.543 765.933 -1.678 0.093270 .
## Fuel_Price:as.factor(Store)34 -429.079 698.515 -0.614 0.539035
## Fuel_Price:as.factor(Store)35 -4330.898 690.452 -6.273 3.56e-10 ***
## Fuel_Price:as.factor(Store)36 -3313.247 772.679 -4.288 1.80e-05 ***
## Fuel_Price:as.factor(Store)37 -1303.875 760.857 -1.714 0.086586 .
## Fuel_Price:as.factor(Store)38 52.994 735.798 0.072 0.942584
## Fuel_Price:as.factor(Store)39 1386.897 695.825 1.993 0.046244 *
## Fuel_Price:as.factor(Store)40 -847.837 673.624 -1.259 0.208168
## Fuel_Price:as.factor(Store)41 837.226 691.135 1.211 0.225751
## Fuel_Price:as.factor(Store)42 -945.193 748.007 -1.264 0.206369
## Fuel_Price:as.factor(Store)43 -1300.235 773.112 -1.682 0.092604 .
## Fuel_Price:as.factor(Store)44 -330.521 801.477 -0.412 0.680054
## Fuel_Price:as.factor(Store)45 -1070.102 688.369 -1.555 0.120055
## Fuel_Price:as.factor(Month)2 2140.842 691.190 3.097 0.001953 **
## Fuel_Price:as.factor(Month)3 2319.906 669.767 3.464 0.000533 ***
## Fuel_Price:as.factor(Month)4 2324.423 664.759 3.497 0.000471 ***
## Fuel_Price:as.factor(Month)5 2108.884 673.662 3.130 0.001745 **
## Fuel_Price:as.factor(Month)6 2178.415 675.116 3.227 0.001252 **
## Fuel_Price:as.factor(Month)7 2216.807 681.005 3.255 0.001133 **
## Fuel_Price:as.factor(Month)8 2147.445 677.462 3.170 0.001525 **
## Fuel_Price:as.factor(Month)9 2363.397 666.744 3.545 0.000393 ***
## Fuel_Price:as.factor(Month)10 2875.057 667.687 4.306 1.66e-05 ***
## Fuel_Price:as.factor(Month)11 2805.872 767.831 3.654 0.000258 ***
## Fuel_Price:as.factor(Month)12 2908.426 828.412 3.511 0.000447 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 21010 on 421291 degrees of freedom
## Multiple R-squared: 0.09791, Adjusted R-squared: 0.09766
## F-statistic: 397.6 on 115 and 421291 DF, p-value: < 2.2e-16
We observed that the p-value unchanged from model 2, which is (2.2e-16) which is good in the way of theory, but for a reason explained earlier we can not conclude on this global model based this p-value.
In we observed the p-value for the explanatory variable fuel price we obtained (0.113135) which is not significant and a lower score than we obtain in the precedent model with a p-value of (0.095926) significant at 10%.
The interaction variables (Fuel_Price*Store) in this model3 had 20 stores on 45, which demonstrate that the fuel price variation has a significant impact on weekly sales and in those 20 stores, weekly sales of 18 store reacted negatively to a positive variation of fuel price.
We observed that the residual standard error lightly decreases from model 2 with (22030) to model 3 with (22010) which relatively not significant. In other words, given that the mean weekly sales for all stores are 23761.467 and that the Residual Standard Error is 22010, we can say that the percentage error is (any prediction would still be off by) 108%, which again not good.
In both case for the model 3, the R2 and the Adj. R2 we obtained 0.097 numbers close to 0 which tell us that the model not explained the variance and is very similar to model 2.
The F-statistic is 397.6 which is fare larger than 1, similar to the previous model, which again, allow us to confirm the relationship between Fuel prices and Weekly sales.
AIC(model2, model3)
Considering the results of this first AIC test: model 2 of (93), model 3 of (117), which is a relatively small increase if we compare with model 1 to model 2.
Globally this third model offers us less performance than the model 2.
We will try to investigate the characteristics of the stores that appear with a significant p-value at the interactive variable (Fuel_Price*Stores).
We will create 3 categories for the small, medium and large size stores. We already have classification A, B, C of the stores but unfortunately, we can not have more information about those categories.
Fist, we will observe the spectrum in the Walmart store size in the way to create the right categories size. We are creating those category sizes of the store in the way to observe if there exists a relation between the stores with significant relation (weekly sales/fuel price) and the category size store.
hist(stores$Size)
print(max(stores$Size))
## [1] 219622
print(min(stores$Size))
## [1] 34875
#Create categories fore store size
attach(stores)
stores$Sizestore[Size > 175000] <- "Large"
stores$Sizestore[Size > 75000 & Size <= 175000] <- "Medium"
stores$Sizestore[Size <= 75000] <- "Smal"
detach(stores)
newData=merge(TM1, stores,by.x="Store",by.y="Store")
Year=as.numeric(substring(newData$Date,1,4))
Month=as.numeric(substring(newData$Date,6,7))
newData=data.frame(newData,Year,Month)
IDout=which(newData$Weekly_Sales>200000)
\[ Weekly Sales = \beta_0 + \beta_1 FuelPrice + \beta_2 Store + \beta_3 FuelPrice*Store + \beta_4 IsHoliday + \beta_5 Temperature + \beta_6 Year + \beta_7 Month + \beta_8 Month*FuelPrice + \beta_9 Type + \beta_10 SizeStore + \beta_11 Size + e \]
model4 = (lm(Weekly_Sales~Fuel_Price+as.factor(Store)+Fuel_Price*as.factor(Store)+as.factor(IsHoliday.x)+Temperature+as.factor(Year)+as.factor(Month)+as.factor(Month)*Fuel_Price+as.factor(as.numeric(Type))+as.factor(Sizestore)+as.factor(Size), data=newData[-IDout,], na.action=na.omit))
summary(model4)
##
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price + as.factor(Store) + Fuel_Price *
## as.factor(Store) + as.factor(IsHoliday.x) + Temperature +
## as.factor(Year) + as.factor(Month) + as.factor(Month) * Fuel_Price +
## as.factor(as.numeric(Type)) + as.factor(Sizestore) + as.factor(Size),
## data = newData[-IDout, ], na.action = na.omit)
##
## Residuals:
## Min 1Q Median 3Q Max
## -33158 -11872 -5668 4528 187540
##
## Coefficients: (43 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23761.467 2595.273 9.156 < 2e-16 ***
## Fuel_Price -1282.400 809.462 -1.584 0.113135
## as.factor(Store)2 11278.065 2240.989 5.033 4.84e-07 ***
## as.factor(Store)3 -13820.523 2313.646 -5.973 2.32e-09 ***
## as.factor(Store)4 2400.001 2267.247 1.059 0.289804
## as.factor(Store)5 -14929.001 2317.996 -6.440 1.19e-10 ***
## as.factor(Store)6 4859.649 2242.481 2.167 0.030229 *
## as.factor(Store)7 -12128.787 2270.957 -5.341 9.26e-08 ***
## as.factor(Store)8 -6551.648 2259.745 -2.899 0.003740 **
## as.factor(Store)9 -11592.825 2331.147 -4.973 6.59e-07 ***
## as.factor(Store)10 9362.953 2306.674 4.059 4.93e-05 ***
## as.factor(Store)11 1101.385 2252.260 0.489 0.624833
## as.factor(Store)12 -4789.386 2355.870 -2.033 0.042057 *
## as.factor(Store)13 3742.250 2354.617 1.589 0.111988
## as.factor(Store)14 19168.518 2280.435 8.406 < 2e-16 ***
## as.factor(Store)15 -7317.209 2317.131 -3.158 0.001589 **
## as.factor(Store)16 -9995.114 2297.769 -4.350 1.36e-05 ***
## as.factor(Store)17 -9663.189 2399.525 -4.027 5.65e-05 ***
## as.factor(Store)18 2487.380 2270.140 1.096 0.273213
## as.factor(Store)19 4972.305 2300.458 2.161 0.030662 *
## as.factor(Store)20 11082.424 2271.627 4.879 1.07e-06 ***
## as.factor(Store)21 -6532.055 2279.388 -2.866 0.004161 **
## as.factor(Store)22 -2924.282 2276.248 -1.285 0.198900
## as.factor(Store)23 685.176 2260.209 0.303 0.761778
## as.factor(Store)24 2313.365 2296.928 1.007 0.313860
## as.factor(Store)25 -7455.289 2295.699 -3.248 0.001164 **
## as.factor(Store)26 -3974.470 2274.180 -1.748 0.080525 .
## as.factor(Store)27 12094.542 2295.612 5.269 1.38e-07 ***
## as.factor(Store)28 2148.588 2330.146 0.922 0.356486
## as.factor(Store)29 -9845.675 2292.730 -4.294 1.75e-05 ***
## as.factor(Store)30 -8147.052 2470.835 -3.297 0.000976 ***
## as.factor(Store)31 -1599.473 2244.083 -0.713 0.476000
## as.factor(Store)32 -4848.149 2243.930 -2.161 0.030730 *
## as.factor(Store)33 -11802.783 2660.525 -4.436 9.16e-06 ***
## as.factor(Store)34 -6705.556 2266.432 -2.959 0.003090 **
## as.factor(Store)35 6662.011 2312.285 2.881 0.003963 **
## as.factor(Store)36 -2488.866 2506.356 -0.993 0.320700
## as.factor(Store)37 -7217.576 2476.896 -2.914 0.003569 **
## as.factor(Store)38 -14806.921 2558.760 -5.787 7.18e-09 ***
## as.factor(Store)39 -5286.135 2259.548 -2.339 0.019312 *
## as.factor(Store)40 -5055.266 2262.225 -2.235 0.025441 *
## as.factor(Store)41 -6365.893 2254.713 -2.823 0.004752 **
## as.factor(Store)42 -7267.813 2593.135 -2.803 0.005068 **
## as.factor(Store)43 -4102.931 2509.892 -1.635 0.102112
## as.factor(Store)44 -14543.667 2639.852 -5.509 3.61e-08 ***
## as.factor(Store)45 -6501.471 2305.406 -2.820 0.004801 **
## as.factor(IsHoliday.x)TRUE -29.778 139.967 -0.213 0.831520
## Temperature 8.547 6.275 1.362 0.173172
## as.factor(Year)2011 -157.133 212.168 -0.741 0.458932
## as.factor(Year)2012 166.165 241.796 0.687 0.491949
## as.factor(Month)2 -5169.688 2245.160 -2.303 0.021302 *
## as.factor(Month)3 -6403.200 2205.832 -2.903 0.003698 **
## as.factor(Month)4 -6259.923 2201.361 -2.844 0.004460 **
## as.factor(Month)5 -5428.126 2247.790 -2.415 0.015741 *
## as.factor(Month)6 -5239.259 2241.864 -2.337 0.019439 *
## as.factor(Month)7 -5844.296 2249.132 -2.598 0.009364 **
## as.factor(Month)8 -5429.774 2254.089 -2.409 0.016003 *
## as.factor(Month)9 -7043.490 2212.157 -3.184 0.001453 **
## as.factor(Month)10 -8501.320 2203.714 -3.858 0.000114 ***
## as.factor(Month)11 -6553.449 2506.638 -2.614 0.008938 **
## as.factor(Month)12 -4564.815 2690.722 -1.697 0.089792 .
## as.factor(as.numeric(Type))2 NA NA NA NA
## as.factor(as.numeric(Type))3 NA NA NA NA
## as.factor(Sizestore)Medium NA NA NA NA
## as.factor(Sizestore)Smal NA NA NA NA
## as.factor(Size)37392 NA NA NA NA
## as.factor(Size)39690 NA NA NA NA
## as.factor(Size)39910 NA NA NA NA
## as.factor(Size)41062 NA NA NA NA
## as.factor(Size)42988 NA NA NA NA
## as.factor(Size)57197 NA NA NA NA
## as.factor(Size)70713 NA NA NA NA
## as.factor(Size)93188 NA NA NA NA
## as.factor(Size)93638 NA NA NA NA
## as.factor(Size)103681 NA NA NA NA
## as.factor(Size)112238 NA NA NA NA
## as.factor(Size)114533 NA NA NA NA
## as.factor(Size)118221 NA NA NA NA
## as.factor(Size)119557 NA NA NA NA
## as.factor(Size)120653 NA NA NA NA
## as.factor(Size)123737 NA NA NA NA
## as.factor(Size)125833 NA NA NA NA
## as.factor(Size)126512 NA NA NA NA
## as.factor(Size)128107 NA NA NA NA
## as.factor(Size)140167 NA NA NA NA
## as.factor(Size)151315 NA NA NA NA
## as.factor(Size)152513 NA NA NA NA
## as.factor(Size)155078 NA NA NA NA
## as.factor(Size)155083 NA NA NA NA
## as.factor(Size)158114 NA NA NA NA
## as.factor(Size)184109 NA NA NA NA
## as.factor(Size)196321 NA NA NA NA
## as.factor(Size)200898 NA NA NA NA
## as.factor(Size)202307 NA NA NA NA
## as.factor(Size)202505 NA NA NA NA
## as.factor(Size)203007 NA NA NA NA
## as.factor(Size)203742 NA NA NA NA
## as.factor(Size)203750 NA NA NA NA
## as.factor(Size)203819 NA NA NA NA
## as.factor(Size)204184 NA NA NA NA
## as.factor(Size)205863 NA NA NA NA
## as.factor(Size)206302 NA NA NA NA
## as.factor(Size)207499 NA NA NA NA
## as.factor(Size)219622 NA NA NA NA
## Fuel_Price:as.factor(Store)2 -1966.896 690.063 -2.850 0.004368 **
## Fuel_Price:as.factor(Store)3 -475.509 712.552 -0.667 0.504560
## Fuel_Price:as.factor(Store)4 1515.699 698.208 2.171 0.029944 *
## Fuel_Price:as.factor(Store)5 -536.195 713.702 -0.751 0.452480
## Fuel_Price:as.factor(Store)6 -1474.613 690.449 -2.136 0.032702 *
## Fuel_Price:as.factor(Store)7 -318.598 696.475 -0.457 0.647353
## Fuel_Price:as.factor(Store)8 -609.122 695.711 -0.876 0.381281
## Fuel_Price:as.factor(Store)9 -412.554 717.384 -0.575 0.565237
## Fuel_Price:as.factor(Store)10 -1566.178 674.055 -2.324 0.020152 *
## Fuel_Price:as.factor(Store)11 -1129.725 693.276 -1.630 0.103198
## Fuel_Price:as.factor(Store)12 -708.808 683.752 -1.037 0.299902
## Fuel_Price:as.factor(Store)13 521.311 717.271 0.727 0.467350
## Fuel_Price:as.factor(Store)14 -3809.057 681.410 -5.590 2.27e-08 ***
## Fuel_Price:as.factor(Store)15 -1575.106 675.805 -2.331 0.019769 *
## Fuel_Price:as.factor(Store)16 -1137.535 702.873 -1.618 0.105575
## Fuel_Price:as.factor(Store)17 307.331 729.123 0.422 0.673385
## Fuel_Price:as.factor(Store)18 -2504.634 675.988 -3.705 0.000211 ***
## Fuel_Price:as.factor(Store)19 -1842.370 671.688 -2.743 0.006090 **
## Fuel_Price:as.factor(Store)20 -1048.599 678.662 -1.545 0.122323
## Fuel_Price:as.factor(Store)21 -1229.304 701.970 -1.751 0.079908 .
## Fuel_Price:as.factor(Store)22 -1100.223 678.016 -1.623 0.104652
## Fuel_Price:as.factor(Store)23 -807.712 672.715 -1.201 0.229878
## Fuel_Price:as.factor(Store)24 -1481.592 670.700 -2.209 0.027173 *
## Fuel_Price:as.factor(Store)25 -1166.111 685.294 -1.702 0.088827 .
## Fuel_Price:as.factor(Store)26 -922.178 676.082 -1.364 0.172567
## Fuel_Price:as.factor(Store)27 -2610.858 670.541 -3.894 9.88e-05 ***
## Fuel_Price:as.factor(Store)28 -1557.564 677.118 -2.300 0.021433 *
## Fuel_Price:as.factor(Store)29 -1117.610 682.533 -1.637 0.101538
## Fuel_Price:as.factor(Store)30 -1487.100 760.036 -1.957 0.050393 .
## Fuel_Price:as.factor(Store)31 -144.672 691.137 -0.209 0.834194
## Fuel_Price:as.factor(Store)32 -138.054 688.604 -0.200 0.841103
## Fuel_Price:as.factor(Store)33 -1285.543 765.933 -1.678 0.093270 .
## Fuel_Price:as.factor(Store)34 -429.079 698.515 -0.614 0.539035
## Fuel_Price:as.factor(Store)35 -4330.898 690.452 -6.273 3.56e-10 ***
## Fuel_Price:as.factor(Store)36 -3313.247 772.679 -4.288 1.80e-05 ***
## Fuel_Price:as.factor(Store)37 -1303.875 760.857 -1.714 0.086586 .
## Fuel_Price:as.factor(Store)38 52.994 735.798 0.072 0.942584
## Fuel_Price:as.factor(Store)39 1386.897 695.825 1.993 0.046244 *
## Fuel_Price:as.factor(Store)40 -847.837 673.624 -1.259 0.208168
## Fuel_Price:as.factor(Store)41 837.226 691.135 1.211 0.225751
## Fuel_Price:as.factor(Store)42 -945.193 748.007 -1.264 0.206369
## Fuel_Price:as.factor(Store)43 -1300.235 773.112 -1.682 0.092604 .
## Fuel_Price:as.factor(Store)44 -330.521 801.477 -0.412 0.680054
## Fuel_Price:as.factor(Store)45 -1070.102 688.369 -1.555 0.120055
## Fuel_Price:as.factor(Month)2 2140.842 691.190 3.097 0.001953 **
## Fuel_Price:as.factor(Month)3 2319.906 669.767 3.464 0.000533 ***
## Fuel_Price:as.factor(Month)4 2324.423 664.759 3.497 0.000471 ***
## Fuel_Price:as.factor(Month)5 2108.884 673.662 3.130 0.001745 **
## Fuel_Price:as.factor(Month)6 2178.415 675.116 3.227 0.001252 **
## Fuel_Price:as.factor(Month)7 2216.807 681.005 3.255 0.001133 **
## Fuel_Price:as.factor(Month)8 2147.445 677.462 3.170 0.001525 **
## Fuel_Price:as.factor(Month)9 2363.397 666.744 3.545 0.000393 ***
## Fuel_Price:as.factor(Month)10 2875.057 667.687 4.306 1.66e-05 ***
## Fuel_Price:as.factor(Month)11 2805.872 767.831 3.654 0.000258 ***
## Fuel_Price:as.factor(Month)12 2908.426 828.412 3.511 0.000447 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 21010 on 421291 degrees of freedom
## Multiple R-squared: 0.09791, Adjusted R-squared: 0.09766
## F-statistic: 397.6 on 115 and 421291 DF, p-value: < 2.2e-16
Unfortunately, we obtained NA as results for all those new variable introduced. After the collinearity test, we conclude as we had perfect multi-colinearity for Size, Sizestore, Type, variables in this model 4 which cannot allow us to obtain more pieces of information in our analysis.
In this analysis, model 2 offer the best performances globally. This second model give us specific informations on those stores 2,4,6,10,14,15,18,19,21,24,27,28,30,35,36,37,39 and 43, for which fuel price variation have significant impact. For the stores 39 and 4 when the fuel price increase the weekly sales increase too and for all other stores when the fuel price increase the weekly sales decrease.
This analysis allows us to reach our main goals which were to obtained information on the relation between weekly sales and fuel price. This analysis opens the doors for the research on the impact of the impact of price variation of fuel on the Walmart consumer behaviour.