Objective of this analysis

Fuel figure as an essential good for most Americans who go shopping at Walmart Store. At an average price of $2.15 per gallon last year, the average American forked about 1,400 dollars to fill up their tank. Walmart is not a high-end store; the average consumer is from poor social class to the average social class. For those two social classes, the marginal variation of fuel price will have a higher impact on consumer budget capacity while in consumer behaviour than the marginal impact on the rich social class. This last assumption justifies our research question. The results of our research if it is negative or positive will be important information to know, in case of the impact of fuel price variation on Wall-Mart consumer behaviour, the fuel price could be added to the forecast procurement models. In a growth strategy planning, this kind of information can be relevant for the members of the decisional circles.

Is it exist an impact on consumer behaviour when there is variation in the fuel price. If there exists a cause to effect what is the magnitude of this effect and how the consumer will react? Will they consume more or less at Walmart store?

In the first part, we will build the most accurate model which can explain the impact on the Wall-Mart consumer behaviours.

In the second part, we will try to identify if there are some store which can use the fuel price variation as an indicator on the future Wall-Mart sales.

In the third part, we will try to identify if the stores for which the fuel price variation have a significant impact if those stores have some common characteristics as stores sizes or geographical localization.

Hypothesis

1.The prices are fixed in time; there is the only variation in quantities.

Plan for the analysis

  1. Loading the data sets
  2. We will look if there exists any correlation in the data
  3. We will observe ours two interest variables, fuel price and weekly sales for a better understanding of the distributions.
  4. Regression analysis
  1. Global alaysis
  2. Conclusion
## [1] "/Users/viktoralexy/Desktop/Rproject/git_project"

1. Loading the datasets

train <- read.csv("train.csv", header = TRUE)
features <- read.csv("features.csv", header = TRUE)
stores <- read.csv("stores.csv", header = TRUE)

2. We will look if there exists any correlation in the data

fuelsales <- merge(train, features, by=c("Store","Date"))
sts.ex.sat <- subset(fuelsales, select = c("Fuel_Price", "Weekly_Sales"))
summary(sts.ex.sat)
##    Fuel_Price     Weekly_Sales   
##  Min.   :2.472   Min.   : -4989  
##  1st Qu.:2.933   1st Qu.:  2080  
##  Median :3.452   Median :  7612  
##  Mean   :3.361   Mean   : 15981  
##  3rd Qu.:3.738   3rd Qu.: 20206  
##  Max.   :4.468   Max.   :693099

Fuel price data observation: The distribution looks normal with the min of 2.472 and a max of 4.468 and median of 3.452.

Weekly sales data observation: The distribution had a negative min of -4989 which it can be interesting to investigate why to find what can make a negative sales. If we compare the max of 693099 and the median of 7612 with a slightly higher mean of 15 981 we can suppose a distribution with a long right tail creating by a few extreme high weekly sales values.

cor(sts.ex.sat) 
##                 Fuel_Price  Weekly_Sales
## Fuel_Price    1.0000000000 -0.0001202955
## Weekly_Sales -0.0001202955  1.0000000000

There is no correlation between weekly sales and fuel price which is not a good result in the way of our initial question, but we will continue to investigate if the exist relation by store.

## corrplot 0.84 loaded
# plotting functions
ggplot(data = sts.ex.sat, aes(x = Fuel_Price, y = Weekly_Sales)) +
  geom_point(alpha = 0.1, aes(color = Weekly_Sales))

Unfortunately, this graph did not give us any valuable pieces of information in the potential relations between weekly sales and fuel price.

First analysis gave us a lousy signal without a clear correlation between weekly sales and fuel price. However, we will continue our study in the way to investigate if there exists a cause to effect relation for some specific stores. If we obtain an affirmative answer, the forecast procurement team can consider the impact of this variable in their forecasts models.

3. We will observe ours two interest variables, fuel price and weekly sales for a better understanding of the distributions.

hist(fuelsales$Weekly_Sales)

As we observed earlier in this analysis, this histogram has a long right tail. Those extreme data can be considered as outliers. We will correct this data set by excluding the extreme value to make a normal distribution.

IDout=which(fuelsales$Weekly_Sales>200000)

It results by a data repartition, which has characteristics of closer to a normal distribution.

hist(fuelsales$Weekly_Sales-IDout)
## Warning in fuelsales$Weekly_Sales - IDout: la taille d'un objet plus long
## n'est pas multiple de la taille d'un objet plus court

hist(fuelsales$Fuel_Price)

As we observed overlap of two processes considering as bimodal – it will have two most-frequent values. We will not stratify the data because we consider that the difference between min and max is reasonable. We will quip fule price data as is.

4. Regressions analysis

We will build the model, we will start with the regression of weekly sales on fuel price, and we will gradually add controls variables.

4.1.

First, we will regress weekly sales on fuel price and compare the results from the model with all weekly sales value and the model without the extreme sales values (IDout).

model1 <- lm(Weekly_Sales ~ Fuel_Price, data=fuelsales) 
summary(model1)
## 
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price, data = fuelsales)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -20972 -13901  -8368   4225 677117 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 16001.285    258.779  61.834   <2e-16 ***
## Fuel_Price     -5.958     76.287  -0.078    0.938    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22710 on 421568 degrees of freedom
## Multiple R-squared:  1.447e-08,  Adjusted R-squared:  -2.358e-06 
## F-statistic: 0.006101 on 1 and 421568 DF,  p-value: 0.9377
#I will cut the extreme data from the tail probably due to the hollyday sales
IDout=which(fuelsales$Weekly_Sales>200000)
model1less <- lm(Weekly_Sales ~ Fuel_Price, data=fuelsales[-IDout,])
# Summarize and print the results
summary(model1less) # show regression coefficients table
## 
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price, data = fuelsales[-IDout, 
##     ])
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -20853 -13806  -8276   4297 183802 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 15686.68     252.07  62.232   <2e-16 ***
## Fuel_Price     59.03      74.31   0.794    0.427    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22120 on 421405 degrees of freedom
## Multiple R-squared:  1.498e-06,  Adjusted R-squared:  -8.753e-07 
## F-statistic: 0.6311 on 1 and 421405 DF,  p-value: 0.4269

Regression summary analysis Model 1 (including explanation of each indicators evaluated)

The model Y = Weekly sales X = Fuel price \[ Y = \beta_0 + \beta_1 X + e \]

The p-value

The Pr(>|t|) acronym found in the model output relates to the probability of observing any value equal to or larger than t. A small p-value indicates that it is unlikely we will observe a relationship between the predictor (x) and response (y) variables due to chance.

Typically, a p-value of 5% or less is a good cut-off point. Where (y) is the response variable what we try to explain by (x) the explanatory variable.

Three stars (or asterisks) represent a highly significant p-value. Consequently, a small p-value for the intercept and the slope indicates that we can reject the null hypothesis which allows us to conclude that there is a relationship between (x) and (y).

P-value observed

We observed that the p-value slightly decreases from model1(0.9377) to model1less(0.427) which is good, but p-value stays not significant as previous regression. This observation informs us that for all stores together the fuel price does not influence the weekly sales. However, we will continue our investigation in the way to find a possible relation of fuel price and weekly sales per stores.

The Residual Standard Error

Residual Standard Error is a measure of the quality of a linear regression fit. The Residual Standard Error is the average amount that the response (y) will deviate from the true regression line.

Residual Standard Error observed

We observed that the residual standard error lightly decreases from model1 (22710) to model1less (22120) which is good but globally the model was not good. In other words, given that the mean weekly sales for all stores are 15686.68 and that the Residual Standard Error is 22120, we can say that the percentage error is (any prediction would still be off by) 141%, which is not good.

The R-squared (R2) and djusted R-squared (Adj.R2)

The R-squared statistic provides a measure of how well the model is fitting the actual data. It takes the form of a proportion of variance. R-squared is a measure of the linear relationship between our predictor variable (Fuel price) and our response/target variable (Weekly sales). It always lies between 0 and 1 (i.e., a number near 0 represents a regression that does not explain the variance in the response variable in oposit with a number close to 1 which explain the observed variance in the response variable).

The adjusted R-squared index informed us on how the model explains the variance while like for R-squared but, R-squared will always increase as more variables are included in the model. That’s why the adjusted R-squared is the preferred measure as it adjusts for the number of variables considered.

R2 and Adj.R2 observed

In both case for the model1less, the R2 and the Adj. R2 we obtain numbers close to 0 which tell us that the model did not explain the variance.

The F-statistic

F-statistic is a good indicator of whether there is a relationship between our predictor and the response variables. The further the F-statistic is from 1 the better it is. However, how much larger the F-statistic needs to depend on both the number of data points and the number of predictors. Generally, when the number of data points is large, an F-statistic that is only a little bit larger than 1 is already sufficient to reject the null hypothesis (H0: There is no relationship between Fuel prices and Weekly sales). The reverse is true as if the number of data points is small; a large F-statistic is required to be able to ascertain that there may be a relationship between predictor and response variables.

F-statistic observed

In our model1less, the F-statistic is 0.6311 which is smaller than 1, which not allow us to confirm relationship between Fuel prices and Weekly sales. In this analysis because of our large number of data, if we obtain just a bit larger number than 1, it allow us to cofirm relationship between our response (Y) and our explanatory (X) variables.

## dummies-1.5.6 provided by Decision Patterns
## ── Attaching packages ───────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ tibble  1.4.2     ✔ purrr   0.2.5
## ✔ tidyr   0.8.1     ✔ dplyr   0.7.6
## ✔ readr   1.1.1     ✔ stringr 1.3.1
## ✔ tibble  1.4.2     ✔ forcats 0.3.0
## Warning: package 'dplyr' was built under R version 3.5.1
## ── Conflicts ──────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::between()   masks data.table::between()
## ✖ dplyr::filter()    masks stats::filter()
## ✖ dplyr::first()     masks data.table::first()
## ✖ dplyr::lag()       masks stats::lag()
## ✖ dplyr::last()      masks data.table::last()
## ✖ purrr::transpose() masks data.table::transpose()
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:data.table':
## 
##     hour, isoweek, mday, minute, month, quarter, second, wday,
##     week, yday, year
## The following object is masked from 'package:base':
## 
##     date

4.2. We will build-up the second model based on the first to observe the impact per store. We will add some controls variables.

In this second model, we will use a function as.factor which make the same effect like if we have creating dummy variables for each store. While, we will add holiday and temperature as new control variables. We will continue our analysis without the extreme values for weekly sales which are higher than 200 000. We will create interaction variables in the way to observe the impact of fuel price variation per store on the weekly sales.

\[ Weekly Sales = \beta_0 + \beta_1 FuelPrice + \beta_2 Store + \beta_3 FuelPrice*Store + \beta_4 IsHoliday + \beta_5 Temperature + e \]

model2 = lm(Weekly_Sales~Fuel_Price+as.factor(Store)+Fuel_Price*as.factor(Store)+as.factor(IsHoliday.x)+Temperature, data=TM1[-IDout,])
summary(model2)
## 
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price + as.factor(Store) + Fuel_Price * 
##     as.factor(Store) + as.factor(IsHoliday.x) + Temperature, 
##     data = TM1[-IDout, ])
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -30668 -11877  -5752   4509 187819 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    19483.058   1588.635  12.264  < 2e-16 ***
## Fuel_Price                       813.409    488.552   1.665 0.095926 .  
## as.factor(Store)2              11285.143   2243.457   5.030 4.90e-07 ***
## as.factor(Store)3             -13772.629   2316.174  -5.946 2.75e-09 ***
## as.factor(Store)4               2193.559   2266.997   0.968 0.333242    
## as.factor(Store)5             -14923.212   2320.532  -6.431 1.27e-10 ***
## as.factor(Store)6               4832.642   2244.919   2.153 0.031343 *  
## as.factor(Store)7             -12745.884   2270.260  -5.614 1.98e-08 ***
## as.factor(Store)8              -6691.779   2261.443  -2.959 0.003086 ** 
## as.factor(Store)9             -11654.137   2333.519  -4.994 5.91e-07 ***
## as.factor(Store)10              9178.313   2304.268   3.983 6.80e-05 ***
## as.factor(Store)11              1159.684   2254.644   0.514 0.607005    
## as.factor(Store)12             -5000.950   2353.011  -2.125 0.033559 *  
## as.factor(Store)13              3271.822   2353.398   1.390 0.164452    
## as.factor(Store)14             18818.517   2281.070   8.250  < 2e-16 ***
## as.factor(Store)15             -7845.485   2314.711  -3.389 0.000701 ***
## as.factor(Store)16            -10685.373   2295.659  -4.655 3.25e-06 ***
## as.factor(Store)17            -10319.046   2395.449  -4.308 1.65e-05 ***
## as.factor(Store)18              2030.647   2270.443   0.894 0.371117    
## as.factor(Store)19              4472.674   2298.431   1.946 0.051659 .  
## as.factor(Store)20             10682.398   2271.651   4.702 2.57e-06 ***
## as.factor(Store)21             -6524.121   2281.895  -2.859 0.004249 ** 
## as.factor(Store)22             -3374.377   2276.717  -1.482 0.138308    
## as.factor(Store)23               103.702   2259.024   0.046 0.963385    
## as.factor(Store)24              1833.237   2295.239   0.799 0.424457    
## as.factor(Store)25             -7903.027   2295.060  -3.443 0.000574 ***
## as.factor(Store)26             -4632.145   2271.342  -2.039 0.041412 *  
## as.factor(Store)27             11704.728   2294.580   5.101 3.38e-07 ***
## as.factor(Store)28              1953.490   2327.226   0.839 0.401242    
## as.factor(Store)29            -10282.070   2293.255  -4.484 7.34e-06 ***
## as.factor(Store)30             -8118.094   2473.550  -3.282 0.001031 ** 
## as.factor(Store)31             -1597.669   2246.551  -0.711 0.476982    
## as.factor(Store)32             -5337.582   2244.714  -2.378 0.017415 *  
## as.factor(Store)33            -11865.111   2658.568  -4.463 8.09e-06 ***
## as.factor(Store)34             -6734.635   2268.247  -2.969 0.002987 ** 
## as.factor(Store)35              6343.112   2313.093   2.742 0.006102 ** 
## as.factor(Store)36             -2441.804   2508.652  -0.973 0.330378    
## as.factor(Store)37             -7128.347   2479.518  -2.875 0.004042 ** 
## as.factor(Store)38            -14995.307   2556.443  -5.866 4.48e-09 ***
## as.factor(Store)39             -5234.694   2262.008  -2.314 0.020658 *  
## as.factor(Store)40             -5590.610   2261.663  -2.472 0.013440 *  
## as.factor(Store)41             -6942.112   2254.770  -3.079 0.002078 ** 
## as.factor(Store)42             -7425.218   2591.406  -2.865 0.004166 ** 
## as.factor(Store)43             -4046.955   2512.642  -1.611 0.107260    
## as.factor(Store)44            -15032.768   2639.338  -5.696 1.23e-08 ***
## as.factor(Store)45             -6833.082   2306.097  -2.963 0.003046 ** 
## as.factor(IsHoliday.x)TRUE       461.668    129.119   3.576 0.000350 ***
## Temperature                       -6.464      2.137  -3.025 0.002488 ** 
## Fuel_Price:as.factor(Store)2   -1969.671    690.823  -2.851 0.004356 ** 
## Fuel_Price:as.factor(Store)3    -474.446    713.331  -0.665 0.505978    
## Fuel_Price:as.factor(Store)4    1551.300    698.571   2.221 0.026373 *  
## Fuel_Price:as.factor(Store)5    -531.474    714.472  -0.744 0.456956    
## Fuel_Price:as.factor(Store)6   -1459.798    691.181  -2.112 0.034684 *  
## Fuel_Price:as.factor(Store)7    -250.753    696.540  -0.360 0.718849    
## Fuel_Price:as.factor(Store)8    -591.947    696.432  -0.850 0.395341    
## Fuel_Price:as.factor(Store)9    -395.062    718.125  -0.550 0.582231    
## Fuel_Price:as.factor(Store)10  -1479.943    673.860  -2.196 0.028077 *  
## Fuel_Price:as.factor(Store)11  -1128.279    694.039  -1.626 0.104020    
## Fuel_Price:as.factor(Store)12   -622.460    683.536  -0.911 0.362482    
## Fuel_Price:as.factor(Store)13    610.961    717.267   0.852 0.394331    
## Fuel_Price:as.factor(Store)14  -3743.876    681.821  -5.491 4.00e-08 ***
## Fuel_Price:as.factor(Store)15  -1480.348    676.032  -2.190 0.028542 *  
## Fuel_Price:as.factor(Store)16  -1021.614    703.274  -1.453 0.146321    
## Fuel_Price:as.factor(Store)17    420.448    729.188   0.577 0.564212    
## Fuel_Price:as.factor(Store)18  -2425.003    676.257  -3.586 0.000336 ***
## Fuel_Price:as.factor(Store)19  -1754.299    671.879  -2.611 0.009027 ** 
## Fuel_Price:as.factor(Store)20   -978.387    679.109  -1.441 0.149672    
## Fuel_Price:as.factor(Store)21  -1228.276    702.740  -1.748 0.080493 .  
## Fuel_Price:as.factor(Store)22  -1016.799    678.287  -1.499 0.133857    
## Fuel_Price:as.factor(Store)23   -712.846    673.036  -1.059 0.289532    
## Fuel_Price:as.factor(Store)24  -1391.706    670.879  -2.074 0.038038 *  
## Fuel_Price:as.factor(Store)25  -1096.472    685.757  -1.599 0.109839    
## Fuel_Price:as.factor(Store)26   -827.202    676.421  -1.223 0.221364    
## Fuel_Price:as.factor(Store)27  -2531.784    670.686  -3.775 0.000160 ***
## Fuel_Price:as.factor(Store)28  -1475.573    676.898  -2.180 0.029265 *  
## Fuel_Price:as.factor(Store)29  -1038.112    682.817  -1.520 0.128427    
## Fuel_Price:as.factor(Store)30  -1491.746    760.869  -1.961 0.049928 *  
## Fuel_Price:as.factor(Store)31   -142.388    691.895  -0.206 0.836951    
## Fuel_Price:as.factor(Store)32    -49.688    688.873  -0.072 0.942499    
## Fuel_Price:as.factor(Store)33  -1210.532    765.928  -1.580 0.113998    
## Fuel_Price:as.factor(Store)34   -465.254    699.257  -0.665 0.505824    
## Fuel_Price:as.factor(Store)35  -4276.284    690.831  -6.190 6.02e-10 ***
## Fuel_Price:as.factor(Store)36  -3314.113    773.428  -4.285 1.83e-05 ***
## Fuel_Price:as.factor(Store)37  -1317.126    761.692  -1.729 0.083772 .  
## Fuel_Price:as.factor(Store)38    134.157    735.659   0.182 0.855298    
## Fuel_Price:as.factor(Store)39   1382.227    696.591   1.984 0.047226 *  
## Fuel_Price:as.factor(Store)40   -770.984    673.857  -1.144 0.252568    
## Fuel_Price:as.factor(Store)41    931.949    691.419   1.348 0.177698    
## Fuel_Price:as.factor(Store)42   -863.422    747.904  -1.154 0.248314    
## Fuel_Price:as.factor(Store)43  -1313.104    773.961  -1.697 0.089773 .  
## Fuel_Price:as.factor(Store)44   -235.142    801.658  -0.293 0.769279    
## Fuel_Price:as.factor(Store)45  -1009.367    688.798  -1.465 0.142811    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21030 on 421315 degrees of freedom
## Multiple R-squared:  0.09587,    Adjusted R-squared:  0.09567 
## F-statistic: 490.9 on 91 and 421315 DF,  p-value: < 2.2e-16
Regression summary analysis Model 2
P-value observed

We observed that the p-value considerably decreases from model1less(0.9377) to model2(2.2e-16) which is good based on theory, but 2.2e-16 is the smallest number larger than 0 that can be stored by the floating system in our computer. This number suppose that the sample size is enormous, which is our case or perhaps the routine that calculates p is incorrect. So we can not conclude on this global model based on the model2 p-value.

When we observed the p-value for the explanatory variable fuel price we obtain (0.095926) which is significant at 10% indicates that we can reject the null hypothesis. It means if we increase fuel price by 1 it will

The two controls variable which we add as holiday and temperature are highly significative both at 1%. # Temperature, p-value 0.0025: For temperature increasing of one unit, the sales decrease by 6.464, which in practice have zero impact. However, it interesting to know if the temperature increases the sales decrease. It could be interesting to observe this impact per store.

# Holiday(TRUE), p-value 0.000350: When it is a holiday day the sales increase by 461.668, which explain human behaviour in the way that they do shoppoçing when they have free time.

The interaction variables (Fuel_Price*Store) allow us to observe per store the variation in weekly sales for a fuel price variation. It gives us more accurate information on the relation between weekly sales and fuel price per store. In this model2 18 stores on 45 demonstrate that the fuel price variation has a significant impact on weekly sales and in those 18 stores, weekly sales of 16 store reacted negatively to increase in a fuel price.

Residual Standard Error observed

We observed that the residual standard error lightly decreases from model1less with (22120) to model 2 with (22030) which is good but relatively not significant. In other words, given that the mean weekly sales for all stores are 19483.058 and that the Residual Standard Error is 22030, we can say that the percentage error is (any prediction would still be off by) 108%, which again not good.

R2 and Adj.R2 observed

In both case for the model 2, the R2 and the Adj. R2 we obtain 0.09567 numbers close to 0 which tell us that the model did not explain the variance, but it is better than the model1less.

F-statistic observed

The F-statistic is 490.9 which is fare larger than 1, which allow us to confirm a relationship between Fuel prices and Weekly sales.

AIC test model 1 vs model 2
AIC(model1less, model2)
The AIC index

The Akaike information criterion(AIC) is the estimator of the relative quality of statistical models or the relative goodness of fit. It’is an interesting criterion to consider when we are comparing a build up models.

How can we use the information from the AIC test in our analysis? Usually when we have a difference of 2 in the AIC test we can consider using the other more complex model and 10 is considering a substantial difference.

AIC test observation

Considering the results of this first AIC test: model1less of (3), model2 of (93), this significant difference allow you to use the model 2.

Conclusion analysis model 2

Globaly this second model offer us mutch better performance with more usable and accurate informations which we can consider using practicatly.

4.3. We will add year and month as controls variables

We will create and add new control variables for year and month to our previous model.

Year=as.numeric(substring(TM1$Date,1,4))
Month=as.numeric(substring(TM1$Date,6,7))
TM1=data.frame(TM1,Year,Month)

\[ Weekly Sales = \beta_0 + \beta_1 FuelPrice + \beta_2 Store + \beta_3 FuelPrice*Store + \beta_4 IsHoliday + \beta_5 Temperature + \beta_6 Year + \beta_7 Month + \beta_8 Month*FUelPrice + e \]

model3 = (lm(Weekly_Sales~Fuel_Price+as.factor(Store)+Fuel_Price*as.factor(Store)+as.factor(IsHoliday.x)+Temperature+as.factor(Year)+as.factor(Month)+as.factor(Month)*Fuel_Price, data=TM1[-IDout,]))
summary(model3)
## 
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price + as.factor(Store) + Fuel_Price * 
##     as.factor(Store) + as.factor(IsHoliday.x) + Temperature + 
##     as.factor(Year) + as.factor(Month) + as.factor(Month) * Fuel_Price, 
##     data = TM1[-IDout, ])
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -33158 -11872  -5668   4528 187540 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    23761.467   2595.273   9.156  < 2e-16 ***
## Fuel_Price                     -1282.400    809.462  -1.584 0.113135    
## as.factor(Store)2              11278.065   2240.989   5.033 4.84e-07 ***
## as.factor(Store)3             -13820.523   2313.646  -5.973 2.32e-09 ***
## as.factor(Store)4               2400.001   2267.247   1.059 0.289804    
## as.factor(Store)5             -14929.001   2317.996  -6.440 1.19e-10 ***
## as.factor(Store)6               4859.649   2242.481   2.167 0.030229 *  
## as.factor(Store)7             -12128.787   2270.957  -5.341 9.26e-08 ***
## as.factor(Store)8              -6551.648   2259.745  -2.899 0.003740 ** 
## as.factor(Store)9             -11592.825   2331.147  -4.973 6.59e-07 ***
## as.factor(Store)10              9362.953   2306.674   4.059 4.93e-05 ***
## as.factor(Store)11              1101.385   2252.260   0.489 0.624833    
## as.factor(Store)12             -4789.386   2355.870  -2.033 0.042057 *  
## as.factor(Store)13              3742.250   2354.617   1.589 0.111988    
## as.factor(Store)14             19168.518   2280.435   8.406  < 2e-16 ***
## as.factor(Store)15             -7317.209   2317.131  -3.158 0.001589 ** 
## as.factor(Store)16             -9995.114   2297.769  -4.350 1.36e-05 ***
## as.factor(Store)17             -9663.189   2399.525  -4.027 5.65e-05 ***
## as.factor(Store)18              2487.380   2270.140   1.096 0.273213    
## as.factor(Store)19              4972.305   2300.458   2.161 0.030662 *  
## as.factor(Store)20             11082.424   2271.627   4.879 1.07e-06 ***
## as.factor(Store)21             -6532.055   2279.388  -2.866 0.004161 ** 
## as.factor(Store)22             -2924.282   2276.248  -1.285 0.198900    
## as.factor(Store)23               685.176   2260.209   0.303 0.761778    
## as.factor(Store)24              2313.365   2296.928   1.007 0.313860    
## as.factor(Store)25             -7455.289   2295.699  -3.248 0.001164 ** 
## as.factor(Store)26             -3974.470   2274.180  -1.748 0.080525 .  
## as.factor(Store)27             12094.542   2295.612   5.269 1.38e-07 ***
## as.factor(Store)28              2148.588   2330.146   0.922 0.356486    
## as.factor(Store)29             -9845.675   2292.730  -4.294 1.75e-05 ***
## as.factor(Store)30             -8147.052   2470.835  -3.297 0.000976 ***
## as.factor(Store)31             -1599.473   2244.083  -0.713 0.476000    
## as.factor(Store)32             -4848.149   2243.930  -2.161 0.030730 *  
## as.factor(Store)33            -11802.783   2660.525  -4.436 9.16e-06 ***
## as.factor(Store)34             -6705.556   2266.432  -2.959 0.003090 ** 
## as.factor(Store)35              6662.011   2312.285   2.881 0.003963 ** 
## as.factor(Store)36             -2488.866   2506.356  -0.993 0.320700    
## as.factor(Store)37             -7217.576   2476.896  -2.914 0.003569 ** 
## as.factor(Store)38            -14806.921   2558.760  -5.787 7.18e-09 ***
## as.factor(Store)39             -5286.135   2259.548  -2.339 0.019312 *  
## as.factor(Store)40             -5055.266   2262.225  -2.235 0.025441 *  
## as.factor(Store)41             -6365.893   2254.713  -2.823 0.004752 ** 
## as.factor(Store)42             -7267.813   2593.135  -2.803 0.005068 ** 
## as.factor(Store)43             -4102.931   2509.892  -1.635 0.102112    
## as.factor(Store)44            -14543.667   2639.852  -5.509 3.61e-08 ***
## as.factor(Store)45             -6501.471   2305.406  -2.820 0.004801 ** 
## as.factor(IsHoliday.x)TRUE       -29.778    139.967  -0.213 0.831520    
## Temperature                        8.547      6.275   1.362 0.173172    
## as.factor(Year)2011             -157.133    212.168  -0.741 0.458932    
## as.factor(Year)2012              166.165    241.796   0.687 0.491949    
## as.factor(Month)2              -5169.688   2245.160  -2.303 0.021302 *  
## as.factor(Month)3              -6403.200   2205.832  -2.903 0.003698 ** 
## as.factor(Month)4              -6259.923   2201.361  -2.844 0.004460 ** 
## as.factor(Month)5              -5428.126   2247.790  -2.415 0.015741 *  
## as.factor(Month)6              -5239.259   2241.864  -2.337 0.019439 *  
## as.factor(Month)7              -5844.296   2249.132  -2.598 0.009364 ** 
## as.factor(Month)8              -5429.774   2254.089  -2.409 0.016003 *  
## as.factor(Month)9              -7043.490   2212.157  -3.184 0.001453 ** 
## as.factor(Month)10             -8501.320   2203.714  -3.858 0.000114 ***
## as.factor(Month)11             -6553.449   2506.638  -2.614 0.008938 ** 
## as.factor(Month)12             -4564.815   2690.722  -1.697 0.089792 .  
## Fuel_Price:as.factor(Store)2   -1966.896    690.063  -2.850 0.004368 ** 
## Fuel_Price:as.factor(Store)3    -475.509    712.552  -0.667 0.504560    
## Fuel_Price:as.factor(Store)4    1515.699    698.208   2.171 0.029944 *  
## Fuel_Price:as.factor(Store)5    -536.195    713.702  -0.751 0.452480    
## Fuel_Price:as.factor(Store)6   -1474.613    690.449  -2.136 0.032702 *  
## Fuel_Price:as.factor(Store)7    -318.598    696.475  -0.457 0.647353    
## Fuel_Price:as.factor(Store)8    -609.122    695.711  -0.876 0.381281    
## Fuel_Price:as.factor(Store)9    -412.554    717.384  -0.575 0.565237    
## Fuel_Price:as.factor(Store)10  -1566.178    674.055  -2.324 0.020152 *  
## Fuel_Price:as.factor(Store)11  -1129.725    693.276  -1.630 0.103198    
## Fuel_Price:as.factor(Store)12   -708.808    683.752  -1.037 0.299902    
## Fuel_Price:as.factor(Store)13    521.311    717.271   0.727 0.467350    
## Fuel_Price:as.factor(Store)14  -3809.057    681.410  -5.590 2.27e-08 ***
## Fuel_Price:as.factor(Store)15  -1575.106    675.805  -2.331 0.019769 *  
## Fuel_Price:as.factor(Store)16  -1137.535    702.873  -1.618 0.105575    
## Fuel_Price:as.factor(Store)17    307.331    729.123   0.422 0.673385    
## Fuel_Price:as.factor(Store)18  -2504.634    675.988  -3.705 0.000211 ***
## Fuel_Price:as.factor(Store)19  -1842.370    671.688  -2.743 0.006090 ** 
## Fuel_Price:as.factor(Store)20  -1048.599    678.662  -1.545 0.122323    
## Fuel_Price:as.factor(Store)21  -1229.304    701.970  -1.751 0.079908 .  
## Fuel_Price:as.factor(Store)22  -1100.223    678.016  -1.623 0.104652    
## Fuel_Price:as.factor(Store)23   -807.712    672.715  -1.201 0.229878    
## Fuel_Price:as.factor(Store)24  -1481.592    670.700  -2.209 0.027173 *  
## Fuel_Price:as.factor(Store)25  -1166.111    685.294  -1.702 0.088827 .  
## Fuel_Price:as.factor(Store)26   -922.178    676.082  -1.364 0.172567    
## Fuel_Price:as.factor(Store)27  -2610.858    670.541  -3.894 9.88e-05 ***
## Fuel_Price:as.factor(Store)28  -1557.564    677.118  -2.300 0.021433 *  
## Fuel_Price:as.factor(Store)29  -1117.610    682.533  -1.637 0.101538    
## Fuel_Price:as.factor(Store)30  -1487.100    760.036  -1.957 0.050393 .  
## Fuel_Price:as.factor(Store)31   -144.672    691.137  -0.209 0.834194    
## Fuel_Price:as.factor(Store)32   -138.054    688.604  -0.200 0.841103    
## Fuel_Price:as.factor(Store)33  -1285.543    765.933  -1.678 0.093270 .  
## Fuel_Price:as.factor(Store)34   -429.079    698.515  -0.614 0.539035    
## Fuel_Price:as.factor(Store)35  -4330.898    690.452  -6.273 3.56e-10 ***
## Fuel_Price:as.factor(Store)36  -3313.247    772.679  -4.288 1.80e-05 ***
## Fuel_Price:as.factor(Store)37  -1303.875    760.857  -1.714 0.086586 .  
## Fuel_Price:as.factor(Store)38     52.994    735.798   0.072 0.942584    
## Fuel_Price:as.factor(Store)39   1386.897    695.825   1.993 0.046244 *  
## Fuel_Price:as.factor(Store)40   -847.837    673.624  -1.259 0.208168    
## Fuel_Price:as.factor(Store)41    837.226    691.135   1.211 0.225751    
## Fuel_Price:as.factor(Store)42   -945.193    748.007  -1.264 0.206369    
## Fuel_Price:as.factor(Store)43  -1300.235    773.112  -1.682 0.092604 .  
## Fuel_Price:as.factor(Store)44   -330.521    801.477  -0.412 0.680054    
## Fuel_Price:as.factor(Store)45  -1070.102    688.369  -1.555 0.120055    
## Fuel_Price:as.factor(Month)2    2140.842    691.190   3.097 0.001953 ** 
## Fuel_Price:as.factor(Month)3    2319.906    669.767   3.464 0.000533 ***
## Fuel_Price:as.factor(Month)4    2324.423    664.759   3.497 0.000471 ***
## Fuel_Price:as.factor(Month)5    2108.884    673.662   3.130 0.001745 ** 
## Fuel_Price:as.factor(Month)6    2178.415    675.116   3.227 0.001252 ** 
## Fuel_Price:as.factor(Month)7    2216.807    681.005   3.255 0.001133 ** 
## Fuel_Price:as.factor(Month)8    2147.445    677.462   3.170 0.001525 ** 
## Fuel_Price:as.factor(Month)9    2363.397    666.744   3.545 0.000393 ***
## Fuel_Price:as.factor(Month)10   2875.057    667.687   4.306 1.66e-05 ***
## Fuel_Price:as.factor(Month)11   2805.872    767.831   3.654 0.000258 ***
## Fuel_Price:as.factor(Month)12   2908.426    828.412   3.511 0.000447 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21010 on 421291 degrees of freedom
## Multiple R-squared:  0.09791,    Adjusted R-squared:  0.09766 
## F-statistic: 397.6 on 115 and 421291 DF,  p-value: < 2.2e-16
Regression summary analysis model 3
P-value observed

We observed that the p-value unchanged from model 2, which is (2.2e-16) which is good in the way of theory, but for a reason explained earlier we can not conclude on this global model based this p-value.

In we observed the p-value for the explanatory variable fuel price we obtained (0.113135) which is not significant and a lower score than we obtain in the precedent model with a p-value of (0.095926) significant at 10%.

The interaction variables (Fuel_Price*Store) in this model3 had 20 stores on 45, which demonstrate that the fuel price variation has a significant impact on weekly sales and in those 20 stores, weekly sales of 18 store reacted negatively to a positive variation of fuel price.

Residual Standard Error observed

We observed that the residual standard error lightly decreases from model 2 with (22030) to model 3 with (22010) which relatively not significant. In other words, given that the mean weekly sales for all stores are 23761.467 and that the Residual Standard Error is 22010, we can say that the percentage error is (any prediction would still be off by) 108%, which again not good.

R2 and Adj.R2 observed

In both case for the model 3, the R2 and the Adj. R2 we obtained 0.097 numbers close to 0 which tell us that the model not explained the variance and is very similar to model 2.

F-statistic observed

The F-statistic is 397.6 which is fare larger than 1, similar to the previous model, which again, allow us to confirm the relationship between Fuel prices and Weekly sales.

AIC test model 2 vs model 3
AIC(model2, model3)

Considering the results of this first AIC test: model 2 of (93), model 3 of (117), which is a relatively small increase if we compare with model 1 to model 2.

Conclusion analysis model 3

Globally this third model offers us less performance than the model 2.

4.4. We will add controls variables in the way to obtain informations on stores characteristics for the stores which apears with significant relations between weekly sales fuel price.

We will try to investigate the characteristics of the stores that appear with a significant p-value at the interactive variable (Fuel_Price*Stores).

We will create new categories

We will create 3 categories for the small, medium and large size stores. We already have classification A, B, C of the stores but unfortunately, we can not have more information about those categories.

Fist, we will observe the spectrum in the Walmart store size in the way to create the right categories size. We are creating those category sizes of the store in the way to observe if there exists a relation between the stores with significant relation (weekly sales/fuel price) and the category size store.

hist(stores$Size)

print(max(stores$Size))
## [1] 219622
print(min(stores$Size))
## [1] 34875
#Create categories fore store size
attach(stores)
stores$Sizestore[Size > 175000] <- "Large"
stores$Sizestore[Size > 75000 & Size <= 175000] <- "Medium"
stores$Sizestore[Size <= 75000] <- "Smal"
detach(stores)

newData=merge(TM1, stores,by.x="Store",by.y="Store")

Year=as.numeric(substring(newData$Date,1,4))
Month=as.numeric(substring(newData$Date,6,7))
newData=data.frame(newData,Year,Month)

IDout=which(newData$Weekly_Sales>200000)

\[ Weekly Sales = \beta_0 + \beta_1 FuelPrice + \beta_2 Store + \beta_3 FuelPrice*Store + \beta_4 IsHoliday + \beta_5 Temperature + \beta_6 Year + \beta_7 Month + \beta_8 Month*FuelPrice + \beta_9 Type + \beta_10 SizeStore + \beta_11 Size + e \]

model4 = (lm(Weekly_Sales~Fuel_Price+as.factor(Store)+Fuel_Price*as.factor(Store)+as.factor(IsHoliday.x)+Temperature+as.factor(Year)+as.factor(Month)+as.factor(Month)*Fuel_Price+as.factor(as.numeric(Type))+as.factor(Sizestore)+as.factor(Size), data=newData[-IDout,], na.action=na.omit))
summary(model4)
## 
## Call:
## lm(formula = Weekly_Sales ~ Fuel_Price + as.factor(Store) + Fuel_Price * 
##     as.factor(Store) + as.factor(IsHoliday.x) + Temperature + 
##     as.factor(Year) + as.factor(Month) + as.factor(Month) * Fuel_Price + 
##     as.factor(as.numeric(Type)) + as.factor(Sizestore) + as.factor(Size), 
##     data = newData[-IDout, ], na.action = na.omit)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -33158 -11872  -5668   4528 187540 
## 
## Coefficients: (43 not defined because of singularities)
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    23761.467   2595.273   9.156  < 2e-16 ***
## Fuel_Price                     -1282.400    809.462  -1.584 0.113135    
## as.factor(Store)2              11278.065   2240.989   5.033 4.84e-07 ***
## as.factor(Store)3             -13820.523   2313.646  -5.973 2.32e-09 ***
## as.factor(Store)4               2400.001   2267.247   1.059 0.289804    
## as.factor(Store)5             -14929.001   2317.996  -6.440 1.19e-10 ***
## as.factor(Store)6               4859.649   2242.481   2.167 0.030229 *  
## as.factor(Store)7             -12128.787   2270.957  -5.341 9.26e-08 ***
## as.factor(Store)8              -6551.648   2259.745  -2.899 0.003740 ** 
## as.factor(Store)9             -11592.825   2331.147  -4.973 6.59e-07 ***
## as.factor(Store)10              9362.953   2306.674   4.059 4.93e-05 ***
## as.factor(Store)11              1101.385   2252.260   0.489 0.624833    
## as.factor(Store)12             -4789.386   2355.870  -2.033 0.042057 *  
## as.factor(Store)13              3742.250   2354.617   1.589 0.111988    
## as.factor(Store)14             19168.518   2280.435   8.406  < 2e-16 ***
## as.factor(Store)15             -7317.209   2317.131  -3.158 0.001589 ** 
## as.factor(Store)16             -9995.114   2297.769  -4.350 1.36e-05 ***
## as.factor(Store)17             -9663.189   2399.525  -4.027 5.65e-05 ***
## as.factor(Store)18              2487.380   2270.140   1.096 0.273213    
## as.factor(Store)19              4972.305   2300.458   2.161 0.030662 *  
## as.factor(Store)20             11082.424   2271.627   4.879 1.07e-06 ***
## as.factor(Store)21             -6532.055   2279.388  -2.866 0.004161 ** 
## as.factor(Store)22             -2924.282   2276.248  -1.285 0.198900    
## as.factor(Store)23               685.176   2260.209   0.303 0.761778    
## as.factor(Store)24              2313.365   2296.928   1.007 0.313860    
## as.factor(Store)25             -7455.289   2295.699  -3.248 0.001164 ** 
## as.factor(Store)26             -3974.470   2274.180  -1.748 0.080525 .  
## as.factor(Store)27             12094.542   2295.612   5.269 1.38e-07 ***
## as.factor(Store)28              2148.588   2330.146   0.922 0.356486    
## as.factor(Store)29             -9845.675   2292.730  -4.294 1.75e-05 ***
## as.factor(Store)30             -8147.052   2470.835  -3.297 0.000976 ***
## as.factor(Store)31             -1599.473   2244.083  -0.713 0.476000    
## as.factor(Store)32             -4848.149   2243.930  -2.161 0.030730 *  
## as.factor(Store)33            -11802.783   2660.525  -4.436 9.16e-06 ***
## as.factor(Store)34             -6705.556   2266.432  -2.959 0.003090 ** 
## as.factor(Store)35              6662.011   2312.285   2.881 0.003963 ** 
## as.factor(Store)36             -2488.866   2506.356  -0.993 0.320700    
## as.factor(Store)37             -7217.576   2476.896  -2.914 0.003569 ** 
## as.factor(Store)38            -14806.921   2558.760  -5.787 7.18e-09 ***
## as.factor(Store)39             -5286.135   2259.548  -2.339 0.019312 *  
## as.factor(Store)40             -5055.266   2262.225  -2.235 0.025441 *  
## as.factor(Store)41             -6365.893   2254.713  -2.823 0.004752 ** 
## as.factor(Store)42             -7267.813   2593.135  -2.803 0.005068 ** 
## as.factor(Store)43             -4102.931   2509.892  -1.635 0.102112    
## as.factor(Store)44            -14543.667   2639.852  -5.509 3.61e-08 ***
## as.factor(Store)45             -6501.471   2305.406  -2.820 0.004801 ** 
## as.factor(IsHoliday.x)TRUE       -29.778    139.967  -0.213 0.831520    
## Temperature                        8.547      6.275   1.362 0.173172    
## as.factor(Year)2011             -157.133    212.168  -0.741 0.458932    
## as.factor(Year)2012              166.165    241.796   0.687 0.491949    
## as.factor(Month)2              -5169.688   2245.160  -2.303 0.021302 *  
## as.factor(Month)3              -6403.200   2205.832  -2.903 0.003698 ** 
## as.factor(Month)4              -6259.923   2201.361  -2.844 0.004460 ** 
## as.factor(Month)5              -5428.126   2247.790  -2.415 0.015741 *  
## as.factor(Month)6              -5239.259   2241.864  -2.337 0.019439 *  
## as.factor(Month)7              -5844.296   2249.132  -2.598 0.009364 ** 
## as.factor(Month)8              -5429.774   2254.089  -2.409 0.016003 *  
## as.factor(Month)9              -7043.490   2212.157  -3.184 0.001453 ** 
## as.factor(Month)10             -8501.320   2203.714  -3.858 0.000114 ***
## as.factor(Month)11             -6553.449   2506.638  -2.614 0.008938 ** 
## as.factor(Month)12             -4564.815   2690.722  -1.697 0.089792 .  
## as.factor(as.numeric(Type))2          NA         NA      NA       NA    
## as.factor(as.numeric(Type))3          NA         NA      NA       NA    
## as.factor(Sizestore)Medium            NA         NA      NA       NA    
## as.factor(Sizestore)Smal              NA         NA      NA       NA    
## as.factor(Size)37392                  NA         NA      NA       NA    
## as.factor(Size)39690                  NA         NA      NA       NA    
## as.factor(Size)39910                  NA         NA      NA       NA    
## as.factor(Size)41062                  NA         NA      NA       NA    
## as.factor(Size)42988                  NA         NA      NA       NA    
## as.factor(Size)57197                  NA         NA      NA       NA    
## as.factor(Size)70713                  NA         NA      NA       NA    
## as.factor(Size)93188                  NA         NA      NA       NA    
## as.factor(Size)93638                  NA         NA      NA       NA    
## as.factor(Size)103681                 NA         NA      NA       NA    
## as.factor(Size)112238                 NA         NA      NA       NA    
## as.factor(Size)114533                 NA         NA      NA       NA    
## as.factor(Size)118221                 NA         NA      NA       NA    
## as.factor(Size)119557                 NA         NA      NA       NA    
## as.factor(Size)120653                 NA         NA      NA       NA    
## as.factor(Size)123737                 NA         NA      NA       NA    
## as.factor(Size)125833                 NA         NA      NA       NA    
## as.factor(Size)126512                 NA         NA      NA       NA    
## as.factor(Size)128107                 NA         NA      NA       NA    
## as.factor(Size)140167                 NA         NA      NA       NA    
## as.factor(Size)151315                 NA         NA      NA       NA    
## as.factor(Size)152513                 NA         NA      NA       NA    
## as.factor(Size)155078                 NA         NA      NA       NA    
## as.factor(Size)155083                 NA         NA      NA       NA    
## as.factor(Size)158114                 NA         NA      NA       NA    
## as.factor(Size)184109                 NA         NA      NA       NA    
## as.factor(Size)196321                 NA         NA      NA       NA    
## as.factor(Size)200898                 NA         NA      NA       NA    
## as.factor(Size)202307                 NA         NA      NA       NA    
## as.factor(Size)202505                 NA         NA      NA       NA    
## as.factor(Size)203007                 NA         NA      NA       NA    
## as.factor(Size)203742                 NA         NA      NA       NA    
## as.factor(Size)203750                 NA         NA      NA       NA    
## as.factor(Size)203819                 NA         NA      NA       NA    
## as.factor(Size)204184                 NA         NA      NA       NA    
## as.factor(Size)205863                 NA         NA      NA       NA    
## as.factor(Size)206302                 NA         NA      NA       NA    
## as.factor(Size)207499                 NA         NA      NA       NA    
## as.factor(Size)219622                 NA         NA      NA       NA    
## Fuel_Price:as.factor(Store)2   -1966.896    690.063  -2.850 0.004368 ** 
## Fuel_Price:as.factor(Store)3    -475.509    712.552  -0.667 0.504560    
## Fuel_Price:as.factor(Store)4    1515.699    698.208   2.171 0.029944 *  
## Fuel_Price:as.factor(Store)5    -536.195    713.702  -0.751 0.452480    
## Fuel_Price:as.factor(Store)6   -1474.613    690.449  -2.136 0.032702 *  
## Fuel_Price:as.factor(Store)7    -318.598    696.475  -0.457 0.647353    
## Fuel_Price:as.factor(Store)8    -609.122    695.711  -0.876 0.381281    
## Fuel_Price:as.factor(Store)9    -412.554    717.384  -0.575 0.565237    
## Fuel_Price:as.factor(Store)10  -1566.178    674.055  -2.324 0.020152 *  
## Fuel_Price:as.factor(Store)11  -1129.725    693.276  -1.630 0.103198    
## Fuel_Price:as.factor(Store)12   -708.808    683.752  -1.037 0.299902    
## Fuel_Price:as.factor(Store)13    521.311    717.271   0.727 0.467350    
## Fuel_Price:as.factor(Store)14  -3809.057    681.410  -5.590 2.27e-08 ***
## Fuel_Price:as.factor(Store)15  -1575.106    675.805  -2.331 0.019769 *  
## Fuel_Price:as.factor(Store)16  -1137.535    702.873  -1.618 0.105575    
## Fuel_Price:as.factor(Store)17    307.331    729.123   0.422 0.673385    
## Fuel_Price:as.factor(Store)18  -2504.634    675.988  -3.705 0.000211 ***
## Fuel_Price:as.factor(Store)19  -1842.370    671.688  -2.743 0.006090 ** 
## Fuel_Price:as.factor(Store)20  -1048.599    678.662  -1.545 0.122323    
## Fuel_Price:as.factor(Store)21  -1229.304    701.970  -1.751 0.079908 .  
## Fuel_Price:as.factor(Store)22  -1100.223    678.016  -1.623 0.104652    
## Fuel_Price:as.factor(Store)23   -807.712    672.715  -1.201 0.229878    
## Fuel_Price:as.factor(Store)24  -1481.592    670.700  -2.209 0.027173 *  
## Fuel_Price:as.factor(Store)25  -1166.111    685.294  -1.702 0.088827 .  
## Fuel_Price:as.factor(Store)26   -922.178    676.082  -1.364 0.172567    
## Fuel_Price:as.factor(Store)27  -2610.858    670.541  -3.894 9.88e-05 ***
## Fuel_Price:as.factor(Store)28  -1557.564    677.118  -2.300 0.021433 *  
## Fuel_Price:as.factor(Store)29  -1117.610    682.533  -1.637 0.101538    
## Fuel_Price:as.factor(Store)30  -1487.100    760.036  -1.957 0.050393 .  
## Fuel_Price:as.factor(Store)31   -144.672    691.137  -0.209 0.834194    
## Fuel_Price:as.factor(Store)32   -138.054    688.604  -0.200 0.841103    
## Fuel_Price:as.factor(Store)33  -1285.543    765.933  -1.678 0.093270 .  
## Fuel_Price:as.factor(Store)34   -429.079    698.515  -0.614 0.539035    
## Fuel_Price:as.factor(Store)35  -4330.898    690.452  -6.273 3.56e-10 ***
## Fuel_Price:as.factor(Store)36  -3313.247    772.679  -4.288 1.80e-05 ***
## Fuel_Price:as.factor(Store)37  -1303.875    760.857  -1.714 0.086586 .  
## Fuel_Price:as.factor(Store)38     52.994    735.798   0.072 0.942584    
## Fuel_Price:as.factor(Store)39   1386.897    695.825   1.993 0.046244 *  
## Fuel_Price:as.factor(Store)40   -847.837    673.624  -1.259 0.208168    
## Fuel_Price:as.factor(Store)41    837.226    691.135   1.211 0.225751    
## Fuel_Price:as.factor(Store)42   -945.193    748.007  -1.264 0.206369    
## Fuel_Price:as.factor(Store)43  -1300.235    773.112  -1.682 0.092604 .  
## Fuel_Price:as.factor(Store)44   -330.521    801.477  -0.412 0.680054    
## Fuel_Price:as.factor(Store)45  -1070.102    688.369  -1.555 0.120055    
## Fuel_Price:as.factor(Month)2    2140.842    691.190   3.097 0.001953 ** 
## Fuel_Price:as.factor(Month)3    2319.906    669.767   3.464 0.000533 ***
## Fuel_Price:as.factor(Month)4    2324.423    664.759   3.497 0.000471 ***
## Fuel_Price:as.factor(Month)5    2108.884    673.662   3.130 0.001745 ** 
## Fuel_Price:as.factor(Month)6    2178.415    675.116   3.227 0.001252 ** 
## Fuel_Price:as.factor(Month)7    2216.807    681.005   3.255 0.001133 ** 
## Fuel_Price:as.factor(Month)8    2147.445    677.462   3.170 0.001525 ** 
## Fuel_Price:as.factor(Month)9    2363.397    666.744   3.545 0.000393 ***
## Fuel_Price:as.factor(Month)10   2875.057    667.687   4.306 1.66e-05 ***
## Fuel_Price:as.factor(Month)11   2805.872    767.831   3.654 0.000258 ***
## Fuel_Price:as.factor(Month)12   2908.426    828.412   3.511 0.000447 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21010 on 421291 degrees of freedom
## Multiple R-squared:  0.09791,    Adjusted R-squared:  0.09766 
## F-statistic: 397.6 on 115 and 421291 DF,  p-value: < 2.2e-16
Regression summary analysis Model 4

Unfortunately, we obtained NA as results for all those new variable introduced. After the collinearity test, we conclude as we had perfect multi-colinearity for Size, Sizestore, Type, variables in this model 4 which cannot allow us to obtain more pieces of information in our analysis.

6 Global analysis

In this analysis, model 2 offer the best performances globally. This second model give us specific informations on those stores 2,4,6,10,14,15,18,19,21,24,27,28,30,35,36,37,39 and 43, for which fuel price variation have significant impact. For the stores 39 and 4 when the fuel price increase the weekly sales increase too and for all other stores when the fuel price increase the weekly sales decrease.

7 Conclusion

This analysis allows us to reach our main goals which were to obtained information on the relation between weekly sales and fuel price. This analysis opens the doors for the research on the impact of the impact of price variation of fuel on the Walmart consumer behaviour.