Data Collected From: https://www.kaggle.com/rutuspatel/walmart-dataset-retail?select=Walmart_Store_sales.csv
Linkedin: https://www.linkedin.com/in/brandon-ly-1676b821a/
GitHub: https://github.com/BrandonSchoolPF/WalmartSalesAnalysis
\[\\[.01in]\] Introduction
The data we have shows the sales for different stores, with sales figures based on if the day contained a holiday, temperature of day, fuel price, cusomer price index, and unemployment rate.
Below we can observe the first 6 rows of the data.
data <- read.csv("Walmart_Store_sales.csv")
head(data)
What we want to analyze is if gas prices, unemployment and temperature of the day caused any form of significance in how the week of sales preformed.
From what the data is given we can use ANOVA to statistically show if whether the observations we observed showed a significance in sales.
\[\\[.01in]\]
Cleaning the Data
In order to identify what we are testing, we must clean the data set to show only the values we want to see.
The columns we need are:
- Weekly Sales
- Temperature
- Fuel Price
- Unemployment
Below I will use dplyr to show only the data we need:
library(dplyr)
data <- data %>%
select(Weekly_Sales, Temperature, Fuel_Price, Unemployment)
head(data)
I have excluded date and store because we are only trying to see how our observations played a significance to our sales, not when the sales were being effected.
Data Analysis
We first want to identify the Null and Alternative Hypothesis.
- H0: Temperature, Fuel Prices, and Unemployment Rate causes no change to the weekly sales
- Ha: Temperature, Fuel Prices, and Unemployment Rate causes change to the weekly sales
In order to show significance, we must use an ANOVA test to determine our F value and P Value. We would use an ANOVA to test because our data contains more than 2 groups that we are comparing to our output variable. With an ANOVA test we can show the variance of our means for both the outcome variable and groups.
R allows us to use an ANOVA test with the aov() function:
data_aov <- aov(Weekly_Sales ~ Temperature + Fuel_Price + Unemployment, data = data)
data_aov
Call:
aov(formula = Weekly_Sales ~ Temperature + Fuel_Price + Unemployment,
data = data)
Terms:
Temperature Fuel_Price Unemployment Residuals
Sum of Squares 8.344136e+12 7.331840e+11 2.025267e+13 2.019961e+15
Deg. of Freedom 1 1 1 6431
Residual standard error: 560444.1
Estimated effects may be unbalanced
summary(data_aov)
Df Sum Sq Mean Sq F value Pr(>F)
Temperature 1 8.344e+12 8.344e+12 26.565 2.62e-07 ***
Fuel_Price 1 7.332e+11 7.332e+11 2.334 0.127
Unemployment 1 2.025e+13 2.025e+13 64.479 1.15e-15 ***
Residuals 6431 2.020e+15 3.141e+11
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Conclusions
As we can see from the results, Temperature and Unemployment had a very small P-Value but a large F- Value, This means that Temperature and Unemployment played a large significance in our ANOVA test in that the means of Temperature and Unemployment had a high variance. However, since the P-Values are lower than the significance level, we can reject the null hypothesis and agree with the Alternative hypothesis.
Therefore, we can make an assumption that Unemployment Rates and Temperature of day had some impact in the weekly sales. Whilst Fuel Price had little variance in mean and no significance in Weekly Sales.
LS0tDQp0aXRsZTogIldhbG1hcnQgU2FsZXMgQW5hbHlzaXMiDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCmF1dGhvcjogQnJhbmRvbiBMeQ0KLS0tDQoqKioqKioqDQpEYXRhIENvbGxlY3RlZCBGcm9tOiBodHRwczovL3d3dy5rYWdnbGUuY29tL3J1dHVzcGF0ZWwvd2FsbWFydC1kYXRhc2V0LXJldGFpbD9zZWxlY3Q9V2FsbWFydF9TdG9yZV9zYWxlcy5jc3YgIA0KTGlua2VkaW46IGh0dHBzOi8vd3d3LmxpbmtlZGluLmNvbS9pbi9icmFuZG9uLWx5LTE2NzZiODIxYS8gIA0KR2l0SHViOiBodHRwczovL2dpdGh1Yi5jb20vQnJhbmRvblNjaG9vbFBGL1dhbG1hcnRTYWxlc0FuYWx5c2lzICANCg0KJCRcXFsuMDFpbl0kJA0KPGZvbnQgc2l6ZSA9ICI1IiA+KipJbnRyb2R1Y3Rpb24qKjwvZm9udD4NCg0KKioqKioqKg0KDQpUaGUgZGF0YSB3ZSBoYXZlIHNob3dzIHRoZSBzYWxlcyBmb3IgZGlmZmVyZW50IHN0b3Jlcywgd2l0aCBzYWxlcyBmaWd1cmVzIGJhc2VkIG9uIGlmIHRoZSBkYXkgY29udGFpbmVkIGEgaG9saWRheSwgdGVtcGVyYXR1cmUgb2YgZGF5LCBmdWVsIHByaWNlLCBjdXNvbWVyIHByaWNlIGluZGV4LCBhbmQgdW5lbXBsb3ltZW50IHJhdGUuICANCg0KQmVsb3cgd2UgY2FuIG9ic2VydmUgdGhlIGZpcnN0IDYgcm93cyBvZiB0aGUgZGF0YS4NCg0KYGBge3J9DQpkYXRhIDwtIHJlYWQuY3N2KCJXYWxtYXJ0X1N0b3JlX3NhbGVzLmNzdiIpDQpoZWFkKGRhdGEpDQpgYGANCiAgDQpXaGF0IHdlIHdhbnQgdG8gYW5hbHl6ZSBpcyBpZiBnYXMgcHJpY2VzLCB1bmVtcGxveW1lbnQgYW5kIHRlbXBlcmF0dXJlIG9mIHRoZSBkYXkgY2F1c2VkIGFueSBmb3JtIG9mIHNpZ25pZmljYW5jZSBpbiBob3cgdGhlIHdlZWsgb2Ygc2FsZXMgcHJlZm9ybWVkLiAgDQoNCkZyb20gd2hhdCB0aGUgZGF0YSBpcyBnaXZlbiB3ZSBjYW4gdXNlIEFOT1ZBIHRvIHN0YXRpc3RpY2FsbHkgc2hvdyBpZiB3aGV0aGVyIHRoZSBvYnNlcnZhdGlvbnMgd2Ugb2JzZXJ2ZWQgc2hvd2VkIGEgc2lnbmlmaWNhbmNlIGluIHNhbGVzLiAgDQoNCg0KJCRcXFsuMDFpbl0kJA0KDQoqKioqKioqICANCiAgDQoNCjxmb250IHNpemUgPSAiNSIgPioqQ2xlYW5pbmcgdGhlIERhdGEqKjwvZm9udD4gIA0KDQpJbiBvcmRlciB0byBpZGVudGlmeSB3aGF0IHdlIGFyZSB0ZXN0aW5nLCB3ZSBtdXN0IGNsZWFuIHRoZSBkYXRhIHNldCB0byBzaG93IG9ubHkgdGhlIHZhbHVlcyB3ZSB3YW50IHRvIHNlZS4gIA0KDQpUaGUgY29sdW1ucyB3ZSBuZWVkIGFyZToNCg0KKiBXZWVrbHkgU2FsZXMNCiogVGVtcGVyYXR1cmUNCiogRnVlbCBQcmljZQ0KKiBVbmVtcGxveW1lbnQNCg0KQmVsb3cgSSB3aWxsIHVzZSBkcGx5ciB0byBzaG93IG9ubHkgdGhlIGRhdGEgd2UgbmVlZDoNCmBgYHtyfQ0KbGlicmFyeShkcGx5cikNCmRhdGEgPC0gZGF0YSAlPiUNCiAgICAgICAgc2VsZWN0KFdlZWtseV9TYWxlcywgVGVtcGVyYXR1cmUsIEZ1ZWxfUHJpY2UsIFVuZW1wbG95bWVudCkNCmhlYWQoZGF0YSkNCmBgYA0KSSBoYXZlIGV4Y2x1ZGVkIGRhdGUgYW5kIHN0b3JlIGJlY2F1c2Ugd2UgYXJlIG9ubHkgdHJ5aW5nIHRvIHNlZSBob3cgb3VyIG9ic2VydmF0aW9ucyBwbGF5ZWQgYSBzaWduaWZpY2FuY2UgdG8gb3VyIHNhbGVzLCBub3Qgd2hlbiB0aGUgc2FsZXMgd2VyZSBiZWluZyBlZmZlY3RlZC4gDQoNCg0KKioqKioqICANCg0KDQo8Zm9udCBzaXplID0gIjUiID4qKkRhdGEgQW5hbHlzaXMqKjwvZm9udD4gIA0KDQpXZSBmaXJzdCB3YW50IHRvIGlkZW50aWZ5IHRoZSBOdWxsIGFuZCBBbHRlcm5hdGl2ZSBIeXBvdGhlc2lzLiAgDQoNCiogSH4wfjogVGVtcGVyYXR1cmUsIEZ1ZWwgUHJpY2VzLCBhbmQgVW5lbXBsb3ltZW50IFJhdGUgY2F1c2VzIG5vIGNoYW5nZSB0byB0aGUgd2Vla2x5IHNhbGVzIA0KKiBIfmF+OiBUZW1wZXJhdHVyZSwgRnVlbCBQcmljZXMsIGFuZCBVbmVtcGxveW1lbnQgUmF0ZSBjYXVzZXMgY2hhbmdlIHRvIHRoZSB3ZWVrbHkgc2FsZXMgDQoNCkluIG9yZGVyIHRvIHNob3cgc2lnbmlmaWNhbmNlLCB3ZSBtdXN0IHVzZSBhbiBBTk9WQSB0ZXN0IHRvIGRldGVybWluZSBvdXIgRiB2YWx1ZSBhbmQgUCBWYWx1ZS4gV2Ugd291bGQgdXNlIGFuIEFOT1ZBIHRvIHRlc3QgYmVjYXVzZSBvdXIgZGF0YSBjb250YWlucyBtb3JlIHRoYW4gMiBncm91cHMgdGhhdCB3ZSBhcmUgY29tcGFyaW5nIHRvIG91ciBvdXRwdXQgdmFyaWFibGUuIFdpdGggYW4gQU5PVkEgdGVzdCB3ZSBjYW4gc2hvdyB0aGUgdmFyaWFuY2Ugb2Ygb3VyIG1lYW5zIGZvciBib3RoIHRoZSBvdXRjb21lIHZhcmlhYmxlIGFuZCBncm91cHMuICANCg0KUiBhbGxvd3MgdXMgdG8gdXNlIGFuIEFOT1ZBIHRlc3Qgd2l0aCB0aGUgYW92KCkgZnVuY3Rpb246ICANCg0KYGBge3J9DQpkYXRhX2FvdiA8LSBhb3YoV2Vla2x5X1NhbGVzIH4gVGVtcGVyYXR1cmUgKyBGdWVsX1ByaWNlICsgVW5lbXBsb3ltZW50LCBkYXRhID0gZGF0YSkNCmRhdGFfYW92DQpzdW1tYXJ5KGRhdGFfYW92KQ0KYGBgDQoNCg0KKioqKioqDQoNCjxmb250IHNpemUgPSAiNSIgPioqQ29uY2x1c2lvbnMqKjwvZm9udD4gIA0KDQpBcyB3ZSBjYW4gc2VlIGZyb20gdGhlIHJlc3VsdHMsIFRlbXBlcmF0dXJlIGFuZCBVbmVtcGxveW1lbnQgaGFkIGEgdmVyeSBzbWFsbCBQLVZhbHVlIGJ1dCBhIGxhcmdlIEYtIFZhbHVlLCBUaGlzIG1lYW5zIHRoYXQgVGVtcGVyYXR1cmUgYW5kIFVuZW1wbG95bWVudCBwbGF5ZWQgYSBsYXJnZSBzaWduaWZpY2FuY2UgaW4gb3VyIEFOT1ZBIHRlc3QgaW4gdGhhdCB0aGUgbWVhbnMgb2YgVGVtcGVyYXR1cmUgYW5kIFVuZW1wbG95bWVudCBoYWQgYSBoaWdoIHZhcmlhbmNlLiBIb3dldmVyLCBzaW5jZSB0aGUgUC1WYWx1ZXMgYXJlIGxvd2VyIHRoYW4gdGhlIHNpZ25pZmljYW5jZSBsZXZlbCwgd2UgY2FuIHJlamVjdCB0aGUgbnVsbCBoeXBvdGhlc2lzIGFuZCBhZ3JlZSB3aXRoIHRoZSBBbHRlcm5hdGl2ZSBoeXBvdGhlc2lzLiAgDQoNClRoZXJlZm9yZSwgd2UgY2FuIG1ha2UgYW4gYXNzdW1wdGlvbiB0aGF0IFVuZW1wbG95bWVudCBSYXRlcyBhbmQgVGVtcGVyYXR1cmUgb2YgZGF5IGhhZCBzb21lIGltcGFjdCBpbiB0aGUgd2Vla2x5IHNhbGVzLiBXaGlsc3QgRnVlbCBQcmljZSBoYWQgbGl0dGxlIHZhcmlhbmNlIGluIG1lYW4gYW5kIG5vIHNpZ25pZmljYW5jZSBpbiBXZWVrbHkgU2FsZXMuDQo=