Data Collected From: https://www.kaggle.com/rutuspatel/walmart-dataset-retail?select=Walmart_Store_sales.csv
Linkedin: https://www.linkedin.com/in/brandon-ly-1676b821a/
GitHub: https://github.com/BrandonSchoolPF/WalmartSalesAnalysis

\[\\[.01in]\] Introduction


The data we have shows the sales for different stores, with sales figures based on if the day contained a holiday, temperature of day, fuel price, customer price index, and unemployment rate.

The variables within the data set are significant in that we can make an assumption that they affect weekly sales, for example we there is a possibility that a fuel price can cause a decrease in weekly sales in that customers may see an increase in price for gas thus swaying them to use their cars more efficiently or not at all.

Using the data set we can use a statistical approach to figure out if whether or not those variables may play a significance in how it affects weekly sales.

Below we can observe the first 6 rows of the data.

data <- read.csv("Walmart_Store_sales.csv")
head(data)

What we want to analyze is if gas prices, unemployment and temperature of the day caused any form of significance in how the week of sales preformed.

From what the data is given we can use ANOVA to statistically show if whether the observations we observed showed a significance in sales.

\[\\[.01in]\]


Cleaning the Data

In order to identify what we are testing, we must clean the data set to show only the values we want to see.

The columns we need are:

Using dplyr I can select the columns that are of importance uding the select() function.

Below I will use dplyr to show only the data we need:

library(dplyr)
data <- data %>%
        select(Weekly_Sales, Temperature, Fuel_Price, Unemployment)
head(data)

As shown, the columns represent which variables are of importance. I have excluded date and store number because we are only trying to see how our observations played a significance to our sales, not when the sales were being effected.

By clearing some of the data we can eliminate any redundancies as well as any data that we deem as insignificant to what we are trying to solve.


Data Analysis

In order to show significance, we must use an ANOVA test to determine our F value and P Value. We would use an ANOVA to test because our data contains more than 2 groups that we are comparing to our output variable. With an ANOVA test we can show the variance of our means for both the outcome variable and groups.

First, I would like to graph the columns using ggplot, and show a line of regression to predict trends to see if there was a negative or positive outcome on if the levels of the variables affect weekly sales.

temp_graph <- ggplot(data, aes(x = Weekly_Sales, y = Temperature)) + geom_point() + geom_smooth(method='lm', formula= y~x)
temp_graph

fuel_graph <- ggplot(data, aes(x = Weekly_Sales, y = Fuel_Price)) + geom_point() + geom_smooth(method='lm', formula= y~x)
fuel_graph

Unemployment_graph <- ggplot(data, aes(x = Weekly_Sales, y = Unemployment)) + geom_point() + geom_smooth(method='lm', formula= y~x)
Unemployment_graph

Using GGPlot2 I created code that allowed us to graph out each column and provide a line using a linear model equation from our x and y-axis.

What we see from our graph is that Temperature and Unemployment had a negative slope from the line of regression while fuel price had a negative slope. We could form a prediction that temperature and unemployment played a significance in that the higher the temperature and unemployment rate, the lower the weekly sale. However we cannot base these prediction to form a result from an analysis, so we will move forward to an ANOVA test.

R allows us to use an ANOVA test with the aov() function:

data_aov <- aov(Weekly_Sales ~ Temperature + Fuel_Price + Unemployment, data = data)
data_aov
Call:
   aov(formula = Weekly_Sales ~ Temperature + Fuel_Price + Unemployment, 
    data = data)

Terms:
                 Temperature   Fuel_Price Unemployment    Residuals
Sum of Squares  8.344136e+12 7.331840e+11 2.025267e+13 2.019961e+15
Deg. of Freedom            1            1            1         6431

Residual standard error: 560444.1
Estimated effects may be unbalanced
summary(data_aov)
               Df    Sum Sq   Mean Sq F value   Pr(>F)    
Temperature     1 8.344e+12 8.344e+12  26.565 2.62e-07 ***
Fuel_Price      1 7.332e+11 7.332e+11   2.334    0.127    
Unemployment    1 2.025e+13 2.025e+13  64.479 1.15e-15 ***
Residuals    6431 2.020e+15 3.141e+11                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Conclusions

Alike a T-test we can show significance of two variables by finding the P-Value and comparing it to our significance level. However, ANOVA allows us to find the same information from a T-Test but instead of having a one-way comparison, ANOVA allows us to provide a comparison from 2 or more dependent variables unlike a standard T-Test.

With our analysis, we want to identify the Null and Alternative Hypothesis. From there we can determine alongside our values from the test on whether or not to reject or accepts our null hypothesis.

As we can see from the results, Temperature and Unemployment had a very small P-Value but a large F- Value, This means that Temperature and Unemployment played a large significance in our ANOVA test in that the means of Temperature and Unemployment had a high variance. However, since the P-Values are lower than the significance level, we can reject the null hypothesis and agree with the Alternative hypothesis.

In conclusion with our analysis, we can make an assumption that Unemployment Rates and Temperature of day had some impact in the weekly sales. Whilst Fuel Price had little variance in mean and no significance in Weekly Sales. We saw this in the beginning of our analysis when using a graph for our data and showing our lines of regression to predict a trend, we saw that temperature and unemployment saw a negative slope in our line of regression, meaning that those played a siginificant role in correlation for the weekly sales.

LS0tDQp0aXRsZTogIldhbG1hcnQgU2FsZXMgQW5hbHlzaXMiDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCmF1dGhvcjogQnJhbmRvbiBMeQ0KLS0tDQoqKioqKioqDQpEYXRhIENvbGxlY3RlZCBGcm9tOiBodHRwczovL3d3dy5rYWdnbGUuY29tL3J1dHVzcGF0ZWwvd2FsbWFydC1kYXRhc2V0LXJldGFpbD9zZWxlY3Q9V2FsbWFydF9TdG9yZV9zYWxlcy5jc3YgIA0KTGlua2VkaW46IGh0dHBzOi8vd3d3LmxpbmtlZGluLmNvbS9pbi9icmFuZG9uLWx5LTE2NzZiODIxYS8gIA0KR2l0SHViOiBodHRwczovL2dpdGh1Yi5jb20vQnJhbmRvblNjaG9vbFBGL1dhbG1hcnRTYWxlc0FuYWx5c2lzICANCg0KJCRcXFsuMDFpbl0kJA0KPGZvbnQgc2l6ZSA9ICI1IiA+KipJbnRyb2R1Y3Rpb24qKjwvZm9udD4NCg0KKioqKioqKg0KDQpUaGUgZGF0YSB3ZSBoYXZlIHNob3dzIHRoZSBzYWxlcyBmb3IgZGlmZmVyZW50IHN0b3Jlcywgd2l0aCBzYWxlcyBmaWd1cmVzIGJhc2VkIG9uIGlmIHRoZSBkYXkgY29udGFpbmVkIGEgaG9saWRheSwgdGVtcGVyYXR1cmUgb2YgZGF5LCBmdWVsIHByaWNlLCBjdXN0b21lciBwcmljZSBpbmRleCwgYW5kIHVuZW1wbG95bWVudCByYXRlLiAgDQoNClRoZSB2YXJpYWJsZXMgd2l0aGluIHRoZSBkYXRhIHNldCBhcmUgc2lnbmlmaWNhbnQgaW4gdGhhdCB3ZSBjYW4gbWFrZSBhbiBhc3N1bXB0aW9uIHRoYXQgdGhleSBhZmZlY3Qgd2Vla2x5IHNhbGVzLCBmb3IgZXhhbXBsZSB3ZSB0aGVyZSBpcyBhIHBvc3NpYmlsaXR5IHRoYXQgYSBmdWVsIHByaWNlIGNhbiBjYXVzZSBhIGRlY3JlYXNlIGluIHdlZWtseSBzYWxlcyBpbiB0aGF0IGN1c3RvbWVycyBtYXkgc2VlIGFuIGluY3JlYXNlIGluIHByaWNlIGZvciBnYXMgdGh1cyBzd2F5aW5nIHRoZW0gdG8gdXNlIHRoZWlyIGNhcnMgbW9yZSBlZmZpY2llbnRseSBvciBub3QgYXQgYWxsLiAgDQoNClVzaW5nIHRoZSBkYXRhIHNldCB3ZSBjYW4gdXNlIGEgc3RhdGlzdGljYWwgYXBwcm9hY2ggdG8gZmlndXJlIG91dCBpZiB3aGV0aGVyIG9yIG5vdCB0aG9zZSB2YXJpYWJsZXMgbWF5IHBsYXkgYSBzaWduaWZpY2FuY2UgaW4gaG93IGl0IGFmZmVjdHMgd2Vla2x5IHNhbGVzLiAgDQoNCg0KQmVsb3cgd2UgY2FuIG9ic2VydmUgdGhlIGZpcnN0IDYgcm93cyBvZiB0aGUgZGF0YS4NCg0KYGBge3J9DQpkYXRhIDwtIHJlYWQuY3N2KCJXYWxtYXJ0X1N0b3JlX3NhbGVzLmNzdiIpDQpoZWFkKGRhdGEpDQpgYGANCiAgDQpXaGF0IHdlIHdhbnQgdG8gYW5hbHl6ZSBpcyBpZiBnYXMgcHJpY2VzLCB1bmVtcGxveW1lbnQgYW5kIHRlbXBlcmF0dXJlIG9mIHRoZSBkYXkgY2F1c2VkIGFueSBmb3JtIG9mIHNpZ25pZmljYW5jZSBpbiBob3cgdGhlIHdlZWsgb2Ygc2FsZXMgcHJlZm9ybWVkLiAgDQoNCkZyb20gd2hhdCB0aGUgZGF0YSBpcyBnaXZlbiB3ZSBjYW4gdXNlIEFOT1ZBIHRvIHN0YXRpc3RpY2FsbHkgc2hvdyBpZiB3aGV0aGVyIHRoZSBvYnNlcnZhdGlvbnMgd2Ugb2JzZXJ2ZWQgc2hvd2VkIGEgc2lnbmlmaWNhbmNlIGluIHNhbGVzLiAgDQoNCg0KJCRcXFsuMDFpbl0kJA0KDQoqKioqKioqICANCiAgDQoNCjxmb250IHNpemUgPSAiNSIgPioqQ2xlYW5pbmcgdGhlIERhdGEqKjwvZm9udD4gIA0KDQpJbiBvcmRlciB0byBpZGVudGlmeSB3aGF0IHdlIGFyZSB0ZXN0aW5nLCB3ZSBtdXN0IGNsZWFuIHRoZSBkYXRhIHNldCB0byBzaG93IG9ubHkgdGhlIHZhbHVlcyB3ZSB3YW50IHRvIHNlZS4gIA0KDQpUaGUgY29sdW1ucyB3ZSBuZWVkIGFyZToNCg0KKiBXZWVrbHkgU2FsZXMNCiogVGVtcGVyYXR1cmUNCiogRnVlbCBQcmljZQ0KKiBVbmVtcGxveW1lbnQgIA0KDQpVc2luZyBkcGx5ciBJIGNhbiBzZWxlY3QgdGhlIGNvbHVtbnMgdGhhdCBhcmUgb2YgaW1wb3J0YW5jZSB1ZGluZyB0aGUgc2VsZWN0KCkgZnVuY3Rpb24uICANCg0KDQpCZWxvdyBJIHdpbGwgdXNlIGRwbHlyIHRvIHNob3cgb25seSB0aGUgZGF0YSB3ZSBuZWVkOg0KYGBge3J9DQpsaWJyYXJ5KGRwbHlyKQ0KZGF0YSA8LSBkYXRhICU+JQ0KICAgICAgICBzZWxlY3QoV2Vla2x5X1NhbGVzLCBUZW1wZXJhdHVyZSwgRnVlbF9QcmljZSwgVW5lbXBsb3ltZW50KQ0KaGVhZChkYXRhKQ0KYGBgDQpBcyBzaG93biwgdGhlIGNvbHVtbnMgcmVwcmVzZW50IHdoaWNoIHZhcmlhYmxlcyBhcmUgb2YgaW1wb3J0YW5jZS4gSSBoYXZlIGV4Y2x1ZGVkIGRhdGUgYW5kIHN0b3JlIG51bWJlciBiZWNhdXNlIHdlIGFyZSBvbmx5IHRyeWluZyB0byBzZWUgaG93IG91ciBvYnNlcnZhdGlvbnMgcGxheWVkIGEgc2lnbmlmaWNhbmNlIHRvIG91ciBzYWxlcywgbm90IHdoZW4gdGhlIHNhbGVzIHdlcmUgYmVpbmcgZWZmZWN0ZWQuICANCg0KQnkgY2xlYXJpbmcgc29tZSBvZiB0aGUgZGF0YSB3ZSBjYW4gZWxpbWluYXRlIGFueSByZWR1bmRhbmNpZXMgYXMgd2VsbCBhcyBhbnkgZGF0YSB0aGF0IHdlIGRlZW0gYXMgaW5zaWduaWZpY2FudCB0byB3aGF0IHdlIGFyZSB0cnlpbmcgdG8gc29sdmUuICANCg0KDQoNCioqKioqKiAgDQoNCg0KPGZvbnQgc2l6ZSA9ICI1IiA+KipEYXRhIEFuYWx5c2lzKio8L2ZvbnQ+ICANCg0KSW4gb3JkZXIgdG8gc2hvdyBzaWduaWZpY2FuY2UsIHdlIG11c3QgdXNlIGFuIEFOT1ZBIHRlc3QgdG8gZGV0ZXJtaW5lIG91ciBGIHZhbHVlIGFuZCBQIFZhbHVlLiBXZSB3b3VsZCB1c2UgYW4gQU5PVkEgdG8gdGVzdCBiZWNhdXNlIG91ciBkYXRhIGNvbnRhaW5zIG1vcmUgdGhhbiAyIGdyb3VwcyB0aGF0IHdlIGFyZSBjb21wYXJpbmcgdG8gb3VyIG91dHB1dCB2YXJpYWJsZS4gV2l0aCBhbiBBTk9WQSB0ZXN0IHdlIGNhbiBzaG93IHRoZSB2YXJpYW5jZSBvZiBvdXIgbWVhbnMgZm9yIGJvdGggdGhlIG91dGNvbWUgdmFyaWFibGUgYW5kIGdyb3Vwcy4gIA0KDQpGaXJzdCwgSSB3b3VsZCBsaWtlIHRvIGdyYXBoIHRoZSBjb2x1bW5zIHVzaW5nIGdncGxvdCwgYW5kIHNob3cgYSBsaW5lIG9mIHJlZ3Jlc3Npb24gdG8gcHJlZGljdCB0cmVuZHMgdG8gc2VlIGlmIHRoZXJlIHdhcyBhIG5lZ2F0aXZlIG9yIHBvc2l0aXZlIG91dGNvbWUgb24gaWYgdGhlIGxldmVscyBvZiB0aGUgdmFyaWFibGVzIGFmZmVjdCB3ZWVrbHkgc2FsZXMuICANCg0KYGBge3J9DQp0ZW1wX2dyYXBoIDwtIGdncGxvdChkYXRhLCBhZXMoeCA9IFdlZWtseV9TYWxlcywgeSA9IFRlbXBlcmF0dXJlKSkgKyBnZW9tX3BvaW50KCkgKyBnZW9tX3Ntb290aChtZXRob2Q9J2xtJywgZm9ybXVsYT0geX54KQ0KdGVtcF9ncmFwaA0KYGBgDQpgYGB7cn0NCmZ1ZWxfZ3JhcGggPC0gZ2dwbG90KGRhdGEsIGFlcyh4ID0gV2Vla2x5X1NhbGVzLCB5ID0gRnVlbF9QcmljZSkpICsgZ2VvbV9wb2ludCgpICsgZ2VvbV9zbW9vdGgobWV0aG9kPSdsbScsIGZvcm11bGE9IHl+eCkNCmZ1ZWxfZ3JhcGgNCmBgYA0KYGBge3J9DQpVbmVtcGxveW1lbnRfZ3JhcGggPC0gZ2dwbG90KGRhdGEsIGFlcyh4ID0gV2Vla2x5X1NhbGVzLCB5ID0gVW5lbXBsb3ltZW50KSkgKyBnZW9tX3BvaW50KCkgKyBnZW9tX3Ntb290aChtZXRob2Q9J2xtJywgZm9ybXVsYT0geX54KQ0KVW5lbXBsb3ltZW50X2dyYXBoDQpgYGANClVzaW5nIEdHUGxvdDIgSSBjcmVhdGVkIGNvZGUgdGhhdCBhbGxvd2VkIHVzIHRvIGdyYXBoIG91dCBlYWNoIGNvbHVtbiBhbmQgcHJvdmlkZSBhIGxpbmUgdXNpbmcgYSBsaW5lYXIgbW9kZWwgZXF1YXRpb24gZnJvbSBvdXIgeCBhbmQgeS1heGlzLiAgDQoNCldoYXQgd2Ugc2VlIGZyb20gb3VyIGdyYXBoIGlzIHRoYXQgVGVtcGVyYXR1cmUgYW5kIFVuZW1wbG95bWVudCBoYWQgYSBuZWdhdGl2ZSBzbG9wZSBmcm9tIHRoZSBsaW5lIG9mIHJlZ3Jlc3Npb24gd2hpbGUgZnVlbCBwcmljZSBoYWQgYSBuZWdhdGl2ZSBzbG9wZS4gV2UgY291bGQgZm9ybSBhIHByZWRpY3Rpb24gdGhhdCB0ZW1wZXJhdHVyZSBhbmQgdW5lbXBsb3ltZW50IHBsYXllZCBhIHNpZ25pZmljYW5jZSBpbiB0aGF0IHRoZSBoaWdoZXIgdGhlIHRlbXBlcmF0dXJlIGFuZCB1bmVtcGxveW1lbnQgcmF0ZSwgdGhlIGxvd2VyIHRoZSB3ZWVrbHkgc2FsZS4gSG93ZXZlciB3ZSBjYW5ub3QgYmFzZSB0aGVzZSBwcmVkaWN0aW9uIHRvIGZvcm0gYSByZXN1bHQgZnJvbSBhbiBhbmFseXNpcywgc28gd2Ugd2lsbCBtb3ZlIGZvcndhcmQgdG8gYW4gQU5PVkEgdGVzdC4gIA0KDQoNClIgYWxsb3dzIHVzIHRvIHVzZSBhbiBBTk9WQSB0ZXN0IHdpdGggdGhlIGFvdigpIGZ1bmN0aW9uOiAgDQoNCmBgYHtyfQ0KZGF0YV9hb3YgPC0gYW92KFdlZWtseV9TYWxlcyB+IFRlbXBlcmF0dXJlICsgRnVlbF9QcmljZSArIFVuZW1wbG95bWVudCwgZGF0YSA9IGRhdGEpDQpkYXRhX2Fvdg0Kc3VtbWFyeShkYXRhX2FvdikNCmBgYA0KDQoNCioqKioqKg0KDQo8Zm9udCBzaXplID0gIjUiID4qKkNvbmNsdXNpb25zKio8L2ZvbnQ+ICAgDQoNCkFsaWtlIGEgVC10ZXN0IHdlIGNhbiBzaG93IHNpZ25pZmljYW5jZSBvZiB0d28gdmFyaWFibGVzIGJ5IGZpbmRpbmcgdGhlIFAtVmFsdWUgYW5kIGNvbXBhcmluZyBpdCB0byBvdXIgc2lnbmlmaWNhbmNlIGxldmVsLiBIb3dldmVyLCBBTk9WQSBhbGxvd3MgdXMgdG8gZmluZCB0aGUgc2FtZSBpbmZvcm1hdGlvbiBmcm9tIGEgVC1UZXN0IGJ1dCBpbnN0ZWFkIG9mIGhhdmluZyBhIG9uZS13YXkgY29tcGFyaXNvbiwgQU5PVkEgYWxsb3dzIHVzIHRvIHByb3ZpZGUgYSBjb21wYXJpc29uIGZyb20gMiBvciBtb3JlIGRlcGVuZGVudCB2YXJpYWJsZXMgdW5saWtlIGEgc3RhbmRhcmQgVC1UZXN0LiAgDQoNCldpdGggb3VyIGFuYWx5c2lzLCB3ZSB3YW50IHRvIGlkZW50aWZ5IHRoZSBOdWxsIGFuZCBBbHRlcm5hdGl2ZSBIeXBvdGhlc2lzLiBGcm9tIHRoZXJlIHdlIGNhbiBkZXRlcm1pbmUgYWxvbmdzaWRlIG91ciB2YWx1ZXMgZnJvbSB0aGUgdGVzdCBvbiB3aGV0aGVyIG9yIG5vdCB0byByZWplY3Qgb3IgYWNjZXB0cyBvdXIgbnVsbCBoeXBvdGhlc2lzLiAgDQoNCiogSH4wfjogVGVtcGVyYXR1cmUsIEZ1ZWwgUHJpY2VzLCBhbmQgVW5lbXBsb3ltZW50IFJhdGUgY2F1c2VzIG5vIGNoYW5nZSB0byB0aGUgd2Vla2x5IHNhbGVzIA0KKiBIfmF+OiBUZW1wZXJhdHVyZSwgRnVlbCBQcmljZXMsIGFuZCBVbmVtcGxveW1lbnQgUmF0ZSBjYXVzZXMgY2hhbmdlIHRvIHRoZSB3ZWVrbHkgc2FsZXMgDQoNCg0KQXMgd2UgY2FuIHNlZSBmcm9tIHRoZSByZXN1bHRzLCBUZW1wZXJhdHVyZSBhbmQgVW5lbXBsb3ltZW50IGhhZCBhIHZlcnkgc21hbGwgUC1WYWx1ZSBidXQgYSBsYXJnZSBGLSBWYWx1ZSwgVGhpcyBtZWFucyB0aGF0IFRlbXBlcmF0dXJlIGFuZCBVbmVtcGxveW1lbnQgcGxheWVkIGEgbGFyZ2Ugc2lnbmlmaWNhbmNlIGluIG91ciBBTk9WQSB0ZXN0IGluIHRoYXQgdGhlIG1lYW5zIG9mIFRlbXBlcmF0dXJlIGFuZCBVbmVtcGxveW1lbnQgaGFkIGEgaGlnaCB2YXJpYW5jZS4gSG93ZXZlciwgc2luY2UgdGhlIFAtVmFsdWVzIGFyZSBsb3dlciB0aGFuIHRoZSBzaWduaWZpY2FuY2UgbGV2ZWwsIHdlIGNhbiByZWplY3QgdGhlIG51bGwgaHlwb3RoZXNpcyBhbmQgYWdyZWUgd2l0aCB0aGUgQWx0ZXJuYXRpdmUgaHlwb3RoZXNpcy4gIA0KDQpJbiBjb25jbHVzaW9uIHdpdGggb3VyIGFuYWx5c2lzLCB3ZSBjYW4gbWFrZSBhbiBhc3N1bXB0aW9uIHRoYXQgVW5lbXBsb3ltZW50IFJhdGVzIGFuZCBUZW1wZXJhdHVyZSBvZiBkYXkgaGFkIHNvbWUgaW1wYWN0IGluIHRoZSB3ZWVrbHkgc2FsZXMuIFdoaWxzdCBGdWVsIFByaWNlIGhhZCBsaXR0bGUgdmFyaWFuY2UgaW4gbWVhbiBhbmQgbm8gc2lnbmlmaWNhbmNlIGluIFdlZWtseSBTYWxlcy4gV2Ugc2F3IHRoaXMgaW4gdGhlIGJlZ2lubmluZyBvZiBvdXIgYW5hbHlzaXMgd2hlbiB1c2luZyBhIGdyYXBoIGZvciBvdXIgZGF0YSBhbmQgc2hvd2luZyBvdXIgbGluZXMgb2YgcmVncmVzc2lvbiB0byBwcmVkaWN0IGEgdHJlbmQsIHdlIHNhdyB0aGF0IHRlbXBlcmF0dXJlIGFuZCB1bmVtcGxveW1lbnQgc2F3IGEgbmVnYXRpdmUgc2xvcGUgaW4gb3VyIGxpbmUgb2YgcmVncmVzc2lvbiwgbWVhbmluZyB0aGF0IHRob3NlIHBsYXllZCBhIHNpZ2luaWZpY2FudCByb2xlIGluIGNvcnJlbGF0aW9uIGZvciB0aGUgd2Vla2x5IHNhbGVzLiANCg0KDQo=