In this section, we will be using Tableau to learn concepts on data outliers, seasonality effect, and the relationships and impacts. There is no R coding in this lab session.
This worksheet will be used to capture your images from Tableau and to share your observations. Example of capturing and including an image is included at the end of this sheet for your reference. You will need to log onto Tableau and Connect/Import the file EuroStore.xls found in the ‘bsad_lab10’ folder.
Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.
For your assignment you may be using different data sets than what is included here. Always read carefully the instructions on Sakai. Tasks/questions to be completed/answered are highlighted in larger bolded fonts and numbered according to their particular placement in the task section.
First get familiar with the data and what each columns represent. A description of the data is provided in a seperate sheet called ‘Desc’ in the same Excel file. Refer to Lab05 for early exercise using Tableau.
In a new Tableau sheet
This plot represents EuroStore sales data based on weeks. Sales are relatively consistent, ranging between 40,000 to 55,000 Euros. However, around the 23rd through 25th week, sales dropped significantly below the lower limits for the remaining weeks. From this plot alone, the data for the 23rd to 25th week is abnormal when compared to the rest of the weeks. By looking at the excel spreadsheet, filtering the data, and putting it in order by weeks, we find that weeks 23-25 only have one data point, where every other week has two. Since this plot looks at the sum of sales, by only having one data set, weeks 23-25 only have the sum of the one data point, instead of the two that every other week has. This is reason for the significant decrease in sales for the roughly three week period of time.
Now that the data looks at average sales, instead of the sum of sales, weeks 23-25 no longer show a significant drop in sales. Sales now show a generally increasing trend between weeks 1-30, and a decreasing trend between weeks 30-52. The highest sales are achieved around weeks 25-33 (with week 30 being the highest at around 26,500 Euros), and the lowest are around scattered between weeks 1-4 and 37-52 (with week 3 being the lowest at around 19,700 Euros). The data roughly resembles a bell curve. Weeks 23-24 are still shown as a drop in sales, when compared to immediately preceding and following data. However, the sales for weeks 23-24 are no longer the lowest sales out of the entire data set, which is what we inaccurately saw in the 1A). The average sales scale has also been adjusted to show sales between 19,000 and 28,000 Euros. This condensed scale shows the weekly ebb and flow better than if the scale began at 0 Euros.
The combined behavior of sales and temperatures shows us that, generally, higher sales occur during warmer weeks and lower sales occur during colder weeks. If we assume week 1 is the first week of January, it seems that warmer temperatures begin around week 20, which would be June, and continues throughout week 40, which would be around August. We would be seeing higher sales around late May-early September, which is consistent with late spring through early fall seasons. Sales begin to drop after the summer months, well into the next year. More people probably like to drive when it’s warmer, which fits with this data trend because less people would buy snacks/drinks/etc. at the store itself if they are not getting gas. ———-
In a seperate Tableau sheet
TV does not seem to be a largely important contirbuting factor to sales. Most of the weekly data points are found at 0 GRP units, meaning 0% of the target audience is reached (but this may be because there were no TV ads being shown). There were still sales being made without TV ads. When tv ads reached audienced, the points were clustered aound 40 GRP units, 90-95 GRP units, and roughly around 200-220 GRP units. More importantly, data points with higher GRP units (200-220 GRP), where the audience was reached repeatedly, did not result in proportionally higher sales. This means that even if customers see tv ads, they won’t necessarily buy more at the stores. This relationship between sales and tv is weak and inconsistent. It appears that Sales around 90-95 GRP units are a good middle ground, because Sales for those weeks are mostly higher than sales at GRP units of around 40 and more consistent than those at 200-220 GRP. Based on this, I would establish the upper limit amount to be invested in TV at the dollar amount corresponding to 90-95 GRP units, since GRP units are not the exact Euro value spent.
Radio is similar to TV, because there is no clear and distinct correlation between higher sales with increaed spending on radio ads. Many radio GRP data points above 100 units are found around the median of sales and some points are reached with less than 100 GRP radio units. However, more radio points at or above GRP units of 100 are found at the higher sales points, showing that radio may be slighlty better in correlating with sales than TV. Most of the sales weeks above the rough median of 24,000 Euros have radio GRP units of 100, meaning the audience was reached by radio ads and likely had some contribution to the higher sales.
In a separate Tableau sheet
There appears to be a positive correlation trend in fuel volume and sales, where higher sales are correlated to higher fuel volumes, or fuel sold. This would make sense because if people are buying gas, they are more likely to go into the store to buy other items (either because it is convenient or they have to go inside to pay for the gas anyway).
By adding temperature, we see that higher temperature data points are clustered around the data showing higher fuel volume and higher sales in the upper-right hand side of the plot. Temperature appears to have a higher correlation to sales, as well. The warmer temperatures correlate to higher sales and fuel sold. In the warmer months, more people are driving and buying fuel, which also means they are more likely to buy items from the store.
Given that 1 indicates a holiday during the week and 0 does not, we can see that the data points with higher sales, temperature, and fuel sold, also tend to be positively correlated to a holiday event. More people likely drive around, or have time to buy fuel and shop, during holidays when it is warm outside.
In a separate sheet
##### 2G) Write a small paragraph summarizing your final conclusions on what you think most affect Sales and under what conditions.
I believe that holidays, fuel volume, and temperature affect sales the most, and therefore have the highest correlation with sales. We see that the best conditions for high sales occur when there is a holiday during a warm week of the year and there is more fuel being sold. If it is a warm holiday, people are driving more and needing to gas. This is the best scenario for sales at the store. TV and Radio do not have a significant correlation to sales based on the analysis in Task 2. I think gas stations are more of a necessity, and ads are probably not as influential in getting people to want to come to the store if they don’t need gas, or if it is not convenient for them to travel to the store.