About
In this section, we will be using Tableau to learn concepts on data outliers, seasonality effect, and the relationships and impacts. There is no R coding in this lab session.
Setup
This worksheet will be used to capture your images from Tableau and to share your observations. Example of capturing and including an image is included at the end of this sheet for your reference. You will need to log onto Tableau and Connect/Import the file EuroStore.xls found in the ‘bsad_lab10’ folder.
Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.
Task 1: Data Outliers and Seasonality Effect
First get familiar with the data and what each columns represent. A description of the data is provided in a seperate sheet called ‘Desc’ in the same Excel file. Refer to Lab05 for early exercise using Tableau.
In a new Tableau sheet
1A) Plot Sales (Rows) versus Week (Columns). Include a snapshot here. Analyse the data source and explain in clear words the behavior you observe.
img1_path <- "imgs/plot_salesVSweek.png"
knitr::include_graphics(img1_path)
Answer: There is a large drop from weeks 23 - 25. There there is also missing data. This sheet tracks 2 years of data but from weeks 23 - 25 there is only one year of recorded sales?? So the graph. the graph is showing a less than amazing representation of the data and shows a lower number for these weeks than it should be.
1B) Switch from SUM(Sales) to Average AVG(Sales). Change the Sales scale to be more reflective of the data. Include a snapshot here. Explain the new behavior relative to 1A).
img2_path <- "imgs/plot_avgsalesVSweek.png"
knitr::include_graphics(img2_path)

Answer: Comparing to Q 1A, I can see that this graph better represents the data throughout the 2 years. Not only are the sales averaged now, but the scale of the Avg Sales has been adjusted to show a more dynamic view of sales compared to 1A which is more flat. When we averaged the sales, the weeks with only one revenue recorded fit the graph better. From the scaled view, sales seem to be the highest from weeks 20 - 32 with a slight dip in week 23.
1C) Add Temp to the Color scale found in Marks. Change SUM(Temp) to AVG(Temp). Edit the color legend to be more reflective of hot and cold temperatures. Include a snapshot here. Explain the combined behavior of sales and temperature.
img3_path <- "imgs/temp_color_scale.png"
knitr::include_graphics(img3_path)

Answer: When I added temperature to the graph, I could see that when the temperature increases the sales usually also increase. The middle weeks of the year are the summer time, as it is the tallest and red portion show heat. Colder months show to be the lowest months of sales.
Task 2: Relationships and Impacts
In a seperate Tableau sheet
2A) Plot Sales (Rows) versus TV (Columns). Switch both measures from SUM() to Dimension. The plot should look more like a scatter plot. Include a snapshot here. Explain the behavior of Sales versus TV. How much you think is the upper limit amount that should be invested in TV ads?
img4_path <- "imgs/scatter_salesVStv.png"
knitr::include_graphics(img4_path)

Answer: This graph shows that if we spend more on TV advertisements, sales will go up. This is true up until 90,000 dollars. After the 90,000 dollar mark there aren’t really much increases in sales from increased advertisements. Any amount spent under 90000 dollars shows lower sales.
2B) Overlay Radio to the previous plot using the Size. scale found in Marks. Include a snapshot here. Explain how the additional Radio ads to Tv ads is impacting Sales.
img5_path <- "imgs/radio_overlay_previous.png"
knitr::include_graphics(img5_path)

Answer: When we add the radio advertisements feature to the graph, you can see the relationship between the 2 variables.We can see that the trend is similar to tv and sales. When we spend more money on ads for tv and radio, sales increase but one will notice that spending 90,000 or more won’t be that great of an impact.
2C) Plot Sales versus Fuel Volume. Explain behavior.
img6_path <- "imgs/salesVSfuelvolume.png"
knitr::include_graphics(img6_path)

Answer: When looking at this graph, we can see that fuel volume and sales is positively correlated and that sales seem to increase when there are higher volumes of fuel. However, correlation does not necessarily mean that there is any causation, but it is possible.
2D) Overlay Temperature using the Color scale. Follow 1C) for temperature settings. Explain the new combined behavior and the impact of temperature.
img7_path <- "imgs/overlay_temp_scatter.png"
knitr::include_graphics(img7_path)

Answer: I see that sales and fuel volume increase in the summer during warmer temperatures. When its colder, the sales and volume are lower.
2E) Overlay Holiday using the Label scale. Include a snapshot here. Explain the new combined behavior and the impact of Holiday.
img8_path <- "imgs/overlay_holiday_scatter.png"
knitr::include_graphics(img8_path)

Answer: After looking at this graph, the status of holiday versus nonholiday seems to have 0 affect on the sales or higher volume of fuel. We can see this becuase when we look at the number 1 next to the points, where there is a number 1 next to warm and cold days. The same can be said for the data points for number 0.
2F) Use a Tree Map to best show the combined effect of Sales, Fuel Volume, Temp, and Holiday. A sample view is shown below. Consider using the Quick Filter on Holiday and Temp to isolate and better view the impact of each. You can have more than one filter at a time. Include a snapshot here.
img9_path <- "imgs/treemap_1.png"
knitr::include_graphics(img9_path)

2G) Write a small paragraph summarizing your final conclusions on what you think most affect Sales and under what conditions.
Answer: In the end, it seems that after looking at all of the graphs that sales is most affected by temperature and tv/radio ads. This business should heavily advertise during the summer when temperatures are usually warmer, as it is very clear from the data that this is the best time for their sales. From this treemap we can see the that the hotter weeks in red have higher sales. So, advertising during the summer months appears to be the best option for sales, as holiday vs non holiday seems to not make much of an impact. They do not need to spend any more money or time into trying to these months to increase sales in the other months.

Answer: Comparing to Q 1A, I can see that this graph better represents the data throughout the 2 years. Not only are the sales averaged now, but the scale of the Avg Sales has been adjusted to show a more dynamic view of sales compared to 1A which is more flat. When we averaged the sales, the weeks with only one revenue recorded fit the graph better. From the scaled view, sales seem to be the highest from weeks 20 - 32 with a slight dip in week 23.
1C) Add Temp to the Color scale found in Marks. Change SUM(Temp) to AVG(Temp). Edit the color legend to be more reflective of hot and cold temperatures. Include a snapshot here. Explain the combined behavior of sales and temperature.
img3_path <- "imgs/temp_color_scale.png"
knitr::include_graphics(img3_path)

Answer: When I added temperature to the graph, I could see that when the temperature increases the sales usually also increase. The middle weeks of the year are the summer time, as it is the tallest and red portion show heat. Colder months show to be the lowest months of sales.