For this lab, we will be using Tableau to learn some basic concepts in visual analytics to identify data outliers, seasonality effects, relationships and impacts. There is no R coding in this lab session.
This worksheet will be used to capture your images from Tableau and to share your observations. To refresh your memory, an example of capturing and including an image is included at the end of this sheet for your reference. You will need to log onto Tableau and Connect/Import the Excel file EuroStore.xls found in the data folder under the ‘bsad-lab10’ folder.
Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.
For your assignment you may be using different data sets than what is included here. Always read carefully the instructions on Sakai. For clarity, tasks/questions to be completed/answered are highlighted in red color (visible in preview) and numbered according to their particular placement in the task section. Quite often you will need to add your own code chunk.
Execute all code chunks, preview, publish, and submit link on Sakai.
First get familiar with the data and what each column represent. A description of the data can be found in the Excel file in a seperate sheet called ‘Desc’. Refer to early lab exercise on how to use Tableau. Also check the quick Tableau get started guide published on Sakai. Note that this time you will be reading an Excel file of type xls (unlike csv) into Tableau. Once the file is read into Tableau you will need to select the Data sheet for your work.
In a new Tableau sheet
Drag the measure Sales and drop into Rows. Similarly drag the dimension Week and drop into Columns.
##### 1A) Plot Sales (Rows) versus Week (Columns). Include a snapshot here. Analyse the data source and explain the behavior you observe.
It can be seen from the tableau plot that throughout the weeks in the data, the sales climbs from about $42000 to $51000 in week 21. However, from week 22 trough week 26, sales take a large plunge and climbs back up to about $53,000 in week 30. From here, sales is pretty volatile, and ends at about $42000 for the year at week 52.
Pull the drop-down menu by clicking the down arrow visible once you hover with the mouse over the variable name in the Rows or Columns field.
##### 1B) Switch from SUM(Sales) to Average AVG(Sales). Doubleclick on the Sales axis and change scale to be more representative of the data range. This should provide a better view of the data. Include a snapshot here. Explain, both qualitatively and quantitatively, how the switch to average impacted the previously observed behavior in 1A).
This switch to average sales from the sum of sales impacts both the numbers plotted on the graph and the graph visual itself. Quantitatively, it now takes the average sales amounts per week across the year. This can help the end user better see trends via average rather than a lot of ups and downs in a sum of sales view. Qualitatively, the plot is more smoothed out now when using the average sales. This is important becasue It allows us to see that the drop in sales in week 22 and increase back up to sales in week 26 (which was a sharp drop in sum of sales view) is not that bad as it looks when looking at week averages of sales. We can see that the Averages are actually lower at other weeks of the year, and not at the percieved drop as seen in the sum view. So, with this view, we are able to see that the average actually increased, for the most part, over the year, only to decrease towards the end. (Instead of thinking that sales dropped in the middle of the year and increase towards the end in the sum view)
##### 1C) Drag the measure Temp to Color found in Marks. Change SUM(Temp) to AVG(Temp). Edit the colors in the newly created color legend to be more representative of hot and cold temperatures. Include a snapshot here. Explain the combined behavior of Sales, Week, and Temp.
While looking at this new plot with average temerature highlighting average sales throughout the year, it can be seen that averge sales are higher in the summer months (weeks 20-35 roughly) and lower in the cold months (week 1-19, 36-52 roughly).
In a seperate Tableau sheet
##### 2A) Drop the measure Sales into Rows and the measure TV into Columns. Switch both measures from SUM() to Dimension using the drop-down menu. The plot should now look like a scatter plot. Include a snapshot here. Explain the overall behavior of Sales versus TV. Ignore the values at zero. Can you identify an upper threshold for the amount of investment in TV ads beyond which any purchase of additional TV ads would have little or no impact on Sales?
When looking at the graphic, and ignoring the values at zero, it can be seen that, generally, there is a positive correlation between total sales and the TV volume of target audience reached. There is, however, what seems to be an upper threshold amount of investment in TV ads beyond which any purchase of additional ads will have little to no impact on sales.
It seems that this threshold might begin around the 80-100 mark of TV. This is because Sales are at what seems to be a peak relative to other data points. In comparison, if you have the most amount of TV, 220, there are only little, if any, increases to sales. With this in mind, the threshold she be taken into consideration for cost minimizing process while still maximizing sales.
##### 2B) Drag the measure Radio to Size found in Marks. Include a snapshot of the plot here. Explain what impact the addition of Radio ads to TV ads is having on Sales.
By adding the measure Radio to the size of marks on the previous graph, we are able to gain further insight that Sales is indeed positively correlated to both tv and radio ads. We can also note that tv ads and radio ads also have a positive correlation. This is because higher sales points with tv tend to have larger circles of Radio ads, meaning there is more radio ads as well. It should be noted though that the highest tv and radio points have higher sales. However, going back to the conclusion above, it is only a marginal increase from the range around 80-100 Tv ads.
In a separate Tableau sheet
##### 2C) Create a scatter plot of Sales versus Fuel Volume. Explain behavior. No need for a snapshot here.
With this graphic, we can see a clear positive relationship between Sales and fuel volume. That is, as more fuel volume is sold, sales increase as well.
##### 2D) Drag the measure Temp to Color found in Marks. Follow 1C) to edit the color legend for Temp. Explain the added impact of temperature on Sales. No need for a snapshot here.
We gain more insight with the temperature addition in marks because it reveals that more sales, and corresponding fuel volume sold, occur when it is warmer than cold. This is because there are more of the red marks (hot temperature) higher up on the graphic as sales and fuel sold is high.
##### 2E) Drag the measure Holiday to Label found in Marks. Include a snapshot here. Explain the new combined impact of Temp, Holiday, and Fuel Volume on Sales.
With this new mark of a holiday occuring during the week, it is more clear that sales, and fuel volume sold, are highest when there is a holidy during the week. Correspondingly, there are more marks of not having a holidy during the week with low sales and fuel volume sold marks on the lower end of the graphic.
In a separate Tableau sheet
##### 2F) Create a Tree Map to show the combined effect of Sales, Temp, Holiday, and Fuel Volume. Use the Show Me menu to select a treemap. Make sure Sales is represented by Size, Temp by Color, Holiday by Label, and Fuel Volume by Label. A sample treemap view is shown below. You may want to view in Presentation mode for more clarity. Consider using a Quick Filter on Holiday and Temp to isolate and better assess the impact of each variable. To create a filter simply hover with the mouse over the measure and select Show Filter from the drop-down menu. Include a snapshot here.
##### 2G) Write a small coincise paragraph (data story) summarizing your findings on how you think the different measures work together or independently, and under what conditions, to affect Sales.
In using this treemap, we are able to see at a big picture level the assumptions that we had previously made about sales, fuel volume sold, temperature, and holidays. To begin our analysis, we can see that when the temperature is highest, and there is a holiday during the week, sales and fuel volumes sold are highest. Correspondingly, sales and fuel volume sold are lowest when the temperature is lower and there is no holiday during the week.
In summary, it is clear that temperature and holiday impact, although perhaps temperature does more, sales and fuel volume sold throughout the year. We saw this throughout the lab, but we saw it all come together with the treemap.
Example treemap for representation from the Professor :