For this lab, we will be using Tableau to learn some basic concepts in visual analytics to identify data outliers, seasonality effects, relationships and impacts. There is no R coding in this lab session.
This worksheet will be used to capture your images from Tableau and to share your observations. To refresh your memory, an example of capturing and including an image is included at the end of this sheet for your reference. You will need to log onto Tableau and Connect/Import the Excel file EuroStore.xls found in the data folder under the ‘bsad-lab10’ folder.
Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.
For your assignment you may be using different data sets than what is included here. Always read carefully the instructions on Sakai. For clarity, tasks/questions to be completed/answered are highlighted in red color (visible in preview) and numbered according to their particular placement in the task section. Quite often you will need to add your own code chunk.
Execute all code chunks, preview, publish, and submit link on Sakai.
First get familiar with the data and what each column represent. A description of the data can be found in the Excel file in a seperate sheet called ‘Desc’. Refer to early lab exercise on how to use Tableau. Also check the quick Tableau get started guide published on Sakai. Note that this time you will be reading an Excel file of type xls (unlike csv) into Tableau. Once the file is read into Tableau you will need to select the Data sheet for your work.
In a new Tableau sheet
Drag the measure Sales and drop into Rows. Similarly drag the dimension Week and drop into Columns.
##### 1A) Plot Sales (Rows) versus Week (Columns). Include a snapshot here. Analyse the data source and explain the behavior you observe.
Sales vs. Week
The data source contains data about week, sales, TV, radio, fuel volume, temperature, and holiday. Within this graph, the data shows a relatively high and stable level of sales (as a sum of sales) except between weeks 20 and 25 where sales dip significantly.
Pull the drop-down menu by clicking the down arrow visible once you hover with the mouse over the variable name in the Rows or Columns field.
##### 1B) Switch from SUM(Sales) to Average AVG(Sales). Doubleclick on the Sales axis and change scale to be more representative of the data range. This should provide a better view of the data. Include a snapshot here. Explain, both qualitatively and quantitatively, how the switch to average impacted the previously observed behavior in 1A).
More Representative View of Sales Vs. Week
Quantitatively, the switch to average sales brought the sales numbers down to a range between 19k - 27k rather than the 20k - 55k range that was shown in the previous graph. It also caused the graph to invert. Qualitatively, we can see that average sales actually exhibited a general increase through week 30 and average sales increased and decreased regularly throughout the 52 weeks. Additionally, when the axis scale changed, the axis numbers changed to begin at 19k rather than 0, which allowed for a more accurate and detailed representation of the data.
##### 1C) Drag the measure Temp to Color found in Marks. Change SUM(Temp) to AVG(Temp). Edit the colors in the newly created color legend to be more representative of hot and cold temperatures. Include a snapshot here. Explain the combined behavior of Sales, Week, and Temp.
Sales vs. Week filtered by temperature
The graph above shows that sales peaked when temperature was at its highest during week 30 and sales were at their lowest during the coldest weeks. As temperature increased up to week 20, sales also increased. Then, when temperatures began to decrease after week 30, sales decreased as well.
In a seperate Tableau sheet
##### 2A) Drop the measure Sales into Rows and the measure TV into Columns. Switch both measures from SUM() to Dimension using the drop-down menu. The plot should now look like a scatter plot. Include a snapshot here. Explain the overall behavior of Sales versus TV. Ignore the values at zero. Can you identify an upper threshold for the amount of investment in TV ads beyond which any purchase of additional TV ads would have little or no impact on Sales?
Sales vs. TV
The scatter plot shows that as TV increased from 40 to 100 sales also increased. However, once TV surpassed the value of 100 sales no longer showed any significant increases. Therefore, investment in TV ads should not exceed 100k.
##### 2B) Drag the measure Radio to Size found in Marks. Include a snapshot of the plot here. Explain what impact the addition of Radio ads to TV ads is having on Sales.
Sales vs. TV and Radio
THe addition of radio ads to TV ads appears to have no significant impact on sales. Compared to the graph of Sales v. TV, this graph contains very similar patterns. Increased investment in Radio ads creates no additional sales.
In a separate Tableau sheet
##### 2C) Create a scatter plot of Sales versus Fuel Volume. Explain behavior. No need for a snapshot here.
In the Sales versus Fuel Volume scatter plot sales have an increasing trend as fuel volume increases up to 67k.
##### 2D) Drag the measure Temp to Color found in Marks. Follow 1C) to edit the color legend for Temp. Explain the added impact of temperature on Sales. No need for a snapshot here.
The added temperature measure shows that sales tended to increase as temperature increased. When temperature was at its highest fuel volume reached 64k-66k. So we can infer that sales, fuel volume, and temperature all increased simultaneously for the most part.
##### 2E) Drag the measure Holiday to Label found in Marks. Include a snapshot here. Explain the new combined impact of Temp, Holiday, and Fuel Volume on Sales.
Impact on Sales by Temp, Holiday, and Fuel Volume
Similar to the previous graph, the above graph shows that the peak of both sales and temperature occured on holidays. This leads to the conclusion that many holidays fall on the hottest days of the year, and that sales are highest on those days. It is shown that the highest sales on holidays with high temperatures and fuel volumes between 63k-67k.
In a separate Tableau sheet
##### 2F) Create a Tree Map to show the combined effect of Sales, Temp, Holiday, and Fuel Volume. Use the Show Me menu to select a treemap. Make sure Sales is represented by Size, Temp by Color, Holiday by Label, and Fuel Volume by Label. A sample treemap view is shown below. You may want to view in Presentation mode for more clarity. Consider using a Quick Filter on Holiday and Temp to isolate and better assess the impact of each variable. To create a filter simply hover with the mouse over the measure and select Show Filter from the drop-down menu. Include a snapshot here.
Tree Map of Sales, Temp, Holiday, and Fuel Volume
##### 2G) Write a small coincise paragraph (data story) summarizing your findings on how you think the different measures work together or independently, and under what conditions, to affect Sales.
Analyzing the EuroStore data shows how the independent variables, week, TV ads, radio ads, fuel volume, temperature, and holiday affect the dependent variable, sales. As shown in the graphs above, sales increase as TV ads, temperature, and fuel volume increase (all together and indepedently), up to a certain threshold. Sales also tended to increase when a holiday occured. The only independent varibale that had seemingly no impact on sales was radio ads.