For this lab, we will be using Tableau to learn some basic concepts in visual analytics to identify data outliers, seasonality effects, relationships and impacts. There is no R coding in this lab session.
This worksheet will be used to capture your images from Tableau and to share your observations. To refresh your memory, an example of capturing and including an image is included at the end of this sheet for your reference. You will need to log onto Tableau and Connect/Import the Excel file EuroStore.xls found in the data folder under the ‘bsad-lab10’ folder.
Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.
For your assignment you may be using different data sets than what is included here. Always read carefully the instructions on Sakai. For clarity, tasks/questions to be completed/answered are highlighted in red color (visible in preview) and numbered according to their particular placement in the task section. Quite often you will need to add your own code chunk.
Execute all code chunks, preview, publish, and submit link on Sakai.
First get familiar with the data and what each column represent. A description of the data can be found in the Excel file in a seperate sheet called ‘Desc’. Refer to early lab exercise on how to use Tableau. Also check the quick Tableau get started guide published on Sakai. Note that this time you will be reading an Excel file of type xls (unlike csv) into Tableau. Once the file is read into Tableau you will need to select the Data sheet for your work.
In a new Tableau sheet
Drag the measure Sales and drop into Rows. Similarly drag the dimension Week and drop into Columns.
##### 1A) Plot Sales (Rows) versus Week (Columns). Include a snapshot here. Analyse the data source and explain the behavior you observe.
#Throughout the range of weeks within the graph the sales stayed consistenyl in the $40,000 to $50,000 range. However, there is a massive exception around week 22-26 where the sales dropped to under $25,000.
Pull the drop-down menu by clicking the down arrow visible once you hover with the mouse over the variable name in the Rows or Columns field.
##### 1B) Switch from SUM(Sales) to Average AVG(Sales). Doubleclick on the Sales axis and change scale to be more representative of the data range. This should provide a better view of the data. Include a snapshot here. Explain, both qualitatively and quantitatively, how the switch to average impacted the previously observed behavior in 1A).
#Quantitively the results of average sales versus the week seem to be different than the sum of sales. The numbers in the beginning weeks and last weeks are lower than the sales in the middle weeks which does not match up with the previously observed behavior. Qualitavely, the graph does not show the extreme outlier in the 22-26 week timeframe that was shown in the sum of sales graph. There is a small downward spike in that timeframe, however, it is not as extreme and is still actually above the averages in the first weeks and last weeks.
##### 1C) Drag the measure Temp to Color found in Marks. Change SUM(Temp) to AVG(Temp). Edit the colors in the newly created color legend to be more representative of hot and cold temperatures. Include a snapshot here. Explain the combined behavior of Sales, Week, and Temp.
#After looking at the graph it seems that average sales is the highest in the middle weeks. The red in the graph shows that this time period is also the warmest weeks. This would mean that average sales is highest when it is warm and will decline once the temperature starts getting colder.
In a seperate Tableau sheet
##### 2A) Drop the measure Sales into Rows and the measure TV into Columns. Switch both measures from SUM() to Dimension using the drop-down menu. The plot should now look like a scatter plot. Include a snapshot here. Explain the overall behavior of Sales versus TV. Ignore the values at zero. Can you identify an upper threshold for the amount of investment in TV ads beyond which any purchase of additional TV ads would have little or no impact on Sales?
#There is an upper threshold and that happens right around 70 tv ads. The sales, on average, get much higher when there are 70 tv ads. However, once the graph hits this point the sales do no increase anymore so the effect of adding more than 70 ads would be a waste.
##### 2B) Drag the measure Radio to Size found in Marks. Include a snapshot of the plot here. Explain what impact the addition of Radio ads to TV ads is having on Sales.
#It seems as though the bigger bubbles (size representing more radio ads) are mostly on the far right of the graph. However, it does not seem to be affecting sales at all other than the upper threshold that was created by the tv ads in the 2A graph. I would assume that they see a flat line of sales after a certain amount of tv ads so they try to increase radio ads as well but does not have a large effect on the sales.
In a separate Tableau sheet
##### 2C) Create a scatter plot of Sales versus Fuel Volume. Explain behavior. No need for a snapshot here.
#After making the scatter plot sales versus Fuel Volume instead of Sales versus tv it seems there is a stronger correlation. You are able to see a more consistent increase of sales as fuel volume increases. This stronger correlated graph represents a direct relationship between the two variables.
##### 2D) Drag the measure Temp to Color found in Marks. Follow 1C) to edit the color legend for Temp. Explain the added impact of temperature on Sales. No need for a snapshot here.
#Now that the temperature is added into the marks section you can see another correlation. There seems to be a lot more red circles in the upper right corner of the graph and more blue circles in the bottom left. This means that all three of these variables are directly related to each other. As one goes up the other two will also be trending upward.
##### 2E) Drag the measure Holiday to Label found in Marks. Include a snapshot here. Explain the new combined impact of Temp, Holiday, and Fuel Volume on Sales.
#Temp,Holiday and Fuel Volume all seem to have a positive relationship with sales. This is shown by the temperature, fuel volume and amount of data points being holidays increases as the sales are increasing. Any increase of these variables should increase the sales.
In a separate Tableau sheet
##### 2F) Create a Tree Map to show the combined effect of Sales, Temp, Holiday, and Fuel Volume. Use the Show Me menu to select a treemap. Make sure Sales is represented by Size, Temp by Color, Holiday by Label, and Fuel Volume by Label. A sample treemap view is shown below. You may want to view in Presentation mode for more clarity. Consider using a Quick Filter on Holiday and Temp to isolate and better assess the impact of each variable. To create a filter simply hover with the mouse over the measure and select Show Filter from the drop-down menu. Include a snapshot here.
##### 2G) Write a small coincise paragraph (data story) summarizing your findings on how you think the different measures work together or independently, and under what conditions, to affect Sales.
#After looking at the treemap there is no doubt that all these variables are increasing sales. All the larger boxes on the left side of the map all have the highest temperatures and fuel volumes with the most holidays. I would say all these variables are working together for the most part to all increase sales. There are a couple data points showing larges sales that is not a holiday or has a cooler temperature but the other two variables are still high so they pick up the slack. All these variables seem to be the key for increasing sales.