Data 624- Advanced Exploration and Visualization in Health

Instructor: Zahra Shakeri– Winter 2021

Handout #1 Introduction to Visualization Tools

The lollipop chart is a composite chart with bars and circles. It is a variant of the bar chart with a circle at the end to highlight the data value. Similar to a bar chart, a lollipop chart is used to compare categorical data.

For this kind of composite chart, we are able to use more visual elements to convey information.

Implementation in Python

The following code snippet explains how to make a Lollipop plot with Matplotlib, whatever the shape of your data. To customize this chart, you can modify the three main parts: the stem, the markers and the baseline. Note that for all of these three components, we first build the stem plot with the stem() function, and then customize it with the plt.setm function.

library(reticulate)
use_python("/Users/zahrashakeri/opt/anaconda3/bin/python/")
# library
import matplotlib.pyplot as plt
import numpy as np
 
# create data
import numpy as np
values=np.random.uniform(size=30)
 
# plot with no marker
plt.stem(values, markerfmt=' ')
## <StemContainer object of 3 artists>
## 
## <string>:1: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
plt.show()
 
# change colour and shape and size and edges

(markers, stemlines, baseline) = plt.stem(values)
## <string>:1: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
plt.setp(markers, marker='D', markersize=5, markeredgecolor="orange", markeredgewidth=3)

#The stems can be customized as well. All the feature that you can apply to a line plot can be applied to stem: colour, width, type, transparency etc..
#plt.setp(stemlines, linestyle="-", color="olive", linewidth=0.5 )
## [None, None, None, None]
plt.show()

Lollipop R- ggpubr library: Publication Ready Plots

About the dataset: A ranking of the ten most common causes of death each year in Alberta, by the total number of deaths [Last updated: August 26, 2016].

First, load and prepare the data in order to create order bar plots. Change the fill colour by the grouping variable “Year”. In the example below, sorting will be done globally, but not by groups. Any problems here?

library(ggpubr)
library(ggplot2)
setwd("/Users/zahrashakeri/Library/Mobile Documents/com~apple~CloudDocs/Teaching/2021/Data 624/Lectures/Lecture 01/Handout#1")


#load the data
df<- read.csv("deaths-leading-causes.csv")

# Convert the year variable to a factor
df$Year <-as.factor(df$Year)


ggbarplot(df, x ='Cause', y = 'Total',
          fill = "Year",              # change fill colour by 'Year'
          color = "white",            # Set bar border colours to white
          palette = c("#39CDAA", "#6D6FA4", "#AF2184"), # Custom colour palette
          sort.val = "desc",          # Sort the value in dscending order
          sort.by.groups = FALSE,     # Don't sort inside each group
          x.text.angle = 90           # Rotate vertically x axis texts
)

Sort bars inside each group. Use the argument sort.by.groups = TRUE.

ggbarplot(df, x ='Cause', y = 'Total',
    fill = "Year",              # change fill colour by 'Year'
    color = "white",            # Set bar border colours to white
    palette = c("#39CDAA", "#6D6FA4", "#AF2184"), # Custom colour palette
    sort.val = "desc",          # Sort the value in dscending order
    sort.by.groups =TRUE,       # Don't sort inside each group
    x.text.angle = 90           # Rotate vertically x axis texts
)

Lollipop chart is an alternative to bar plots when you have a large set of (close) values to visualize.

In the following example, we draw a Lollipop chart coloured by the grouping variable Year. Depending on the number of categories (i.e. Causes) you are trying to graphically display, and the range of the x-axis, it can be helpful to add value markers to the points to clarify the difference between the points.

ggdotchart(df, x ='Cause', y = 'Total',
           color = "Year",                                # Color by groups
           palette = c("#39CDAA", "#6D6FA4", "#AF2184"),  # Custom colour palette
           sorting = "descending",                        # Sort value in descending order
           add = "segments",                              # Add segments from y = 0 to dots
           #add.params = list(color="Year", size=1),
           rotate = FALSE,                                # Rotate vertically
           group = "Year",                                # Order by groups
           dot.size = 10,                                 # Large dot size
           label = round(df$Total),                       # Add Total values as dot labels
           font.label = list(color = "white", size = 10, 
                             vjust = 0.5),                # Adjust label parameters
           ggtheme = theme_pubr()                         # ggplot2 theme
           )

  • Sort in descending order. sorting = “descending”.
  • Rotate the plot vertically, using rotate = TRUE.
  • Sort the Total value inside each group by using group = “Year”.
  • Set dot.size to 10 to make the circles bigger.
  • Add Total values as label. label = “Total” or label = round(df$Total).

Tableau

Introduction

The first thing you see after you open Tableau Desktop is the start page. Here, you select the connector (how you will connect to your data) that you want to use.

  • Under Open (Number 1), you can open workbooks that you have already created.

  • Under Sample Workbooks (Number 2), view sample dashboards and worksheets that come with Tableau Desktop.

  • Under Connect (number 3), you can:

    • Connect to data that is stored in a file, such as Microsoft Excel, PDF, Spatial files, and more.
    • Connect to data that is stored on a server, such as Tableau Server, Microsoft SQL Server, Google Analytics, and more.
    • Connect to a data source that you have connected to before.
  • Under Discover (Number 4), find additional resources like video tutorials, forums, or the “Viz of the week” to get ideas about what you can build.

  • Click the Tableau icon (number 5) in the upper left-hand corner of any page to toggle between the start page and the authoring workspace.

What are the dimensions and measures? Dimensions are categorical fields; in this case, fields such as Causes and Year. These are fields that we want to slice and dice our numerical data by. Dimensions are often discrete. Discrete fields create labels in the chart and are colour-coded blue in the data pane and in the view. Measures, on the other hand, are our metrics. They are the numbers we want to analyze. Measures are often continuous. Continuous fields create axes in the chart, and their pills are colour-coded green.

Lillipop Chart

The lollipop chart in Tableau is a composite chart with bars and circles. Like a bar chart, a lollipop chart is used to compare categorical data.

  1. We start with a bar chart:

    • Drag “Cause” into Rows Shelf.
    • Drag “Total” into Columns Shelf.
    • Drag “Total” to the top side of view, Tableau will show you a dashed line. Now it is time to release the left mouse key. A dual-axis is created automatically.

  1. Check Synchronize Axis to synchronize two axes.

    • Since the two y-axes are synchronized, we can hide the right axis by unchecking Show Header.
  2. Tableau may automatically convert these two views into Circle, so we need to convert the first axis back to Bar:

    • Open the first Marks card and choose Bar as the mark type.
    • Expand the Size shelf and thin the bars to make it more like a lollipop.

  1. To better compare the main causes of death over different years, we add diverging colour bars or circles or both.

    • Drag “Year”" into Marks - Colour of both two marks.

5. Edit the colour by expanding the colour pane and choose “Edit Color”

  1. Add well-formatted total labels on the lollipops:

    • Drag "Total into Marks - Label for the second mark.
    • Expand Label option, and edit the in Font. Change the label colour to white/black. Set the Font size to 8 and Bold.
    • Expand Alignment pane and choose Middle Center alignment in Horizontal.

Aided by this advanced lollipop chart, we can easily find that All other forms of chronic ischemic heart disease have the highest frequency of death in 2001 and 2007, and Organic dementia has been the main cause of death in Alberta in 2016.

Wrapping Up

In this guide, we have learned about one of the composite charts in Tableau - Lollipop Chart. Lollipop charts are an excellent alternative to bar charts and dot plots. As previously mentioned, when trying to visualize data across many categories, they provide a nice minimalistic visualization of the data. First, we introduced the concept and characteristics of a lollipop chart. And then, we learned the basic process of creating a lollipop chart in Python, R, and Tableau. In the end, we enhanced the lollipop chart with diverging, sorting, and labels.


Did you know?

"Infographics are often confused with data visualizations. It is commonly thought that these two are completely different - but they are not. Both are visual representations of data.

An important difference is that a data visualization is just one (i.e. a map, graph, chart or diagram), while an infographic often contains multiple data visualizations. A second key difference is that infographics contain additional elements like narrative and graphics. Besides that, more work tends to go into the design of infographics, to make them more impactful and aesthetically pleasing." Check here for a detailed comparison.

In this course, we will learn some infographics design techniques (in addition to the tools mentioned in the course outline), using Inkscape and Infographics Lab tools.