Data Visualization with Plotly: Histograms, Box Plots, Pie Charts, and Bar Charts

Illya Mowerman, Ph.D.

Introduction

This presentation explains how to create and interpret various types of charts using Plotly in R:

We’ll use the mtcars dataset as our primary example.

The mtcars Dataset

The mtcars dataset contains fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Understanding Histograms

Histograms are used to visualize the distribution of a continuous variable.

Key features:

Histogram: Miles Per Gallon (mpg)

plot_ly(mtcars, x = ~mpg, type = "histogram") %>%
  layout(title = "Distribution of Miles Per Gallon",
         xaxis = list(title = "Miles Per Gallon"),
         yaxis = list(title = "Count"))

Interpreting the MPG Histogram

Understanding Box Plots

Box plots show the distribution of a variable through quartiles.

Components:

Single-Variable Box Plot: MPG

plot_ly(mtcars, y = ~mpg, type = "box") %>%
  layout(title = "Distribution of Miles Per Gallon",
         yaxis = list(title = "Miles Per Gallon"))

Interpreting the MPG Box Plot

Comparative Box Plots

Box plots are excellent for comparing distributions across groups.

Box Plot: MPG by Number of Cylinders

plot_ly(mtcars, x = ~as.factor(cyl), y = ~mpg, type = "box") %>%
  layout(title = "Miles Per Gallon by Number of Cylinders",
         xaxis = list(title = "Number of Cylinders"),
         yaxis = list(title = "Miles Per Gallon"))

Interpreting MPG by Cylinders Box Plot

Understanding Pie Charts

Pie charts are used to show the composition of a categorical variable.

Key features:

Pie Chart: Distribution of Cylinders

cyl_counts <- mtcars %>% count(cyl)
plot_ly(cyl_counts, labels = ~cyl, values = ~n, type = 'pie') %>%
  layout(title = 'Distribution of Cars by Number of Cylinders')

Interpreting the Cylinders Pie Chart

Understanding Bar Charts

Bar charts are used to compare quantities across different categories.

Key features:

Bar Chart: Average MPG by Number of Cylinders

avg_mpg <- mtcars %>% group_by(cyl) %>% summarise(avg_mpg = mean(mpg))
plot_ly(avg_mpg, x = ~factor(cyl), y = ~avg_mpg, type = 'bar') %>%
  layout(title = 'Average MPG by Number of Cylinders',
         xaxis = list(title = 'Number of Cylinders'),
         yaxis = list(title = 'Average MPG'))

Interpreting the MPG by Cylinders Bar Chart

When to Use Each Chart Type

Conclusion