Visualization

Today’s class

  • A little bit about data visualization
  • A review of topics
  • Start with Generalized Linear Mixed Effects Models
  • Start thinking about advanced topics

Homework for Wednesday

Find a cool plot that you want to replicate.

On Friday, you will create the data, and replicate the plot

Data visualization

  • My favorite part of stats/data analysis/ etc

  • You can be creative! but be smart and ethical.

  • Plots are easy to “manipulate”

Misleading plots

Incorrect and misleading plots

  • Pie chart with total over 100

  • Avoid pie charts. Grouped barplots are potentially the best solution for “parts of a whole”. Stacked barplot can be challenging to distinguish.

Barplot

Barplot

Plotting

  • Estimate (point) and CI

  • Similar to a line and CI

Plots

  • Distribution

  • Correlation

  • Ranking

  • Part of a whole

  • Evolution (time series, line plot)

  • Map

  • Flows

Distribution plots

Distribution plots

Distribution plots

Distribution plots

Correlation

  • Scatterplot

  • Heatmap

  • Correlogram

  • Bubble

Scatterplot

Scatterplot

Scatterplot

Combining scatterplot and distribution

                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Combining scatterplot and distribution

Combining scatterplot and distribution

Heatmap

Correlogram

   species tars1 tars2 head aede1 aede2 aede3
1 Concinna   191   131   53   150    15   104
2 Concinna   185   134   50   147    13   105
3 Concinna   200   137   52   144    14   102
4 Concinna   173   127   50   144    16    97
5 Concinna   171   118   49   153    13   106
6 Concinna   160   118   47   140    15    99
      species       tars1           tars2            head           aede1      
 Concinna :21   Min.   :122.0   Min.   :107.0   Min.   :43.00   Min.   :116.0  
 Heikert. :31   1st Qu.:148.0   1st Qu.:118.2   1st Qu.:49.00   1st Qu.:125.5  
 Heptapot.:22   Median :185.5   Median :123.0   Median :50.50   Median :136.5  
                Mean   :177.3   Mean   :124.0   Mean   :50.35   Mean   :134.8  
                3rd Qu.:198.2   3rd Qu.:130.0   3rd Qu.:52.00   3rd Qu.:142.8  
                Max.   :242.0   Max.   :146.0   Max.   :58.00   Max.   :157.0  
     aede2           aede3       
 Min.   : 8.00   Min.   : 55.00  
 1st Qu.:11.00   1st Qu.: 85.25  
 Median :14.00   Median : 98.50  
 Mean   :12.99   Mean   : 95.38  
 3rd Qu.:15.00   3rd Qu.:106.00  
 Max.   :16.00   Max.   :123.00  

Correlogram

Friday’s lab

You will simply try to replicate a complex, but cool plot

You will generate data for it