Histos

Jason Van Pelt, Jeni Rainer
3/22/2019

Overview

  • Shared Vocabulary
  • Why use histograms
  • Abstract samples
  • Spike samples

Shared Vocabulary and Why

  • Histograms answer questions related to frequency and data distribution
    1. How similar or diverse are the points in my data set
  • Filter Histograms:
    1. Small
    2. Low fidelity
    3. Cue to the user to lead them to narrowing down result set to find specific data points

Shared Vocabulary

  • Bin size:
    1. Narrow enough to reveal interesting features about the distribution
    2. Wide enough to reduce noise
  • AREA is what matters, not height

plot of chunk histo1

Same Data, Different User Experience

plot of chunk rawData This is test data, but has not been massaged in any way.

Naturalize Histogram by putting all outliers in one bin

plot of chunk hist_data2

Same Data, Different User Experience

Change Bin Width

plot of chunk unnamed-chunk-2

Change Bin width and group outliers

plot of chunk unnamed-chunk-3