class: center, middle, inverse, title-slide # Descriptive Statistics ## What to do with lots of numbers ### Matthew Crump ### 2018/07/20 (updated: 2021-11-28) --- class: pink, center, middle, clear # What do lots of number look like? --- # Lots of Numbers look like this Like this <div class=rtable> <table> <tbody> <tr> <td style="text-align:right;"> 79 </td> <td style="text-align:right;"> -30 </td> <td style="text-align:right;"> -31 </td> <td style="text-align:right;"> -46 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -91 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> 84 </td> <td style="text-align:right;"> 54 </td> <td style="text-align:right;"> 96 </td> <td style="text-align:right;"> 86 </td> <td style="text-align:right;"> -1 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> 66 </td> <td style="text-align:right;"> -86 </td> <td style="text-align:right;"> -38 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -17 </td> <td style="text-align:right;"> 80 </td> <td style="text-align:right;"> 30 </td> </tr> <tr> <td style="text-align:right;"> 97 </td> <td style="text-align:right;"> -100 </td> <td style="text-align:right;"> -59 </td> <td style="text-align:right;"> -9 </td> <td style="text-align:right;"> -50 </td> <td style="text-align:right;"> 52 </td> <td style="text-align:right;"> -40 </td> <td style="text-align:right;"> 17 </td> <td style="text-align:right;"> 32 </td> <td style="text-align:right;"> -38 </td> <td style="text-align:right;"> -57 </td> <td style="text-align:right;"> 76 </td> <td style="text-align:right;"> -89 </td> <td style="text-align:right;"> 43 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> -35 </td> <td style="text-align:right;"> -86 </td> <td style="text-align:right;"> 79 </td> <td style="text-align:right;"> -63 </td> <td style="text-align:right;"> -80 </td> </tr> <tr> <td style="text-align:right;"> 94 </td> <td style="text-align:right;"> -3 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> -32 </td> <td style="text-align:right;"> -46 </td> <td style="text-align:right;"> -84 </td> <td style="text-align:right;"> 45 </td> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> -9 </td> <td style="text-align:right;"> -6 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -50 </td> <td style="text-align:right;"> -65 </td> <td style="text-align:right;"> -73 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> -29 </td> <td style="text-align:right;"> -12 </td> </tr> <tr> <td style="text-align:right;"> -87 </td> <td style="text-align:right;"> 62 </td> <td style="text-align:right;"> -27 </td> <td style="text-align:right;"> -12 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> -19 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> 18 </td> <td style="text-align:right;"> -68 </td> <td style="text-align:right;"> -78 </td> <td style="text-align:right;"> 59 </td> <td style="text-align:right;"> -42 </td> <td style="text-align:right;"> -54 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> -46 </td> <td style="text-align:right;"> 60 </td> <td style="text-align:right;"> -97 </td> <td style="text-align:right;"> 74 </td> <td style="text-align:right;"> -92 </td> </tr> <tr> <td style="text-align:right;"> -18 </td> <td style="text-align:right;"> -45 </td> <td style="text-align:right;"> -70 </td> <td style="text-align:right;"> 40 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -22 </td> <td style="text-align:right;"> -40 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> -10 </td> <td style="text-align:right;"> -98 </td> <td style="text-align:right;"> 59 </td> <td style="text-align:right;"> 27 </td> <td style="text-align:right;"> 30 </td> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> -56 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 76 </td> <td style="text-align:right;"> -70 </td> </tr> <tr> <td style="text-align:right;"> -45 </td> <td style="text-align:right;"> -37 </td> <td style="text-align:right;"> 72 </td> <td style="text-align:right;"> 18 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> -10 </td> <td style="text-align:right;"> 93 </td> <td style="text-align:right;"> 76 </td> <td style="text-align:right;"> -13 </td> <td style="text-align:right;"> 86 </td> <td style="text-align:right;"> 87 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> -66 </td> <td style="text-align:right;"> -70 </td> <td style="text-align:right;"> -18 </td> <td style="text-align:right;"> 64 </td> <td style="text-align:right;"> 48 </td> <td style="text-align:right;"> -62 </td> <td style="text-align:right;"> 62 </td> <td style="text-align:right;"> 76 </td> </tr> <tr> <td style="text-align:right;"> 95 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> 95 </td> <td style="text-align:right;"> -84 </td> <td style="text-align:right;"> 61 </td> <td style="text-align:right;"> 17 </td> <td style="text-align:right;"> -47 </td> <td style="text-align:right;"> 53 </td> <td style="text-align:right;"> -16 </td> <td style="text-align:right;"> -95 </td> <td style="text-align:right;"> -98 </td> <td style="text-align:right;"> 44 </td> <td style="text-align:right;"> 28 </td> <td style="text-align:right;"> -64 </td> <td style="text-align:right;"> -56 </td> <td style="text-align:right;"> -85 </td> <td style="text-align:right;"> 62 </td> <td style="text-align:right;"> 50 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> -99 </td> </tr> <tr> <td style="text-align:right;"> -94 </td> <td style="text-align:right;"> 38 </td> <td style="text-align:right;"> 49 </td> <td style="text-align:right;"> -88 </td> <td style="text-align:right;"> -74 </td> <td style="text-align:right;"> 48 </td> <td style="text-align:right;"> 35 </td> <td style="text-align:right;"> 41 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 73 </td> <td style="text-align:right;"> -30 </td> <td style="text-align:right;"> -12 </td> <td style="text-align:right;"> -23 </td> <td style="text-align:right;"> -44 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -84 </td> <td style="text-align:right;"> -32 </td> <td style="text-align:right;"> 39 </td> <td style="text-align:right;"> 18 </td> </tr> <tr> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> -94 </td> <td style="text-align:right;"> -80 </td> <td style="text-align:right;"> -99 </td> <td style="text-align:right;"> 80 </td> <td style="text-align:right;"> -92 </td> <td style="text-align:right;"> -22 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> -18 </td> <td style="text-align:right;"> -47 </td> <td style="text-align:right;"> -33 </td> <td style="text-align:right;"> 61 </td> <td style="text-align:right;"> 67 </td> <td style="text-align:right;"> -99 </td> <td style="text-align:right;"> -78 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 35 </td> <td style="text-align:right;"> -22 </td> <td style="text-align:right;"> -43 </td> </tr> <tr> <td style="text-align:right;"> 72 </td> <td style="text-align:right;"> -3 </td> <td style="text-align:right;"> -35 </td> <td style="text-align:right;"> -73 </td> <td style="text-align:right;"> 59 </td> <td style="text-align:right;"> 33 </td> <td style="text-align:right;"> -47 </td> <td style="text-align:right;"> 66 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> 78 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> 51 </td> <td style="text-align:right;"> -15 </td> <td style="text-align:right;"> -70 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -89 </td> <td style="text-align:right;"> 63 </td> <td style="text-align:right;"> 72 </td> </tr> </tbody> </table> </div> --- # What can we say about them? We can see they aren't all the same. Not much else really. Looking at a bunch of numbers is hard work. <div class=rtable> <table> <tbody> <tr> <td style="text-align:right;"> 79 </td> <td style="text-align:right;"> -30 </td> <td style="text-align:right;"> -31 </td> <td style="text-align:right;"> -46 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -91 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> 84 </td> <td style="text-align:right;"> 54 </td> <td style="text-align:right;"> 96 </td> <td style="text-align:right;"> 86 </td> <td style="text-align:right;"> -1 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> 66 </td> <td style="text-align:right;"> -86 </td> <td style="text-align:right;"> -38 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -17 </td> <td style="text-align:right;"> 80 </td> <td style="text-align:right;"> 30 </td> </tr> <tr> <td style="text-align:right;"> 97 </td> <td style="text-align:right;"> -100 </td> <td style="text-align:right;"> -59 </td> <td style="text-align:right;"> -9 </td> <td style="text-align:right;"> -50 </td> <td style="text-align:right;"> 52 </td> <td style="text-align:right;"> -40 </td> <td style="text-align:right;"> 17 </td> <td style="text-align:right;"> 32 </td> <td style="text-align:right;"> -38 </td> <td style="text-align:right;"> -57 </td> <td style="text-align:right;"> 76 </td> <td style="text-align:right;"> -89 </td> <td style="text-align:right;"> 43 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> -35 </td> <td style="text-align:right;"> -86 </td> <td style="text-align:right;"> 79 </td> <td style="text-align:right;"> -63 </td> <td style="text-align:right;"> -80 </td> </tr> <tr> <td style="text-align:right;"> 94 </td> <td style="text-align:right;"> -3 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> -32 </td> <td style="text-align:right;"> -46 </td> <td style="text-align:right;"> -84 </td> <td style="text-align:right;"> 45 </td> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> -9 </td> <td style="text-align:right;"> -6 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -50 </td> <td style="text-align:right;"> -65 </td> <td style="text-align:right;"> -73 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> -29 </td> <td style="text-align:right;"> -12 </td> </tr> <tr> <td style="text-align:right;"> -87 </td> <td style="text-align:right;"> 62 </td> <td style="text-align:right;"> -27 </td> <td style="text-align:right;"> -12 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> -19 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> 18 </td> <td style="text-align:right;"> -68 </td> <td style="text-align:right;"> -78 </td> <td style="text-align:right;"> 59 </td> <td style="text-align:right;"> -42 </td> <td style="text-align:right;"> -54 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> -46 </td> <td style="text-align:right;"> 60 </td> <td style="text-align:right;"> -97 </td> <td style="text-align:right;"> 74 </td> <td style="text-align:right;"> -92 </td> </tr> <tr> <td style="text-align:right;"> -18 </td> <td style="text-align:right;"> -45 </td> <td style="text-align:right;"> -70 </td> <td style="text-align:right;"> 40 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -22 </td> <td style="text-align:right;"> -40 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> -10 </td> <td style="text-align:right;"> -98 </td> <td style="text-align:right;"> 59 </td> <td style="text-align:right;"> 27 </td> <td style="text-align:right;"> 30 </td> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> -56 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 76 </td> <td style="text-align:right;"> -70 </td> </tr> <tr> <td style="text-align:right;"> -45 </td> <td style="text-align:right;"> -37 </td> <td style="text-align:right;"> 72 </td> <td style="text-align:right;"> 18 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> -10 </td> <td style="text-align:right;"> 93 </td> <td style="text-align:right;"> 76 </td> <td style="text-align:right;"> -13 </td> <td style="text-align:right;"> 86 </td> <td style="text-align:right;"> 87 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> -66 </td> <td style="text-align:right;"> -70 </td> <td style="text-align:right;"> -18 </td> <td style="text-align:right;"> 64 </td> <td style="text-align:right;"> 48 </td> <td style="text-align:right;"> -62 </td> <td style="text-align:right;"> 62 </td> <td style="text-align:right;"> 76 </td> </tr> <tr> <td style="text-align:right;"> 95 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> 95 </td> <td style="text-align:right;"> -84 </td> <td style="text-align:right;"> 61 </td> <td style="text-align:right;"> 17 </td> <td style="text-align:right;"> -47 </td> <td style="text-align:right;"> 53 </td> <td style="text-align:right;"> -16 </td> <td style="text-align:right;"> -95 </td> <td style="text-align:right;"> -98 </td> <td style="text-align:right;"> 44 </td> <td style="text-align:right;"> 28 </td> <td style="text-align:right;"> -64 </td> <td style="text-align:right;"> -56 </td> <td style="text-align:right;"> -85 </td> <td style="text-align:right;"> 62 </td> <td style="text-align:right;"> 50 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> -99 </td> </tr> <tr> <td style="text-align:right;"> -94 </td> <td style="text-align:right;"> 38 </td> <td style="text-align:right;"> 49 </td> <td style="text-align:right;"> -88 </td> <td style="text-align:right;"> -74 </td> <td style="text-align:right;"> 48 </td> <td style="text-align:right;"> 35 </td> <td style="text-align:right;"> 41 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 73 </td> <td style="text-align:right;"> -30 </td> <td style="text-align:right;"> -12 </td> <td style="text-align:right;"> -23 </td> <td style="text-align:right;"> -44 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -84 </td> <td style="text-align:right;"> -32 </td> <td style="text-align:right;"> 39 </td> <td style="text-align:right;"> 18 </td> </tr> <tr> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> -94 </td> <td style="text-align:right;"> -80 </td> <td style="text-align:right;"> -99 </td> <td style="text-align:right;"> 80 </td> <td style="text-align:right;"> -92 </td> <td style="text-align:right;"> -22 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> -18 </td> <td style="text-align:right;"> -47 </td> <td style="text-align:right;"> -33 </td> <td style="text-align:right;"> 61 </td> <td style="text-align:right;"> 67 </td> <td style="text-align:right;"> -99 </td> <td style="text-align:right;"> -78 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 35 </td> <td style="text-align:right;"> -22 </td> <td style="text-align:right;"> -43 </td> </tr> <tr> <td style="text-align:right;"> 72 </td> <td style="text-align:right;"> -3 </td> <td style="text-align:right;"> -35 </td> <td style="text-align:right;"> -73 </td> <td style="text-align:right;"> 59 </td> <td style="text-align:right;"> 33 </td> <td style="text-align:right;"> -47 </td> <td style="text-align:right;"> 66 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> 78 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> 51 </td> <td style="text-align:right;"> -15 </td> <td style="text-align:right;"> -70 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -89 </td> <td style="text-align:right;"> 63 </td> <td style="text-align:right;"> 72 </td> </tr> </tbody> </table> </div> --- # Summary numbers It would be nice to reduce the big set of numbers down to a few numbers that we can look at and make sense of. ## Sameness (Central Tendency) - What are all the numbers close to? ## Differentness (Variance) - How different are the numbers? --- # Descriptive Statistics - Give us summaries of big sets of numbers - Useful single numbers to look at - They tell us about patterns of sameness and differentness --- class: pink, center, middle, clear # Graph the numbers to get a better look --- # Dot plot (unordered) Graphing the numbers gives a quick and dirty sense of what they are like. Here's 200 numbers presented as dots <img src="2-Descriptives_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> --- # Dot plot (ordered) Sorting the numbers from smallest to largest <img src="2-Descriptives_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- # Histograms Histograms count up the numbers inside specific ranges <img src="2-Descriptives_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> --- # Histograms Bars show you which bins have more or less numbers in the range <img src="2-Descriptives_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- # So what are these numbers like? What single number would you say best describes most of these numbers? <img src="2-Descriptives_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- # Question Is the red or blue value a better summary of all the numbers? <img src="2-Descriptives_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> --- class: pink, center, middle, clear # Measures of Central Tendency --- # Central Tendency 1. **Central tendency** should describe what most of the data is like -- 2. We want our summary number to be most like the other numbers. We want it to be a **representative value** -- 3. There are **multiple measures** of central tendency -- 4. They have **different properties** -- 5. Some work better than others depending on the data --- # Mode The mode is the single most frequently occuring number > 1 1 2 2 3 4 5 6 7 7 7 7 7 - The mode is 7 because 7 happens the most - Find the mode by counting the occurence of each number, the mode is the most frequently occuring number - If there is a tie, then you have two or three or more modes (depends on how many differen numbers tie) --- # Finding the Mode in R We make 25 numbers, how do we get R to find the mode? ```r #make some numbers a <- round(rnorm(n=25, mean=24, sd=5)) ``` | | | | | | |--:|--:|--:|--:|--:| | 29| 30| 16| 28| 22| | 17| 29| 26| 21| 29| | 27| 31| 22| 19| 22| | 27| 28| 30| 29| 24| | 23| 33| 27| 26| 28| --- # Finding the Mode in R `table` function automatically counts the occurence of each number ```r table(a) ``` ``` ## a ## 16 17 19 21 22 23 24 26 27 28 29 30 31 33 ## 1 1 1 1 3 1 1 2 3 3 4 2 1 1 ``` We can see that 29 occurs the most --- # Custom function for the mode in R You can always write your own function for the mode. This one is called `my_mode` ```r my_mode <- function(x) { ux <- unique(x) ux[which.max(tabulate(match(x, ux)))] } ``` ```r my_mode(a) ``` ``` ## [1] 29 ``` --- # Thinking about the mode 1. Tells us the most frequent number(s) -- 2. Is it representative of all the numbers? -- 3. When would the mode be a good thing to know? ---