class: middle background-image: url(data:image/png;base64,#LTU_logo.jpg) background-position: top left background-size: 30% # STM1001 [Topic 2](https://bookdown.org/a_shaker/STM1001_Topic_2/) Workshop ## Descriptive Statistics ### La Trobe University This workshop complements the [Topic 2 readings](https://bookdown.org/a_shaker/STM1001_Topic_2/) --- # Topic 1 recap (Menti) Let's begin with a quick recap of [Topic 1: Introduction to statistics and presenting data](https://bookdown.org/a_shaker/STM1001_Topic_1/) --- name: menti class: middle background-image: url(data:image/png;base64,#menti.jpg) background-size: 115% # Menti ## Go to [www.menti.com](https://www.menti.com) and use ## the code provided --- name: stat class: middle background-image: url(data:image/png;base64,#slide_1.png) background-size: 110% --- name: stat class: middle background-image: url(data:image/png;base64,#slide_4.png) background-size: 100% --- # Topic 2: Descriptive Statistics ## In this week's readings: <iframe src="https://bookdown.org/a_shaker/STM1001_Topic_2/" width="100%" height="400px" data-external="1"></iframe> --- # Topic 2: Descriptive Statistics In today's workshop, we will complete some activities to consolidate our understanding of some of this topic's concepts. -- We will not have time to cover every concept, so please make sure you read this topic's readings thoroughly. --- # Measures of location: Mean and median * ***Measures of location***, or ***measures of central tendency***, are designed to tell us what is a 'typical' value in a given set of data -- * Three common measures of location: * Mean * Median * Mode * In the readings, we also consider some other useful summary statistics * Quantile * Percentile * Minimum * Maximum --- # Mean * The first measure we will look at today is the ***mean***, also known as the ***average*** -- * To calculate the mean, add up all of the given values, and then divide that sum by the number of values -- # Median * The ***median*** is simply the 'middle' value, meaning that 50% of the values are higher, and 50% lower, than the median -- * To calculate the median: 1. List the values in order from lowest to highest -- 1. Then, if there is an odd number of values, the median will be the middle value. If there is an even number of values, the median will be the average of the middle two values. --- # Group activity 1 1. As a group, answer the following questions and collate the answers together (if you don't know the exact numbers, an estimate will be fine): a. How many times have you visited a physical Kmart store in the past year? b. How many items have you purchased online in the past year? ***Note:*** When collating the answers together for a. and b., write them down in pairs (a, b) for each student. For example, if a student visited a physical Kmart store 5 times in the last year and purchased 10 items online in the last year, write it down as (5, 10). 2. Calculate the following for your group: a. Average number of items purchased online b. Average number of visits to Kmart c. Median items purchased online d. Median visits to Kmart --- # Report back For in-class discussion: 1. Compare means / medians between groups 2. Discuss variability between samples: we do not know what the true means are, but we hope these samples can give us an idea --- # Scatter plots and correlation * A scatter plot is a convenient way to visually compare two numeric variables * For example, recall the example provided in this week's readings where we looked at the happiness index versus income per person for a number of countries: <iframe src="https://bookdown.org/a_shaker/STM1001_Topic_2/5-1-scatter-plots.html" width="100%" height="400px" data-external="1"></iframe> --- # Scatter plots and correlation * As we can see, each axis represents one variable -- * Each point represents one observation (country in this case), indicating its value for average income per person on the `\(x\)`-axis, and average happiness score on the `\(y\)`-axis -- * By considering a scatter plot, we can observe how the two variables relate to each other --- # Scatter plots and correlation * ***Correlation*** is a measure between -1 and 1 (denoted `\(r\)`) which tells us about the relationship between two variables -- * The sign of the number tells us about the **direction** of the relationship (positive or negative - we will see some examples on the next slide) -- * The size of the number tells us about the **strength** of the relationship: * The closer `\(|r|\)` (the absolute value of `\(r\)`) is to 1, the stronger the linear relationship between the two variables -- * Below is a guide to interpreting the strength of a correlation coefficient: <table> <thead> <tr> <th style="text-align:left;"> Range of |r| </th> <th style="text-align:left;"> Strength of correlation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 0 to 0.3 </td> <td style="text-align:left;"> None or very weak </td> </tr> <tr> <td style="text-align:left;"> 0.3 to 0.5 </td> <td style="text-align:left;"> Weak </td> </tr> <tr> <td style="text-align:left;"> 0.5 to 0.8 </td> <td style="text-align:left;"> Moderate </td> </tr> <tr> <td style="text-align:left;"> 0.8 to 1 </td> <td style="text-align:left;"> Strong </td> </tr> </tbody> </table> --- # Scatter plots and correlation * Scatter plots are a helpful way for us to understand correlation. For example: <img src="data:image/png;base64,#Topic_2_Workshop_files/figure-html/unnamed-chunk-4-1.svg" style="display: block; margin: auto;" /> --- # Group activity 2 1. Using your responses from the previous activity, create a scatter plot with Kmart visits on the `\(x\)` (horizontal) axis, and online purchases on the `\(y\)` (vertical) axis 1. Based on your scatter plot, do you think the variables are ***positively*** or ***negatively*** related? 1. Based on your scatter plot, do you think the ***correlation*** between the two variables is 'strong', 'moderate', 'weak', or 'none'? --- # Report back For in-class discussion: 1. Compare scatter plots between groups 1. Compare correlation guesses between groups. Do you agree with guesses other groups have made? --- background-image: url(data:image/png;base64,#computerlab.jpg) background-position: bottom background-size: 75% class: center # See you in the computer labs! Continue with this topic's readings: [Topic 2 Readings](https://bookdown.org/a_shaker/STM1001_Topic_2/) --- class: middle <font color = "grey"> These notes have been prepared by Amanda Shaker. The copyright for the material in these notes resides with the authors named above, with the Department of Mathematics and Statistics and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License <a href = "https://creativecommons.org/licenses/by-nc-nd/4.0/" target="_blank"> BY-NC-ND. </a> </font>