class: middle background-image: url(data:image/png;base64,#LTU_logo.jpg) background-position: top left background-size: 30% # STM1001 [Topic 1](https://bookdown.org/a_shaker/STM1001_Topic_1/) Workshop ## Introduction to statistics and presenting data ### La Trobe University This workshop complements the [Topic 1 readings](https://bookdown.org/a_shaker/STM1001_Topic_1/) --- # Topic 1: Introduction to statistics and presenting data ## In this week's readings: We will not have time to cover every concept, so please make sure you read this topic's readings thoroughly. <iframe src="https://bookdown.org/a_shaker/STM1001_Topic_1/" width="100%" height="400px" data-external="1"></iframe> --- # Where are we headed in this subject? * In this subject, we will be learning how to ***Make Sense of Data*** -- * One of the most important tools we can use to do so is Statistics .content-box-blue[ .center[ **What is Statistics?** ] Statistics allows us to make sense of data. It involves **collecting**, **describing**, and **analysing** data, and sometimes **drawing conclusions** from data. ] -- * How should we **collect** the data (including study design) * Once we have the data, can we **describe** it? * Can we draw **conclusions**, or **inferences**, about what we are seeing? --- # Introduction to statistics and presenting data .content-box-blue[ .center[ **Descriptive Statistics** ] Descriptive statistics involves summarising and displaying data via graphical and numerical means. ] -- * Let's look at some examples... * We will consider an example from the `survey` data set from the R package `MASS` (Venables & Ripley, 2002) * University students studying statistics were asked how often they smoke --- # Frequency Table * We display the **number** of students in each category |Smoke | Frequency| |:------------|---------:| |Never | 189| |Occasionally | 19| |Regularly | 17| |Heavy | 11| --- # Relative Frequency Table * We display the **percentage** of students in each category |Smoke | Relative Frequency (%)| |:------------|----------------------:| |Never | 80.08| |Occasionally | 8.05| |Regularly | 7.20| |Heavy | 4.66| --- # Cumulative frequency and relative frequency tables * As well as frequencies and relative frequencies, we can display the cumulative frequencies: |Smoke | Frequency| Cumulative Frequency| Relative Frequency (%)| Cumulative Relative Frequency (%)| |:------------|---------:|--------------------:|----------------------:|---------------------------------:| |Never | 189| 189| 80.08| 80.08| |Occasionally | 19| 208| 8.05| 88.14| |Regularly | 17| 225| 7.20| 95.34| |Heavy | 11| 236| 4.66| 100.00| --- # Bar chart * Same information presented visually: <img src="data:image/png;base64,#Topic_1_Workshop_files/figure-html/unnamed-chunk-5-1.svg" style="display: block; margin: auto;" /> --- # Pie chart <img src="data:image/png;base64,#Topic_1_Workshop_files/figure-html/unnamed-chunk-6-1.svg" style="display: block; margin: auto;" /> --- # Types of variables How we present data often depends on what type of ***variable(s)*** we are looking at. .content-box-blue[ .center[ **Categorical (qualitative) Variable** A variable that is separated into groups. Categorical variables can be either: ] * **Nominal**: Where the groups are characterised by names, labels or categories. For example, eye colour (blue, brown, green, etc.), car brand (Hyundai, Toyota, Holden, etc.), or state (VIC, NSW, SA, etc.). * **Ordinal**: Where the groups can be arranged into a specific order. For example, how much a person smokes (never, occasional, regular, heavy), or level of exercise (none, some, frequent). ] --- # Types of variables .content-box-blue[ .center[ **Numerical (quantitative) Variable** A numerical variable is one that represents counts or measurements. The two types of numerical variables we will be looking at are: ] * **Discrete**: Where the set of all possible values is countable. For example, the number of heartbeats per minute, or the number of heads observed when flipping a coin five times. * **Continuous**: Where the variable can take an infinite number of values within a certain range. For example, height, weight or age. ] --- # Drawing conclusions (inferences) * After observing data via **descriptive statistics**, we may wish to draw **conclusions**, or **inferences**, about what we are seeing * This is called ***inferential statistics*** .content-box-blue[ .center[ **Inferential Statistics** ] Inferential statistics involves drawing conclusions from data. ] --- # Drawing conclusions (inferences) * Normally, we use data available to us in a **sample** to make **inferences** about a **population**  --- # Drawing conclusions (inferences) * When we take a **sample**, we hope it is **representative** of the **population** -- * But realistically, each time we take a random **sample**, we could get a different **estimate** * How close might our **sample estimates** be to the true **population parameters**? -- * We will usually never know, but ***statistics*** gives us tools to factor the uncertainty into our conclusions * We will be covering ***inferential statistics*** later on in this subject --- name: menti class: middle background-image: url(data:image/png;base64,#menti.jpg) background-size: 115% # Menti ## Go to [www.menti.com](https://www.menti.com) and use ## the code provided --- background-image: url(data:image/png;base64,#computerlab.jpg) background-position: bottom background-size: 75% class: center # See you in the computer labs! Continue with this topic's readings: [Topic 1 Readings](https://bookdown.org/a_shaker/STM1001_Topic_1/) --- # References Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with s. Fourth. New York: Springer. https://www.stats.ox.ac.uk/pub/MASS4/. --- class: middle <font color = "grey"> These notes have been prepared by Amanda Shaker. The copyright for the material in these notes resides with the authors named above, with the Department of Mathematics and Statistics and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License <a href = "https://creativecommons.org/licenses/by-nc-nd/4.0/" target="_blank"> BY-NC-ND. </a> </font>