STM1001 Topic 3 Workshop

class: middle
background-image: url(data:image/png;base64,#LTU_logo.jpg)
background-position: top left
background-size: 30%

# STM1001 [Topic 3](https://bookdown.org/a_shaker/STM1001_Topic_3/) Workshop
## Probability and Distributions
### La Trobe University
This workshop complements the [Topic 3 readings](https://bookdown.org/a_shaker/STM1001_Topic_3/)

---

background-image: url(data:image/png;base64,#Cards.jpg)
background-position: 100% 50%
background-size: 50% 100%

.pull-left[
# Probability game (Menti)

* We have used a [playing card shuffler](https://www.random.org/playing-cards/?cards=1&decks=1&spades=on&hearts=on&diamonds=on&clubs=on&aces=on&twos=on&threes=on&fours=on&fives=on&sixes=on&sevens=on&eights=on&nines=on&tens=on&jacks=on&queens=on&kings=on&remaining=on) for the following game
* Your task is to see how many you can guess of the correct suit
]

---

name: menti
class: middle
background-image: url(data:image/png;base64,#menti.jpg)
background-size: 115%

# Menti

## Go to [www.menti.com](https://www.menti.com) and use

## the code provided

---

# Topic 3: Probability and Distributions

## In this week's readings:
We will not have time to cover every concept, so please make sure you read this topic's readings thoroughly.
<iframe src="https://bookdown.org/a_shaker/STM1001_Topic_3/" width="100%" height="400px" data-external="1"></iframe>

---
name: stat
class: middle
background-image: url(data:image/png;base64,#slide_1.png)
background-size: 110%

---

name: stat
class: middle
background-image: url(data:image/png;base64,#slide_5.png)
background-size: 100%

---

# Card game and probabilities

* Each time a card was randomly chosen in the game, what was your probability of guessing the correct suit?
--
`$\frac{1}{4}$` or `$0.25$`
--

* Focussing on just one 'trial' (one card selection), what are all of the possible outcomes?
--
[club, diamond, heart, spade]
--

* From [this topic's readings](https://bookdown.org/a_shaker/STM1001_Topic_3/1-1-some-probability-notation.html), this is called `$\Omega$`, the **sample space**: the set of all possible *outcomes* of an experiment
--

* For just one trial, we have `$\Omega = \{\text{club, diamond, heart, spade\}}$`
* A ***simple event*** is any event with *just one outcome*
* One card selection is an example of an experiment with ***equally likely events***
* Therefore, each time a card is drawn, the probability of guessing the correct suit is `$$\displaystyle\frac{1}{\text{number of simple events in }\Omega} = \frac{1}{4}.$$`

---

# Some probability facts

* Probabilities are always between 0 and 1, so that, for some event `$A$`:
  * `$P(A) = 0$` means that event `$A$` will definitely not occur
  * `$P(A) = 0.5$` means that event `$A$` is equally likely to occur or not occur
  * `$P(A) = 1$` means that event `$A$` will definitely occur.
--

* Probabilities cannot be negative
--

* `$P(\Omega) = 1$`. For example, when we draw a card, we know exactly one of the outcomes from the sample space will definitely occur (either a club or spade or heart or a diamond)
--

* For one trial, we know the probability of getting a ‘heart’ is 0.25. What is the probability of NOT getting a heart?
--

* We can use the complement rule to answer this:

.content-box-blue[
.center[
**The complement rule**
]
`$$P(A^C) = 1 - P(A),$$`
where:

* `$A^C$` means "not `$A$`". That is, `$A^C$` is the complement of `$A$`.
]

---

# Random variables

* Considering all 10 ‘trials’ now, we know that for each trial, you had a 0.25 chance of guessing the correct answer
--

* We can let `$X$` denote the number of times you guessed correctly
* Then the possible values for `$X$` are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
--

* We can use something called the "Binomial Distribution" to see your chances for each one of these values of `$X$`:

---
# Random variables

* In the previous example, `$X$` was a discrete random variable because we could write down the possible values of `$X$` sequentially with a recognisable pattern. In other words, they are countable

* We can also have continuous random variables such as height, weight or age

---

# The normal distribution

.pull-left[
* Bell-shaped

* Symmetric

* Total area under the curve is 1 (this includes all possible values for `$X$`)

* Can be specified by key parameters: the mean, `$\mu$`, and the standard deviation, `$\sigma$`, or alternatively the variance, `$\sigma^2$`

* We can express the distribution of an arbitrary normally distributed random variable `$X$` as `$X \sim N(\mu, \sigma^2)$`
]

.pull-right[
<img src="data:image/png;base64,#Topic_3_Workshop_files/figure-html/unnamed-chunk-3-1.svg" style="display: block; margin: auto;" />
]

---
# The normal distribution

.pull-left[
* Let's assume university students' heights follow a `$N(172.38, 9.85^2)$` distribution
* Then we could guess the following:

.content-box-blue[
.center[
**Some rules of thumb for an approximately normal data set**
]
1. About 68% of students' heights are within 1 standard deviation of the mean
1. About 95% of students' heights are within 2 standard deviations of the mean
1. About 99.7% of students' heights are within 3 standard deviations of the mean
]
]

.pull-right[
<img src="data:image/png;base64,#Topic_3_Workshop_files/figure-html/unnamed-chunk-4-1.svg" width="100%" style="display: block; margin: auto;" />
]

---

# Activity 1 (individual)

* Considering your height in centimeters, how many standard deviations are you from the mean? [Recall we have `$X \sim N(172.38, 9.85^2)$`]

---
# Standardisation

* Very useful technique commonly used in statistics

* Puts everything on a ***standard*** scale that makes comparing things easier

* Recall in our height example, we had `$X \sim N(172.38, 9.85^2)$`

* If we standardise everything, we have a special case of the normal distribution: the ***standard*** normal distribution: `$Z \sim N(0, 1)$`

* The usual convention is to use `$Z$` instead of `$X$` when using the ***standard*** normal distribution

* We can ***standardise*** values with the following formula:
$$ z = \frac{x - \mu}{\sigma} $$

* This gives us a `$z$`-score

---

# Activity 2 (individual)

1. Use the standardisation formula to calculate your z-score: 
$$ z = \frac{x - \mu}{\sigma}, $$
Where:
  * Your own height in cm is `$x$`
  * The mean is `$\mu = 172.38$`
  * The standard deviation is `$\sigma = 9.85$`

1. Is your `$z$`-score above 0 or below 0?

1. Compare your `$z$`-score to the answer you got in the previous activity (how many standard deviations are you away from the mean?).

* [Your answer should be the same as in the previous exercise, but the sign may be different]

---

background-image: url(data:image/png;base64,#computerlab.jpg)
background-position: bottom
background-size: 75%
class: center

# See you in the computer labs!

Continue with this topic's readings: [Topic 3 Readings](https://bookdown.org/a_shaker/STM1001_Topic_3/)

We will explore the normal distribution further in this week’s computer lab

---
class: middle

<font color = "grey">
These notes have been prepared by Amanda Shaker. The copyright for the material in these notes resides with the authors named above, with the Department of Mathematics and Statistics and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License 
<a href = "https://creativecommons.org/licenses/by-nc-nd/4.0/" target="_blank"> BY-NC-ND. </a>
</font>