When we collect a set of data, we often want to find a single value that represents the “center” or the “typical” value of the group. In statistics, these are called Measures of Central Tendency.
For Grade 10, we focus on ungrouped data, which is raw data listed individually (e.g., a list of test scores: 75, 82, 90…).
The Mean is the most common measure of central tendency. It is found by adding all the values together and dividing by the total number of values.
For a dataset \(x_1, x_2, \dots, x_n\): \[\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}\]
Where: * \(\bar{x}\) (read as “x-bar”) is the sample mean. * \(\sum\) is the summation symbol (add them all up). * \(n\) is the number of observations.
The Median is the middle value of a dataset when the numbers are arranged in ascending or descending order.
The Mode is the value that appears most often in the dataset.
Shoe sizes of 8 players: 7, 8, 8, 9, 10, 8, 11, 9. The number 8 appears three times. The mode is 8.
Using R: (Note: R doesn’t have a built-in
mode() function for statistics, so we use a table).
## shoes
## 7 8 9 10 11
## 1 3 2 1 1
One of the most important concepts in Grade 10 is understanding how outliers (extreme values) affect these measures.
Imagine a small company with 5 employees. Their monthly salaries are: $2,000, $2,200, $2,100, $2,300, and the CEO earns $15,000.
| Measure | Definition | Best Used When… |
|---|---|---|
| Mean | Arithmetic Average | Data is symmetric (no extreme outliers). |
| Median | Middle Value | Data has extreme outliers (like house prices). |
| Mode | Most Frequent | Dealing with categorical data (like favorite color). |
The number of goals scored by a soccer team in 7 matches are: 2, 0, 1, 3, 2, 1, 5. 1. Calculate the Mean. 2. Find the Median. 3. Identify the Mode.
The mean of four numbers is 10. Three of the numbers are 8, 12, and 7. What is the fourth number?
A class of 20 students took a test. 19 students scored between 70 and 80, but one student scored 0 because they were absent. * Which measure (Mean or Median) will give a better reflection of the class’s performance? Why?
Create a vector in R called temp with the following
temperatures: 22, 25, 22, 28, 30, 22, 24. Write the
code to find the mean and median.
```
plot() and
barplot() to avoid the rlang package issues
you encountered earlier.