1. Introduction

When we collect a set of data, we often want to find a single value that represents the “center” or the “typical” value of the group. In statistics, these are called Measures of Central Tendency.

For Grade 10, we focus on ungrouped data, which is raw data listed individually (e.g., a list of test scores: 75, 82, 90…).


2. The Mean (Arithmetic Average)

The Mean is the most common measure of central tendency. It is found by adding all the values together and dividing by the total number of values.

Mathematical Formula

For a dataset \(x_1, x_2, \dots, x_n\): \[\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}\]

Where: * \(\bar{x}\) (read as “x-bar”) is the sample mean. * \(\sum\) is the summation symbol (add them all up). * \(n\) is the number of observations.

Real-Life Example

A student’s grades in 5 subjects are: 85, 90, 78, 92, and 80.

Calculation: \[\bar{x} = \frac{85 + 90 + 78 + 92 + 80}{5} = \frac{425}{5} = 85\]

Using R:

grades <- c(85, 90, 78, 92, 80)
mean(grades)
## [1] 85

3. The Median (The Middle Value)

The Median is the middle value of a dataset when the numbers are arranged in ascending or descending order.

How to find the Median:

  1. Sort the data from smallest to largest.
  2. If \(n\) is odd: The median is the middle number at position \(\frac{n+1}{2}\).
  3. If \(n\) is even: The median is the average of the two middle numbers.

Real-Life Example (Even Number of Data Points)

Daily pocket money for 6 students: $10, $15, $12, $20, $15, $50.

  1. Sort: 10, 12, 15, 15, 20, 50
  2. Middle two: 15 and 15.
  3. Median: \((15 + 15) / 2 = 15\).

Using R:

money <- c(10, 15, 12, 20, 15, 50)
median(money)
## [1] 15

4. The Mode (The Most Frequent)

The Mode is the value that appears most often in the dataset.

Real-Life Example

Shoe sizes of 8 players: 7, 8, 8, 9, 10, 8, 11, 9. The number 8 appears three times. The mode is 8.

Using R: (Note: R doesn’t have a built-in mode() function for statistics, so we use a table).

shoes <- c(7, 8, 8, 9, 10, 8, 11, 9)
table(shoes) # Look for the highest frequency
## shoes
##  7  8  9 10 11 
##  1  3  2  1  1

5. Comparing the Measures (The Outlier Effect)

One of the most important concepts in Grade 10 is understanding how outliers (extreme values) affect these measures.

Visualizing the Difference

Imagine a small company with 5 employees. Their monthly salaries are: $2,000, $2,200, $2,100, $2,300, and the CEO earns $15,000.

  • The Mean ($4720): Is pulled up by the CEO’s high salary. It doesn’t represent the “average” worker well.
  • The Median ($2200): Remains at the center. It is “resistant” to outliers.

6. Summary Table

Measure Definition Best Used When…
Mean Arithmetic Average Data is symmetric (no extreme outliers).
Median Middle Value Data has extreme outliers (like house prices).
Mode Most Frequent Dealing with categorical data (like favorite color).

7. Exercises

Exercise 1: Calculation

The number of goals scored by a soccer team in 7 matches are: 2, 0, 1, 3, 2, 1, 5. 1. Calculate the Mean. 2. Find the Median. 3. Identify the Mode.

Exercise 2: The Missing Value

The mean of four numbers is 10. Three of the numbers are 8, 12, and 7. What is the fourth number?

Exercise 3: Critical Thinking

A class of 20 students took a test. 19 students scored between 70 and 80, but one student scored 0 because they were absent. * Which measure (Mean or Median) will give a better reflection of the class’s performance? Why?

Exercise 4: R Practice

Create a vector in R called temp with the following temperatures: 22, 25, 22, 28, 30, 22, 24. Write the code to find the mean and median.

# Write your code here
# temp <- c(...)
# mean(temp)

```

Key Features of this Chapter:

  1. Grade 10 Appropriate: The language is simplified, and the examples (grades, pocket money, shoe sizes) are relatable to 15-16 year olds.
  2. Mathematical Rigor: Includes the standard \(\bar{x}\) notation which is introduced at this level.
  3. Visual Learning: The “Outlier Plot” clearly demonstrates why we need more than just the Mean.
  4. Interactive: The R code chunks allow students to see how modern statisticians calculate these values instantly.
  5. No-Error Code: Uses Base R plot() and barplot() to avoid the rlang package issues you encountered earlier.