1. Introduction

In statistics, a Measure of Central Tendency is a summary measure that attempts to describe a whole set of data with a single value that represents the middle or center of its distribution.

It is often referred to as a “location” measure because it tells us where the data is localized. The three most common measures are: 1. The Mean 2. The Median 3. The Mode

2. The Arithmetic Mean

The mean (or average) is the most popular and well-known measure of central tendency. It is calculated by summing all the values in a data set and dividing by the total number of values.

Mathematical Formula

For a sample of $n$ values, $x_1, x_2, \dots, x_n$, the sample mean $\bar{x}$ is:

\[\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}\]

Where: - $\sum$: Summation symbol. - $x_i$: Each individual value. - $n$: Total number of observations.

Real-Life Example

Scenario: A teacher wants to find the average score of 5 students in a mini-quiz. Scores: 70, 85, 80, 90, 75.

Calculation: \[\bar{x} = \frac{70 + 85 + 80 + 90 + 75}{5} = \frac{400}{5} = 80\]

R Implementation

scores <- c(70, 85, 80, 90, 75)
mean_score <- mean(scores)
print(paste("The mean score is:", mean_score))

## [1] "The mean score is: 80"

3. The Median

The median is the middle value in a data set that has been arranged in numerical order (ascending or descending). It splits the data into two equal halves.

Mathematical Formula

Sort the data from smallest to largest.
If $n$ is odd, the median is the value at position: \[\text{Median} = \left( \frac{n+1}{2} \right)^{th} \text{term}\]
If $n$ is even, the median is the average of the two middle terms: \[\text{Median} = \frac{(\frac{n}{2})^{th} \text{term} + (\frac{n}{2} + 1)^{th} \text{term}}{2}\]

Real-Life Example

Scenario: Comparing household incomes in a neighborhood to avoid the influence of one billionaire living on the block. Incomes (in thousands): $45, $50, $52, $55, $700.

Mean: $180.4 (This is misleading due to the $700 outlier).
Median: $52 (A much better representation of the “typical” neighbor).

R Implementation

incomes <- c(45, 50, 52, 55, 700)
median_income <- median(incomes)
print(paste("The median income is:", median_income))

## [1] "The median income is: 52"

4. The Mode

The mode is the value that appears most frequently in a data set. A data set can have one mode (unimodal), two modes (bimodal), or many modes (multimodal).

Real-Life Example

Scenario: A shoe store owner wants to know which shoe size to stock most heavily. Sizes sold: 7, 8, 8, 9, 10, 10, 10, 11.

Mode: 10 (Because it occurred 3 times).

R Implementation

Note: R does not have a built-in function for the statistical mode, so we use a custom table-based approach.

shoe_sizes <- c(7, 8, 8, 9, 10, 10, 10, 11)

get_mode <- function(v) {
  uniqv <- unique(v)
  uniqv[which.max(tabulate(match(v, uniqv)))]
}

print(paste("The mode shoe size is:", get_mode(shoe_sizes)))

## [1] "The mode shoe size is: 10"

5. Comparison: When to use which?

Measure	Best Used For…	Sensitivity to Outliers
Mean	Continuous data with a symmetrical distribution (e.g., Height).	High (Easily skewed)
Median	Skewed data or data with outliers (e.g., Salaries).	Low (Robust)
Mode	Categorical/Nominal data (e.g., Most popular car color).	Low

6. Summary Exercise

Consider the following dataset representing the number of hours 7 students spent studying: 2, 3, 3, 4, 5, 8, 20.

Calculate the Mean: $(2+3+3+4+5+8+20)/7 = 6.42$ hours.
Find the Median: The 4th value is 4.
Find the Mode: The most frequent value is 3.

Observation: The mean (6.42) is higher than the median (4) because it is being pulled up by the outlier (20 hours). In this case, the median provides a more accurate picture of “typical” study time. ```

Module 1: Measures of Central Tendency

Introduction to Statistical Descriptors

Abdikadir Abdilahi Aw Hussein

2025-12-28

1. Introduction

2. The Arithmetic Mean

Mathematical Formula

Real-Life Example

R Implementation

3. The Median

Mathematical Formula

Real-Life Example

R Implementation

4. The Mode

Real-Life Example

R Implementation

5. Comparison: When to use which?

6. Summary Exercise