1. Introduction

A Measure of Central Tendency is a single value that attempts to describe a set of data by identifying the central position within that set of data. In statistics, the three most common measures are the Mean, Median, and Mode.

2. The Arithmetic Mean

The mean (or average) is the sum of all observations divided by the total number of observations.

Mathematical Formula

For a sample of size \(n\), the sample mean \(\bar{x}\) is: \[\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} = \frac{x_1 + x_2 + ... + x_n}{n}\]

Where: * \(\sum\) is the summation symbol. * \(x_i\) represents each individual value. * \(n\) is the total number of values in the sample.

Real-Life Example

Scenario: A tech startup tracks the number of hours five employees worked in a day: 8, 9, 7, 10, and 6 hours.

R Implementation

# Data vector
work_hours <- c(8, 9, 7, 10, 6)

# Calculating Mean
mean_val <- mean(work_hours)
print(paste("The mean work hours are:", mean_val))
## [1] "The mean work hours are: 8"

3. The Median

The median is the middle value in a data set when the numbers are arranged in ascending or descending order. It is robust to outliers.

Mathematical Formula

  1. Arrange data in order from smallest to largest.
  2. If \(n\) is odd, the median is the value at position: \(\frac{n+1}{2}\)
  3. If \(n\) is even, the median is the average of the values at positions: \(\frac{n}{2}\) and \(\frac{n}{2} + 1\)

Real-Life Example

Scenario: Monthly house rents in a neighborhood: $1200, $1250, $1300, $1400, and $5000 (an outlier).

R Implementation

rents <- c(1200, 1250, 1300, 1400, 5000)

# Mean vs Median comparison
print(paste("Mean Rent:", mean(rents)))
## [1] "Mean Rent: 2030"
print(paste("Median Rent:", median(rents)))
## [1] "Median Rent: 1300"

Observation: Notice how the mean is pulled upward by the $5000 rent, while the median stays representative of the “typical” house.


4. The Mode

The mode is the value that appears most frequently in a data set. A set can be unimodal (one mode), bimodal (two modes), or multimodal.

Real-Life Example

Scenario: A shoe store records the sizes of sneakers sold in an hour: 7, 8, 8, 9, 10, 8, 11.

R Implementation

Note: R does not have a standard built-in function for the statistical mode, so we use the table function or a custom function.

shoe_sizes <- c(7, 8, 8, 9, 10, 8, 11)

# Using table to find frequency
freq_table <- table(shoe_sizes)
mode_val <- names(freq_table)[which.max(freq_table)]

print(freq_table)
## shoe_sizes
##  7  8  9 10 11 
##  1  3  1  1  1
print(paste("The Mode shoe size is:", mode_val))
## [1] "The Mode shoe size is: 8"

5. Comparison: When to use what?

Measure Best Used For… Sensitivity to Outliers
Mean Continuous data with a symmetric distribution (e.g., Height). Highly Sensitive
Median Skewed data or data with outliers (e.g., Income). Resistant
Mode Categorical/Nominal data (e.g., Favorite color). Resistant

6. Visualizing Central Tendency

Let’s visualize where these measures sit on a distribution using a generated dataset of exam scores.

# Generate random data
set.seed(123)
scores <- rgamma(100, shape = 2, scale = 10) # Right-skewed distribution

m_mean <- mean(scores)
m_median <- median(scores)

# Plotting
hist(scores, col="lightblue", main="Distribution of Exam Scores", xlab="Score")
abline(v = m_mean, col = "red", lwd = 2, lty = 1)
abline(v = m_median, col = "blue", lwd = 2, lty = 2)

legend("topright", legend=c("Mean", "Median"), 
       col=c("red", "blue"), lwd=2, lty=1:2)

7. Summary Checklist

Key Features of this Template:

  1. LaTeX Integration: Uses $...$ for inline math and $$...$$ for block equations.
  2. Code Chunks: Includes executable R code that generates results and plots.
  3. Visual Aids: Includes a histogram to demonstrate the difference between mean and median in a skewed distribution.
  4. Formatting: Uses Markdown tables and headers for readability.

How to use this:

  1. Install R and RStudio.
  2. Install the rmarkdown package: install.packages("rmarkdown").
  3. Create a new RMarkdown file and paste the code above.
  4. Click Knit at the top of the editor.