- Topic: Mean vs Median
- Goal: Understand two measures of tendency
- Key Motivation
- Used in exam scores, income, housing prices, sports analytics
- Mean and median can lead to different interpretations
2025-11-14
Mean (Arithmetic Average)
- Add all values and divide by the number of observations.
\[ \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i \]
Median
- The “middle” value after sorting the data. - If \(n\) is odd: the median is the single middle value.
- If \(n\) is even: the median is the average of the two middle values.
Key Difference - Mean uses all values and is sensitive to outliers. - Median is more robust when the data are skewed or contain extreme values.
values <- c(55, 60, 65, 67, 70, 72, 75, 77, 80, 85, 90, 100) df <- data.frame(value = values) mean_value <- mean(values) median_value <- median(values) mean_value
## [1] 74.66667
median_value
## [1] 73.5
plot_box <- ggplot(df, aes(x = "", y = value)) +
geom_boxplot(fill = "steelblue") +
labs(
title = "Boxplot of Exam Scores",
x = "",
y = "Value"
)
plot_hist <- ggplot(df, aes(x = value)) + geom_histogram(binwidth = 5, fill = "lightblue", color = "black") + labs( title = "Histogram of Data", x = "Value", y = "Count" )
plot_interactive <- plot_ly(df, x = ~value, type = "histogram") |>
layout(
title = "Interactive Histogram (Plotly)",
xaxis = list(title = "Value"),
yaxis = list(title = "Count")
)