title: “Week 6 Assignment – Height Analysis” author: “Tyler Whittney” date: “2026-02-25” output: html_document ———————
This assignment analyzes the height variable from
the ok_cupid_data_full.csv dataset. I calculate measures of
center, measures of spread, and create two graphs.
library(readr)
library(ggplot2)
okcupid_data <- read_csv("ok_cupid_data_full.csv")
## Rows: 300 Columns: 19
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (16): body_type, diet, drinks, drugs, education, ethnicity, job, offspri...
## dbl (3): age, height, income
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
height <- na.omit(okcupid_data$height)
mean(height)
## [1] 67.98
median(height)
## [1] 68
get_mode <- function(x) {
uniq_x <- unique(x)
uniq_x[which.max(tabulate(match(x, uniq_x)))]
}
get_mode(height)
## [1] 67
Explanation: The mean is the average height. The median is the middle height. The mode is the most common height.
range(height)
## [1] 59 80
var(height)
## [1] 15.42435
sd(height)
## [1] 3.927384
Explanation: The range shows the smallest and largest heights. Variance and standard deviation show how spread out the heights are.
ggplot(okcupid_data, aes(x = height)) +
geom_histogram(bins = 30) +
labs(title = "Histogram of Height",
x = "Height (inches)",
y = "Frequency")
ggplot(okcupid_data, aes(y = height)) +
geom_boxplot() +
labs(title = "Boxplot of Height",
y = "Height (inches)")
Explanation: The histogram shows the shape of the height distribution. The boxplot shows the median, spread, and possible outliers.
I imported a CSV file, analyzed the height variable, calculated measures of center and spread, and created two graphs to summarize the data.