Normal Distribution: Practical Industry Applications

Bridging Statistical Theory to Managerial Decision-Making

Author

Segun Iyanda

Published

March 11, 2026

“The goal is not to teach students to calculate probabilities — it is to teach them to make better decisions under uncertainty.”

Course Overview

Learning Objectives

By the end of these exercises, students will be able to:

Apply normal distribution concepts to real-world business scenarios
Interpret probabilities as actionable managerial risk statements
Recognize that() is a managerial lever, not merely a statistical parameter
Develop data-driven recommendations under uncertainty
Evaluate trade-offs using distributional thinking

1 Manufacturing & Quality Control

🏭 Industry Context

Manufacturing quality control relies heavily on process distribution analysis. Understanding variation is the foundation of operational excellence.

1.1 The Defective Product Dilemma

1.1.1 Scenario

A bottling plant fills cola bottles with a mean of 500ml and standard deviation of 5ml. FDA regulations require bottles to contain between 490ml and 515ml.

1.1.2 Tasks & Solutions

Show Code

# Parameters
mu_bottle <- 500
sd_bottle <- 5
lower      <- 490
upper      <- 515

# Probability calculation
p_compliant <- pnorm(upper, mu_bottle, sd_bottle) -
               pnorm(lower, mu_bottle, sd_bottle)

daily_production <- 50000
expected_failures <- daily_production * (1 - p_compliant)

# Visualization
x <- seq(mu_bottle - 4 * sd_bottle,
         mu_bottle + 4 * sd_bottle, length.out = 1000)
df_bottle <- tibble(x = x, y = dnorm(x, mu_bottle, sd_bottle))

ggplot(df_bottle, aes(x, y)) +
  # Fail zones
  geom_area(
    data = filter(df_bottle, x < lower),
    aes(x, y), fill = palette["secondary"], alpha = 0.4
  ) +
  geom_area(
    data = filter(df_bottle, x > upper),
    aes(x, y), fill = palette["warning"], alpha = 0.4
  ) +
  # Pass zone
  geom_area(
    data = filter(df_bottle, x >= lower & x <= upper),
    aes(x, y), fill = palette["accent"], alpha = 0.4
  ) +
  geom_line(color = palette["primary"], linewidth = 1.2) +
  geom_vline(xintercept = c(lower, upper),
             linetype = "dashed", color = palette["secondary"], linewidth = 0.9) +
  geom_vline(xintercept = mu_bottle,
             linetype = "solid", color = palette["primary"], linewidth = 1) +
  annotate("text", x = 495,   y = 0.06,  label = "✅ FDA\nCompliant",
           color = "#27AE60", fontface = "bold", size = 3.5) +
  annotate("text", x = 487,   y = 0.025, label = "❌ Under\nFill",
           color = palette["secondary"], fontface = "bold", size = 3) +
  annotate("text", x = 518,   y = 0.025, label = "⚠️ Over\nFill",
           color = palette["warning"],   fontface = "bold", size = 3) +
  labs(
    title    = "Bottle Fill Distribution — FDA Compliance Analysis",
    subtitle = paste0("μ = ", mu_bottle, "ml  |  σ = ", sd_bottle,
                      "ml  |  Compliance Rate = ",
                      round(p_compliant * 100, 2), "%"),
    x = "Fill Volume (ml)",
    y = "Density"
  )

Figure 1: Cola Bottle Fill Distribution with FDA Compliance Bounds

Show Code

tibble(
  Metric  = c(
    "Compliance Probability",
    "Daily Expected Failures",
    "Annual Expected Failures (250 days)"
  ),
  Value   = c(
    paste0(round(p_compliant * 100, 3), "%"),
    format(round(expected_failures), big.mark = ","),
    format(round(expected_failures * 250), big.mark = ",")
  ),
  `Business Impact` = c(
    "FDA regulatory threshold",
    "Units requiring rework or disposal",
    "Annual waste exposure"
  )
) |>
  kbl(align = c("l", "c", "l")) |>
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed", "responsive"),
    full_width = FALSE
  ) |>
  column_spec(1, bold = TRUE, color = "#2C3E50") |>
  column_spec(2, color = "#2980B9", bold = TRUE)

Table 1: Bottling Plant Compliance Summary

Metric	Value	Business Impact
Compliance Probability	97.59%	FDA regulatory threshold
Daily Expected Failures	1,205	Units requiring rework or disposal
Annual Expected Failures (250 days)	301,250	Annual waste exposure

💬 Managerial Discussion Prompt

Task 3: The plant manager wants to reduce waste below 1%. Should they:

(A) Adjust the process mean (recalibrate machinery)?
(B) Reduce process variance (invest in precision equipment)?

Use your calculations to justify the recommendation. What are the cost trade-offs of each approach?

1.2 Six Sigma Process Capability

1.2.1 Scenario

A circuit board manufacturer measures component resistance: μ = 100 ohms, σ = 2 ohms. Acceptable range: 94–106 ohms.

Show Code

mu_circuit <- 100
sd_circuit <- 2
lsl        <- 94   # Lower Spec Limit
usl        <- 106  # Upper Spec Limit

# Process Capability Index
cp  <- (usl - lsl) / (6 * sd_circuit)
cpk <- min((usl - mu_circuit) / (3 * sd_circuit),
           (mu_circuit - lsl) / (3 * sd_circuit))

p_defect <- (1 - (pnorm(usl, mu_circuit, sd_circuit) -
                  pnorm(lsl, mu_circuit, sd_circuit))) * 1e6

x2 <- seq(mu_circuit - 5 * sd_circuit,
          mu_circuit + 5 * sd_circuit, length.out = 1000)
df_circuit <- tibble(x = x2, y = dnorm(x2, mu_circuit, sd_circuit))

ggplot(df_circuit, aes(x, y)) +
  geom_area(
    data = filter(df_circuit, x >= lsl & x <= usl),
    aes(x, y), fill = palette["accent"], alpha = 0.35
  ) +
  geom_area(
    data = filter(df_circuit, x < lsl | x > usl),
    aes(x, y), fill = palette["secondary"], alpha = 0.5
  ) +
  geom_line(color = palette["primary"], linewidth = 1.2) +
  geom_vline(xintercept = c(lsl, usl),
             linetype = "dashed", color = palette["secondary"], linewidth = 0.9) +
  annotate("text", x = 100, y = 0.15,
           label = paste0("Cp = ", round(cp, 3),
                          "\nCpk = ", round(cpk, 3)),
           size = 4, fontface = "bold", color = "#2C3E50") +
  labs(
    title    = "Six Sigma Process Capability — Resistance Distribution",
    subtitle = paste0("Defect Rate: ", round(p_defect, 1), " DPMO"),
    x        = "Resistance (Ohms)",
    y        = "Density"
  )

Figure 2: Circuit Board Resistance Distribution and Process Capability

📐 Key Formula Reference

(C_p = C_{pk} = (, ))

A Six Sigma process achieves(C_{pk} ), yielding only 3.4 DPMO (defects per million opportunities).

2 Healthcare & Operations Management

🏥 Industry Context

In healthcare, distributional decisions have direct human consequences. Operations managers must balance cost, staffing, and patient outcomes simultaneously.

2.1 Emergency Room Staffing Model

2.1.1 Scenario

Patient wait times: μ = 35 minutes, σ = 8 minutes. Hospital policy: 95% of patients seen within a target threshold.

Show Code

mu_er   <- 35
sd_er   <- 8
sd_new  <- 5   # After hiring additional nurse

threshold_95 <- qnorm(0.95, mu_er, sd_er)
threshold_30 <- qnorm(0.30, mu_er, sd_er)

# Comparative visualization
x_er <- seq(0, 70, length.out = 1000)

df_er <- tibble(
  x       = rep(x_er, 2),
  y       = c(dnorm(x_er, mu_er, sd_er),
              dnorm(x_er, mu_er, sd_new)),
  Staffing = rep(c("Current (σ = 8 min)", "With New Nurse (σ = 5 min)"),
                 each = length(x_er))
)

p_over45_current <- 1 - pnorm(45, mu_er, sd_er)
p_over45_new     <- 1 - pnorm(45, mu_er, sd_new)

ggplot(df_er, aes(x, y, color = Staffing, fill = Staffing)) +
  geom_line(linewidth = 1.2) +
  geom_area(alpha = 0.15) +
  geom_vline(xintercept = threshold_95,
             linetype = "dashed", color = "#2C3E50", linewidth = 0.8) +
  geom_vline(xintercept = 45,
             linetype = "dotted", color = palette["secondary"], linewidth = 0.9) +
  annotate("text", x = threshold_95 + 1, y = 0.045,
           label = paste0("95th pct\n", round(threshold_95, 1), " min"),
           size = 3, color = "#2C3E50") +
  annotate("text", x = 47, y = 0.04,
           label = "45 min\nthreshold",
           size = 3, color = palette["secondary"]) +
  scale_color_manual(values = c(palette["primary"], palette["accent"])) +
  scale_fill_manual(values  = c(palette["primary"], palette["accent"])) +
  labs(
    title    = "ER Wait Time Distributions — Staffing Scenarios",
    subtitle = paste0(
      "P(wait > 45 min) | Current: ",
      round(p_over45_current * 100, 1), "%  →  With New Nurse: ",
      round(p_over45_new * 100, 1), "%"
    ),
    x      = "Wait Time (minutes)",
    y      = "Density",
    color  = "Staffing Scenario",
    fill   = "Staffing Scenario"
  ) +
  theme(legend.position = "bottom")

Figure 3: Emergency Room Wait Time Analysis — Staffing Decision Scenarios

Show Code

tibble(
  Metric = c(
    "95th Percentile Threshold",
    "Fast-Track Cutoff (30th pct)",
    "P(Wait > 45 min) — Current",
    "P(Wait > 45 min) — With Nurse",
    "Improvement",
    "Annual Nurse Cost"
  ),
  Result = c(
    paste0(round(threshold_95, 1), " minutes"),
    paste0(round(threshold_30, 1), " minutes"),
    paste0(round(p_over45_current * 100, 2), "%"),
    paste0(round(p_over45_new     * 100, 2), "%"),
    paste0(round((p_over45_current - p_over45_new) * 100, 2),
           " percentage points"),
    "$80,000"
  )
) |>
  kbl(col.names = c("Decision Metric", "Calculated Result")) |>
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  ) |>
  column_spec(1, bold = TRUE, color = "#2C3E50") |>
  column_spec(2, color = palette["primary"], bold = TRUE) |>
  row_spec(6, background = "#FDEBD0", bold = TRUE)

Table 2: Hospital Staffing Investment Analysis

Decision Metric	Calculated Result
95th Percentile Threshold	48.2 minutes
Fast-Track Cutoff (30th pct)	30.8 minutes
P(Wait > 45 min) — Current	10.56%
P(Wait > 45 min) — With Nurse	2.28%
Improvement	8.29 percentage points
Annual Nurse Cost	$80,000

3 Supply Chain & Inventory Management

3.1 The Reorder Point Problem

3.1.1 Scenario

Daily demand: μ = 200 units, σ = 25 units. Lead time = 5 days | Stockout cost = $50/unit | Holding cost = $2/unit/day

Show Code

mu_daily  <- 200
sd_daily  <- 25
lead_time <- 5

# Lead time demand distribution
mu_lt <- mu_daily * lead_time
sd_lt <- sd_daily * sqrt(lead_time)

service_levels <- c(0.90, 0.95, 0.99)

reorder_df <- tibble(
  service_level  = service_levels,
  z_score        = qnorm(service_levels),
  reorder_point  = mu_lt + qnorm(service_levels) * sd_lt,
  safety_stock   = qnorm(service_levels) * sd_lt,
  holding_cost   = qnorm(service_levels) * sd_lt * 2 * 365,
  label          = paste0(service_levels * 100, "%")
)

# Visualization
p1 <- ggplot(reorder_df, aes(x = label, y = safety_stock, fill = label)) +
  geom_col(width = 0.5, show.legend = FALSE) +
  geom_text(aes(label = round(safety_stock, 0)),
            vjust = -0.5, fontface = "bold", size = 4) +
  scale_fill_manual(values = c("#3498DB", "#E67E22", "#E74C3C")) +
  labs(title = "Safety Stock by Service Level",
       x = "Service Level", y = "Safety Stock (Units)") +
  theme(plot.title = element_text(size = 12))

p2 <- ggplot(reorder_df, aes(x = label, y = holding_cost / 1000, fill = label)) +
  geom_col(width = 0.5, show.legend = FALSE) +
  geom_text(aes(label = paste0("$", round(holding_cost / 1000, 1), "K")),
            vjust = -0.5, fontface = "bold", size = 4) +
  scale_fill_manual(values = c("#3498DB", "#E67E22", "#E74C3C")) +
  labs(title = "Annual Holding Cost by Service Level",
       x = "Service Level", y = "Annual Holding Cost ($000s)") +
  theme(plot.title = element_text(size = 12))

p1 + p2 +
  plot_annotation(
    title    = "Inventory Decision Trade-off: Service Level vs. Cost",
    subtitle = "Higher service levels protect against stockouts but increase holding costs",
    theme    = theme(plot.title = element_text(face = "bold", size = 14))
  )

Figure 4: Safety Stock vs. Service Level Trade-off Analysis

Show Code

reorder_df |>
  mutate(
    across(c(reorder_point, safety_stock), ~ round(., 0)),
    holding_cost  = dollar(round(holding_cost, 0)),
    z_score       = round(z_score, 3),
    service_level = paste0(service_level * 100, "%")
  ) |>
  select(
    `Service Level` = service_level,
    `Z-Score`       = z_score,
    `Reorder Point (units)` = reorder_point,
    `Safety Stock (units)`  = safety_stock,
    `Annual Holding Cost`   = holding_cost
  ) |>
  kbl(align = c("c", "c", "c", "c", "c")) |>
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed", "responsive")
  ) |>
  column_spec(1, bold = TRUE) |>
  row_spec(2, background = "#D5F5E3") |>
  add_header_above(c(
    " " = 2,
    "Inventory Metrics" = 2,
    "Cost Impact" = 1
  ))

Table 3: Reorder Point Decision Matrix

		Inventory Metrics		Cost Impact
Service Level	Z-Score	Reorder Point (units)	Safety Stock (units)	Annual Holding Cost
90%	1.282	1072	72	$52,298
95%	1.645	1092	92	$67,124
99%	2.326	1130	130	$94,934

💬 Managerial Discussion Prompt

Which service level would you choose? Does your answer change if this analysis is for:

🛒 Retail consumer goods?
💊 Hospital medications?
✈️ Aerospace components?

4 Finance & Risk Management

4.1 Portfolio Risk Assessment

4.1.1 Scenario

Annual stock returns: μ = 12%, σ = 8%

Show Code

mu_stock <- 0.12
sd_stock <- 0.08

var_5      <- qnorm(0.05, mu_stock, sd_stock)
p_loss     <- pnorm(0, mu_stock, sd_stock)
p_loss10   <- pnorm(-0.10, mu_stock, sd_stock)

x_stock <- seq(mu_stock - 4 * sd_stock,
               mu_stock + 4 * sd_stock, length.out = 1000)
df_stock <- tibble(x = x_stock, y = dnorm(x_stock, mu_stock, sd_stock))

ggplot(df_stock, aes(x, y)) +
  # Catastrophic loss region
  geom_area(
    data = filter(df_stock, x < -0.10),
    aes(x, y), fill = "#922B21", alpha = 0.7
  ) +
  # Loss region
  geom_area(
    data = filter(df_stock, x >= -0.10 & x < 0),
    aes(x, y), fill = palette["secondary"], alpha = 0.4
  ) +
  # VaR region
  geom_area(
    data = filter(df_stock, x >= 0 & x < -var_5),
    aes(x, y), fill = palette["warning"], alpha = 0.3
  ) +
  # Gain region
  geom_area(
    data = filter(df_stock, x >= -var_5),
    aes(x, y), fill = palette["accent"], alpha = 0.25
  ) +
  geom_line(color = palette["primary"], linewidth = 1.2) +
  geom_vline(xintercept = var_5,
             linetype = "dashed", color = palette["warning"], linewidth = 0.9) +
  geom_vline(xintercept = 0,
             linetype = "dashed", color = palette["secondary"], linewidth = 0.9) +
  annotate("text", x = var_5 - 0.003, y = 3.5,
           label = paste0("VaR (5%)\n", round(var_5 * 100, 1), "%"),
           size = 3.2, hjust = 1, color = "#E67E22", fontface = "bold") +
  scale_x_continuous(labels = percent_format()) +
  labs(
    title    = "Portfolio Return Distribution — Risk Analysis",
    subtitle = paste0(
      "P(Loss) = ", round(p_loss * 100, 1),
      "%  |  P(Loss > 10%) = ", round(p_loss10 * 100, 1), "%",
      "  |  VaR(5%) = ", round(var_5 * 100, 1), "%"
    ),
    x = "Annual Return",
    y = "Density"
  )

Figure 5: Stock Return Distribution — Risk and Value at Risk Analysis

5 Marketing & Consumer Behavior

5.1 Pricing Strategy Optimization

5.1.1 Scenario

Customer willingness-to-pay: μ = $85/month, σ = $18

Show Code

mu_wtp   <- 85
sd_wtp   <- 18
prices   <- seq(50, 130, by = 5)

pricing_df <- tibble(
  Price         = prices,
  pct_buy       = 1 - pnorm(prices, mu_wtp, sd_wtp),
  revenue_index = prices * (1 - pnorm(prices, mu_wtp, sd_wtp))
)

optimal_price <- pricing_df$Price[which.max(pricing_df$revenue_index)]

p1 <- ggplot(pricing_df, aes(Price, pct_buy)) +
  geom_line(color = palette["primary"], linewidth = 1.2) +
  geom_point(color = palette["primary"], size = 2) +
  scale_y_continuous(labels = percent_format()) +
  scale_x_continuous(labels = dollar_format()) +
  labs(title = "Purchase Probability by Price",
       x = "Price ($/month)", y = "% Customers Who Buy")

p2 <- ggplot(pricing_df, aes(Price, revenue_index)) +
  geom_line(color = palette["accent"], linewidth = 1.2) +
  geom_point(color = palette["accent"], size = 2) +
  geom_vline(xintercept = optimal_price,
             linetype = "dashed", color = palette["secondary"], linewidth = 0.9) +
  annotate("text", x = optimal_price + 2, y = max(pricing_df$revenue_index) * 0.9,
           label = paste0("Optimal\n$", optimal_price),
           color = palette["secondary"], fontface = "bold", size = 3.5) +
  scale_x_continuous(labels = dollar_format()) +
  labs(title = "Revenue Index by Price Point",
       x = "Price ($/month)", y = "Revenue Index")

p1 + p2 +
  plot_annotation(
    title = "Willingness-to-Pay Analysis — Pricing Strategy",
    theme = theme(plot.title = element_text(face = "bold", size = 14))
  )

Figure 6: Revenue Optimization Across Price Points

Show Code

library(dplyr)
library(kableExtra)
library(scales)

# 1. Prepare and identify the optimal row index first
# Using near() prevents "it didn't work" due to floating point precision issues
pricing_summary <- pricing_df |>
  filter(Price %in% c(60, 75, 85, 99, 110, optimal_price)) |>
  distinct() |>
  arrange(Price)

opt_row_idx <- which(near(pricing_summary$Price, optimal_price))

# 2. Build the table
pricing_summary |>
  mutate(
    pct_buy       = percent(pct_buy, accuracy = 0.1),
    revenue_index = round(revenue_index, 2),
    Optimal       = if_else(near(Price, optimal_price), "⭐ Optimal", "")
  ) |>
  kbl(
    col.names = c("Price", "% Who Buy", "Revenue Index", ""),
    align     = "cccc",
    caption   = "Price Point Analysis"
  ) |>
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width        = FALSE
  ) |>
  row_spec(
    opt_row_idx, 
    background = "#D5F5E3", 
    bold       = TRUE
  )

Table 4: Price Point Analysis Table

Price Point Analysis
Price	% Who Buy	Revenue Index
60	91.8%	55.05
65	86.7%	56.34	⭐ Optimal
75	71.1%	53.31
85	50.0%	42.50
110	8.2%	9.07

6 Human Resources Management

6.1 Employee Performance & Compensation Tiers

6.1.1 Scenario

Performance scores: μ = 72, σ = 12 | Company size: 500 employees

Show Code

mu_hr      <- 72
sd_hr      <- 12
mu_hr_new  <- 78
n_emp      <- 500

# Thresholds
top10  <- qnorm(0.90, mu_hr, sd_hr)
top25  <- qnorm(0.75, mu_hr, sd_hr)
bot15  <- qnorm(0.15, mu_hr, sd_hr)

x_hr <- seq(30, 115, length.out = 1000)
df_hr <- tibble(
  x       = rep(x_hr, 2),
  y       = c(dnorm(x_hr, mu_hr,     sd_hr),
              dnorm(x_hr, mu_hr_new, sd_hr)),
  Program = rep(c("Before Training (μ=72)",
                  "After Training (μ=78)"), each = length(x_hr))
)

ggplot(df_hr, aes(x, y, color = Program, fill = Program)) +
  geom_line(linewidth = 1.2) +
  geom_area(alpha = 0.12) +
  geom_vline(xintercept = c(bot15, top25, top10),
             linetype = "dashed", color = "#7F8C8D", linewidth = 0.7) +
  annotate("text", x = bot15, y = 0.034,
           label = paste0("PIP\n<", round(bot15, 1)),
           size = 2.8, color = "#E74C3C", fontface = "bold") +
  annotate("text", x = top25, y = 0.034,
           label = paste0("Bonus\n>", round(top25, 1)),
           size = 2.8, color = "#E67E22", fontface = "bold") +
  annotate("text", x = top10, y = 0.034,
           label = paste0("Executive\n>", round(top10, 1)),
           size = 2.8, color = "#27AE60", fontface = "bold") +
  scale_color_manual(values = c(palette["primary"], palette["accent"])) +
  scale_fill_manual(values  = c(palette["primary"], palette["accent"])) +
  labs(
    title    = "Employee Performance Distribution",
    subtitle = "Impact of Training Intervention on Tier Classification",
    x        = "Performance Score",
    y        = "Density",
    color    = NULL, fill = NULL
  ) +
  theme(legend.position = "bottom")

Figure 7: Employee Performance Distribution — Before and After Training Intervention

Show Code

hr_tiers <- function(mu, sd, n) {
  tibble(
    Tier          = c("🏆 Executive Bonus (Top 10%)",
                      "✅ Standard Bonus (Top 25%)",
                      "📋 Performance Improvement (Bottom 15%)"),
    Threshold     = c(paste0("> ", round(qnorm(0.90, mu, sd), 1)),
                      paste0("> ", round(qnorm(0.75, mu, sd), 1)),
                      paste0("< ", round(qnorm(0.15, mu, sd), 1))),
    Employees     = c(
      round(n * 0.10),
      round(n * 0.15),
      round(n * 0.15)
    )
  )
}

before_tiers <- hr_tiers(mu_hr,     sd_hr, n_emp) |> rename_with(~ paste0(., " (Before)"), -Tier)
after_tiers  <- hr_tiers(mu_hr_new, sd_hr, n_emp) |> rename_with(~ paste0(., " (After)"),  -Tier)

left_join(before_tiers, after_tiers, by = "Tier") |>
  kbl(align = c("l", "c", "c", "c", "c")) |>
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed", "responsive")
  ) |>
  column_spec(1, bold = TRUE) |>
  add_header_above(c(" " = 1, "Before Training" = 2, "After Training" = 2))

Table 5: Performance Tier Analysis — Before vs. After Training

	Before Training		After Training
Tier	Threshold (Before)	Employees (Before)	Threshold (After)	Employees (After)
🏆 Executive Bonus (Top 10%)	> 87.4	50	> 93.4	50
✅ Standard Bonus (Top 25%)	> 80.1	75	> 86.1	75
📋 Performance Improvement (Bottom 15%)	< 59.6	75	< 65.6	75

⚖️ Ethical Discussion Prompt

Is applying a normal distribution assumption to evaluate human performance inherently problematic?

Consider: forced ranking systems, self-fulfilling prophecies, and diversity implications.

7 🔧 Integrated Capstone: The Bakery Challenge

7.0.1 Scenario

You manage a 50-store bakery chain. Each store:

Production: μ = 150 loaves/day, σ = 20 loaves

Selling price: $8/loaf | Production cost: $3/loaf

Markdown (unsold): $1/loaf | Lost sale opportunity: $4/loaf

Show Code

mu_bak  <- 150
sd_bak  <- 20
price   <- 8
cost    <- 3
salvage <- 1
lost    <- 4

expected_profit <- function(q, mu, sd) {
  # Expected sales = E[min(D, Q)]
  exp_sales <- mu * pnorm(q, mu, sd) -
    sd * dnorm(q, mu, sd) * sd +
    q  * (1 - pnorm(q, mu, sd))
  
  # More precise: E[min(D,Q)]
  exp_min_dq <- q - (q - mu) * pnorm(q, mu, sd) -
    sd * dnorm(q, mu, sd)

  exp_leftover <- pmax(q - exp_min_dq, 0)
  exp_shortage  <- pmax(mu - exp_min_dq, 0)

  profit <- (price - cost) * exp_min_dq -
    (cost - salvage) * exp_leftover -
    lost * exp_shortage

  profit
}

quantities <- seq(80, 220, by = 1)
profits    <- sapply(quantities, expected_profit, mu = mu_bak, sd = sd_bak)

opt_q      <- quantities[which.max(profits)]
opt_profit <- max(profits)

df_bakery <- tibble(q = quantities, profit = profits)

ggplot(df_bakery, aes(q, profit)) +
  geom_line(color = palette["primary"], linewidth = 1.3) +
  geom_vline(xintercept = opt_q,
             linetype = "dashed", color = palette["secondary"], linewidth = 0.9) +
  geom_point(data = tibble(q = opt_q, profit = opt_profit),
             aes(q, profit), color = palette["secondary"], size = 4) +
  annotate("text", x = opt_q + 5, y = opt_profit * 0.98,
           label = paste0("Optimal: ", opt_q, " loaves\n$",
                          round(opt_profit, 2), "/store/day"),
           color = palette["secondary"], fontface = "bold", size = 3.5) +
  scale_y_continuous(labels = dollar_format()) +
  labs(
    title    = "Expected Daily Profit by Production Quantity",
    subtitle = paste0(
      "50-Store Chain | Daily Chain Profit at Optimum: $",
      format(round(opt_profit * 50, 0), big.mark = ",")
    ),
    x = "Loaves Produced per Store",
    y = "Expected Daily Profit per Store"
  )

Figure 8: Expected Daily Profit Optimization — Bakery Production Decision

🎯 Strategic Questions for Discussion

Should each store manager decide independently or should production be centralized?
How does your answer change if stores are in different neighborhoods (different demand profiles)?
What data would you need to refine this model?

Assessment Rubric

Show Code

tibble(
  Criteria = c(
    "Calculation Accuracy",
    "Business Interpretation",
    "Critical Thinking",
    "Communication"
  ),
  `Excellent (A)` = c(
    "All z-scores & probabilities correct",
    "Nuanced managerial recommendations with trade-offs",
    "Challenges assumptions, explores alternatives",
    "Clear, professional, publication-ready"
  ),
  `Proficient (B)` = c(
    "Minor computational errors",
    "Basic interpretation present",
    "Addresses main decision variables",
    "Adequate clarity and structure"
  ),
  `Developing (C)` = c(
    "Conceptual misunderstandings present",
    "Results not connected to decisions",
    "Surface-level analysis only",
    "Difficult to follow reasoning"
  )
) |>
  kbl(align = c("l", "l", "l", "l")) |>
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed", "responsive"),
    full_width = TRUE
  ) |>
  column_spec(1, bold = TRUE, color = "#2C3E50", width = "15%") |>
  column_spec(2, background = "#D5F5E3", width = "28%") |>
  column_spec(3, background = "#D6EAF8", width = "28%") |>
  column_spec(4, background = "#FDEBD0", width = "28%")

Table 6: Grading Rubric — Industry Application Exercises

Criteria	Excellent (A)	Proficient (B)	Developing (C)
Calculation Accuracy	All z-scores & probabilities correct	Minor computational errors	Conceptual misunderstandings present
Business Interpretation	Nuanced managerial recommendations with trade-offs	Basic interpretation present	Results not connected to decisions
Critical Thinking	Challenges assumptions, explores alternatives	Addresses main decision variables	Surface-level analysis only
Communication	Clear, professional, publication-ready	Adequate clarity and structure	Difficult to follow reasoning

Teaching Notes

📅 Recommended Sequencing

Week	Exercises	Focus
Week 1	Manufacturing (§1)	Concrete, visual, z-score fluency
Week 2	Healthcare + Supply Chain (§2–3)	Decision trade-offs emerge
Week 3	Finance + Marketing (§4–5)	Abstract probabilistic thinking
Week 4	HR + Capstone (§6–7)	Full synthesis and ethics

🎓 Core Pedagogical Goals

Students should recognize σ as a managerial lever, not just a statistical parameter
Every probability is a business risk statement
Z-scores connect directly to actionable threshold decisions
Always close with: “What would YOU decide, and what risk are you willing to accept?”

--- title: "Normal Distribution: Practical Industry Applications" subtitle: "Bridging Statistical Theory to Managerial Decision-Making" author: "Segun Iyanda" date: today date-format: "MMMM D, YYYY" format: html: theme: cosmo toc: true toc-depth: 3 toc-location: left toc-title: "📋 Table of Contents" number-sections: true number-depth: 3 code-fold: true code-tools: true code-summary: "Show Code" highlight-style: github fig-width: 8 fig-height: 5 fig-align: center smooth-scroll: true anchor-sections: true citations-hover: true footnotes-hover: true html-math-method: mathjax self-contained: true css: styles.css execute: echo: true warning: false message: false cache: true #bibliography: references.bib --- ```{r} #| label: setup #| include: false # Load required libraries library(tidyverse) library(knitr) library(kableExtra) library(ggplot2) library(patchwork) library(scales) # Set global theme theme_set( theme_minimal(base_size = 13) + theme( plot.title = element_text(face = "bold", color = "#2C3E50", size = 15), plot.subtitle = element_text(color = "#7F8C8D", size = 11), axis.title = element_text(color = "#2C3E50"), panel.grid.minor = element_blank(), plot.background = element_rect(fill = "white", color = NA) ) ) # Color palette palette <- c( primary = "#2980B9", secondary = "#E74C3C", accent = "#27AE60", warning = "#F39C12", light = "#ECF0F1" ) ``` ::: {.callout-note appearance="minimal"} > *"The goal is not to teach students to calculate probabilities — it is to teach them to make better decisions under uncertainty."* ::: *** # Course Overview {.unnumbered} ::: {.callout-tip icon=true} ## Learning Objectives By the end of these exercises, students will be able to: - **Apply** normal distribution concepts to real-world business scenarios - **Interpret** probabilities as actionable managerial risk statements - **Recognize** that$\sigma$ is a managerial lever, not merely a statistical parameter - **Develop** data-driven recommendations under uncertainty - **Evaluate** trade-offs using distributional thinking ::: *** # Manufacturing & Quality Control {#sec-manufacturing} ::: {.callout-caution collapse="false" appearance="default"} ## 🏭 Industry Context Manufacturing quality control relies heavily on process distribution analysis. Understanding variation is the foundation of operational excellence. ::: ## The Defective Product Dilemma {#sec-defective} ### Scenario > A bottling plant fills cola bottles with a **mean of 500ml** and **standard deviation of 5ml**. > FDA regulations require bottles to contain **between 490ml and 515ml**. ### Tasks & Solutions ```{r} #| label: fig-bottling #| fig-cap: "Cola Bottle Fill Distribution with FDA Compliance Bounds" # Parameters mu_bottle <- 500 sd_bottle <- 5 lower <- 490 upper <- 515 # Probability calculation p_compliant <- pnorm(upper, mu_bottle, sd_bottle) - pnorm(lower, mu_bottle, sd_bottle) daily_production <- 50000 expected_failures <- daily_production * (1 - p_compliant) # Visualization x <- seq(mu_bottle - 4 * sd_bottle, mu_bottle + 4 * sd_bottle, length.out = 1000) df_bottle <- tibble(x = x, y = dnorm(x, mu_bottle, sd_bottle)) ggplot(df_bottle, aes(x, y)) + # Fail zones geom_area( data = filter(df_bottle, x < lower), aes(x, y), fill = palette["secondary"], alpha = 0.4 ) + geom_area( data = filter(df_bottle, x > upper), aes(x, y), fill = palette["warning"], alpha = 0.4 ) + # Pass zone geom_area( data = filter(df_bottle, x >= lower & x <= upper), aes(x, y), fill = palette["accent"], alpha = 0.4 ) + geom_line(color = palette["primary"], linewidth = 1.2) + geom_vline(xintercept = c(lower, upper), linetype = "dashed", color = palette["secondary"], linewidth = 0.9) + geom_vline(xintercept = mu_bottle, linetype = "solid", color = palette["primary"], linewidth = 1) + annotate("text", x = 495, y = 0.06, label = "✅ FDA\nCompliant", color = "#27AE60", fontface = "bold", size = 3.5) + annotate("text", x = 487, y = 0.025, label = "❌ Under\nFill", color = palette["secondary"], fontface = "bold", size = 3) + annotate("text", x = 518, y = 0.025, label = "⚠️ Over\nFill", color = palette["warning"], fontface = "bold", size = 3) + labs( title = "Bottle Fill Distribution — FDA Compliance Analysis", subtitle = paste0("μ = ", mu_bottle, "ml | σ = ", sd_bottle, "ml | Compliance Rate = ", round(p_compliant * 100, 2), "%"), x = "Fill Volume (ml)", y = "Density" ) ``` ```{r} #| label: tbl-bottling-results #| tbl-cap: "Bottling Plant Compliance Summary" tibble( Metric = c( "Compliance Probability", "Daily Expected Failures", "Annual Expected Failures (250 days)" ), Value = c( paste0(round(p_compliant * 100, 3), "%"), format(round(expected_failures), big.mark = ","), format(round(expected_failures * 250), big.mark = ",") ), `Business Impact` = c( "FDA regulatory threshold", "Units requiring rework or disposal", "Annual waste exposure" ) ) |> kbl(align = c("l", "c", "l")) |> kable_styling( bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = FALSE ) |> column_spec(1, bold = TRUE, color = "#2C3E50") |> column_spec(2, color = "#2980B9", bold = TRUE) ``` ::: {.callout-important} ## 💬 Managerial Discussion Prompt **Task 3:** The plant manager wants to reduce waste below **1%**. Should they: - **(A)** Adjust the process **mean** (recalibrate machinery)? - **(B)** Reduce process **variance** (invest in precision equipment)? Use your calculations to justify the recommendation. What are the **cost trade-offs** of each approach? ::: *** ## Six Sigma Process Capability {#sec-sixsigma} ### Scenario > A circuit board manufacturer measures component resistance: > **μ = 100 ohms**, **σ = 2 ohms**. Acceptable range: **94–106 ohms**. ```{r} #| label: fig-sixsigma #| fig-cap: "Circuit Board Resistance Distribution and Process Capability" mu_circuit <- 100 sd_circuit <- 2 lsl <- 94 # Lower Spec Limit usl <- 106 # Upper Spec Limit # Process Capability Index cp <- (usl - lsl) / (6 * sd_circuit) cpk <- min((usl - mu_circuit) / (3 * sd_circuit), (mu_circuit - lsl) / (3 * sd_circuit)) p_defect <- (1 - (pnorm(usl, mu_circuit, sd_circuit) - pnorm(lsl, mu_circuit, sd_circuit))) * 1e6 x2 <- seq(mu_circuit - 5 * sd_circuit, mu_circuit + 5 * sd_circuit, length.out = 1000) df_circuit <- tibble(x = x2, y = dnorm(x2, mu_circuit, sd_circuit)) ggplot(df_circuit, aes(x, y)) + geom_area( data = filter(df_circuit, x >= lsl & x <= usl), aes(x, y), fill = palette["accent"], alpha = 0.35 ) + geom_area( data = filter(df_circuit, x < lsl | x > usl), aes(x, y), fill = palette["secondary"], alpha = 0.5 ) + geom_line(color = palette["primary"], linewidth = 1.2) + geom_vline(xintercept = c(lsl, usl), linetype = "dashed", color = palette["secondary"], linewidth = 0.9) + annotate("text", x = 100, y = 0.15, label = paste0("Cp = ", round(cp, 3), "\nCpk = ", round(cpk, 3)), size = 4, fontface = "bold", color = "#2C3E50") + labs( title = "Six Sigma Process Capability — Resistance Distribution", subtitle = paste0("Defect Rate: ", round(p_defect, 1), " DPMO"), x = "Resistance (Ohms)", y = "Density" ) ``` ::: {.callout-tip} ## 📐 Key Formula Reference $C_p = \frac{USL - LSL}{6\sigma} \qquad C_{pk} = \min\left(\frac{USL - \mu}{3\sigma},\ \frac{\mu - LSL}{3\sigma}\right)$ A **Six Sigma process** achieves$C_{pk} \geq 2.0$, yielding only **3.4 DPMO** (defects per million opportunities). ::: *** # Healthcare & Operations Management {#sec-healthcare} ::: {.callout-caution collapse="false"} ## 🏥 Industry Context In healthcare, distributional decisions have direct human consequences. Operations managers must balance cost, staffing, and patient outcomes simultaneously. ::: ## Emergency Room Staffing Model {#sec-er} ### Scenario > Patient wait times: **μ = 35 minutes**, **σ = 8 minutes**. > Hospital policy: **95% of patients** seen within a target threshold. ```{r} #| label: fig-er-staffing #| fig-cap: "Emergency Room Wait Time Analysis — Staffing Decision Scenarios" mu_er <- 35 sd_er <- 8 sd_new <- 5 # After hiring additional nurse threshold_95 <- qnorm(0.95, mu_er, sd_er) threshold_30 <- qnorm(0.30, mu_er, sd_er) # Comparative visualization x_er <- seq(0, 70, length.out = 1000) df_er <- tibble( x = rep(x_er, 2), y = c(dnorm(x_er, mu_er, sd_er), dnorm(x_er, mu_er, sd_new)), Staffing = rep(c("Current (σ = 8 min)", "With New Nurse (σ = 5 min)"), each = length(x_er)) ) p_over45_current <- 1 - pnorm(45, mu_er, sd_er) p_over45_new <- 1 - pnorm(45, mu_er, sd_new) ggplot(df_er, aes(x, y, color = Staffing, fill = Staffing)) + geom_line(linewidth = 1.2) + geom_area(alpha = 0.15) + geom_vline(xintercept = threshold_95, linetype = "dashed", color = "#2C3E50", linewidth = 0.8) + geom_vline(xintercept = 45, linetype = "dotted", color = palette["secondary"], linewidth = 0.9) + annotate("text", x = threshold_95 + 1, y = 0.045, label = paste0("95th pct\n", round(threshold_95, 1), " min"), size = 3, color = "#2C3E50") + annotate("text", x = 47, y = 0.04, label = "45 min\nthreshold", size = 3, color = palette["secondary"]) + scale_color_manual(values = c(palette["primary"], palette["accent"])) + scale_fill_manual(values = c(palette["primary"], palette["accent"])) + labs( title = "ER Wait Time Distributions — Staffing Scenarios", subtitle = paste0( "P(wait > 45 min) | Current: ", round(p_over45_current * 100, 1), "% → With New Nurse: ", round(p_over45_new * 100, 1), "%" ), x = "Wait Time (minutes)", y = "Density", color = "Staffing Scenario", fill = "Staffing Scenario" ) + theme(legend.position = "bottom") ``` ```{r} #| label: tbl-er-decision #| tbl-cap: "Hospital Staffing Investment Analysis" tibble( Metric = c( "95th Percentile Threshold", "Fast-Track Cutoff (30th pct)", "P(Wait > 45 min) — Current", "P(Wait > 45 min) — With Nurse", "Improvement", "Annual Nurse Cost" ), Result = c( paste0(round(threshold_95, 1), " minutes"), paste0(round(threshold_30, 1), " minutes"), paste0(round(p_over45_current * 100, 2), "%"), paste0(round(p_over45_new * 100, 2), "%"), paste0(round((p_over45_current - p_over45_new) * 100, 2), " percentage points"), "$80,000" ) ) |> kbl(col.names = c("Decision Metric", "Calculated Result")) |> kable_styling( bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE ) |> column_spec(1, bold = TRUE, color = "#2C3E50") |> column_spec(2, color = palette["primary"], bold = TRUE) |> row_spec(6, background = "#FDEBD0", bold = TRUE) ``` *** # Supply Chain & Inventory Management {#sec-supplychain} ## The Reorder Point Problem {#sec-reorder} ### Scenario > Daily demand: **μ = 200 units**, **σ = 25 units**. > Lead time = **5 days** | Stockout cost = **$50/unit** | Holding cost = **$2/unit/day** ```{r} #| label: fig-reorder #| fig-cap: "Safety Stock vs. Service Level Trade-off Analysis" mu_daily <- 200 sd_daily <- 25 lead_time <- 5 # Lead time demand distribution mu_lt <- mu_daily * lead_time sd_lt <- sd_daily * sqrt(lead_time) service_levels <- c(0.90, 0.95, 0.99) reorder_df <- tibble( service_level = service_levels, z_score = qnorm(service_levels), reorder_point = mu_lt + qnorm(service_levels) * sd_lt, safety_stock = qnorm(service_levels) * sd_lt, holding_cost = qnorm(service_levels) * sd_lt * 2 * 365, label = paste0(service_levels * 100, "%") ) # Visualization p1 <- ggplot(reorder_df, aes(x = label, y = safety_stock, fill = label)) + geom_col(width = 0.5, show.legend = FALSE) + geom_text(aes(label = round(safety_stock, 0)), vjust = -0.5, fontface = "bold", size = 4) + scale_fill_manual(values = c("#3498DB", "#E67E22", "#E74C3C")) + labs(title = "Safety Stock by Service Level", x = "Service Level", y = "Safety Stock (Units)") + theme(plot.title = element_text(size = 12)) p2 <- ggplot(reorder_df, aes(x = label, y = holding_cost / 1000, fill = label)) + geom_col(width = 0.5, show.legend = FALSE) + geom_text(aes(label = paste0("$", round(holding_cost / 1000, 1), "K")), vjust = -0.5, fontface = "bold", size = 4) + scale_fill_manual(values = c("#3498DB", "#E67E22", "#E74C3C")) + labs(title = "Annual Holding Cost by Service Level", x = "Service Level", y = "Annual Holding Cost ($000s)") + theme(plot.title = element_text(size = 12)) p1 + p2 + plot_annotation( title = "Inventory Decision Trade-off: Service Level vs. Cost", subtitle = "Higher service levels protect against stockouts but increase holding costs", theme = theme(plot.title = element_text(face = "bold", size = 14)) ) ``` ```{r} #| label: tbl-inventory #| tbl-cap: "Reorder Point Decision Matrix" reorder_df |> mutate( across(c(reorder_point, safety_stock), ~ round(., 0)), holding_cost = dollar(round(holding_cost, 0)), z_score = round(z_score, 3), service_level = paste0(service_level * 100, "%") ) |> select( `Service Level` = service_level, `Z-Score` = z_score, `Reorder Point (units)` = reorder_point, `Safety Stock (units)` = safety_stock, `Annual Holding Cost` = holding_cost ) |> kbl(align = c("c", "c", "c", "c", "c")) |> kable_styling( bootstrap_options = c("striped", "hover", "condensed", "responsive") ) |> column_spec(1, bold = TRUE) |> row_spec(2, background = "#D5F5E3") |> add_header_above(c( " " = 2, "Inventory Metrics" = 2, "Cost Impact" = 1 )) ``` ::: {.callout-important} ## 💬 Managerial Discussion Prompt Which service level would you choose? **Does your answer change** if this analysis is for: - 🛒 Retail consumer goods? - 💊 Hospital medications? - ✈️ Aerospace components? ::: *** # Finance & Risk Management {#sec-finance} ## Portfolio Risk Assessment {#sec-portfolio} ### Scenario > Annual stock returns: **μ = 12%**, **σ = 8%** ```{r} #| label: fig-portfolio #| fig-cap: "Stock Return Distribution — Risk and Value at Risk Analysis" mu_stock <- 0.12 sd_stock <- 0.08 var_5 <- qnorm(0.05, mu_stock, sd_stock) p_loss <- pnorm(0, mu_stock, sd_stock) p_loss10 <- pnorm(-0.10, mu_stock, sd_stock) x_stock <- seq(mu_stock - 4 * sd_stock, mu_stock + 4 * sd_stock, length.out = 1000) df_stock <- tibble(x = x_stock, y = dnorm(x_stock, mu_stock, sd_stock)) ggplot(df_stock, aes(x, y)) + # Catastrophic loss region geom_area( data = filter(df_stock, x < -0.10), aes(x, y), fill = "#922B21", alpha = 0.7 ) + # Loss region geom_area( data = filter(df_stock, x >= -0.10 & x < 0), aes(x, y), fill = palette["secondary"], alpha = 0.4 ) + # VaR region geom_area( data = filter(df_stock, x >= 0 & x < -var_5), aes(x, y), fill = palette["warning"], alpha = 0.3 ) + # Gain region geom_area( data = filter(df_stock, x >= -var_5), aes(x, y), fill = palette["accent"], alpha = 0.25 ) + geom_line(color = palette["primary"], linewidth = 1.2) + geom_vline(xintercept = var_5, linetype = "dashed", color = palette["warning"], linewidth = 0.9) + geom_vline(xintercept = 0, linetype = "dashed", color = palette["secondary"], linewidth = 0.9) + annotate("text", x = var_5 - 0.003, y = 3.5, label = paste0("VaR (5%)\n", round(var_5 * 100, 1), "%"), size = 3.2, hjust = 1, color = "#E67E22", fontface = "bold") + scale_x_continuous(labels = percent_format()) + labs( title = "Portfolio Return Distribution — Risk Analysis", subtitle = paste0( "P(Loss) = ", round(p_loss * 100, 1), "% | P(Loss > 10%) = ", round(p_loss10 * 100, 1), "%", " | VaR(5%) = ", round(var_5 * 100, 1), "%" ), x = "Annual Return", y = "Density" ) ``` *** # Marketing & Consumer Behavior {#sec-marketing} ## Pricing Strategy Optimization {#sec-pricing} ### Scenario > Customer willingness-to-pay: **μ = $85/month**, **σ = $18** ```{r} #| label: fig-pricing #| fig-cap: "Revenue Optimization Across Price Points" mu_wtp <- 85 sd_wtp <- 18 prices <- seq(50, 130, by = 5) pricing_df <- tibble( Price = prices, pct_buy = 1 - pnorm(prices, mu_wtp, sd_wtp), revenue_index = prices * (1 - pnorm(prices, mu_wtp, sd_wtp)) ) optimal_price <- pricing_df$Price[which.max(pricing_df$revenue_index)] p1 <- ggplot(pricing_df, aes(Price, pct_buy)) + geom_line(color = palette["primary"], linewidth = 1.2) + geom_point(color = palette["primary"], size = 2) + scale_y_continuous(labels = percent_format()) + scale_x_continuous(labels = dollar_format()) + labs(title = "Purchase Probability by Price", x = "Price ($/month)", y = "% Customers Who Buy") p2 <- ggplot(pricing_df, aes(Price, revenue_index)) + geom_line(color = palette["accent"], linewidth = 1.2) + geom_point(color = palette["accent"], size = 2) + geom_vline(xintercept = optimal_price, linetype = "dashed", color = palette["secondary"], linewidth = 0.9) + annotate("text", x = optimal_price + 2, y = max(pricing_df$revenue_index) * 0.9, label = paste0("Optimal\n$", optimal_price), color = palette["secondary"], fontface = "bold", size = 3.5) + scale_x_continuous(labels = dollar_format()) + labs(title = "Revenue Index by Price Point", x = "Price ($/month)", y = "Revenue Index") p1 + p2 + plot_annotation( title = "Willingness-to-Pay Analysis — Pricing Strategy", theme = theme(plot.title = element_text(face = "bold", size = 14)) ) ``` ```{r} #| label: tbl-pricing #| tbl-cap: "Price Point Analysis Table" library(dplyr) library(kableExtra) library(scales) # 1. Prepare and identify the optimal row index first # Using near() prevents "it didn't work" due to floating point precision issues pricing_summary <- pricing_df |> filter(Price %in% c(60, 75, 85, 99, 110, optimal_price)) |> distinct() |> arrange(Price) opt_row_idx <- which(near(pricing_summary$Price, optimal_price)) # 2. Build the table pricing_summary |> mutate( pct_buy = percent(pct_buy, accuracy = 0.1), revenue_index = round(revenue_index, 2), Optimal = if_else(near(Price, optimal_price), "⭐ Optimal", "") ) |> kbl( col.names = c("Price", "% Who Buy", "Revenue Index", ""), align = "cccc", caption = "Price Point Analysis" ) |> kable_styling( bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE ) |> row_spec( opt_row_idx, background = "#D5F5E3", bold = TRUE ) ``` *** # Human Resources Management {#sec-hr} ## Employee Performance & Compensation Tiers {#sec-performance} ### Scenario > Performance scores: **μ = 72**, **σ = 12** | Company size: **500 employees** ```{r} #| label: fig-hr #| fig-cap: "Employee Performance Distribution — Before and After Training Intervention" mu_hr <- 72 sd_hr <- 12 mu_hr_new <- 78 n_emp <- 500 # Thresholds top10 <- qnorm(0.90, mu_hr, sd_hr) top25 <- qnorm(0.75, mu_hr, sd_hr) bot15 <- qnorm(0.15, mu_hr, sd_hr) x_hr <- seq(30, 115, length.out = 1000) df_hr <- tibble( x = rep(x_hr, 2), y = c(dnorm(x_hr, mu_hr, sd_hr), dnorm(x_hr, mu_hr_new, sd_hr)), Program = rep(c("Before Training (μ=72)", "After Training (μ=78)"), each = length(x_hr)) ) ggplot(df_hr, aes(x, y, color = Program, fill = Program)) + geom_line(linewidth = 1.2) + geom_area(alpha = 0.12) + geom_vline(xintercept = c(bot15, top25, top10), linetype = "dashed", color = "#7F8C8D", linewidth = 0.7) + annotate("text", x = bot15, y = 0.034, label = paste0("PIP\n<", round(bot15, 1)), size = 2.8, color = "#E74C3C", fontface = "bold") + annotate("text", x = top25, y = 0.034, label = paste0("Bonus\n>", round(top25, 1)), size = 2.8, color = "#E67E22", fontface = "bold") + annotate("text", x = top10, y = 0.034, label = paste0("Executive\n>", round(top10, 1)), size = 2.8, color = "#27AE60", fontface = "bold") + scale_color_manual(values = c(palette["primary"], palette["accent"])) + scale_fill_manual(values = c(palette["primary"], palette["accent"])) + labs( title = "Employee Performance Distribution", subtitle = "Impact of Training Intervention on Tier Classification", x = "Performance Score", y = "Density", color = NULL, fill = NULL ) + theme(legend.position = "bottom") ``` ```{r} #| label: tbl-hr-tiers #| tbl-cap: "Performance Tier Analysis — Before vs. After Training" hr_tiers <- function(mu, sd, n) { tibble( Tier = c("🏆 Executive Bonus (Top 10%)", "✅ Standard Bonus (Top 25%)", "📋 Performance Improvement (Bottom 15%)"), Threshold = c(paste0("> ", round(qnorm(0.90, mu, sd), 1)), paste0("> ", round(qnorm(0.75, mu, sd), 1)), paste0("< ", round(qnorm(0.15, mu, sd), 1))), Employees = c( round(n * 0.10), round(n * 0.15), round(n * 0.15) ) ) } before_tiers <- hr_tiers(mu_hr, sd_hr, n_emp) |> rename_with(~ paste0(., " (Before)"), -Tier) after_tiers <- hr_tiers(mu_hr_new, sd_hr, n_emp) |> rename_with(~ paste0(., " (After)"), -Tier) left_join(before_tiers, after_tiers, by = "Tier") |> kbl(align = c("l", "c", "c", "c", "c")) |> kable_styling( bootstrap_options = c("striped", "hover", "condensed", "responsive") ) |> column_spec(1, bold = TRUE) |> add_header_above(c(" " = 1, "Before Training" = 2, "After Training" = 2)) ``` ::: {.callout-warning} ## ⚖️ Ethical Discussion Prompt Is applying a **normal distribution assumption** to evaluate human performance inherently problematic? Consider: forced ranking systems, self-fulfilling prophecies, and diversity implications. ::: *** # 🔧 Integrated Capstone: The Bakery Challenge {#sec-capstone} ### Scenario > You manage a **50-store bakery chain**. Each store: > > - Production: **μ = 150 loaves/day**, **σ = 20 loaves** > - Selling price: **$8/loaf** | Production cost: **$3/loaf** > - Markdown (unsold): **$1/loaf** | Lost sale opportunity: **$4/loaf** ```{r} #| label: fig-bakery #| fig-cap: "Expected Daily Profit Optimization — Bakery Production Decision" mu_bak <- 150 sd_bak <- 20 price <- 8 cost <- 3 salvage <- 1 lost <- 4 expected_profit <- function(q, mu, sd) { # Expected sales = E[min(D, Q)] exp_sales <- mu * pnorm(q, mu, sd) - sd * dnorm(q, mu, sd) * sd + q * (1 - pnorm(q, mu, sd)) # More precise: E[min(D,Q)] exp_min_dq <- q - (q - mu) * pnorm(q, mu, sd) - sd * dnorm(q, mu, sd) exp_leftover <- pmax(q - exp_min_dq, 0) exp_shortage <- pmax(mu - exp_min_dq, 0) profit <- (price - cost) * exp_min_dq - (cost - salvage) * exp_leftover - lost * exp_shortage profit } quantities <- seq(80, 220, by = 1) profits <- sapply(quantities, expected_profit, mu = mu_bak, sd = sd_bak) opt_q <- quantities[which.max(profits)] opt_profit <- max(profits) df_bakery <- tibble(q = quantities, profit = profits) ggplot(df_bakery, aes(q, profit)) + geom_line(color = palette["primary"], linewidth = 1.3) + geom_vline(xintercept = opt_q, linetype = "dashed", color = palette["secondary"], linewidth = 0.9) + geom_point(data = tibble(q = opt_q, profit = opt_profit), aes(q, profit), color = palette["secondary"], size = 4) + annotate("text", x = opt_q + 5, y = opt_profit * 0.98, label = paste0("Optimal: ", opt_q, " loaves\n$", round(opt_profit, 2), "/store/day"), color = palette["secondary"], fontface = "bold", size = 3.5) + scale_y_continuous(labels = dollar_format()) + labs( title = "Expected Daily Profit by Production Quantity", subtitle = paste0( "50-Store Chain | Daily Chain Profit at Optimum: $", format(round(opt_profit * 50, 0), big.mark = ",") ), x = "Loaves Produced per Store", y = "Expected Daily Profit per Store" ) ``` ::: {.callout-tip} ## 🎯 Strategic Questions for Discussion 1. Should each store manager **decide independently** or should production be **centralized**? 2. How does your answer change if stores are in **different neighborhoods** (different demand profiles)? 3. What data would you need to **refine this model**? ::: *** # Assessment Rubric {#sec-rubric .unnumbered} ```{r} #| label: tbl-rubric #| tbl-cap: "Grading Rubric — Industry Application Exercises" tibble( Criteria = c( "Calculation Accuracy", "Business Interpretation", "Critical Thinking", "Communication" ), `Excellent (A)` = c( "All z-scores & probabilities correct", "Nuanced managerial recommendations with trade-offs", "Challenges assumptions, explores alternatives", "Clear, professional, publication-ready" ), `Proficient (B)` = c( "Minor computational errors", "Basic interpretation present", "Addresses main decision variables", "Adequate clarity and structure" ), `Developing (C)` = c( "Conceptual misunderstandings present", "Results not connected to decisions", "Surface-level analysis only", "Difficult to follow reasoning" ) ) |> kbl(align = c("l", "l", "l", "l")) |> kable_styling( bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = TRUE ) |> column_spec(1, bold = TRUE, color = "#2C3E50", width = "15%") |> column_spec(2, background = "#D5F5E3", width = "28%") |> column_spec(3, background = "#D6EAF8", width = "28%") |> column_spec(4, background = "#FDEBD0", width = "28%") ``` *** # Teaching Notes {#sec-teaching .unnumbered} ::: {.callout-note appearance="minimal"} ## 📅 Recommended Sequencing | Week | Exercises | Focus | | --- | --- | --- | | **Week 1** | Manufacturing (§1) | Concrete, visual, z-score fluency | | **Week 2** | Healthcare + Supply Chain (§2–3) | Decision trade-offs emerge | | **Week 3** | Finance + Marketing (§4–5) | Abstract probabilistic thinking | | **Week 4** | HR + Capstone (§6–7) | Full synthesis and ethics | ::: ::: {.callout-tip appearance="default"} ## 🎓 Core Pedagogical Goals - Students should recognize **σ as a managerial lever**, not just a statistical parameter - Every probability is a **business risk statement** - Z-scores connect directly to **actionable threshold decisions** - Always close with: *"What would YOU decide, and what risk are you willing to accept?"* ::: ***