Multivariate Statistical Process Control (MSPC)

Hotelling T², MEWMA, and Multivariate Process Monitoring

Author

David Guy PhD: Quality Control Department

Published

December 20, 2025

Introduction to Multivariate SPC

Note: This analysis uses the qcr package as the quality control framework. The multivariate methods (Hotelling T² and MEWMA) are implemented using standard statistical formulas to ensure compatibility across different package versions and to provide educational transparency. All calculations follow established statistical theory and produce results equivalent to specialized multivariate SPC packages.

What is Multivariate SPC?

Multivariate Statistical Process Control (MSPC) monitors multiple related quality characteristics simultaneously, rather than tracking each variable independently.

Why Use Multivariate SPC?

Traditional (Univariate) Approach: - Monitor each variable with separate control charts - Example: Track temperature, pressure, flow rate separately - Problem: Misses interactions and correlations between variables

Multivariate Approach: - Monitor all variables together as a system - Detects patterns that univariate charts miss - Accounts for correlations between variables - More powerful for detecting process changes

Real-World Example

Chemical Reactor Process: - Temperature: 150°C +/- 5°C - Pressure: 200 psi +/- 10 psi
- pH: 7.0 +/- 0.5

Scenario 1 (Both in control individually): - Temperature: 152°C [OK] - within limits - Pressure: 195 psi [OK] - within limits - But: This combination is unusual and indicates a problem!

Multivariate SPC detects this because it knows the normal relationship between temperature and pressure.

Key Multivariate Methods

1. Hotelling T² Chart

  • Multivariate extension of X-bar chart
  • Monitors mean vector of multiple variables
  • One statistic for all variables combined
  • Most common multivariate control chart

2. MEWMA (Multivariate EWMA)

  • Multivariate exponentially weighted moving average
  • More sensitive to small sustained shifts
  • Better for detecting gradual process drift
  • Incorporates historical information

3. MCUSUM (Multivariate CUSUM)

  • Multivariate cumulative sum
  • Excellent for detecting small persistent shifts
  • Directional information for diagnosis

4. Principal Component Analysis (PCA)

  • Reduces dimensionality
  • Transforms correlated variables to uncorrelated components
  • T² and Q statistics for monitoring
  • Useful for high-dimensional data (10+ variables)

Setup and Data Import

Code
# Install packages if needed
packages <- c("qcr", "plotly", "gt", "dplyr", "tidyr", "ggplot2", 
              "readr", "lubridate", "DT", "knitr", "MASS", "psych")

# Load libraries
library(qcr)       # Unified quality control (replaces qcc + MSQC)
library(plotly)    # Interactive plots
library(gt)        # Tables
library(dplyr)     # Data manipulation
library(tidyr)     # Data tidying
library(ggplot2)   # Graphics
library(readr)     # Data import
library(lubridate) # Dates
library(DT)        # Interactive tables
library(knitr)     # Reports
library(MASS)      # Multivariate normal
library(psych)     # Correlation matrices

theme_set(theme_minimal())

cat("Package versions:\n")
Package versions:
Code
cat("qcr version:", as.character(packageVersion("qcr")), "\n\n")
qcr version: 1.4 

Import Multivariate Data

Code
# Import multivariate quality data
if (file.exists("multivariate_data.csv")) {
  mv_data <- read_csv("multivariate_data.csv", show_col_types = FALSE)
  cat("[OK] Multivariate data loaded from multivariate_data.csv\n\n")
} else {
  cat("[INFO] No multivariate_data.csv found. Generating example data.\n")
  cat("For your own data, create multivariate_data.csv with multiple quality variables.\n\n")
  
  # Generate realistic multivariate process data
  set.seed(42)
  n_samples <- 100
  
  # Phase I: In-control data (samples 1-80)
  # Create correlated variables (temperature, pressure, flow_rate)
  
  # Correlation structure (realistic for a process)
  correlation_matrix <- matrix(c(
    1.0,  0.7,  0.5,   # Temperature correlations
    0.7,  1.0,  0.6,   # Pressure correlations
    0.5,  0.6,  1.0    # Flow rate correlations
  ), nrow = 3, byrow = TRUE)
  
  # Mean vector
  mu_in_control <- c(150, 200, 50)  # Temp, Pressure, Flow
  
  # Standard deviations
  sigma_in_control <- c(2, 5, 1)
  
  # Create covariance matrix
  D <- diag(sigma_in_control)
  Sigma_in_control <- D %*% correlation_matrix %*% D
  
  # Generate in-control data (Phase I)
  phase1_data <- mvrnorm(n = 80, mu = mu_in_control, Sigma = Sigma_in_control)
  
  # Phase II: Out-of-control data (samples 81-100)
  # Introduce a mean shift
  mu_out_control <- c(152, 205, 51)  # Small shifts in all variables
  phase2_data <- mvrnorm(n = 20, mu = mu_out_control, Sigma = Sigma_in_control)
  
  # Combine data
  all_data <- rbind(phase1_data, phase2_data)
  
  # Create dataframe
  mv_data <- tibble(
    sample = 1:n_samples,
    date = seq.Date(from = Sys.Date() - n_samples, to = Sys.Date() - 1, by = "day"),
    temperature = all_data[, 1],
    pressure = all_data[, 2],
    flow_rate = all_data[, 3],
    shift = sample(c("Day", "Night"), n_samples, replace = TRUE),
    operator = sample(paste0("OP", 1:3), n_samples, replace = TRUE)
  )
}
[OK] Multivariate data loaded from multivariate_data.csv
Code
# Display data summary
cat("Multivariate Process Data:\n")
Multivariate Process Data:
Code
cat("Samples:", nrow(mv_data), "\n")
Samples: 100 
Code
cat("Variables:", sum(sapply(mv_data, is.numeric)), "numeric variables\n\n")
Variables: 4 numeric variables
Code
# Show first few observations
mv_data %>%
  head(10) %>%
  gt() %>%
  tab_header(
    title = "Multivariate Quality Data - Sample Preview",
    subtitle = "First 10 observations"
  ) %>%
  fmt_number(
    columns = where(is.numeric),
    decimals = 2
  ) %>%
  tab_style(
    style = cell_fill(color = "lightblue"),
    locations = cells_column_labels()
  )
Multivariate Quality Data - Sample Preview
First 10 observations
sample date temperature pressure flow_rate shift operator
1.00 2024-09-11 149.82 199.45 49.89 Day OP1
2.00 2024-09-12 150.24 201.32 50.15 Night OP2
3.00 2024-09-13 149.95 198.76 49.95 Day OP3
4.00 2024-09-14 150.18 200.88 50.08 Day OP1
5.00 2024-09-15 149.73 199.12 49.82 Night OP2
6.00 2024-09-16 150.42 202.15 50.28 Day OP1
7.00 2024-09-17 149.88 199.67 49.92 Day OP3
8.00 2024-09-18 150.15 200.45 50.12 Night OP2
9.00 2024-09-19 149.92 199.88 49.98 Day OP1
10.00 2024-09-20 150.05 200.12 50.02 Day OP3

Data Summary and Correlation Analysis

Code
# Select quality variables
quality_vars <- c("temperature", "pressure", "flow_rate")
X <- as.matrix(mv_data[, quality_vars])

# Summary statistics
summary_stats <- mv_data %>%
  select(all_of(quality_vars)) %>%
  summarise(across(everything(),
                   list(Mean = ~mean(., na.rm = TRUE),
                        SD = ~sd(., na.rm = TRUE),
                        Min = ~min(., na.rm = TRUE),
                        Max = ~max(., na.rm = TRUE)),
                   .names = "{.col}__{.fn}")) %>%
  pivot_longer(everything(),
               names_to = c("Variable", "Statistic"),
               names_sep = "__") %>%
  pivot_wider(names_from = Statistic, values_from = value)

summary_stats %>%
  gt() %>%
  tab_header(
    title = "Summary Statistics",
    subtitle = "Process variables descriptive statistics"
  ) %>%
  fmt_number(
    columns = c(Mean, SD, Min, Max),
    decimals = 3
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightyellow"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(columns = Variable)
  )
Summary Statistics
Process variables descriptive statistics
Variable Mean SD Min Max
temperature 150.444 0.857 149.670 152.380
pressure 201.321 2.219 198.760 206.220
flow_rate 50.250 0.485 49.750 51.350
Code
# Correlation matrix
cor_matrix <- cor(X)

cat("\nCorrelation Matrix:\n")

Correlation Matrix:
Code
cor_df <- as.data.frame(cor_matrix)
cor_df$Variable <- rownames(cor_matrix)
cor_df <- cor_df[, c("Variable", quality_vars)]

cor_df %>%
  gt() %>%
  tab_header(
    title = "Correlation Matrix",
    subtitle = "Relationships between process variables"
  ) %>%
  fmt_number(
    columns = all_of(quality_vars),
    decimals = 3
  )
Correlation Matrix
Relationships between process variables
Variable temperature pressure flow_rate
temperature 1.000 0.980 0.998
pressure 0.980 1.000 0.989
flow_rate 0.998 0.989 1.000
Code
# Note: Strong correlations (>0.7) are important for multivariate SPC
cat("\nStrong correlations (>0.7) found:\n")

Strong correlations (>0.7) found:
Code
for (i in 1:(length(quality_vars)-1)) {
  for (j in (i+1):length(quality_vars)) {
    if (abs(cor_matrix[i,j]) > 0.7) {
      cat(sprintf("%s <-> %s: %.3f\n", 
                  quality_vars[i], quality_vars[j], cor_matrix[i,j]))
    }
  }
}
temperature <-> pressure: 0.980
temperature <-> flow_rate: 0.998
pressure <-> flow_rate: 0.989
Code
cat("\n")
Code
# Correlation heatmap
cor_long <- cor_matrix %>%
  as.data.frame() %>%
  mutate(Var1 = rownames(.)) %>%
  pivot_longer(cols = -Var1, names_to = "Var2", values_to = "Correlation")

cor_plot <- plot_ly(
  data = cor_long,
  x = ~Var1,
  y = ~Var2,
  z = ~Correlation,
  type = "heatmap",
  colorscale = list(
    c(0, "blue"),
    c(0.5, "white"),
    c(1, "red")
  ),
  zmin = -1,
  zmax = 1,
  text = ~round(Correlation, 3),
  hovertemplate = "<b>%{x} vs %{y}</b><br>Correlation: %{z:.3f}<extra></extra>"
) %>%
  layout(
    title = list(
      text = "Correlation Heatmap<br><sub>Strong correlations indicate variables move together</sub>",
      font = list(size = 16)
    ),
    xaxis = list(title = ""),
    yaxis = list(title = "")
  )

cor_plot

1. Hotelling T² Control Chart

Theory

The Hotelling T² statistic is a multivariate generalization of the t-statistic:

Formula: \[T^2 = n(\bar{x} - \mu_0)' S^{-1} (\bar{x} - \mu_0)\]

Where: - \(\bar{x}\) = sample mean vector - \(\mu_0\) = target mean vector (from Phase I) - \(S\) = sample covariance matrix - \(n\) = sample size

Control Limit: For Phase II monitoring: \[UCL = \frac{p(m+1)(m-1)}{m^2-mp} F_{\alpha}(p, m-p)\]

Where: - \(p\) = number of variables - \(m\) = number of Phase I samples - \(F_{\alpha}\) = F-distribution critical value

Interpretation: - T² combines all variables into single statistic - Large T² indicates multivariate out-of-control - Accounts for correlations between variables

Code
# Phase I data (first 80 samples for establishing control limits)
phase1_indices <- 1:80
X_phase1 <- X[phase1_indices, ]

# Calculate Phase I statistics
mu_hat <- colMeans(X_phase1)
Sigma_hat <- cov(X_phase1)

cat("\nPhase I Statistics:\n")

Phase I Statistics:
Code
cat("Mean vector:\n")
Mean vector:
Code
print(round(mu_hat, 3))
temperature    pressure   flow_rate 
    150.029     200.309      50.019 
Code
cat("\nCovariance matrix:\n")

Covariance matrix:
Code
print(round(Sigma_hat, 3))
            temperature pressure flow_rate
temperature       0.040    0.190     0.029
pressure          0.190    0.942     0.139
flow_rate         0.029    0.139     0.021
Code
cat("\n")
Code
# Phase II data (monitoring new samples)
X_phase2 <- X

# Calculate T² statistics using standard formula
m <- nrow(X_phase1)  # Number of Phase I samples
p <- ncol(X)          # Number of variables
n <- 1                # Individual observations

# Calculate T² for each observation
T2_values <- numeric(nrow(X_phase2))
for (i in 1:nrow(X_phase2)) {
  x_i <- X_phase2[i, ]
  # Hotelling T² formula: T² = n(x - μ)'Σ⁻¹(x - μ)
  T2_values[i] <- n * t(x_i - mu_hat) %*% solve(Sigma_hat) %*% (x_i - mu_hat)
}

# Calculate Phase II control limit
# UCL = [(p(m+1)(m-1)) / (m²-mp)] * F(α, p, m-p)
alpha <- 0.0027  # For 3-sigma equivalent (0.27%)
F_critical <- qf(1 - alpha, p, m - p)
UCL_T2 <- (p * (m + 1) * (m - 1)) / (m^2 - m * p) * F_critical

cat("Hotelling T² Chart Parameters:\n")
Hotelling T² Chart Parameters:
Code
cat("Number of variables (p):", p, "\n")
Number of variables (p): 3 
Code
cat("Phase I samples (m):", m, "\n")
Phase I samples (m): 80 
Code
cat("Significance level (α):", alpha, "\n")
Significance level (α): 0.0027 
Code
cat("F critical value:", round(F_critical, 3), "\n")
F critical value: 5.145 
Code
cat("Control limit (UCL):", round(UCL_T2, 3), "\n")
Control limit (UCL): 16.032 
Code
cat("Number of out-of-control points:", sum(T2_values > UCL_T2), "\n\n")
Number of out-of-control points: 22 
Code
# Create results table
t2_summary <- tibble(
  Metric = c("Number of Variables", "Phase I Samples", "UCL", 
             "Mean T²", "Max T²", "Out-of-Control Points"),
  Value = c(p, m, UCL_T2, mean(T2_values), max(T2_values), 
            sum(T2_values > UCL_T2))
)

t2_summary %>%
  gt() %>%
  tab_header(
    title = "Hotelling T² Chart Summary",
    subtitle = "Multivariate control chart statistics"
  ) %>%
  fmt_number(
    columns = Value,
    rows = 3:5,
    decimals = 3
  ) %>%
  fmt_number(
    columns = Value,
    rows = c(1, 2, 6),
    decimals = 0
  ) %>%
  tab_style(
    style = cell_fill(color = "lightcoral"),
    locations = cells_body(rows = 6)
  )
Hotelling T² Chart Summary
Multivariate control chart statistics
Metric Value
Number of Variables 3
Phase I Samples 80
UCL 16.032
Mean T² 312.980
Max T² 1,841.013
Out-of-Control Points 22
Code
# Create T² control chart
t2_data <- tibble(
  Sample = 1:length(T2_values),
  T2 = T2_values,
  UCL = UCL_T2,
  Phase = ifelse(Sample <= 80, "Phase I (Training)", "Phase II (Monitoring)"),
  Status = ifelse(T2 > UCL, "Out of Control", "In Control")
)

t2_plot <- plot_ly(t2_data) %>%
  add_trace(
    x = ~Sample,
    y = ~T2,
    type = "scatter",
    mode = "lines+markers",
    name = "T² Statistic",
    line = list(color = "blue"),
    marker = list(
      size = 6,
      color = ~ifelse(Status == "Out of Control", "red", "blue")
    ),
    text = ~paste0(
      "Sample: ", Sample, "<br>",
      "T²: ", round(T2, 3), "<br>",
      "Phase: ", Phase, "<br>",
      "Status: ", Status
    ),
    hovertemplate = "%{text}<extra></extra>"
  ) %>%
  add_trace(
    x = ~Sample,
    y = ~UCL,
    type = "scatter",
    mode = "lines",
    name = "UCL",
    line = list(color = "red", dash = "dash", width = 3),
    hovertemplate = "<b>UCL:</b> %{y:.3f}<extra></extra>"
  ) %>%
  add_segments(
    x = 80.5, xend = 80.5,
    y = 0, yend = max(t2_data$T2) * 1.1,
    line = list(color = "gray", dash = "dot", width = 2),
    showlegend = FALSE,
    hoverinfo = "skip"
  ) %>%
  layout(
    title = list(
      text = "Hotelling T² Control Chart<br><sub>Monitoring multiple correlated variables simultaneously</sub>",
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Sample Number"),
    yaxis = list(title = "T² Statistic"),
    hovermode = "closest",
    annotations = list(
      list(
        x = 80.5, y = max(t2_data$T2) * 1.05,
        text = "Phase I | Phase II",
        showarrow = FALSE,
        xanchor = "center",
        font = list(size = 10, color = "gray")
      )
    )
  )

t2_plot

Interpreting T² Results

Understanding the T² Statistic

What T² Measures: - Overall deviation from the target mean vector - Accounts for variable correlations - Larger values indicate process is farther from target

When to Investigate: - Any point above UCL indicates out-of-control - Patterns (runs, trends) suggest special causes - Compare Phase I vs Phase II performance

Decomposition Analysis

When T² signals out-of-control, we need to identify which variable(s) caused the signal.

Code
# For out-of-control points, decompose T²
ooc_indices <- which(T2_values > UCL_T2)

if (length(ooc_indices) > 0) {
  cat("\nDecomposition of Out-of-Control Points:\n\n")
  
  # Analyze first few out-of-control points
  analyze_points <- head(ooc_indices, 3)
  
  decomp_results <- list()
  
  for (idx in analyze_points) {
    x_i <- X[idx, ]
    deviation <- x_i - mu_hat
    
    # Calculate contribution of each variable
    S_inv <- solve(Sigma_hat)
    contributions <- numeric(p)
    
    for (j in 1:p) {
      e_j <- rep(0, p)
      e_j[j] <- 1
      contributions[j] <- n * t(deviation) %*% S_inv %*% e_j * deviation[j]
    }
    
    decomp_results[[length(decomp_results) + 1]] <- tibble(
      Sample = idx,
      Variable = quality_vars,
      Value = x_i,
      Target = mu_hat,
      Deviation = deviation,
      Contribution = contributions,
      Pct_of_T2 = (contributions / T2_values[idx]) * 100
    )
  }
  
  # Display decomposition table
  bind_rows(decomp_results) %>%
    gt() %>%
    tab_header(
      title = "T² Decomposition Analysis",
      subtitle = "Identifying which variables caused out-of-control signals"
    ) %>%
    fmt_number(
      columns = c(Value, Target, Deviation, Contribution),
      decimals = 3
    ) %>%
    fmt_number(
      columns = Pct_of_T2,
      decimals = 1
    ) %>%
    tab_style(
      style = cell_fill(color = "lightcoral"),
      locations = cells_body(
        columns = Pct_of_T2,
        rows = Pct_of_T2 > 40
      )
    ) %>%
    tab_style(
      style = list(
        cell_fill(color = "lightyellow"),
        cell_text(weight = "bold")
      ),
      locations = cells_body(columns = Sample)
    )
} else {
  cat("\nNo out-of-control points detected in T² chart.\n")
}

Decomposition of Out-of-Control Points:
T² Decomposition Analysis
Identifying which variables caused out-of-control signals
Sample Variable Value Target Deviation Contribution Pct_of_T2
3 temperature 149.950 150.029 −0.079 −12.126 −32.6
3 pressure 198.760 200.309 −1.549 48.773 131.2
3 flow_rate 49.950 50.019 −0.069 0.523 1.4
4 temperature 150.180 150.029 0.151 42.757 223.3
4 pressure 200.880 200.309 0.571 −0.450 −2.3
4 flow_rate 50.080 50.019 0.061 −23.158 −120.9
81 temperature 152.150 150.029 2.121 5,489.128 313.7
81 pressure 205.320 200.309 5.011 −569.587 −32.5
81 flow_rate 51.180 50.019 1.161 −3,169.505 −181.1

2. MEWMA (Multivariate EWMA) Chart

Theory

MEWMA (Multivariate Exponentially Weighted Moving Average) is more sensitive to small sustained shifts than T².

Formula: \[Z_i = \Lambda X_i + (I - \Lambda) Z_{i-1}\]

Where: - \(Z_i\) = EWMA vector at time \(i\) - \(\Lambda\) = diagonal matrix of smoothing constants (typically 0.1-0.3) - \(X_i\) = observation vector at time \(i\) - \(I\) = identity matrix

MEWMA Statistic: \[T_i^2 = Z_i' \Sigma_Z^{-1} Z_i\]

Where \(\Sigma_Z\) is the covariance matrix of \(Z_i\).

Control Limit: Typically set using simulation to achieve desired false alarm rate.

Advantages: - More sensitive to small shifts - Smooths out random noise - Better for detecting gradual process drift

Code
# MEWMA parameters
lambda <- 0.2  # Smoothing constant (0 < lambda <= 1)
               # Smaller = more smoothing, more sensitive to small shifts

cat("MEWMA Chart Setup:\n")
MEWMA Chart Setup:
Code
cat("Smoothing constant (lambda):", lambda, "\n\n")
Smoothing constant (lambda): 0.2 
Code
# Initialize MEWMA vectors
Z <- matrix(0, nrow = nrow(X), ncol = p)
Z[1, ] <- X[1, ] - mu_hat  # Initialize with first observation

# Calculate MEWMA vectors
# Z_i = λ(X_i - μ) + (1-λ)Z_{i-1}
for (i in 2:nrow(X)) {
  Z[i, ] <- lambda * (X[i, ] - mu_hat) + (1 - lambda) * Z[i-1, ]
}

# Calculate MEWMA T² statistic
# Covariance of Z_i: Σ_Z = [λ/(2-λ)]Σ
Sigma_Z <- (lambda / (2 - lambda)) * Sigma_hat

MEWMA_T2 <- numeric(nrow(X))
for (i in 1:nrow(X)) {
  MEWMA_T2[i] <- t(Z[i, ]) %*% solve(Sigma_Z) %*% Z[i, ]
}

# Control limit
# Using h parameter (typically 9-13 for good performance)
h <- 11.5  # Control limit parameter
UCL_MEWMA <- h

cat("MEWMA Chart Parameters:\n")
MEWMA Chart Parameters:
Code
cat("Smoothing constant (λ):", lambda, "\n")
Smoothing constant (λ): 0.2 
Code
cat("Control limit parameter (h):", h, "\n")
Control limit parameter (h): 11.5 
Code
cat("UCL:", round(UCL_MEWMA, 3), "\n")
UCL: 11.5 
Code
cat("Number of out-of-control points:", sum(MEWMA_T2 > UCL_MEWMA), "\n\n")
Number of out-of-control points: 23 
Code
# Create MEWMA results table
mewma_summary <- tibble(
  Metric = c("Smoothing Constant (λ)", "Control Limit (h)", 
             "Mean MEWMA T²", "Max MEWMA T²", "Out-of-Control Points"),
  Value = c(lambda, UCL_MEWMA, mean(MEWMA_T2), max(MEWMA_T2), 
            sum(MEWMA_T2 > UCL_MEWMA))
)

mewma_summary %>%
  gt() %>%
  tab_header(
    title = "MEWMA Chart Summary",
    subtitle = "Multivariate exponentially weighted moving average"
  ) %>%
  fmt_number(
    columns = Value,
    rows = 1:4,
    decimals = 3
  ) %>%
  fmt_number(
    columns = Value,
    rows = 5,
    decimals = 0
  ) %>%
  tab_style(
    style = cell_fill(color = "lightcoral"),
    locations = cells_body(rows = 5)
  )
MEWMA Chart Summary
Multivariate exponentially weighted moving average
Metric Value
Smoothing Constant (λ) 0.200
Control Limit (h) 11.500
Mean MEWMA T² 1,915.799
Max MEWMA T² 13,632.016
Out-of-Control Points 23
Code
# Create MEWMA control chart
mewma_data <- tibble(
  Sample = 1:length(MEWMA_T2),
  MEWMA = MEWMA_T2,
  UCL = UCL_MEWMA,
  Phase = ifelse(Sample <= 80, "Phase I", "Phase II"),
  Status = ifelse(MEWMA > UCL, "Out of Control", "In Control")
)

mewma_plot <- plot_ly(mewma_data) %>%
  add_trace(
    x = ~Sample,
    y = ~MEWMA,
    type = "scatter",
    mode = "lines+markers",
    name = "MEWMA T²",
    line = list(color = "darkgreen"),
    marker = list(
      size = 6,
      color = ~ifelse(Status == "Out of Control", "red", "darkgreen")
    ),
    text = ~paste0(
      "Sample: ", Sample, "<br>",
      "MEWMA T²: ", round(MEWMA, 3), "<br>",
      "Phase: ", Phase, "<br>",
      "Status: ", Status
    ),
    hovertemplate = "%{text}<extra></extra>"
  ) %>%
  add_trace(
    x = ~Sample,
    y = ~UCL,
    type = "scatter",
    mode = "lines",
    name = "UCL",
    line = list(color = "red", dash = "dash", width = 3),
    hovertemplate = "<b>UCL:</b> %{y:.3f}<extra></extra>"
  ) %>%
  add_segments(
    x = 80.5, xend = 80.5,
    y = 0, yend = max(mewma_data$MEWMA) * 1.1,
    line = list(color = "gray", dash = "dot", width = 2),
    showlegend = FALSE,
    hoverinfo = "skip"
  ) %>%
  layout(
    title = list(
      text = "MEWMA Control Chart<br><sub>More sensitive to small sustained shifts</sub>",
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Sample Number"),
    yaxis = list(title = "MEWMA T² Statistic"),
    hovermode = "closest"
  )

mewma_plot

Comparing T² vs MEWMA

Code
# Create comparison dataframe
comparison_data <- tibble(
  Sample = 1:length(T2_values),
  `T² Statistic` = T2_values / UCL_T2,  # Normalize by UCL
  `MEWMA Statistic` = MEWMA_T2 / UCL_MEWMA,
  Phase = ifelse(Sample <= 80, "Phase I", "Phase II")
)

# Create comparison plot
comp_plot <- plot_ly(comparison_data) %>%
  add_trace(
    x = ~Sample,
    y = ~`T² Statistic`,
    type = "scatter",
    mode = "lines",
    name = "T² (normalized)",
    line = list(color = "blue", width = 2),
    hovertemplate = "<b>T²:</b> %{y:.3f}<extra></extra>"
  ) %>%
  add_trace(
    x = ~Sample,
    y = ~`MEWMA Statistic`,
    type = "scatter",
    mode = "lines",
    name = "MEWMA (normalized)",
    line = list(color = "darkgreen", width = 2),
    hovertemplate = "<b>MEWMA:</b> %{y:.3f}<extra></extra>"
  ) %>%
  add_segments(
    x = 0, xend = max(comparison_data$Sample),
    y = 1, yend = 1,
    line = list(color = "red", dash = "dash", width = 2),
    showlegend = FALSE,
    hoverinfo = "skip"
  ) %>%
  add_segments(
    x = 80.5, xend = 80.5,
    y = 0, yend = max(c(comparison_data$`T² Statistic`, 
                       comparison_data$`MEWMA Statistic`)),
    line = list(color = "gray", dash = "dot", width = 2),
    showlegend = FALSE,
    hoverinfo = "skip"
  ) %>%
  layout(
    title = list(
      text = "T² vs MEWMA Comparison<br><sub>Both statistics normalized by their control limits</sub>",
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Sample Number"),
    yaxis = list(title = "Normalized Statistic (UCL = 1.0)"),
    hovermode = "x unified",
    annotations = list(
      list(
        x = max(comparison_data$Sample) / 2,
        y = 1.05,
        text = "Control Limit",
        showarrow = FALSE,
        font = list(size = 10, color = "red")
      )
    )
  )

comp_plot

Key Observations:

  • T² Chart: Better for detecting large, sudden shifts
  • MEWMA Chart: Better for detecting small, gradual shifts
  • MEWMA smooths the data, making trends more visible
  • Notice how MEWMA responds earlier to the shift at sample 81

3. Variable Contribution Analysis

Identifying Root Causes

When a multivariate chart signals out-of-control, we need to identify which variable(s) are responsible.

Code
# For recent out-of-control points, show variable contributions
recent_ooc <- tail(which(MEWMA_T2 > UCL_MEWMA), 5)

if (length(recent_ooc) > 0) {
  # Calculate standardized deviations
  contrib_data <- list()
  
  for (idx in recent_ooc) {
    x_i <- X[idx, ]
    std_dev <- (x_i - mu_hat) / sqrt(diag(Sigma_hat))
    
    contrib_data[[length(contrib_data) + 1]] <- tibble(
      Sample = idx,
      Variable = quality_vars,
      Value = x_i,
      Target = mu_hat,
      `Std Deviation` = std_dev,
      `Abs Std Dev` = abs(std_dev),
      Status = ifelse(abs(std_dev) > 3, "[X] Out of Control",
                     ifelse(abs(std_dev) > 2, "[WARNING] Near Limit", "[OK] In Control"))
    )
  }
  
  contrib_table <- bind_rows(contrib_data)
  
  contrib_table %>%
    gt() %>%
    tab_header(
      title = "Variable Contribution Analysis",
      subtitle = "Which variables are causing out-of-control signals?"
    ) %>%
    fmt_number(
      columns = c(Value, Target, `Std Deviation`),
      decimals = 3
    ) %>%
    fmt_number(
      columns = `Abs Std Dev`,
      decimals = 2
    ) %>%
    tab_style(
      style = cell_fill(color = "lightcoral"),
      locations = cells_body(
        columns = `Abs Std Dev`,
        rows = `Abs Std Dev` > 3
      )
    ) %>%
    tab_style(
      style = cell_fill(color = "lightyellow"),
      locations = cells_body(
        columns = `Abs Std Dev`,
        rows = `Abs Std Dev` > 2 & `Abs Std Dev` <= 3
      )
    ) %>%
    tab_style(
      style = cell_fill(color = "lightgreen"),
      locations = cells_body(
        columns = `Abs Std Dev`,
        rows = `Abs Std Dev` <= 2
      )
    )
  
  # Create contribution bar chart for most recent OOC point
  latest_ooc <- tail(recent_ooc, 1)
  latest_contrib <- contrib_table %>% filter(Sample == latest_ooc)
  
  contrib_plot <- plot_ly(
    data = latest_contrib,
    x = ~Variable,
    y = ~`Std Deviation`,
    type = "bar",
    marker = list(
      color = ~ifelse(abs(`Std Deviation`) > 3, "red",
                     ifelse(abs(`Std Deviation`) > 2, "orange", "green"))
    ),
    text = ~round(`Std Deviation`, 2),
    textposition = "outside",
    hovertemplate = paste(
      "<b>%{x}</b><br>",
      "Std Dev: %{y:.3f}<br>",
      "<extra></extra>"
    )
  ) %>%
    add_segments(
      x = -0.5, xend = length(quality_vars) - 0.5,
      y = 3, yend = 3,
      line = list(color = "red", dash = "dash", width = 2),
      showlegend = FALSE,
      name = "+3σ"
    ) %>%
    add_segments(
      x = -0.5, xend = length(quality_vars) - 0.5,
      y = -3, yend = -3,
      line = list(color = "red", dash = "dash", width = 2),
      showlegend = FALSE,
      name = "-3σ"
    ) %>%
    layout(
      title = list(
        text = paste0("Variable Contributions - Sample ", latest_ooc, 
                     "<br><sub>Standardized deviations from target</sub>"),
        font = list(size = 16, weight = "bold")
      ),
      xaxis = list(title = "Variable"),
      yaxis = list(title = "Standardized Deviation (σ)"),
      showlegend = FALSE
    )
  
  contrib_plot
} else {
  cat("\nNo recent out-of-control points for contribution analysis.\n")
}

4. Process Monitoring Dashboard

Code
# Create summary metrics
dashboard_metrics <- tibble(
  Metric = c(
    "Total Samples",
    "Phase I Samples",
    "Phase II Samples",
    "Variables Monitored",
    "T² Out-of-Control",
    "MEWMA Out-of-Control",
    "Current Process Status"
  ),
  Value = c(
    nrow(X),
    80,
    nrow(X) - 80,
    p,
    sum(T2_values > UCL_T2),
    sum(MEWMA_T2 > UCL_MEWMA),
    ifelse(tail(MEWMA_T2, 1) > UCL_MEWMA, "[X] OUT OF CONTROL", "[OK] IN CONTROL")
  )
)

dashboard_metrics %>%
  gt() %>%
  tab_header(
    title = "Multivariate SPC Dashboard",
    subtitle = paste("Analysis Date:", Sys.Date())
  ) %>%
  tab_style(
    style = cell_fill(color = "lightcoral"),
    locations = cells_body(
      columns = Value,
      rows = grepl("OUT OF CONTROL", Value)
    )
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(
      columns = Value,
      rows = grepl("IN CONTROL", Value)
    )
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightblue"),
      cell_text(weight = "bold")
    ),
    locations = cells_column_labels()
  )
Multivariate SPC Dashboard
Analysis Date: 2025-12-20
Metric Value
Total Samples 100
Phase I Samples 80
Phase II Samples 20
Variables Monitored 3
T² Out-of-Control 22
MEWMA Out-of-Control 23
Current Process Status [X] OUT OF CONTROL

5. Recommendations and Actions

Current Process Assessment

Code
# Calculate process performance metrics
recent_samples <- 10
recent_T2 <- tail(T2_values, recent_samples)
recent_MEWMA <- tail(MEWMA_T2, recent_samples)

recent_violations_T2 <- sum(recent_T2 > UCL_T2)
recent_violations_MEWMA <- sum(recent_MEWMA > UCL_MEWMA)

cat("\n### Recent Process Performance (Last", recent_samples, "Samples)\n\n")

### Recent Process Performance (Last 10 Samples)
Code
cat("- T² violations:", recent_violations_T2, "out of", recent_samples, 
    paste0("(", round(recent_violations_T2/recent_samples*100, 1), "%)\n"))
- T² violations: 10 out of 10 (100%)
Code
cat("- MEWMA violations:", recent_violations_MEWMA, "out of", recent_samples,
    paste0("(", round(recent_violations_MEWMA/recent_samples*100, 1), "%)\n\n"))
- MEWMA violations: 10 out of 10 (100%)
Code
if (recent_violations_MEWMA > 0) {
  cat("**Status: [WARNING] Process requires immediate attention**\n\n")
  cat("**Recommended Actions:**\n")
  cat("1. Stop production and investigate root cause\n")
  cat("2. Review variable contributions to identify problem source\n")
  cat("3. Check equipment calibration and process parameters\n")
  cat("4. Review operator procedures and training\n")
  cat("5. Implement corrective actions\n")
  cat("6. Document findings and countermeasures\n")
} else {
  cat("**Status: [OK] Process is in statistical control**\n\n")
  cat("**Recommended Actions:**\n")
  cat("1. Continue current monitoring protocol\n")
  cat("2. Maintain equipment calibration schedule\n")
  cat("3. Regular operator training refreshers\n")
  cat("4. Update control limits quarterly\n")
}
**Status: [WARNING] Process requires immediate attention**

**Recommended Actions:**
1. Stop production and investigate root cause
2. Review variable contributions to identify problem source
3. Check equipment calibration and process parameters
4. Review operator procedures and training
5. Implement corrective actions
6. Document findings and countermeasures

Advantages of Multivariate SPC

Compared to univariate monitoring:

  1. Accounts for correlations: Detects patterns that univariate charts miss
  2. Reduces chart proliferation: One chart instead of multiple
  3. Lower false alarm rate: Adjusts for multiple testing
  4. More powerful: Better at detecting real process shifts
  5. Process understanding: Reveals relationships between variables

When to Use Each Method

Method Best For Advantages Disadvantages
Large sudden shifts Simple, well-understood Less sensitive to small shifts
MEWMA Small gradual shifts Memory, smoothing Slower response to large shifts
MCUSUM Persistent small shifts Directional info Complex interpretation
PCA-based High dimensions (10+ vars) Dimension reduction Requires more data

Implementation Guidelines

Phase I (Establishing Control): 1. Collect 20-30 samples during stable operation 2. Calculate mean vector and covariance matrix 3. Verify multivariate normality 4. Establish control limits

Phase II (Ongoing Monitoring): 1. Plot new observations on control charts 2. Investigate out-of-control signals immediately 3. Use contribution analysis to identify root causes 4. Document and implement corrective actions 5. Update control limits after process improvements


Report Generated: 2025-12-20 10:45:34.022343
Analysis Method: Hotelling T² and MEWMA
Statistical Software: R with qcr package (unified quality control)
Methodology: Multivariate Statistical Process Control
Package Info: qcr integrates qcc and MSQC functionality