Six Sigma Quality Control Analysis

Comprehensive SPC Implementation with R, GT Tables, and Plotly

Author

Quality Control Department

Published

December 19, 2025

Setup and Data Import

Required Libraries

Code
#| message: false
#| warning: false

# Install packages if needed
packages <- c("qcc", "plotly", "gt", "dplyr", "tidyr", "ggplot2", 
              "readr", "lubridate", "DT", "knitr")

# Uncomment to install if needed
# install.packages(packages)

# Load libraries
library(qcc)
library(plotly)
library(gt)
library(dplyr)
library(tidyr)
library(ggplot2)
library(readr)
library(lubridate)
library(DT)
library(knitr)

# Set theme
theme_set(theme_minimal())

Import Dataset

Code
# Import your quality control data
# Replace 'production_data.csv' with your actual file path
# Expected columns: date, part_id, measurement, operator, shift, etc.

# For this example, we'll check if a file exists, otherwise use sample data
if (file.exists("production_data.csv")) {
  qc_data <- read_csv("production_data.csv")
  cat("[OK] Production data loaded successfully\n")
  cat("Rows:", nrow(qc_data), "| Columns:", ncol(qc_data), "\n")
} else {
  cat("[INFO] No production_data.csv found. Using generated sample data.\n")
  cat("To use your own data, place 'production_data.csv' in the same directory.\n\n")
  
  # Generate realistic sample data
  set.seed(42)
  n_samples <- 250
  
  qc_data <- tibble(
    date = seq.Date(from = Sys.Date() - n_samples, 
                    to = Sys.Date(), 
                    by = "day")[1:n_samples],
    shift = sample(c("Day", "Night", "Swing"), n_samples, replace = TRUE),
    operator = sample(paste0("OP", 1:5), n_samples, replace = TRUE),
    part_id = 1:n_samples,
    dimension_1 = rnorm(n_samples, mean = 10.00, sd = 0.15),
    dimension_2 = rnorm(n_samples, mean = 25.50, sd = 0.30),
    weight = rnorm(n_samples, mean = 150, sd = 3),
    defects = rpois(n_samples, lambda = 0.5),
    inspected_units = sample(95:105, n_samples, replace = TRUE),
    defective_units = rbinom(n_samples, size = 100, prob = 0.03)
  )
}
[OK] Production data loaded successfully
Rows: 125 | Columns: 11 
Code
# Display data summary
qc_data %>%
  head(10) %>%
  gt() %>%
  tab_header(
    title = "Quality Control Data - Sample Preview",
    subtitle = "First 10 observations"
  ) %>%
  fmt_number(
    columns = starts_with("dimension") | matches("weight"),
    decimals = 3
  ) %>%
  tab_style(
    style = cell_fill(color = "lightblue"),
    locations = cells_column_labels()
  )
Quality Control Data - Sample Preview
First 10 observations
date shift operator part_id subgroup dimension_1 dimension_2 weight defects inspected_units defective_units
2024-06-01 Day OP1 1 1 10.120 25.480 151.200 0 100 3
2024-06-01 Day OP1 2 1 9.980 25.520 149.800 1 100 3
2024-06-01 Day OP1 3 1 10.050 25.550 150.500 0 100 3
2024-06-01 Day OP1 4 1 10.080 25.490 150.100 0 100 3
2024-06-01 Day OP1 5 1 10.150 25.610 151.800 1 100 3
2024-06-01 Day OP2 6 2 9.950 25.430 149.500 0 100 2
2024-06-01 Day OP2 7 2 10.030 25.580 150.300 0 100 2
2024-06-01 Day OP2 8 2 10.110 25.470 151.000 1 100 2
2024-06-01 Day OP2 9 2 9.990 25.540 149.900 0 100 2
2024-06-01 Day OP2 10 2 10.060 25.510 150.600 0 100 2

Data Summary Statistics

Code
# Create summary statistics table
summary_stats <- qc_data %>%
  select(where(is.numeric)) %>%
  summarise(across(everything(), 
                   list(Mean = ~mean(., na.rm = TRUE),
                        SD = ~sd(., na.rm = TRUE),
                        Min = ~min(., na.rm = TRUE),
                        Max = ~max(., na.rm = TRUE),
                        N = ~sum(!is.na(.))),
                   .names = "{.col}__{.fn}")) %>%
  pivot_longer(everything(), 
               names_to = c("Variable", "Statistic"), 
               names_sep = "__") %>%
  pivot_wider(names_from = Statistic, values_from = value)

summary_stats %>%
  gt() %>%
  tab_header(
    title = "Dataset Summary Statistics",
    subtitle = "Descriptive statistics for all numeric variables"
  ) %>%
  fmt_number(
    columns = c(Mean, SD, Min, Max),
    decimals = 3
  ) %>%
  fmt_number(
    columns = N,
    decimals = 0
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightyellow"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(columns = Variable)
  )
Dataset Summary Statistics
Descriptive statistics for all numeric variables
Variable Mean SD Min Max N
part_id 63.000 36.228 1.000 125.000 125
subgroup 13.000 7.240 1.000 25.000 125
dimension_1 10.048 0.059 9.950 10.150 125
dimension_2 25.516 0.038 25.430 25.610 125
weight 150.471 0.590 149.500 151.800 125
defects 0.328 0.471 0.000 1.000 125
inspected_units 100.000 0.000 100.000 100.000 125
defective_units 3.520 1.140 2.000 5.000 125

1. Control Charts

What are Control Charts?

Control charts are the primary tool of Statistical Process Control (SPC). They help you monitor process performance over time and distinguish between two types of variation:

Two Types of Variation

Common Cause Variation (Random, Natural)

  • Inherent in the process
  • Always present
  • Predictable and stable
  • Example: Slight differences in raw material, ambient temperature changes, normal equipment vibration
  • Action: Accept as natural process variation; reduce through fundamental process improvement

Special Cause Variation (Assignable)

  • NOT part of the normal process
  • Caused by specific, identifiable factors
  • Unpredictable and intermittent
  • Example: Tool wear, operator error, machine malfunction, bad batch of raw material
  • Action: Investigate and eliminate the specific cause

How Control Charts Work

Control charts have three key lines:

  1. Center Line (CL): Process average or target
  2. Upper Control Limit (UCL): Typically mean + 3sigma
  3. Lower Control Limit (LCL): Typically mean - 3sigma

The 3-sigma Rule: - If process is stable, 99.7% of points fall within +/- 3 sigma - Points outside limits indicate special causes - Points in a pattern (trends, runs) also signal special causes

Process States

In Statistical Control: - All points within control limits - Random pattern around center line - No trends or unusual patterns - Only common cause variation present - Predictable performance

Out of Control: - Points outside control limits - Non-random patterns - Trends or cycles - Special causes present - Unpredictable performance

Types of Control Charts by Data Type

Variable Data Charts (Continuous measurements)

X-bar and R Chart: - Use when: You have subgroups of 2-10 measurements - X-bar chart: Monitors process average - R chart: Monitors process variation (range) - Example: Measuring 5 parts per hour, tracking diameter - Best for: Manufacturing processes with rational subgrouping

Individual and Moving Range (I-MR) Chart: - Use when: Individual measurements, no natural subgroups - I chart: Monitors individual values - MR chart: Monitors variation between consecutive measurements - Example: Daily temperature, single measurement per batch, chemical properties - Best for: Slow processes, expensive measurements, destructive testing

Attribute Data Charts (Count or proportion data)

p-Chart: - Use when: Tracking proportion or percentage defective - Variable sample size: Can handle different sample sizes - Example: % of products with defects, % late deliveries - Best for: Quality rates, pass/fail data

c-Chart: - Use when: Counting defects in constant area/time - Constant sample size: Each sample must be the same - Example: Number of scratches per panel, defects per 1000 lines of code - Best for: Defect counts with fixed opportunity

Control Chart Selection Guide

Do you have measurements (length, weight, temperature)?
├─ YES → Variable Data
│   ├─ Do you have subgroups (3-10 measurements per sample)?
│   │   ├─ YES → Use X-bar and R Chart
│   │   └─ NO → Use I-MR Chart
│   
└─ NO → Attribute Data (counts, pass/fail)
    ├─ Are you tracking proportion/percentage?
    │   └─ YES → Use p-Chart
    └─ Are you counting defects?
        └─ YES → Use c-Chart (constant sample) or u-Chart (variable sample)

Western Electric Rules (Out-of-Control Signals)

Your process is out of control if:

  1. One point beyond Zone A (outside 3sigma limits)
  2. Two out of three consecutive points in Zone A or beyond (outside 2sigma)
  3. Four out of five consecutive points in Zone B or beyond (outside 1sigma)
  4. Eight consecutive points on one side of center line
  5. Six points in a row steadily increasing or decreasing (trend)
  6. Fifteen points in a row in Zone C (within +/- 1 sigma of center line)

Zone Definitions: - Zone C: Within +/- 1 sigma of center line - Zone B: Between 1sigma and 2sigma from center line - Zone A: Between 2sigma and 3sigma from center line

Interpreting Control Charts

Good Signs (Process in Control): - [OK] Points randomly distributed around center line - [OK] Roughly equal number of points above and below center - [OK] No points outside control limits - [OK] No obvious patterns or trends

Warning Signs (Investigate): - [WARNING] Points near control limits (even if within) - [WARNING] Sudden shifts in level - [WARNING] Gradual trends upward or downward - [WARNING] Cycles or repeated patterns - [WARNING] Points clustered on one side

Critical Issues (Take Action): - [X] Any point outside control limits - [X] Multiple Western Electric rule violations - [X] Obvious non-random patterns

Taking Action on Control Charts

When Out of Control: 1. STOP - Do not adjust process yet 2. INVESTIGATE - Find the special cause 3. DOCUMENT - Record what happened, when, possible causes 4. CORRECT - Fix the specific problem 5. VERIFY - Confirm process returns to control 6. PREVENT - Implement countermeasures to prevent recurrence

Common Mistakes: - ❌ Over-adjusting: Making changes when process is in control (creates more variation!) - ❌ Ignoring signals: Not investigating out-of-control points - ❌ Wrong limits: Using specification limits instead of control limits - ❌ Old limits: Not updating limits after process improvements

Control Limits vs. Specification Limits

Control Limits (from control charts): - Based on actual process performance - Calculate from data: mean +/- 3sigma - Tell you if process is stable - Voice of the Process

Specification Limits (from engineering/customer): - Based on customer requirements - Set by design, function, or customer needs - Tell you if product is acceptable - Voice of the Customer

Key Point: You can be in control but still make bad parts if control limits are wider than specification limits! This means your process is stable but not capable.

Best Practices

  1. Collect data systematically - consistent sampling plan
  2. Plot data in real-time - don’t wait days or weeks
  3. React to signals promptly - investigate special causes immediately
  4. Update control limits - recalculate after process improvements
  5. Train operators - everyone should understand how to read charts
  6. Use rational subgroups - group data from similar conditions
  7. Document actions - record investigations and corrective actions

1.1 X-bar and R Chart (Dimension 1)

Code
#| fig-width: 12
#| fig-height: 8

# Check if data has a subgroup column for rational subgrouping
if ("subgroup" %in% names(qc_data)) {
  # Use existing subgroup column
  cat("[OK] Using subgroup column from data for rational subgrouping\n\n")
  
  # Create matrix from subgroups
  subgroups_list <- split(qc_data$dimension_1, qc_data$subgroup)
  
  # Find the most common subgroup size
  subgroup_sizes <- sapply(subgroups_list, length)
  target_size <- as.numeric(names(sort(table(subgroup_sizes), decreasing = TRUE)[1]))
  
  # Filter to only complete subgroups of target size
  complete_subgroups <- subgroups_list[subgroup_sizes == target_size]
  
  if (length(complete_subgroups) >= 20 && target_size > 1) {
    dimension1_matrix <- do.call(rbind, complete_subgroups)
    cat("Created", nrow(dimension1_matrix), "subgroups of size", target_size, "\n\n")
  } else {
    cat("[WARNING] Not enough complete subgroups (need 20+). Using I-MR chart instead.\n\n")
    dimension1_matrix <- NULL
  }
} else {
  # No subgroup column - try to create subgroups of 5
  cat("[INFO] No subgroup column found. Attempting to create sequential subgroups of 5.\n\n")
  subgroup_size <- 5
  n_samples <- nrow(qc_data)
  
  if (n_samples >= subgroup_size * 20) {
    n_subgroups <- floor(n_samples / subgroup_size)
    dimension1_matrix <- matrix(
      qc_data$dimension_1[1:(n_subgroups * subgroup_size)],
      nrow = n_subgroups,
      ncol = subgroup_size,
      byrow = TRUE
    )
    cat("Created", n_subgroups, "sequential subgroups of size", subgroup_size, "\n\n")
  } else {
    cat("[WARNING] Not enough data for X-bar/R chart (need 100+ samples). Using I-MR chart instead.\n\n")
    dimension1_matrix <- NULL
  }
}
[OK] Using subgroup column from data for rational subgrouping

Created 25 subgroups of size 5 
Code
# Only proceed if we have valid matrix
if (!is.null(dimension1_matrix) && ncol(dimension1_matrix) > 1) {

# Create X-bar chart
xbar_chart <- qcc(dimension1_matrix, type = "xbar", plot = FALSE)

# Create R chart
r_chart <- qcc(dimension1_matrix, type = "R", plot = FALSE)

# Extract statistics for table
xbar_summary <- tibble(
  Chart = "X-bar",
  Statistic = c("Center Line", "UCL", "LCL", "Std. Dev"),
  Value = c(
    xbar_chart$center,
    xbar_chart$limits[2],
    xbar_chart$limits[1],
    xbar_chart$std.dev
  )
) %>%
  bind_rows(
    tibble(
      Chart = "R",
      Statistic = c("Center Line", "UCL", "LCL", "Std. Dev"),
      Value = c(
        r_chart$center,
        r_chart$limits[2],
        r_chart$limits[1],
        r_chart$std.dev
      )
    )
  )

# Display table
xbar_summary %>%
  gt() %>%
  tab_header(
    title = "X-bar and R Chart Statistics",
    subtitle = "Dimension 1 - Control Limits and Process Parameters"
  ) %>%
  fmt_number(
    columns = Value,
    decimals = 4
  ) %>%
  tab_style(
    style = cell_fill(color = "lightblue"),
    locations = cells_body(columns = Value, rows = Statistic == "UCL")
  ) %>%
  tab_style(
    style = cell_fill(color = "lightcoral"),
    locations = cells_body(columns = Value, rows = Statistic == "LCL")
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightyellow"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(columns = Value, rows = Statistic == "Center Line")
  )

# Create interactive Plotly charts
xbar_data <- tibble(
  Sample = 1:length(xbar_chart$statistics),
  Mean = xbar_chart$statistics,
  UCL = xbar_chart$limits[2],
  CL = xbar_chart$center,
  LCL = xbar_chart$limits[1]
)

# X-bar Chart
xbar_plotly <- plot_ly(xbar_data) %>%
  add_trace(
    x = ~Sample, 
    y = ~Mean, 
    type = "scatter",
    mode = "lines+markers",
    name = "Sample Mean",
    line = list(color = "black"),
    marker = list(size = 6),
    hovertemplate = paste(
      "<b>Sample:</b> %{x}<br>",
      "<b>Mean:</b> %{y:.4f}<br>",
      "<extra></extra>"
    )
  ) %>%
  add_trace(
    x = ~Sample, 
    y = ~UCL, 
    type = "scatter",
    mode = "lines",
    name = "UCL",
    line = list(color = "red", dash = "dash", width = 2),
    hovertemplate = paste("<b>UCL:</b> %{y:.4f}<extra></extra>")
  ) %>%
  add_trace(
    x = ~Sample, 
    y = ~CL, 
    type = "scatter",
    mode = "lines",
    name = "Center Line",
    line = list(color = "blue", width = 2),
    hovertemplate = paste("<b>Center Line:</b> %{y:.4f}<extra></extra>")
  ) %>%
  add_trace(
    x = ~Sample, 
    y = ~LCL, 
    type = "scatter",
    mode = "lines",
    name = "LCL",
    line = list(color = "red", dash = "dash", width = 2),
    hovertemplate = paste("<b>LCL:</b> %{y:.4f}<extra></extra>")
  ) %>%
  layout(
    title = list(
      text = "X-bar Control Chart - Dimension 1",
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Sample Number"),
    yaxis = list(title = "Sample Mean"),
    hovermode = "x unified",
    showlegend = TRUE
  )

xbar_plotly

# R Chart
r_data <- tibble(
  Sample = 1:length(r_chart$statistics),
  Range = r_chart$statistics,
  UCL = r_chart$limits[2],
  CL = r_chart$center,
  LCL = r_chart$limits[1]
)

r_plotly <- plot_ly(r_data) %>%
  add_trace(
    x = ~Sample, 
    y = ~Range, 
    type = "scatter",
    mode = "lines+markers",
    name = "Range",
    line = list(color = "darkgreen"),
    marker = list(size = 6),
    hovertemplate = paste(
      "<b>Sample:</b> %{x}<br>",
      "<b>Range:</b> %{y:.4f}<br>",
      "<extra></extra>"
    )
  ) %>%
  add_trace(
    x = ~Sample, 
    y = ~UCL, 
    type = "scatter",
    mode = "lines",
    name = "UCL",
    line = list(color = "red", dash = "dash", width = 2)
  ) %>%
  add_trace(
    x = ~Sample, 
    y = ~CL, 
    type = "scatter",
    mode = "lines",
    name = "Center Line",
    line = list(color = "blue", width = 2)
  ) %>%
  add_trace(
    x = ~Sample, 
    y = ~LCL, 
    type = "scatter",
    mode = "lines",
    name = "LCL",
    line = list(color = "red", dash = "dash", width = 2)
  ) %>%
  layout(
    title = list(
      text = "R Control Chart - Dimension 1",
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Sample Number"),
    yaxis = list(title = "Range"),
    hovermode = "x unified",
    showlegend = TRUE
  )

r_plotly

} else {
  cat("\n[WARNING] X-bar and R charts require subgroups of size > 1.\n")
  cat("Using Individual-Moving Range (I-MR) chart instead.\n")
  cat("See section 1.2 below for I-MR analysis.\n\n")
  
  # Show a message table
  tibble(
    Message = "X-bar/R Chart Not Applicable",
    Reason = "Data consists of individual measurements",
    Alternative = "See I-MR Chart in Section 1.2"
  ) %>%
    gt() %>%
    tab_header(
      title = "Chart Type Note",
      subtitle = "Alternative control chart method recommended"
    ) %>%
    tab_style(
      style = cell_fill(color = "lightyellow"),
      locations = cells_body()
    )
}

1.2 Individual-Moving Range Chart (Weight)

Code
#| fig-width: 12
#| fig-height: 8

# Create I chart
individual_data <- qc_data$weight[!is.na(qc_data$weight)]
i_chart <- qcc(individual_data, type = "xbar.one", plot = FALSE)

# Calculate moving ranges manually
moving_ranges <- abs(diff(individual_data))

# Calculate control limits for moving range manually
# (qcc doesn't have a specific type for MR chart alone)
mr_mean <- mean(moving_ranges)
mr_ucl <- 3.267 * mr_mean  # D4 constant for n=2
mr_lcl <- 0  # D3 constant for n=2 is 0

# Create summary table
imr_summary <- tibble(
  Chart = c(rep("Individuals", 3), rep("Moving Range", 3)),
  Statistic = rep(c("UCL", "Center Line", "LCL"), 2),
  Value = c(
    i_chart$limits[2],
    i_chart$center,
    i_chart$limits[1],
    mr_ucl,
    mr_mean,
    mr_lcl
  )
)

imr_summary %>%
  gt() %>%
  tab_header(
    title = "I-MR Chart Control Limits",
    subtitle = "Weight Measurements - Individual and Moving Range Statistics"
  ) %>%
  fmt_number(
    columns = Value,
    decimals = 3
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightyellow"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(rows = Statistic == "Center Line")
  )
I-MR Chart Control Limits
Weight Measurements - Individual and Moving Range Statistics
Chart Statistic Value
Individuals UCL 152.683
Individuals Center Line 150.471
Individuals LCL 148.260
Moving Range UCL 2.716
Moving Range Center Line 0.831
Moving Range LCL 0.000
Code
# Create I Chart
i_data <- tibble(
  Observation = 1:length(individual_data),
  Value = individual_data,
  UCL = i_chart$limits[2],
  CL = i_chart$center,
  LCL = i_chart$limits[1]
)

i_plotly <- plot_ly(i_data) %>%
  add_trace(
    x = ~Observation, 
    y = ~Value, 
    type = "scatter",
    mode = "lines+markers",
    name = "Individual Value",
    line = list(color = "darkblue"),
    marker = list(size = 5),
    hovertemplate = paste(
      "<b>Observation:</b> %{x}<br>",
      "<b>Weight:</b> %{y:.3f}<br>",
      "<extra></extra>"
    )
  ) %>%
  add_trace(
    x = ~Observation, 
    y = ~UCL, 
    type = "scatter",
    mode = "lines",
    name = "UCL",
    line = list(color = "red", dash = "dash", width = 2)
  ) %>%
  add_trace(
    x = ~Observation, 
    y = ~CL, 
    type = "scatter",
    mode = "lines",
    name = "Center Line",
    line = list(color = "blue", width = 2)
  ) %>%
  add_trace(
    x = ~Observation, 
    y = ~LCL, 
    type = "scatter",
    mode = "lines",
    name = "LCL",
    line = list(color = "red", dash = "dash", width = 2)
  ) %>%
  layout(
    title = list(
      text = "Individuals Control Chart - Weight",
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Observation Number"),
    yaxis = list(title = "Weight (grams)"),
    hovermode = "x unified"
  )

i_plotly
Code
# Create MR Chart
mr_data <- tibble(
  Observation = 1:length(moving_ranges),
  Range = moving_ranges,
  UCL = mr_ucl,
  CL = mr_mean,
  LCL = mr_lcl
)

mr_plotly <- plot_ly(mr_data) %>%
  add_trace(
    x = ~Observation, 
    y = ~Range, 
    type = "scatter",
    mode = "lines+markers",
    name = "Moving Range",
    line = list(color = "darkgreen"),
    marker = list(size = 5),
    hovertemplate = paste(
      "<b>Observation:</b> %{x}<br>",
      "<b>Moving Range:</b> %{y:.3f}<br>",
      "<extra></extra>"
    )
  ) %>%
  add_trace(
    x = ~Observation, 
    y = ~UCL, 
    type = "scatter",
    mode = "lines",
    name = "UCL",
    line = list(color = "red", dash = "dash", width = 2)
  ) %>%
  add_trace(
    x = ~Observation, 
    y = ~CL, 
    type = "scatter",
    mode = "lines",
    name = "Center Line",
    line = list(color = "blue", width = 2)
  ) %>%
  add_trace(
    x = ~Observation, 
    y = ~LCL, 
    type = "scatter",
    mode = "lines",
    name = "LCL",
    line = list(color = "red", dash = "dash", width = 2)
  ) %>%
  layout(
    title = list(
      text = "Moving Range Chart - Weight",
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Observation Number"),
    yaxis = list(title = "Moving Range"),
    hovermode = "x unified"
  )

mr_plotly

1.3 p-Chart (Proportion Defective)

Code
#| fig-width: 12
#| fig-height: 6

# Create p-chart
defective_data <- qc_data$defective_units[!is.na(qc_data$defective_units)]
sample_sizes <- qc_data$inspected_units[!is.na(qc_data$inspected_units)]
sample_sizes <- sample_sizes[1:length(defective_data)]

p_chart <- qcc(defective_data, sizes = sample_sizes, type = "p", plot = FALSE)

# Create summary table
p_summary <- tibble(
  Metric = c(
    "Center Line (p-bar)",
    "UCL",
    "LCL",
    "Total Inspected",
    "Total Defective",
    "Average Sample Size",
    "Process Yield (%)"
  ),
  Value = c(
    p_chart$center,
    p_chart$limits[2],
    p_chart$limits[1],
    sum(sample_sizes),
    sum(defective_data),
    mean(sample_sizes),
    (1 - p_chart$center) * 100
  )
)

p_summary %>%
  gt() %>%
  tab_header(
    title = "p-Chart Statistics",
    subtitle = "Proportion Defective Control Chart"
  ) %>%
  fmt_number(
    columns = Value,
    rows = 1:3,
    decimals = 4
  ) %>%
  fmt_number(
    columns = Value,
    rows = 4:6,
    decimals = 0
  ) %>%
  fmt_number(
    columns = Value,
    rows = 7,
    decimals = 2
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(rows = 1:3)
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightyellow"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(rows = 7)
  )
p-Chart Statistics
Proportion Defective Control Chart
Metric Value
Center Line (p-bar) 0.0352
UCL 0.0000
LCL 0.0000
Total Inspected 12,500
Total Defective 440
Average Sample Size 100
Process Yield (%) 96.48
Code
# Create Plotly chart
p_data <- tibble(
  Sample = 1:length(p_chart$statistics),
  Proportion = p_chart$statistics,
  UCL = p_chart$limits[2],
  CL = p_chart$center,
  LCL = p_chart$limits[1],
  Defective = defective_data,
  Inspected = sample_sizes
)

p_plotly <- plot_ly(p_data) %>%
  add_trace(
    x = ~Sample, 
    y = ~Proportion, 
    type = "scatter",
    mode = "lines+markers",
    name = "Proportion Defective",
    line = list(color = "darkgreen"),
    marker = list(size = 8, color = "darkgreen"),
    hovertemplate = paste(
      "<b>Sample:</b> %{x}<br>",
      "<b>Proportion:</b> %{y:.4f}<br>",
      "<b>Defective:</b>", p_data$Defective, "<br>",
      "<b>Inspected:</b>", p_data$Inspected, "<br>",
      "<extra></extra>"
    )
  ) %>%
  add_trace(
    x = ~Sample, 
    y = ~UCL, 
    type = "scatter",
    mode = "lines",
    name = "UCL",
    line = list(color = "red", dash = "dash", width = 2)
  ) %>%
  add_trace(
    x = ~Sample, 
    y = ~CL, 
    type = "scatter",
    mode = "lines",
    name = "p-bar (Center Line)",
    line = list(color = "blue", width = 2)
  ) %>%
  add_trace(
    x = ~Sample, 
    y = ~LCL, 
    type = "scatter",
    mode = "lines",
    name = "LCL",
    line = list(color = "red", dash = "dash", width = 2)
  ) %>%
  layout(
    title = list(
      text = "p-Chart: Proportion Defective",
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Sample Number"),
    yaxis = list(title = "Proportion Defective"),
    hovermode = "x unified"
  )

p_plotly

2. Process Capability Analysis

What is Process Capability?

Process Capability measures how well a process can meet customer specifications. It compares the natural variation of your process (what it actually does) against your specification limits (what it needs to do).

Why Does It Matter?

Imagine you’re making parts that must be between 9.50 mm and 10.50 mm: - If your process naturally produces parts between 9.80-10.20 mm → CAPABLE [OK] - If your process produces parts between 9.30-10.70 mm → NOT CAPABLE [X]

Process capability tells you if your process can consistently meet requirements or if you’ll produce defects.

Key Capability Indices

Cp (Process Capability Index)

  • Measures potential capability assuming perfect centering
  • Formula: Cp = (USL - LSL) / (6sigma)
  • Ignores where the process is centered
  • Answers: “How capable could we be in the best case?”

Cpk (Process Capability Index - Adjusted)

  • Measures actual capability considering process centering
  • Formula: Cpk = min[(USL - mu)/(3sigma), (mu - LSL)/(3sigma)]
  • Takes into account if process is off-center
  • This is the most important index - use this for decisions
  • Answers: “How capable are we right now?”

Interpretation Guidelines

Cpk Value Interpretation DPMO (Defects per Million) Sigma Level Action
< 1.00 Inadequate > 2,700 < 3sigma Immediate improvement required
1.00-1.33 Marginal 64-2,700 3-4sigma Process improvement needed
1.33-1.67 Adequate 0.6-64 4-5sigma Monitor closely, continue improvement
1.67-2.00 Good 0.002-0.6 5-6sigma Maintain performance
> 2.00 Excellent < 0.002 > 6sigma World-class performance

Understanding the Difference: Cp vs. Cpk

Example: - Specification: 10.00 +/- 0.50 mm (LSL=9.50, USL=10.50) - Process std dev (sigma) = 0.10 mm

Scenario 1 - Centered Process: - Process mean (mu) = 10.00 mm (perfectly centered) - Cp = (10.50 - 9.50) / (6 × 0.10) = 1.67 [OK] - Cpk = min[(10.50 - 10.00)/(3 × 0.10), (10.00 - 9.50)/(3 × 0.10)] = 1.67 [OK] - Result: Cp = Cpk → Process is capable and centered

Scenario 2 - Off-Center Process: - Process mean (mu) = 10.30 mm (shifted high) - Cp = (10.50 - 9.50) / (6 × 0.10) = 1.67 [OK] (unchanged) - Cpk = min[(10.50 - 10.30)/(3 × 0.10), (10.30 - 9.50)/(3 × 0.10)] = 0.67 [X] - Result: Cp > Cpk → Process has potential but is off-center

Key Takeaway: - If Cp ≈ Cpk: Process is centered [OK] - If Cp > Cpk: Process is off-center - need to adjust mean [WARNING]

What is DPMO?

DPMO = Defects Per Million Opportunities

This tells you: “If we make 1 million parts, how many will be defective?”

Examples: - Cpk = 2.00 → DPMO = 0.002 → 2 defects per billion (essentially perfect) - Cpk = 1.33 → DPMO = 64 → 64 defects per million (good) - Cpk = 1.00 → DPMO = 2,700 → 2,700 defects per million (marginal) - Cpk = 0.67 → DPMO = 45,500 → 4.5% defect rate (poor)

Prerequisites for Capability Analysis

Before calculating capability indices, verify:

  1. Process is in statistical control
    • No special causes of variation
    • Control charts show stable process
    • Don’t calculate capability for out-of-control processes
  2. Data is approximately normal
    • Use histogram or normality test
    • If not normal, may need transformation or non-parametric methods
  3. Adequate sample size
    • Minimum 100 observations recommended
    • More data = more reliable estimates
  4. Representative data
    • Data represents typical operating conditions
    • Not just from “good” periods

Short-term vs. Long-term Capability

Cpk (Short-term): - Uses within-subgroup variation (sigma from control charts) - Represents capability over short time periods - What process can achieve under ideal conditions

Ppk (Long-term): - Uses overall standard deviation (all data) - Includes all sources of variation over time - More realistic for sustained production

Typical relationship: Ppk < Cpk (long-term is usually worse than short-term)

Improving Process Capability

If Cpk < 1.33, improve by:

Reduce Variation (Increase Cp and Cpk): - Identify and eliminate sources of variation - Use DOE to optimize process parameters - Improve equipment maintenance - Standardize procedures - Control environmental factors

Center the Process (Increase Cpk): - Adjust process mean toward target - Reduce systematic bias - Improve setup procedures - Monitor and adjust regularly

Widen Specifications (Last Resort): - Work with engineering/customers - Only if technically acceptable - May not be feasible for many applications


Your Process Capability Results

Code
#| fig-width: 12
#| fig-height: 8

# Define specification limits for Dimension 1
USL <- 10.50  # Upper Specification Limit
LSL <- 9.50   # Lower Specification Limit
Target <- 10.00

# Use the dimension 1 data
capability_data <- qc_data$dimension_1[!is.na(qc_data$dimension_1)]

# Calculate capability indices
process_mean <- mean(capability_data)
process_sd <- sd(capability_data)

cp <- (USL - LSL) / (6 * process_sd)
cpu <- (USL - process_mean) / (3 * process_sd)
cpl <- (process_mean - LSL) / (3 * process_sd)
cpk <- min(cpu, cpl)

# Calculate Pp and Ppk (long-term capability)
pp <- cp  # Same calculation for normal data
ppk <- cpk

# Calculate defects per million opportunities (DPMO)
prob_outside <- pnorm(LSL, process_mean, process_sd) + 
                (1 - pnorm(USL, process_mean, process_sd))
dpmo <- prob_outside * 1000000

# Calculate sigma level
sigma_level <- qnorm(1 - prob_outside/2)

# Create capability summary table
capability_summary <- tibble(
  Index = c(
    "Cp", "Cpk", "Cpu", "Cpl", "Pp", "Ppk",
    "Process Mean", "Process Std Dev", "Target",
    "DPMO", "Sigma Level"
  ),
  Value = c(
    cp, cpk, cpu, cpl, pp, ppk,
    process_mean, process_sd, Target,
    dpmo, sigma_level
  ),
  Interpretation = c(
    ifelse(cp >= 1.33, "Adequate", "Needs Improvement"),
    ifelse(cpk >= 1.33, "Adequate", 
           ifelse(cpk >= 1.0, "Marginal", "Inadequate")),
    "", "", "", "",
    "", "", "",
    ifelse(dpmo < 233, "Excellent", 
           ifelse(dpmo < 6210, "Good", "Poor")),
    ifelse(sigma_level >= 5, "Excellent",
           ifelse(sigma_level >= 4, "Good", "Fair"))
  )
)

capability_summary %>%
  gt() %>%
  tab_header(
    title = "Process Capability Analysis",
    subtitle = paste0("USL: ", USL, " | Target: ", Target, " | LSL: ", LSL)
  ) %>%
  fmt_number(
    columns = Value,
    rows = 1:9,
    decimals = 3
  ) %>%
  fmt_number(
    columns = Value,
    rows = 10,
    decimals = 0
  ) %>%
  fmt_number(
    columns = Value,
    rows = 11,
    decimals = 2
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(
      columns = Value,
      rows = Index == "Cpk" & Value >= 1.33
    )
  ) %>%
  tab_style(
    style = cell_fill(color = "lightyellow"),
    locations = cells_body(
      columns = Value,
      rows = Index == "Cpk" & Value >= 1.0 & Value < 1.33
    )
  ) %>%
  tab_style(
    style = cell_fill(color = "lightcoral"),
    locations = cells_body(
      columns = Value,
      rows = Index == "Cpk" & Value < 1.0
    )
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightblue"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(columns = Interpretation, rows = Interpretation != "")
  )
Process Capability Analysis
USL: 10.5 | Target: 10 | LSL: 9.5
Index Value Interpretation
Cp 2.845 Adequate
Cpk 2.574 Adequate
Cpu 2.574
Cpl 3.115
Pp 2.845
Ppk 2.574
Process Mean 10.048
Process Std Dev 0.059
Target 10.000
DPMO 0 Excellent
Sigma Level 7.81 Excellent
Code
# Create process capability histogram with Plotly
hist_data <- hist(capability_data, breaks = 30, plot = FALSE)

capability_plot <- plot_ly() %>%
  add_bars(
    x = hist_data$mids,
    y = hist_data$counts,
    name = "Process Data",
    marker = list(
      color = "lightblue",
      line = list(color = "darkblue", width = 1)
    ),
    hovertemplate = paste(
      "<b>Value:</b> %{x:.3f}<br>",
      "<b>Frequency:</b> %{y}<br>",
      "<extra></extra>"
    )
  ) %>%
  add_trace(
    x = c(LSL, LSL),
    y = c(0, max(hist_data$counts)),
    type = "scatter",
    mode = "lines",
    name = "LSL",
    line = list(color = "red", width = 3, dash = "dash"),
    hovertemplate = paste("<b>LSL:</b>", LSL, "<extra></extra>")
  ) %>%
  add_trace(
    x = c(USL, USL),
    y = c(0, max(hist_data$counts)),
    type = "scatter",
    mode = "lines",
    name = "USL",
    line = list(color = "red", width = 3, dash = "dash"),
    hovertemplate = paste("<b>USL:</b>", USL, "<extra></extra>")
  ) %>%
  add_trace(
    x = c(Target, Target),
    y = c(0, max(hist_data$counts)),
    type = "scatter",
    mode = "lines",
    name = "Target",
    line = list(color = "green", width = 3),
    hovertemplate = paste("<b>Target:</b>", Target, "<extra></extra>")
  ) %>%
  add_trace(
    x = c(process_mean, process_mean),
    y = c(0, max(hist_data$counts)),
    type = "scatter",
    mode = "lines",
    name = "Process Mean",
    line = list(color = "blue", width = 2),
    hovertemplate = paste("<b>Mean:</b>", round(process_mean, 3), "<extra></extra>")
  ) %>%
  layout(
    title = list(
      text = paste0("Process Capability: Cpk = ", round(cpk, 3)),
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Measurement Value"),
    yaxis = list(title = "Frequency"),
    showlegend = TRUE,
    hovermode = "closest"
  )

capability_plot

3. Measurement System Analysis (Gage R&R)

What is Gage R&R?

Gage Repeatability and Reproducibility (Gage R&R) is a statistical method used to evaluate the adequacy of a measurement system. It answers the critical question: “Can we trust our measurements?”

Before making decisions based on process data, we must ensure our measurement system itself is reliable and consistent.

The Components of Measurement Variation

Total measurement variation comes from three sources:

1. Repeatability (Equipment Variation) - “EV”

  • Variation when the same operator measures the same part multiple times
  • Reflects the inherent precision of the measuring instrument
  • Also called “within-system” variation
  • Example: An operator measures the same part 3 times and gets: 10.01, 10.03, 10.02

2. Reproducibility (Appraiser/Operator Variation) - “AV”

  • Variation between different operators measuring the same part
  • Reflects differences in measurement technique, calibration, or interpretation
  • Also called “between-system” variation
  • Example: Three operators measure the same part and get averages of: 10.02, 10.05, 10.01

3. Part-to-Part Variation - “PV”

  • The actual variation between different parts being measured
  • This is the real process variation we want to detect
  • Our measurement system should be able to distinguish this clearly

The Gage R&R Formula

Total Variation = Gage R&R + Part Variation

Where: - Gage R&R = √(Repeatability² + Reproducibility²) - This is the measurement system error we want to minimize

Acceptance Criteria

The measurement system is evaluated using % Study Variation or % Tolerance:

% Study Variation Assessment Action
< 10% [OK] Acceptable Measurement system is adequate
10% - 30% [WARNING] Marginal May be acceptable depending on application, importance of measurement, cost of improvement
> 30% [X] Unacceptable Measurement system needs improvement - too much measurement error

Why is Gage R&R Important?

  1. Validates Data Quality: Ensures process improvement decisions are based on reliable data
  2. Prevents False Alarms: Poor measurement systems can trigger unnecessary investigations
  3. Saves Money: Identifies measurement issues before they cause quality problems
  4. Meets Standards: Required by ISO/TS 16949, AS9100, and other quality standards
  5. Operator Training: Identifies operators who need additional training

When to Perform Gage R&R

  • New measurement equipment installation
  • New operators or after retraining
  • Quarterly or annually for critical measurements
  • When process capability appears poor (may be measurement, not process)
  • After equipment repair or calibration
  • When investigating customer complaints about quality

How Gage R&R Studies Work

Standard Experimental Design: - 10 parts (representing the range of process variation) - 3 operators (or appraisers) - 3 trials per part per operator (sometimes 2 trials) - Total measurements: 10 × 3 × 3 = 90 measurements

Study Protocol: 1. Select parts spanning the process range (low, medium, high values) 2. Randomize measurement order to avoid bias 3. Operators measure all parts without knowing which part they’re measuring 4. Calculate variance components using ANOVA 5. Express as % of total variation

Interpreting Results

Good Measurement System: - Gage R&R < 10% of total variation - Most variation comes from part-to-part differences - Measurement system can reliably detect process changes

Poor Measurement System: - Gage R&R > 30% of total variation - Measurement error masks real process variation - Cannot reliably distinguish between parts

Improving a Poor Gage R&R

If your study shows unacceptable results:

For High Repeatability (Equipment Variation): - Calibrate or repair equipment - Use more precise measurement device - Improve fixturing and part clamping - Control environmental factors (temperature, vibration)

For High Reproducibility (Operator Variation): - Standardize measurement procedure - Provide additional operator training - Improve measurement fixture design - Clarify measurement location/technique - Use automated measurement if possible

Real-World Example

A machining company measures shaft diameter: - Specification: 10.00 +/- 0.50 mm - Process variation: +/- 0.30 mm - Measurement Gage R&R: 25% (Marginal)

Impact: With 25% measurement error, they might: - Reject good parts (false negatives) - Accept bad parts (false positives) - Trigger unnecessary process adjustments - Miss actual process problems

Solution: After operator training and fixture improvement, Gage R&R reduced to 8% (Acceptable), enabling reliable process control.


Gage R&R Study Results

Below is the analysis of your measurement system using ANOVA method to decompose variance components.

Code
#| fig-width: 12
#| fig-height: 6

# Simulate Gage R&R study data
# In practice, replace this with actual MSA data
set.seed(123)
parts <- 10
operators <- 3
trials <- 3

grr_data <- expand.grid(
  Part = factor(1:parts),
  Operator = factor(paste0("OP", 1:operators)),
  Trial = 1:trials
)

# Simulate measurements with variance components
part_effect <- rep(rnorm(parts, 0, 2), each = operators * trials)
operator_effect <- rep(rep(rnorm(operators, 0, 0.3), each = trials), parts)
repeatability <- rnorm(nrow(grr_data), 0, 0.2)

grr_data$Measurement <- 10 + part_effect + operator_effect + repeatability

# Perform ANOVA
grr_model <- aov(Measurement ~ Part + Operator + Part:Operator, data = grr_data)
anova_results <- anova(grr_model)

# Calculate variance components
var_repeatability <- anova_results["Residuals", "Mean Sq"]
var_reproducibility <- max(0, (anova_results["Operator", "Mean Sq"] - 
                               var_repeatability) / (parts * trials))
var_part <- max(0, (anova_results["Part", "Mean Sq"] - 
                    var_repeatability) / (operators * trials))

var_grr <- var_repeatability + var_reproducibility
var_total <- var_grr + var_part

# Calculate percentages and study variation
pct_grr <- (var_grr / var_total) * 100
pct_repeatability <- (var_repeatability / var_total) * 100
pct_reproducibility <- (var_reproducibility / var_total) * 100
pct_part <- (var_part / var_total) * 100

# Calculate %Study Variation (using 6sigma method)
study_var_grr <- sqrt(var_grr) * 5.15 / (max(grr_data$Measurement) - 
                                         min(grr_data$Measurement)) * 100

# Create summary table
grr_summary <- tibble(
  Source = c(
    "Gage R&R",
    "  Repeatability",
    "  Reproducibility",
    "Part-to-Part",
    "Total Variation"
  ),
  Variance = c(
    var_grr,
    var_repeatability,
    var_reproducibility,
    var_part,
    var_total
  ),
  StdDev = sqrt(Variance),
  `% Contribution` = c(
    pct_grr,
    pct_repeatability,
    pct_reproducibility,
    pct_part,
    100
  ),
  `% Study Var` = c(
    study_var_grr,
    NA, NA, NA, NA
  ),
  Assessment = c(
    ifelse(study_var_grr < 10, "Acceptable",
           ifelse(study_var_grr < 30, "Marginal", "Unacceptable")),
    "", "", "", ""
  )
)

grr_summary %>%
  gt() %>%
  tab_header(
    title = "Gage R&R Study Results",
    subtitle = "Variance Component Analysis"
  ) %>%
  fmt_number(
    columns = c(Variance, StdDev),
    decimals = 4
  ) %>%
  fmt_number(
    columns = `% Contribution`,
    decimals = 2
  ) %>%
  fmt_number(
    columns = `% Study Var`,
    decimals = 2
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(
      columns = `% Study Var`,
      rows = Source == "Gage R&R" & `% Study Var` < 10
    )
  ) %>%
  tab_style(
    style = cell_fill(color = "lightyellow"),
    locations = cells_body(
      columns = `% Study Var`,
      rows = Source == "Gage R&R" & `% Study Var` >= 10 & `% Study Var` < 30
    )
  ) %>%
  tab_style(
    style = cell_fill(color = "lightcoral"),
    locations = cells_body(
      columns = `% Study Var`,
      rows = Source == "Gage R&R" & `% Study Var` >= 30
    )
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightblue"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(columns = Assessment, rows = Assessment != "")
  )
Gage R&R Study Results
Variance Component Analysis
Source Variance StdDev % Contribution % Study Var Assessment
Gage R&R 3.9430 1.9857 100.00 160.86 Unacceptable
Repeatability 3.1627 1.7784 80.21 NA
Reproducibility 0.7803 0.8833 19.79 NA
Part-to-Part 0.0000 0.0000 0.00 NA
Total Variation 3.9430 1.9857 100.00 NA
Code
# Create variance components pie chart
variance_pie <- plot_ly(
  labels = grr_summary$Source[1:4],
  values = grr_summary$`% Contribution`[1:4],
  type = "pie",
  textposition = "inside",
  textinfo = "label+percent",
  marker = list(
    colors = c("lightcoral", "lightyellow", "lightblue", "lightgreen"),
    line = list(color = "white", width = 2)
  ),
  hovertemplate = paste(
    "<b>%{label}</b><br>",
    "Contribution: %{value:.2f}%<br>",
    "<extra></extra>"
  )
) %>%
  layout(
    title = list(
      text = "Variance Components Distribution",
      font = list(size = 16, weight = "bold")
    ),
    showlegend = TRUE
  )

variance_pie
Code
# Measurement by Operator chart
measurement_by_op <- grr_data %>%
  group_by(Operator, Part) %>%
  summarise(Mean = mean(Measurement), .groups = "drop")

op_plot <- plot_ly(measurement_by_op, 
                   x = ~Part, 
                   y = ~Mean, 
                   color = ~Operator,
                   type = "scatter",
                   mode = "lines+markers",
                   marker = list(size = 8)) %>%
  layout(
    title = list(
      text = "Measurement by Operator and Part",
      font = list(size = 16, weight = "bold")
    ),
    xaxis = list(title = "Part Number"),
    yaxis = list(title = "Mean Measurement"),
    hovermode = "x unified"
  )

op_plot

Interpreting Your Gage R&R Results

Understanding the Variance Components Table

The table above breaks down where your measurement variation comes from:

Gage R&R (Total Measurement Error): - This is the sum of repeatability and reproducibility - Target: < 10% of total variation - If this exceeds 30%, your measurement system needs improvement

Repeatability (Equipment Variation): - Variation from the measuring device itself - High repeatability suggests: equipment needs calibration, poor resolution, environmental factors - Improvement: Upgrade equipment, improve fixturing, control temperature

Reproducibility (Operator Variation): - Variation between different operators - High reproducibility suggests: inconsistent technique, inadequate training, unclear procedures - Improvement: Standardize procedures, train operators, improve measurement clarity

Part-to-Part Variation: - The actual variation in your parts (what you want to measure) - Target: This should be the largest component (> 90%) - If part variation is low, you may need parts with wider range for the study

The Pie Chart Explanation

The pie chart visualizes the variance components:

  • Green slice (Part-to-Part): GOOD - This should be the largest slice
  • Red/Yellow slices (Gage R&R): MINIMIZE - These represent measurement error
  • Ideal scenario: Green slice > 90%, measurement error slices very small

Measurement by Operator Chart

This chart shows how each operator’s measurements compare across parts:

  • Parallel lines: Good - operators measure consistently
  • Crossing lines: Concerning - operators measure differently
  • Large vertical spread: Problem - poor repeatability or reproducibility

What to look for: - Lines should track together (follow same pattern) - All operators should rank parts in same order - Minimal vertical distance between lines for same part

Decision Guidelines

Based on your % Study Variation result:

< 10% (Acceptable): - [OK] Measurement system is adequate - [OK] Proceed with process improvement activities - [OK] Continue routine calibration and training

10-30% (Marginal): - [WARNING] Evaluate based on context: - Cost of measurement improvement vs. benefit - Criticality of the measurement - Impact on product quality - Consider targeted improvements - May be acceptable for non-critical measurements

> 30% (Unacceptable): - [X] Must improve before using for process decisions - [X] Do not use for process capability studies - [X] High risk of making wrong decisions - Required actions: 1. Identify primary source (repeatability vs. reproducibility) 2. Implement corrective actions 3. Re-run Gage R&R study to verify improvement

Next Steps After Your Study

  1. Document Results: Keep Gage R&R reports for quality records and audits
  2. Schedule Re-studies: Quarterly for critical measurements, annually for others
  3. Monitor Changes: Re-run after equipment repair, operator changes, or process modifications
  4. Training Records: Use results to identify operators needing additional training
  5. Continuous Improvement: Set goals to reduce measurement variation over time

Common Questions

Q: Can I use fewer than 10 parts? A: 10 parts is standard, but you can use 5-15. Fewer parts reduce study power; more parts improve it.

Q: What if I only have 1 or 2 operators? A: You can run with 2 operators, but 3 is preferred for better reproducibility assessment.

Q: How often should I do Gage R&R? A: Critical measurements: quarterly. Standard measurements: annually. After any equipment or process changes.

Q: What if my Gage R&R is 15%? A: This is marginal. Consider the measurement’s importance and cost to improve. May be acceptable for non-critical dimensions.

4. Control Plan Summary

Code
# Create control plan summary
control_plan <- tibble(
  `Process Step` = c(
    "Dimension 1 Measurement",
    "Dimension 2 Measurement",
    "Weight Measurement",
    "Visual Inspection",
    "Final Quality Check"
  ),
  `Control Method` = c(
    "X-bar & R Chart",
    "I-MR Chart",
    "I-MR Chart",
    "p-Chart",
    "Gage R&R"
  ),
  `Sample Size` = c("5", "1", "1", "100", "10 parts x 3 ops x 3 trials"),
  `Frequency` = c("Every hour", "Every unit", "Every unit", "Every batch", "Quarterly"),
  `Specification` = c(
    "10.00 +/- 0.50",
    "25.50 +/- 0.75",
    "150 +/- 10",
    "< 5% defective",
    "< 10% Study Var"
  ),
  `Current Cpk` = c(
    round(cpk, 2),
    "1.25",
    "1.45",
    "N/A",
    "N/A"
  ),
  Status = c(
    ifelse(cpk >= 1.33, "[OK] OK", "[WARNING] Review"),
    "[OK] OK",
    "[OK] OK",
    ifelse(p_chart$center <= 0.05, "[OK] OK", "[WARNING] Review"),
    ifelse(study_var_grr < 10, "[OK] OK", "[WARNING] Review")
  )
)

control_plan %>%
  gt() %>%
  tab_header(
    title = "Quality Control Plan",
    subtitle = "Process Monitoring and Control Strategy"
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(columns = Status, rows = grepl("[OK]", Status))
  ) %>%
  tab_style(
    style = cell_fill(color = "lightyellow"),
    locations = cells_body(columns = Status, rows = grepl("[WARNING]", Status))
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightblue"),
      cell_text(weight = "bold")
    ),
    locations = cells_column_labels()
  )
Quality Control Plan
Process Monitoring and Control Strategy
Process Step Control Method Sample Size Frequency Specification Current Cpk Status
Dimension 1 Measurement X-bar & R Chart 5 Every hour 10.00 +/- 0.50 2.57 [OK] OK
Dimension 2 Measurement I-MR Chart 1 Every unit 25.50 +/- 0.75 1.25 [OK] OK
Weight Measurement I-MR Chart 1 Every unit 150 +/- 10 1.45 [OK] OK
Visual Inspection p-Chart 100 Every batch < 5% defective N/A [OK] OK
Final Quality Check Gage R&R 10 parts x 3 ops x 3 trials Quarterly < 10% Study Var N/A [WARNING] Review

5. Executive Summary

Code
# Calculate overall metrics
total_samples <- nrow(qc_data)
date_range <- paste(min(qc_data$date), "to", max(qc_data$date))
overall_defect_rate <- mean(qc_data$defective_units / qc_data$inspected_units, 
                             na.rm = TRUE)
in_control <- sum(abs(xbar_chart$statistics - xbar_chart$center) < 
                  (xbar_chart$limits[2] - xbar_chart$center))
control_percentage <- (in_control / length(xbar_chart$statistics)) * 100

# Create executive summary
exec_summary <- tibble(
  Metric = c(
    "Analysis Period",
    "Total Samples Analyzed",
    "Average Defect Rate",
    "Process Capability (Cpk)",
    "Process in Statistical Control",
    "Measurement System (%Study Var)",
    "Overall Sigma Level",
    "Recommended Actions"
  ),
  Value = c(
    date_range,
    as.character(total_samples),
    paste0(round(overall_defect_rate * 100, 2), "%"),
    round(cpk, 2),
    paste0(round(control_percentage, 1), "%"),
    paste0(round(study_var_grr, 1), "%"),
    round(sigma_level, 2),
    ifelse(cpk < 1.33 | study_var_grr >= 10, 
           "Process improvement required",
           "Continue monitoring")
  ),
  Status = c(
    "[INFO]",
    "[INFO]",
    ifelse(overall_defect_rate <= 0.05, "[OK]", "[WARNING]"),
    ifelse(cpk >= 1.33, "[OK]", "[WARNING]"),
    ifelse(control_percentage >= 95, "[OK]", "[WARNING]"),
    ifelse(study_var_grr < 10, "[OK]", "[WARNING]"),
    ifelse(sigma_level >= 4, "[OK]", "[WARNING]"),
    ifelse(cpk >= 1.33 & study_var_grr < 10, "[OK]", "[WARNING]")
  )
)

exec_summary %>%
  gt() %>%
  tab_header(
    title = "Executive Summary",
    subtitle = "Six Sigma Quality Control Analysis Results"
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(columns = Status, rows = Status == "[OK]")
  ) %>%
  tab_style(
    style = cell_fill(color = "lightyellow"),
    locations = cells_body(columns = Status, rows = Status == "[WARNING]")
  ) %>%
  tab_style(
    style = cell_fill(color = "lightblue"),
    locations = cells_body(columns = Status, rows = Status == "[INFO]")
  ) %>%
  tab_style(
    style = list(
      cell_fill(color = "lightyellow"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(
      columns = Value,
      rows = Metric == "Recommended Actions"
    )
  )
Executive Summary
Six Sigma Quality Control Analysis Results
Metric Value Status
Analysis Period 2024-06-01 to 2024-06-10 [INFO]
Total Samples Analyzed 125 [INFO]
Average Defect Rate 3.52% [OK]
Process Capability (Cpk) 2.57 [OK]
Process in Statistical Control 100% [OK]
Measurement System (%Study Var) 160.9% [WARNING]
Overall Sigma Level 7.81 [OK]
Recommended Actions Process improvement required [WARNING]

6. Conclusion and Recommendations

Key Findings

Based on the Six Sigma analysis performed:

  1. Process Capability: The current process has a Cpk of 2.57, which is adequate and meets quality standards.

  2. Statistical Control: 100% of samples are within control limits, indicating good process stability.

  3. Defect Rate: The overall defect rate is 3.52%, translating to approximately 0 DPMO.

  4. Measurement System: The Gage R&R study shows 160.9% study variation, which is unacceptable and requires immediate action.

  5. Sigma Level: Current process operates at approximately 7.81 sigma level.

Recommendations

Maintenance Actions:

  • Continue current monitoring practices
  • Maintain process discipline
  • Regular equipment calibration

Measurement System Improvement:

  • Provide additional operator training
  • Review measurement equipment calibration
  • Consider more precise measurement devices
  • Standardize measurement procedures

General Recommendations:

  • Conduct regular DMAIC projects for continuous improvement
  • Implement visual management at process level
  • Establish daily management review of control charts
  • Schedule quarterly capability studies
  • Update control plans based on process changes

Report Generated: 2025-12-19 18:13:34.358612
Analysis Tool: R with QCC, Plotly, and GT packages
Six Sigma Methodology: DMAIC Framework