© 2026 Dr. Debashis Chatterjee. All rights reserved.
These notes are prepared exclusively for educational use. Students must cite sources and must not submit copied work.

1 0. Software setup (run once)

We will use R with standard SQC/SPC packages.

pkgs <- c("qcc","ggplot2","dplyr","tidyr","readr","kableExtra","AcceptanceSampling")
to_install <- pkgs[!sapply(pkgs, requireNamespace, quietly = TRUE)]
if(length(to_install) > 0) install.packages(to_install, dependencies = TRUE)

library(qcc)
library(ggplot2)
library(dplyr)
library(tidyr)
library(readr)
library(kableExtra)
library(AcceptanceSampling)

set.seed(123)

1.1 0.1 Helper functions (used throughout)

# Capability indices (Normal assumption)
capability <- function(x, LSL, USL){
  mu <- mean(x, na.rm=TRUE)
  s  <- sd(x, na.rm=TRUE)
  Cp  <- (USL - LSL) / (6*s)
  Cpk <- min((USL - mu)/(3*s), (mu - LSL)/(3*s))
  data.frame(mu=mu, sd=s, Cp=Cp, Cpk=Cpk)
}

# "Nice" table
ktable <- function(df, caption=NULL, digits=4){
  # Ensure it's a data.frame
  df <- as.data.frame(df)

  # If empty, print a simple message (prevents kableExtra parser errors)
  if(nrow(df) == 0){
    cat("\n\n**No out-of-control points detected.**\n\n")
    return(invisible(NULL))
  }

  # For safety in knitting: avoid kableExtra parser issues for small tables
  knitr::kable(df, digits=digits, caption=caption)
}

# Extract OOC points from qcc object (if any)
ooc_points <- function(q){
  idx <- integer(0)

  # q$violations is often a LIST in qcc (e.g., $beyond.limits, $violating.runs, etc.)
  if(!is.null(q$violations)){
    v <- q$violations

    if(is.list(v)){
      # most common: beyond control limits
      if(!is.null(v$beyond.limits)){
        idx <- v$beyond.limits
      } else {
        # fallback: gather any numeric indices from the list
        idx <- unique(unlist(v, use.names = FALSE))
      }
    } else {
      idx <- v
    }
  }

  # clean indices
  idx <- idx[!is.na(idx)]
  idx <- idx[idx >= 1 & idx <= length(q$statistics)]

  if(length(idx) == 0) return(data.frame())

  data.frame(
    index = idx,
    statistic = as.numeric(q$statistics[idx])
  )
}

2 1. Practical 1 — Construction and interpretation of statistical control charts

2.1 1.1 Theory (minimal but rigorous)

A Shewhart chart monitors a statistic $Y_t$ sequentially. Under in-control conditions: \[ Y_t \sim (\mu_0,\sigma_0^2),\qquad \text{approximately independent across }t. \] Classical 3-sigma limits are: \[ \mathrm{UCL}=\mu_0+3\sigma_0,\qquad \mathrm{CL}=\mu_0,\qquad \mathrm{LCL}=\mu_0-3\sigma_0. \] The false-alarm probability per point under Normality is: \[ \alpha = \mathbb{P}(|Z|>3)=2\{1-\Phi(3)\}\approx 0.0027, \quad \Rightarrow \quad \mathrm{ARL}_0 \approx 1/\alpha \approx 370. \]

Important: Always interpret dispersion first (R or S chart), then mean chart (X̄ chart).

2.2 1.2 X̄–R chart (variables)

2.2.1 Dataset (real): `pistonrings` from `qcc`

This is a classical dataset used in SPC texts.

data(pistonrings, package="qcc")
str(pistonrings)

## 'data.frame':    200 obs. of  3 variables:
##  $ diameter: num  74 74 74 74 74 ...
##  $ sample  : int  1 1 1 1 1 2 2 2 2 2 ...
##  $ trial   : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...

head(pistonrings)

##   diameter sample trial
## 1   74.030      1  TRUE
## 2   74.002      1  TRUE
## 3   74.019      1  TRUE
## 4   73.992      1  TRUE
## 5   74.008      1  TRUE
## 6   73.995      2  TRUE

The dataset is already in subgroup (matrix) form: rows = subgroups, columns = observations within subgroup.

2.2.2 Step 1: R chart (dispersion)

qR <- qcc(pistonrings, type="R", plot=TRUE, title="R Chart (pistonrings)")

summary(qR)

## 
## Call:
## qcc(data = pistonrings, type = "R", plot = TRUE, title = "R Chart (pistonrings)")
## 
## R chart for pistonrings 
## 
## Summary of group statistics:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 72.96700 72.99875 73.00900 73.37860 74.00100 74.03600 
## 
## Group sample size:  3
## Number of groups:  200
## Center of group statistics:  73.3786
## Standard deviation:  43.34235 
## 
## Control limits:
##  LCL      UCL
##    0 188.8907

oocR <- ooc_points(qR)
ktable(oocR, caption="Out-of-control points on R chart (if any)")

## 
## 
## **No out-of-control points detected.**

2.2.3 Step 2: X̄ chart (location)

qX <- qcc(pistonrings, type="xbar", plot=TRUE, title="X-bar Chart (pistonrings)")

summary(qX)

## 
## Call:
## qcc(data = pistonrings, type = "xbar", plot = TRUE, title = "X-bar Chart (pistonrings)")
## 
## xbar chart for pistonrings 
## 
## Summary of group statistics:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 25.33067 28.58142 31.83467 31.70953 34.75025 38.00967 
## 
## Group sample size:  3
## Number of groups:  200
## Center of group statistics:  31.70954
## Standard deviation:  43.34235 
## 
## Control limits:
##        LCL      UCL
##  -43.36162 106.7807

oocX <- ooc_points(qX)
ktable(oocX, caption="Out-of-control points on X-bar chart (if any)")

## 
## 
## **No out-of-control points detected.**

2.2.4 Extra: A colorful ggplot view of subgroup means + limits

df_xbar <- data.frame(
  subgroup = 1:nrow(pistonrings),
  xbar = qX$statistics,
  UCL = qX$limits[2],
  CL  = qX$center,
  LCL = qX$limits[1]
)

ggplot(df_xbar, aes(subgroup, xbar)) +
  geom_line(linewidth=0.6) +
  geom_point(aes(color = (xbar>UCL | xbar<LCL)), size=2) +
  geom_hline(aes(yintercept=UCL), linetype="dashed") +
  geom_hline(aes(yintercept=CL),  linetype="solid") +
  geom_hline(aes(yintercept=LCL), linetype="dashed") +
  scale_color_manual(values=c("FALSE"="#2c7fb8","TRUE"="#d95f0e")) +
  labs(title="X-bar with Control Limits (highlighting signals)", color="Signal?",
       x="Subgroup", y="Subgroup mean") +
  theme_minimal(base_size = 12)

2.3 1.3 X̄–S chart (variables, larger subgroup size)

2.3.1 Dataset (real): `iris`

We create rational subgroups of size $n=10$ from Sepal.Length (real measurements).

data(iris)
x <- iris$Sepal.Length
n <- 10
m <- floor(length(x)/n)

Xmat <- matrix(x[1:(m*n)], nrow=m, ncol=n, byrow=TRUE)  # subgroups
dim(Xmat)

## [1] 15 10

2.3.2 Step 1: S chart

qS <- qcc(Xmat, type="S", plot=TRUE, title="S Chart (iris Sepal.Length grouped)")

summary(qS)

## 
## Call:
## qcc(data = Xmat, type = "S", plot = TRUE, title = "S Chart (iris Sepal.Length grouped)")
## 
## S chart for Xmat 
## 
## Summary of group statistics:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.2469818 0.3546382 0.4217688 0.4803885 0.6602783 0.8042250 
## 
## Group sample size:  10
## Number of groups:  15
## Center of group statistics:  0.4803885
## Standard deviation:  0.4938919 
## 
## Control limits:
##        LCL       UCL
##  0.1362889 0.8244881

ktable(ooc_points(qS), caption="Out-of-control points on S chart (if any)")

## 
## 
## **No out-of-control points detected.**

2.3.3 Step 2: X̄ chart (using S estimate)

qXS <- qcc(Xmat, type="xbar", plot=TRUE, title="X-bar Chart (iris Sepal.Length grouped)")

summary(qXS)

## 
## Call:
## qcc(data = Xmat, type = "xbar", plot = TRUE, title = "X-bar Chart (iris Sepal.Length grouped)")
## 
## xbar chart for Xmat 
## 
## Summary of group statistics:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 4.860000 5.140000 5.850000 5.843333 6.500000 6.740000 
## 
## Group sample size:  10
## Number of groups:  15
## Center of group statistics:  5.843333
## Standard deviation:  0.4873294 
## 
## Control limits:
##       LCL      UCL
##  5.381012 6.305655

ktable(ooc_points(qXS), caption="Out-of-control points on X-bar chart (if any)")

Out-of-control points on X-bar chart (if any)
index	statistic
11	6.57
12	6.55
13	6.63
14	6.74
15	6.45
1	4.86
2	5.21
3	5.01
4	5.07
5	4.88

2.4 1.4 Attribute charts: np, p, c, u (real datasets)

2.4.1 1.4.1 p / np charts (defectives)

We need binary classification (defective/nondefective).
We will use mtcars (real) and define defective based on an engineering rule.

CTQ definition (illustration): defective if mpg < 20.

data(mtcars)

# Create a "stream" of batches for SPC: resample cars into batches (teaching device)
set.seed(1)
T <- 25
n_const <- 12

batch <- tibble(
  t = 1:T,
  n = n_const
) |>
  rowwise() |>
  mutate(
    samp = list(sample(mtcars$mpg, n, replace=TRUE)),
    d = sum(unlist(samp) < 20),
    p = d/n
  ) |>
  ungroup()

ktable(batch |> select(t,n,d,p), caption="Batch-wise defectives (mpg<20 treated as defective)")

Batch-wise defectives (mpg<20 treated as defective)
t	n	d	p
1	12	6	0.5000
2	12	7	0.5833
3	12	9	0.7500
4	12	6	0.5000
5	12	10	0.8333
6	12	8	0.6667
7	12	7	0.5833
8	12	9	0.7500
9	12	8	0.6667
10	12	7	0.5833
11	12	6	0.5000
12	12	9	0.7500
13	12	4	0.3333
14	12	7	0.5833
15	12	7	0.5833
16	12	7	0.5833
17	12	4	0.3333
18	12	5	0.4167
19	12	6	0.5000
20	12	10	0.8333
21	12	5	0.4167
22	12	6	0.5000
23	12	5	0.4167
24	12	8	0.6667
25	12	5	0.4167

np chart (constant n):

q_np <- qcc(batch$d, type="np", sizes=batch$n, plot=TRUE, title="np Chart (defectives per batch)")

summary(q_np)

## 
## Call:
## qcc(data = batch$d, type = "np", sizes = batch$n, plot = TRUE,     title = "np Chart (defectives per batch)")
## 
## np chart for batch$d 
## 
## Summary of group statistics:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    4.00    6.00    7.00    6.84    8.00   10.00 
## 
## Group sample size:  12
## Number of groups:  25
## Center of group statistics:  6.84
## Standard deviation:  1.714993 
## 
## Control limits:
##       LCL      UCL
##  1.695022 11.98498

p chart (varying n): we now vary sample size per batch:

set.seed(2)
batch2 <- tibble(
  t = 1:T,
  n = sample(8:20, T, replace=TRUE)
) |>
  rowwise() |>
  mutate(
    samp = list(sample(mtcars$mpg, n, replace=TRUE)),
    d = sum(unlist(samp) < 20),
    p = d/n
  ) |>
  ungroup()

q_p <- qcc(batch2$d, type="p", sizes=batch2$n, plot=TRUE, title="p Chart (varying n)")

summary(q_p)

## 
## Call:
## qcc(data = batch2$d, type = "p", sizes = batch2$n, plot = TRUE,     title = "p Chart (varying n)")
## 
## p chart for batch2$d 
## 
## Summary of group statistics:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.3846154 0.4666667 0.5454545 0.5633878 0.6250000 0.9000000 
## 
## Summary of group sample sizes:                                           
##   sizes  8 9 10 11 12 13 14 15 16 18 19 20
##   counts 4 2  2  1  1  4  2  2  2  2  1  2
## 
## Number of groups:  25
## Center of group statistics:  0.5606061
## Standard deviation:  0.4963133 
## 
## Control limits:
##           LCL       UCL
##     0.1307861 0.9904260
##     0.1476484 0.9735637
## ...                    
##     0.2096596 0.9115526

Interpretation: a signal means the defective fraction is unusually high/low compared to baseline.
Always check whether the definition of “defective” (CTQ) is meaningful for the application.

2.4.2 1.4.2 c chart (defects per constant unit)

Dataset (real): warpbreaks contains counts of breaks (defects) under weaving conditions.

We treat each observation as one constant opportunity unit and monitor breaks.

data(warpbreaks)

cdata <- warpbreaks |> 
  mutate(t = row_number()) |> 
  select(t, breaks)

ktable(head(cdata, 12), caption="warpbreaks: breaks (defects) per observation")

warpbreaks: breaks (defects) per observation
t	breaks
1	26
2	30
3	54
4	25
5	70
6	52
7	51
8	26
9	67
10	18
11	21
12	29

q_c <- qcc(cdata$breaks, type="c", plot=TRUE, title="c Chart (warpbreaks: breaks)")

summary(q_c)

## 
## Call:
## qcc(data = cdata$breaks, type = "c", plot = TRUE, title = "c Chart (warpbreaks: breaks)")
## 
## c chart for cdata$breaks 
## 
## Summary of group statistics:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 10.00000 18.25000 26.00000 28.14815 34.00000 70.00000 
## 
## Group sample size:  1
## Number of groups:  54
## Center of group statistics:  28.14815
## Standard deviation:  5.305483 
## 
## Control limits:
##      LCL     UCL
##  12.2317 44.0646

2.4.3 1.4.3 u chart (defects per varying opportunity)

Dataset (real): esoph contains number of cases and population size (exposure) for groups.

We treat cases as defects and population size as opportunity $n_t$.

data(esoph)

# Aggregate to create a time-like sequence (teaching)
u_df <- esoph |> 
  group_by(agegp) |> 
  summarise(
    cases = sum(ncases),
    exposure = sum(ncontrols + ncases),
    .groups="drop"
  ) |>
  mutate(t = row_number(),
         u = cases / exposure)

ktable(u_df, caption="u chart data from esoph (cases per exposure)")

u chart data from esoph (cases per exposure)
agegp	cases	exposure	t	u
25-34	1	116	1	0.0086
35-44	9	199	2	0.0452
45-54	46	213	3	0.2160
55-64	76	242	4	0.3140
65-74	55	161	5	0.3416
75+	13	44	6	0.2955

q_u <- qcc(u_df$cases, type="u", sizes=u_df$exposure, plot=TRUE,
           title="u Chart (esoph: cases per exposure)")

summary(q_u)

## 
## Call:
## qcc(data = u_df$cases, type = "u", sizes = u_df$exposure, plot = TRUE,     title = "u Chart (esoph: cases per exposure)")
## 
## u chart for u_df$cases 
## 
## Summary of group statistics:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.0086207 0.0879102 0.2557085 0.2034881 0.3094008 0.3416149 
## 
## Summary of group sample sizes:                                
##   sizes  44 116 161 199 213 242
##   counts  1   1   1   1   1   1
## 
## Number of groups:  6
## Center of group statistics:  0.2051282
## Standard deviation:  0.4529108 
## 
## Control limits:
##              LCL       UCL
##     0.0789730651 0.3312833
##     0.1088102161 0.3014462
## ...                       
##     0.0002915825 0.4099648

3 2. Practical 2 — Single sampling plan (OC, AQL, LTPD, AOQ, ASN, ATI)

3.1 2.1 Theory

A single sampling plan $(n,c)$ inspects $n$ units and accepts the lot if the number of defectives $X\le c$.
Under Binomial approximation $X\sim \mathrm{Bin}(n,p)$, the OC curve is: \[ P_a(p)=\mathbb{P}(X\le c)=\sum_{x=0}^c \binom{n}{x}p^x(1-p)^{n-x}. \] Producer risk at AQL $p_A$: $\alpha=1-P_a(p_A)$.
Consumer risk at LTPD $p_L$: $\beta=P_a(p_L)$.

Under rectifying inspection (reject lots are 100% inspected), with lot size $N$: \[ \mathrm{AOQ}(p)=pP_a(p)\left(1-\frac{n}{N}\right),\quad \mathrm{ATI}(p)=N-(N-n)P_a(p). \]

3.2 2.2 OC curve in R (exact Binomial)

Pa <- function(p, n, c) pbinom(c, size=n, prob=p)

n <- 50; c <- 2
pgrid <- seq(0, 0.20, by=0.001)

oc <- data.frame(p=pgrid, Pa=Pa(pgrid, n, c))
ggplot(oc, aes(p, Pa)) +
  geom_line(linewidth=0.8, color="#2c7fb8") +
  labs(title="OC Curve: Single Sampling Plan (n=50, c=2)",
       x="Lot fraction defective p", y="P(accept)") +
  theme_minimal(base_size=12)

3.3 2.3 AQL, LTPD, and risks

AQL <- 0.01
LTPD <- 0.07

Pa_AQL  <- Pa(AQL,  n, c)
Pa_LTPD <- Pa(LTPD, n, c)
alpha <- 1 - Pa_AQL
beta  <- Pa_LTPD

risk_tbl <- data.frame(
  quantity=c("Pa(AQL)","Producer risk alpha","Pa(LTPD)","Consumer risk beta"),
  value=c(Pa_AQL, alpha, Pa_LTPD, beta)
)

ktable(risk_tbl, caption="AQL/LTPD operating probabilities and risks", digits=6)

AQL/LTPD operating probabilities and risks
quantity	value
Pa(AQL)	0.986183
Producer risk alpha	0.013817
Pa(LTPD)	0.310789
Consumer risk beta	0.310789

3.4 2.4 AOQ, AOQL, ATI (rectifying inspection)

Nlot <- 1000

AOQ <- function(p, n, c, N) p * Pa(p,n,c) * (1 - n/N)
ATI <- function(p, n, c, N) N - (N-n) * Pa(p,n,c)

aoq <- data.frame(p=pgrid, AOQ=AOQ(pgrid,n,c,Nlot))
ati <- data.frame(p=pgrid, ATI=ATI(pgrid,n,c,Nlot))

AOQL <- max(aoq$AOQ)
p_at_aoql <- aoq$p[which.max(aoq$AOQ)]

ktable(data.frame(AOQL=AOQL, p_at_AOQL=p_at_aoql), caption="AOQL and where it occurs", digits=6)

AOQL and where it occurs
AOQL	p_at_AOQL
0.025985	0.045

ggplot(aoq, aes(p, AOQ)) +
  geom_line(linewidth=0.8, color="#d95f0e") +
  labs(title="AOQ Curve (Rectifying Inspection)", x="p", y="AOQ(p)") +
  theme_minimal(base_size=12)

ggplot(ati, aes(p, ATI)) +
  geom_line(linewidth=0.8, color="#1b9e77") +
  labs(title="ATI Curve (Rectifying Inspection)", x="p", y="ATI(p)") +
  theme_minimal(base_size=12)

3.5 2.5 “Real lot” demonstration using a real dataset

We create a “lot” from real measurements, define specs, compute the lot defect fraction, and estimate acceptance probability by repeated sampling.

# Lot = all piston ring measurements (real), flattened to numeric vector
lot <- as.numeric(unlist(pistonrings, use.names = FALSE))

# Drop missing if any (safe)
lot <- lot[is.finite(lot)]

# Specs (teaching example): choose near the empirical range
LSL <- as.numeric(quantile(lot, 0.05, names = FALSE, type = 7))
USL <- as.numeric(quantile(lot, 0.95, names = FALSE, type = 7))

p_true <- mean(lot < LSL | lot > USL)

n_plan <- 50; c_plan <- 2
B <- 2000

set.seed(99)
acc <- replicate(B, {
  samp <- sample(lot, n_plan, replace=FALSE)
  x <- sum(samp < LSL | samp > USL)
  as.integer(x <= c_plan)
})

Pa_hat <- mean(acc)

ktable(data.frame(LSL=LSL, USL=USL, p_true=p_true, Pa_estimated=Pa_hat),
       caption="Real-lot acceptance simulation (using pistonrings as a lot)", digits=6)

Real-lot acceptance simulation (using pistonrings as a lot)
LSL	USL	p_true	Pa_estimated
0	74.015	0.043333	0.6285

4 3. Practical 3 — Process capability and comparison with 3-sigma limits

4.1 3.1 Theory

Given stable output $X\sim \Normal(\mu,\sigma^2)$ and specs $\mathrm{LSL},\mathrm{USL}$: \[ C_p=\frac{\mathrm{USL}-\mathrm{LSL}}{6\sigma},\qquad C_{pk}=\min\left(\frac{\mathrm{USL}-\mu}{3\sigma},\frac{\mu-\mathrm{LSL}}{3\sigma}\right). \] The natural tolerance interval is $\mu\pm 3\sigma$.
Capability is meaningful only if the process is in statistical control.

4.2 3.2 Capability on a real dataset (pistonrings)

x <- as.numeric(unlist(pistonrings, use.names = FALSE))
x <- x[is.finite(x)]

# Choose illustration specs around the distribution
LSL <- as.numeric(quantile(x, 0.01, names = FALSE, type = 7))
USL <- as.numeric(quantile(x, 0.99, names = FALSE, type = 7))

cap_tbl <- capability(x, LSL, USL)
ktable(cap_tbl, caption="Capability indices (pistonrings; illustrative specs)")

Capability indices (pistonrings; illustrative specs)
mu	sd	Cp	Cpk
31.7095	31.7238	0.3889	0.3332

mu <- cap_tbl$mu[1]; s <- cap_tbl$sd[1]
L_3s <- mu - 3*s
U_3s <- mu + 3*s

comp <- data.frame(LSL=LSL, L_3sigma=L_3s, mu=mu, U_3sigma=U_3s, USL=USL)
ktable(comp, caption="Specs vs 3-sigma natural tolerance", digits=6)

Specs vs 3-sigma natural tolerance
LSL	L_3sigma	mu	U_3sigma	USL
0	-63.462	31.70953	126.8811	74.02603

df <- data.frame(x=x)
ggplot(df, aes(x)) +
  geom_histogram(bins=30, fill="#80b1d3", color="white") +
  geom_vline(xintercept=c(LSL,USL), linetype="dashed", linewidth=0.9, color="#d95f0e") +
  geom_vline(xintercept=c(L_3s,U_3s), linetype="dotted", linewidth=1.0, color="#1b9e77") +
  labs(title="Histogram with Specs (dashed) and 3-sigma Natural Tolerance (dotted)",
       x="Measurement", y="Count") +
  theme_minimal(base_size=12)

4.3 3.3 Bootstrap confidence intervals for Cp and Cpk (optional but modern)

set.seed(202)
B <- 2000

# Each bootstrap returns a numeric vector (mu, sd, Cp, Cpk)
boot_mat <- replicate(B, {
  xb <- sample(x, length(x), replace = TRUE)
  as.numeric(capability(xb, LSL, USL)[1, c("mu","sd","Cp","Cpk")])
})

# Convert to data.frame with proper column names
boot_df <- as.data.frame(t(boot_mat))
names(boot_df) <- c("mu","sd","Cp","Cpk")

ci <- function(v) as.numeric(quantile(v, c(0.025, 0.975), na.rm = TRUE))

ci_tbl <- data.frame(
  index = c("Cp","Cpk"),
  lower = c(ci(boot_df$Cp)[1],  ci(boot_df$Cpk)[1]),
  upper = c(ci(boot_df$Cp)[2],  ci(boot_df$Cpk)[2])
)

ktable(ci_tbl, caption="Bootstrap 95% CI for capability indices", digits = 4)

Bootstrap 95% CI for capability indices
index	lower	upper
Cp	0.3797	0.3994
Cpk	0.3116	0.3556

5 4. Practical 4 — Six Sigma DMAIC case study (with real baseline data)

We illustrate DMAIC using a real baseline dataset and a transparent improvement demonstration.

5.1 4.1 Define

CTQ: piston ring diameter (continuous).
Defect definition: outside $[LSL,USL]$.
Goal: reduce defect rate and improve $C_{pk}$.

5.2 4.2 Measure (baseline)

We already built $\bar{X}$ and $R$ charts and computed capability in Sections 1.2 and 3.2.

We estimate baseline defect probability under the chosen specs: \[ \hat{p}_{\text{def}}=\frac{1}{N}\sum_{i=1}^N \mathbf{1}\{x_i<LSL \text{ or } x_i>USL\}. \]

p_def_baseline <- mean(x < LSL | x > USL)
ktable(data.frame(baseline_defect_fraction=p_def_baseline,
                  baseline_defect_ppm=1e6*p_def_baseline),
       caption="Baseline defect fraction and PPM (pistonrings; illustrative specs)", digits=6)

Baseline defect fraction and PPM (pistonrings; illustrative specs)
baseline_defect_fraction	baseline_defect_ppm
0.01	10000

5.3 4.3 Analyze

A quick diagnostic: does the subgroup mean drift? Is dispersion stable?
We compute subgroup means and ranges and examine correlation with time (a simple drift check).

sub_means <- rowMeans(pistonrings)
sub_ranges <- apply(pistonrings, 1, function(v) max(v)-min(v))
t <- 1:length(sub_means)

an_tbl <- data.frame(
  cor_mean_time = cor(t, sub_means),
  cor_range_time = cor(t, sub_ranges)
)
ktable(an_tbl, caption="Simple drift diagnostics: correlation with time (illustrative)")

Simple drift diagnostics: correlation with time (illustrative)
cor_mean_time	cor_range_time
0.9994	0.8398

ggplot(data.frame(t=t, xbar=sub_means), aes(t, xbar)) +
  geom_line(color="#2c7fb8") + geom_point(color="#2c7fb8") +
  labs(title="Subgroup means over time (diagnostic)", x="Subgroup", y="Mean") +
  theme_minimal(base_size=12)

ggplot(data.frame(t=t, R=sub_ranges), aes(t, R)) +
  geom_line(color="#d95f0e") + geom_point(color="#d95f0e") +
  labs(title="Subgroup ranges over time (diagnostic)", x="Subgroup", y="Range") +
  theme_minimal(base_size=12)

5.4 4.4 Improve (transparent demonstration)

In real projects, improvement comes from engineering action (calibration, material change, tool replacement). For teaching, we demonstrate a plausible post-improvement scenario by re-centering and reducing spread of the baseline data.

This “after” dataset is simulated from the baseline to show the statistical verification workflow; the baseline data remain real.

set.seed(777)
x_after <- (x - mean(x)) * 0.80 + mean(x)  # reduce spread by 20%
x_after <- x_after - (mean(x_after) - (LSL+USL)/2)  # recenter to midpoint

cap_before <- capability(x, LSL, USL)
cap_after  <- capability(x_after, LSL, USL)

cap_compare <- rbind(
  data.frame(stage="Before", cap_before),
  data.frame(stage="After (demo)", cap_after)
)

ktable(cap_compare, caption="Capability comparison (Before vs After demo)", digits=4)

Capability comparison (Before vs After demo)
stage	mu	sd	Cp	Cpk
Before	31.7095	31.7238	0.3889	0.3332
After (demo)	37.0130	25.3791	0.4861	0.4861

p_def_after <- mean(x_after < LSL | x_after > USL)
ktable(data.frame(stage=c("Before","After (demo)"),
                  defect_fraction=c(p_def_baseline, p_def_after),
                  defect_ppm=c(1e6*p_def_baseline, 1e6*p_def_after)),
       caption="Defect reduction (Before vs After demo)", digits=2)

Defect reduction (Before vs After demo)
stage	defect_fraction	defect_ppm
Before	0.01	10000
After (demo)	0.00	0

5.5 4.5 Control

Choose appropriate SPC charts for ongoing monitoring. Here we show an Individuals chart (I-MR) for the post-improvement stream (illustration):

qI <- qcc(x_after, type="xbar.one", plot=TRUE, title="Individuals Chart (After demo)")

# Build 2-observation subgroups to create moving ranges (size = 2)
pair_mat <- cbind(x_after[-1], x_after[-length(x_after)])  # (x_t, x_{t-1})
qMR <- qcc(pair_mat, type="R", plot=TRUE, title="Moving Range (MR) Chart (After demo)")

summary(qI)

## 
## Call:
## qcc(data = x_after, type = "xbar.one", plot = TRUE, title = "Individuals Chart (After demo)")
## 
## xbar.one chart for x_after 
## 
## Summary of group statistics:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 11.64539 12.44539 28.04539 37.01301 70.84139 70.87419 
## 
## Group sample size:  1
## Number of groups:  600
## Center of group statistics:  37.01301
## Standard deviation:  0.1826543 
## 
## Control limits:
##       LCL      UCL
##  36.46505 37.56098

summary(qMR)

## 
## Call:
## qcc(data = pair_mat, type = "R", plot = TRUE, title = "Moving Range (MR) Chart (After demo)")
## 
## R chart for pair_mat 
## 
## Summary of group statistics:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##  0.00000  0.00000  0.00000  0.20603  0.00720 58.41600 
## 
## Group sample size:  2
## Number of groups:  599
## Center of group statistics:  0.2060341
## Standard deviation:  0.1826543 
## 
## Control limits:
##  LCL       UCL
##    0 0.6731743

6 5. Student Exercises (with guidance)

6.1 Exercise A (Variables charts)

Using a new real dataset of your choice (e.g., airquality$Temp), form subgroups of size $n=5$, build X̄–R charts, and report: 1. limits, 2. any signals, 3. interpretation.

6.2 Exercise B (Attribute charts)

Create a binary CTQ using a real dataset (e.g., iris$Sepal.Length defective if outside a chosen interval), then build a p chart with varying $n_t$.

6.3 Exercise C (Sampling plan)

Design a plan $(n,c)$ so that $P_a(0.01)\ge 0.95$ and $P_a(0.07)\le 0.10$ (use numerical search in R).

Pa <- function(p, n, c) pbinom(c, size=n, prob=p)
target <- function(n,c){
  Pa(0.01,n,c) >= 0.95 && Pa(0.07,n,c) <= 0.10
}
sol <- list()
for(n in 20:200){
  for(c in 0:10){
    if(target(n,c)) { sol[[length(sol)+1]] <- c(n=n,c=c) }
  }
}
sol[1:10]

6.4 Exercise D (Capability)

Compute $C_p$ and $C_{pk}$ for a CTQ using a real dataset and interpret whether centering or variability reduction is needed.

7 Appendix: Notes for instructors

For real industrial labs, replace the illustrative spec limits here with true engineering specs.
Replace the teaching “after demo” with actual post-intervention data when available.
Encourage students to write interpretations, not only plots.

Statistical Quality Control Laboratory Manual (Set A)

Dr. Debashis Chatterjee

2026-02-26