Overview

This presentation demonstrates an application of statistics in chemistry, specifically the Beer-Lambert Law. In this presentation we will estimate the linear relationship between concentration and absorbance.

library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)

Simulated calibration data

Suppose we have a set of standard solutions with known concentrations (in mol/L) and their measured absorbances at a given wavelength.

set.seed(42)
conc <- seq(0.1, 1.0, by=0.1)
absorb <- 0.25 + 1.2*conc + rnorm(length(conc), sd=0.05)
calib <- data.frame(conc, absorb)
head(calib)
##   conc    absorb
## 1  0.1 0.4385479
## 2  0.2 0.4617651
## 3  0.3 0.6281564
## 4  0.4 0.7616431
## 5  0.5 0.8702134
## 6  0.6 0.9646938

Beer-Lambert Law

The Beer-Lambert Law states:

\[ A = \varepsilon \, c \, l \] where: - \(A\) is absorbance - \(\varepsilon\) is molar absorptivity - \(c\) is concentration - \(l\) is path length (cm)

Calibration is often done using simple linear regression.

Linear regression model

We can model absorbance as:

\[ A_i = \beta_0 + \beta_1 c_i + \varepsilon_i, \qquad \varepsilon_i \sim N(0, \sigma^2) \]

with estimates:

\[ \hat{\beta}_0, \hat{\beta}_1 = \arg\min \sum (A_i - (\beta_0+\beta_1 c_i))^2 \]

Calibration curve

p1 <- ggplot(calib, aes(x = conc, y = absorb)) +
geom_point(size = 2) +
geom_smooth(method = "lm", se = TRUE, color="blue") +
labs(title = "Calibration Curve", x = "Concentration (mol/L)", 
y = "Absorbance")
print(p1)

Residuals histogram

fit <- lm(absorb ~ conc, data = calib)
resid_df <- data.frame(resid = residuals(fit))

p2 <- ggplot(resid_df, aes(x = resid)) +
geom_histogram(binwidth = 0.02, boundary = 0, closed = "left") +
labs(title = "Residuals of calibration model", x = "Residual", y = "Count")
print(p2)

3D interactive plot

Absorbance may depend on both concentration and wavelength. Below is a simulated 3D plot.