RS-24 Yars
The RS-24 is a three-stage solid fuel missile that reportedly carries a payload of three reentry vehicles (RV) and penetration aids. The RS-24 Yars is believed to have entered into service in February 2010. While details about the missiles specifications and capabilities are limited, it is reported to be designed similarly to Russia’s SS-27 (Topol M) ICBM and the Bulava (SS-NX-32) SLBM.
For more see: https://missilethreat.csis.org/missile/rs-24/, https://en.wikipedia.org/wiki/RS-24_Yars, https://ria.ru/20210129/rakety-1595155441.html
We don’t believe any speculations about this ICBM. We just try estimate RS-24 operational range, applying multiple linear regression both in frequentest and Bayesian paradigm on SLBM data.
We use aggregated and transformed SLBM data to explore the relationship between M - Mass (kg), R - Range (km), P - Payload (kg), S - Stage (1,2,3), D - Diameter (m), L - Length (m) and W - Type of Warhead (Single - 1, MIRV - 2) of submarine launched ballistic missile.
load("slbm.dat")
library(DT)
library(tidyverse)
library(knitr)
#datatable(slbm)
slbm %>% kable(caption = "SLBM data",digits = 4)
| S | M | L | D | R | P | W |
|---|---|---|---|---|---|---|
| 1 | 5400 | 10.40 | 0.58 | 150 | 975 | 1 |
| 1 | 13700 | 11.80 | 1.30 | 650 | 1597 | 1 |
| 1 | 19650 | 14.20 | 1.30 | 1420 | 1179 | 1 |
| 1 | 14200 | 8.89 | 1.50 | 2400 | 650 | 1 |
| 1 | 14200 | 8.89 | 1.50 | 3000 | 650 | 1 |
| 1 | 14200 | 9.65 | 1.50 | 3000 | 650 | 2 |
| 2 | 33300 | 13.00 | 1.80 | 7800 | 1100 | 1 |
| 2 | 33300 | 13.00 | 1.80 | 9100 | 1100 | 1 |
| 2 | 26900 | 10.60 | 1.54 | 3900 | 450 | 1 |
| 2 | 35300 | 14.10 | 1.80 | 6500 | 1600 | 2 |
| 2 | 35300 | 14.10 | 1.80 | 8000 | 1600 | 1 |
| 2 | 35300 | 14.10 | 1.80 | 6500 | 1600 | 2 |
| 3 | 90000 | 16.00 | 2.40 | 8300 | 2550 | 2 |
| 3 | 40300 | 14.80 | 1.90 | 8300 | 2300 | 2 |
| 3 | 40800 | 14.80 | 1.90 | 8300 | 2800 | 2 |
| 3 | 36800 | 11.50 | 2.00 | 9300 | 1150 | 2 |
| 2 | 13600 | 9.45 | 1.37 | 2800 | 500 | 1 |
| 2 | 16200 | 9.86 | 1.37 | 4630 | 760 | 2 |
| 2 | 29485 | 10.36 | 1.88 | 5600 | 2000 | 2 |
| 3 | 32000 | 10.36 | 1.88 | 7400 | 1360 | 2 |
| 3 | 57500 | 13.42 | 2.11 | 11000 | 2880 | 2 |
| 2 | 20000 | 10.70 | 1.50 | 3000 | 1360 | 1 |
| 2 | 19500 | 10.70 | 1.50 | 3200 | 1360 | 1 |
| 2 | 19950 | 10.40 | 1.50 | 3200 | 1000 | 1 |
| 2 | 14700 | 10.70 | 1.40 | 2500 | 600 | 1 |
| 2 | 23000 | 13.00 | 2.00 | 8000 | 700 | 2 |
Here we produce correlations matrix in graphical form.
library(ggplot2)
library(GGally)
#Here we use function from https://www.r-bloggers.com/multiple-regression-lines-in-ggpairs/
my_fn <- function(data, mapping, ...){
p <- ggplot(data = data, mapping = mapping) +
geom_point() +
geom_smooth(method=loess, fill="red", color="red", ...) +
geom_smooth(method=lm, fill="blue", color="blue", ...)
p
}
g = ggpairs(slbm,columns = 2:6, lower = list(continuous = my_fn))
g
Here we produce classical multiple linear regression model in reduced form (two independent variables) and pass some routine tests.
fit.lm <- lm(log(M)~log(P)+sqrt(R),data = slbm)
summary(fit.lm)
##
## Call:
## lm(formula = log(M) ~ log(P) + sqrt(R), data = slbm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.43582 -0.09245 -0.05523 0.06307 0.65331
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.443759 0.653801 9.856 1.00e-09 ***
## log(P) 0.344329 0.100484 3.427 0.0023 **
## sqrt(R) 0.017668 0.002243 7.877 5.59e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.236 on 23 degrees of freedom
## Multiple R-squared: 0.8453, Adjusted R-squared: 0.8318
## F-statistic: 62.82 on 2 and 23 DF, p-value: 4.788e-10
hist(fit.lm$residuals,col = "blue",main = "Histogram of residuals",
xlab = "Residuals")
par(mfrow=c(2,2))
plot(fit.lm)
Linear regression model formula looks like
\[{Mass} = e^{{B0 + B1\sqrt(Range)} + B2\ln(Payload)}\]
or
\[{Mass} = e^{{6.443759 + 0.01767\sqrt(Range)} + 0.34433\ln(Payload)}\]
library(rvest)
library(stringi)
RS24 <- read_html("https://missilethreat.csis.org/missile/rs-24/") %>%
html_node(".ataglance")%>% html_text()
str_RS24_00 <- stri_replace_all(RS24,replacement = " Originated in Russia: Name Yars, SS-27 Mod 2, SS-29, RS-24, ",regex = "\nYars At a Glance\nOriginated From: RussiaPossessed By: RussiaAlternate Name: SS-27 Mod 2, SS-29, RS-24, Yars/Yantz/Yahres")
str_RS24_01 <- stri_replace_all(str_RS24_00,replacement =": Class",regex = "Class:")
str_RS24_02 <- stri_replace_all(str_RS24_01,replacement =": Basing",regex = "Basing:")
str_RS24_0 <- stri_replace_all(str_RS24_02,replacement = "Silo-based: Length",regex = "Silo-basedLength:")
str_RS24_1 <- stri_replace_all(str_RS24_0,replacement = "m : Diameter",regex = "mDiameter:")
str_RS24_2 <- stri_replace_all(str_RS24_1,replacement = ": Launch Weight",regex = "Launch Weight:")
str_RS24_3 <- stri_replace_all(str_RS24_2,replacement = "kg: Payload",regex = "kgPayload:")
str_RS24_4 <- stri_replace_all(str_RS24_3,replacement = "kg : Warhead",regex = "kgWarhead:")
str_RS24_5 <- stri_replace_all(str_RS24_4,replacement = "kT: Propulsion",regex = "kTPropulsion:")
str_RS24_6 <- stri_replace_all(str_RS24_5,replacement = "propellant: Range",regex = "propellantRange:")
str_RS24_7 <- stri_replace_all(str_RS24_6,replacement = "km: Status",regex = "kmStatus:")
str_RS24_8 <- stri_replace_all(str_RS24_7,replacement = " Operational: In Service 2010",regex = "OperationalIn Service: 2010\n")
stri_split(str_RS24_8,regex = ":")
## [[1]]
## [1] " Originated in Russia"
## [2] " Name Yars, SS-27 Mod 2, SS-29, RS-24, "
## [3] " Class Intercontinental Ballistic Missile (ICBM)"
## [4] " Basing Road-mobile, Silo-based"
## [5] " Length 22.5 m "
## [6] " Diameter 2.0 m (first stage)"
## [7] " Launch Weight 49,600 kg"
## [8] " Payload Three MIRV warheads, 1,200 kg "
## [9] " Warhead Nuclear, 150-200 kT"
## [10] " Propulsion Three-stage solid propellant"
## [11] " Range 10,500 km"
## [12] " Status Operational"
## [13] " In Service 2010"
Here we apply previous linear model formula for Range given Mass and Payload.
\[Range={(\frac{6.443759+0.34433ln(Payload)-ln(Mass)}{0.01767})}^2\]
B0 <- as.numeric(fit.lm$coefficients[1])
B1 <- as.numeric(fit.lm$coefficients[2])
B2 <- as.numeric(fit.lm$coefficients[3])
Mass <- 49600
Payload <- 1200
Range <- ((B0 + B1*log(Payload) - log(Mass))/B2)^2
cat("Range =",Range)
## Range = 11891.62
fit.slbm.bs.4<-stan_glm(sqrt(R)~S+D+L+W+log(M)+log(P),data=slbm,chains=4,iter=10000,seed=12345)
fit.slbm.bs.4
## stan_glm
## family: gaussian [identity]
## formula: sqrt(R) ~ S + D + L + W + log(M) + log(P)
## observations: 26
## predictors: 7
## ------
## Median MAD_SD
## (Intercept) -40.6 106.7
## S 9.4 5.1
## D 37.4 19.8
## L 0.9 1.7
## W 3.3 5.6
## log(M) 6.5 15.0
## log(P) -7.3 6.0
##
## Auxiliary parameter(s):
## Median MAD_SD
## sigma 10.2 1.7
##
## Sample avg. posterior predictive distribution of y:
## Median MAD_SD
## mean_PPD 69.0 2.8
##
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg
par(mfrow=c(1,1))
plot(fit.slbm.bs.4)
posterior_vs_prior(fit.slbm.bs.4)
##
## Drawing from prior...
fit.slbm.bs42<-as.array(fit.slbm.bs.4)
mcmc_hist(fit.slbm.bs42)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
mcmc_trace(fit.slbm.bs42)
summary(fit.slbm.bs.4)
##
## Model Info:
##
## function: stan_glm
## family: gaussian [identity]
## formula: sqrt(R) ~ S + D + L + W + log(M) + log(P)
## algorithm: sampling
## priors: see help('prior_summary')
## sample: 20000 (posterior sample size)
## observations: 26
## predictors: 7
##
## Estimates:
## mean sd 2.5% 25% 50% 75% 97.5%
## (Intercept) -40.0 109.1 -254.9 -112.3 -40.6 31.7 175.7
## S 9.4 5.4 -1.2 5.9 9.4 12.9 20.2
## D 37.4 20.3 -2.6 24.1 37.4 50.9 77.4
## L 0.9 1.8 -2.7 -0.3 0.9 2.0 4.4
## W 3.2 5.8 -8.5 -0.5 3.3 7.1 14.8
## log(M) 6.4 15.6 -24.3 -3.6 6.5 16.6 37.0
## log(P) -7.3 6.2 -19.5 -11.3 -7.3 -3.3 5.0
## sigma 10.4 1.8 7.6 9.2 10.2 11.4 14.6
## mean_PPD 69.0 2.9 63.2 67.1 69.0 70.9 74.7
## log-posterior -110.5 2.4 -116.3 -111.8 -110.1 -108.8 -107.1
##
## Diagnostics:
## mcse Rhat n_eff
## (Intercept) 1.2 1.0 8838
## S 0.0 1.0 12972
## D 0.2 1.0 9062
## L 0.0 1.0 12215
## W 0.1 1.0 11906
## log(M) 0.2 1.0 8349
## log(P) 0.1 1.0 13073
## sigma 0.0 1.0 10097
## mean_PPD 0.0 1.0 19001
## log-posterior 0.0 1.0 6237
##
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
set.seed(12345)
slbm.pp <- rstanarm::posterior_predict(fit.slbm.bs.4,
newdata = data.frame(L=22.5,D=2,S=3,W=2,
M=Mass,P=Payload),seed=12345)
Range.km <- slbm.pp^2
Range.km <- Range.km[1:10000,]
quantile(Range.km,probs = c(0.1,0.5,0.9))
## 10% 50% 90%
## 6663.413 11402.103 17464.954
ggplot(data=as.data.frame(Range.km), aes(Range.km)) +
geom_histogram(bins = 30,col="black",fill="green") +
geom_vline(xintercept = mean(Range.km), color = "red") +
geom_errorbarh(aes(y=-5, xmin=quantile(Range.km,0.1),
xmax=quantile(Range.km,0.9)),
data=as.data.frame(Range.km), col="#0094EA", size=3) +
ggtitle(label="Probability density of RS-24 Range")
mean(Range.km>9000)
## [1] 0.7339
Given historical SLBM data (M,P,D,L,S,W) and linear regression model we can produce point estimate for RS-24 ICBM Range as \({Range} = {{(\frac{6.443759+0.34433ln(Payload)-ln(Mass)}{0.01767})}^2}=11891\)
Applying Bayesian linear regression model for SLBM data we can produce posterior distribution sample given the same historical data with 80% credible interval for RS-24 ICBM Range as \(P(6663\le Range\le17465)=0.8\) and the mean \(Range=11402\).
We can be 73% assured that RS-24 ICBM \(Range\ge9000\).
Both estimates are very close to the official ones.