Optimized Letter-Value Based Estimates for Tukey’s g-&-h Distribution

Author

Tingting Zhan

Published

July 26, 2025

Introduction

This vignette of package TukeyGH77 (CRAN, Github) documents

Tukey (1977)’s g-&-h transformation and its inverse
Tukey (1977)’s g-&-h distribution density, cumulative probability, quantile, and random generator
an improvement to Hoaglin (1985)’s letter-value based estimates.

R terminology might be different from that of mathematics and statistics. Please refer to Appendix Section 5.1 for explanation and reference of the terms and abbreviations used in this vignette.

Package TukeyGH77 Imports packages

ggplot2 (Wickham 2016, version 3.5.2), geomtextpath (Cameron and van den Brand 2025, version 0.2.0), latex2exp (Meschiari 2022, version 0.9.6), patchwork (Pedersen 2025, version 1.3.1), rlang (Henry and Wickham 2025, version 1.1.6), scales (Wickham, Pedersen, and Seidel 2025, version 1.4.0), for visualization
🗝 rstpm2 (X.-R. Liu, Y. Pawitan, and M. Clements 2018, version 1.6.7), key dependency, for function rstpm2::vuniroot()

Prerequisite

New features are first implemented on Github.

remotes::install_github('tingtingzhan/TukeyGH77')

And eventually make their way to CRAN.

utils::install.packages('TukeyGH77') # Developers: do NOT use!

Note to Users

Examples in this vignette require that the search path has

library(TukeyGH77)

Tukey’s g-&-h Transformation

Tukey’s Transformation

Tukey (1977)’s g-&-h random variable t_{gh} is a monotone transformation of the standard normal variable z , t_{gh} = \begin{cases} z\cdot G_g(z) & g\neq 0,\ g\text{-distribution} \\ z\cdot H_h(z) & h>0,\ h\text{-distribution} \\ z\cdot G_g(z)\cdot H_h(z) & g\neq 0,h>0,\ gh\text{-distribution} \\ \end{cases} with skewness

G_g(z)= \begin{cases} (e^{gz}-1)/gz & g\neq 0\\\ \displaystyle\lim_{g\rightarrow 0} G_{g \ne 0}(z)=1 & g=0 \end{cases} and kurtosis

H_h(z)= \begin{cases} e^{hz^2/2} & h>0\\ 1 & h =0 \end{cases} Tukey’s transformation function z2gh() transforms the standard normal variable z to Tukey’s g-&-h random variable t_{gh}.

set.seed(15); rnorm(1e3L) |>
  z2gh(g = .5, h = .1) |>
  hist(breaks = 30L, xlab = NULL, main = c('Tukey(g=.5, h=.1)'))

Inverse Tukey’s Transformation

Inverse Tukey’s transformation \zeta_{gh}

\zeta_{gh}(t) = \begin{cases} g^{-1}\ln(gt+1), & g\neq0 \\ z:\ z\cdot H_h(z) = t, & h>0 \\ z:\ z\cdot G_g(z)\cdot H_h(z) = t, & g\neq 0,\ h>0 \end{cases} does not have a closed analytical form when h>0, but the numerical solution could be found by using Brent (1973)’s root-finding algorithm implemented in function rstpm2::vuniroot().

Inverse Tukey’s transformation function gh2z() transforms the Tukey’s g-&-h random variable t_{gh} to standard normal variable z. A closed analytical form of \zeta_{gh} exists when h=0, thus the discrepancy is merely a floating-point error.

set.seed(24); x = rnorm(1e3L)
x |> 
  z2gh(g = .3, h = 0) |> 
  gh2z(g = .3, h = 0) |> 
  all.equal.numeric(target = x, tolerance = .Machine$double.eps)
#> [1] "Mean relative difference: 2.871014e-16"

We have to rely on a root-finding algorithm when h>0. To find a numerical solution for standard normal quantile z, we need only to search the interval of (-8.3, 8.3), as

stopifnot(identical(pnorm(8.3), 1))

Nonetheless, the root-finding algorithm would have much larger discrepancy (which can be controlled by parameter tol inside function gh2z()).

set.seed(91); x = rnorm(1e3L)
x |> 
  z2gh(g = 0, h = .1) |> 
  gh2z(g = 0, h = .1) |> 
  all.equal.numeric(target = x, tolerance = .Machine$double.eps)
#> [1] "Mean relative difference: 5.191943e-06"

set.seed(56); x = rnorm(1e3L)
x |> 
  z2gh(g = .3, h = .1) |> 
  gh2z(g = .3, h = .1) |> 
  all.equal.numeric(target = x, tolerance = .Machine$double.eps)
#> [1] "Mean relative difference: 5.138869e-06"

Tukey g-&-h Distribution

Probability

The distribution function is F_{gh}(t)=\text{Pr}(T_{gh} \le t)=\text{Pr}\left(Z \le \zeta_{gh}(t)\right).

curve(pGH(x, g = .3, h = .1), from = -2.5, to = 3.5, n = 501L, ylab = 'Probability')

Density

Density function f_{gh} has a closed analytical form in terms of \zeta_{gh},

f_{gh}(t) = \dfrac{e^{-z^2/2}}{\sqrt{2\pi}\cdot\partial t_{gh}/\partial z}\Bigg|_{z=\zeta_{gh}(t)}

where

\dfrac{\partial t_{gh}}{\partial z}= \begin{cases} e^{gz}, & g\neq0\\ e^{hz^2/2}(1+hz^2), & h>0 \\ e^{hz^2/2}\left(e^{gz}+g^{-1}hz(e^{gz}-1)\right), & g\neq 0,\ h>0 \end{cases}

Note that when h=0, domain has a bound.

curve(dGH(x, g = 1, h = 0), from = -1.2, to = 3.5, n = 501L, ylab = 'Density')

When h>0, domain is (-\infty, \infty).

curve(dGH(x, g = .3, h = .1), from = -2.5, to = 3.5, n = 501L, ylab = 'Density')

Quantile

curve(qGH(x, g = .3, h = .1), from = 0, to = 1, n = 501L, ylab = 'Quantile')

Simulation

set.seed(17); rGH(n = 1e3L, g = .3, h = .1) |> 
  hist(breaks = 30L, xlab = NULL, main = 'TukeyGH(g = .3, h = .1)')

Optimized Letter-Value Based Estimates

Consider a random sample T from Tukey’s g-&-h distribution with parameters (A,B,g,h).

Let \hat{t}_p be the p-th sample quantile, for 0<p<.5.

We refer A-t_p as the lower half-spread (LHS) and t_{1-p}-A as the upper half-spread (UHS).

Tukey’s h-Model

From Tukey’s h-transformation (i.e., g=0) on LHS and UHS, as well as their summation (Hoaglin (1985), equations (26a), (26b), (27) and (28)),

\begin{cases} A - t_p = -Bz_pe^{hz^2_p/2} & \Rightarrow\quad \ln\left(\dfrac{\hat{A}-\hat{t}_p}{-z_p}\right) = {\color{blue}{\ln B_\text{L}}}+{\color{blue}{h_\text{L}}}z^2_p/2 + \varepsilon\\ t_{1-p} - A = Bz_{1-p}e^{hz^2_{1-p}/2} & \Rightarrow\quad \ln\left(\dfrac{\hat{t}_{1-p}-\hat{A}}{z_{1-p}}\right) = {\color{blue}{\ln B_\text{U}}}+{\color{blue}{h_\text{U}}}z^2_p/2 + \varepsilon\\ t_{1-p}-t_p = B(z_{1-p}-z_p)e^{hz^2_p/2} & \Rightarrow\quad \ln\left(\dfrac{\hat{t}_{1-p} - \hat{t}_p}{-2z_p}\right) = {\color{blue}{\ln B}} + {\color{blue}{h}}z^2_p/2 + \varepsilon \end{cases}

Function letterValue_h() estimates Tukey’s h-model. We explore some choices of \hat{A} between the 45th to 55th sample percentile, with parameter A_select,

A_select = 'median' (default) of random sample T, the naïve choice in Hoaglin (1985)
A_select = 'B.optim', to minimize \text{err}_B =(\hat{B}_\text{U}-\hat{B}_\text{L})^2
A_select = 'h.optim', to minimize \text{err}_h =(\hat{h}_\text{U}-\hat{h}_\text{L})^2

Note that the estimated intercept and slope parameters (\ln\hat{B}, \hat{h}) on both half spreads do not depend on \hat{A}.

set.seed(28); rGH(1e3L, h = .2) |>
  letterValue_h()
#>           A           B           g           h 
#> -0.07240777  1.01353750  0.00000000  0.14859619

Figure 1 shows different choices of \hat{A} affect (\hat{B}_\text{L}, \hat{h}_\text{L}) and (\hat{B}_\text{U}, \hat{h}_\text{U}), but do not affect (\hat{B}, \hat{h}).

[R] Tukey h-model: choices of A

set.seed(28); rGH(1e3L, h = .2) |>
  letterValue_h(A_select = 'demo')

Tukey’s g-Model

From Tukey’s g-transformation (i.e., h=0) on LHS and UHS, as well as their ratio (Hoaglin (1985), equations (8a), (8b), (10)),

\begin{cases} t_p - A = B(e^{gz_p}-1)/g & \Rightarrow\quad \hat{t}_p - \hat{A} = {\color{blue}{0}} + {\color{blue}{B_\text{L}}}(e^{\hat{g}z_p}-1)/\hat{g} + \varepsilon\\ t_{1-p} - A = B(e^{gz_{1-p}}-1)/g & \Rightarrow\quad \hat{t}_{1-p} - \hat{A} = {\color{blue}{0}} + {\color{blue}{B_\text{U}}}(e^{\hat{g}z_{1-p}}-1)/\hat{g} + \varepsilon\\ \dfrac{A-t_p}{t_{1-p}-A} = \dfrac{1-e^{gz_p}}{e^{-gz_p}-1} = e^{gz_p} & \Rightarrow\quad \hat{g}_p = \dfrac{1}{z_{p}}\cdot \ln\left(\dfrac{\hat{A}-\hat{t}_{p}}{\hat{t}_{1-p}-\hat{A}}\right) \end{cases}

Function letterValue_g() estimates Tukey’s g-model. We first explore some choices of \hat{A} between the 45th to 55th sample percentile, with parameter A_select,

A_select = 'median' of random sample T, the naïve choice in Hoaglin (1985)
A_select = 'optim' (default), to minimize the standard deviation stats::sd() of \hat{g}_p

We then explore some choices of \hat{g} between the 40th to 60th percentile of \hat{g}_p, with parameter g_select,

g_select = 'median' of \hat{g}_p, the naïve choice in Hoaglin (1985)
g_select = 'optim' (default), to minimize \text{err}_B =(\hat{B}_\text{U}-\hat{B}_\text{L})^2

set.seed(43); rGH(1e3L, g = .3) |>
  letterValue_g()
#>            A            B            g            h 
#> 0.0001778168 0.9962584286 0.3360145770 0.0000000000

Figure 2 shows different choices of \hat{A} affect the estimated \hat{g}_p.

Figure 3 shows different choices of \hat{g} typically do not affect \hat{B}_\text{L}, \hat{B}_\text{U} nor \hat{B} too much.

[R] Tukey g-model: choices of A

set.seed(43); rGH(1e3L, g = .3) |>
  letterValue_g(A_select = 'demo')

[R] Tukey g-model: choices of g

set.seed(43); rGH(1e3L, g = .3) |>
  letterValue_g(g_select = 'demo')

Tukey’s g-&-h Model

From Tukey’s g-&-h transformation on LHS and UHS, as well as their difference and ratio (Hoaglin (1985), equation (33)),

\begin{cases} t_p - A= Bg^{-1}(e^{gz_p}-1)e^{hz^2_p/2} & \Rightarrow\quad \ln\left(\dfrac{\hat{g}(\hat{t}_p-\hat{A})}{e^{\hat{g}z_p}-1}\right) = {\color{blue}{\ln B_\text{L}}}+{\color{blue}{h_\text{L}}}z^2_p/2 + \varepsilon\\ t_{1-p} - A = Bg^{-1}(e^{gz_{1-p}}-1)e^{hz^2_{1-p}/2} & \Rightarrow\quad \ln\left(\dfrac{\hat{g}(\hat{t}_{1-p}-\hat{A})}{e^{-\hat{g}z_p}-1}\right) = {\color{blue}{\ln B_\text{U}}}+{\color{blue}{h_\text{U}}}z^2_p/2 + \varepsilon\\ t_{1-p}-t_p = Bg^{-1}(e^{-gz_p}-e^{gz_p})e^{hz^2_p/2} & \Rightarrow\quad \ln\left(\dfrac{\hat{g}(\hat{t}_{1-p}-\hat{t}_p)}{e^{-\hat{g}z_p}-e^{\hat{g}z_p}}\right) = {\color{blue}{\ln B}} + {\color{blue}{h}}z^2_p/2 + \varepsilon\\ \dfrac{A-t_p}{t_{1-p}-A} = \dfrac{1-e^{gz_p}}{e^{-gz_p}-1} = e^{gz_p} & \Rightarrow\quad \hat{g}_p\quad \text{same as}\ g\text{-model} \end{cases}

The estimates (\ln\hat{B}_\text{L}, \hat{h}_\text{L}), (\ln\hat{B}_\text{U}, \hat{h}_\text{U}) and (\ln\hat{B}, \hat{h}) could all be very sensitive to the choice of \hat{g}.

Function letterValue_gh() estimates Tukey’s g-&-h model. We first explore some choices of \hat{A} between the 45th to 55th sample percentile, with parameter A_select,

A_select = 'median' of random sample T, the naïve choice in Hoaglin (1985)
A_select = 'optim' (default), to minimize the standard deviation stats::sd() of \hat{g}_p.

We then explore some choices of \hat{g} between the 40th to 60th percentile of \hat{g}_p, with parameter g_select,

g_select = 'median' of \hat{g}_p, the naïve choice in Hoaglin (1985)
g_select = 'B.optim', to minimize \text{err}_B =(\hat{B}_\text{U}-\hat{B}_\text{L})^2
g_select = 'h.optim' (default), to minimize \text{err}_h =(\hat{h}_\text{U}-\hat{h}_\text{L})^2

Perfect example

set.seed(54); rGH(n = 1e3L, g = -.3, h = .1) |> 
  letterValue_gh()
#>            A            B            g            h 
#> -0.004519068  0.986487241 -0.294966233  0.158689118

This is a ‘perfect’ data set that

all choices of \hat{A} are pretty close, in terms of the dispersion and/or pattern of \hat{g}_p, Figure 4;
all choices of \hat{g} have very small \text{err}_h and \text{err}_B, Figure 5.

[R] Tukey gh-model (Perfect): choices of A

set.seed(54); rGH(n = 1e3L, g = -.3, h = .1) |> 
  letterValue_gh(A_select = 'demo')

Figure 4: Tukey gh-model (Perfect): choices of A

[R] Tukey gh-model (Perfect): choices of g

set.seed(54); rGH(n = 1e3L, g = -.3, h = .1) |> 
  letterValue_gh(g_select = 'demo')

Figure 5: Tukey gh-model (Perfect): choices of g

Salvageable example

set.seed(335); rGH(n = 1e3L, g = -.3, h = .1) |>
  letterValue_gh()
#>             A             B             g             h 
#>  0.0004409318  1.0404701657 -0.3128586259  0.0172237338

This is a ‘salvageable’ data set that

\hat{A}_\text{optim} not only reduces the standard deviations of \hat{g}_p, but also removes the clear pattern in \hat{g}_p, Figure 6;
\hat{g}_\text{h.optim} provides both \hat{h}_\text{L}>0 and \hat{h}_\text{U}>0, and surely \hat{h}>0, Figure 7.

Figure 6: Tukey gh-model (Salvageable): choices of A

[R] Tukey gh-model (Salvageable): choices of g

set.seed(335); rGH(n = 1e3L, g = -.3, h = .1) |>
  letterValue_gh(g_select = 'demo')

Figure 7: Tukey gh-model (Salvageable): choices of g

Hopeless example

set.seed(39); rGH(n = 1e3L, g = -.3, h = .1) |>
  letterValue_gh()
#>           A           B           g           h 
#> -0.03884931  1.09634946 -0.25245660  0.00000000

This is a ‘hopeless’ data set that

\hat{A}_\text{optim} neither reduces much the standard deviations of \hat{g}_p, nor removes the clear pattern in \hat{g}_p, Figure 8;
estimated slope \hat{h}<0 for all choices of \hat{g}, Figure 9.

[R] Tukey gh-model (Hopeless): choices of A

set.seed(39); rGH(n = 1e3L, g = -.3, h = .1) |>
  letterValue_gh(A_select = 'demo')

Figure 8: Tukey gh-model (Hopeless): choices of A

[R] Tukey gh-model (Hopeless): choices of g

set.seed(39); rGH(n = 1e3L, g = -.3, h = .1) |>
  letterValue_gh(g_select = 'demo')

Figure 9: Tukey gh-model (Hopeless): choices of g

Good new is that we do not see ‘hopeless’ data sets often in simulations (\approx 6.3\% in random seeds 1:1e3L). In this case, we recommend reporting \hat{h}=0.

Appendix

Terms & Abbreviations

Term / Abbreviation	Description
`\|>`	Forward pipe operator introduced in `R` 4.1.0
`all.equal.numeric`	Test of near equality
`CRAN`, `R`	The Comprehensive R Archive Network
`curve`	Function plots
`.lm.fit`	Least squares regression, the internal workhorse
`mad`	Median Absolute Deviation
`median`	Median
`dnorm`, `pnorm`, `qnorm`, `rnorm`	Normal Distribution
`optim`, `optimize`	Optimization
`sd`	Standard Deviation
`search`	Search path
`seed`	Random number generation seed
`vuniroot`	Vectorised one-dimensional root-finding, in package `rstpm2`

References

Brent, Richard P. 1973. Algorithms for Minimization Without Derivatives. Englewood Cliffs, New Jersey: Prentice-Hall.

Cameron, Allan, and Teun van den Brand. 2025. geomtextpath: Curved Text in ’Ggplot2’. https://doi.org/10.32614/CRAN.package.geomtextpath.

Henry, Lionel, and Hadley Wickham. 2025. rlang: Functions for Base Types and Core r and ’Tidyverse’ Features. https://doi.org/10.32614/CRAN.package.rlang.

Hoaglin, David C. 1985. “Exploring Data Tables, Trends, and Shapes.” In, 461–513. New York: John Wiley & Sons, Ltd. https://doi.org/10.1002/9781118150702.ch11.

Meschiari, Stefano. 2022. latex2exp: Use LaTeX Expressions in Plots. https://doi.org/10.32614/CRAN.package.latex2exp.

Pedersen, Thomas Lin. 2025. patchwork: The Composer of Plots. https://doi.org/10.32614/CRAN.package.patchwork.

Tukey, John W. 1977. “Modern Techniques in Data Analysis.”

Wickham, Hadley. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.

Wickham, Hadley, Thomas Lin Pedersen, and Dana Seidel. 2025. scales: Scale Functions for Visualization. https://doi.org/10.32614/CRAN.package.scales.

X.-R. Liu, Y. Pawitan, and M. Clements. 2018. “Parametric and Penalized Generalized Survival Models.” Statistical Methods in Medical Research 27 (5): 1531–46.