Given a probability distribution F and a point x, Mahalanobis Depth (MD) is defined as \(MD = [1 + (x - \mu)'\Sigma^{-1}(x - \mu)]^{-1}\), where \(\mu\) is the mean of distribution F, \(\Sigma\) the variance-covariance matrix.
First, we load the packages MASS and stats.
library(MASS) # For the use of function mvrnorm
library(stats) # For the use of function mahalanobis
Generate a sample of size \(n = 10\) from a bivariate normal distribution with mean \(\mu = (5,10)\), unit variances and correlation 0.5.
mu <- c(5,10)
sigma <- rbind(c(1,0.5), c(0.5,1))
set.seed(23333)
my_sample <- mvrnorm(10, mu, sigma)
my_sample
## [,1] [,2]
## [1,] 4.843196 8.778597
## [2,] 5.463109 8.498375
## [3,] 6.812016 10.340742
## [4,] 5.237737 11.806749
## [5,] 4.957676 10.077101
## [6,] 5.006152 10.351695
## [7,] 6.517056 11.862513
## [8,] 4.906093 8.700336
## [9,] 6.080445 10.831436
## [10,] 6.630384 11.263912
Report the depth of each data point.
m_distance <- mahalanobis(my_sample, mu, sigma)
m_depth <- 1/(1 + m_distance)
tbl <- data.frame(my_sample, m_depth)
tbl
## X1 X2 m_depth
## 1 4.843196 8.778597 0.3614649
## 2 5.463109 8.498375 0.1915824
## 3 6.812016 10.340742 0.2123396
## 4 5.237737 11.806749 0.2059687
## 5 4.957676 10.077101 0.9855466
## 6 5.006152 10.351695 0.8605222
## 7 6.517056 11.862513 0.2029837
## 8 4.906093 8.700336 0.3224561
## 9 6.080445 10.831436 0.4385127
## 10 6.630384 11.263912 0.2546720
Find the bivariate vector that maximizes the depth function.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:MASS':
##
## select
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
tbl1 <- tbl %>% arrange(desc(m_depth))
tbl1[1,]
## X1 X2 m_depth
## 1 4.957676 10.0771 0.9855466