In this article, simple multivariate gaussian distribution will be used to find the outliers in an image.
## [1] "MLE estimate for mean"
## r g b
## 0.5693976 0.4987922 0.1681461
## [1] "MLE estimate for covariance matrix"
## [,1] [,2] [,3]
## [1,] 0.05288263 0.00000000 0.00000000
## [2,] 0.00000000 0.04169364 0.00000000
## [3,] 0.00000000 0.00000000 0.02009812
## [1] "Visualizing Gaussian fit"
Finally the image dataset is going to be divided into training and validation datasets.
The following two white cut portions from the image are going to be used as validation dataset, the first one (the points from orange) with label 1 (since we want orange to be detected as outlier) and the second one with label 0, as shown below. The rest of the image is going to be used as training dataset, from which the estimated parameters for the multivariate Gaussian fit distribution is computed.
## [1] "MLE estimate for mean from the training dataset"
## r g b
## 0.5409833 0.4860903 0.1600407
## [1] "MLE estimate for covariance matrix from the training dataset"
## [,1] [,2] [,3]
## [1,] 0.04874885 0.00000000 0.00000000
## [2,] 0.00000000 0.04163396 0.00000000
## [3,] 0.00000000 0.00000000 0.01906682
## [1] "Best epsilon found using cross-validation: 2.030149e-01"
## [1] "Best F1 on Cross Validation Set: 0.774317"