outliers
Create the data.
set.seed(321123)
data <- rchisq(100, df=10)
head(data)
## [1] 8.077467 6.208754 8.679944 12.300834 5.539333 14.603254
View the data.
hist(data,
col="orangered")
Calculate a summary and the IQR of the data.
summary(data)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.533 6.855 8.646 9.837 12.380 35.490
IQR(data)
## [1] 5.527023
Make a boxplot.
boxplot(data,
horizontal=TRUE, col="yellow")
Assign values to the symbols \(q1, m, q3\) and \(iqr\).
q1 <- as.numeric(quantile(data, 0.25))
m <- as.numeric(quantile(data, 0.50))
q3 <- as.numeric(quantile(data, 0.75))
iqr <- q3 - q1
The numbers 1 and 2 are just place holders. Use the correct formulas.
lower.boundary <- q1 - 1.5 * iqr
lower.boundary
## [1] -1.435212
upper.boundary <- q3 + 1.5 * iqr
upper.boundary
## [1] 20.67288
Here are the smallest 10 values and largest 10 values of the data. Are any of these numbers outliers?
head(sort(data), 10)
## [1] 1.532913 3.298883 4.005185 4.577083 4.718302 4.784650 4.849587
## [8] 5.167080 5.395535 5.539333
tail(sort(data), 10)
## [1] 15.38270 15.41619 15.85357 15.85960 16.02138 16.21278 16.68438
## [8] 16.85810 20.15049 35.48660