library(pander)
library(outliers)
library(EnvStats)
Using RStudio, find \(P_{16}\), \(D_{6}\), \(Q_{1}\), \(Q_{3}\) and the interquartile range of the following data set on the total weight, in kilograms, of ready-to-cook chicken inasal leg quarters sold by a frozen foods retail store during selected days of June and July. Does the sample have outliers?
Data: 35.2, 7.0, 24.0, 42.4, 33.0, 27.5, 24.0, 21.0, 8.0, 45.6, 25.9, 14.8, 29.8, 21.0, 17.5, 9.7, 40.0, 18.8, 57.9, 21.0, 12.0, 12.0, 19.6, 51.5, 12.0, 36.8, 13.7, 32.8, 12.0, 10.5, 22.5, 19.5, 37.5, 35.0, 10.5, 33.6, 14.5, 36.5, 17.9, 26.9, 12.0, 41.5.
inasal <- c(35.2, 7.0, 24.0, 42.4, 33.0, 27.5, 24.0, 21.0, 8.0, 45.6, 25.9, 14.8, 29.8, 21.0, 17.5, 9.7, 40.0, 18.8, 57.9, 21.0, 12.0, 12.0, 19.6, 51.5, 12.0, 36.8, 13.7, 32.8, 12.0, 10.5, 22.5, 19.5, 37.5, 35.0, 10.5, 33.6, 14.5, 36.5, 17.9, 26.9, 12.0, 41.5)
\(~\)
For the \(16^{th}\) percentile, \(P_{16}\):
x1 <- quantile(inasal, 0.16)
pander(round(x1, digits = 2))
16% |
---|
12 |
\(~\)
For the \(6^{th}\) decile, \(D_{6}\):
x2 <- quantile(inasal, 0.60)
pander(round(x2, digits = 2))
60% |
---|
26.5 |
\(~\)
For the \(1^{st}\) quartile, \(Q_{1}\):
x3 <- quantile(inasal, 0.25)
pander(round(x3, digits = 2))
25% |
---|
13.9 |
\(~\)
For the \(3^{rd}\) quartile, \(Q_{3}\):
x4 <- quantile(inasal, 0.75)
pander(round(x4, digits = 2))
75% |
---|
34.65 |
\(~\)
For the Interquartile Range (IQR):
x5 <- quantile(inasal, 0.75) - quantile(inasal, 0.25)
pander(round(x5, digits = 2))
75% |
---|
20.75 |
\(~\)
To check for outliers:
boxplot(inasal, outcol = "red", cex = 1.5)
The boxplot shows no outliers.
rosnerTest(inasal, k=3, warn=TRUE)
##
## Results of Outlier Test
## -------------------------
##
## Test Method: Rosner's Test for Outliers
##
## Hypothesized Distribution: Normal
##
## Data: inasal
##
## Sample Size: 42
##
## Test Statistics: R.1 = 2.605014
## R.2 = 2.345154
## R.3 = 2.023399
##
## Test Statistic Parameter: k = 3
##
## Alternative Hypothesis: Up to 3 observations are not
## from the same Distribution.
##
## Type I Error: 5%
##
## Number of Outliers Detected: 0
##
## i Mean.i SD.i Value Obs.Num R.i+1 lambda.i+1 Outlier
## 1 0 24.87857 12.67611 57.9 19 2.605014 3.056723 FALSE
## 2 1 24.07317 11.69511 51.5 24 2.345154 3.046571 FALSE
## 3 2 23.38750 10.97781 45.6 10 2.023399 3.036097 FALSE
There are no outliers based on the Rosner’s Test.
\(~\)
Consider the following data set on the monthly rent ($) for a sample of one-bedroom apartments:
Data: 425, 430, 430, 435, 435, 435, 435, 435, 440, 440, 440, 440, 440, 445, 445, 445, 445, 445, 450, 450, 450, 450, 450, 450, 450, 460, 460, 460, 465, 465, 465, 470, 470, 472, 475, 475, 475, 480, 480, 480, 480, 485, 490, 490, 490, 500, 500, 500, 500, 510, 510, 515, 525, 525, 525, 535, 549, 550, 570, 570, 575, 575, 580, 590, 600, 600, 600, 600, 615, 615.
Compute \(P_{13}\), \(D_{6}\), \(Q_{1}\), \(Q_{3}\) and the interquartile range. Also, find the outliers, if any, for the given data set. (10 points)
rent <- c(425, 430, 430, 435, 435, 435, 435, 435, 440, 440, 440, 440, 440, 445, 445, 445, 445, 445, 450, 450, 450, 450, 450, 450, 450, 460, 460, 460, 465, 465, 465, 470, 470, 472, 475, 475, 475, 480, 480, 480, 480, 485, 490, 490, 490, 500, 500, 500, 500, 510, 510, 515, 525, 525, 525, 535, 549, 550, 570, 570, 575, 575, 580, 590, 600, 600, 600, 600, 615, 615)
\(~\)
For the \(13^{th}\) percentile, \(P_{13}\):
x1 <- quantile(rent, 0.13)
pander(round(x1, digits = 2))
13% |
---|
440 |
\(~\)
For the \(6^{th}\) decile, \(D_{6}\):
x2 <- quantile(rent, 0.60)
pander(round(x2, digits = 2))
60% |
---|
487 |
\(~\)
For the \(1^{st}\) quartile, \(Q_{1}\):
x3 <- quantile(rent, 0.25)
pander(round(x3, digits = 2))
25% |
---|
446.2 |
\(~\)
For the \(3^{rd}\) quartile, \(Q_{3}\):
x3 <- quantile(rent, 0.75)
pander(round(x3, digits = 2))
75% |
---|
522.5 |
\(~\)
For the Interquartile Range (IQR):
x5 <- quantile(rent, 0.75) - quantile(rent, 0.25)
pander(round(x5, digits = 2))
75% |
---|
76.25 |
\(~\)
To check for outliers:
boxplot(rent, outcol = "red", cex = 1.5)
The boxplot shows no outliers.
\(~\)
rosnerTest(rent, k=3, warn=TRUE)
##
## Results of Outlier Test
## -------------------------
##
## Test Method: Rosner's Test for Outliers
##
## Hypothesized Distribution: Normal
##
## Data: rent
##
## Sample Size: 70
##
## Test Statistics: R.1 = 2.269023
## R.2 = 2.376898
## R.3 = 2.208312
##
## Test Statistic Parameter: k = 3
##
## Alternative Hypothesis: Up to 3 observations are not
## from the same Distribution.
##
## Type I Error: 5%
##
## Number of Outliers Detected: 0
##
## i Mean.i SD.i Value Obs.Num R.i+1 lambda.i+1 Outlier
## 1 0 490.8000 54.73721 615 69 2.269023 3.257596 FALSE
## 2 1 489.0000 53.01027 615 70 2.376898 3.252277 FALSE
## 3 2 487.1471 51.10372 600 65 2.208312 3.246863 FALSE
There are no outliers based on the Rosner’s Test.
\(~\)
Find \(P_{62}\), \(D_{4}\), \(Q_{1}\), \(Q_{3}\) and the interquartile range of the following data set representing the amount, in pesos, saved by a student from her allowance each day. Find the outliers, if any, for the given data set. (10 points)
Data: 16, 23, 37, 42, 48, 55, 56, 59, 62, 63, 65, 66, 66, 69, 69, 70, 71, 71, 73, 73, 74, 75, 75, 75, 77, 78, 81, 82, 82, 83, 86, 89.
savings <- c(16, 23, 37, 42, 48, 55, 56, 59, 62, 63, 65, 66, 66, 69, 69, 70, 71, 71, 73, 73, 74, 75, 75, 75, 77, 78, 81, 82, 82, 83, 86, 89)
\(~\)
For the \(62^{nd}\) percentile, \(P_{62}\):
x1 <- quantile(savings, 0.62)
pander(round(x1, digits = 2))
62% |
---|
73.22 |
\(~\)
For the \(4^{th}\) decile, \(D_{4}\):
x2 <- quantile(savings, 0.40)
pander(round(x2, digits = 2))
40% |
---|
67.2 |
\(~\)
For the \(1^{st}\) quartile, \(Q_{1}\):
x3 <- quantile(savings, 0.25)
pander(round(x3, digits = 2))
25% |
---|
61.25 |
\(~\)
For the \(3^{rd}\) quartile, \(Q_{3}\):
x4 <- quantile(savings, 0.75)
pander(round(x4, digits = 2))
75% |
---|
75.5 |
\(~\)
For the Interquartile Range (IQR):
x5 <- quantile(savings, 0.75) - quantile(savings, 0.25)
pander(round(x5, digits = 2))
75% |
---|
14.25 |
\(~\)
To check for outliers:
boxplot(savings, outcol = "red", cex = 1.5)
The boxplot shows three outliers.
\(~\)
Now, we perform the Rosner’s test to check if these outliers are significant:
rosnerTest(savings, k=4, warn=TRUE)
##
## Results of Outlier Test
## -------------------------
##
## Test Method: Rosner's Test for Outliers
##
## Hypothesized Distribution: Normal
##
## Data: savings
##
## Sample Size: 32
##
## Test Statistics: R.1 = 2.917284
## R.2 = 3.024578
## R.3 = 2.584623
## R.4 = 2.556553
##
## Test Statistic Parameter: k = 4
##
## Alternative Hypothesis: Up to 4 observations are not
## from the same Distribution.
##
## Type I Error: 5%
##
## Number of Outliers Detected: 2
##
## i Mean.i SD.i Value Obs.Num R.i+1 lambda.i+1 Outlier
## 1 0 65.96875 17.12852 16 1 2.917284 2.938048 TRUE
## 2 1 67.58065 14.73946 23 2 3.024578 2.923571 TRUE
## 3 2 69.06667 12.40671 37 3 2.584623 2.908473 FALSE
## 4 3 70.17241 11.01969 42 4 2.556553 2.892705 FALSE
The Rosner’s test show that the values 16 and 23 are outliers.
\(~\)
We trim the data by removing these observations and after doing this, we check if there are still outliers:
savings1 <- savings[c(-1, -2)]
boxplot(savings1, outcol = "red", cex = 1.5)
The boxplot shows that there is an outlier.
\(~\)
Performing Rosner’s Test again to check if the outlier is significant:
rosnerTest(savings1, k=3, warn=TRUE)
##
## Results of Outlier Test
## -------------------------
##
## Test Method: Rosner's Test for Outliers
##
## Hypothesized Distribution: Normal
##
## Data: savings1
##
## Sample Size: 30
##
## Test Statistics: R.1 = 2.584623
## R.2 = 2.556553
## R.3 = 2.372018
##
## Test Statistic Parameter: k = 3
##
## Alternative Hypothesis: Up to 3 observations are not
## from the same Distribution.
##
## Type I Error: 5%
##
## Number of Outliers Detected: 0
##
## i Mean.i SD.i Value Obs.Num R.i+1 lambda.i+1 Outlier
## 1 0 69.06667 12.406709 37 1 2.584623 2.908473 FALSE
## 2 1 70.17241 11.019687 42 2 2.556553 2.892705 FALSE
## 3 2 71.17857 9.771666 48 3 2.372018 2.876209 FALSE
With these results, we see that there are two outliers in the data set which are 16 and 23.
\(~\)
The table that follows shows the time (in minutes) it takes for customers to wait in line before being served at a fast food restaurant. Assuming the data to be a sample, find \(P_{37}\), \(D_{9}\), \(Q_{1}\), \(Q_{3}\) and the interquartile range. Does the sample have outliers? (10 points)
Data: 3.2, 3.3, 3.5, 3.9, 4.1, 4.4, 4.7, 4.8, 5.2, 5.6, 5.6, 5.7, 5.8, 6.0, 6.2, 6.3, 6.4, 6.5, 6.7, 6.7, 6.9, 7.0, 7.2, 7.5, 8.0, 8.8, 8.9, 9.4, 9.7, 9.9, 10.0, 11.3, 12.4, 12.5, 14.8, 15.0, 16.5, 16.8, 17.2, 19.3.
wait <- c(3.2, 3.3, 3.5, 3.9, 4.1, 4.4, 4.7, 4.8, 5.2, 5.6, 5.6, 5.7, 5.8, 6.0, 6.2, 6.3, 6.4, 6.5, 6.7, 6.7, 6.9, 7.0, 7.2, 7.5, 8.0, 8.8, 8.9, 9.4, 9.7, 9.9, 10.0, 11.3, 12.4, 12.5, 14.8, 15.0, 16.5, 16.8, 17.2, 19.3)
\(~\)
For the \(37^{th}\) percentile, \(P_{37}\):
x1 <- quantile(wait, 0.37)
pander(round(x1, digits = 2))
37% |
---|
6.24 |
\(~\)
For the \(9^{th}\) decile, \(D_{9}\):
x2 <- quantile(wait, 0.90)
pander(round(x2, digits = 2))
90% |
---|
15.15 |
\(~\)
For the \(1^{st}\) quartile, \(Q_{1}\):
x3 <- quantile(wait, 0.25)
pander(round(x3, digits = 2))
25% |
---|
5.6 |
\(~\)
For the \(3^{rd}\) quartile, \(Q_{3}\):
x4 <- quantile(wait, 0.75)
pander(round(x4, digits = 2))
75% |
---|
9.93 |
\(~\)
For the Interquartile Range (IQR):
x5 <- quantile(wait, 0.75) - quantile(wait, 0.25)
pander(round(x5, digits = 2))
75% |
---|
4.33 |
\(~\)
To check for outliers:
boxplot(wait, outcol = "red", cex = 1.5)
The boxplot shows outliers.
\(~\)
rosnerTest(wait, k=5, warn=TRUE)
##
## Results of Outlier Test
## -------------------------
##
## Test Method: Rosner's Test for Outliers
##
## Hypothesized Distribution: Normal
##
## Data: wait
##
## Sample Size: 40
##
## Test Statistics: R.1 = 2.587545
## R.2 = 2.346746
## R.3 = 2.466020
## R.4 = 2.651018
## R.5 = 2.513303
##
## Test Statistic Parameter: k = 5
##
## Alternative Hypothesis: Up to 5 observations are not
## from the same Distribution.
##
## Type I Error: 5%
##
## Number of Outliers Detected: 0
##
## i Mean.i SD.i Value Obs.Num R.i+1 lambda.i+1 Outlier
## 1 0 8.342500 4.234709 19.3 40 2.587545 3.036097 FALSE
## 2 1 8.061538 3.894099 17.2 39 2.346746 3.025284 FALSE
## 3 2 7.821053 3.641069 16.8 38 2.466020 3.014109 FALSE
## 4 3 7.578378 3.365357 16.5 37 2.651018 3.002552 FALSE
## 5 4 7.330556 3.051540 15.0 36 2.513303 2.990585 FALSE
Based on the Rosner’s Test for Outliers, there are no outliers in the data set.