library(pander)
library(outliers)
library(EnvStats)

Practice Exercise 6-2:

Using RStudio, find \(P_{16}\), \(D_{6}\), \(Q_{1}\), \(Q_{3}\) and the interquartile range of the following data set on the total weight, in kilograms, of ready-to-cook chicken inasal leg quarters sold by a frozen foods retail store during selected days of June and July. Does the sample have outliers?

Data: 35.2, 7.0, 24.0, 42.4, 33.0, 27.5, 24.0, 21.0, 8.0, 45.6, 25.9, 14.8, 29.8, 21.0, 17.5, 9.7, 40.0, 18.8, 57.9, 21.0, 12.0, 12.0, 19.6, 51.5, 12.0, 36.8, 13.7, 32.8, 12.0, 10.5, 22.5, 19.5, 37.5, 35.0, 10.5, 33.6, 14.5, 36.5, 17.9, 26.9, 12.0, 41.5.

inasal <- c(35.2,   7.0,    24.0,   42.4,   33.0,   27.5,   24.0,   21.0,   8.0,    45.6,   25.9,   14.8,   29.8,   21.0, 17.5, 9.7,    40.0,   18.8,   57.9,   21.0,   12.0,   12.0,   19.6,   51.5,   12.0,   36.8,   13.7,   32.8, 12.0, 10.5,   22.5,   19.5,   37.5,   35.0,   10.5,   33.6,   14.5,   36.5,   17.9,   26.9,   12.0,   41.5)

\(~\)

For the \(16^{th}\) percentile, \(P_{16}\):

x1 <- quantile(inasal, 0.16)
pander(round(x1, digits = 2))
16%
12

\(~\)

For the \(6^{th}\) decile, \(D_{6}\):

x2 <- quantile(inasal, 0.60)
pander(round(x2, digits = 2))
60%
26.5

\(~\)

For the \(1^{st}\) quartile, \(Q_{1}\):

x3 <- quantile(inasal, 0.25)
pander(round(x3, digits = 2))
25%
13.9

\(~\)

For the \(3^{rd}\) quartile, \(Q_{3}\):

x4 <- quantile(inasal, 0.75)
pander(round(x4, digits = 2))
75%
34.65

\(~\)

For the Interquartile Range (IQR):

x5 <- quantile(inasal, 0.75) - quantile(inasal, 0.25)
pander(round(x5, digits = 2))
75%
20.75

\(~\)

To check for outliers:

boxplot(inasal, outcol = "red", cex = 1.5)

The boxplot shows no outliers.

rosnerTest(inasal, k=3, warn=TRUE)
## 
## Results of Outlier Test
## -------------------------
## 
## Test Method:                     Rosner's Test for Outliers
## 
## Hypothesized Distribution:       Normal
## 
## Data:                            inasal
## 
## Sample Size:                     42
## 
## Test Statistics:                 R.1 = 2.605014
##                                  R.2 = 2.345154
##                                  R.3 = 2.023399
## 
## Test Statistic Parameter:        k = 3
## 
## Alternative Hypothesis:          Up to 3 observations are not
##                                  from the same Distribution.
## 
## Type I Error:                    5%
## 
## Number of Outliers Detected:     0
## 
##   i   Mean.i     SD.i Value Obs.Num    R.i+1 lambda.i+1 Outlier
## 1 0 24.87857 12.67611  57.9      19 2.605014   3.056723   FALSE
## 2 1 24.07317 11.69511  51.5      24 2.345154   3.046571   FALSE
## 3 2 23.38750 10.97781  45.6      10 2.023399   3.036097   FALSE

There are no outliers based on the Rosner’s Test.

\(~\)

Problem 1:

Consider the following data set on the monthly rent ($) for a sample of one-bedroom apartments:

Data: 425, 430, 430, 435, 435, 435, 435, 435, 440, 440, 440, 440, 440, 445, 445, 445, 445, 445, 450, 450, 450, 450, 450, 450, 450, 460, 460, 460, 465, 465, 465, 470, 470, 472, 475, 475, 475, 480, 480, 480, 480, 485, 490, 490, 490, 500, 500, 500, 500, 510, 510, 515, 525, 525, 525, 535, 549, 550, 570, 570, 575, 575, 580, 590, 600, 600, 600, 600, 615, 615.

Compute \(P_{13}\), \(D_{6}\), \(Q_{1}\), \(Q_{3}\) and the interquartile range. Also, find the outliers, if any, for the given data set. (10 points)

rent <- c(425, 430, 430, 435, 435, 435, 435, 435, 440, 440, 440, 440, 440, 445, 445, 445, 445, 445, 450, 450, 450, 450, 450, 450, 450, 460, 460, 460,   465, 465, 465, 470, 470, 472, 475, 475, 475, 480,   480, 480, 480, 485, 490, 490,   490, 500,   500, 500,   500, 510, 510, 515, 525, 525,   525, 535, 549, 550, 570, 570, 575, 575, 580, 590,   600, 600,   600, 600,   615, 615)

\(~\)

For the \(13^{th}\) percentile, \(P_{13}\):

x1 <- quantile(rent, 0.13)
pander(round(x1, digits = 2))
13%
440

\(~\)

For the \(6^{th}\) decile, \(D_{6}\):

x2 <- quantile(rent, 0.60)
pander(round(x2, digits = 2))
60%
487

\(~\)

For the \(1^{st}\) quartile, \(Q_{1}\):

x3 <- quantile(rent, 0.25)
pander(round(x3, digits = 2))
25%
446.2

\(~\)

For the \(3^{rd}\) quartile, \(Q_{3}\):

x3 <- quantile(rent, 0.75)
pander(round(x3, digits = 2))
75%
522.5

\(~\)

For the Interquartile Range (IQR):

x5 <- quantile(rent, 0.75) - quantile(rent, 0.25)
pander(round(x5, digits = 2))
75%
76.25

\(~\)

To check for outliers:

boxplot(rent, outcol = "red", cex = 1.5)

The boxplot shows no outliers.

\(~\)

rosnerTest(rent, k=3, warn=TRUE)
## 
## Results of Outlier Test
## -------------------------
## 
## Test Method:                     Rosner's Test for Outliers
## 
## Hypothesized Distribution:       Normal
## 
## Data:                            rent
## 
## Sample Size:                     70
## 
## Test Statistics:                 R.1 = 2.269023
##                                  R.2 = 2.376898
##                                  R.3 = 2.208312
## 
## Test Statistic Parameter:        k = 3
## 
## Alternative Hypothesis:          Up to 3 observations are not
##                                  from the same Distribution.
## 
## Type I Error:                    5%
## 
## Number of Outliers Detected:     0
## 
##   i   Mean.i     SD.i Value Obs.Num    R.i+1 lambda.i+1 Outlier
## 1 0 490.8000 54.73721   615      69 2.269023   3.257596   FALSE
## 2 1 489.0000 53.01027   615      70 2.376898   3.252277   FALSE
## 3 2 487.1471 51.10372   600      65 2.208312   3.246863   FALSE

There are no outliers based on the Rosner’s Test.

\(~\)

Problem 2:

Find \(P_{62}\), \(D_{4}\), \(Q_{1}\), \(Q_{3}\) and the interquartile range of the following data set representing the amount, in pesos, saved by a student from her allowance each day. Find the outliers, if any, for the given data set. (10 points)

Data: 16, 23, 37, 42, 48, 55, 56, 59, 62, 63, 65, 66, 66, 69, 69, 70, 71, 71, 73, 73, 74, 75, 75, 75, 77, 78, 81, 82, 82, 83, 86, 89.

savings <- c(16, 23, 37, 42, 48, 55, 56, 59, 62, 63, 65, 66, 66, 69, 69, 70, 71, 71, 73, 73, 74, 75,    75, 75, 77, 78, 81, 82, 82, 83, 86, 89)

\(~\)

For the \(62^{nd}\) percentile, \(P_{62}\):

x1 <- quantile(savings, 0.62)
pander(round(x1, digits = 2))
62%
73.22

\(~\)

For the \(4^{th}\) decile, \(D_{4}\):

x2 <- quantile(savings, 0.40)
pander(round(x2, digits = 2))
40%
67.2

\(~\)

For the \(1^{st}\) quartile, \(Q_{1}\):

x3 <- quantile(savings, 0.25)
pander(round(x3, digits = 2))
25%
61.25

\(~\)

For the \(3^{rd}\) quartile, \(Q_{3}\):

x4 <- quantile(savings, 0.75)
pander(round(x4, digits = 2))
75%
75.5

\(~\)

For the Interquartile Range (IQR):

x5 <- quantile(savings, 0.75) - quantile(savings, 0.25)
pander(round(x5, digits = 2))
75%
14.25

\(~\)

To check for outliers:

boxplot(savings, outcol = "red", cex = 1.5)

The boxplot shows three outliers.

\(~\)

Now, we perform the Rosner’s test to check if these outliers are significant:

rosnerTest(savings, k=4, warn=TRUE)
## 
## Results of Outlier Test
## -------------------------
## 
## Test Method:                     Rosner's Test for Outliers
## 
## Hypothesized Distribution:       Normal
## 
## Data:                            savings
## 
## Sample Size:                     32
## 
## Test Statistics:                 R.1 = 2.917284
##                                  R.2 = 3.024578
##                                  R.3 = 2.584623
##                                  R.4 = 2.556553
## 
## Test Statistic Parameter:        k = 4
## 
## Alternative Hypothesis:          Up to 4 observations are not
##                                  from the same Distribution.
## 
## Type I Error:                    5%
## 
## Number of Outliers Detected:     2
## 
##   i   Mean.i     SD.i Value Obs.Num    R.i+1 lambda.i+1 Outlier
## 1 0 65.96875 17.12852    16       1 2.917284   2.938048    TRUE
## 2 1 67.58065 14.73946    23       2 3.024578   2.923571    TRUE
## 3 2 69.06667 12.40671    37       3 2.584623   2.908473   FALSE
## 4 3 70.17241 11.01969    42       4 2.556553   2.892705   FALSE

The Rosner’s test show that the values 16 and 23 are outliers.

\(~\)

We trim the data by removing these observations and after doing this, we check if there are still outliers:

savings1 <- savings[c(-1, -2)]
boxplot(savings1, outcol = "red", cex = 1.5)

The boxplot shows that there is an outlier.

\(~\)

Performing Rosner’s Test again to check if the outlier is significant:

rosnerTest(savings1, k=3, warn=TRUE)
## 
## Results of Outlier Test
## -------------------------
## 
## Test Method:                     Rosner's Test for Outliers
## 
## Hypothesized Distribution:       Normal
## 
## Data:                            savings1
## 
## Sample Size:                     30
## 
## Test Statistics:                 R.1 = 2.584623
##                                  R.2 = 2.556553
##                                  R.3 = 2.372018
## 
## Test Statistic Parameter:        k = 3
## 
## Alternative Hypothesis:          Up to 3 observations are not
##                                  from the same Distribution.
## 
## Type I Error:                    5%
## 
## Number of Outliers Detected:     0
## 
##   i   Mean.i      SD.i Value Obs.Num    R.i+1 lambda.i+1 Outlier
## 1 0 69.06667 12.406709    37       1 2.584623   2.908473   FALSE
## 2 1 70.17241 11.019687    42       2 2.556553   2.892705   FALSE
## 3 2 71.17857  9.771666    48       3 2.372018   2.876209   FALSE

With these results, we see that there are two outliers in the data set which are 16 and 23.

\(~\)

Problem 3:

The table that follows shows the time (in minutes) it takes for customers to wait in line before being served at a fast food restaurant. Assuming the data to be a sample, find \(P_{37}\), \(D_{9}\), \(Q_{1}\), \(Q_{3}\) and the interquartile range. Does the sample have outliers? (10 points)

Data: 3.2, 3.3, 3.5, 3.9, 4.1, 4.4, 4.7, 4.8, 5.2, 5.6, 5.6, 5.7, 5.8, 6.0, 6.2, 6.3, 6.4, 6.5, 6.7, 6.7, 6.9, 7.0, 7.2, 7.5, 8.0, 8.8, 8.9, 9.4, 9.7, 9.9, 10.0, 11.3, 12.4, 12.5, 14.8, 15.0, 16.5, 16.8, 17.2, 19.3.

wait <- c(3.2, 3.3, 3.5, 3.9,   4.1, 4.4,   4.7, 4.8,   5.2, 5.6, 5.6, 5.7, 5.8, 6.0,   6.2, 6.3,   6.4, 6.5,   6.7, 6.7, 6.9, 7.0, 7.2, 7.5,   8.0, 8.8,   8.9, 9.4,   9.7, 9.9, 10.0, 11.3,   12.4, 12.5, 14.8,   15.0,   16.5,   16.8,   17.2,   19.3)

\(~\)

For the \(37^{th}\) percentile, \(P_{37}\):

x1 <- quantile(wait, 0.37)
pander(round(x1, digits = 2))
37%
6.24

\(~\)

For the \(9^{th}\) decile, \(D_{9}\):

x2 <- quantile(wait, 0.90)
pander(round(x2, digits = 2))
90%
15.15

\(~\)

For the \(1^{st}\) quartile, \(Q_{1}\):

x3 <- quantile(wait, 0.25)
pander(round(x3, digits = 2))
25%
5.6

\(~\)

For the \(3^{rd}\) quartile, \(Q_{3}\):

x4 <- quantile(wait, 0.75)
pander(round(x4, digits = 2))
75%
9.93

\(~\)

For the Interquartile Range (IQR):

x5 <- quantile(wait, 0.75) - quantile(wait, 0.25)
pander(round(x5, digits = 2))
75%
4.33

\(~\)

To check for outliers:

boxplot(wait, outcol = "red", cex = 1.5)

The boxplot shows outliers.

\(~\)

rosnerTest(wait, k=5, warn=TRUE)
## 
## Results of Outlier Test
## -------------------------
## 
## Test Method:                     Rosner's Test for Outliers
## 
## Hypothesized Distribution:       Normal
## 
## Data:                            wait
## 
## Sample Size:                     40
## 
## Test Statistics:                 R.1 = 2.587545
##                                  R.2 = 2.346746
##                                  R.3 = 2.466020
##                                  R.4 = 2.651018
##                                  R.5 = 2.513303
## 
## Test Statistic Parameter:        k = 5
## 
## Alternative Hypothesis:          Up to 5 observations are not
##                                  from the same Distribution.
## 
## Type I Error:                    5%
## 
## Number of Outliers Detected:     0
## 
##   i   Mean.i     SD.i Value Obs.Num    R.i+1 lambda.i+1 Outlier
## 1 0 8.342500 4.234709  19.3      40 2.587545   3.036097   FALSE
## 2 1 8.061538 3.894099  17.2      39 2.346746   3.025284   FALSE
## 3 2 7.821053 3.641069  16.8      38 2.466020   3.014109   FALSE
## 4 3 7.578378 3.365357  16.5      37 2.651018   3.002552   FALSE
## 5 4 7.330556 3.051540  15.0      36 2.513303   2.990585   FALSE

Based on the Rosner’s Test for Outliers, there are no outliers in the data set.