Chapter 2 Question 36

The accompanying data set contains three variables, x1, x2, and x3.

library(readxl)
myData <- read_excel("Ch2_Q36_Data_File.xlsx")
myData
## # A tibble: 31 × 3
##       x1    x2     x3
##    <dbl> <dbl>  <dbl>
##  1   119    23 174000
##  2   148    22   1000
##  3   140    26 123000
##  4   128    39  17000
##  5   108    14      0
##  6   155    20  95000
##  7   166    24 133000
##  8   213    37  48000
##  9   156    10  55000
## 10   118    17 162000
## # … with 21 more rows

a. Bin the values of x1 into 3 equal size groups. Label the groups with numbers 1 (lowest values) to 3 (highest values). How many observations are assigned to group 1?

benx1 = quantile(myData$x1, probs = seq(0, 1, by = 1/3))


myData['Binnedx1'] = cut(myData$x1, breaks = benx1, labels = c("1", "2", "3"), include.lowest=TRUE, right=FALSE)

table(myData$Binnedx1)
## 
##  1  2  3 
## 10 10 11
n = length(which(myData$Binnedx1 == '1'))
sprintf("There are %s x1 observations with the lowest bin value of 1.",n)
## [1] "There are 10 x1 observations with the lowest bin value of 1."

b. Bin the values of x2 into 3 equal interval groups. Label the groups with numbers 1 (lowest values) to 3 (highest values). How many observations are assigned to group 2?

myData['Binnedx2'] = cut(myData$x2, breaks=3, labels = c("1", "2", "3"), include.lowest = TRUE, right = FALSE)
n1 = length(which(myData$Binnedx2 == "2"))
sprintf("There are %s x2 observations with the bin value of 2", n1)
## [1] "There are 13 x2 observations with the bin value of 2"

c. Bin the values of x3 into the following three groups: <50,000, between 50,000 and 100,000, and > 100,000. Label the groups with numbers 1 (lowest values) to 3 (highest values). How many observations are assigned to group 1?

myData['Binnedx3'] = cut(myData$x3, breaks = c(-1, 50000, 100000, Inf), labels = c("1", "2", "3"))
n3 = length(which(myData$Binnedx3 =="1"))
sprintf("There are %s x1 observations with the bin value of 1.", n3)
## [1] "There are 8 x1 observations with the bin value of 1."