Chapter 2 Question 36
The accompanying data set contains three variables, x1, x2, and x3.
library(readxl)
myData <- read_excel("Ch2_Q36_Data_File.xlsx")
myData
## # A tibble: 31 × 3
## x1 x2 x3
## <dbl> <dbl> <dbl>
## 1 119 23 174000
## 2 148 22 1000
## 3 140 26 123000
## 4 128 39 17000
## 5 108 14 0
## 6 155 20 95000
## 7 166 24 133000
## 8 213 37 48000
## 9 156 10 55000
## 10 118 17 162000
## # … with 21 more rows
a. Bin the values of x1 into 3 equal size groups. Label the groups with numbers 1 (lowest values) to 3 (highest values). How many observations are assigned to group 1?
benx1 = quantile(myData$x1, probs = seq(0, 1, by = 1/3))
myData['Binnedx1'] = cut(myData$x1, breaks = benx1, labels = c("1", "2", "3"), include.lowest=TRUE, right=FALSE)
table(myData$Binnedx1)
##
## 1 2 3
## 10 10 11
n = length(which(myData$Binnedx1 == '1'))
sprintf("There are %s x1 observations with the lowest bin value of 1.",n)
## [1] "There are 10 x1 observations with the lowest bin value of 1."
b. Bin the values of x2 into 3 equal interval groups. Label the groups with numbers 1 (lowest values) to 3 (highest values). How many observations are assigned to group 2?
myData['Binnedx2'] = cut(myData$x2, breaks=3, labels = c("1", "2", "3"), include.lowest = TRUE, right = FALSE)
n1 = length(which(myData$Binnedx2 == "2"))
sprintf("There are %s x2 observations with the bin value of 2", n1)
## [1] "There are 13 x2 observations with the bin value of 2"
c. Bin the values of x3 into the following three groups: <50,000, between 50,000 and 100,000, and > 100,000. Label the groups with numbers 1 (lowest values) to 3 (highest values). How many observations are assigned to group 1?
myData['Binnedx3'] = cut(myData$x3, breaks = c(-1, 50000, 100000, Inf), labels = c("1", "2", "3"))
n3 = length(which(myData$Binnedx3 =="1"))
sprintf("There are %s x1 observations with the bin value of 1.", n3)
## [1] "There are 8 x1 observations with the bin value of 1."