Displaying and Summarizing 1 Variable Quantitative Data

Problem 42

Marijuana<-c(21,32,21,22,4,44,23,23,9,11,22,27,6,27,16,13,39,39,27,16,13,10,28,9,18,15,3,22,27,28,7,40,4,21,38)
hist(Marijuana, col = "lightgreen", main = "Percent of 16 year olds who tried weed", xlab = "% of 16 Y/Os who tried")

stem(Marijuana)
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 344
##   0 | 6799
##   1 | 0133
##   1 | 5668
##   2 | 11122233
##   2 | 777788
##   3 | 2
##   3 | 899
##   4 | 04
boxplot(Marijuana, main ="Weed enthusiasts", col = "magenta")

summary(Marijuana)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3.00   12.00   21.00   20.71   27.00   44.00

The data is unimodal with a peak around 21% of the teen population. The outliers are highest at 44% and lowest at 3%.

median(Marijuana)
## [1] 21
sd(Marijuana)
## [1] 11.18447
mean(Marijuana)
## [1] 20.71429
range(Marijuana)
## [1]  3 44
IQR(Marijuana)
## [1] 15

Problem 47

Math2005<-c(225,236,230,236,230,236,230,239,242,240,239,234,230,242,233,240,240,246,231,230,241,238,247,238,246,227,235,241,238,230,246,244,224,238,241,243,242,234,238,241,233,238,242,232,242,239,244,240,242,231,241,243)
hist(Math2005, col = "lightgreen", main = "2005 State Test Scores Distribution 8th Grade", xlab = "Avg test score")

stem(Math2005)
## 
##   The decimal point is at the |
## 
##   224 | 00
##   226 | 0
##   228 | 
##   230 | 00000000
##   232 | 000
##   234 | 000
##   236 | 000
##   238 | 000000000
##   240 | 000000000
##   242 | 00000000
##   244 | 00
##   246 | 0000
boxplot(Math2005, main ="Test Scores", col = "magenta")

summary(Math2005)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   224.0   233.0   238.5   237.5   242.0   247.0

a) Median = 238.5 / IQR = 9 / Mean = 237.5 / SD = 5.7

b) You would use median and IQR since the data is skewed left and not symetrical.

c) Unimodal with a skew to the right. A few states did absolutely terrible, bringing the average down - such as New Mexico and Alabama.

median(Math2005)
## [1] 238.5
sd(Math2005)
## [1] 5.682659
mean(Math2005)
## [1] 237.4615
range(Math2005)
## [1] 224 247
IQR(Math2005)
## [1] 9

Problem 48

Boomtowns<-c(7.5,4.2,4.5,3.4,1.9,4.4,3.1,3.2,2.6,2.6,2.6,3.3,2.8,2.9,3.3,2.3,1.7,2.2,1.5,1.4)
hist(Boomtowns, col = "lightgreen", main = "Boomtowns Eceonomic Growth Percentages", xlab = "1 Year Job Growth")

stem(Boomtowns)
## 
##   The decimal point is at the |
## 
##   0 | 4579
##   2 | 236668912334
##   4 | 245
##   6 | 5
boxplot(Boomtowns, main ="Boomtowns Growth", col = "magenta")

summary(Boomtowns)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.400   2.275   2.850   3.070   3.325   7.500

b) Median = 2.85 / Mean = 3.07. The mean is higher becuase Las Vegas had a growth rate much higher than any other cities.

c) Median is better since it ignore the outlier of Las Vegas

d) IQR = 1.052 and SD = 1.3680

e) The IQR does a better job summarizing the job growth because it ignores the outliers and skew to the right

f) A subtraction of 1.2% would simply move the data to the left, not alter the shape or distirubtion of the graph.

g) If we admitted las vegas, the median and the IQR would stay the same with the mean and standard deviation would change significantly (to the right, or hgiher)

h) The growth rate was unimodal with a peak around 2.8% growth, which is snigifcantly higher than the US average of only 1.2% growth. The data skewed to the right, primarily due to Las Vegas. The highest growth was 7.5% while the lowest was 1.4%. The 1st quarter was 2.275 while the 3rd quarter was 3.325.

median(Boomtowns)
## [1] 2.85
sd(Boomtowns)
## [1] 1.368095
mean(Boomtowns)
## [1] 3.07
range(Boomtowns)
## [1] 1.4 7.5
IQR(Boomtowns)
## [1] 1.05