Section 2.1

7

  1. OF.

  2. 15 more MVPs.

  3. This graph is potentially misleading because OF can be broken into 3 categories but isn’t, while 1B, 2B, and 3B could be rolled into 1 category but isn’t. This Pareto chart lists OF first because it’s the category with the highest frequency of MVPs. If OF were broken into 3 categories, however, it’s possible one of the bases would be listed first; for example, with 10 left field, 10 center field, and 10 right field MVPs comprising OF, 1B would actually be listed first. Alternatively, rolling the bases into one category would yield 30 MVPs, as many as OF.

9

  1. Approximately 69%

  2. Approximately 5,520,000

  3. That assertion is inferential, because it’s a conclusion extrapolated from the sample of adult Americans surveyed and applied to the population of all adult Americans.

11

  1. Approximately 44%. Approximately 61%.

  2. 55+ year-olds.

  3. 18-34 year-olds.

  4. Age and likelihood to buy when made in America are positively correlated. I.e. the older one is, the more likely she is to buy something advertised as being made in America.

13

  1. Never = 0.026172529 Rarely = 0.067839196 Sometimes = 0.115577889 Most of the Time = 0.263190955 Always = 0.52721943
datt <- c(125, 324, 552, 1257, 2518)

rel.freqq <- datt/sum(datt)

categoriess <- c("Never", "Rarely", "Sometimes", "Most of time", "Always")


answerr <- data.frame(categoriess,rel.freqq)

answerr
##    categoriess  rel.freqq
## 1        Never 0.02617253
## 2       Rarely 0.06783920
## 3    Sometimes 0.11557789
## 4 Most of time 0.26319095
## 5       Always 0.52721943
  1. About 52.72%

  2. About 9.4%

barplot(datt,main="Seat Belt Usage",names=categoriess, col =c("red","blue","green","yellow","orange"))

barplot(rel.freqq,main="Seat Belt Usage",names=categoriess, col =c("red","blue","green","yellow","orange"))

pie(datt,main="Seat Belt Usage",labels=categoriess, col =c("red","blue","green","yellow","orange"))

  1. This is a descriptive statement, because it only describes the sample surveyed, rather using an extrapolation from that sample to infer something about the full population of college students.

15

dat <- c(377,192,132,81,243)

rel.freq <- dat/sum(dat)

categories <- c("More 1", "Up to 1", "Few a week", "Few a month", "Never")


answer <- data.frame(categories,rel.freq)

answer
##    categories   rel.freq
## 1      More 1 0.36780488
## 2     Up to 1 0.18731707
## 3  Few a week 0.12878049
## 4 Few a month 0.07902439
## 5       Never 0.23707317
  1. About 23.7%

barplot(dat,main="Internet Usage",names=categories, col =c("red","blue","green","yellow","orange"))

barplot(rel.freq,main="Internet Usage(Relative Freq)",names=categories, col =c("red","blue","green","yellow","orange"))

pie(dat,main="Internet Usage",labels=categories, col =c("red","blue","green","yellow","orange"))

  1. It’s an inferential statement phrased as a descriptive one. It would be (more) accurate to say that approximately 37% of 1025 randomly sampled adult Americans spend more than an hour a day on the Internet.

Section 2.2

9

  1. Rolling a value of 8.

  2. Rolling a value of 2.

  3. 15 times.

  4. 5 more.

  5. 15%.

  6. The distribution is skewed left

10

  1. 4 per week.

  2. 9 weeks.

  3. About 17.3% of the time.

  4. Slightly skewed right.

11

  1. 200 students.

  2. Class width = 10.

  3. 60-69 = 2 70-79 = 3 80-89 = 13 90-99 = 42 100-109 = 58 110-119 = 40 120-129 = 31 130-139 = 8 140-149 = 2 150-159 = 1

  4. 100-109

  5. 150-159

  6. 5.5%

  7. No.

12

  1. 200

  2. 0-199 200-399 400-599 600-799 800-999 1000-1199 1200-1399 1400-1599

  3. 0-199

  4. Skewed right.

  5. That statement compares frequency in VT to frequency in TX, which doesn’t take the number of residents of each state into account. Texas has more people, so one would expect it to have more alcohol-related deaths than Vermont. It would be more fair to examine the different categories of causes of traffic fatalities in each state, and compare the relative frequency of the category “alcohol-related” in each.

13

  1. Skewed left; it’s likely that a greater proportion of households make below-median income than above-median income. There are more forces against making an above-median income (education, competition, specialized skills, experience, etc.) than there are forces for making an above-median income. Thus, the difference between the median and the upper class limit is likely smaller than the difference between the median and the lower class limit.

  2. Bell-shaped; the proportion of above-median scores should be roughly the same as the proportion of below-median scores, with the highest frequency in the middle/median.

  3. Skewed right. There are more forces acting against having a below-median number of people in a home (more expensive per person, living with dependents, population density/urban areas with large populations competing for limited space) than there are forces acting for having an above-median number of people in a home. There’s a “hard” lower class limit of 1, while there is no such hard upper class limit.

  4. Skewed left; there are more forces acting against being diagnosed at an above-median age (namely, life expectancy) than there are acting for being diagnosed at a below-median age.

14

  1. Skewed right. The midian is probably fairly low, with a lower class limit of 0 drinks per week; however, there is no such “hard” upper limit (indeed there are people who drink too much).

  2. Uniform distribution. Students usually don’t enter public schools until they’ve reached a minimum age, and they typically or leave at the age of 18. Public schools also often have caps on the number of students that can be enrolled, which would keep class sizes mostly consistent from grade to grade.

  3. Skewed left. Since hearing tends to degrade with age, the median age for needing a hearing-aid is likely higher. There are also more forces acting against getting a hearing-aid at an above-median age (namely, life expectancy) than there are acting for getting one at a younger age.

  4. Skewed left. There are more forces acting against having an above-median height (namely, gravity with human physiology) than there are having a below-median height.