Section 2.1

2 A frequency distribution lists the number of occurrences of each category of data, while a relative frequency lists the proportion of percent of occurrences of each category of data.

3 Relative frequencies should add up to 1.

5

  1. The most common approach is washing your hands. 61% of the population chooses this method.

  2. The least used approach is drinking orange juice. 2% of the population uses this method.

  3. 25% of the population thinks flu shots are the best way to beat the flu.

13

datt <- c(125, 324, 552, 1257, 2518)

rel.freqq <- datt/sum(datt)

categoriess <- c("Never", "Rarely", "Sometimes", "Most of time", "Always")


answerr <- data.frame(categoriess,rel.freqq)

answerr
##    categoriess  rel.freqq
## 1        Never 0.02617253
## 2       Rarely 0.06783920
## 3    Sometimes 0.11557789
## 4 Most of time 0.26319095
## 5       Always 0.52721943
  1. 52.7%

  2. 9.4%

barplot(datt,main="Seat Belt Usage",names=categoriess, col =c("red","blue","green","yellow","orange"))

barplot(rel.freqq,main="Seat Belt Usage",names=categoriess, col =c("red","blue","green","yellow","orange"))

pie(datt,main="Seat Belt Usage",labels=categoriess, col =c("red","blue","green","yellow","orange"))

  1. Descriptive

15

dat <- c(377,192,132,81,243)

rel.freq <- dat/sum(dat)

categories <- c("More 1", "Up to 1", "Few a week", "Few a month", "Never")


answer <- data.frame(categories,rel.freq)

answer
##    categories   rel.freq
## 1      More 1 0.36780488
## 2     Up to 1 0.18731707
## 3  Few a week 0.12878049
## 4 Few a month 0.07902439
## 5       Never 0.23707317
  1. 243/1025

barplot(dat,main="Internet Usage",names=categories, col =c("red","blue","green","yellow","orange"))

barplot(rel.freq,main="Internet Usage(Relative Freq)",names=categories, col =c("red","blue","green","yellow","orange"))

pie(dat,main="Internet Usage",labels=categories, col =c("red","blue","green","yellow","orange"))

Section 2.2

7 False

8 False

9 type answer here

  1. 8

  2. 2

  3. 15

  4. 4

  5. 7%

  6. Fairly symmetric

10

  1. 4

  2. 9

  3. 17.3%

  4. The distribution is fairly bell-shaped (symmetrical)

13

  1. Bell shaped because the majority of people in the US make a medium amount of money (are middle class).

  2. Bell shaped because the majority of people score somewhere in the middle of 0-2400.

  3. Bell shaped because the number of people living in a household is typically not super high or super low.

  4. Skewed left because as people age their chances of being diagnosed with Alzheimer’s disease increases.

14

  1. Skewed right because most of the population does not consume alcohol.
  2. Skewed right because most people above the age of 18 are not in public schools.
  3. Skewed left because older people are more likely to use a hearing aid.

  4. Skewed left because full grown men are taller than the entire male population.

15

dattt <- c(16, 18, 12, 3, 1)

rel.freqqq <- dattt/sum(dattt)

categoriesss <- c("Zero", "One", "Two", "Three", "Four")

answerrr <- data.frame(categoriesss,rel.freqqq)

answerrr
##   categoriesss rel.freqqq
## 1         Zero       0.32
## 2          One       0.36
## 3          Two       0.24
## 4        Three       0.06
## 5         Four       0.02
  1. 24%

  2. 60%

16

free_throws <- c(16, 11, 9, 7, 2,3,0,1,0,1)

rel.freqqq <- free_throws/sum(free_throws)

categoriesss <- c("1", "2", "3", "4","5","6","7","8","9","10")

answerrr <- data.frame(categoriesss,rel.freqqq)

answerrr
##    categoriesss rel.freqqq
## 1             1       0.32
## 2             2       0.22
## 3             3       0.18
## 4             4       0.14
## 5             5       0.04
## 6             6       0.06
## 7             7       0.00
## 8             8       0.02
## 9             9       0.00
## 10           10       0.02
  1. 14%

  2. 2%

  3. 14%

25

  1. The data is discrete because the values can only be whole integers. For example, you can not have between 1 & 2 televisions.

tv <- c(1, 1, 1, 2, 1,
        1, 2, 2, 3, 2,
        4, 2, 2, 2, 2,
        2, 4, 1, 2, 2,
        3, 1, 3, 1, 2,
        3, 1, 1, 2, 1,
        5, 0 ,1, 3, 3,
        1, 3, 3, 2, 1)

#table(tv)

tv <- c(1,14,14,8,2,1)

tv.freq <- tv/sum(tv)

tv.cat <- c("0", "1", "2", "3","4","5")

freq.tab <- data.frame(tv.cat,tv)
rfreq.tab <- data.frame(tv.cat,tv.freq)


freq.tab
##   tv.cat tv
## 1      0  1
## 2      1 14
## 3      2 14
## 4      3  8
## 5      4  2
## 6      5  1
rfreq.tab
##   tv.cat tv.freq
## 1      0   0.025
## 2      1   0.350
## 3      2   0.350
## 4      3   0.200
## 5      4   0.050
## 6      5   0.025
  1. 20%

  2. 7.5%