Section 2.1
2 A frequency distribution lists the number of occurrences of each category of data, while a relative frequency lists the proportion of percent of occurrences of each category of data.
3 Relative frequencies should add up to 1.
5
The most common approach is washing your hands. 61% of the population chooses this method.
The least used approach is drinking orange juice. 2% of the population uses this method.
25% of the population thinks flu shots are the best way to beat the flu.
13
datt <- c(125, 324, 552, 1257, 2518)
rel.freqq <- datt/sum(datt)
categoriess <- c("Never", "Rarely", "Sometimes", "Most of time", "Always")
answerr <- data.frame(categoriess,rel.freqq)
answerr
## categoriess rel.freqq
## 1 Never 0.02617253
## 2 Rarely 0.06783920
## 3 Sometimes 0.11557789
## 4 Most of time 0.26319095
## 5 Always 0.52721943
52.7%
9.4%
barplot(datt,main="Seat Belt Usage",names=categoriess, col =c("red","blue","green","yellow","orange"))
barplot(rel.freqq,main="Seat Belt Usage",names=categoriess, col =c("red","blue","green","yellow","orange"))
pie(datt,main="Seat Belt Usage",labels=categoriess, col =c("red","blue","green","yellow","orange"))
15
dat <- c(377,192,132,81,243)
rel.freq <- dat/sum(dat)
categories <- c("More 1", "Up to 1", "Few a week", "Few a month", "Never")
answer <- data.frame(categories,rel.freq)
answer
## categories rel.freq
## 1 More 1 0.36780488
## 2 Up to 1 0.18731707
## 3 Few a week 0.12878049
## 4 Few a month 0.07902439
## 5 Never 0.23707317
243/1025
barplot(dat,main="Internet Usage",names=categories, col =c("red","blue","green","yellow","orange"))
barplot(rel.freq,main="Internet Usage(Relative Freq)",names=categories, col =c("red","blue","green","yellow","orange"))
pie(dat,main="Internet Usage",labels=categories, col =c("red","blue","green","yellow","orange"))
Section 2.2
7 False
8 False
9 type answer here
8
2
15
4
7%
Fairly symmetric
10
4
9
17.3%
The distribution is fairly bell-shaped (symmetrical)
13
Bell shaped because the majority of people in the US make a medium amount of money (are middle class).
Bell shaped because the majority of people score somewhere in the middle of 0-2400.
Bell shaped because the number of people living in a household is typically not super high or super low.
Skewed left because as people age their chances of being diagnosed with Alzheimer’s disease increases.
14
Skewed left because older people are more likely to use a hearing aid.
Skewed left because full grown men are taller than the entire male population.
15
dattt <- c(16, 18, 12, 3, 1)
rel.freqqq <- dattt/sum(dattt)
categoriesss <- c("Zero", "One", "Two", "Three", "Four")
answerrr <- data.frame(categoriesss,rel.freqqq)
answerrr
## categoriesss rel.freqqq
## 1 Zero 0.32
## 2 One 0.36
## 3 Two 0.24
## 4 Three 0.06
## 5 Four 0.02
24%
60%
16
free_throws <- c(16, 11, 9, 7, 2,3,0,1,0,1)
rel.freqqq <- free_throws/sum(free_throws)
categoriesss <- c("1", "2", "3", "4","5","6","7","8","9","10")
answerrr <- data.frame(categoriesss,rel.freqqq)
answerrr
## categoriesss rel.freqqq
## 1 1 0.32
## 2 2 0.22
## 3 3 0.18
## 4 4 0.14
## 5 5 0.04
## 6 6 0.06
## 7 7 0.00
## 8 8 0.02
## 9 9 0.00
## 10 10 0.02
14%
2%
14%
25
The data is discrete because the values can only be whole integers. For example, you can not have between 1 & 2 televisions.
tv <- c(1, 1, 1, 2, 1,
1, 2, 2, 3, 2,
4, 2, 2, 2, 2,
2, 4, 1, 2, 2,
3, 1, 3, 1, 2,
3, 1, 1, 2, 1,
5, 0 ,1, 3, 3,
1, 3, 3, 2, 1)
#table(tv)
tv <- c(1,14,14,8,2,1)
tv.freq <- tv/sum(tv)
tv.cat <- c("0", "1", "2", "3","4","5")
freq.tab <- data.frame(tv.cat,tv)
rfreq.tab <- data.frame(tv.cat,tv.freq)
freq.tab
## tv.cat tv
## 1 0 1
## 2 1 14
## 3 2 14
## 4 3 8
## 5 4 2
## 6 5 1
rfreq.tab
## tv.cat tv.freq
## 1 0 0.025
## 2 1 0.350
## 3 2 0.350
## 4 3 0.200
## 5 4 0.050
## 6 5 0.025
20%
7.5%