DATA 606 Chapter 5 Assignment
Chapter 5 - Inference for Numerical Data
Graded: 5.6, 5.14, 5.20, 5.32, 5.48
5.6
the sample mean is 71, the margin of error is 6, and the sample standard deviation is 17.53
x1 <- 65
x2 <- 77
samp_mean <- (x1+x2)/2
samp_mean
## [1] 71
marg_error <- (x2 -x1)/2
marg_error
## [1] 6
n <- 25
df <- n -1
df
## [1] 24
t <- qt(.95,df)
t
## [1] 1.710882
SE <- marg_error/t
SE
## [1] 3.506963
sample_sd <- SE*sqrt(n)
sample_sd
## [1] 17.53481
5.14
(a)the minimum sample size is 271 (b)the sample size will need to be larger in order for the confidence interval to be larger (c)In order for the confidence interval to be 99% the sample size would have to be 664
sd <- 250
ME <- 25
t <- qnorm(.05)
n =(t *(sd/ME))^2
n
## [1] 270.5543
t2 <- qnorm (.005)
n =(t2 *(sd/ME))^2
n
## [1] 663.4897
5.20
(a)There is no clear difference between writing and reading scores the distribution is fairly normal (b)No the writing and reading would most likely be dependent on each other meaning a student who is better at writing is most likely better at reading too (c) Ho: average scores for reading and writing are the same, so their difference is equal to zero Ha: average scores for reading and writing are not the same, so their difference is not equal to zero
Since the p score is above .1 we do not have enough evidence to succesfully reject the null hypothesis (f) Type II error, this is because we have failed to reject the null, and if the null is actually not true, we might have failed to conclude that there is in fact a difference between the two scores. (g) Yes we would expect the confidence interval to include 0 because we were not able to reject the null hypothesis that the reading and writing scores difference was zero
diff_mean<--0.545
diff_sd<-8.887
n<-200
df<-n-1
#we use a 90% confidence level
SE<-diff_sd/sqrt(n)
t<-(diff_mean-0)/SE
t
## [1] -0.867274
## [1] -0.867274
p<-2*pt(t,df)
p
## [1] 0.3868365
5.32
yes since the p value is less than 0.1 there is strong evidence that the there is a difference in fuel efficiency
mean_auto<-16.12
mean_man<-19.85
sd_auto<-3.58
sd_man<-4.51
n<-26
df<-n-1
diff_mean<-mean_auto-mean_man
#we use a 90% confidence level
SE<-sqrt((sd_auto^2/n)+(sd_man^2/n))
t<-(diff_mean-0)/SE
t
## [1] -3.30302
p<-2*pt(t,df)
p
## [1] 0.002883615
5.48
- Ho: The average nummber of hours worked in each group is the same Ha: At least one of the average working hours in these groups is different
- the assumptions that need to be made are: the observations need to be independent within the group and across groups, the number of observations in each group represents less than 10% of the population, so we consider them independent. between groups we assume they are independent because we assume each person only belongs to one educational group. The data distribution within each group seems relatively normal.
(d)p-value is greater than .05 therefore we do not reject the null hypothesis and conclude that there is no significant difference between the 5 groups.
library(knitr)
work_hours <- data.frame (
mu <- c(38.67, 39.6, 41.39, 42.55, 40.85),
sd <- c(15.81, 14.97, 18.1, 13.62, 15.51),
n <- c(121, 546, 97, 253, 155)
)
n <- sum(work_hours$n)
k <- length(work_hours$sd)
k
## [1] 5
df_deg <- k - 1
df_deg
## [1] 4
df_res <- n-k
df_res
## [1] 1167
prf <- 0.0682
total_mean <- 40.45
f <- qf( 1 - prf, df_deg , df_res)
f
## [1] 2.188931
SSG <- sum( work_hours$n * (work_hours$mean - total_mean)^2 )
SSG
## [1] 0
MSE=501.54
MSG <- MSE/F
MSG
## [1] Inf
colnames(work_hours) <- c("mean","sd","n")
knitr::kable(work_hours)
mean | sd | n |
---|---|---|
38.67 | 15.81 | 121 |
39.60 | 14.97 | 546 |
41.39 | 18.10 | 97 |
42.55 | 13.62 | 253 |
40.85 | 15.51 | 155 |
anova548 <- data.frame(
names <- c("degree","Residuals","Total"),
Df <- c("4","1167","1171"),
SumSq <- c("2004.1","267382","269386.1"),
MeanSq <- c("501.54","229.13",""),
Fvalue <- c("2.19","",""),
prf <- c("0.0682","","")
)
colnames(anova548) <- c("names","Df","Sum Sq","Mean Sq","F value","Pr(>F)")
knitr::kable(anova548)
names | Df | Sum Sq | Mean Sq | F value | Pr(>F) |
---|---|---|---|---|---|
degree | 4 | 2004.1 | 501.54 | 2.19 | 0.0682 |
Residuals | 1167 | 267382 | 229.13 | ||
Total | 1171 | 269386.1 |