A 90% confidence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.
upper_limit <- 77
lower_limit <- 65
xbar <- ((upper_limit+lower_limit)/2)
xbar
## [1] 71
ME <- upper_limit - xbar
ME
## [1] 6
n <- 25
df <- n-1
t <- qt(.95, df)
se <- (ME/t)
sd <- se * sqrt(n)
sd
## [1] 17.53481
SAT scores of students at an Ivy League college are distributed with a standard deviation of 250 points. Two statistics students, Raina and Luke, want to estimate the average SAT score of students at this college as part of a class project. They want their margin of error to be no more than 25 points.
Since Standard Deviation = Standard Error * Sqrt(n) n = (standard deviation / standard error)^2
sd <- 250
ME <- 25
z <- qnorm(.95)
n <- ((z*sd)/ME)^2
n
## [1] 270.5543
Sample size will be larger because because it requires a larger z value
sd <- 250
ME <- 25
z <- qnorm(.995)
n <- ((z*sd)/ME)^2
n
## [1] 663.4897
5.20 High School and Beyond, Part I. The National Center of Education Statistics conducted a survey of high school seniors, collecting test data on reading, writing, and several other subjects. Here we examine a simple random sample of 200 students from this survey. Side-by-side box plots of reading and writing scores as well as a histogram of the differences in scores are shown below.
library(knitr)
include_graphics('/Users/Michele/Desktop/520.png')
There is no clear difference, and they appear to have similar distributions. The only slight difference is that the range for write scores are smaller.
Yes, reading and writing scores of each student are independent of each other.
Ho = Scores in reading - Scores in writing = 0
Ha = Scores in reading - Scores in writing != 0
Our sample size is large, we have independence, and the distribution of the differences appears roughly normal.
Going to start by finding the standard error, then the p-value
n <- 200
sd <- 8.8817
xdiff <- -.545
sediff <- sd/(sqrt(n))
t <- (xdiff-0)/sediff
df <- n-1
p <- pt(t, df = df)
p
## [1] 0.1932769
Since the p-value is greater than .05, we conclude that there is no clear difference between a student’s reading scores and writing scores.
Type I Error = rejecting the null hypotehsis when it is actually true
Type II Error = failing to reject the null hypothesis when the alternative is actually true.
We may have a type II error, in which we believe there is no difference between reading and writing scores, while there actually is.
I would expect it to contain 0 because we are fairly certain that the difference between reading and math scores are 0.
Each year the US Environ- mental Protection Agency (EPA) releases fuel economy data on cars manufactured in that year. Below are summary statistics on fuel efficiency (in miles/gallon) from random samples of cars with manual and automatic transmissions manufactured in 2012. Do these data provide strong evidence of a difference between the average fuel efficiency of cars with manual and automatic transmissions in terms of their average city mileage? Assume that conditions for inference are satisfied.
include_graphics('/Users/Michele/Desktop/532.png')
Ho = Automatic MPG - Manual MPG = 0
Ha = Automatic MPG - Manual MPG != 0
n <- 26
mean_auto <- 16.12
sd_auto <- 3.58
mean_man <- 19.85
sd_man <- 4.51
meandiff <- mean_auto - mean_man
SEdiff <- (((sd_auto^2)/n) + ((sd_man^2)/n))^0.5
t <- (meandiff-0)/SEdiff
df <- n-1
p <- pt(t, df = df)
p
## [1] 0.001441807
The General Social Survey collects data on demographics, education, and work, among many other characteristics of US residents. Using ANOVA, we can consider educational attainment levels for all 1,172 respondents at once. Below are the distributions of hours worked by educational attainment and relevant summary statistics that will be helpful in carrying out this analysis.
include_graphics('/Users/Michele/Desktop/548a.png')
H0: The mean outcome is the same across all groups. μ1 = μ2 = · · · = μk
HA: At least one mean is different.
the observations are independent within and across groups the data within each group are nearly normal the variability across the groups is about equal.
include_graphics('/Users/Michele/Desktop/548b.png')
n <- 1172
k <- 5
dfG <- k-1
dfR <- n-k
totaldf <- dfG + dfR
# Use P to determine F Stat
p <- .0682
F <- qf(1-p, dfR, dfG)
# Use MSR and F to determine MSG
MSG <- 501.54
MSR <- MSG / F
# Use MSR to determine SSR
SSG <- dfG * MSG
SSR <- 267382
SST <- SSG + SSR
ANOVA <- c("degree", "residuals", "Total")
Df <- c(dfG, dfR, totaldf)
Sum_Sq <- c(SSG, SSR, SST)
Mean_Sq <- c(MSG, MSR, "")
F_value <- c(F, "", "")
P <- c(p, "", "")
ANOVA_df <- data.frame(ANOVA, Df, Sum_Sq, Mean_Sq, F_value, P)
ANOVA_df
## ANOVA Df Sum_Sq Mean_Sq F_value P
## 1 degree 4 2006.16 501.54 4.71531485272543 0.0682
## 2 residuals 1167 267382.00 106.364053231803
## 3 Total 1171 269388.16
Since the P value is greater than .05, we conclude that there is no significant difference between the educational attainment levels.