OpenIntro Data
price<- ames$SalePrice
samp1 <- sample(price, 50)
mean(samp1)
## [1] 173831.1
I would guess the mean to be about 1490.But the actual population mean is 1498.356.
area<-ames$Gr.Liv.Area
sample_means50 <- rep(NA, 5000)
for(i in 1:5000){
samp <- sample(area, 50)
sample_means50[i] <- mean(samp)
}
hist(sample_means50, breaks= 25)
mean(sample_means50)
## [1] 1498.356
Based on this sampling distribution I would gues the man sale price of homes in Ames to be about $1500.
samp2<- sample(price, 150)
sample_means150<-rep(NA, 5000)
for(i in 1:5000){
samp <- sample(area, 150)
sample_means150[i] <- mean(samp)
}
hist(sample_means150, breaks= 25)
mean(sample_means150)
## [1] 1499.824
The samplling distribution of size 150 has the smaller spread.If we want an estimate closer to the true value, we’d want a distribution with a large spread.
MileStone Data
area<-PokemonStat$Attack
sample_means5<-rep(NA, 5000)
for(i in 1:5000){
samp<-sample(area, 5)
sample_means5[i]<-mean(samp)
}
hist(sample_means5, breaks=25)
sample_means25<-rep(NA, 5000)
sample_means100<-rep(NA, 5000)
for(i in 1:5000){
samp<-sample(area, 25)
sample_means25[i]<-mean(samp)
}
hist(sample_means25, breaks=25)
for(i in 1:5000){
samp<-sample(area, 100)
sample_means100[i]<-mean(samp)
}
hist(sample_means100, breaks=25)
All three sample sizes are similar in that theyre all unimodal with a close to normal distribution with the mean around 80. Out of all three sample sizes the one with 100 samples appears to be be the closest to a normal distribution.
area<-PokemonStat$Attack
sample_std5<-rep(NA,5000)
for(i in 1:5000){
samp<-sample(area, 5)
sample_std5[i]<-sd(samp)
}
hist(sample_std5, breaks=25)
sample_std25<-rep(NA, 5000)
sample_std100<-rep(NA, 5000)
for(i in 1:5000){
samp<-sample(area, 25)
sample_std25[i]<-sd(samp)
}
hist(sample_std25, breaks=25)
for(i in 1:5000){
samp<-sample(area, 100)
sample_std100[i]<-sd(samp)
}
hist(sample_std100, breaks=25)
All three sample sizes are similar in that theyre all unimodal with a close to normal distribution with the mean around 33 or 34. Out of all three sample sizes the one with 100 samples appears to be be the closest to a normal distribution.