Data 606 HW 4

4.4

Point estimate mean: 171.1. Point estimate Median: 170.3
Point estimate std. dev.: 9.4. Point estimate IQR: 14
180cm is not unusually tall because it falls within 2 SD of the mean, which would be considered within normal range. While short, 155cm also falls within 2 SD of the mean.
I would not expect the mean and standard deviation to be the same, since point estimates are based on a limited sample.
The standard error quantifies the variability of point estimates. The Standard error of this sample is 9.4/sqrt(507)=.417

4.14

False, the average spending of this sample is the sample mean, $84.71.
False, the sample size n=436 is large enough to allow use of the normal model.
True
True
True
False, we’d have to sample 9 times the number of people
False, the Margin of Error is 84.71 / sqrt(436) * 1.96 = 7.95

4.24

Yes, sample size is >30, observations are independant, and the distribution is not skewed.
defining the hypothesis:
H0: The average age at which gifted children count to 10 = 32 months
HA: The average age at which gifted children count to 10 < 32 months
Here we calculate the standard error of the sample distribution, the zscore of the observed mean, and the probability of that zscore.

n<- 36
sample_mean <- 30.69
pop_mean <- 32
sd <- 4.31
(se <- sd/sqrt(n))

## [1] 0.7183333

(z <- (sample_mean-pop_mean)/se)

## [1] -1.823666

pnorm(z)

## [1] 0.0341013

Given the plausible range of values for the population mean indicated by the confidence interval, we cannot reject the null hypothesis.

As shown above, the p-value is less than significance level .10, thus we reject the null hypothesis.
Calculate 90% confidence interval:

(confidence_interval <- c((sample_mean-(1.645*se)),(sample_mean + (1.645*se))))

## [1] 29.50834 31.87166

The results of the different tests agree, both the confidence interval and hypothesis test show the observed mean outside of the plausible range of the null hypothesis.

4.26

H0: Average is the samek, HA: Average is not the same. As shown below if the null hypothesis were true, the probabilty of the observed sample mean would be near zero. Thus we reject the null hypothesis.

pop_mean <- 100
smp_mean <- 118.2
(se<- 6.5/sqrt(36))

## [1] 1.083333

(z <- (smp_mean-pop_mean)/se)

## [1] 16.8

1-pnorm(z)

## [1] 0

Calculate 90% confidence interval:

(confidence_interval <- c((smp_mean-(1.645*se)),(smp_mean + (1.645*se))))

## [1] 116.4179 119.9821

The hypothesis test and confidence interval agree that it is not plausible that the range of average IQ for mothers of gifted children fall within the range of means of the population at large.

4.34
The sampling distribution of the mean refers to the distribution of all the possible means of samples of length n. As the sample size increases the center approaches the population mean, the curve approaches normal, and the spread becomes smaller.

4.40

What is the probability that a randomly chosen penny weighs less than 2.4 grams?

mean<-2.5
sd<-.03
pnorm(2.4,mean,sd)

## [1] 0.0004290603

The sampling distibution would be symmetrical with mean of 2.5 and standard error of ~.009

(se <- sd/sqrt(10))

## [1] 0.009486833

What is the probability mean weight of 10 pennies is less than 2.4 grams?
There is a near zero probability that the mean weight of 10 pennies is less than 2.4 grams.

pnorm(2.4,2.5,se)

## [1] 2.797279e-26

sketch the two distributions:

library(DATA606)

## Loading required package: shiny

## Loading required package: openintro

## Please visit openintro.org for free statistics materials

## 
## Attaching package: 'openintro'

## The following objects are masked from 'package:datasets':
## 
##     cars, trees

## Loading required package: OIdata

## Loading required package: RCurl

## Loading required package: bitops

## Loading required package: maps

## Loading required package: ggplot2

## 
## Attaching package: 'ggplot2'

## The following object is masked from 'package:openintro':
## 
##     diamonds

## Loading required package: markdown

## 
## Welcome to CUNY DATA606 Statistics and Probability for Data Analytics 
## This package is designed to support this course. The text book used 
## is OpenIntro Statistics, 3rd Edition. You can read this by typing 
## vignette('os3') or visit www.OpenIntro.org. 
##  
## The getLabs() function will return a list of the labs available. 
##  
## The demo(package='DATA606') will list the demos that are available.

## 
## Attaching package: 'DATA606'

## The following object is masked from 'package:utils':
## 
##     demo

par(mfrow=c(1,2))
normalPlot(2.5,.03,bounds=c(-5,2.4)) #Randomly chosen penny less than 2.4
normalPlot(2.5,se,bounds=c(-5,2.4)) #Mean of sample n=10 less than 2.4

4.48

The p-value of the sampling distribution is derived from the z-score, the z-score is calculated using the standard error, which is in-turn calculated using the sample size such that zscore = (sample_mean - pop_mean) / (sample_std_dev / sqrt(sample_size)) As shown below, when the sample size increases, the absolute value of the zscore will also increase (negative will become smaller, positive greater). As a result, the associated p-value will decrease.

sample_size <- 30
sample_mean <- 10
pop_mean <- 11
sample_std_dev <- 1
(zscore <- (sample_mean - pop_mean) / (sample_std_dev / sqrt(sample_size)))

## [1] -5.477226

sample_size <- 60
(zscore <- (sample_mean - pop_mean) / (sample_std_dev / sqrt(sample_size)))

## [1] -7.745967

Data 606 HW 4

John Perez

3/10/2019