Homework 4

Q1) Modified from Q15. Ch4

a.

Assuming sampling is random, What is the standard error of the mean in men, and what is it in women?

Standard errors were calculated using sd/sqrt(n) SE for men: 0.0986 SE for women: 0.0583

b.

Using the 2SE rule of thumb, what is the 95% confidence interval for men? What is it for women?

CI for men: [3.6 < mu < 4.0] CI for women: [2.3 < mu < 2.5]

c.

Plot these data. Indicate your confidence in your estimate of the mean by noting +/- 2 standard errors of the mean. Describe how you would plot these data if you had raw data rather than these summaries.

I made a barplot and added line segments to represent the standard errors.

y<- c(3.8,2.4)
m1 <- 3.8
m2 <-2.4
se1 <- 0.099
se2 <- 0.0583
barplot(height=c(m1,m2), ylim=c(0,4), main="Number of Sexual Partners in The Last Five Years", xlab="Gender (blue=male,red=female)", ylab="Number of Heterosexual Partners", col=c("blue", "red"))
segments(x0=0.7,x1=0.7,y0=m1-2*se1,y1=m1+2*se1)
segments(x0=1.9,x1=1.9,y0=m2-2*se2,y1=m2+2*se2)

With the raw data, a strip chart or boxplot would have been a better representation, showing the full variance and means of sample groups.

d.

What proprotion of men had more than 3.8 heterosexual partners - half? more than half? less than half? cannot say?

Less than half of men had more than 3.8 heterosexual partners.

e.

Which is a better description of the variation among men in the number of sexual partners, the standard deviation or the standard error? Why?

Standard deviation is a better indication of the variation or spread of the actual data obtained, because standard error is a measure of the precision of the study, or how precisely the mean of the sample approximates of the mean of the population. The standard deviation is the standard representation of variation of raw data.

f.

Which is a better description of the uncertainty in the estimated mean number of sexual partners of women in this population, the standard deviation or the standard error? Why?

Standard error is the better description of the uncertainty of the estimated mean because this is what standard error measures, the precision with which the sample predicts that population parameter.

g.

A mysterious result of the study id the discrepancy between the mean number of partners of heterosexual men and women. If each sex obtains its partner from the other sex (and there are an equal number of heterosexual men and women in the population) then the true mean number of heterosexual partners sahould be identical. Considering aspects of the study design, suggest an explination for the discrepancy.

It was a survey, which introduces a few areas for potential error. One is that people are self-representing and may lie (maybe women “under-report” and men “over-report” their number of sexual partners…). The second is that the respondants might not be an accurate representation of the population because people who are willing to respond might be those who are more proud or even just more open about their number of sexual partners.

Q2) Modified from Q19. Ch4

a.

Identify which graph goes with which distribution.

A- distribution of sample means for samples of size 10 B= random sample c= distribution of sample means for samples of size 100

b.

What features of these distributions allowed youto distinguish which was which?

The variation in the data was the main thing that allowed me to distinguish. The original sampling distribution will be the one with the most spread and the distribution of sample means will have a smaller range (be more precise), the larger the sample sizes are.

c.

Estimate, by eye, the mean number of hours of sleep using the distribution of the data.

7.5

e.

Estimate, by eye, the range of hours of sleep per night Europeans get.

4.5-11.5

f.

Which figure best conveys the variability in this population?

Figure B best presents the variability of the population.

Q3) Modified from Q22 Ch4

a.

a1 According to the values in the table, which relationship group gets the longest hugs, on average, and which gets the briefest hugs?

On average, the coaches get the longest hugs and the competitors get the briefest hugs.

a2 Do the values shown respresent parameters or sample estimates? Explain.

These values are sample estimates because the information is not known for the entire population of olympians, only for this specific sample of that population.

b.

b1 Using the numbers in the table, calculate the standard error of the mean hug duration for each relationship group.

SE for coach- 0.45 SE for supporter- 0.32 SE for competitor- 0.20

b2 What do these values measure?

These values measure the prescision of the estimate. Standard error represents the amount of uncertainty there is about these estimates of the means of these values for the entire population of olympic athletes.

c.

What assumptions of the samples are you making in (b)?

We are assuming these are random samples of the entire population of olympic athletes.

d.

Using the information in this table, calculate an approximate 95% confidence interval for the mean hug duration when athletes embrace

d1 coach

2.87 < mu < 4.67

d2 supporter

2.52 < mu < 3.80

d3 competitor

1.42 < mu < 2.20

e.

Plot these data noting the uncertainty in your values with +/- two standard errors.

This is my bar plot showing means and standard errors for each group:

y<- c(3.77,3.16,1.81)
m1 <- 3.77
m2 <- 3.16
m3 <- 1.81
se1 <- 0.45
se2 <- 0.32
se3 <- 0.2
barplot(height=c(m1,m2,m3), ylim=c(0,5), main="Duration of Hugs Given by Olympic Athletes", xlab="Person Receving Hug (red=coach, blue=supporter, green=competitor)", ylab="Duration of Hug (in seconds)", col=c("red","blue","green"))
segments(x0=0.7,x1=0.7,y0=m1-2*se1,y1=m1+2*se1)
segments(x0=1.9,x1=1.9,y0=m2-2*se2,y1=m2+2*se2)
segments(x0=3.1,x1=3.1,y0=m3-2*se3,y1=m3+2*se3)

f.

From your (approximate 95%) confidence intervals, How confident are you that athletes hug (f1) coaches and supporters for different amounts of time? (f2) coaches and competitors for different amounts of time? (f3) competitors and supporters for different amounts of time?

I am confident that (f2) coaches and competitors and (f3) supporters and competitors are hugged for different amounts of times because the confidence intervals for these means do not overlap. I am not as confident, but there is still some reason to believe, that there is a difference between the amount of time athletes hug coaches versus supporters, because these confidence intervals have a signicant amount of overlap.

Q4) Modified from Q23 Ch4

a.

Does this result imply that individual plants receive between 57% and 100% of their leaf nitrogen from tree shrews? Explain.

That is not necessarily the implication of this confidence interval. The implication is that we can be 95% sure that the population mean is between 0.57 and 1.0. Because this interval does not say anything about the spread of the population values, the above assumption cannot be made based only on the confidence interval.

b.

Is the confidence interval meant to bracket the sample mean or the population mean?

The confidence interval is meant to give an interval for which we can be 95% confident that it contains the true population mean.

c.

Identify two values for the mean shrew fraction of total leaf nitrogen that the analysis suggests are among the most plausible.

0.78 and 0.79 are probably the most plausible population means for this value.

d.

Identify two values for the mean shrew fraction of total leaf nitrogen that the analysis suggests are less plausible.

0.57 and 1.0 are less plausible to be the true population mean.

Homework 4

Angel Davis

February 2, 2016

Q1) Modified from Q15. Ch4

a.

b.

c.

d.

e.

f.

g.

Q2) Modified from Q19. Ch4

a.

b.

c.

e.

f.

Q3) Modified from Q22 Ch4

a.

b.

c.

d.

e.

f.

Q4) Modified from Q23 Ch4

a.

b.

c.

d.