The sample is likely to not be random, but is probably favorable toward the presence of male birds as opposed to females since males are much easier to see and to hear as compared to the females.
A. By leaving out people who have cellphones (probably representing a large portion of the population these days), you limit your sample to only those people who have retained house phones and who regularly answer them. This might represent a particular group of people that is not representative of the diversity of whole population - perhaps an older demographic with a particular set of values and beliefs.
B. Equal chance of being selected will be violated - not everyone has an equal chance of being selected for the study. Only those with home phones who will answer them and who will be willing to partake in the survey are those who will be represented out of the population.
C. You might be somewhat precise, depending on how alike people who are willing to answer their house phone and who are willing to answer such questions are, but you may not receive an accurate estimate of the population’s opinion. Since accuracy is related to the idea of being unbiased, this approach does not seem to be very accurate.
A. The pinon pine trees of the coastal ranges of California.
B. The researchers might have been trying to correct for outliers when finding the average age. If they find the mean of the average ages of each of the individual plots, their overall measurement can be more accurate by examining carefully each of the plots to make sure their data is not being skewed. For instance, perhaps they might take the median of one plot because of high outliers whereas the mean would work better in another.
A. Is the distribution map representative of the population of pinon pine trees in California? If the data in the map is not skewed or in error, are the areas identified by the computer accurate? In other words, did the researchers correctly estimate the location of the 10-hectare plots in the field? If so, there does not seem to be much room for sampling error. The plots were randomly selected - and assuming the random sampling method was truly random - the amount of 10-hectare plots seems pretty large. Their correction for outliers seems pretty solid. I would say, all these things considered, the estimate has little error, but then I could be totally wrong.
B. Sampling error becomes smaller as n increases and thus precision increases. In this case, if n were to decrease, the sampling error would become larger and precision would decrease.
A. Answer: The warfarin treatment is the explanatory variable and the survival of the mice is the response variable.
B. Answer: Both the naturopathic care and the psychotherapy are explanatory variables. The response of the patients’ anxiety is the response variable.
C. Answer: The questions immediately followed by pictures of food are the explanatory variables, and the response of the patients’ frontostriatal-amygdala-midbrain is the response variable.
D. Answer: The doses of endostatin were the explanatory variables; the responses of the tumors were the response variables.
A and C are observational whereas B and D are experimental. This is because in B and D, experimenters assigned the treatment to random individuals, whereas in A and C, researchers simply observed the responses of the subjects. A is hard to determine because it seems like it could be either, but I will stick to my guns and say it’s observational.
A. Answer: The explanatory variable is the random choosing of patients out of two pools: those who have MS and those who do not. The response variable is whether or not the patients from either pool have CCSVI or no.
B. Answer: This is an observational study because the researchers are not trying to prove causation by providing/testing treatment and response as much as they are trying to draw a correlation between CCSVI and MS.
C. Answer: Hypothesis testing may have been used in the comparison of the healthy group to the MS group; I assume this is how they drew their association - by comparing the percentages of those with CCSVI against those without to try to see if CCSVI was correlated in a statistically significant way to MS.
Chapter2Problem19<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter02/chap02q19FireflySpermatophoreMass.csv"))
hist(Chapter2Problem19$spermatophoreMass, xlab = "Spermatophore Mass", ylab = "Frequency", main = "Spermatophore Mass")
B. I chose the histogram because I only needed to display one variable type and thought the histogram would provide a clear picture of the distribution. C. The distribution looks fairly normal with a right skew. There is a clear outlier on the right tail. D. The largest term is an outlier. It is also the maximum value.
Chapter2Problem22<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter02/chap02q22CriminalConvictions.csv"))
hist(Chapter2Problem22$numberConvictions, xlab = "Number of Convictions", ylab = "Frequency", main = "Conviction")
A. Frequency Table B. One = convictions C. Twenty-one D. 53/79 or 67% ——–265/395 E. The histogram because we are only dealing with one variable (number of convictions). F. Right-Skewed, unimodal, the mode = 0, yes there are outliers - any value over 1. 67% of the range is 0. The third quartile is 1. 3rd quartile - 1st quartile = 1-0=1. Anything over 1.5 of the IQR is an outlier. Thus any value over 1.5 is an outlier. G. They do not represent a random sample because they were chosen from schools located near the research institute. They do not represent a good random sample of the entirety of Britain, but rather this small region in north Britain. The researchers are a bit lazy.
A. This is a line graph. B. The steepness of the line indicates the speed of change. C. Because there is a gradual rise in the overall data, the graph shows a positive correlation between the addition of species to the endangered species list and time. In 1993, there was a burst of addition to the list.
Chapter2Problem26<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter02/chap02q26NeotropicalTreePhotosynthesis.csv"))
plot(Chapter2Problem26$previousFruits, Chapter2Problem26$photosyntheticCapacity, xlab = "Previous Fruits", ylab = "Photosynthetic Capacity")
A. I used a scatter-plot B. The explanatory variable is the number of previously produced fruits, because it is the variable this study hypothesizes affects the dependent variable - the photosynthetic capacity.
C. There is a negative correlation between the previous reproductive effort and the photosynthetic capacity.
Chapter2Problem28<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter02/chap02q28SneakerCannibalism.csv"))
counts<- table(Chapter2Problem28$typeOfMales, Chapter2Problem28$cannibalism)
barplot(counts, main = "Rate of Cannibalism in Male Telmatherina sarasinorum in Presence of Sneakers vs. Non-Sneakers", xlab = "Types of Males", col = c("green", "yellow", "blue"), legend.text = rownames(counts), beside=TRUE)
Chapter2Problem28<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter02/chap02q28SneakerCannibalism.csv"))
plot(Chapter2Problem28, main = "Does the Instance of Sneakers Affect Cannibalism in Father Fish?", xlab = "Types of Males", ylab = "Cannibalism")
Chapter2Problem32<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter02/chap02q32ToxoplasmaAccidents.csv"))
plot(Chapter2Problem32, xlab = "Driver Type", ylab = "Infection Status")
Chapter2Problem35<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter02/chap02q35FoodReductionLifespan.csv"))
Chapter2Problem35$combined <- paste0(Chapter2Problem35$sex, ", ", Chapter2Problem35$foodTreatment)
stripchart(lifespan ~ sex*foodTreatment, data = Chapter2Problem35)
stripchart(lifespan ~ sex*foodTreatment, data = Chapter2Problem35, vertical = TRUE)
Chapter3Problem14<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter03/chap03q14VasopressinVoles.csv"))
boxplot(Chapter3Problem14$percent[1-20], Chapter3Problem14$percent[21-31])
stripchart(percent ~ treatment, data = Chapter3Problem14, vertical = TRUE)
B.
mean(Chapter3Problem14[1-20])
## Warning in mean.default(Chapter3Problem14[1 - 20]): argument is not numeric
## or logical: returning NA
## [1] NA
mean(Chapter3Problem14[21-31])
## Warning in mean.default(Chapter3Problem14[21 - 31]): argument is not
## numeric or logical: returning NA
## [1] NA
So the control males have a higher mean percentage.
C.
sd(Chapter3Problem14$percent[1-20])
## [1] 26.36708
sd(Chapter3Problem14$percent[21-31])
## [1] 28.81251
The experimental group has a higher, sample standard deviation.
Chapter3Problem19<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter03/chap03q19SparrowReproductiveSuccess.csv"))
var(Chapter3Problem19$lifetimeRS[1-84])
## [1] 3.816699
var(Chapter3Problem19$lifetimeRS[85-165])
## [1] 3.864844
It appears the females have a higher variance in their reproductive success. Hm.
0
Chapter3Problem21<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter03/chap03q21YeastMutantGrowth.csv"))
mean(Chapter3Problem21$mutantGrowthRate)
## [1] 0.9709091
median(Chapter3Problem21$mutantGrowthRate)
## [1] 1.01
var(Chapter3Problem21$mutantGrowthRate)
## [1] 0.004889091
sd(Chapter3Problem21$mutantGrowthRate)
## [1] 0.06992203
Chapter3Problem28<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter03/chap03q28SeaUrchinBindin.csv"))
stripchart(percentAAfertilization ~ populationOfFemale, data = Chapter3Problem28, vertical = TRUE)
AA sperm definitely had a higher percentage of fertilization than BB sperm.
I guess if wer’re comparing the spread of frequency distribution, wouldn’t we just compare the standard deviation? The higher standard deviation should indicate the group with the most spread because standard deviation is the spread/deviation from the mean.
sd(Chapter3Problem28$percentAAfertilization[1-12])
## [1] 0.2581793
sd(Chapter3Problem28$percentAAfertilization[13-19])
## [1] 0.265527
So it looks like the spread of the data for BB sperm is wider than AA.
Chapter4Problem7<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter04/chap04q07FireflyFlash.csv"))
S<- c(Chapter4Problem7$flash)
ybar<- mean(S)
mean(S)
## [1] 95.94286
ybar
## [1] 95.94286
96.6
## [1] 96.6
N<- length(S)
df<- N-1
alpha<- 0.05
tcrit<- qt(1-alpha/2, df)
tcrit
## [1] 2.032245
2.262157
## [1] 2.262157
SE<- sd(S)/sqrt(N)
CI<- c(ybar - tcrit*SE, ybar + tcrit*SE)
t.test(S)
##
## One Sample t-test
##
## data: S
## t = 51.626, df = 34, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 92.16611 99.71960
## sample estimates:
## mean of x
## 95.94286
FALSE TRUE TRUE TRUE
Chapter4Problem18<- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter04/chap04q18Corpseflowers.csv"))
mean(Chapter4Problem18$numberOfBeetles)
## [1] 70.1
sd(Chapter4Problem18$numberOfBeetles)
## [1] 48.50074
SE
## [1] 1.858409
S<- c(Chapter4Problem18$numberOfBeetles)
ybar<- mean(S)
N<- length(S)
df<- N-1
alpha<- 0.05
tcrit<- qt(1-alpha/2, df)
SE<- sd(S)/sqrt(N)
CI<- c(ybar - tcrit*SE, ybar + tcrit*SE)
S<- c(Chapter4Problem7$flash)
S<- c(Chapter4Problem7$flash)
ybar<- mean(S)
mean(S)
## [1] 95.94286
95.94286
## [1] 95.94286
ybar
## [1] 95.94286
95.94286
## [1] 95.94286
96.6
## [1] 96.6
96.6
## [1] 96.6
N<- length(S)
df<- N-1
alpha<- 0.05
tcrit<- qt(1-alpha/2, df)
tcrit
## [1] 2.032245
2.032245
## [1] 2.032245
2.262157
## [1] 2.262157
2.262157
## [1] 2.262157
SE<- sd(S)/sqrt(N)
CI<- c(ybar - tcrit*SE, ybar + tcrit*SE)
t.test(S)
##
## One Sample t-test
##
## data: S
## t = 51.626, df = 34, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 92.16611 99.71960
## sample estimates:
## mean of x
## 95.94286
A. The coaches get the longest hugs on average, while the competitors get the shortest hugs on average. I determined this by the observation that the mean hug time for coaches is 3.77 seconds, whereas the mean hug time for competitors is 1.81 seconds.
These values measure the standard error which means “the difference between an estimate and the target parameter” (Whitlock and Shluter 101). Basically, this means there is much more room for error (uncertainty) associated with the coaches than with the competitors.
I’m assuming n is relatively comparable between them, when in actuality it is about half in the competitors compared to the supporters and coaches.
In order to calculate a 95% confidence interval, I can use the rule of thumb and add 2x the sample error and subtract 2x the sample error: 2.203415502>__>1.416584498.
Well, two seconds is within the 95% confidence interval, so yes, but I suspect the hug duration will be closer to the sample mean: 1.81.
The supporters and coaches should have 3-second hugs as their 95% confidence interval calculated roughly by adding and subtracting 2x the SE will certainly include 3 second hugs: 3.16 +/- 2(.3186983486) and 3.77 +/- 2(.4512838828).
The probability of choosing a male out of this population of adults is 48% or p=.48. I know this because the problem states that 52% of the population are women or p=.52.
The probability that a male will say brussel sprouts are somewhat delicious or especially delicious can be calculated by: Pr[male]Pr[somewhat delicious] + Pr[male]Pr[especially delicious]. More simply: (.48)(.08) + (.48)(.01) = .0384+.0048 = .0432 or 4.3%. What’s wrong with brussel sprouts?!?!
Pr[somewhat delicious or especially delicious (given that) male]. The ‘given that’ would be represented by a straight, vertical line, but I don’t know how to make that symbol with this program.
Pr[somewhat delicious or especially delicious given that female] = (.52)(.06) + (.52)(.01) = .0364 = 3.6%.
See Paint File.
Pr[male] = Pr[somewhat delicious]Pr[male given that somewhat delicious] + Pr[especially delicious]Pr[male given that especially delicious] Pr[female] = Pr[somewhat delicious]Pr[female given that somewhat delicious] + Pr[especially deliciou]Pr[female given that especially delicious] Basically, I just need to add the two previous probabilities: 0.0432 + 0.0364 = .0796 or 7.96%
Bayes’ Theorem: Pr[Canadian man smoked given that he has been diagnosed with lung cancer] = Pr[lung cancer given that he smoked]Pr[smoked]/Pr[lung cancer].
Probability of Cancer = (.52)(.172) + (.48)(.013) = .09568
(.172)(.52)/(.09568) = .9347826087 = 93%
The probability of catching a bug on any given try is .2 or 20%. The probability of not catching a bug on any given try is 0.8 or 80%. If this is the bird’s fourth try, I’m going to assume it missed the three other tries. Therefore: (0.8) x (0.8) x (0.8) x (0.2) = .1024 or 10.24%.
The baseline probability of catching a bug after four failures is: (0.8) x (0.8) x (0.8) x (0.8) = .4096 = 40.96%
Yes they are mutually exclusive, because you cannot randomly choose a colony of bacteria that is both between 4-6 and 8-12mm.
Because the problem uses the word “or”, we must add the probabilities of .48 and .14 = .62 or 62%.
Just add the probabilities of the diameters being equal or greater than 10. The 10-12 range is 0.14 and the 12-14 range is 0.02. Add the two together and you get: 0.14 + 0.02 = 0.16 or 16%!
In order to get the interval between 8-10, I would subtract the probability of 10-12 from the larger probability of 8-12 to isolate 8-10. This would equal: 0.48 - 0.14 = 0.34 or 34%.
Just add the last two probabilities together: 0.16 + 0.34 = 50 or 50%
The probability of C is 0.13 and since each allele is an independent probability, the probability of an individual having CC should be 0.13^2. = .0169
(.83)^2 + (.13)^2 + (.04)^2 = .7074 = 70.74%
The probability of AS = 2((0.83)(.04)) = .0664 or 6.64%
The probability of either AS or AC = the previous answer + 2((.83)(.13)) = .2822 or 28.22%!
“A” is true because the biased data will show a significant skew from the null and will thus increase the chance of rejecting it. If, as in this case, the null is true, then this is a type 1 error: the rejection of a true null.
The answer is true, because the data will be more skewed with a higher sampling error; it would lead to a situation very much like the one in problem 1.
A. False. The p-value assumes the null.
False - this would be a type I error as the null is true
False - if we rejected the true null only then would we be making a Type I error. Not rejecting a true null is not an error.
A. Alternative Hypothesis - it would be interesting to prove, not to reject.
B. Alternative Hypothesis
C. Null - please reject!
D. Alternative Hypothesis
E. Null
A. Decreasing the significance level from 0.05 to 0.01 would decrease the potential for Type 1 error because it would be harder to reject the null if it were true.
B. It would increase the potential for Type II errors because the null hypothesis (if false) would be more difficult to reject.
C. It would increase the power of the test, because power is the reduced potential for making a Type II error - in other words, not rejecting the false null. In a roundabout way, this means a test has more power if it can reject the false null.
D. The larger the sample size, the more power. This is also true if the true discrepancy from the null hypothesis is large, or if the variability in the population is low.
A. A larger sample size (as long as it is truly random) should decrease the number of Type I error because it will give a more accurant representation of the population parameters and thus the ‘truth.’
B. A larger sample size should decrease the number of Type II errors as well.
C. A larger sample size should raise the power/ability of a test to reject a false null.
D. I don’t think it would have much effect on the significance level. The significance level seems like a value the experimenter chooses based on their judgement of the situation. A 0.05 significance level should be fairly accurate 19/20 trials in either rejecting or accepting the null accurately based on its untruthfulness or truthfulness. A 0.01 significance level will decrease Type I error, but will make it harder to reject false nulls, and will thus increase Type II error.
A. .007548
B. .096348
C. .814548
D. 1.7621
A. This is hard to say. 0.04 is so close to the significance level of 0.05 that it could have only had a small effect. If the sample size was large enough, making this a more accurate measurement of significance, the effect might have been small. However, if the sample size was quite small, this data might not be representative enough of the population and could lead to error. Since 0.04 is so close to the significance level, this might not be enough to reject the null.
B. Again, the treatment might have had some effect; it depends on the sample size and significance level. If the significance level is 0.01, this doesn’t say much about rejecting the null. If the sample is not indicative of the population, then this might not reject the null.
C. False. The P-value is not the probability of committing an error, the P-value is the “probability of obtaining a result as extreme or more extreme than that observed.” In other words, the P-value tries to test the truth of something measured against the null. If the difference between the test value and the null is significant, the null may be rejected. If not, the null may be true … but never accepted.
False - for reasons stated in C. A type II error would simply mean not rejecting a false null. The p-value could only show you the significance level of the difference between the sample result and the null. If the significane is great, the false null may be rejected.
True. If the significance level had been 0.01, then 0.04 would have indeed been above and therefore the null would have been kept. However, this may have led to a type II error.
A. If the null hypothesis is true, a larger sample size should support the null rather than support a p-value that would reject it. Thus, I will say false.
B. True.
C. False.
A. True. If the null hypothesis is false, a larger sample size should support the rejection of the null.
B. False.
C. False.
The Crested Guan
The Agouti, Coatimundi, Howler Monkey, and the Ocellated Turkey.
The Collared Peccary, the Deppes Squirrel, the Spider Monkey, the Great Currassow, and the Tinamou.
The Crested Guan is certainly the most significant. The Coatimundi and the Ocellated Turkey are pretty significant as well. The Howler Monkey is in the running.