The population of interest in this study is all children ages 5-15, while the sample in this study are the 160 children whom researchers conducted the experiment on.
The results of this study should not be generalized to all children ages 5-15. This study experimented on children with certain characteristics, and a very simple method was used to perform this experiment. I would need to see more replication of this study, with different experimental methods to determine if it can be generalized to the population. I also do not believe the findings of this study are enough to establish causal relationship. The researches only took into account age, sex, and whether or not they were an only child. This means there could have been confounding variables, which are variables related to the explanatory and response variables that are unaccounted for in the experiment.
1.20 Stressed Out, Part I
This type of study is considered an observational study because the data is collected in a way that does not interfere with how the data in the study arise. In other words, the researches simply monitored what occurred with the school students, whereas if this study were an experiment, the researchers would have to assign the explanatory variables to the school students.
Again, I would caution against using this study to conclude a causal relationship between stress and muscle cramps. There could be confounding variables, which were not accounted for, that are affecting the students’ muscles.
One confounding variable that I can think of would be whether or not the students play a sport. Playing a sport on top of schoolwork could contribute to stress, as well as muscle cramps. Another confounding variable could be lack of sleep. A student not being well rested could enhance stress and even cause muscle cramps.
1.30 Stressed Out, Part II
This type of study is considered an experiment because the researchers are assigning the primary explanatory variable (stress) to half of the subjects and leaving the other half with no or baseline stress.
I would still caution against using this study to conclude a causal relationship between increased stress and muscle cramps. There could still be confounding variables that are unaccounted for in this study.
1.40 Office Productivity
H <- c(1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7)
counts <- table(H)
counts
## H
## 1 2 3 4 5 6 7
## 1 2 3 4 3 2 1
scatter.smooth(counts)
1.50 Mix and Match
The first distribution in histogram (a) is unimodal and is not skewed right or left, therefore it is symmetric. The mode appears to be at 60. This histogram matches up with box plot (2), which also displays the symmetric distribution with the median appearing 60, which is the same as the mode.
The second distribution in histogram (b) is multimodal as there are multiple peaks in the distribution and is symmetric. This histogram matches up with box plot (3). The IQR is a very large range due to the multimodal distribution of the data and the median falls near 50.
The third distribution in histogram (c) is unimodal as well. However, this distribution is skewed right because the data tail off to the right. This histogram matches up with box plot (1) with the median lying at the same point as the mode and the data being skewed to the right, You can tell the data is skewed right in the box plot because of the multiple outliers lying above the box plot’s upper whisker.
1.60 A New Statistic
The shape of distribution (a) is symmetric because the mean and the median are the same. The mean and the median are the same when data is symmetric because the distribution is even on both sides, making the mean in the middle equaling the median.
The shape of the second distribution would be skewed to the left because in order for (b) to be true, the mean must be smaller than (or to the left of) the median. The mean is more affected by outliers, and since the data is skewed to the left here, the mean is pulled more to the left than the median, which is simply the 50th percentile.
The shape of the third distribution would be skewed to the right because in order for (c) to be true, the mean must be larger than (or to the right of) the median. The mean is more affected by outliers, and since the data is skewed to the right here, the mean is pulled more to the right than the median, which is simply the 50th percentile.
1.70 Heart Transplants
The mosaic plot displays that survival is independent of whether or not the patient got the transplant because some patients who did not receive the transplant still survived. In other word, surviving does not depend on receiving the transplant.
The box plots display the effectiveness of the transplant by showing that 50% of the patients who received the transplant survived longer than almost all of those who didn’t (besides a few outliers).
The proportion of people who died in the treatment group was 45/69. The proportion of people who died in the control group was 30/34.
The claim being tested is whether or not the heart transplant increases lifespan for patients with heart disease.
28 75 69 34 0 (45/69) – (30/34) = -.23
Since the fraction of simulations was low (-.23), the simulation results show that we must reject the null hypothesis in favor of the alternative, that the transplant increases life expectancy.