A survey was conducted to study the smoking habits of UK residents. Below is a data matrix displaying a portion of the data collected in this survey. Note that “£”stands for British Pounds Sterling, “cig” stands for cigarettes, and “N/A” refers to a missing component of the data.
What does each row of the data matrix represent?
It reprsent each participatent’s smoking habits.
How many participants were included in the survey?
Total 1691 participants
Indicate whether each variable in the study is numerical or categorical. If numerical, identify as continuous or discrete. If categorical, indicate if the variable is ordinal.
index: numerical continuous
sex: categorical nominal
age: numerical discrete
marital: categorical nominal
grossIncome: categorical ordinal
smoke: categorical nominal
amtWeekends: numerical discrete
amtWeekdays : numerical discrete
Exercise 1.5 introduces a study where researchers studying the relationship between honesty, age, and self-control conducted an experiment on 160 children between the ages of 5 and 15. The researchers asked each child to toss a fair coin in private and to record the outcome(white or black) on a paper sheet, and said they would only reward children who report white. Half the students were explicitly told not to cheat and the others were not given any explicit instructions. Differences were observed in the cheating rates in the instruction and no instruction groups, as well as some differences across children’s characteristics within each group.
Identify the population of interest and the sample in this study.
160 children between the ages of 5 and 15.
Comment on whether or not the results of the study can be generalized to the population, and if the findings of the study can be used to establish causal relationships.
The study can’t establish causal relationship because we don’t know whether those 160 children are randomly choose. This is an observation study.
Below are excerpts from two articles published in the NY Times:
No. We can’t get the conclusion because the 23,123 smokers are chosen from health plan, instead of random population. In addition, we don’t know whether there are other reasons to cause the Dementia. No experiment is involved.
This statement is not justified. This is an observation but not an experiment. We don’t know whether those behavioral concerns are based on those children who have sleep disorders or randomly chosen from all the children. In addition, they might have other reasons to cause the behavior issues, such as lack of attention. If we can’t exclude other reasons base on a fair experiment, we can’t draw any conclusion.
A researcher is interested in the effects of exercise on mental health and he proposes the following study: Use stratified random sampling to ensure rep- resentative proportions of 18-30, 31-40 and 41-55 year olds from the population. Next, randomly assign half the subjects from each age group to exercise twice a week, and instruct the rest not to exercise. Conduct a mental health exam at the beginning and at the end of the study, and compare the results.
Randomized experiment.
Treatment group: group that excercising twice a week. Control group: group not excercising at all.
Yes. They group people base on age.
Base on the information provide, the experiment doesn’t use blinding. One group aware they will do exercise twice a week and the other group instruct not to exercise.
We can establish a causal relationship if the data size is large enough because the example is randomly chosen. The only concern is whether the actively exercise person before was chose to not exercise at all.
As I mention before, my only reservation about the study is the morality of forcing those actively exercising person to stop exercising. In addition, we don’t know how long the experiment is and it might affect our study due to those people’s emotion change.