DATA 621 Week #1 Textbook Exercises
LMR 1.1
The dataset teengamb concers a study of teenage gambline in Britain. Make a numerical and graphical summary of the data, commenting on any features that you find interesting. Limit the output you present to a quantity that a busy reader would find sufficient to get a basic understanding of the data.
Estimates show teens in Britain gamble between 0 and 150 pounds annually. Males gamble more than females. They gamble on average nearly 8 times as much.
sex | Min. | 25% | 50% | Mean | 75% | Max. | Count |
---|---|---|---|---|---|---|---|
Female | 0 | 0.100 | 1.70 | 3.865789 | 6.000 | 19.6 | 19 |
Male | 0 | 2.775 | 14.25 | 29.775000 | 42.175 | 156.0 | 28 |
Teens earned between 31 and 780 pounds annually. Those who earned more were more likely to gamble more.
LMR 1.3
The dataset prostate is from a study on 97 men with prostate conacer who were due to receive a radical prostatectomy. Make a numerical and graphical summary of the data as in the first question.
These men are generally older. Their age ranges from 41 to 79 years. The median age is 65.
Min. | 25% | 50% | Mean | 75% | Max. |
---|---|---|---|---|---|
41 | 60 | 65 | 63.86598 | 68 | 79 |
There is a direct relationship between the prostate cancer volume and the capsular penetration.
Min. | 25% | 50% | Mean | 75% | Max. |
---|---|---|---|---|---|
-1.38629 | -1.38629 | -0.79851 | -0.1793637 | 1.17865 | 2.90417 |
The volume of the cancer is also positively correlated with the PSA.
Min. | 25% | 50% | Mean | 75% | Max. |
---|---|---|---|---|---|
-0.43078 | 1.73166 | 2.59152 | 2.478387 | 3.05636 | 5.58293 |
LMR 1.4
The dataset sat comes from a study entitled “Getting What You Pay For: The Debate Over Equity in Public School Expenditures.” Make a numerical and graphical summary of the data as in the first question.
SAT scores range from 844 to 1107 with an average arround 966. These data have considerable variability.
Min. | 25% | 50% | Mean | 75% | Max. | Count |
---|---|---|---|---|---|---|
844 | 897.25 | 945.5 | 965.92 | 1032 | 1107 | 50 |
At first blush it appears that high schools that spend less per pupil do better on the SAT than those who spend more. Strange!?!
But when you examine the popularity of the SAT test a different pattern emerges. In certain states the ACT is the college enterance test of choice. Thus the SAT scores in these state are only for the students who want to apply to a school back East.
LMR 1.5
The dataset divusa contains data on divorces in the United States from 1920 to 1996. Make a numerical and graphical summary of the data as in the first question.
The US divorce rate has generally been increasing over time. There was a spike in the 1940s and it has been trending down since the 1980.
Decade | Min. | 25% | 50% | Mean | 75% | Max. | Count |
---|---|---|---|---|---|---|---|
1920 | 6.6 | 7.200 | 7.35 | 7.44000 | 7.800 | 8.0 | 10 |
1930 | 6.1 | 7.200 | 7.65 | 7.60000 | 8.375 | 8.7 | 10 |
1940 | 8.8 | 10.225 | 11.10 | 11.90000 | 13.200 | 17.9 | 10 |
1950 | 8.9 | 9.300 | 9.45 | 9.58000 | 9.900 | 10.3 | 10 |
1960 | 9.2 | 9.600 | 10.30 | 10.63000 | 11.125 | 13.4 | 10 |
1970 | 14.9 | 17.300 | 19.80 | 19.24000 | 21.100 | 22.8 | 10 |
1980 | 20.4 | 20.900 | 21.40 | 21.45000 | 21.700 | 22.6 | 10 |
1990 | 19.5 | 20.150 | 20.50 | 20.47143 | 20.900 | 21.2 | 7 |
There are a couple of things that have a similar spike. For example military personnel per 1000…
…the marriage rate…
…and most strikingly, the female labor force participation rate.