The dataset teengamb concerns a study of teenage gambling in Britain. Make a numerical and graphical summary of the data, commenting on any features that you find interesting. Limit the output you present to a quantity that a busy reader would find sufficient to get a basic understanding of the data.

First, we load the data:

Next, we take a look at a summary of the data:

```
## sex status income verbal
## Min. :0.0000 Min. :18.00 Min. : 0.600 Min. : 1.00
## 1st Qu.:0.0000 1st Qu.:28.00 1st Qu.: 2.000 1st Qu.: 6.00
## Median :0.0000 Median :43.00 Median : 3.250 Median : 7.00
## Mean :0.4043 Mean :45.23 Mean : 4.642 Mean : 6.66
## 3rd Qu.:1.0000 3rd Qu.:61.50 3rd Qu.: 6.210 3rd Qu.: 8.00
## Max. :1.0000 Max. :75.00 Max. :15.000 Max. :10.00
## gamble
## Min. : 0.0
## 1st Qu.: 1.1
## Median : 6.0
## Mean : 19.3
## 3rd Qu.: 19.4
## Max. :156.0
```

Let’s visualize the relationship of each of the features in the dataset with gamble. It looks like:

- Income has a strong positive correlation with gamble
- Verbal has a weak negative correlation with gamble
- Status has a weak negative correlation with gamble
- Sex has a moderate negative correlation with gamble

The dataset prostate is from a study on 97 men with prostate cancer who were due to receive a radical prostatectomy. Make a numerical and graphical summary of the data as in the first question.

First, we load the data:

Next, we take a look at a summary of the data:

```
## lcavol lweight age lbph
## Min. :-1.3471 Min. :2.375 Min. :41.00 Min. :-1.3863
## 1st Qu.: 0.5128 1st Qu.:3.376 1st Qu.:60.00 1st Qu.:-1.3863
## Median : 1.4469 Median :3.623 Median :65.00 Median : 0.3001
## Mean : 1.3500 Mean :3.653 Mean :63.87 Mean : 0.1004
## 3rd Qu.: 2.1270 3rd Qu.:3.878 3rd Qu.:68.00 3rd Qu.: 1.5581
## Max. : 3.8210 Max. :6.108 Max. :79.00 Max. : 2.3263
## svi lcp gleason pgg45
## Min. :0.0000 Min. :-1.3863 Min. :6.000 Min. : 0.00
## 1st Qu.:0.0000 1st Qu.:-1.3863 1st Qu.:6.000 1st Qu.: 0.00
## Median :0.0000 Median :-0.7985 Median :7.000 Median : 15.00
## Mean :0.2165 Mean :-0.1794 Mean :6.753 Mean : 24.38
## 3rd Qu.:0.0000 3rd Qu.: 1.1786 3rd Qu.:7.000 3rd Qu.: 40.00
## Max. :1.0000 Max. : 2.9042 Max. :9.000 Max. :100.00
## lpsa
## Min. :-0.4308
## 1st Qu.: 1.7317
## Median : 2.5915
## Mean : 2.4784
## 3rd Qu.: 3.0564
## Max. : 5.5829
```

Let’s take a look at the distributions of each of these variables:

- lcavol, lweight, age, and lpsa have somewhat normal distributions
- lbph, svi, lcp, gleason, and pgg45 have left-skewed distributions

The dataset sat comes from a study entitled “Getting What You Pay For: The Debate Over Equity in Public School Expenditures.” Make a numerical and graphical summary of the data as in the first question.

Next, we take a look at a summary of the data:

```
## expend ratio salary takers
## Min. :3.656 Min. :13.80 Min. :25.99 Min. : 4.00
## 1st Qu.:4.882 1st Qu.:15.22 1st Qu.:30.98 1st Qu.: 9.00
## Median :5.768 Median :16.60 Median :33.29 Median :28.00
## Mean :5.905 Mean :16.86 Mean :34.83 Mean :35.24
## 3rd Qu.:6.434 3rd Qu.:17.57 3rd Qu.:38.55 3rd Qu.:63.00
## Max. :9.774 Max. :24.30 Max. :50.05 Max. :81.00
## verbal math total
## Min. :401.0 Min. :443.0 Min. : 844.0
## 1st Qu.:427.2 1st Qu.:474.8 1st Qu.: 897.2
## Median :448.0 Median :497.5 Median : 945.5
## Mean :457.1 Mean :508.8 Mean : 965.9
## 3rd Qu.:490.2 3rd Qu.:539.5 3rd Qu.:1032.0
## Max. :516.0 Max. :592.0 Max. :1107.0
```

Let’s visualize the relationship between expenditure, salary, and takers vs SAT score.

- The relationship between expend and total score is weak and negative. Variability does not appear to be constant for all values of expend.
- The relationship between salary and total score is weak and negative.
- The relationship between takers and total score is strong and negative, and does not apear to be linear.

The dataset divusa contains data on divorces in the United States from 1920 to 1996. Make a numerical and graphical summary of the data as in the first question.

Next, we take a look at a summary of the data:

```
## year divorce unemployed femlab
## Min. :1920 Min. : 6.10 Min. : 1.200 Min. :22.70
## 1st Qu.:1939 1st Qu.: 8.70 1st Qu.: 4.200 1st Qu.:27.47
## Median :1958 Median :10.60 Median : 5.600 Median :37.10
## Mean :1958 Mean :13.27 Mean : 7.173 Mean :38.58
## 3rd Qu.:1977 3rd Qu.:20.30 3rd Qu.: 7.500 3rd Qu.:47.80
## Max. :1996 Max. :22.80 Max. :24.900 Max. :59.30
## marriage birth military
## Min. : 49.70 Min. : 65.30 Min. : 1.940
## 1st Qu.: 61.90 1st Qu.: 68.90 1st Qu.: 3.469
## Median : 74.10 Median : 85.90 Median : 9.102
## Mean : 72.97 Mean : 88.89 Mean :12.365
## 3rd Qu.: 80.00 3rd Qu.:107.30 3rd Qu.:14.266
## Max. :118.10 Max. :122.90 Max. :86.641
```

Let’s look at time series of each of the variables. We notice that:

- Divorce spikes in 1945 and 1980
- Unemployed spikes in 1935
- Feblab spikes in 1945
- Marrige spikes in 1950
- Birth spikes in 1955
- Military spikes in 1945