Intro to R

Mienie Roberts

1/19/2018

Probability and Statistics

Unit 1

R and RStudio

What is R?


Why use R?

Top of page

Check out the following opinions: Opinion 1. Opinion 2. Opinion 3. Opinion 4.


Capabilities of R:

Top of page

R’s capabilities are simply amazing. Check out:


What is Rstudio?

Top of page

RStudio is a graphical user interface for R which includes a set of integrated tools designed to help you be more productive with R. It includes:


Downloading and installing R and Rstudio (windows)

Top of page


Exercises:

Top of page
  1. Install R (either from USB or from the Internet)
  2. Install Rstudio (either from USB or from the Internet)

Note: Once R and Rstudio are installed, it is not necessary to start R, because Rstudio will start it


Next: “RStudio Interface: Panes, Tabs, and Windows


1.3 Measures of Central Tendency

1.3.1 Introduction

A measure of central tendency is a single value that attempts to describe a set of data by identifying the central position within that set of data. As such, measures of central tendency are sometimes called measures of central location. They are also classed as summary statistics. The mean (often called the average) is most likely the measure of central tendency that you are most familiar with, but there are others, such as the median and the mode.

The mean, median and mode are all valid measures of central tendency, but under different conditions, some measures of central tendency become more appropriate to use than others. In the following sections, we will look at the mean, mode and median, and learn how to calculate them and under what conditions they are most appropriate to be used.

1.3.2 Mean:

Mean (Arithmetic) The mean (or average) is the most popular and well known measure of central tendency. It can be used with both discrete and continuous data, although its use is most often with continuous data. The mean is equal to the sum of all the values in the data set divided by the number of values in the data set.

The mean is essentially a model of your data set. It is the value that is most common. You will notice, however, that the mean is not often one of the actual values that you have observed in your data set. However, one of its important properties is that it minimises error in the prediction of any one value in your data set. That is, it is the value that produces the lowest amount of error from all other values in the data set.

An important property of the mean is that it includes every value in your data set as part of the calculation. In addition, the mean is the only measure of central tendency where the sum of the deviations of each value from the mean is always zero.

1.3.3 Median:

The median is the middle score for a set of data that has been arranged in order of magnitude. The median is less affected by outliers and skewed data.

Week 2

Import files into R

Consider the following data file on the body temperatures of ten US males.

https://www.amazon.com/clouddrive/share/RJLhFeGmPR8j4b4dQDUzjuxbnhDLhIKqabQvJCKDnER

Watch the following video on how to import a .csv file into R. https://www.amazon.com/clouddrive/share/bcK8ZluX3i45PvJaQ5Omwc0ii53iVzRJx1jcrYIAbp9

This week we cover the following topics:


Histograms and numerical summaries

Top of page

A histogram is a visual representation of the distribution of a dataset. The shape of a histogram allows you to easily see where most of the data is situated. In particular, you can see where the middle of distribution is located, how closely the data lie around the middle, and where possible outliers are to be found. As shown in the figures below, a histogram consists of an x-axis, a y-axis and bars of different heights. The x-axis is divided into intervals (called “bins”), and on each bin a vertical bar is constructed whose height represents the number of data values within that bin. Note that histograms (unlike bar charts) don’t have gaps between the bars (if it looks like there’s a gap, that’s because that particular bin has no data in it).


Example: Suppose you are interested in the distribution of ages for employees working in a certain office. The following data is available: 36, 25, 38, 46, 55, 68, 72, 55, 36, 38, 67, 45, 22, 48, 91, 46, 52, 61, 58, 55. We use R to construct a histogram to represent the distribution of the data.

age<-c(36, 25, 38, 46, 55, 68, 72, 55, 36, 38, 67, 45, 22, 48, 91, 46, 52, 61, 58, 55)
hist(age)

The output appears under the ‘Plots’ tab, and looks like this:
[Histogram of age] The ‘hist’ command has many options that enable the user to change the display. For example, the user can control the number of bins by using the ‘breaks’ option. The title of the histogram by using the ‘main’ option, and the x- and y-axis labels using the ‘xlab’ and ‘ylab’ options.


Example: The following command creates a histogram with 7 nonempty bins, with title “Age of Employees” and x label “Employee ages”:

hist(age,breaks=7,main="Age of Employees",xlab="Employee ages")

The output appears under the ‘Plots’ tab, and looks like this:
[Histogram of age] ### XY plots {#xyplots} ###### Top of page

The command ‘xyplot’ can be used to plot one variable against another. The command uses the ‘lattice’ package, so before using it you must load the package.


Example: Load a new package called ‘lattice’.

library(lattice)

If you get an error message, it probably means you haven’t installed ‘lattice’. In this case, go back to “R_RStudioWindows” and follow the instructions found in the section ‘Packages window’.

To demonstrate ‘xyplot’ we will be using data from the ‘mosaicData package’, so you must load this package as well.


Week 4

Install the package ‘mosaic’:

install.packages('mosaic')

Install the package ‘mosaicData’:

install.packages('mosaicData')

Load the package ‘mosaic’:

require(mosaic)

Load the package ‘mosaicData’:

require(mosaicData)

We set the default number of digits to 2:

options(digits =2)

Consider the HELPrct (Health Evaluation and Linkage to Primary Care) data set that can be found under the “mosaicData” package. The HELP study was a clinical trial for adult inpatients recruited from a detoxification unit. Patients with no primary care physician were randomized to receive a multidisciplinary assessment and a brief motivational intervention or usual care, with the goal of linking them to primary medical care.

This is a data frame with 453 observations on the following variables.

age subject age at baseline (in years)

anysub use of any substance post-detox: a factor with levels no yes

cesd Center for Epidemiologic Studies Depression measure at baseline (high scores indicate more depressive symptoms)

d1 lifetime number of hospitalizations for medical problems (measured at baseline)

daysanysub time (in days) to first use of any substance post-detox

dayslink time (in days) to linkage to primary care

drugrisk Risk Assessment Battery drug risk scale at baseline

e2b number of times in past 6 months entered a detox program (measured at baseline)

female 0 for male, 1 for female

sex a factor with levels male female

g1b experienced serious thoughts of suicide in last 30 days (measured at baseline): a factor with levels no yes

homeless housing status: a factor with levels housed homeless

i1 average number of drinks (standard units) consumed per day, in the past 30 days (measured at baseline)

i2 maximum number of drinks (standard units) consumed per day, in the past 30 days (measured at baseline)

id subject identifier

indtot Inventory of Drug Use Consequences (InDUC) total score (measured at baseline)

linkstatus post-detox linkage to primary care (0 = no, 1 = yes)

link post-detox linkage to primary care: no yes

mcs SF-36 Mental Component Score (measured at baseline, lower scores indicate worse status)

pcs SF-36 Physical Component Score (measured at baseline, lower scores indicate worse status)

pss_fr perceived social support by friends (measured at baseline, higher scores indicate more support)

racegrp race/ethnicity: levels black hispanic other white

satreat any BSAS substance abuse treatment at baseline: no yes

sexrisk Risk Assessment Battery sex risk score (measured at baseline)

substance primary substance of abuse: alcohol cocaine heroin

treat randomized to HELP clinic: no yes


We find the mean of the cesd (Center for Epidemiologic Studies Depression measure at baseline (high scores indicate more depressive symptoms)) variable:

mean(HELPrct$cesd)

Which is equal to 33.

The standard deviation is:


sd(HELPrct$cesd)

Which works out to be 13.

The variance is:


var(HELPrct$cesd)

157

We can also calculate the median:


median(HELPrct$cesd)

which is 34.

We can use the “summary” command to print out the min, max, mean, median, and quantiles:

library(mosaic)
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Loading required package: ggformula
## Loading required package: ggplot2
## Loading required package: ggstance
## 
## Attaching package: 'ggstance'
## The following objects are masked from 'package:ggplot2':
## 
##     geom_errorbarh, GeomErrorbarh
## 
## New to ggformula?  Try the tutorials: 
##  learnr::run_tutorial("introduction", package = "ggformula")
##  learnr::run_tutorial("refining", package = "ggformula")
## Loading required package: mosaicData
## Loading required package: Matrix
## 
## The 'mosaic' package masks several functions from core packages in order to add 
## additional features.  The original behavior of these functions should not be affected by this.
## 
## Note: If you use the Matrix package, be sure to load it BEFORE loading mosaic.
## 
## Attaching package: 'mosaic'
## The following object is masked from 'package:Matrix':
## 
##     mean
## The following object is masked from 'package:ggplot2':
## 
##     stat
## The following objects are masked from 'package:dplyr':
## 
##     count, do, tally
## The following objects are masked from 'package:stats':
## 
##     binom.test, cor, cor.test, cov, fivenum, IQR, median,
##     prop.test, quantile, sd, t.test, var
## The following objects are masked from 'package:base':
## 
##     max, mean, min, prod, range, sample, sum
summary(HELPrct$cesd)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00   25.00   34.00   32.85   41.00   60.00

Min. 1st Qu. Median Mean 3rd Qu. Max.

1   25   34   33   41   60

Graphical summaries

hist(HELPrct$cesd)

How many females in teh dataset?

tally(~sex, data=HELPrct)
## sex
## female   male 
##    107    346
tally(~sex, format="percent", data=HELPrct)
## sex
##   female     male 
## 23.62031 76.37969

Lets restrict our attention to the female subjects. We use the filter() function in the dplyr package to generate a new dataframe containing only females.

female<-filter(HELPrct, sex=='female')
female
##     age anysubstatus anysub cesd d1 daysanysub dayslink drugrisk e2b
## 1    39            1    yes   15  2        189      343        0   1
## 2    47            1    yes    6  1         31      365        0  NA
## 3    49           NA   <NA>   52 14         NA      334        0   1
## 4    50            1    yes   50 14         31      365       18   7
## 5    34           NA   <NA>   46  0         NA      365        8  NA
## 6    58            0     no   49  3        192      365        0  NA
## 7    28            1    yes   35  6         27       41        0   2
## 8    27            0     no   52  0        198       49       10   4
## 9    48            1    yes   19  4         67      365        0  NA
## 10   34            1    yes    5  2         23       14        0  NA
## 11   35            1    yes   46  3         17      365        0  NA
## 12   41            0     no   29  3        181       19        0   2
## 13   29            0     no   33  3        180      365        1   4
## 14   40            0     no   57  5        181       34        0  NA
## 15   26           NA   <NA>   30  4         NA       NA        0  NA
## 16   41            1    yes   43  0          2       NA       10  NA
## 17   32            1    yes   37  2        175      365        0  NA
## 18   33           NA   <NA>   47  9         NA       38        0   3
## 19   40           NA   <NA>   36  1         NA      217        0   1
## 20   35           NA   <NA>   30  2         NA       16        0  NA
## 21   30            0     no   39  0        201       18        0   1
## 22   32           NA   <NA>   53 15         NA       41        0  NA
## 23   42            0     no   26 10        183      358        0   2
## 24   30           NA   <NA>   51  9         NA       NA        9   1
## 25   35           NA   <NA>   58  5         NA       17        0   2
## 26   30            1    yes   15  1         15      365        0  NA
## 27   50            0     no   35  6        178       49        0  NA
## 28   38           NA   <NA>   26  4         NA       28        0  NA
## 29   24            1    yes   45  0         68      365        0   1
## 30   49           NA   <NA>   28 13         NA      193        0   1
## 31   28            1    yes   48  4         12      413        0  NA
## 32   37           NA   <NA>   35  1         NA      106        0  NA
## 33   31            1    yes   15  1         31      365        0  NA
## 34   30            1    yes   29  2         12      365        0  NA
## 35   57            1    yes   39  4         28      380        0   1
## 36   29           NA   <NA>   46  6         NA      365        5   3
## 37   33           NA   <NA>   44  4         NA      427        0  NA
## 38   28            1    yes   38  3        117      218        0  NA
## 39   31           NA   <NA>   38 10         NA      405       20   1
## 40   36           NA   <NA>   53  3         NA       45        0   3
## 41   38           NA   <NA>   57  4         NA      370        0  NA
## 42   39           NA   <NA>   43  1         NA      365       13   1
## 43   33            1    yes   19 40          3      146        0   1
## 44   38            1    yes   34  1          0      348       14   1
## 45   43           NA   <NA>   36  1         NA       18        0  NA
## 46   33            1    yes   24  6          2      365        1  NA
## 47   29           NA   <NA>   54  0         NA      407        4  NA
## 48   47            0     no   41  1        190       78        0  NA
## 49   31           NA   <NA>   18  3         NA       NA        8   1
## 50   40           NA   <NA>   60  7         NA      406        0  NA
## 51   32            0     no   34  3        184      365        0  NA
## 52   38            0     no   38  3        247      365        0   1
## 53   32            1    yes   37  1         82      348        0  NA
## 54   35           NA   <NA>   24  1         NA      365        0  NA
## 55   35            0     no   34  1        172      136        0  NA
## 56   45            1    yes   40  5          7      365        0   1
## 57   47           NA   <NA>   39  2         NA      365        1   3
## 58   39            1    yes   42  4        215      428        0  NA
## 59   44           NA   <NA>   13  0         NA      365        0  NA
## 60   55            1    yes   30  2         11       40        0   2
## 61   34           NA   <NA>   19  1         NA      329        0  NA
## 62   34           NA   <NA>   36  1         NA      326        0  NA
## 63   31           NA   <NA>   22  0         NA      359        0  NA
## 64   27            1    yes   33  0          4      365        0   2
## 65   33            1    yes   51  1          5      365        1   6
## 66   30           NA   <NA>   30  6         NA       83        0  NA
## 67   34           NA   <NA>   38  2         NA      365        8  NA
## 68   37            0     no   37  2        179       41        0  NA
## 69   26           NA   <NA>   56  2         NA      365        0  NA
## 70   45            1    yes   41  0         33      365        4   1
## 71   23            1    yes   48  1          2      365        0   2
## 72   35            1    yes   45  3          1       26        0   1
## 73   42           NA   <NA>   52  3         NA       63        0  NA
## 74   32            1    yes   45  4          1      427        0   2
## 75   36            1    yes   39  1        136      324        0   2
## 76   22            1    yes   51  2          2      374        9   1
## 77   37           NA   <NA>   58  8         NA      365        0   2
## 78   33            1    yes   19  0         64       33        0  NA
## 79   43            0     no    7  0        187       41        0  NA
## 80   47            1    yes   54  1          4      349        8  NA
## 81   48            1    yes   53  4          0      302        0   3
## 82   35            1    yes   54  1          5      365       13  NA
## 83   38           NA   <NA>   42  4         NA      337        0  NA
## 84   35            0     no   36  0        178      361        0  NA
## 85   47           NA   <NA>   52  8         NA      365        0   2
## 86   33           NA   <NA>   40  4         NA       21        0  NA
## 87   26            1    yes   33  0         35      296        0   1
## 88   34            1    yes   29  0         12      356        0  NA
## 89   47            0     no   32  3        158       74        0  NA
## 90   39            0     no   52  2        268      449        0  NA
## 91   37            1    yes   41 10          1      393        0  NA
## 92   31            1    yes   42  1         15      365        0  NA
## 93   42            1    yes   42  5         33       98        0  NA
## 94   33           NA   <NA>   15  0         NA      365        0  NA
## 95   38           NA   <NA>   33  1         NA      286        1  NA
## 96   43           NA   <NA>   23  4         NA      365        0   2
## 97   27           NA   <NA>    3  0         NA      365        0  NA
## 98   21           NA   <NA>   39  0         NA       NA        6  NA
## 99   29           NA   <NA>   47  2         NA      365        0  NA
## 100  45           NA   <NA>   41  2         NA      365        0   1
## 101  24           NA   <NA>   34  2         NA      365       14   8
## 102  35           NA   <NA>   23  2         NA       28        0  NA
## 103  33           NA   <NA>   21  8         NA       NA        0  NA
## 104  36           NA   <NA>   29  4         NA      365        0  NA
## 105  33           NA   <NA>   40  2         NA      365        0   1
## 106  31           NA   <NA>   47  1         NA      365        0  NA
## 107  39           NA   <NA>   28  0         NA      365        1  NA
##     female    sex g1b homeless i1  i2  id indtot linkstatus link       mcs
## 1        1 female  no   housed  5   5   4     28          0   no 43.967880
## 2        1 female  no   housed  4   4   6     29          0   no 55.508991
## 3        1 female yes   housed 13  20   7     38          0   no 21.793024
## 4        1 female  no homeless 71 129   9     44          0   no 22.029678
## 5        1 female  no   housed  0   0  11     34          0   no 43.974678
## 6        1 female  no   housed 13  13  12     11          0   no 13.382205
## 7        1 female yes homeless  0   0  17     26          1  yes 29.799828
## 8        1 female yes   housed  9  24  20     37          1  yes 15.458271
## 9        1 female  no   housed  6   8  27     40          0   no 21.668474
## 10       1 female  no   housed  6  13  50      8          1  yes 59.454094
## 11       1 female  no   housed 13  20  57     32          0   no 24.000315
## 12       1 female yes   housed  3   6  65     20          1  yes 33.374172
## 13       1 female yes homeless  0   0  66     29          0   no 27.575460
## 14       1 female yes homeless 59 164  71     43          1  yes 17.705963
## 15       1 female yes   housed 12  18  74     37         NA <NA> 26.697262
## 16       1 female  no   housed  0   0  75     40         NA <NA> 15.447794
## 17       1 female yes   housed  2   2  90     40          0   no 28.858498
## 18       1 female yes   housed 64  64 100     44          1  yes 19.595461
## 19       1 female yes homeless 33  38 104     42          1  yes 27.993336
## 20       1 female  no   housed  9  15 108     33          1  yes 23.299021
## 21       1 female  no   housed  0   0 118     19          1  yes 24.747171
## 22       1 female yes homeless 34  34 120     33          1  yes 27.136280
## 23       1 female  no homeless 39  95 121     31          0   no 41.321629
## 24       1 female yes   housed  0   0 125     43         NA <NA> 19.156574
## 25       1 female yes   housed  1   1 127     37          1  yes 18.465418
## 26       1 female  no   housed 26  26 131     25          0   no 37.438934
## 27       1 female  no   housed 13  13 134     28          1  yes 20.310446
## 28       1 female  no   housed  0   0 138     39          1  yes 22.787546
## 29       1 female  no homeless  7   7 141     39          0   no 28.505577
## 30       1 female  no homeless 15  15 143     36          1  yes 40.156929
## 31       1 female  no   housed  2   2 150     33          0   no 22.017500
## 32       1 female  no homeless  1   3 153     25          1  yes 33.366123
## 33       1 female  no   housed  0   0 166     38          0   no 50.030434
## 34       1 female  no homeless 29  29 179     31          0   no 52.197483
## 35       1 female  no   housed 12  12 181     36          0   no 36.651463
## 36       1 female  no   housed  0   0 187     39          0   no 20.119982
## 37       1 female yes homeless 59  59 188     38          0   no 25.257971
## 38       1 female yes   housed 16  20 191     35          1  yes 18.324743
## 39       1 female yes homeless 26  33 193     44          0   no 22.442661
## 40       1 female yes homeless 50  50 194     41          1  yes 27.171751
## 41       1 female yes   housed 13  32 200     39          0   no 20.356680
## 42       1 female yes   housed 20  20 203     37          0   no 22.815102
## 43       1 female  no homeless 19  26 204     32          1  yes 40.032974
## 44       1 female  no homeless  0   0 213     32          0   no 43.353584
## 45       1 female yes   housed 58  58 219     40          1  yes 36.100307
## 46       1 female yes   housed 32  38 220     23          0   no 33.259956
## 47       1 female  no   housed  0   0 221     33          0   no 12.323594
## 48       1 female yes homeless  0   0 224     21          1  yes 37.953403
## 49       1 female yes   housed  0   0 226     32         NA <NA> 27.641029
## 50       1 female yes homeless 38  38 228     43          0   no 16.786348
## 51       1 female  no   housed 13  13 229     31          0   no 54.768539
## 52       1 female yes   housed 16  26 236     34          0   no 14.919310
## 53       1 female  no   housed  1   6 237     28          0   no 40.462433
## 54       1 female  no   housed  0   0 241     34          0   no 44.351089
## 55       1 female  no homeless  4   4 242     36          1  yes 16.469986
## 56       1 female yes   housed 10  14 247     34          0   no 26.311474
## 57       1 female  no   housed 42  48 249     33          0   no 27.471394
## 58       1 female yes   housed  0   0 254     20          0   no 13.968738
## 59       1 female  no   housed 13  13 255     26          0   no 41.867615
## 60       1 female  no   housed  1   2 264     41          1  yes 23.547628
## 61       1 female  no   housed  4   4 269     27          0   no 34.048084
## 62       1 female  no   housed  1   1 272     38          0   no 32.384045
## 63       1 female  no   housed 10  20 275     23          0   no 47.442879
## 64       1 female  no homeless  8   8 284     38          0   no 31.781149
## 65       1 female yes   housed  8  13 304     28          0   no 20.911337
## 66       1 female yes homeless 27  33 306     25          1  yes 44.446507
## 67       1 female  no   housed  0   0 308     33          0   no 21.543468
## 68       1 female  no homeless  1   1 313     33          1  yes 27.601431
## 69       1 female  no   housed  1   1 316     36          0   no 14.415197
## 70       1 female  no   housed  2   2 320     22          0   no 34.747746
## 71       1 female yes homeless 29  58 324     27          0   no 16.718819
## 72       1 female  no   housed  0   0 325     32          1  yes 20.220354
## 73       1 female yes homeless  0   0 327     32          1  yes 28.447634
## 74       1 female yes homeless 67  67 333     40          0   no 17.926985
## 75       1 female yes homeless 53  53 339     36          0   no 22.237560
## 76       1 female  no   housed  0   0 342     40          0   no  7.035307
## 77       1 female yes homeless 67  80 351     41          0   no 16.922634
## 78       1 female  no homeless  6   6 354     22          1  yes 24.923189
## 79       1 female  no homeless 26  26 364     15          1  yes 60.542084
## 80       1 female yes   housed 13  13 367     35          0   no 13.852996
## 81       1 female yes homeless  0   0 370     32          0   no 19.808329
## 82       1 female  no   housed  0   0 372     44          0   no  9.406377
## 83       1 female yes   housed  3   3 374     40          0   no 27.495565
## 84       1 female  no homeless 58  58 379     13          0   no 44.767254
## 85       1 female  no   housed  6   6 391     34          0   no  7.226597
## 86       1 female  no   housed 13  26 402     38          1  yes 19.819555
## 87       1 female  no   housed  0   0 403     41          0   no 29.213017
## 88       1 female  no   housed  0   0 421     37          0   no 31.077631
## 89       1 female  no   housed 21  21 431     13          1  yes 51.922516
## 90       1 female  no   housed  0   0 442     37          0   no 24.930353
## 91       1 female  no homeless 24  51 445     44          0   no 25.710777
## 92       1 female yes homeless  6  13 461     34          0   no 16.863588
## 93       1 female yes   housed 26  41 465     35          1  yes 30.701563
## 94       1 female  no   housed  0   0 466      6          0   no 41.624706
## 95       1 female yes   housed  3  16 470     33          0   no 22.337873
## 96       1 female  no homeless 19  19  55     31          0   no 27.717655
## 97       1 female  no   housed  1   1 139     21          0   no 57.834595
## 98       1 female yes   housed  0   0 155     35         NA <NA> 47.773228
## 99       1 female  no homeless 11  14 157     35          0   no  9.732559
## 100      1 female  no homeless 19  26 162     25          0   no 55.479382
## 101      1 female  no   housed 13  26 171     38          0   no 28.590870
## 102      1 female  no   housed  4   4 303     20          1  yes 45.425110
## 103      1 female  no homeless 26  26 345     28         NA <NA> 18.594315
## 104      1 female  no   housed  7   8 349     27          0   no 25.676130
## 105      1 female yes homeless 26  32 427     37          0   no 34.152245
## 106      1 female yes homeless 56  61 451     41          0   no 17.050970
## 107      1 female  no homeless  1  24 460     28          0   no 33.434536
##          pcs pss_fr  racegrp satreat sexrisk substance treat avg_drinks
## 1   61.93168     11    white     yes       4    heroin    no          5
## 2   46.47521      5    black      no       5   cocaine   yes          4
## 3   24.51504      1    black     yes       8   cocaine    no         13
## 4   38.27088      5    white      no       8   alcohol    no         71
## 5   60.07915      0    white      no       2    heroin   yes          0
## 6   41.93376     13    black     yes       0   alcohol    no         13
## 7   44.77651      7 hispanic     yes       3    heroin   yes          0
## 8   37.45214     13    white      no       3    heroin   yes          9
## 9   36.01007      6    black      no       7   cocaine    no          6
## 10  52.69898     12    black      no       4   cocaine   yes          6
## 11  46.75086      1    black      no       7   cocaine   yes         13
## 12  55.23372     13    white     yes       4   alcohol   yes          3
## 13  35.12470      4 hispanic     yes       4    heroin    no          0
## 14  36.04016      1    black      no       4   alcohol   yes         59
## 15  54.38272      6    white      no       9   cocaine    no         12
## 16  55.32189     14    white      no       3    heroin    no          0
## 17  43.94296     11    black      no       3   cocaine    no          2
## 18  40.48884      1    other      no       7   alcohol   yes         64
## 19  44.53589      7    white     yes       3   alcohol    no         33
## 20  51.81045     12    black     yes       5   alcohol   yes          9
## 21  54.10854     14 hispanic      no       4   cocaine   yes          0
## 22  54.79462      7    black      no       5   alcohol   yes         34
## 23  36.68874      4    black      no      10   cocaine    no         39
## 24  34.33698     10    white      no       6    heroin    no          0
## 25  39.33260     13    black     yes       6   cocaine   yes          1
## 26  49.29042     11    black     yes       3   cocaine   yes         26
## 27  33.48925      2    white      no       0   alcohol    no         13
## 28  28.74085      9    other      no       7   cocaine   yes          0
## 29  37.79718      7    black     yes       7   cocaine   yes          7
## 30  40.96234      7 hispanic     yes       9   alcohol    no         15
## 31  40.24271      1    white      no       5   cocaine   yes          2
## 32  45.16520      8    black      no       9   cocaine   yes          1
## 33  57.38777      9    black     yes       2   cocaine    no          0
## 34  55.73845     13    black     yes       7   cocaine   yes         29
## 35  30.50811      6    white     yes       0   alcohol    no         12
## 36  32.96189      3    white      no       4    heroin   yes          0
## 37  42.12069      7 hispanic      no       5   alcohol    no         59
## 38  43.24062     14    black      no      11   cocaine    no         16
## 39  35.90619      8    white      no      11   alcohol    no         26
## 40  37.75567      3    white      no       9   alcohol   yes         50
## 41  35.97361      0    black      no      14   cocaine    no         13
## 42  35.22702     10    white      no       4    heroin   yes         20
## 43  38.10227      2    black     yes       7   cocaine    no         19
## 44  21.91906      9    black      no       8    heroin    no          0
## 45  37.03778     11    black     yes       2   alcohol   yes         58
## 46  41.66993      8    other      no       3    heroin    no         32
## 47  48.21926     11    white      no       6    heroin    no          0
## 48  57.64361     11    black      no       0   cocaine    no          0
## 49  48.37090     12    white      no       4    heroin    no          0
## 50  38.51597      3    white     yes      11   cocaine   yes         38
## 51  23.48208     12    black     yes       0   cocaine    no         13
## 52  57.83691      3    white      no       5   alcohol   yes         16
## 53  56.90286      3    black     yes       4   cocaine   yes          1
## 54  46.79942      4    black      no       2   cocaine    no          0
## 55  58.49455      2    black      no       8   cocaine    no          4
## 56  43.25021      8    white      no       5   alcohol    no         10
## 57  52.42204     10    black      no       5    heroin    no         42
## 58  48.97176     11    black      no       4   cocaine   yes          0
## 59  46.36879      7 hispanic      no       4    heroin    no         13
## 60  37.35865      7    black     yes       2    heroin   yes          1
## 61  57.24648     12    black      no       2   cocaine    no          4
## 62  44.85584     10    black      no       4   cocaine    no          1
## 63  52.85658     11    black      no       7   alcohol   yes         10
## 64  51.49556      7    black     yes       8   cocaine   yes          8
## 65  33.07642      6 hispanic     yes       4    heroin   yes          8
## 66  45.79400     12    black      no       4   alcohol   yes         27
## 67  52.35651     10    white      no       4    heroin    no          0
## 68  37.83872     11    black      no       6   cocaine    no          1
## 69  46.74971      2    black      no      11    heroin   yes          1
## 70  64.35030      3    white      no       1    heroin   yes          2
## 71  35.70664      3    black      no      11   alcohol   yes         29
## 72  32.44772      2    black      no       9   alcohol   yes          0
## 73  39.93384      2    other      no       0    heroin   yes          0
## 74  39.09279      7    black      no       6   alcohol    no         67
## 75  36.52407      3    black     yes       5   alcohol    no         53
## 76  52.51404      8    other      no       7    heroin   yes          0
## 77  34.09209      0    other      no       2   alcohol    no         67
## 78  63.77832      8    black      no       4   cocaine   yes          6
## 79  55.44015     13    white      no       1    heroin   yes         26
## 80  31.11147      9    black      no       0   cocaine   yes         13
## 81  27.09086     13    white     yes       3   alcohol    no          0
## 82  41.95401     13    white      no       4    heroin    no          0
## 83  51.27790      3    black      no       9   cocaine    no          3
## 84  53.42212     14    black      no       4   cocaine    no         58
## 85  47.60948      9    white      no       4   alcohol   yes          6
## 86  32.99675      0    black      no       4   alcohol   yes         13
## 87  56.69189      3    black     yes       3    heroin    no          0
## 88  64.91865     14    black      no      12   cocaine   yes          0
## 89  54.52398     12 hispanic      no       0   alcohol    no         21
## 90  33.53111      7    black      no       2    heroin   yes          0
## 91  49.18084      9    other      no       9   alcohol    no         24
## 92  46.69877      0    black      no      10   cocaine   yes          6
## 93  38.40187      5    white      no       6   alcohol   yes         26
## 94  62.08943     11    black     yes       6   cocaine   yes          0
## 95  42.31495      8    black      no       1    heroin    no          3
## 96  41.10135      3    black      no       6   alcohol    no         19
## 97  58.21511      4    black     yes       1   cocaine    no          1
## 98  41.09781     14    white      no       1    heroin    no          0
## 99  69.17161      4    black      no       7   cocaine    no         11
## 100 54.09069      4    white      no       4   alcohol    no         19
## 101 57.76270      9    white     yes      14    heroin   yes         13
## 102 58.75759      1    black      no       2   cocaine   yes          4
## 103 38.86502      3    white      no       4   alcohol    no         26
## 104 54.98139     13    white      no       4   alcohol   yes          7
## 105 45.27036      2 hispanic      no       3   alcohol   yes         26
## 106 34.51623      8 hispanic     yes      14   alcohol    no         56
## 107 40.04572      1    white      no       2    heroin    no          1
##     max_drinks
## 1            5
## 2            4
## 3           20
## 4          129
## 5            0
## 6           13
## 7            0
## 8           24
## 9            8
## 10          13
## 11          20
## 12           6
## 13           0
## 14         164
## 15          18
## 16           0
## 17           2
## 18          64
## 19          38
## 20          15
## 21           0
## 22          34
## 23          95
## 24           0
## 25           1
## 26          26
## 27          13
## 28           0
## 29           7
## 30          15
## 31           2
## 32           3
## 33           0
## 34          29
## 35          12
## 36           0
## 37          59
## 38          20
## 39          33
## 40          50
## 41          32
## 42          20
## 43          26
## 44           0
## 45          58
## 46          38
## 47           0
## 48           0
## 49           0
## 50          38
## 51          13
## 52          26
## 53           6
## 54           0
## 55           4
## 56          14
## 57          48
## 58           0
## 59          13
## 60           2
## 61           4
## 62           1
## 63          20
## 64           8
## 65          13
## 66          33
## 67           0
## 68           1
## 69           1
## 70           2
## 71          58
## 72           0
## 73           0
## 74          67
## 75          53
## 76           0
## 77          80
## 78           6
## 79          26
## 80          13
## 81           0
## 82           0
## 83           3
## 84          58
## 85           6
## 86          26
## 87           0
## 88           0
## 89          21
## 90           0
## 91          51
## 92          13
## 93          41
## 94           0
## 95          16
## 96          19
## 97           1
## 98           0
## 99          14
## 100         26
## 101         26
## 102          4
## 103         26
## 104          8
## 105         32
## 106         61
## 107         24
with(female, stem(cesd))
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 3
##   0 | 567
##   1 | 3
##   1 | 555589999
##   2 | 123344
##   2 | 66889999
##   3 | 0000233334444
##   3 | 5556666777888899999
##   4 | 00011112222334
##   4 | 555666777889
##   5 | 011122222333444
##   5 | 67788
##   6 | 0