Exercise 1: Summarize the backpain data

load data file

## Loading required package: tools
##   ID  status driver suburban
## 1  1    case    yes      yes
## 2  1 control    yes       no
## 3  2    case    yes      yes
## 4  2 control    yes      yes
## 5  3    case    yes       no
## 6  3 control    yes      yes

divide the status variable into two mutate columns and summarize the results

## ─ Attaching packages ────────────────────────── tidyverse 1.3.0 ─
## ✓ ggplot2 3.2.1     ✓ purrr   0.3.3
## ✓ tibble  2.1.3     ✓ dplyr   0.8.4
## ✓ tidyr   1.0.2     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.4.0
## ─ Conflicts ─────────────────────────── tidyverse_conflicts() ─
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
##   driver suburban case control total
## 1     no       no   26      47    73
## 2     no      yes    6       7    13
## 3    yes       no   64      63   127
## 4    yes      yes  121     100   221

Exercise 2: City Murder

load data file from state.x77

##            Population Income Illiteracy Life Exp Murder HS Grad Frost   Area
## Alabama          3615   3624        2.1    69.05   15.1    41.3    20  50708
## Alaska            365   6315        1.5    69.31   11.3    66.7   152 566432
## Arizona          2212   4530        1.8    70.55    7.8    58.1    15 113417
## Arkansas         2110   3378        1.9    70.66   10.1    39.9    65  51945
## California      21198   5114        1.1    71.71   10.3    62.6    20 156361
## Colorado         2541   4884        0.7    72.06    6.8    63.9   166 103766

load data file from USArrests

##            Murder Assault UrbanPop Rape
## Alabama      13.2     236       58 21.2
## Alaska       10.0     263       48 44.5
## Arizona       8.1     294       80 31.0
## Arkansas      8.8     190       50 19.5
## California    9.0     276       91 40.6
## Colorado      7.9     204       78 38.7

The both datasets contain the same city data and variable (murder), but the murder values of two data aree different.

merge two data depends on the same city name.

## 'data.frame':    50 obs. of  4 variables:
##  $ Murder  : num  13.2 10 8.1 8.8 9 7.9 3.3 5.9 15.4 17.4 ...
##  $ Assault : int  236 263 294 190 276 204 110 238 335 211 ...
##  $ UrbanPop: int  58 48 80 50 91 78 77 72 80 60 ...
##  $ Rape    : num  21.2 44.5 31 19.5 40.6 38.7 11.1 15.8 31.9 25.8 ...

Compute the pair-wise correlation

##             Population      Income  Illiteracy    Life Exp    Murder.x
## Population  1.00000000  0.20822756  0.10762237 -0.06805195  0.34364275
## Income      0.20822756  1.00000000 -0.43707519  0.34025534 -0.23007761
## Illiteracy  0.10762237 -0.43707519  1.00000000 -0.58847793  0.70297520
## Life Exp   -0.06805195  0.34025534 -0.58847793  1.00000000 -0.78084575
## Murder.x    0.34364275 -0.23007761  0.70297520 -0.78084575  1.00000000
## HS Grad    -0.09848975  0.61993232 -0.65718861  0.58221620 -0.48797102
## Frost      -0.33215245  0.22628218 -0.67194697  0.26206801 -0.53888344
## Area        0.02254384  0.36331544  0.07726113 -0.10733194  0.22839021
## Murder.y    0.32024487 -0.21520501  0.70677564 -0.77849850  0.93369089
## Assault     0.31702281  0.04093255  0.51101299 -0.62665800  0.73976479
## UrbanPop    0.51260491  0.48053302 -0.06219936  0.27146824  0.01638255
## Rape        0.30523361  0.35738678  0.15459686 -0.26956828  0.57996132
##                HS Grad      Frost        Area    Murder.y     Assault
## Population -0.09848975 -0.3321525  0.02254384  0.32024487  0.31702281
## Income      0.61993232  0.2262822  0.36331544 -0.21520501  0.04093255
## Illiteracy -0.65718861 -0.6719470  0.07726113  0.70677564  0.51101299
## Life Exp    0.58221620  0.2620680 -0.10733194 -0.77849850 -0.62665800
## Murder.x   -0.48797102 -0.5388834  0.22839021  0.93369089  0.73976479
## HS Grad     1.00000000  0.3667797  0.33354187 -0.52159126 -0.23030510
## Frost       0.36677970  1.0000000  0.05922910 -0.54139702 -0.46823989
## Area        0.33354187  0.0592291  1.00000000  0.14808597  0.23120879
## Murder.y   -0.52159126 -0.5413970  0.14808597  1.00000000  0.80187331
## Assault    -0.23030510 -0.4682399  0.23120879  0.80187331  1.00000000
## UrbanPop    0.35868123 -0.2461862 -0.06154747  0.06957262  0.25887170
## Rape        0.27072504 -0.2792054  0.52495510  0.56357883  0.66524123
##               UrbanPop       Rape
## Population  0.51260491  0.3052336
## Income      0.48053302  0.3573868
## Illiteracy -0.06219936  0.1545969
## Life Exp    0.27146824 -0.2695683
## Murder.x    0.01638255  0.5799613
## HS Grad     0.35868123  0.2707250
## Frost      -0.24618618 -0.2792054
## Area       -0.06154747  0.5249551
## Murder.y    0.06957262  0.5635788
## Assault     0.25887170  0.6652412
## UrbanPop    1.00000000  0.4113412
## Rape        0.41134124  1.0000000

Exercise 3: Survey rmarkdown file

Exercise 4: Education and Vocabulary

Load data file

## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## The following object is masked from 'package:purrr':
## 
##     some
##          year    sex education vocabulary
## 19740001 1974   Male        14          9
## 19740002 1974   Male        16          9
## 19740003 1974 Female        10          9
## 19740004 1974 Female        10          5
## 19740005 1974 Female        12          8
## 19740006 1974   Male        16          8

Summarize the relationship between education and vocabulary over the years by gender.

## function (x, data, ...) 
## UseMethod("xyplot")
## <bytecode: 0x7f84a1b16008>
## <environment: namespace:lattice>

Exercise 5: Body and Brain in Animals

Load data file

## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
##                     body brain
## Mountain beaver     1.35   8.1
## Cow               465.00 423.0
## Grey wolf          36.33 119.5
## Goat               27.66 115.0
## Guinea pig          1.04   5.5
## Dipliodocus     11700.00  50.0
##                    body brain
## Arctic fox        3.385  44.5
## Owl monkey        0.480  15.5
## Mountain beaver   1.350   8.1
## Cow             465.000 423.0
## Grey wolf        36.330 119.5
## Goat             27.660 115.0

Metge two datastes and delete duplicate data.

##                       body  brain
## Mountain beaver1     1.350    8.1
## Cow1               465.000  423.0
## Grey wolf1          36.330  119.5
## Goat1               27.660  115.0
## Guinea pig1          1.040    5.5
## Asian elephant1   2547.000 4603.0
## Donkey1            187.100  419.0
## Horse1             521.000  655.0
## Patas monkey        10.000  115.0
## Cat1                 3.300   25.6
## Giraffe1           529.000  680.0
## Gorilla1           207.000  406.0
## Human1              62.000 1320.0
## African elephant1 6654.000 5712.0
## Rhesus monkey1       6.800  179.0
## Kangaroo1           35.000   56.0
## Golden hamster1      0.120    1.0
## Mouse1               0.023    0.4
## Rabbit1              2.500   12.1
## Sheep1              55.500  175.0
## Jaguar1            100.000  157.0
## Chimpanzee1         52.160  440.0
## Rat1                 0.280    1.9
## Mole rat             0.122    3.0
## Pig1               192.000  180.0
## 'data.frame':    90 obs. of  2 variables:
##  $ body : num  1.35 465 36.33 27.66 1.04 ...
##  $ brain: num  8.1 423 119.5 115 5.5 ...

Exercise 6: Body and Brain in Animals