Graded: 1.8, 1.10, 1.28, 1.36, 1.48, 1.50, 1.56, 1.70

Problem 1.8

library(openintro)
## Please visit openintro.org for free statistics materials
## 
## Attaching package: 'openintro'
## The following objects are masked from 'package:datasets':
## 
##     cars, trees
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
data(package = 'openintro')
data("heartTr")
data(package = 'openintro')

str(smoking)
## 'data.frame':    1691 obs. of  12 variables:
##  $ gender              : Factor w/ 2 levels "Female","Male": 2 1 2 1 1 1 2 2 2 1 ...
##  $ age                 : int  38 42 40 40 39 37 53 44 40 41 ...
##  $ maritalStatus       : Factor w/ 5 levels "Divorced","Married",..: 1 4 2 2 2 2 2 4 4 2 ...
##  $ highestQualification: Factor w/ 8 levels "A Levels","Degree",..: 6 6 2 2 4 4 2 2 3 6 ...
##  $ nationality         : Factor w/ 8 levels "British","English",..: 1 1 2 2 1 1 1 2 2 2 ...
##  $ ethnicity           : Factor w/ 7 levels "Asian","Black",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ grossIncome         : Factor w/ 10 levels "10,400 to 15,600",..: 3 9 5 1 3 2 7 1 3 6 ...
##  $ region              : Factor w/ 7 levels "London","Midlands & East Anglia",..: 6 6 6 6 6 6 6 6 6 6 ...
##  $ smoke               : Factor w/ 2 levels "No","Yes": 1 2 1 1 1 1 2 1 2 2 ...
##  $ amtWeekends         : int  NA 12 NA NA NA NA 6 NA 8 15 ...
##  $ amtWeekdays         : int  NA 12 NA NA NA NA 6 NA 8 12 ...
##  $ type                : Factor w/ 5 levels "","Both/Mainly Hand-Rolled",..: 1 5 1 1 1 1 5 1 4 5 ...
  1. Each row represents a UK resident that was interviewed in the survey

  2. There was a total of 1691 UK residents in this survey

  3. Categorical variables: gender, marital status, higest qualification - ordinal, nationality, ethnicity, gross income - ordinal, region, smoke Numerical variable: age - continuous, amount on weekends - discrete, amount on weekdays - discrete

Problem 1.10

  1. The population of interest in this group are kids aged between 5 and 15

  2. The population is pretty huge, kids aged 5-15, and the sample used can also be considered extremely small, 160 kids. Depending on whether the outcome of the experiment is being used to generalize all kids aged 5-15 in the US, or the state, or the school would change my perspective on whether it is appropriate to generalize the population. Also, 10 years is a huge gap in age which results in differences in maturity. In this study, it’s combining first graders with highschool freshman/sophomores which I would assume not be accurate in generalizing the population.

Problem 1.28

  1. Based on the study and its results, I conclude that smoking and dementia have a degree of correlation. I would not conclude that smoking causes dimentia. The nature of the study is merely observational, and not expirmental so I would not conclude that smoking causes cigarettes.

  2. The statement is not justified. According to the researchers conclusion, children who had behavioral issues, and those who were identified as bullies were twice as likely to have shown symptoms of sleep disorders. This means that children that have behavioral issues or that are bullies are more likely to have sleep disorder symptoms.

Problem 1.36

  1. Randomized experiment

  2. treatment - exercise group. Control - non exercise group

  3. This experiment makes use of blocking by the age variable

  4. No because participants know who is recieving the treament (exercise) and who is not

  5. There is a possibility of a casual relationship due to the nature of the experiment. Depending on the size of the sample one would be able to generalize it to the population at large

  6. I would suggest coming up with a reasonable sample size, moving excerising up to 3-5 days per week, categorizing the types of workouts to weight lifting, cardio, yoga, etc., and tracking average heartbeat rate throughout the workout.

Problem 1.48

data <- c(57,66,69,71,72,73,74,77,78,78,79,79,81,81,82,83,83,88,89,94)
dataframe <- as.data.frame(data)

boxplot(dataframe, ylab = "Final Exam Scores for Intro Statistics")

Problem 1.50

  1. symmetric, unimodal, matches boxplot (2)

  2. evenly distributed, matches boxplot (3)

  3. skewed right, unimodal, matches boxplot (1)

Problem 1.56

  1. Right skewed since there is a meaningful number of houses over $6,000,000. Based off of the percentages provided, most costs of homes lie between $350k and $1mil. The mean might now be a good representation for a typical observation due to the number of houses over $6mil, I would suggest using the median. Also, the standard deviation wouldn’t be appropriate to represent the variability of the observations due to its skewness. The IQR would be a better representation.

  2. The distribution seems fairly symmetric spread, with a majority of houses price around $600k. The few houses over $1.2mil might skew the distribution right however. I assumed standard deviation be appropriate to measure the variablility due to the symmetric distribution

  3. The distribution would be heavly skewed left. Students would not be allowed to drink till their junior year, and in some cases their senior year. This suggest 75% have 0 drinks per week. There will be an upper 25% which will drink a decent amount, say 4 drinks per week and a few excessive drinkers, say 8+. Median seems to be the best representation since most students are underaged and unable to drink, and IQR is also the best representation since the 21 year old would not inflate the variablility.

  4. Left skewed distribution, left tail of low salaries, and few of executives on the right making much higher salaries. Due to the skewness, the median would best represent the typical salary, which suggests the IQR is also a better representation since it is not influenced by the higher executive salaries.

Problem 1.70

  1. It looks like more people took the treatment than people who did not. However, the mosaic plot suggests survival rate improves with treament, but we cannot claim independece soley from the mosaic plot.

  2. The boxplot suggests that the treatment isn’t too effective, the median is around 250 days, but it is a greater survival rate than not having any treatment. The treatment is effective in extending survival for its patients compared to the nontreatment group where almost all patients die within 250 days.

heartTr
##      id acceptyear age survived survtime prior transplant wait
## 1    15         68  53     dead        1    no    control   NA
## 2    43         70  43     dead        2    no    control   NA
## 3    61         71  52     dead        2    no    control   NA
## 4    75         72  52     dead        2    no    control   NA
## 5     6         68  54     dead        3    no    control   NA
## 6    42         70  36     dead        3    no    control   NA
## 7    54         71  47     dead        3    no    control   NA
## 8    38         70  41     dead        5    no  treatment    5
## 9    85         73  47     dead        5    no    control   NA
## 10    2         68  51     dead        6    no    control   NA
## 11  103         67  39     dead        6    no    control   NA
## 12   12         68  53     dead        8    no    control   NA
## 13   48         71  56     dead        9    no    control   NA
## 14  102         74  40    alive       11    no    control   NA
## 15   35         70  43     dead       12    no    control   NA
## 16   95         73  40     dead       16    no  treatment    2
## 17   31         69  54     dead       16    no    control   NA
## 18    3         68  54     dead       16    no  treatment    1
## 19   74         72  29     dead       17    no  treatment    5
## 20    5         68  20     dead       18    no    control   NA
## 21   77         72  41     dead       21    no    control   NA
## 22   99         73  49     dead       21    no    control   NA
## 23   20         69  55     dead       28    no  treatment    1
## 24   70         72  52     dead       30    no  treatment    5
## 25  101         74  49    alive       31    no    control   NA
## 26   66         72  53     dead       32    no    control   NA
## 27   29         69  50     dead       35    no    control   NA
## 28   17         68  20     dead       36    no    control   NA
## 29   19         68  59     dead       37    no    control   NA
## 30    4         68  40     dead       39    no  treatment   36
## 31  100         74  35    alive       39   yes  treatment   38
## 32    8         68  45     dead       40    no    control   NA
## 33   44         70  42     dead       40    no    control   NA
## 34   16         68  56     dead       43    no  treatment   20
## 35   45         71  36     dead       45    no  treatment    1
## 36    1         67  30     dead       50    no    control   NA
## 37   22         69  42     dead       51    no  treatment   12
## 38   39         70  50     dead       53    no  treatment    2
## 39   10         68  42     dead       58    no  treatment   12
## 40   35         71  52     dead       61    no  treatment   10
## 41   37         70  61     dead       66    no  treatment   19
## 42   68         72  45     dead       68    no  treatment    3
## 43   60         71  49     dead       68    no  treatment    3
## 44   62         71  39     dead       69    no    control   NA
## 45   28         69  53     dead       72    no  treatment   71
## 46   47         71  47     dead       72    no  treatment   21
## 47   32         69  64     dead       77    no  treatment   17
## 48   65         72  51     dead       78    no  treatment   12
## 49   83         73  53     dead       80    no  treatment   32
## 50   13         68  54     dead       81    no  treatment   17
## 51    9         68  47     dead       85    no    control   NA
## 52   73         72  56     dead       90    no  treatment   27
## 53   79         72  53     dead       96    no  treatment   67
## 54   36         70  48     dead      100    no  treatment   46
## 55   32         71  41     dead      102    no    control   NA
## 56   98         73  28    alive      109    no  treatment   96
## 57   87         73  46     dead      110    no  treatment   60
## 58   97         73  23    alive      131    no  treatment   21
## 59   37         71  41     dead      149    no    control   NA
## 60   11         68  47     dead      153    no  treatment   26
## 61   94         73  43     dead      165   yes  treatment    4
## 62   96         73  26    alive      180    no  treatment   13
## 63   90         73  52     dead      186   yes  treatment  160
## 64   53         71  47     dead      188    no  treatment   41
## 65   89         73  51     dead      207    no  treatment  139
## 66   24         69  51     dead      219    no  treatment   83
## 67   27         69   8     dead      263    no    control   NA
## 68   93         73  47    alive      265    no  treatment   28
## 69   51         71  48     dead      285    no  treatment   32
## 70   67         73  19     dead      285    no  treatment   57
## 71   16         68  49     dead      308    no  treatment   28
## 72   84         73  42     dead      334    no  treatment   37
## 73   91         73  47     dead      340    no    control   NA
## 74   92         73  44    alive      340    no  treatment  310
## 75   58         71  47     dead      342   yes  treatment   21
## 76   88         73  54    alive      370    no  treatment   31
## 77   86         73  48    alive      397    no  treatment    8
## 78   82         71  29    alive      427    no    control   NA
## 79   81         73  52    alive      445    no  treatment    6
## 80   80         72  46    alive      482   yes  treatment   26
## 81   78         72  48    alive      515    no  treatment  210
## 82   76         72  52    alive      545   yes  treatment   46
## 83   64         72  48     dead      583   yes  treatment   32
## 84   72         72  26    alive      596    no  treatment    4
## 85   71         72  47    alive      630    no  treatment   31
## 86   69         72  47    alive      670    no  treatment   10
## 87    7         68  50     dead      675    no  treatment   51
## 88   23         69  58     dead      733    no  treatment    3
## 89   63         71  32    alive      841    no  treatment   27
## 90   30         69  44     dead      852    no  treatment   16
## 91   59         71  41    alive      915    no  treatment   78
## 92   56         71  38    alive      941    no  treatment   67
## 93   50         71  45     dead      979   yes  treatment   83
## 94   46         71  48     dead      995   yes  treatment    2
## 95   21         69  43     dead     1032    no  treatment    8
## 96   49         71  36    alive     1141   yes  treatment   36
## 97   41         70  45    alive     1321   yes  treatment   58
## 98   14         68  53     dead     1386    no  treatment   37
## 99   26         69  30    alive     1400    no    control   NA
## 100  40         70  48    alive     1407   yes  treatment   41
## 101  34         69  40    alive     1571    no  treatment   23
## 102  33         69  48    alive     1586    no  treatment   51
## 103  25         69  33    alive     1799    no  treatment   25
treatmentgroup <- filter(heartTr,transplant == "treatment")
treatmentgroup
##     id acceptyear age survived survtime prior transplant wait
## 1   38         70  41     dead        5    no  treatment    5
## 2   95         73  40     dead       16    no  treatment    2
## 3    3         68  54     dead       16    no  treatment    1
## 4   74         72  29     dead       17    no  treatment    5
## 5   20         69  55     dead       28    no  treatment    1
## 6   70         72  52     dead       30    no  treatment    5
## 7    4         68  40     dead       39    no  treatment   36
## 8  100         74  35    alive       39   yes  treatment   38
## 9   16         68  56     dead       43    no  treatment   20
## 10  45         71  36     dead       45    no  treatment    1
## 11  22         69  42     dead       51    no  treatment   12
## 12  39         70  50     dead       53    no  treatment    2
## 13  10         68  42     dead       58    no  treatment   12
## 14  35         71  52     dead       61    no  treatment   10
## 15  37         70  61     dead       66    no  treatment   19
## 16  68         72  45     dead       68    no  treatment    3
## 17  60         71  49     dead       68    no  treatment    3
## 18  28         69  53     dead       72    no  treatment   71
## 19  47         71  47     dead       72    no  treatment   21
## 20  32         69  64     dead       77    no  treatment   17
## 21  65         72  51     dead       78    no  treatment   12
## 22  83         73  53     dead       80    no  treatment   32
## 23  13         68  54     dead       81    no  treatment   17
## 24  73         72  56     dead       90    no  treatment   27
## 25  79         72  53     dead       96    no  treatment   67
## 26  36         70  48     dead      100    no  treatment   46
## 27  98         73  28    alive      109    no  treatment   96
## 28  87         73  46     dead      110    no  treatment   60
## 29  97         73  23    alive      131    no  treatment   21
## 30  11         68  47     dead      153    no  treatment   26
## 31  94         73  43     dead      165   yes  treatment    4
## 32  96         73  26    alive      180    no  treatment   13
## 33  90         73  52     dead      186   yes  treatment  160
## 34  53         71  47     dead      188    no  treatment   41
## 35  89         73  51     dead      207    no  treatment  139
## 36  24         69  51     dead      219    no  treatment   83
## 37  93         73  47    alive      265    no  treatment   28
## 38  51         71  48     dead      285    no  treatment   32
## 39  67         73  19     dead      285    no  treatment   57
## 40  16         68  49     dead      308    no  treatment   28
## 41  84         73  42     dead      334    no  treatment   37
## 42  92         73  44    alive      340    no  treatment  310
## 43  58         71  47     dead      342   yes  treatment   21
## 44  88         73  54    alive      370    no  treatment   31
## 45  86         73  48    alive      397    no  treatment    8
## 46  81         73  52    alive      445    no  treatment    6
## 47  80         72  46    alive      482   yes  treatment   26
## 48  78         72  48    alive      515    no  treatment  210
## 49  76         72  52    alive      545   yes  treatment   46
## 50  64         72  48     dead      583   yes  treatment   32
## 51  72         72  26    alive      596    no  treatment    4
## 52  71         72  47    alive      630    no  treatment   31
## 53  69         72  47    alive      670    no  treatment   10
## 54   7         68  50     dead      675    no  treatment   51
## 55  23         69  58     dead      733    no  treatment    3
## 56  63         71  32    alive      841    no  treatment   27
## 57  30         69  44     dead      852    no  treatment   16
## 58  59         71  41    alive      915    no  treatment   78
## 59  56         71  38    alive      941    no  treatment   67
## 60  50         71  45     dead      979   yes  treatment   83
## 61  46         71  48     dead      995   yes  treatment    2
## 62  21         69  43     dead     1032    no  treatment    8
## 63  49         71  36    alive     1141   yes  treatment   36
## 64  41         70  45    alive     1321   yes  treatment   58
## 65  14         68  53     dead     1386    no  treatment   37
## 66  40         70  48    alive     1407   yes  treatment   41
## 67  34         69  40    alive     1571    no  treatment   23
## 68  33         69  48    alive     1586    no  treatment   51
## 69  25         69  33    alive     1799    no  treatment   25
filter(treatmentgroup, survived == 'dead')
##    id acceptyear age survived survtime prior transplant wait
## 1  38         70  41     dead        5    no  treatment    5
## 2  95         73  40     dead       16    no  treatment    2
## 3   3         68  54     dead       16    no  treatment    1
## 4  74         72  29     dead       17    no  treatment    5
## 5  20         69  55     dead       28    no  treatment    1
## 6  70         72  52     dead       30    no  treatment    5
## 7   4         68  40     dead       39    no  treatment   36
## 8  16         68  56     dead       43    no  treatment   20
## 9  45         71  36     dead       45    no  treatment    1
## 10 22         69  42     dead       51    no  treatment   12
## 11 39         70  50     dead       53    no  treatment    2
## 12 10         68  42     dead       58    no  treatment   12
## 13 35         71  52     dead       61    no  treatment   10
## 14 37         70  61     dead       66    no  treatment   19
## 15 68         72  45     dead       68    no  treatment    3
## 16 60         71  49     dead       68    no  treatment    3
## 17 28         69  53     dead       72    no  treatment   71
## 18 47         71  47     dead       72    no  treatment   21
## 19 32         69  64     dead       77    no  treatment   17
## 20 65         72  51     dead       78    no  treatment   12
## 21 83         73  53     dead       80    no  treatment   32
## 22 13         68  54     dead       81    no  treatment   17
## 23 73         72  56     dead       90    no  treatment   27
## 24 79         72  53     dead       96    no  treatment   67
## 25 36         70  48     dead      100    no  treatment   46
## 26 87         73  46     dead      110    no  treatment   60
## 27 11         68  47     dead      153    no  treatment   26
## 28 94         73  43     dead      165   yes  treatment    4
## 29 90         73  52     dead      186   yes  treatment  160
## 30 53         71  47     dead      188    no  treatment   41
## 31 89         73  51     dead      207    no  treatment  139
## 32 24         69  51     dead      219    no  treatment   83
## 33 51         71  48     dead      285    no  treatment   32
## 34 67         73  19     dead      285    no  treatment   57
## 35 16         68  49     dead      308    no  treatment   28
## 36 84         73  42     dead      334    no  treatment   37
## 37 58         71  47     dead      342   yes  treatment   21
## 38 64         72  48     dead      583   yes  treatment   32
## 39  7         68  50     dead      675    no  treatment   51
## 40 23         69  58     dead      733    no  treatment    3
## 41 30         69  44     dead      852    no  treatment   16
## 42 50         71  45     dead      979   yes  treatment   83
## 43 46         71  48     dead      995   yes  treatment    2
## 44 21         69  43     dead     1032    no  treatment    8
## 45 14         68  53     dead     1386    no  treatment   37
controlgroup <- filter(heartTr,transplant == "control")
controlgroup
##     id acceptyear age survived survtime prior transplant wait
## 1   15         68  53     dead        1    no    control   NA
## 2   43         70  43     dead        2    no    control   NA
## 3   61         71  52     dead        2    no    control   NA
## 4   75         72  52     dead        2    no    control   NA
## 5    6         68  54     dead        3    no    control   NA
## 6   42         70  36     dead        3    no    control   NA
## 7   54         71  47     dead        3    no    control   NA
## 8   85         73  47     dead        5    no    control   NA
## 9    2         68  51     dead        6    no    control   NA
## 10 103         67  39     dead        6    no    control   NA
## 11  12         68  53     dead        8    no    control   NA
## 12  48         71  56     dead        9    no    control   NA
## 13 102         74  40    alive       11    no    control   NA
## 14  35         70  43     dead       12    no    control   NA
## 15  31         69  54     dead       16    no    control   NA
## 16   5         68  20     dead       18    no    control   NA
## 17  77         72  41     dead       21    no    control   NA
## 18  99         73  49     dead       21    no    control   NA
## 19 101         74  49    alive       31    no    control   NA
## 20  66         72  53     dead       32    no    control   NA
## 21  29         69  50     dead       35    no    control   NA
## 22  17         68  20     dead       36    no    control   NA
## 23  19         68  59     dead       37    no    control   NA
## 24   8         68  45     dead       40    no    control   NA
## 25  44         70  42     dead       40    no    control   NA
## 26   1         67  30     dead       50    no    control   NA
## 27  62         71  39     dead       69    no    control   NA
## 28   9         68  47     dead       85    no    control   NA
## 29  32         71  41     dead      102    no    control   NA
## 30  37         71  41     dead      149    no    control   NA
## 31  27         69   8     dead      263    no    control   NA
## 32  91         73  47     dead      340    no    control   NA
## 33  82         71  29    alive      427    no    control   NA
## 34  26         69  30    alive     1400    no    control   NA
filter(controlgroup, survived == 'dead')
##     id acceptyear age survived survtime prior transplant wait
## 1   15         68  53     dead        1    no    control   NA
## 2   43         70  43     dead        2    no    control   NA
## 3   61         71  52     dead        2    no    control   NA
## 4   75         72  52     dead        2    no    control   NA
## 5    6         68  54     dead        3    no    control   NA
## 6   42         70  36     dead        3    no    control   NA
## 7   54         71  47     dead        3    no    control   NA
## 8   85         73  47     dead        5    no    control   NA
## 9    2         68  51     dead        6    no    control   NA
## 10 103         67  39     dead        6    no    control   NA
## 11  12         68  53     dead        8    no    control   NA
## 12  48         71  56     dead        9    no    control   NA
## 13  35         70  43     dead       12    no    control   NA
## 14  31         69  54     dead       16    no    control   NA
## 15   5         68  20     dead       18    no    control   NA
## 16  77         72  41     dead       21    no    control   NA
## 17  99         73  49     dead       21    no    control   NA
## 18  66         72  53     dead       32    no    control   NA
## 19  29         69  50     dead       35    no    control   NA
## 20  17         68  20     dead       36    no    control   NA
## 21  19         68  59     dead       37    no    control   NA
## 22   8         68  45     dead       40    no    control   NA
## 23  44         70  42     dead       40    no    control   NA
## 24   1         67  30     dead       50    no    control   NA
## 25  62         71  39     dead       69    no    control   NA
## 26   9         68  47     dead       85    no    control   NA
## 27  32         71  41     dead      102    no    control   NA
## 28  37         71  41     dead      149    no    control   NA
## 29  27         69   8     dead      263    no    control   NA
## 30  91         73  47     dead      340    no    control   NA
  1. 69 people took the treatment, 45 died. 34 did not take the treatment, 30 of them died

  2. Heart trasplant can increase lifespan is the claim being tested.

  1. We write alive on 28 cards representing patients who were alive at the end of the study, and dead on 65 cards representing patients who were not. Then, we shu✏e these cards and split them into two groups: one group of size 69 representing treatment, and another group of size 34 representing control. We calculate the di↵erence between the proportion of dead cards in the treatment and control groups (treatment - control) and record this value. We repeat this 100 times to build a distribution centered at 0. Lastly, we calculate the fraction of simulations where the simulated di↵erences in proportions are. 45/69 - 30/34 If this fraction is low, we conclude that it is unlikely to have observed such an outcome by chance and that the null hypothesis should be rejected in favor of the alternative.

  2. Based on the results, it seems as the trasnplant is effective. The more negative the simulated differences in proportion, the less the proportion for deaths in the treatment group is. The data is almost symmetric, it is unimodal, with its mode of difference in proportions being negative.