Math and statistics homework

R Markdown

HW #1.8)

Observations
1691
Sex - categorical; Age - numerical, discrete; Marital - categorical; GrossIncome - categorical ordinal; Smoke - categorical; AmtWeekends - numerical discrete; AmtWeekdays - numerical discrete

1.10)
1. population of interest - chidren between the ages of 5 and 15; sample - 160
2. the field experement sample if truly random can be generalized to population, which implayes if we have good reason to believe that 160 members are randomly selected from population than it can be generilized; the same should apply to casual relationship
1.28)
1. it is observational study, so we cannot make conclusions from the study
2. the statement is not justified - it is observational study and we cannot make this type of catagorical conclusions. Our conclusion should be that there is a need of randomized exprement that will confirm or reject our hypotitize
1.36)
1. randomized experement
2. treatment - exercise group; control - no exercise group
3. blocking variable - age groups
4. no blinding
5. there was no blinding, so individuals knew if they received treatment (exercise) or not not. Nevertheless, we can catiously use the results to establish relationship between exercise and mental health and apply thses conclusions to general population for age between 18 and 55
6. I think it is the best we can do to establish relationship between exercise and mental health. I am not sure how we can blind exercise. SO I would fund the study
1.48)

e48 <- c(57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82, 83, 83, 88, 89, 94)

boxplot(e48)

#1.50)

      a) symmetrical; (2)
      b) symmetrical/uniform; (3)
      c) right skewed; (1)

#1.56)

    a) right skewed; median is better approximation; IQR is better (skewed distribution) 
    b) symetrical; mean and median should be the simular; either one IQR or standard diviation (symetrical)
    c) right skewed; median is better approximation; IQR is better (skewed distribution)  
    d) right skewed; median is better approximation; IQR is better (skewed distribution)

#1.70)

    a) survival is dependant; boxes of alive patients are on different level for both groups
    b) treatment group had much better outcome

load(file="heartTr.RData")

summary(heartTr)

##        id          acceptyear         age         survived 
##  Min.   :  1.0   Min.   :67.00   Min.   : 8.00   alive:28  
##  1st Qu.: 26.5   1st Qu.:69.00   1st Qu.:41.00   dead :75  
##  Median : 49.0   Median :71.00   Median :47.00             
##  Mean   : 51.4   Mean   :70.62   Mean   :44.64             
##  3rd Qu.: 77.5   3rd Qu.:72.00   3rd Qu.:52.00             
##  Max.   :103.0   Max.   :74.00   Max.   :64.00             
##                                                            
##     survtime      prior        transplant      wait       
##  Min.   :   1.0   no :91   control  :34   Min.   :  1.00  
##  1st Qu.:  33.5   yes:12   treatment:69   1st Qu.: 10.00  
##  Median :  90.0                           Median : 26.00  
##  Mean   : 310.2                           Mean   : 38.42  
##  3rd Qu.: 412.0                           3rd Qu.: 46.00  
##  Max.   :1799.0                           Max.   :310.00  
##                                           NA's   :34

library(plyr)

count <- count(heartTr, c('transplant','survived')) 

count

##   transplant survived freq
## 1    control    alive    4
## 2    control     dead   30
## 3  treatment    alive   24
## 4  treatment     dead   45

in control group 4 out 34 survived or 12%, while in treatment group 24 out of 69 survived or 35%

1. H0 - heart transplant and survival independant events vs H1 - events are dependant; We test heart patients with heart transplants and without to see if their survival time differs
2. index, index, 69, 34, 0, more than -0.23 (0.65-0.88=-0.23),
3. it appears to be effective

Math and statistics homework

Mikhail Groysman

September 15, 2018

R Markdown

1.10)

1.28)

1.36)

1.48)