Practice: 1.7 (available in R using the data(iris) command), 1.9, 1.23, 1.33, 1.55, 1.69

Graded: 1.8, 1.10, 1.28, 1.36, 1.48, 1.50, 1.56, 1.70 (use the library(openintro); data(heartTr) to load the data)

#1.7 Fisher's irises: - a) 150
#                       b)  Continuous numerical variable sepal length, sepal width, petal length and petal width.
#                       c) Ordinal categorical variable with level setosa, versicolor and verginica

#1.8 Smoking habits of UK residents: - 
#a) Case 
#b) 1691
#c) age: Discrete Numerical
#   grossIncome: Continuous Numerical
#   amtWeekend: Discrete Numerical
#   amtWeekdays: Discrete Numerical
#   Sex: Categorical nominal
#   Martial: Categorical Ordinal
#   smoke: Categorical Ordinal
#1.9 Air pollution and birth outcomes, scope of inference: 
#a) Birth and   Sample : 143,196 birth during period of 1989 to 1993
#b) As this is an observational data , so no causal relationship can be determined.

#1.10 Cheaters, scope of inference.
#a) Population of Interest : Cheaters
#     Sample : 160 children's within age group  of 5 and 15.
#b) No this study cannot be generalized , no causal relation ship can be established as this is an Observational study.

#1.23 Haters are gonna hate, study confirms
#a) 200 randomly selected men n women
#b) Response variable :- Reaction towards the imaginary oven
#c) Explanatory variable : Attitude
#d) yes
#e) This is an observational study , as the study observers the behavior of person based on some criteria.
#f)  No Causal relationship can be inferred as this is an Observational study
#g) yes, as sample taken is random.

#1.28 Reading the paper.
#a)  Looking at the article, I think there is a clear relationship between the smokers and people who are having dementia. And as the people who smoke more the more the risk of having dementia /Alzheimer / Vascular dementia. But this study is an Observational study which puts a question mark on any causal relation ship between smokers and Dementia. But still there is some sort of relation between the 2.
#b)  This is also an Observational study, hence any causal relation between variable (Bullying & Sleep disorders) cannot be contemplated. But looking at the study one can infer that there seems to be a relation between Bullying and Sleep disorders.

#1.33 Light, noise, and exam performance
#a) Experimental Study
#b) Noise (no noise, construction noise, and human chatter noise), Light (fluorescent overhead lighting, yellow overhead lighting, no overhead lighting (only desk lamps))
#c) Researches wanted to show that Noise and Light level have different effect on male and females thus they had equal representation of both sexes.

#1.36 Exercise and mental health.

#a) Experimental Study
#b) Treatment group is the one which does exercise twice in a week
#   Control group is the one which will remain as they are now.
#c) Exercise
#d) No
#e) yes, the study establishes causal exercise between exercise and mental health. Yes study can be generalized as we have taken random samples from stratified samples.
#f) May be the study can we based on sex and also can be divided into  a)  5 times exercise a week b) 2-3 times a week and c) no exercise groups to be more generalized and clear of the outcomes.
statScores  <- c(57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82, 83, 83, 88, 89, 94)
statScores
##  [1] 57 66 69 71 72 73 74 77 78 78 79 79 81 81 82 83 83 88 89 94
boxplot(statScores)

summary(statScores)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   57.00   72.75   78.50   77.70   82.25   94.00
#install.packages("devtools") 
#library(devtools) 
#install_github("OpenIntroOrg/openintro-r-package", subdir = "openintro") 


library(openintro) 
## Please visit openintro.org for free statistics materials
## 
## Attaching package: 'openintro'
## The following objects are masked from 'package:datasets':
## 
##     cars, chickwts, trees
data(heartTr)

NROW(heartTr)
## [1] 103
 controlDied <- subset(heartTr, heartTr$transplant =='control' & heartTr$survived =='dead' )
dim(controlDied)
## [1] 30  8
NROW(controlDied)
## [1] 30
treatmentDied <- subset(heartTr, heartTr$transplant =='treatment' & heartTr$survived =='dead' )
dim(treatmentDied)
## [1] 45  8
nrow(treatmentDied)
## [1] 45
#proportion of control dead 
(NROW(controlDied)/nrow(heartTr))
## [1] 0.2912621
#proportion of treatment dead
(NROW(treatmentDied)/nrow(heartTr))
## [1] 0.4368932
treatmentgrp <- subset(heartTr, heartTr$transplant=='treatment')
nrow(treatmentgrp)
## [1] 69