1. What does each row of the data matrix represent?
  2. How many participants were included in the survey?
  3. Indicate whether each variable in the study is numerical or categorical. If numerical, identify as continuous or discrete. If categorical, indicate if the variable is ordinal.
library(RCurl)
df <- read.csv("https://raw.githubusercontent.com/jbryer/DATA606Fall2020/master/course_data/os3_data/Ch%201%20Exercise%20Data/smoking.csv")
head(df)
##   gender age maritalStatus highestQualification nationality ethnicity
## 1   Male  38      Divorced     No Qualification     British     White
## 2 Female  42        Single     No Qualification     British     White
## 3   Male  40       Married               Degree     English     White
## 4 Female  40       Married               Degree     English     White
## 5 Female  39       Married         GCSE/O Level     British     White
## 6 Female  37       Married         GCSE/O Level     British     White
##        grossIncome    region smoke amtWeekends amtWeekdays    type
## 1   2,600 to 5,200 The North    No          NA          NA        
## 2      Under 2,600 The North   Yes          12          12 Packets
## 3 28,600 to 36,400 The North    No          NA          NA        
## 4 10,400 to 15,600 The North    No          NA          NA        
## 5   2,600 to 5,200 The North    No          NA          NA        
## 6 15,600 to 20,800 The North    No          NA          NA

1.10 (a) What does each row of the data matrix represent?’ Each row of the data matrix represents one responder’s answers to the survey. (b) How many participants were included in the survey? 1691

summary(df)
##     gender               age        maritalStatus      highestQualification
##  Length:1691        Min.   :16.00   Length:1691        Length:1691         
##  Class :character   1st Qu.:34.00   Class :character   Class :character    
##  Mode  :character   Median :48.00   Mode  :character   Mode  :character    
##                     Mean   :49.84                                          
##                     3rd Qu.:65.50                                          
##                     Max.   :97.00                                          
##                                                                            
##  nationality         ethnicity         grossIncome           region         
##  Length:1691        Length:1691        Length:1691        Length:1691       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     smoke            amtWeekends     amtWeekdays        type          
##  Length:1691        Min.   : 0.00   Min.   : 0.00   Length:1691       
##  Class :character   1st Qu.:10.00   1st Qu.: 7.00   Class :character  
##  Mode  :character   Median :15.00   Median :12.00   Mode  :character  
##                     Mean   :16.41   Mean   :13.75                     
##                     3rd Qu.:20.00   3rd Qu.:20.00                     
##                     Max.   :60.00   Max.   :55.00                     
##                     NA's   :1270    NA's   :1270
  1. Indicate whether each variable in the study is numerical or categorical. If numerical, identify as continuous or discrete. If categorical, indicate if the variable is ordinal. sex - categorical (not ordinal) age - numerical (discrete) marital - categorical (not ordinal) grossIncome - categorical (ordinal) smoke - categorical (not ordinal) amtWeekends - numerical (discrete) amtWeekdays - numerical (discrete)

1.34

  1. What type of study is this? Experimental
  2. What are the treatment and control groups in this study? The treatment is exercise where the treatment group will be exercising and the control group will not exercise
  3. Does this study make use of blocking? If so, what is the blocking variable? The blocking variable is age.
  4. Does this study make use of blinding? No, they all know which group they belong to as well the researcher.
  5. Comment on whether or not the results of the study can be used to establish a causal relationship between exercise and mental health, and indicate whether or not the conclusions can be generalized to the population at large.

The study can be used to establish a causal relationshp between exercise and mental health. The conclusion can be generalized to the population at large because participants were picked randomnly.

  1. Suppose you are given the task of determining if this proposed study should get funding. Would you have any reservations about the study proposal? I would fund the study, because it would provide some meaningful observations to larger population regarding getting exercise or not with connection to their age.