library(tidyverse)
library(openintro)
download.file("http://www.openintro.org/stat/data/nc.RData", destfile = "nc.RData")
load("nc.RData")

Exercise 1

The following are cases in our dataset: 1) Is the baby premature or full term? 2) Is the mother married or unmarried? 3) Is the baby low birthweight or high birthweight? 4) Is the baby male or female? 5) Is the mother a smoker or nonsmoker? 6) Is the mother white or not white?

I’m not sure what the question “how many cases are there in our sample?” means, but I see that there are 1000 observations for all 13 variables.

There are clearly outliers for our numerical data as shown by the boxplot for visits.

summary(nc)
##       fage            mage            mature        weeks             premie   
##  Min.   :14.00   Min.   :13   mature mom :133   Min.   :20.00   full term:846  
##  1st Qu.:25.00   1st Qu.:22   younger mom:867   1st Qu.:37.00   premie   :152  
##  Median :30.00   Median :27                     Median :39.00   NA's     :  2  
##  Mean   :30.26   Mean   :27                     Mean   :38.33                  
##  3rd Qu.:35.00   3rd Qu.:32                     3rd Qu.:40.00                  
##  Max.   :55.00   Max.   :50                     Max.   :45.00                  
##  NA's   :171                                    NA's   :2                      
##      visits            marital        gained          weight      
##  Min.   : 0.0   married    :386   Min.   : 0.00   Min.   : 1.000  
##  1st Qu.:10.0   not married:613   1st Qu.:20.00   1st Qu.: 6.380  
##  Median :12.0   NA's       :  1   Median :30.00   Median : 7.310  
##  Mean   :12.1                     Mean   :30.33   Mean   : 7.101  
##  3rd Qu.:15.0                     3rd Qu.:38.00   3rd Qu.: 8.060  
##  Max.   :30.0                     Max.   :85.00   Max.   :11.750  
##  NA's   :9                        NA's   :27                      
##  lowbirthweight    gender          habit          whitemom  
##  low    :111    female:503   nonsmoker:873   not white:284  
##  not low:889    male  :497   smoker   :126   white    :714  
##                              NA's     :  1   NA's     :  2  
##                                                             
##                                                             
##                                                             
## 
boxplot(nc$visits)

Exercise 2

These plots show that the median birthweight for smoker mothers is slightly lower than the median birthweight for the non-smoker mothers, and the upper range of birthweights for non-smoker mothers is higher than for smoker mothers.

boxplot(nc$weight ~ nc$habit,
        col='steelblue',
        main='Smoking habit vs. Birthweight',
        ) 

by(nc$weight, nc$habit, mean)
## nc$habit: nonsmoker
## [1] 7.144273
## ------------------------------------------------------------ 
## nc$habit: smoker
## [1] 6.82873

Exercise 3

Because there is no information about the method, we will assume that the variables are i.i.d. The sample sizes are sufficiently large to account for skewing, therefore the conditions are met.

by(nc$weight, nc$habit, length)
## nc$habit: nonsmoker
## [1] 873
## ------------------------------------------------------------ 
## nc$habit: smoker
## [1] 126

Exercise 4

H_0: There is no difference in mean birthweight for smoker and non-smoker mothers.(mu’s are equal) H_a: There is a difference in mean birthweight for smoker and non-smoker mothers. (mu’s are not equal)

The inference function produces a p-value of p = 0.0184, which is < 0.05 but > 0.01, therefore there is moderate evidence to suggest that a mother’s smoking habit is correlated with the birthweight of the baby.

inference(y = nc$weight, x = nc$habit, est = "mean", type = "ht", null = 0, 
          alternative = "twosided", method = "theoretical")
## Response variable: numerical, Explanatory variable: categorical
## Difference between two means
## Summary statistics:
## n_nonsmoker = 873, mean_nonsmoker = 7.1443, sd_nonsmoker = 1.5187
## n_smoker = 126, mean_smoker = 6.8287, sd_smoker = 1.3862
## Observed difference between means (nonsmoker-smoker) = 0.3155
## 
## H0: mu_nonsmoker - mu_smoker = 0 
## HA: mu_nonsmoker - mu_smoker != 0 
## Standard error = 0.134 
## Test statistic: Z =  2.359 
## p-value =  0.0184

Exercise 5

The researcher can be 95% confident that the true difference in mean birthweights for non-smoker vs. smoker babies is between -0.578 lbs and -0.053 lbs.

inference(y = nc$weight, x = nc$habit, est = "mean", type = "ci", null = 0, 
          alternative = "twosided", method = "theoretical", 
          order = c("smoker","nonsmoker"))
## Response variable: numerical, Explanatory variable: categorical
## Difference between two means
## Summary statistics:
## n_smoker = 126, mean_smoker = 6.8287, sd_smoker = 1.3862
## n_nonsmoker = 873, mean_nonsmoker = 7.1443, sd_nonsmoker = 1.5187

## Observed difference between means (smoker-nonsmoker) = -0.3155
## 
## Standard error = 0.1338 
## 95 % Confidence interval = ( -0.5777 , -0.0534 )
LS0tDQp0aXRsZTogIkluZmVyZW5jZSBmb3IgTnVtZXJpY2FsIERhdGEiDQphdXRob3I6ICJBZGkgVi4iDQpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiDQpvdXRwdXQ6IG9wZW5pbnRybzo6bGFiX3JlcG9ydA0KLS0tDQoNCmBgYHtyIGxvYWQtcGFja2FnZXMsIG1lc3NhZ2U9RkFMU0V9DQpsaWJyYXJ5KHRpZHl2ZXJzZSkNCmxpYnJhcnkob3BlbmludHJvKQ0KZG93bmxvYWQuZmlsZSgiaHR0cDovL3d3dy5vcGVuaW50cm8ub3JnL3N0YXQvZGF0YS9uYy5SRGF0YSIsIGRlc3RmaWxlID0gIm5jLlJEYXRhIikNCmxvYWQoIm5jLlJEYXRhIikNCmBgYA0KDQojIyMgRXhlcmNpc2UgMQ0KDQpUaGUgZm9sbG93aW5nIGFyZSBjYXNlcyBpbiBvdXIgZGF0YXNldDoNCiAgMSkgSXMgdGhlIGJhYnkgcHJlbWF0dXJlIG9yIGZ1bGwgdGVybT8NCiAgMikgSXMgdGhlIG1vdGhlciBtYXJyaWVkIG9yIHVubWFycmllZD8NCiAgMykgSXMgdGhlIGJhYnkgbG93IGJpcnRod2VpZ2h0IG9yIGhpZ2ggYmlydGh3ZWlnaHQ/DQogIDQpIElzIHRoZSBiYWJ5IG1hbGUgb3IgZmVtYWxlPw0KICA1KSBJcyB0aGUgbW90aGVyIGEgc21va2VyIG9yIG5vbnNtb2tlcj8NCiAgNikgSXMgdGhlIG1vdGhlciB3aGl0ZSBvciBub3Qgd2hpdGU/DQogIA0KSSdtIG5vdCBzdXJlIHdoYXQgdGhlIHF1ZXN0aW9uICJob3cgbWFueSBjYXNlcyBhcmUgdGhlcmUgaW4gb3VyIHNhbXBsZT8iIG1lYW5zLA0KYnV0IEkgc2VlIHRoYXQgdGhlcmUgYXJlIDEwMDAgb2JzZXJ2YXRpb25zIGZvciBhbGwgMTMgdmFyaWFibGVzLg0KDQpUaGVyZSBhcmUgY2xlYXJseSBvdXRsaWVycyBmb3Igb3VyIG51bWVyaWNhbCBkYXRhIGFzIHNob3duIGJ5IHRoZSBib3hwbG90IGZvcg0KdmlzaXRzLg0KDQpgYGB7ciBjb2RlLWNodW5rLTF9DQpzdW1tYXJ5KG5jKQ0KYm94cGxvdChuYyR2aXNpdHMpDQpgYGANCg0KDQojIyMgRXhlcmNpc2UgMg0KDQpUaGVzZSBwbG90cyBzaG93IHRoYXQgdGhlIG1lZGlhbiBiaXJ0aHdlaWdodCBmb3Igc21va2VyIG1vdGhlcnMgaXMgc2xpZ2h0bHkNCmxvd2VyIHRoYW4gdGhlIG1lZGlhbiBiaXJ0aHdlaWdodCBmb3IgdGhlIG5vbi1zbW9rZXIgbW90aGVycywgYW5kIHRoZSB1cHBlcg0KcmFuZ2Ugb2YgYmlydGh3ZWlnaHRzIGZvciBub24tc21va2VyIG1vdGhlcnMgaXMgaGlnaGVyIHRoYW4gZm9yIHNtb2tlciBtb3RoZXJzLg0KDQpgYGB7ciBjb2RlLWNodW5rLTJ9DQpib3hwbG90KG5jJHdlaWdodCB+IG5jJGhhYml0LA0KICAgICAgICBjb2w9J3N0ZWVsYmx1ZScsDQogICAgICAgIG1haW49J1Ntb2tpbmcgaGFiaXQgdnMuIEJpcnRod2VpZ2h0JywNCiAgICAgICAgKSANCg0KYnkobmMkd2VpZ2h0LCBuYyRoYWJpdCwgbWVhbikNCmBgYA0KDQoNCiMjIyBFeGVyY2lzZSAzDQoNCkJlY2F1c2UgdGhlcmUgaXMgbm8gaW5mb3JtYXRpb24gYWJvdXQgdGhlIG1ldGhvZCwgd2Ugd2lsbCBhc3N1bWUgdGhhdCB0aGUNCnZhcmlhYmxlcyBhcmUgaS5pLmQuIFRoZSBzYW1wbGUgc2l6ZXMgYXJlIHN1ZmZpY2llbnRseSBsYXJnZSB0byBhY2NvdW50IGZvcg0Kc2tld2luZywgdGhlcmVmb3JlIHRoZSBjb25kaXRpb25zIGFyZSBtZXQuDQoNCmBgYHtyIGNvZGUtY2h1bmstM30NCmJ5KG5jJHdlaWdodCwgbmMkaGFiaXQsIGxlbmd0aCkNCmBgYA0KDQoNCiMjIyBFeGVyY2lzZSA0DQoNCkhfMDogVGhlcmUgaXMgbm8gZGlmZmVyZW5jZSBpbiBtZWFuIGJpcnRod2VpZ2h0IGZvciBzbW9rZXIgYW5kIG5vbi1zbW9rZXIgbW90aGVycy4obXUncyBhcmUgZXF1YWwpDQpIX2E6IFRoZXJlIGlzIGEgZGlmZmVyZW5jZSBpbiBtZWFuIGJpcnRod2VpZ2h0IGZvciBzbW9rZXIgYW5kIG5vbi1zbW9rZXIgbW90aGVycy4gKG11J3MgYXJlIG5vdCBlcXVhbCkNCg0KVGhlIGluZmVyZW5jZSBmdW5jdGlvbiBwcm9kdWNlcyBhIHAtdmFsdWUgb2YgcCA9IDAuMDE4NCwgd2hpY2ggaXMgPCAwLjA1IGJ1dCA+IDAuMDEsIHRoZXJlZm9yZQ0KdGhlcmUgaXMgbW9kZXJhdGUgZXZpZGVuY2UgdG8gc3VnZ2VzdCB0aGF0IGEgbW90aGVyJ3Mgc21va2luZyBoYWJpdCBpcyBjb3JyZWxhdGVkIHdpdGggdGhlDQpiaXJ0aHdlaWdodCBvZiB0aGUgYmFieS4NCg0KYGBge3IgY29kZS1jaHVuay00fQ0KaW5mZXJlbmNlKHkgPSBuYyR3ZWlnaHQsIHggPSBuYyRoYWJpdCwgZXN0ID0gIm1lYW4iLCB0eXBlID0gImh0IiwgbnVsbCA9IDAsIA0KICAgICAgICAgIGFsdGVybmF0aXZlID0gInR3b3NpZGVkIiwgbWV0aG9kID0gInRoZW9yZXRpY2FsIikNCmBgYA0KDQoNCiMjIyBFeGVyY2lzZSA1DQoNClRoZSByZXNlYXJjaGVyIGNhbiBiZSA5NSUgY29uZmlkZW50IHRoYXQgdGhlIHRydWUgZGlmZmVyZW5jZSBpbiBtZWFuIGJpcnRod2VpZ2h0cw0KZm9yIG5vbi1zbW9rZXIgdnMuIHNtb2tlciBiYWJpZXMgaXMgYmV0d2VlbiAtMC41NzggbGJzIGFuZCAtMC4wNTMgbGJzLg0KDQpgYGB7ciBjb2RlLWNodW5rLTV9DQppbmZlcmVuY2UoeSA9IG5jJHdlaWdodCwgeCA9IG5jJGhhYml0LCBlc3QgPSAibWVhbiIsIHR5cGUgPSAiY2kiLCBudWxsID0gMCwgDQogICAgICAgICAgYWx0ZXJuYXRpdmUgPSAidHdvc2lkZWQiLCBtZXRob2QgPSAidGhlb3JldGljYWwiLCANCiAgICAgICAgICBvcmRlciA9IGMoInNtb2tlciIsIm5vbnNtb2tlciIpKQ0KYGBg