HW #1

Exercise 0.15

Hint: this is chapter 0. Don’t feel like you need to do anything too complicated for your “model.” Comparing plots or doing a simple linear regression is fine. 0.15 Statistics students survey. An instructor at a small liberal arts college distributed the data collection card similar to what is shown below on the first day of class. The data for two different sections of the course are shown in the file Day1Survey. Note that the names have not been entered into the dataset.

Data Collection Card

Directions: Please answer each question and return to me.

Your name (as you prefer): _______________ What is your current class standing? _______________ Sex: Male _______________ Female _______________ How many miles (approximately) did you travel to get to campus? _______________ Height (estimated) in inches: _______________ Handedness (Left, Right, Ambidextrous): _______________ How much money, in coins (not bills), do you have with you? $ _______________ Estimate the length of the white string (in inches): _______________ Estimate the length of the black string (in inches): _______________ How much do you expect to read this semester (in pages/week)? _______________ How many hours do you watch TV in a typical week? _______________ What is your resting pulse? _______________ How many text messages have you sent and received in the last 24 hours? _______________ The data for this survey are stored in Day1Survey.

Apply the four-step process to the survey data to address the question: “Is there evidence that the mean resting pulse rate for women is different from the mean resting pulse rate for men?”

Pick another question that interests you from the survey and compare the responses of men and women.

data(Day1Survey)
head(Day1Survey)

##   Section    Class Sex Distance Height Handedness Coins WhiteString
## 1       1   Senior   F      400     62      Right  1.12          42
## 2       1        *   F      450     61       Left 29.00          45
## 3       1 Freshman   F     3000     61      Right  1.50          22
## 4       1 Freshman   M      100     72      Right  0.07          40
## 5       1      N/A   F     2000     69      Right  0.12          48
## 6       1   Senior   M      500     73      Right  8.00          30
##   BlackString Reading TV Pulse Texting
## 1           6      80  3    71       3
## 2           5     100 10    78     100
## 3           4     100  4    80       2
## 4           4      50 25    63     200
## 5           7     200  5    63     100
## 6           8     100  0    56       1

attach(Day1Survey)

Part a

The four step process: 1. Choose “Is there evidence that the mean resting pulse rate for women is different from the mean resting pulse rate for men?” 2. Fit ŷ is the fitted/predicted value and ŷ= B_hat0 +B_hat1*x For a 1-unit increase in the predictor we would expect to see a 1 Beta_hat_1 unit change in the response.

femalelm <- lm((Pulse~(Sex=="F")))  #Create a linear model for female pulse
femalelm

## 
## Call:
## lm(formula = (Pulse ~ (Sex == "F")))
## 
## Coefficients:
##    (Intercept)  Sex == "F"TRUE  
##          66.65            1.17

mean(Pulse~(Sex=="F"))#True is the females' mean pulse and False is the males' mean pulse

##    FALSE     TRUE 
## 66.65385 67.82353

malelm<-lm((Pulse~(Sex=="M"))) # This is a linear model for males pulse just to check that the female is the same.
malelm

## 
## Call:
## lm(formula = (Pulse ~ (Sex == "M")))
## 
## Coefficients:
##    (Intercept)  Sex == "M"TRUE  
##          67.82           -1.17

The intercept or Beta_hat_0 for females is 66.65 heart beats per minute and the Beta_hat_1 is 1.17. This means that for if the person is female they gain 1.17 heart beats per minute while resting to the intercept of 66.65 heart beats per minute. If they were male/not female they would not gain this 1.17 heart beats per minute and would just have a heart beat of 66.65.

This makes since since the mean pulse for females is 67.82 beats per minute and 66.65 beats per minute is mean pulse for males.

library(skimr)

## 
## Attaching package: 'skimr'

## The following object is masked from 'package:mosaic':
## 
##     n_missing

Day1Survey %>%
  group_by(Sex)%>%
  skim(Pulse)

## Skim summary statistics
##  n obs: 43 
##  n variables: 13 
##  group variables: Sex 
## 
## ── Variable type:integer ──────────────────────────────────────────────────────────────────────────────────────
##  Sex variable missing complete  n  mean    sd p0 p25 p50 p75 p100     hist
##    F    Pulse       0       17 17 67.82 11.38 51  60  72  75   90 ▃▃▁▁▇▂▁▁
##    M    Pulse       0       26 26 66.65 11.27 48  57  66  72   96 ▅▇▇▇▂▃▁▁

ggplot(data=Day1Survey) + geom_density(aes(x=Pulse, color=Sex))

These graphs show the Pulse of the females and the pulse of the males. It shows what we learned above that the females have a higher mean pulse compare to males.

Assess These should follow a normal distribution with mean zero and so the residuals between each gender should be 0.

 mean(residuals(femalelm)) #checks to see if the residuals are close to 0

## [1] 1.136042e-16

mean(residuals(malelm)) #checks to see if the residuals are close to 0

## [1] 3.083168e-17

They are very close to 0.

Use Use the t.test to see if the difference of the two Sex pulse are 0 meaning that the resting pulse is the same by gender. The t.test below shows that the p value is .7428 so we would not have enough evidence to reject the null aka that with the amount of data we have we cannot say that the resting pulse is the different for both genders.

Does resting pulse differ by gender?

No it does not.

t.test(Pulse~Sex)

## 
##  Welch Two Sample t-test
## 
## data:  Pulse by Sex
## t = 0.33077, df = 34.12, p-value = 0.7428
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.015993  8.355360
## sample estimates:
## mean in group F mean in group M 
##        67.82353        66.65385

Part b

Think of another question and answer it here! 1. choose How many hours do you watch TV in a typical week? “Is there evidence that the mean amount of tv time for women is different from the mean amount of tv time for men?”

fit We fit the relationship (TV~Sex) to see if there is a different between the two genders and their amount of tv.

tvfe<-lm(TV~(Sex=="F"))
tvfe

## 
## Call:
## lm(formula = TV ~ (Sex == "F"))
## 
## Coefficients:
##    (Intercept)  Sex == "F"TRUE  
##          5.558          -1.881

ggplot(data=Day1Survey) + geom_density(aes(x=TV, color=Sex))

This means that females have an average of 1.881 hours less of tv then males, which is also shown by the graph. 3. Assess

The residuals between each gender should be 0, which was tested below.

 mean(residuals(tvfe)) #checks to see if the residuals are close to 0

## [1] 7.47217e-17

tvme<-lm(TV~(Sex=="M"))
mean(residuals(tvme)) #checks to see if the residuals are close to 0

## [1] -2.631056e-16

They are close to 0.

We assess the relationship by using them within a t.test to see if we have to reject the null that they carry the same amount of coins.

t.test(TV~Sex)

## 
##  Welch Two Sample t-test
## 
## data:  TV by Sex
## t = -1.3256, df = 35.307, p-value = 0.1935
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -4.7613613  0.9989179
## sample estimates:
## mean in group F mean in group M 
##        3.676471        5.557692

Use as seen by the t.test the pvalue is above .1935 so we would fail to reject the null because of lack of evidence if our Alpha was greater than 80%.

This means that we can’t prove that males and females watch different amounts of TV during a week.

HW #1

Rebecca Lewis

Due Monday, September 17, 2018 by 11:55 pm

Exercise 0.7

Exercise 0.15

Part a

Part b