Make sure to include the unit of the values whenever appropriate.

Q1 Build a regression model to predict life expectancy using gdp per capita.

Hint: The variables are available in the gapminder data set from the gapminder package. Note that the data set and package both have the same name, gapminder.

library(tidyverse)
options(scipen=999)

data(gapminder, package="gapminder")
houses_lm <- lm(lifeExp ~ gdpPercap, 
                data = gapminder)

# View summary of model 1
summary(houses_lm)
## 
## Call:
## lm(formula = lifeExp ~ gdpPercap, data = gapminder)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -82.754  -7.758   2.176   8.225  18.426 
## 
## Coefficients:
##                Estimate  Std. Error t value            Pr(>|t|)    
## (Intercept) 53.95556088  0.31499494  171.29 <0.0000000000000002 ***
## gdpPercap    0.00076488  0.00002579   29.66 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.49 on 1702 degrees of freedom
## Multiple R-squared:  0.3407, Adjusted R-squared:  0.3403 
## F-statistic: 879.6 on 1 and 1702 DF,  p-value: < 0.00000000000000022

Q2 Is the coefficient of gdpPercap statistically significant at 5%?

Hint: Your answer must include a discussion on the p-value. Coefficient is significant,it is less than 5%.

Q3 Interpret the coefficient of gdpPercap.

Hint: Discuss both its sign and magnitude. gdpPercap is,00076488, gdpPercap increases by $1, the life expectancy increases by .00076488 years.

Q4 Interpret the Intercept.

Hint: Provide a technical interpretation.

With the intercept being 53.955, this is projected that if you’re born with a $0 gdpPercap, your life expectancy at birth is 53.95 years.

Q5 Build another model that predicts life expectancy using gdpPercap, but also controls for another important variable, year.

Hint: This is a model with two explanatory

data(gapminder, package="gapminder")
houses_lm <- lm(lifeExp ~ year, gdpPercap,
                data = gapminder)

# View summary of model 1
summary(houses_lm)
## 
## Call:
## lm(formula = lifeExp ~ year, data = gapminder, subset = gdpPercap)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.221  -9.436   1.517  11.201  21.581 
## 
## Coefficients:
##               Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) -573.69800   56.15343  -10.22 <0.0000000000000002 ***
## year           0.31998    0.02837   11.28 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.86 on 580 degrees of freedom
##   (1122 observations deleted due to missingness)
## Multiple R-squared:  0.1799, Adjusted R-squared:  0.1784 
## F-statistic: 127.2 on 1 and 580 DF,  p-value: < 0.00000000000000022

summary(houses_lm) ## Q6 Which of the two models is better? Hint: Discuss in terms of both residual standard error and reported adjusted R squared.

I think the second one is better because it is closer to the line of regression it hits more data points. Allthought the first modle misses more people.

Q7 Interpret the coefficient of year.

Hint: Discuss both its sign and magnitude. Coefficient is equal to 5 percent in magnitude.

Q7.a Based on the second model, what is the predicted life expectancy for a country with gdpPercap of $40,000 a year in 1997.

Hint: We had this discussion in class while watching the video at DataCamp, Correlation and Regression in R. The video is titled as “Interpretation of Regression” in Chapter 4: Interpreting Regression Models.

The coefficient for a year in 1997 is positive. 76 years predicted life expectancy for a country with gdpPercap of 40,000 in the year 1997.

Q8 Hide the messages, but display the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.