Make sure to include the unit of the values whenever appropriate.

Q1 Build a regression model to predict life expectancy using gdp per capita.

library(tidyverse)
options(scipen=999)

data(gapminder, package="gapminder")
gdp_lm <- lm(lifeExp ~ gdpPercap, 
                data = gapminder)

# View summary of model 1
summary(gdp_lm)
## 
## Call:
## lm(formula = lifeExp ~ gdpPercap, data = gapminder)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -82.754  -7.758   2.176   8.225  18.426 
## 
## Coefficients:
##                Estimate  Std. Error t value            Pr(>|t|)    
## (Intercept) 53.95556088  0.31499494  171.29 <0.0000000000000002 ***
## gdpPercap    0.00076488  0.00002579   29.66 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.49 on 1702 degrees of freedom
## Multiple R-squared:  0.3407, Adjusted R-squared:  0.3403 
## F-statistic: 879.6 on 1 and 1702 DF,  p-value: < 0.00000000000000022

Q2 Is the coefficient of gdpPercap statistically significant at 5%?

Yes, because the p-value is less than .05

Q3 Interpret the coefficient of gdpPercap.

The coeeficcient of 0.00076488 represents the change in the response variable, which is average life expectancy, per unit increase in high temperature. It is a very low but positive change in the countries gdp.

Q4 Interpret the Intercept.

The intercept of this data set states that when gdp is 0, the life expectancy of a country is about 54 years.

Q5 Build another model that predicts life expectancy using gdpPercap, but also controls for another important variable, year.

library(tidyverse)
options(scipen=999)

data(gapminder, package="gapminder")
gdp2_lm <- lm(lifeExp ~ gdpPercap + year, 
                data = gapminder)

# View summary of model 2
summary(gdp2_lm)
## 
## Call:
## lm(formula = lifeExp ~ gdpPercap + year, data = gapminder)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -67.262  -6.954   1.219   7.759  19.553 
## 
## Coefficients:
##                  Estimate    Std. Error t value            Pr(>|t|)    
## (Intercept) -418.42425945   27.61713769  -15.15 <0.0000000000000002 ***
## gdpPercap      0.00066973    0.00002447   27.37 <0.0000000000000002 ***
## year           0.23898275    0.01397107   17.11 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.694 on 1701 degrees of freedom
## Multiple R-squared:  0.4375, Adjusted R-squared:  0.4368 
## F-statistic: 661.4 on 2 and 1701 DF,  p-value: < 0.00000000000000022

Q6 Which of the two models is better?

The second model is better because the residual standard error is lower and the adjusted R squared is higher than the first model. The adjusted R being higher means the second model accounts for more fluctuations in life expectancy.

Q7 Interpret the coefficient of year.

The coefficient of year is 0.23898275 meaning that for every year the life expectancy is positively going up, but it is going up slowly as on average life expectancy will increase .24 years per every one calender year.

Q7.a Based on the second model, what is the predicted life expectancy for a country with gdpPercap of $40,000 a year in 1997.

Life expectancy = (1997-1952) * .23898275 + 40,000 * 0.00066973

Which is equal to 37.54 or about 38 years.

Q8 Hide the messages, but display the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.