Final Exam BZAN 6351 Fall 2023

Get Started: Run the below codes without changing anything

The below questions will use the Singapore dataset. Choose 5 questions to answer.

Q1: plot a line chart, with y axis being the LifeExp, x axis being gdpPercap

Q2: plot a line chart of lifeExp throughout the years.

Q3: build a linear model of lifeExp and gdp

mod_life <- lm (lifeExp ~ gdpPercap, data = Singapore) 
summary (mod_life)

## 
## Call:
## lm(formula = lifeExp ~ gdpPercap, data = Singapore)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.9944 -0.8075  0.8923  1.7135  1.9727 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 6.450e+01  1.074e+00  60.035 4.00e-14 ***
## gdpPercap   3.858e-04  4.767e-05   8.093 1.06e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.36 on 10 degrees of freedom
## Multiple R-squared:  0.8675, Adjusted R-squared:  0.8543 
## F-statistic:  65.5 on 1 and 10 DF,  p-value: 1.064e-05

Q4: what is the equation? (it’s okay to use the scientific notation in your formula)

Formula would be lifeExp = 6.450e+01 + 3.858e-04 * gdpPercap

Q5: what does the p-value mean for this model?

The low p-value of 1.064e-05 indicates that it is a significant predictor

Q6: please interpret the coefficient

The coefficients for gdpPercap represent a change of 3.858e-04 for a one-unit change in gdpPercap.

Q7: create a new column in the gapminder dataset called gdpMillions, which is the GDP per capita (gdpPercap) multiplied by the population (pop) divided by 1,000,000.

Q8: display the year when Singapore had the highest population (pop). Write the R code to achieve this. (Hint: filter the max(pop))

max_pop_Singapore <- Singapore %>%
  filter (pop == max(pop)) %>%
  select (year) 
 
max_pop_Singapore

## # A tibble: 1 × 1
##    year
##   <int>
## 1  2007