Q1 Import data.

Hint: The data file is posted in Moodle. See Module 5. It’s named as “gapminder.csv”.

data <- read.csv("~//Business Stats/data/Minitab.csv")
head(data)
##       country continent year lifeExp      pop gdpPercap
## 1 Afghanistan      Asia 1952  28.801  8425333  779.4453
## 2 Afghanistan      Asia 1957  30.332  9240934  820.8530
## 3 Afghanistan      Asia 1962  31.997 10267083  853.1007
## 4 Afghanistan      Asia 1967  34.020 11537966  836.1971
## 5 Afghanistan      Asia 1972  36.088 13079460  739.9811
## 6 Afghanistan      Asia 1977  38.438 14880372  786.1134
summary(data)
##    country           continent              year         lifeExp     
##  Length:1704        Length:1704        Min.   :1952   Min.   :23.60  
##  Class :character   Class :character   1st Qu.:1966   1st Qu.:48.20  
##  Mode  :character   Mode  :character   Median :1980   Median :60.71  
##                                        Mean   :1980   Mean   :59.47  
##                                        3rd Qu.:1993   3rd Qu.:70.85  
##                                        Max.   :2007   Max.   :82.60  
##       pop              gdpPercap       
##  Min.   :6.001e+04   Min.   :   241.2  
##  1st Qu.:2.794e+06   1st Qu.:  1202.1  
##  Median :7.024e+06   Median :  3531.8  
##  Mean   :2.960e+07   Mean   :  7215.3  
##  3rd Qu.:1.959e+07   3rd Qu.:  9325.5  
##  Max.   :1.319e+09   Max.   :113523.1
str(data)
## 'data.frame':    1704 obs. of  6 variables:
##  $ country  : chr  "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
##  $ continent: chr  "Asia" "Asia" "Asia" "Asia" ...
##  $ year     : int  1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
##  $ lifeExp  : num  28.8 30.3 32 34 36.1 ...
##  $ pop      : int  8425333 9240934 10267083 11537966 13079460 14880372 12881816 13867957 16317921 22227415 ...
##  $ gdpPercap: num  779 821 853 836 740 ...

Q2 Create a scatter plot to visualize the relationship between life expectancy and GDP per capita.

Hint: For the code, refer to one of our textbooks, Data Visualization with R: Chapter 4.2. Map lifeExp to the y-axis and gdpPercap to the x-axis.

library(tidyverse)
## -- Attaching packages --------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.3     v dplyr   1.0.2
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts ------------------------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
ggplot(data, 
       aes(x = gdpPercap, 
           y = lifeExp)) +
  geom_point()

Q3 Calculate and interpret the Pearson correlation coefficient.

Hint: Interpret both the direction and the strength of the correlation

cor(data$lifeExp, data$gdpPercap)
## [1] 0.5837062

With the Pearson Correlation Coefficent being 0.58 this means that GDP per capita and life expectancy are positive and associated

Q4 Based on your analysis in Q2 and Q3, can you conclude that the standard of living (measured by GDP per capita) causes life expectancy to rise? Why or why not?

Yes we can conclude that standard of living causes life expectancy to rise because there is a positive correlation between the GDP per capita and life expectancy.

Q5 You suspect that there may be other variables that are asociated with life expectancy. Create a correlation plot.

Hint: For the code, refer to one of our textbooks, Data Visualization with R: Chapter 8.1.

df <- dplyr::select_if(data, is.numeric)
r <- cor(df, use="complete.obs")
round(r,2)
##           year lifeExp   pop gdpPercap
## year      1.00    0.44  0.08      0.23
## lifeExp   0.44    1.00  0.06      0.58
## pop       0.08    0.06  1.00     -0.03
## gdpPercap 0.23    0.58 -0.03      1.00
library(ggcorrplot)
## Warning: package 'ggcorrplot' was built under R version 4.0.3
ggcorrplot(r, 
           hc.order = TRUE, 
           type = "lower",
           lab = TRUE)

Q6 List any variable with a strong or moderate positive association with life expectancy, if any.

GDP per capita has moderate positive association with life expectancy at 0.58 and Life expectancy has a positive association with year at 0.44

Q7 Your classmate argues that the world has gotten better in the recent past and people tend to live longer each year. Would you agree? Argue your case based on the correlation coefficient between life expectancy and year.

I would agree with that because there is a positively moderate correlation between the life expectancy and the year in a linear relationship based on the chart.

Q8 Hide the messages, but display the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.