Q1 Import data.

Hint: The data file is posted in Moodle. See Module 5. It’s named as “gapminder.csv”.

data <- read.csv("~//BusStats/Data/gapminder.csv")
head(data)
##       country continent year lifeExp      pop gdpPercap
## 1 Afghanistan      Asia 1952  28.801  8425333  779.4453
## 2 Afghanistan      Asia 1957  30.332  9240934  820.8530
## 3 Afghanistan      Asia 1962  31.997 10267083  853.1007
## 4 Afghanistan      Asia 1967  34.020 11537966  836.1971
## 5 Afghanistan      Asia 1972  36.088 13079460  739.9811
## 6 Afghanistan      Asia 1977  38.438 14880372  786.1134
summary(data)
##    country           continent              year         lifeExp     
##  Length:1704        Length:1704        Min.   :1952   Min.   :23.60  
##  Class :character   Class :character   1st Qu.:1966   1st Qu.:48.20  
##  Mode  :character   Mode  :character   Median :1980   Median :60.71  
##                                        Mean   :1980   Mean   :59.47  
##                                        3rd Qu.:1993   3rd Qu.:70.85  
##                                        Max.   :2007   Max.   :82.60  
##       pop              gdpPercap       
##  Min.   :6.001e+04   Min.   :   241.2  
##  1st Qu.:2.794e+06   1st Qu.:  1202.1  
##  Median :7.024e+06   Median :  3531.8  
##  Mean   :2.960e+07   Mean   :  7215.3  
##  3rd Qu.:1.959e+07   3rd Qu.:  9325.5  
##  Max.   :1.319e+09   Max.   :113523.1

Q2 Create a scatter plot to visualize the relationship between life expectancy and GDP per capita.

Hint: For the code, refer to one of our textbooks, Data Visualization with R: Chapter 4.2. Map lifeExp to the y-axis and gdpPercap to the x-axis.

library(ggplot2)
ggplot(data, 
       aes(x = lifeExp, 
           y = gdpPercap )) +
  geom_point()

Q3 Calculate and interpret the Pearson correlation coefficient.

Hint: Interpret both the direction and the strength of the correlation

cor(data$lifeExp, data$gdpPercap)
## [1] 0.5837062

Q4 Based on your analysis in Q2 and Q3, can you conclude that the standard of living (measured by GDP per capita) causes life expectancy to rise? Why or why not?

Yes,in the graph the data is heavily skewed right. The higher life expectancy is associated with the higher GDP.

Q5 You suspect that there may be other variables that are asociated with life expectancy. Create a correlation plot.

Hint: For the code, refer to one of our textbooks, Data Visualization with R: Chapter 8.1.

df <- dplyr::select_if(data, is.numeric)

# calulate the correlations
r <- cor(df, use="complete.obs")
round(r,2)
##           year lifeExp   pop gdpPercap
## year      1.00    0.44  0.08      0.23
## lifeExp   0.44    1.00  0.06      0.58
## pop       0.08    0.06  1.00     -0.03
## gdpPercap 0.23    0.58 -0.03      1.00

Q6 List any variable with a strong or moderate positive association with life expectancy, if any.

GDP, year ## Q7 Your classmate argues that the world has gotten better in the recent past and people tend to live longer each year. Would you agree? Argue your case based on the correlation coefficient between life expectancy and year. Hint: A correct answer must include all of the following: 1) direction and strength of the correlation coefficient, and 2) linear versus non-linear relationship.
I would agree, In the first set of data, between the years 1952 and 1977, the life expectancy grew from 28.8 to 38.4 with a steady increase each year ## Q8 Hide the messages, but display the code and its results on the webpage. Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.