library(tidyverse)
library(skimr)
library(ggthemes)
library(gapminder)
data("gapminder")
glimpse(gapminder)
## Rows: 1,704
## Columns: 6
## $ country <fct> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", …
## $ continent <fct> Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, …
## $ year <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, …
## $ lifeExp <dbl> 28.801, 30.332, 31.997, 34.020, 36.088, 38.438, 39.854, 40.8…
## $ pop <int> 8425333, 9240934, 10267083, 11537966, 13079460, 14880372, 12…
## $ gdpPercap <dbl> 779.4453, 820.8530, 853.1007, 836.1971, 739.9811, 786.1134, …
Country is a factor variable, and it tells you the country. Continent is a factor variable, and it tells you the continent that the country is on. year is an integer variable that ranges from 1952 to 2007 in increments of five. LifeExp is the life expectancy at birth in years and it is a double variable. pop is the population of the country and it is a integer variable. gpdPercap is the GDP per capita in US dollars and inflation adjusted. It is a double variable.
c.)
skim(gapminder)
| Name | gapminder |
| Number of rows | 1704 |
| Number of columns | 6 |
| _______________________ | |
| Column type frequency: | |
| factor | 2 |
| numeric | 4 |
| ________________________ | |
| Group variables | None |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| country | 0 | 1 | FALSE | 142 | Afg: 12, Alb: 12, Alg: 12, Ang: 12 |
| continent | 0 | 1 | FALSE | 5 | Afr: 624, Asi: 396, Eur: 360, Ame: 300 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| year | 0 | 1 | 1979.50 | 17.27 | 1952.00 | 1965.75 | 1979.50 | 1993.25 | 2007.0 | ▇▅▅▅▇ |
| lifeExp | 0 | 1 | 59.47 | 12.92 | 23.60 | 48.20 | 60.71 | 70.85 | 82.6 | ▁▆▇▇▇ |
| pop | 0 | 1 | 29601212.32 | 106157896.74 | 60011.00 | 2793664.00 | 7023595.50 | 19585221.75 | 1318683096.0 | ▇▁▁▁▁ |
| gdpPercap | 0 | 1 | 7215.33 | 9857.45 | 241.17 | 1202.06 | 3531.85 | 9325.46 | 113523.1 | ▇▁▁▁▁ |
There are no missing values.
d.)
gapminder %>% ggplot(aes(x = year,
y = lifeExp)) +
geom_point() +
labs(title = "Life expectancy across time",
x = "year",
y = "Life expectancy (years)" )
It looks like the life expectancy is going up a little bit.
e.)
gapminder %>% ggplot(aes(x = year,
y = lifeExp)) +
geom_point() +
labs(title = "Life expectancy across time",
x = "year",
y = "Life expectancy (years)" ) +
geom_smooth(se = FALSE)
f.)
gapminder %>% ggplot(aes(x = year,
y = lifeExp,
color = continent)) +
geom_point() +
labs(title = "Life expectancy across time",
x = "year",
y = "Life expectancy (years)" ) +
geom_smooth(se = FALSE, aes(color = continent))
Oceania has the highest life expectancy
g.)
gapminder %>% ggplot(aes(x = year,
y = lifeExp,
color = continent)) +
geom_point() +
labs(title = "Life expectancy across time",
x = "year",
y = "Life expectancy (years)" ) +
geom_smooth(se = FALSE, aes(color = continent)) +
facet_grid(.~continent)
h.)
gapminder %>% ggplot(aes(x = year,
y = lifeExp,
color = continent)) +
geom_point() +
labs(title = "Life expectancy across time",
x = "year",
y = "Life expectancy (years)" ) +
geom_smooth(se = FALSE, aes(color = continent)) +
facet_grid(.~continent) +
scale_color_colorblind() +
theme_bw()
i and j.)
gapminder %>% ggplot(aes(x = year,
y = lifeExp,
color = continent)) +
geom_point() +
labs(title = "Life expectancy across time",
x = "year",
y = "Life expectancy (years)" ) +
geom_smooth(se = FALSE, aes(color = continent)) +
facet_grid(.~continent) +
scale_color_colorblind() +
theme_bw() +
theme(axis.text.x = element_text(angle = 45), legend.position = "none")
creating new data set that holds the top 20 populated countries in 2007
gapminder2007 <- gapminder %>% filter(year == 2007) %>% slice_max(pop, n = 20)
a.)
gapminder2007 %>% ggplot(aes(x = pop,
y = country)) +
geom_col()
b.)
gapminder2007 %>% ggplot(aes(x = fct_reorder(country, pop),
y = pop)) +
geom_col()
c,d)
gapminder2007 %>% ggplot(aes(x = fct_reorder(country, pop),
y = pop,
fill = continent)) +
coord_flip() +
geom_col(color = "black")
e, f )
gapminder2007 %>% ggplot(aes(x = fct_reorder(country, pop),
y = pop,
fill = continent)) +
coord_flip() +
geom_col(color = "black") +
labs(x = "Country",
y = "Population",
title = "country by population")+
theme_bw() +
theme(legend.position = "bottom", legend.title = element_blank())