For each exercise below, show code. Once you’ve completed things, don’t forget to input everything into the quiz on Canvas and to upload this document (knitted version please!) at the end of the quiz. A few tips:
install.packages() and load them using
library().gapminder
dataset?library(tidyverse)
library(gapminder)
data(gapminder)
class() of each variable in the
gapminder dataset. Describe the the difference
between"numeric" and "integer". What’s the
class of year?head(gapminder)
## # A tibble: 6 × 6
## country continent year lifeExp pop gdpPercap
## <fct> <fct> <int> <dbl> <int> <dbl>
## 1 Afghanistan Asia 1952 28.8 8425333 779.
## 2 Afghanistan Asia 1957 30.3 9240934 821.
## 3 Afghanistan Asia 1962 32.0 10267083 853.
## 4 Afghanistan Asia 1967 34.0 11537966 836.
## 5 Afghanistan Asia 1972 36.1 13079460 740.
## 6 Afghanistan Asia 1977 38.4 14880372 786.
sapply(gapminder, class)
## country continent year lifeExp pop gdpPercap
## "factor" "factor" "integer" "numeric" "integer" "numeric"
length() function.length(gapminder$country)
## [1] 1704
filter().oman_pop_2007 <- gapminder %>%
filter(country == "Oman") %>%
select(country, year, pop)
oman_pop_2007 %>%
filter(year == "2007")
## # A tibble: 1 × 3
## country year pop
## <fct> <int> <int>
## 1 Oman 2007 3204897
filter() and arrange().gapminder %>%
filter(year == "2007") %>%
select(country, gdpPercap) %>%
arrange(desc(gdpPercap))
## # A tibble: 142 × 2
## country gdpPercap
## <fct> <dbl>
## 1 Norway 49357.
## 2 Kuwait 47307.
## 3 Singapore 47143.
## 4 United States 42952.
## 5 Ireland 40676.
## 6 Hong Kong, China 39725.
## 7 Switzerland 37506.
## 8 Netherlands 36798.
## 9 Canada 36319.
## 10 Iceland 36181.
## # … with 132 more rows
group_by() and
summarize()!)gapminder %>%
group_by(country) %>%
summarize(mean_le = mean(lifeExp)) %>%
arrange(mean_le)
## # A tibble: 142 × 2
## country mean_le
## <fct> <dbl>
## 1 Sierra Leone 36.8
## 2 Afghanistan 37.5
## 3 Angola 37.9
## 4 Guinea-Bissau 39.2
## 5 Mozambique 40.4
## 6 Somalia 41.0
## 7 Rwanda 41.5
## 8 Liberia 42.5
## 9 Equatorial Guinea 43.0
## 10 Guinea 43.2
## # … with 132 more rows
gapminder %>%
filter(year == "2007") %>%
select(country, pop, year) %>%
arrange(desc(pop))
## # A tibble: 142 × 3
## country pop year
## <fct> <int> <int>
## 1 China 1318683096 2007
## 2 India 1110396331 2007
## 3 United States 301139947 2007
## 4 Indonesia 223547000 2007
## 5 Brazil 190010647 2007
## 6 Pakistan 169270617 2007
## 7 Bangladesh 150448339 2007
## 8 Nigeria 135031164 2007
## 9 Japan 127467972 2007
## 10 Mexico 108700891 2007
## # … with 132 more rows
africa where
observations located in the continent of Africa are coded as “Africa”
and those not located in Africa as “Not Africa.” Use dplyr
to compute the average life expectancy and GDP per capita in countries
located within Africa and outside of Africa in 2007. (2 points)africa <- gapminder %>%
mutate(africa = if_else(continent == "Africa", "Africa", "Not Africa"))
head(africa)
## # A tibble: 6 × 7
## country continent year lifeExp pop gdpPercap africa
## <fct> <fct> <int> <dbl> <int> <dbl> <chr>
## 1 Afghanistan Asia 1952 28.8 8425333 779. Not Africa
## 2 Afghanistan Asia 1957 30.3 9240934 821. Not Africa
## 3 Afghanistan Asia 1962 32.0 10267083 853. Not Africa
## 4 Afghanistan Asia 1967 34.0 11537966 836. Not Africa
## 5 Afghanistan Asia 1972 36.1 13079460 740. Not Africa
## 6 Afghanistan Asia 1977 38.4 14880372 786. Not Africa
Africa_le_gdp <- africa %>%
filter(year == "2007", africa == "Africa") %>%
group_by(africa, year) %>%
summarize(avg_le = mean(lifeExp),
avg_gdp = mean(gdpPercap))
## `summarise()` has grouped output by 'africa'. You can override using the
## `.groups` argument.
Africa_le_gdp
## # A tibble: 1 × 4
## # Groups: africa [1]
## africa year avg_le avg_gdp
## <chr> <int> <dbl> <dbl>
## 1 Africa 2007 54.8 3089.
Not_Africa_le_gdp <- africa %>%
filter(year == "2007", africa == "Not Africa") %>%
group_by(africa, year) %>%
summarize(avg_le = mean(lifeExp),
avg_gdp = mean(gdpPercap))
## `summarise()` has grouped output by 'africa'. You can override using the
## `.groups` argument.
Not_Africa_le_gdp
## # A tibble: 1 × 4
## # Groups: africa [1]
## africa year avg_le avg_gdp
## <chr> <int> <dbl> <dbl>
## 1 Not Africa 2007 74.1 16644.