This is an an Average Life Expectancy Dataset sourced from Gapminder. I have used three R packages today – dplyr for data manipulation, gapminder for data sources, and highcharter for visualization.
Average Life Expectancy Per Continent Subset:
## # A tibble: 5 × 2
## continent AvgLifeExp
## <fct> <dbl>
## 1 Oceania 74.3
## 2 Europe 71.9
## 3 Americas 64.7
## 4 Asia 60.1
## 5 Africa 48.9
The subset we’ve worked with so far won’t make the cut here. We need fewer data points.
The avg_le_europe data frame will display average life expectancy in Europe grouped by year:
## # A tibble: 12 × 2
## year AvgLifeExp
## <int> <dbl>
## 1 1952 64.4
## 2 1957 66.7
## 3 1962 68.5
## 4 1967 69.7
## 5 1972 70.8
## 6 1977 71.9
## 7 1982 72.8
## 8 1987 73.6
## 9 1992 74.4
## 10 1997 75.5
## 11 2002 76.7
## 12 2007 77.6
In plain English, a multiple scatter plot allows you to plot multiple categories of the same variable. I need a bigger dataset. I have included Asia into the mix:
## `summarise()` has grouped output by 'continent'. You can override using the
## `.groups` argument.
## # A tibble: 24 × 3
## # Groups: continent [2]
## continent year AvgLifeExp
## <fct> <int> <dbl>
## 1 Asia 1952 46.3
## 2 Asia 1957 49.3
## 3 Asia 1962 51.6
## 4 Asia 1967 54.7
## 5 Asia 1972 57.3
## 6 Asia 1977 59.6
## 7 Asia 1982 62.6
## 8 Asia 1987 64.9
## 9 Asia 1992 66.5
## 10 Asia 1997 68.0
## # ℹ 14 more rows
In essence, a line chart conveys the same information as a scatter plot, but individual data points aren’t separated – they’re connected by a line instead.
It includes Population of three countries extracted from year 2007
## # A tibble: 3 × 6
## country continent year lifeExp pop gdpPercap
## <fct> <fct> <int> <dbl> <int> <dbl>
## 1 France Europe 2007 80.7 61083916 30470.
## 2 Germany Europe 2007 79.4 82400996 32170.
## 3 Spain Europe 2007 80.9 40448191 28821.