All graphs should be created with ggplot2
.
Upload a data frame from file demography.csv
. It contains data on population of Belgorodskaya and Kaluzhskaya oblasts (districts) in 2016 (taken from Russian State Statistics Service).
Variables:
region_ru
: region (in Russian, Belgorodskaya and Kaluzhskaya oblasts);district_ru
: district (in Russian);region_code
: code of region (1 - Belgorodskaya oblast, 2 - Kaluzhskaya oblast);empl_total
: total number of employed people;popul_total
: population;urban_total
: urban population;rural_total
: rural population;wa_total
: total number of people in working age (from 18 to 55 for females, from 18 to 60 for males);ret_total
: total number of people older than working age;young_total
: total number of people younger than working age;Create the following columns in a data set:
work_share
: a share of people in working age (in %);old_share
: a share of people older than working age (in %);young_share
: a share of people younger than working age (in %);Hint: you can add three columns at once via mutate()
from dplyr
.
Plot a histogram for work_share
. Change its fill color and add rugs. Add a vertical line that corresponds to the median value of work_share
.
Hint: add one more layer to the graph: geom_vline(xintercept = )
and put the median value after =
.
Create two smoothed density plots for work_age
by regions in the same graph (in one window). Change fill colors from default blue and pink to any you want. Fix transparency of graphs if needed.
Create a scatterplot for young_share
and old_share
. Interpret the graph you obtained. Change the color of points and their type.
Create a scatterplot for young_share
and old_share
by regions in two separate windows (facets).