data in

read in the data:

piaac<-read.csv("http://www.ut.ee/~iseppo/piaacexam.csv")

getting some basic overview of the data:

  1. usig dplyrs fuctios find the average income by study area for the people who have higher education and whos study area is available (is not not available).
## # A tibble: 18 x 3
## # Groups:   gender [2]
##    gender studyarea                                   averageWage
##    <fct>  <fct>                                             <dbl>
##  1 Female Agriculture and veterinary                         702.
##  2 Female Engineering, manufacturing and construction        725.
##  3 Female General programmes                                 576.
##  4 Female Health and welfare                                1041.
##  5 Female Humanities, languages and arts                     813.
##  6 Female Science, mathematics and computing                 851.
##  7 Female Services                                           862.
##  8 Female Social sciences, business and law                  936.
##  9 Female Teacher training and education science             781.
## 10 Male   Agriculture and veterinary                        1032.
## 11 Male   Engineering, manufacturing and construction       1336.
## 12 Male   General programmes                                 869.
## 13 Male   Health and welfare                                1468.
## 14 Male   Humanities, languages and arts                     936.
## 15 Male   Science, mathematics and computing                1458.
## 16 Male   Services                                          1185.
## 17 Male   Social sciences, business and law                 1526.
## 18 Male   Teacher training and education science             810.
  1. lets reorganize this data a bit now using pivot_wider from tidyr. Try to get the results in the following format:
## # A tibble: 9 x 3
##   studyarea                                   Female  Male
##   <fct>                                        <dbl> <dbl>
## 1 Agriculture and veterinary                    702. 1032.
## 2 Engineering, manufacturing and construction   725. 1336.
## 3 General programmes                            576.  869.
## 4 Health and welfare                           1041. 1468.
## 5 Humanities, languages and arts                813.  936.
## 6 Science, mathematics and computing            851. 1458.
## 7 Services                                      862. 1185.
## 8 Social sciences, business and law             936. 1526.
## 9 Teacher training and education science        781.  810.

Think from which variable are the new variable names coming from and from which are the values coming from and it should be doable.