Hướng dẫn những lệnh cơ bản nhất của gói dplyr
Thực hành trên bộ số liệu điều tra mưc sống hộ gia đình 2014
## -- Attaching packages ---------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0 v purrr 0.2.5
## v tibble 1.4.2 v dplyr 0.7.8
## v tidyr 0.8.2 v stringr 1.3.1
## v readr 1.2.1 v forcats 0.3.0
## -- Conflicts ------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(haven)
my_folder <- "D:/GoogleDrive/Data/VHLSS2014/"
muc1a <- read_dta(paste0(my_folder, "muc1a.dta"))
muc1a## # A tibble: 36,080 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 1 "" 2 1 11 1958
## 2 1 1 4 8 13 2 "" 1 2 " 9" 1957
## 3 1 1 4 8 14 1 "" 2 1 " 4" 1953
## 4 1 1 4 8 14 2 "" 1 7 " 7" 1996
## 5 1 1 4 8 15 1 "" 1 1 " 6" 1979
## 6 1 1 4 8 15 2 "" 2 2 11 1981
## 7 1 1 4 8 15 3 "" 1 3 " 1" 2008
## 8 1 1 4 8 15 4 "" 2 3 " 7" 2010
## 9 1 1 4 8 15 5 "" 2 4 " 8" 1954
## 10 1 1 7 6 13 1 "" 1 1 " 5" 1953
## # ... with 36,070 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 36,080 x 1
## tinh
## <dbl+lbl>
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 1
## 7 1
## 8 1
## 9 1
## 10 1
## # ... with 36,070 more rows
## # A tibble: 36,080 x 3
## tinh huyen xa
## <dbl+lbl> <dbl> <dbl>
## 1 1 1 4
## 2 1 1 4
## 3 1 1 4
## 4 1 1 4
## 5 1 1 4
## 6 1 1 4
## 7 1 1 4
## 8 1 1 4
## 9 1 1 4
## 10 1 1 7
## # ... with 36,070 more rows
## # A tibble: 36,080 x 28
## huyen diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b m1ac5 m1ac6
## <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl> <dbl> <dbl>
## 1 1 8 13 1 "" 2 1 11 1958 56 NA
## 2 1 8 13 2 "" 1 2 " 9" 1957 57 NA
## 3 1 8 14 1 "" 2 1 " 4" 1953 61 NA
## 4 1 8 14 2 "" 1 7 " 7" 1996 18 NA
## 5 1 8 15 1 "" 1 1 " 6" 1979 35 NA
## 6 1 8 15 2 "" 2 2 11 1981 33 NA
## 7 1 8 15 3 "" 1 3 " 1" 2008 6 " 1"
## 8 1 8 15 4 "" 2 3 " 7" 2010 4 " 1"
## 9 1 8 15 5 "" 2 4 " 8" 1954 60 NA
## 10 1 6 13 1 "" 1 1 " 5" 1953 61 NA
## # ... with 36,070 more rows, and 17 more variables: m1ac7a <dbl+lbl>,
## # m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>, m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>,
## # m1ac10 <dbl+lbl>, m1ama1 <dbl>, m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>,
## # m1ac13 <dbl+lbl>, m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 36,080 x 2
## m1ac4a m1ac4b
## <dbl+lbl> <dbl>
## 1 11 1958
## 2 " 9" 1957
## 3 " 4" 1953
## 4 " 7" 1996
## 5 " 6" 1979
## 6 11 1981
## 7 " 1" 2008
## 8 " 7" 2010
## 9 " 8" 1954
## 10 " 5" 1953
## # ... with 36,070 more rows
## # A tibble: 36,080 x 5
## xa m1ac4a m1ac7a m1ac14a m1ac15a
## <dbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl>
## 1 4 11 NA NA " 2"
## 2 4 " 9" NA NA " 2"
## 3 4 " 4" NA NA " 2"
## 4 4 " 7" NA NA " 2"
## 5 4 " 6" NA NA " 2"
## 6 4 11 NA NA " 2"
## 7 4 " 1" " 1" NA NA
## 8 4 " 7" " 1" NA NA
## 9 4 " 8" NA NA " 2"
## 10 7 " 5" NA NA " 2"
## # ... with 36,070 more rows
## # A tibble: 36,080 x 22
## m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b m1ac5 m1ac6 m1ac7a m1ac7b m1ac7c m1ac8
## <chr> <dbl> <dbl> <dbl+> <dbl> <dbl> <dbl> <dbl+> <dbl+> <dbl+> <dbl>
## 1 "" 2 1 11 1958 56 NA NA NA NA " 2"
## 2 "" 1 2 " 9" 1957 57 NA NA NA NA " 2"
## 3 "" 2 1 " 4" 1953 61 NA NA NA NA " 4"
## 4 "" 1 7 " 7" 1996 18 NA NA NA NA " 1"
## 5 "" 1 1 " 6" 1979 35 NA NA NA NA " 2"
## 6 "" 2 2 11 1981 33 NA NA NA NA " 2"
## 7 "" 1 3 " 1" 2008 6 " 1" " 1" " 2" NA NA
## 8 "" 2 3 " 7" 2010 4 " 1" " 1" " 2" NA NA
## 9 "" 2 4 " 8" 1954 60 NA NA NA NA " 2"
## 10 "" 1 1 " 5" 1953 61 NA NA NA NA " 2"
## # ... with 36,070 more rows, and 11 more variables: m1ac9 <dbl+lbl>,
## # m1ac10 <dbl+lbl>, m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>,
## # m1ac13 <dbl+lbl>, m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>
## # A tibble: 36,080 x 30
## xa huyen tinh diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 4 1 1 8 13 1 "" 2 1 11 1958
## 2 4 1 1 8 13 2 "" 1 2 " 9" 1957
## 3 4 1 1 8 14 1 "" 2 1 " 4" 1953
## 4 4 1 1 8 14 2 "" 1 7 " 7" 1996
## 5 4 1 1 8 15 1 "" 1 1 " 6" 1979
## 6 4 1 1 8 15 2 "" 2 2 11 1981
## 7 4 1 1 8 15 3 "" 1 3 " 1" 2008
## 8 4 1 1 8 15 4 "" 2 3 " 7" 2010
## 9 4 1 1 8 15 5 "" 2 4 " 8" 1954
## 10 7 1 1 6 13 1 "" 1 1 " 5" 1953
## # ... with 36,070 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 36,080 x 1
## gender
## <dbl+lbl>
## 1 2
## 2 1
## 3 2
## 4 1
## 5 1
## 6 2
## 7 1
## 8 2
## 9 2
## 10 1
## # ... with 36,070 more rows
## # A tibble: 17,718 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 2 "" 1 2 9 1957
## 2 1 1 4 8 14 2 "" 1 7 7 1996
## 3 1 1 4 8 15 1 "" 1 1 6 1979
## 4 1 1 4 8 15 3 "" 1 3 1 2008
## 5 1 1 7 6 13 1 "" 1 1 5 1953
## 6 1 1 7 6 13 3 "" 1 6 4 2002
## 7 1 1 7 6 14 1 "" 1 1 1 1954
## 8 1 1 7 6 15 2 "" 1 2 4 1943
## 9 1 1 7 6 15 3 "" 1 3 5 1984
## 10 1 1 16 20 13 2 "" 1 2 3 1963
## # ... with 17,708 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 18,362 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 1 "" 2 1 11 1958
## 2 1 1 4 8 14 1 "" 2 1 " 4" 1953
## 3 1 1 4 8 15 2 "" 2 2 11 1981
## 4 1 1 4 8 15 4 "" 2 3 " 7" 2010
## 5 1 1 4 8 15 5 "" 2 4 " 8" 1954
## 6 1 1 7 6 13 2 "" 2 2 " 5" 1955
## 7 1 1 7 6 14 2 "" 2 2 " 7" 1961
## 8 1 1 7 6 15 1 "" 2 1 11 1947
## 9 1 1 7 6 15 4 "" 2 3 " 8" 1984
## 10 1 1 7 6 15 5 "" 2 6 11 2010
## # ... with 18,352 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 683 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 1 "" 2 1 11 1958
## 2 1 1 4 8 14 1 "" 2 1 " 4" 1953
## 3 1 1 7 6 13 1 "" 1 1 " 5" 1953
## 4 1 1 28 25 15 2 "" 2 2 10 1953
## 5 1 2 40 12 15 1 "" 1 1 " 1" 1958
## 6 1 4 124 36 13 1 "" 2 1 " 9" 1953
## 7 1 5 167 30 14 1 "" 2 1 10 1953
## 8 1 6 187 10 20 2 "" 2 2 " 8" 1958
## 9 1 6 190 60 15 2 "" 2 2 " 1" 1953
## 10 1 6 199 16 14 2 "" 1 2 " 8" 1953
## # ... with 673 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 35,397 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 2 "" 1 2 " 9" 1957
## 2 1 1 4 8 14 2 "" 1 7 " 7" 1996
## 3 1 1 4 8 15 1 "" 1 1 " 6" 1979
## 4 1 1 4 8 15 2 "" 2 2 11 1981
## 5 1 1 4 8 15 3 "" 1 3 " 1" 2008
## 6 1 1 4 8 15 4 "" 2 3 " 7" 2010
## 7 1 1 4 8 15 5 "" 2 4 " 8" 1954
## 8 1 1 7 6 13 2 "" 2 2 " 5" 1955
## 9 1 1 7 6 13 3 "" 1 6 " 4" 2002
## 10 1 1 7 6 14 1 "" 1 1 " 1" 1954
## # ... with 35,387 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 26,816 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 1 "" 2 1 11 1958
## 2 1 1 4 8 13 2 "" 1 2 " 9" 1957
## 3 1 1 4 8 14 1 "" 2 1 " 4" 1953
## 4 1 1 4 8 14 2 "" 1 7 " 7" 1996
## 5 1 1 4 8 15 1 "" 1 1 " 6" 1979
## 6 1 1 4 8 15 2 "" 2 2 11 1981
## 7 1 1 4 8 15 5 "" 2 4 " 8" 1954
## 8 1 1 7 6 13 1 "" 1 1 " 5" 1953
## 9 1 1 7 6 13 2 "" 2 2 " 5" 1955
## 10 1 1 7 6 14 1 "" 1 1 " 1" 1954
## # ... with 26,806 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 160 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 2 79 10 15 5 "" 1 6 1 2014
## 2 1 6 190 60 15 5 "" 1 6 5 2014
## 3 1 8 322 15 15 4 "" 1 6 4 2014
## 4 1 9 367 33 14 5 "" 1 6 8 2014
## 5 1 21 607 32 14 6 "" 1 6 6 2014
## 6 1 271 9631 1 15 9 "" 1 6 1 2014
## 7 1 274 9847 17 13 4 "" 1 6 4 2014
## 8 1 278 10174 11 15 5 "" 1 6 8 2014
## 9 1 281 10426 9 13 5 "" 1 6 1 2014
## 10 2 30 955 4 14 4 "" 1 3 1 2014
## # ... with 150 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 160 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 2 79 10 15 5 "" 1 6 1 2014
## 2 1 6 190 60 15 5 "" 1 6 5 2014
## 3 1 8 322 15 15 4 "" 1 6 4 2014
## 4 1 9 367 33 14 5 "" 1 6 8 2014
## 5 1 21 607 32 14 6 "" 1 6 6 2014
## 6 1 271 9631 1 15 9 "" 1 6 1 2014
## 7 1 274 9847 17 13 4 "" 1 6 4 2014
## 8 1 278 10174 11 15 5 "" 1 6 8 2014
## 9 1 281 10426 9 13 5 "" 1 6 1 2014
## 10 2 30 955 4 14 4 "" 1 3 1 2014
## # ... with 150 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
Điều kiện và có thể dùng dầu phảy, hoặc dấu &, điều kiện hoặc dùng dấu “|”
## # A tibble: 21,848 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 2 "" 1 2 9 1957
## 2 1 1 4 8 14 2 "" 1 7 7 1996
## 3 1 1 4 8 15 1 "" 1 1 6 1979
## 4 1 1 4 8 15 3 "" 1 3 1 2008
## 5 1 1 4 8 15 4 "" 2 3 7 2010
## 6 1 1 7 6 13 1 "" 1 1 5 1953
## 7 1 1 7 6 13 3 "" 1 6 4 2002
## 8 1 1 7 6 14 1 "" 1 1 1 1954
## 9 1 1 7 6 15 2 "" 1 2 4 1943
## 10 1 1 7 6 15 3 "" 1 3 5 1984
## # ... with 21,838 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 36,080 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 1 "" 2 1 11 1958
## 2 1 1 4 8 13 2 "" 1 2 " 9" 1957
## 3 1 1 4 8 14 1 "" 2 1 " 4" 1953
## 4 1 1 4 8 14 2 "" 1 7 " 7" 1996
## 5 1 1 4 8 15 1 "" 1 1 " 6" 1979
## 6 1 1 4 8 15 2 "" 2 2 11 1981
## 7 1 1 4 8 15 3 "" 1 3 " 1" 2008
## 8 1 1 4 8 15 4 "" 2 3 " 7" 2010
## 9 1 1 4 8 15 5 "" 2 4 " 8" 1954
## 10 1 1 7 6 13 1 "" 1 1 " 5" 1953
## # ... with 36,070 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 3,130 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 1 "" 2 1 11 1958
## 2 1 1 7 6 13 1 "" 1 1 " 5" 1953
## 3 1 1 16 20 13 1 "" 2 1 12 1963
## 4 1 1 22 19 13 1 "" 2 1 " 2" 1941
## 5 1 1 28 25 13 1 "" 2 1 " 4" 1959
## 6 1 1 34 10 14 1 "" 2 1 " 7" 1959
## 7 1 2 40 12 13 1 "" 1 1 " 4" 1948
## 8 1 2 55 11 13 1 "" 1 1 11 1941
## 9 1 2 67 16 13 1 "" 1 1 " 5" 1961
## 10 1 2 79 10 14 1 "" 1 1 " 3" 1952
## # ... with 3,120 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 5 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 79 784 27592 10 14 1 "" 1 1 " 1" 1935
## 2 96 966 32062 13 20 3 "" 1 3 " 1" 2001
## 3 62 608 23305 6 15 2 "" 1 2 12 1965
## 4 45 462 19358 3 13 2 "" 2 2 " 4" 1968
## 5 36 356 13657 22 15 1 "" 2 1 " 4" 1953
## # ... with 19 more variables: m1ac5 <dbl>, m1ac6 <dbl+lbl>,
## # m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>, m1ac8 <dbl+lbl>,
## # m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>, m1ac11 <dbl+lbl>,
## # m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>, m1ac14a <dbl+lbl>, m1ac14b <dbl>,
## # m1ac15a <dbl+lbl>, m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>,
## # m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 361 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 44 454 19093 7 15 2 "" 2 2 " 9" 1969
## 2 12 106 3424 7 13 1 "" 1 1 " 5" 1995
## 3 75 732 26080 8 15 3 "" 1 3 " 3" 2004
## 4 70 690 25336 20 14 2 "" 2 3 " 4" 1998
## 5 56 572 22579 7 15 1 "" 1 1 " 1" 1963
## 6 17 157 5287 3 14 1 "" 1 1 " 8" 1979
## 7 94 947 31729 6 15 1 "" 1 1 11 1977
## 8 49 517 20971 12 13 1 "" 1 1 12 1971
## 9 58 587 22891 6 13 2 "" 2 2 " 7" 1952
## 10 45 464 19363 12 14 1 "" 1 1 " 1" 1957
## # ... with 351 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 5 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 1 "" 2 1 11 1958
## 2 1 1 4 8 13 2 "" 1 2 " 9" 1957
## 3 1 1 4 8 14 1 "" 2 1 " 4" 1953
## 4 1 1 4 8 14 2 "" 1 7 " 7" 1996
## 5 1 1 4 8 15 1 "" 1 1 " 6" 1979
## # ... with 19 more variables: m1ac5 <dbl>, m1ac6 <dbl+lbl>,
## # m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>, m1ac8 <dbl+lbl>,
## # m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>, m1ac11 <dbl+lbl>,
## # m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>, m1ac14a <dbl+lbl>, m1ac14b <dbl>,
## # m1ac15a <dbl+lbl>, m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>,
## # m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 2 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 2 "" 1 2 9 1957
## 2 1 1 4 8 14 2 "" 1 7 7 1996
## # ... with 19 more variables: m1ac5 <dbl>, m1ac6 <dbl+lbl>,
## # m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>, m1ac8 <dbl+lbl>,
## # m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>, m1ac11 <dbl+lbl>,
## # m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>, m1ac14a <dbl+lbl>, m1ac14b <dbl>,
## # m1ac15a <dbl+lbl>, m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>,
## # m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 36,080 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 1 1 4 8 13 1 "" 2 1 11 1958
## 2 1 1 4 8 13 2 "" 1 2 " 9" 1957
## 3 1 1 4 8 14 1 "" 2 1 " 4" 1953
## 4 1 1 4 8 14 2 "" 1 7 " 7" 1996
## 5 1 1 4 8 15 1 "" 1 1 " 6" 1979
## 6 1 1 4 8 15 2 "" 2 2 11 1981
## 7 1 1 4 8 15 3 "" 1 3 " 1" 2008
## 8 1 1 4 8 15 4 "" 2 3 " 7" 2010
## 9 1 1 4 8 15 5 "" 2 4 " 8" 1954
## 10 1 1 7 6 13 1 "" 1 1 " 5" 1953
## # ... with 36,070 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 36,080 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 96 964 31999 2 13 1 "" 1 1 " 9" 1961
## 2 96 964 31999 2 13 2 "" 2 2 10 1965
## 3 96 964 31999 2 15 1 "" 2 1 -2 1924
## 4 96 964 31999 2 15 2 "" 1 3 " 3" 1952
## 5 96 964 31999 2 15 3 "" 2 3 12 1969
## 6 96 964 31999 2 15 4 "" 1 6 " 3" 1997
## 7 96 964 31999 2 15 5 "" 1 6 " 4" 1998
## 8 96 964 31999 2 20 1 "" 2 1 -2 1958
## 9 96 964 31999 2 20 2 "" 1 2 -2 1958
## 10 96 964 31999 2 20 3 "" 2 3 " 3" 1983
## # ... with 36,070 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
## # A tibble: 36,080 x 30
## tinh huyen xa diaban hoso matv m1ac1 m1ac2 m1ac3 m1ac4a m1ac4b
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl+> <dbl>
## 1 96 973 32233 6 13 1 "" 1 1 " 5" 1949
## 2 96 973 32233 6 13 2 "" 2 2 " 1" 1948
## 3 96 973 32233 6 14 1 "" 1 1 " 1" 1987
## 4 96 973 32233 6 14 2 "" 2 2 " 6" 1988
## 5 96 973 32233 6 14 3 "" 1 3 " 9" 2008
## 6 96 973 32233 6 14 4 "" 2 3 " 9" 2010
## 7 96 973 32233 6 15 1 "" 1 1 " 4" 1970
## 8 96 973 32233 6 15 2 "" 2 7 " 6" 1982
## 9 96 973 32233 6 15 3 "" 1 7 " 5" 2006
## 10 96 973 32233 6 15 4 "" 2 7 11 2012
## # ... with 36,070 more rows, and 19 more variables: m1ac5 <dbl>,
## # m1ac6 <dbl+lbl>, m1ac7a <dbl+lbl>, m1ac7b <dbl+lbl>, m1ac7c <dbl+lbl>,
## # m1ac8 <dbl+lbl>, m1ac9 <dbl+lbl>, m1ac10 <dbl+lbl>, m1ama1 <dbl>,
## # m1ac11 <dbl+lbl>, m1ac12 <dbl+lbl>, m1ac13 <dbl+lbl>,
## # m1ac14a <dbl+lbl>, m1ac14b <dbl>, m1ac15a <dbl+lbl>,
## # m1ac15b <dbl+lbl>, m1ac15c <dbl+lbl>, m1ac15d <dbl+lbl>, ky <dbl>
muc1a_subset <- muc1a %>%
select(tinh, huyen, xa, diaban, hoso, matv, m1ac2, m1ac4b)
muc1a_subset %>% mutate(id = paste0(tinh, huyen, xa, diaban, hoso, matv))## # A tibble: 36,080 x 9
## tinh huyen xa diaban hoso matv m1ac2 m1ac4b id
## <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> <dbl> <chr>
## 1 1 1 4 8 13 1 2 1958 1148131
## 2 1 1 4 8 13 2 1 1957 1148132
## 3 1 1 4 8 14 1 2 1953 1148141
## 4 1 1 4 8 14 2 1 1996 1148142
## 5 1 1 4 8 15 1 1 1979 1148151
## 6 1 1 4 8 15 2 2 1981 1148152
## 7 1 1 4 8 15 3 1 2008 1148153
## 8 1 1 4 8 15 4 2 2010 1148154
## 9 1 1 4 8 15 5 2 1954 1148155
## 10 1 1 7 6 13 1 1 1953 1176131
## # ... with 36,070 more rows
## # A tibble: 36,080 x 9
## tinh huyen xa diaban hoso matv m1ac2 m1ac4b age
## <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> <dbl> <dbl>
## 1 1 1 4 8 13 1 2 1958 60
## 2 1 1 4 8 13 2 1 1957 61
## 3 1 1 4 8 14 1 2 1953 65
## 4 1 1 4 8 14 2 1 1996 22
## 5 1 1 4 8 15 1 1 1979 39
## 6 1 1 4 8 15 2 2 1981 37
## 7 1 1 4 8 15 3 1 2008 10
## 8 1 1 4 8 15 4 2 2010 8
## 9 1 1 4 8 15 5 2 1954 64
## 10 1 1 7 6 13 1 1 1953 65
## # ... with 36,070 more rows
## # A tibble: 36,080 x 9
## tinh huyen xa diaban hoso matv m1ac2 m1ac4b age_lg
## <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> <dbl> <lgl>
## 1 1 1 4 8 13 1 2 1958 FALSE
## 2 1 1 4 8 13 2 1 1957 FALSE
## 3 1 1 4 8 14 1 2 1953 FALSE
## 4 1 1 4 8 14 2 1 1996 FALSE
## 5 1 1 4 8 15 1 1 1979 FALSE
## 6 1 1 4 8 15 2 2 1981 FALSE
## 7 1 1 4 8 15 3 1 2008 TRUE
## 8 1 1 4 8 15 4 2 2010 TRUE
## 9 1 1 4 8 15 5 2 1954 FALSE
## 10 1 1 7 6 13 1 1 1953 FALSE
## # ... with 36,070 more rows
## # A tibble: 36,080 x 9
## tinh huyen xa diaban hoso matv m1ac2 m1ac4b ha_noi
## <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> <dbl> <lgl>
## 1 1 1 4 8 13 1 2 1958 TRUE
## 2 1 1 4 8 13 2 1 1957 TRUE
## 3 1 1 4 8 14 1 2 1953 TRUE
## 4 1 1 4 8 14 2 1 1996 TRUE
## 5 1 1 4 8 15 1 1 1979 TRUE
## 6 1 1 4 8 15 2 2 1981 TRUE
## 7 1 1 4 8 15 3 1 2008 TRUE
## 8 1 1 4 8 15 4 2 2010 TRUE
## 9 1 1 4 8 15 5 2 1954 TRUE
## 10 1 1 7 6 13 1 1 1953 TRUE
## # ... with 36,070 more rows
## # A tibble: 36,080 x 9
## tinh huyen xa diaban hoso matv m1ac2 m1ac4b ha_noi_ha_giang
## <dbl+lb> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lb> <dbl> <lgl>
## 1 1 1 4 8 13 1 2 1958 TRUE
## 2 1 1 4 8 13 2 1 1957 TRUE
## 3 1 1 4 8 14 1 2 1953 TRUE
## 4 1 1 4 8 14 2 1 1996 TRUE
## 5 1 1 4 8 15 1 1 1979 TRUE
## 6 1 1 4 8 15 2 2 1981 TRUE
## 7 1 1 4 8 15 3 1 2008 TRUE
## 8 1 1 4 8 15 4 2 2010 TRUE
## 9 1 1 4 8 15 5 2 1954 TRUE
## 10 1 1 7 6 13 1 1 1953 TRUE
## # ... with 36,070 more rows
## # A tibble: 36,080 x 9
## tinh huyen xa diaban hoso matv m1ac2 m1ac4b gender
## <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> <dbl> <chr>
## 1 1 1 4 8 13 1 2 1958 Nu
## 2 1 1 4 8 13 2 1 1957 Nam
## 3 1 1 4 8 14 1 2 1953 Nu
## 4 1 1 4 8 14 2 1 1996 Nam
## 5 1 1 4 8 15 1 1 1979 Nam
## 6 1 1 4 8 15 2 2 1981 Nu
## 7 1 1 4 8 15 3 1 2008 Nam
## 8 1 1 4 8 15 4 2 2010 Nu
## 9 1 1 4 8 15 5 2 1954 Nu
## 10 1 1 7 6 13 1 1 1953 Nam
## # ... with 36,070 more rows
muc1a_subset %>% mutate(nhom_tuoi = case_when(m1ac4b < 18 ~ "Duoi 14",
m1ac4b < 29 ~ "[18, 29)",
TRUE ~ ">= 29"))## # A tibble: 36,080 x 9
## tinh huyen xa diaban hoso matv m1ac2 m1ac4b nhom_tuoi
## <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> <dbl> <chr>
## 1 1 1 4 8 13 1 2 1958 >= 29
## 2 1 1 4 8 13 2 1 1957 >= 29
## 3 1 1 4 8 14 1 2 1953 >= 29
## 4 1 1 4 8 14 2 1 1996 >= 29
## 5 1 1 4 8 15 1 1 1979 >= 29
## 6 1 1 4 8 15 2 2 1981 >= 29
## 7 1 1 4 8 15 3 1 2008 >= 29
## 8 1 1 4 8 15 4 2 2010 >= 29
## 9 1 1 4 8 15 5 2 1954 >= 29
## 10 1 1 7 6 13 1 1 1953 >= 29
## # ... with 36,070 more rows
muc1a %>%
select(tinh, huyen, xa, diaban, hoso, matv, m1ac2, m1ac4b) %>%
mutate(age = 2018 - m1ac4b,
gender = if_else(m1ac2 == 1, "Nam", "Nu"),
nhom_tuoi = case_when(m1ac4b < 18 ~ "Duoi 14",
m1ac4b < 29 ~ "[18, 29)",
TRUE ~ ">= 29")) %>%
filter(gender == "Nu") %>%
arrange(tinh, - huyen)## # A tibble: 18,362 x 11
## tinh huyen xa diaban hoso matv m1a~ m1ac4b age gender nhom_tuoi
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <db> <dbl> <dbl> <chr> <chr>
## 1 1 282 10441 14 13 2 2 1971 47 Nu >= 29
## 2 1 282 10441 14 13 3 2 1995 23 Nu >= 29
## 3 1 282 10441 14 13 4 2 2006 12 Nu >= 29
## 4 1 282 10441 14 14 2 2 1954 64 Nu >= 29
## 5 1 282 10441 14 14 4 2 1983 35 Nu >= 29
## 6 1 282 10441 14 14 5 2 2008 10 Nu >= 29
## 7 1 282 10441 14 15 2 2 1968 50 Nu >= 29
## 8 1 282 10441 14 15 3 2 1994 24 Nu >= 29
## 9 1 282 10450 11 13 2 2 1979 39 Nu >= 29
## 10 1 282 10450 11 13 4 2 2000 18 Nu >= 29
## # ... with 18,352 more rows
muc1a %>%
select(tinh, huyen, xa, diaban, hoso, matv, m1ac2, m1ac4b) %>%
mutate(age = 2018 - m1ac4b,
gender = if_else(m1ac2 == 1, "Nam", "Nu"),
nhom_tuoi = case_when(m1ac4b < 18 ~ "Duoi 14",
m1ac4b < 29 ~ "[18, 29)",
TRUE ~ ">= 29")) %>%
group_by(gender) %>%
filter(!is.na(age)) %>%
summarise(mean(age),
max(age),
min(age),
median(age),
sd(age),
n()
)## # A tibble: 2 x 7
## gender `mean(age)` `max(age)` `min(age)` `median(age)` `sd(age)` `n()`
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 Nam 36.0 109 4 34 20.6 17718
## 2 Nu 38.6 106 4 37 21.9 18362
muc1a %>%
select(tinh, huyen, xa, diaban, hoso, matv, m1ac2, m1ac4b) %>%
mutate(age = 2018 - m1ac4b,
gender = if_else(m1ac2 == 1, "Nam", "Nu"),
nhom_tuoi = case_when(m1ac4b < 18 ~ "Duoi 14",
m1ac4b < 29 ~ "[18, 29)",
TRUE ~ ">= 29")) %>%
group_by(gender) %>%
summarise(mean_age = mean(age, na.rm = TRUE),
max_age = max(age, na.rm = TRUE),
min_age = min(age, na.rm = TRUE),
median_age = median(age, na.rm = TRUE),
sd_age = sd(age, na.rm = TRUE),
cnt = n()
)## # A tibble: 2 x 7
## gender mean_age max_age min_age median_age sd_age cnt
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 Nam 36.0 109 4 34 20.6 17718
## 2 Nu 38.6 106 4 37 21.9 18362