title: “Lớp PT số liệu N1” author: “vule” date: “2024-09-08” output:
html_document ##Việc 3. Đọc dữ liệu vào R #t = file.choose() ob <-
read_csv(“C:/Users/ADMIN/Desktop/Khóa học NCKH 07.09.2024/obesity
data.csv”) Rows: 1217 Columns: 13
## Việc 4 Thông tin về dữ liệu ob ### có bao nhiêu biến số và quan
sátsát > dim(ob) [1] 1217 13 ### Liệt kê 6 quan sát đầu tiên head(ob)
A tibble: 6 × 13 id gender height weight bmi age WBBMC wbbmd fat lean
pcfat
4 4 F 156 53 21.8 56 1171 0.8 17472 33094 33.8 5 5 M 160 51 19.9 54 1681
0.98 7336 40621 14.8 6 6 F 153 47 20.1 52 1358 0.91 14904 30068 32.2 # ℹ
2 more variables: hypertension
Min. : 1.0 Length:1217 Min. :136.0 Min. :34.00
1st Qu.: 309.0 Class :character 1st Qu.:151.0 1st Qu.:49.00
Median : 615.0 Mode :character Median :155.0 Median :54.00
Mean : 614.5 Mean :156.7 Mean :55.14
3rd Qu.: 921.0 3rd Qu.:162.0 3rd Qu.:61.00
Max. :1227.0 Max. :185.0 Max. :95.00
bmi age WBBMC wbbmd fat
Min. :14.5 Min. :13.00 Min. : 695 Min. :0.650 Min. : 4277
1st Qu.:20.2 1st Qu.:35.00 1st Qu.:1481 1st Qu.:0.930 1st
Qu.:13768
Median :22.2 Median :48.00 Median :1707 Median :1.010 Median
:16955
Mean :22.4 Mean :47.15 Mean :1725 Mean :1.009 Mean :17288
3rd Qu.:24.3 3rd Qu.:58.00 3rd Qu.:1945 3rd Qu.:1.090 3rd
Qu.:20325
Max. :37.1 Max. :88.00 Max. :3040 Max. :1.350 Max. :40825
lean pcfat hypertension diabetes
Min. :19136 Min. : 9.2 Min. :0.000 Min. :0.0000
1st Qu.:30325 1st Qu.:27.0 1st Qu.:0.000 1st Qu.:0.0000
Median :33577 Median :32.4 Median :1.000 Median :0.0000
Mean :35463 Mean :31.6 Mean :0.507 Mean :0.1109
3rd Qu.:39761 3rd Qu.:36.8 3rd Qu.:1.000 3rd Qu.:0.0000
Max. :63059 Max. :48.4 Max. :1.000 Max. :1.0000
## Việc 5. Biên tập dữ liệu bằng gói phân tích “tidyverse” ### 5.1. Mã
hóa biến gender > ob\(sex[ob\)gender
== “F”] = 1 > ob\(sex[ob\)gender ==
“M”] = 0 > > ob\(sex.b =
ifelse(ob\)gender== “F”, 1, 0) > table(ob\(sex, ob\)sex.b)
0 1
0 355 0 1 0 862862