This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
# Đọc dữ liệu từ ổ D
fa <- read.csv("D:/fage.csv")
# Xem qua dữ liệu
str(fa)
## 'data.frame': 100 obs. of 26 variables:
## $ id : int 1 2 3 4 5 6 7 8 9 10 ...
## $ gender : int 2 2 2 2 2 1 2 2 2 1 ...
## $ age : int 24 19 24 24 24 21 23 23 24 23 ...
## $ level : int 2 2 3 2 2 2 3 2 2 2 ...
## $ job : int 2 1 3 4 4 1 3 3 3 2 ...
## $ mar : int 1 1 1 1 3 1 1 1 1 1 ...
## $ inc : int 72 60 240 72 120 36 72 72 84 60 ...
## $ spending : int 6 3 10 5 10 4 10 8 2 5 ...
## $ weight : int 57 47 48 43 42 55 75 49 55 74 ...
## $ height : int 160 155 163 150 162 175 164 155 160 180 ...
## $ skin : int 2 5 1 3 4 2 1 3 3 4 ...
## $ hair : int 4 1 2 2 2 1 1 2 1 1 ...
## $ style : int 1 2 1 7 3 1 1 3 7 3 ...
## $ life : int 2 6 1 7 6 6 1 7 6 3 ...
## $ fashion.age: int 25 15 24 20 15 30 21 18 24 18 ...
## $ DG1 : int 26 20 27 26 27 23 30 24 27 25 ...
## $ DG2 : int 25 22 22 22 22 23 25 23 24 23 ...
## $ DG3 : int 26 22 25 26 26 21 30 25 23 24 ...
## $ DG4 : int 24 22 26 23 30 27 29 30 33 23 ...
## $ DG5 : int 23 21 23 22 19 22 24 22 24 22 ...
## $ DG6 : int 24 24 24 25 25 27 25 26 26 21 ...
## $ DG7 : int 28 26 25 30 23 30 28 28 27 22 ...
## $ DG8 : int 24 23 27 30 20 24 27 25 35 25 ...
## $ DG9 : int 30 27 23 24 31 26 25 32 34 22 ...
## $ DG10 : int 22 21 25 24 20 23 27 25 24 21 ...
## $ fage : num 25.2 22.8 24.7 25.2 24.3 24.6 27 26 27.7 22.8 ...
summary(fa)
## id gender age level job
## Min. : 1.00 Min. :1.00 Min. :18.00 Min. :1.00 Min. :1.00
## 1st Qu.: 25.75 1st Qu.:1.00 1st Qu.:24.00 1st Qu.:1.00 1st Qu.:2.00
## Median : 50.50 Median :2.00 Median :31.00 Median :2.00 Median :2.00
## Mean : 50.50 Mean :1.59 Mean :38.45 Mean :1.84 Mean :2.32
## 3rd Qu.: 75.25 3rd Qu.:2.00 3rd Qu.:53.50 3rd Qu.:2.00 3rd Qu.:3.00
## Max. :100.00 Max. :2.00 Max. :75.00 Max. :3.00 Max. :5.00
## mar inc spending weight height
## Min. :1 Min. : 36.00 Min. : 2.0 Min. :40.00 Min. :150.0
## 1st Qu.:1 1st Qu.: 65.75 1st Qu.: 3.0 1st Qu.:49.75 1st Qu.:157.0
## Median :2 Median :120.00 Median : 4.0 Median :55.50 Median :160.0
## Mean :2 Mean :122.85 Mean : 5.1 Mean :55.56 Mean :162.5
## 3rd Qu.:3 3rd Qu.:170.00 3rd Qu.: 6.0 3rd Qu.:60.00 3rd Qu.:168.0
## Max. :3 Max. :300.00 Max. :25.0 Max. :75.00 Max. :180.0
## skin hair style life fashion.age
## Min. :1.00 Min. :1.00 Min. :1.00 Min. :1.00 Min. :15.00
## 1st Qu.:3.00 1st Qu.:1.00 1st Qu.:2.00 1st Qu.:3.75 1st Qu.:23.00
## Median :3.00 Median :1.00 Median :3.00 Median :6.00 Median :31.00
## Mean :3.04 Mean :1.51 Mean :4.25 Mean :5.21 Mean :38.22
## 3rd Qu.:3.00 3rd Qu.:2.00 3rd Qu.:7.00 3rd Qu.:7.00 3rd Qu.:55.00
## Max. :5.00 Max. :4.00 Max. :7.00 Max. :8.00 Max. :75.00
## DG1 DG2 DG3 DG4 DG5
## Min. :19.00 Min. :20.00 Min. :17.0 Min. :19.00 Min. :19.00
## 1st Qu.:24.00 1st Qu.:26.50 1st Qu.:25.0 1st Qu.:24.75 1st Qu.:23.00
## Median :33.00 Median :34.50 Median :31.0 Median :32.00 Median :30.00
## Mean :39.39 Mean :39.68 Mean :38.8 Mean :38.19 Mean :37.80
## 3rd Qu.:53.00 3rd Qu.:53.50 3rd Qu.:52.0 3rd Qu.:50.25 3rd Qu.:55.25
## Max. :77.00 Max. :75.00 Max. :81.0 Max. :80.00 Max. :72.00
## DG6 DG7 DG8 DG9
## Min. :17.00 Min. :18.00 Min. :18.00 Min. :17.00
## 1st Qu.:22.75 1st Qu.:27.00 1st Qu.:25.00 1st Qu.:24.00
## Median :29.00 Median :36.00 Median :35.50 Median :35.00
## Mean :37.29 Mean :42.82 Mean :41.52 Mean :40.87
## 3rd Qu.:50.50 3rd Qu.:56.25 3rd Qu.:56.00 3rd Qu.:55.50
## Max. :80.00 Max. :82.00 Max. :90.00 Max. :80.00
## DG10 fage
## Min. :20.00 Min. :19.70
## 1st Qu.:25.00 1st Qu.:24.57
## Median :30.00 Median :34.20
## Mean :38.57 Mean :39.49
## 3rd Qu.:55.00 3rd Qu.:52.25
## Max. :80.00 Max. :76.70
#cài đặt và nạp gói BMA
# Cài đặt nếu chưa có
if(!require(BMA)) install.packages("BMA")
library(BMA)
#kiểm tra dữ liệu thiếu
# Loại bỏ dòng có giá trị NA
fa <- na.omit(fa)
#thiết lập mô hình BMA
# tách biến đầu ra và đầu vào
y <- fa$fage
X <- fa[, setdiff(names(fa), "fage")]
# chạy BMA
bma_model <- bicreg(x = X, y = y, strict = FALSE)
# xem kết quả
summary(bma_model)
##
## Call:
## bicreg(x = X, y = y, strict = FALSE)
##
##
## 164 models were selected
## Best 5 models (cumulative posterior probability = 0.1911 ):
##
## p!=0 EV SD model 1 model 2 model 3
## Intercept 100.0 4.779e-03 0.5975676 -0.34055 0.62383 -0.15024
## id 3.4 -3.941e-05 0.0006168 . . .
## gender 3.4 -3.186e-04 0.0356225 . . .
## age 18.8 1.984e-02 0.0448326 0.09394 0.13941 .
## level 3.4 5.544e-04 0.0448642 . . .
## job 3.1 -2.705e-04 0.0232001 . . .
## mar 3.8 -1.262e-04 0.0369848 . . .
## inc 3.4 -3.722e-06 0.0002834 . . .
## spending 3.4 1.552e-04 0.0048059 . . .
## weight 3.8 8.839e-05 0.0032569 . . .
## height 3.8 1.027e-04 0.0020970 . . .
## skin 3.1 1.037e-04 0.0157705 . . .
## hair 3.4 2.128e-03 0.0251212 . . .
## style 3.4 -1.721e-04 0.0056126 . . .
## life 3.1 4.149e-04 0.0084369 . . .
## fashion.age 4.6 2.152e-03 0.0124676 . . .
## DG1 53.0 7.810e-02 0.0776411 . . 0.12870
## DG2 60.2 1.020e-01 0.0874263 0.18576 . 0.14088
## DG3 73.2 9.563e-02 0.0599329 . 0.14050 0.13455
## DG4 61.9 7.632e-02 0.0616520 0.13320 . .
## DG5 64.6 8.296e-02 0.0642319 . 0.12772 .
## DG6 90.1 1.037e-01 0.0366146 0.12305 0.11599 0.11065
## DG7 98.9 1.066e-01 0.0182055 0.10997 0.10314 0.09531
## DG8 99.6 1.109e-01 0.0152951 0.11137 0.11650 0.12659
## DG9 100.0 1.105e-01 0.0144053 0.11591 0.11821 0.12259
## DG10 87.9 1.083e-01 0.0424931 0.13236 0.12497 0.13715
##
## nVar 8 8 8
## r2 0.999 0.999 0.999
## BIC -653.93417 -653.93417 -653.93417
## post prob 0.038 0.038 0.038
## model 4 model 5
## Intercept -0.46101 0.55391
## id . .
## gender . .
## age . .
## level . .
## job . .
## mar . .
## inc . .
## spending . .
## weight . .
## height . .
## skin . .
## hair . .
## style . .
## life . .
## fashion.age . .
## DG1 0.10662 0.15935
## DG2 0.19340 .
## DG3 . 0.13734
## DG4 0.12856 0.12802
## DG5 . 0.15554
## DG6 0.11698 0.11443
## DG7 0.10969 0.09973
## DG8 0.10754 0.10399
## DG9 0.11249 0.09078
## DG10 0.13089 .
##
## nVar 8 8
## r2 0.999 0.999
## BIC -653.93417 -653.93417
## post prob 0.038 0.038
#5.diễn giải kết quả
# In bảng xác suất mô hình
imageplot.bma(bma_model)
# Biến quan trọng nhất
coef(bma_model)
## NULL
#6.dự báo fage và đánh giá
# Dự báo trên tập dữ liệu gốc
fage_pred <- predict(bma_model, newdata = X)
length(y)
## [1] 100
length(fage_pred)
## [1] 3
fage_pred <- predict(bma_model, newdata = X)$mean
# So sánh dự báo và thực tế
plot(y, fage_pred, main = "Thực tế vs Dự báo", xlab = "fage thực tế", ylab = "fage dự báo")
abline(0, 1, col = "red")
# Tính sai số RMSE
rmse <- sqrt(mean((y - fage_pred)^2))
cat("RMSE:", rmse)
## RMSE: 0.1413567
#kết luận
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.