R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

# Đọc dữ liệu từ ổ D
fa <- read.csv("D:/fage.csv")

# Xem qua dữ liệu
str(fa)
## 'data.frame':    100 obs. of  26 variables:
##  $ id         : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ gender     : int  2 2 2 2 2 1 2 2 2 1 ...
##  $ age        : int  24 19 24 24 24 21 23 23 24 23 ...
##  $ level      : int  2 2 3 2 2 2 3 2 2 2 ...
##  $ job        : int  2 1 3 4 4 1 3 3 3 2 ...
##  $ mar        : int  1 1 1 1 3 1 1 1 1 1 ...
##  $ inc        : int  72 60 240 72 120 36 72 72 84 60 ...
##  $ spending   : int  6 3 10 5 10 4 10 8 2 5 ...
##  $ weight     : int  57 47 48 43 42 55 75 49 55 74 ...
##  $ height     : int  160 155 163 150 162 175 164 155 160 180 ...
##  $ skin       : int  2 5 1 3 4 2 1 3 3 4 ...
##  $ hair       : int  4 1 2 2 2 1 1 2 1 1 ...
##  $ style      : int  1 2 1 7 3 1 1 3 7 3 ...
##  $ life       : int  2 6 1 7 6 6 1 7 6 3 ...
##  $ fashion.age: int  25 15 24 20 15 30 21 18 24 18 ...
##  $ DG1        : int  26 20 27 26 27 23 30 24 27 25 ...
##  $ DG2        : int  25 22 22 22 22 23 25 23 24 23 ...
##  $ DG3        : int  26 22 25 26 26 21 30 25 23 24 ...
##  $ DG4        : int  24 22 26 23 30 27 29 30 33 23 ...
##  $ DG5        : int  23 21 23 22 19 22 24 22 24 22 ...
##  $ DG6        : int  24 24 24 25 25 27 25 26 26 21 ...
##  $ DG7        : int  28 26 25 30 23 30 28 28 27 22 ...
##  $ DG8        : int  24 23 27 30 20 24 27 25 35 25 ...
##  $ DG9        : int  30 27 23 24 31 26 25 32 34 22 ...
##  $ DG10       : int  22 21 25 24 20 23 27 25 24 21 ...
##  $ fage       : num  25.2 22.8 24.7 25.2 24.3 24.6 27 26 27.7 22.8 ...
summary(fa)
##        id             gender          age            level           job      
##  Min.   :  1.00   Min.   :1.00   Min.   :18.00   Min.   :1.00   Min.   :1.00  
##  1st Qu.: 25.75   1st Qu.:1.00   1st Qu.:24.00   1st Qu.:1.00   1st Qu.:2.00  
##  Median : 50.50   Median :2.00   Median :31.00   Median :2.00   Median :2.00  
##  Mean   : 50.50   Mean   :1.59   Mean   :38.45   Mean   :1.84   Mean   :2.32  
##  3rd Qu.: 75.25   3rd Qu.:2.00   3rd Qu.:53.50   3rd Qu.:2.00   3rd Qu.:3.00  
##  Max.   :100.00   Max.   :2.00   Max.   :75.00   Max.   :3.00   Max.   :5.00  
##       mar         inc            spending        weight          height     
##  Min.   :1   Min.   : 36.00   Min.   : 2.0   Min.   :40.00   Min.   :150.0  
##  1st Qu.:1   1st Qu.: 65.75   1st Qu.: 3.0   1st Qu.:49.75   1st Qu.:157.0  
##  Median :2   Median :120.00   Median : 4.0   Median :55.50   Median :160.0  
##  Mean   :2   Mean   :122.85   Mean   : 5.1   Mean   :55.56   Mean   :162.5  
##  3rd Qu.:3   3rd Qu.:170.00   3rd Qu.: 6.0   3rd Qu.:60.00   3rd Qu.:168.0  
##  Max.   :3   Max.   :300.00   Max.   :25.0   Max.   :75.00   Max.   :180.0  
##       skin           hair          style           life       fashion.age   
##  Min.   :1.00   Min.   :1.00   Min.   :1.00   Min.   :1.00   Min.   :15.00  
##  1st Qu.:3.00   1st Qu.:1.00   1st Qu.:2.00   1st Qu.:3.75   1st Qu.:23.00  
##  Median :3.00   Median :1.00   Median :3.00   Median :6.00   Median :31.00  
##  Mean   :3.04   Mean   :1.51   Mean   :4.25   Mean   :5.21   Mean   :38.22  
##  3rd Qu.:3.00   3rd Qu.:2.00   3rd Qu.:7.00   3rd Qu.:7.00   3rd Qu.:55.00  
##  Max.   :5.00   Max.   :4.00   Max.   :7.00   Max.   :8.00   Max.   :75.00  
##       DG1             DG2             DG3            DG4             DG5       
##  Min.   :19.00   Min.   :20.00   Min.   :17.0   Min.   :19.00   Min.   :19.00  
##  1st Qu.:24.00   1st Qu.:26.50   1st Qu.:25.0   1st Qu.:24.75   1st Qu.:23.00  
##  Median :33.00   Median :34.50   Median :31.0   Median :32.00   Median :30.00  
##  Mean   :39.39   Mean   :39.68   Mean   :38.8   Mean   :38.19   Mean   :37.80  
##  3rd Qu.:53.00   3rd Qu.:53.50   3rd Qu.:52.0   3rd Qu.:50.25   3rd Qu.:55.25  
##  Max.   :77.00   Max.   :75.00   Max.   :81.0   Max.   :80.00   Max.   :72.00  
##       DG6             DG7             DG8             DG9       
##  Min.   :17.00   Min.   :18.00   Min.   :18.00   Min.   :17.00  
##  1st Qu.:22.75   1st Qu.:27.00   1st Qu.:25.00   1st Qu.:24.00  
##  Median :29.00   Median :36.00   Median :35.50   Median :35.00  
##  Mean   :37.29   Mean   :42.82   Mean   :41.52   Mean   :40.87  
##  3rd Qu.:50.50   3rd Qu.:56.25   3rd Qu.:56.00   3rd Qu.:55.50  
##  Max.   :80.00   Max.   :82.00   Max.   :90.00   Max.   :80.00  
##       DG10            fage      
##  Min.   :20.00   Min.   :19.70  
##  1st Qu.:25.00   1st Qu.:24.57  
##  Median :30.00   Median :34.20  
##  Mean   :38.57   Mean   :39.49  
##  3rd Qu.:55.00   3rd Qu.:52.25  
##  Max.   :80.00   Max.   :76.70
#cài đặt và nạp gói BMA
# Cài đặt nếu chưa có
if(!require(BMA)) install.packages("BMA")
library(BMA)
#kiểm tra dữ liệu thiếu
# Loại bỏ dòng có giá trị NA
fa <- na.omit(fa)
#thiết lập mô hình BMA
# tách biến đầu ra và đầu vào
y <- fa$fage
X <- fa[, setdiff(names(fa), "fage")]

# chạy BMA
bma_model <- bicreg(x = X, y = y, strict = FALSE)

# xem kết quả
summary(bma_model)
## 
## Call:
## bicreg(x = X, y = y, strict = FALSE)
## 
## 
##   164  models were selected
##  Best  5  models (cumulative posterior probability =  0.1911 ): 
## 
##              p!=0    EV         SD         model 1     model 2     model 3   
## Intercept    100.0   4.779e-03  0.5975676    -0.34055     0.62383    -0.15024
## id             3.4  -3.941e-05  0.0006168       .           .           .    
## gender         3.4  -3.186e-04  0.0356225       .           .           .    
## age           18.8   1.984e-02  0.0448326     0.09394     0.13941       .    
## level          3.4   5.544e-04  0.0448642       .           .           .    
## job            3.1  -2.705e-04  0.0232001       .           .           .    
## mar            3.8  -1.262e-04  0.0369848       .           .           .    
## inc            3.4  -3.722e-06  0.0002834       .           .           .    
## spending       3.4   1.552e-04  0.0048059       .           .           .    
## weight         3.8   8.839e-05  0.0032569       .           .           .    
## height         3.8   1.027e-04  0.0020970       .           .           .    
## skin           3.1   1.037e-04  0.0157705       .           .           .    
## hair           3.4   2.128e-03  0.0251212       .           .           .    
## style          3.4  -1.721e-04  0.0056126       .           .           .    
## life           3.1   4.149e-04  0.0084369       .           .           .    
## fashion.age    4.6   2.152e-03  0.0124676       .           .           .    
## DG1           53.0   7.810e-02  0.0776411       .           .         0.12870
## DG2           60.2   1.020e-01  0.0874263     0.18576       .         0.14088
## DG3           73.2   9.563e-02  0.0599329       .         0.14050     0.13455
## DG4           61.9   7.632e-02  0.0616520     0.13320       .           .    
## DG5           64.6   8.296e-02  0.0642319       .         0.12772       .    
## DG6           90.1   1.037e-01  0.0366146     0.12305     0.11599     0.11065
## DG7           98.9   1.066e-01  0.0182055     0.10997     0.10314     0.09531
## DG8           99.6   1.109e-01  0.0152951     0.11137     0.11650     0.12659
## DG9          100.0   1.105e-01  0.0144053     0.11591     0.11821     0.12259
## DG10          87.9   1.083e-01  0.0424931     0.13236     0.12497     0.13715
##                                                                              
## nVar                                            8           8           8    
## r2                                            0.999       0.999       0.999  
## BIC                                        -653.93417  -653.93417  -653.93417
## post prob                                     0.038       0.038       0.038  
##              model 4     model 5   
## Intercept      -0.46101     0.55391
## id                .           .    
## gender            .           .    
## age               .           .    
## level             .           .    
## job               .           .    
## mar               .           .    
## inc               .           .    
## spending          .           .    
## weight            .           .    
## height            .           .    
## skin              .           .    
## hair              .           .    
## style             .           .    
## life              .           .    
## fashion.age       .           .    
## DG1             0.10662     0.15935
## DG2             0.19340       .    
## DG3               .         0.13734
## DG4             0.12856     0.12802
## DG5               .         0.15554
## DG6             0.11698     0.11443
## DG7             0.10969     0.09973
## DG8             0.10754     0.10399
## DG9             0.11249     0.09078
## DG10            0.13089       .    
##                                    
## nVar              8           8    
## r2              0.999       0.999  
## BIC          -653.93417  -653.93417
## post prob       0.038       0.038
#5.diễn giải kết quả
# In bảng xác suất mô hình
imageplot.bma(bma_model)

# Biến quan trọng nhất
coef(bma_model)
## NULL
#6.dự báo fage và đánh giá
# Dự báo trên tập dữ liệu gốc
fage_pred <- predict(bma_model, newdata = X)
length(y)
## [1] 100
length(fage_pred)
## [1] 3
fage_pred <- predict(bma_model, newdata = X)$mean


# So sánh dự báo và thực tế
plot(y, fage_pred, main = "Thực tế vs Dự báo", xlab = "fage thực tế", ylab = "fage dự báo")
abline(0, 1, col = "red")

# Tính sai số RMSE
rmse <- sqrt(mean((y - fage_pred)^2))
cat("RMSE:", rmse)
## RMSE: 0.1413567
#kết luận

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.