Tài liệu buổi thực hành ngày 8/9/2021: Xử lý dữ liệu khuyết với phân tích thành phần chính.
(Tiếp theo: Thực hành trên R)
Speaker: TS. Hoàng Văn Hà, ĐHKHTN TP Hồ Chí Minh.
Chi tiết tại: https://sites.google.com/view/tkud/home?authuser=1
Tài liệu thực hành tại đường link: https://drive.google.com/drive/folders/1x_HkoByzdMqmcqrGnRFq-cbydpw1Wh1W?usp=sharing
Install necessary packages for missing data imputation. We will install the following packages: VIM, naniar, visdat, Amelia, mice, mtvnorm, ggplot2, missMDA, FactoMineR
Load the data
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
Iris is a complete dataset, hence we will make missing values
n <- dim(dat_iris)[1] ## get the number of observations
iris_miss <- dat_iris
p_miss <- 0.30 ## We make 30% of missing values under MCAR (Missing Completely At Random)
miss_inds <- replicate(4, runif(n) < p_miss) ## Make missing values only for the first 4 columns
miss_inds <- cbind(miss_inds, rep(FALSE, n))
iris_miss[miss_inds] <- NA
iris_miss[1:20,] ## Print first 20 lines of Iris dataset with 30% of missing values## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 NA 1.4 0.2 setosa
## 2 4.9 NA 1.4 0.2 setosa
## 3 4.7 3.2 1.3 NA setosa
## 4 4.6 3.1 1.5 NA setosa
## 5 5.0 3.6 1.4 NA setosa
## 6 NA 3.9 1.7 0.4 setosa
## 7 4.6 3.4 NA 0.3 setosa
## 8 5.0 3.4 1.5 0.2 setosa
## 9 4.4 NA 1.4 NA setosa
## 10 4.9 NA 1.5 NA setosa
## 11 NA 3.7 1.5 0.2 setosa
## 12 NA 3.4 NA 0.2 setosa
## 13 4.8 3.0 1.4 NA setosa
## 14 4.3 NA 1.1 0.1 setosa
## 15 5.8 4.0 NA 0.2 setosa
## 16 5.7 4.4 1.5 0.4 setosa
## 17 5.4 3.9 1.3 NA setosa
## 18 5.1 3.5 NA 0.3 setosa
## 19 5.7 3.8 1.7 0.3 setosa
## 20 5.1 3.8 NA NA setosa
a. USE PACKAGE NANIAR
References: http://naniar.njtierney.com/articles/getting-started-w-naniar.html
vis_dat visualises the whole dataframe at once, and provides information about the class of the data input into R, as well as whether the data is missing or not.
The function vis_miss provides a summary of whether the data is missing or not. It also provides the amount of missings in each columns.
## Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
## use `guide = "none"` instead.
## [1] 24.13333
## [1] 181
## [1] 43
## [1] 569
using geom_miss_point() with ggplot
with facet!
ggplot(iris_miss, aes(x = Petal.Length, y = Petal.Width)) + geom_miss_point() + facet_wrap(~ Species)b. USE PACKAGE VIM
The function aggr (package VIM) calculates and represents the number of missing entries in each variable and for certain combinations of variables which tend to be missing simultaneously
##
## Variables sorted by number of missings:
## Variable Count
## Petal.Length 0.3466667
## Petal.Width 0.3133333
## Sepal.Length 0.2866667
## Sepal.Width 0.2600000
## Species 0.0000000
## miss_plot <- aggr(iris_miss, col=c('navyblue','yellow'), numbers=TRUE, sortVars=TRUE,
## labels=names(iris_miss), cex.axis=.7,
## gap=3, ylab=c("Missing data","Pattern"))we can show matrix plot
The VIM function marginplot creates a scatterplot with additional information on the missing values. The points for which x (resp. y) is missing are represented in red along the y (resp. x) axis.
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
Ta sử dụng hai biến là Peta.Length và Petal.Width
Tạo giá trị khuyết trong biến Petal.Lenght
n <- dim(iris_petal)[1]
p_miss <- 0.5
index_NA <- sample(1:n, p_miss*n)
iris_petal[index_NA, 2] <- NA # Tạo 25% giá trị khuyết trong biến Petal.Width
iris_petal ## Petal.Length Petal.Width
## 1 1.4 0.2
## 2 1.4 0.2
## 3 1.3 NA
## 4 1.5 0.2
## 5 1.4 0.2
## 6 1.7 NA
## 7 1.4 NA
## 8 1.5 0.2
## 9 1.4 NA
## 10 1.5 NA
## 11 1.5 NA
## 12 1.6 NA
## 13 1.4 NA
## 14 1.1 0.1
## 15 1.2 NA
## 16 1.5 0.4
## 17 1.3 NA
## 18 1.4 0.3
## 19 1.7 0.3
## 20 1.5 0.3
## 21 1.7 0.2
## 22 1.5 0.4
## 23 1.0 NA
## 24 1.7 0.5
## 25 1.9 0.2
## 26 1.6 NA
## 27 1.6 0.4
## 28 1.5 NA
## 29 1.4 NA
## 30 1.6 NA
## 31 1.6 NA
## 32 1.5 NA
## 33 1.5 NA
## 34 1.4 0.2
## 35 1.5 NA
## 36 1.2 NA
## 37 1.3 NA
## 38 1.4 NA
## 39 1.3 0.2
## 40 1.5 0.2
## 41 1.3 NA
## 42 1.3 0.3
## 43 1.3 NA
## 44 1.6 0.6
## 45 1.9 0.4
## 46 1.4 NA
## 47 1.6 0.2
## 48 1.4 NA
## 49 1.5 NA
## 50 1.4 0.2
## 51 4.7 NA
## 52 4.5 1.5
## 53 4.9 1.5
## 54 4.0 1.3
## 55 4.6 1.5
## 56 4.5 NA
## 57 4.7 1.6
## 58 3.3 1.0
## 59 4.6 NA
## 60 3.9 1.4
## 61 3.5 NA
## 62 4.2 NA
## 63 4.0 NA
## 64 4.7 NA
## 65 3.6 NA
## 66 4.4 1.4
## 67 4.5 1.5
## 68 4.1 NA
## 69 4.5 1.5
## 70 3.9 1.1
## 71 4.8 1.8
## 72 4.0 1.3
## 73 4.9 1.5
## 74 4.7 NA
## 75 4.3 NA
## 76 4.4 1.4
## 77 4.8 1.4
## 78 5.0 NA
## 79 4.5 NA
## 80 3.5 NA
## 81 3.8 1.1
## 82 3.7 1.0
## 83 3.9 NA
## 84 5.1 NA
## 85 4.5 NA
## 86 4.5 1.6
## 87 4.7 1.5
## 88 4.4 NA
## 89 4.1 1.3
## 90 4.0 1.3
## 91 4.4 1.2
## 92 4.6 NA
## 93 4.0 NA
## 94 3.3 NA
## 95 4.2 1.3
## 96 4.2 1.2
## 97 4.2 NA
## 98 4.3 NA
## 99 3.0 1.1
## 100 4.1 1.3
## 101 6.0 2.5
## 102 5.1 NA
## 103 5.9 2.1
## 104 5.6 1.8
## 105 5.8 NA
## 106 6.6 2.1
## 107 4.5 1.7
## 108 6.3 NA
## 109 5.8 1.8
## 110 6.1 2.5
## 111 5.1 2.0
## 112 5.3 1.9
## 113 5.5 NA
## 114 5.0 2.0
## 115 5.1 2.4
## 116 5.3 NA
## 117 5.5 1.8
## 118 6.7 NA
## 119 6.9 2.3
## 120 5.0 NA
## 121 5.7 2.3
## 122 4.9 NA
## 123 6.7 NA
## 124 4.9 1.8
## 125 5.7 2.1
## 126 6.0 1.8
## 127 4.8 NA
## 128 4.9 NA
## 129 5.6 2.1
## 130 5.8 NA
## 131 6.1 1.9
## 132 6.4 NA
## 133 5.6 2.2
## 134 5.1 NA
## 135 5.6 NA
## 136 6.1 NA
## 137 5.6 NA
## 138 5.5 NA
## 139 4.8 NA
## 140 5.4 2.1
## 141 5.6 2.4
## 142 5.1 NA
## 143 5.1 1.9
## 144 5.9 2.3
## 145 5.7 NA
## 146 5.2 NA
## 147 5.0 NA
## 148 5.2 NA
## 149 5.4 2.3
## 150 5.1 NA
iris_petal_CC <- iris_petal[complete.cases(iris_petal),]
# Hoặc iris_petal_CC <- na.omit(iris_petal)
dim(iris_petal_CC)## [1] 75 2
## Petal.Length Petal.Width
## 1 1.4 0.2
## 2 1.4 0.2
## 4 1.5 0.2
## 5 1.4 0.2
## 8 1.5 0.2
## 14 1.1 0.1
## 16 1.5 0.4
## 18 1.4 0.3
## 19 1.7 0.3
## 20 1.5 0.3
## 21 1.7 0.2
## 22 1.5 0.4
## 24 1.7 0.5
## 25 1.9 0.2
## 27 1.6 0.4
## 34 1.4 0.2
## 39 1.3 0.2
## 40 1.5 0.2
## 42 1.3 0.3
## 44 1.6 0.6
## 45 1.9 0.4
## 47 1.6 0.2
## 50 1.4 0.2
## 52 4.5 1.5
## 53 4.9 1.5
## 54 4.0 1.3
## 55 4.6 1.5
## 57 4.7 1.6
## 58 3.3 1.0
## 60 3.9 1.4
## 66 4.4 1.4
## 67 4.5 1.5
## 69 4.5 1.5
## 70 3.9 1.1
## 71 4.8 1.8
## 72 4.0 1.3
## 73 4.9 1.5
## 76 4.4 1.4
## 77 4.8 1.4
## 81 3.8 1.1
## 82 3.7 1.0
## 86 4.5 1.6
## 87 4.7 1.5
## 89 4.1 1.3
## 90 4.0 1.3
## 91 4.4 1.2
## 95 4.2 1.3
## 96 4.2 1.2
## 99 3.0 1.1
## 100 4.1 1.3
## 101 6.0 2.5
## 103 5.9 2.1
## 104 5.6 1.8
## 106 6.6 2.1
## 107 4.5 1.7
## 109 5.8 1.8
## 110 6.1 2.5
## 111 5.1 2.0
## 112 5.3 1.9
## 114 5.0 2.0
## 115 5.1 2.4
## 117 5.5 1.8
## 119 6.9 2.3
## 121 5.7 2.3
## 124 4.9 1.8
## 125 5.7 2.1
## 126 6.0 1.8
## 129 5.6 2.1
## 131 6.1 1.9
## 133 5.6 2.2
## 140 5.4 2.1
## 141 5.6 2.4
## 143 5.1 1.9
## 144 5.9 2.3
## 149 5.4 2.3
iris_petal_Mean <- iris_petal
iris_petal_Mean[index_NA, 2] <- mean(iris_petal[,2], na.rm = TRUE)
imputed <- ((1:n) %in% index_NA)
ggplot(iris_petal_Mean) + ggtitle("Mean imputation") +
aes(x=Petal.Length, y=Petal.Width, colour = imputed) + geom_point()##
## Call:
## lm(formula = Petal.Width ~ Petal.Length, data = iris_petal)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.37008 -0.10925 -0.02497 0.11761 0.61049
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.36701 0.05326 -6.891 1.66e-09 ***
## Petal.Length 0.42285 0.01259 33.591 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1872 on 73 degrees of freedom
## (75 observations deleted due to missingness)
## Multiple R-squared: 0.9392, Adjusted R-squared: 0.9384
## F-statistic: 1128 on 1 and 73 DF, p-value: < 2.2e-16
iris_petal_Reg <- iris_petal
iris_petal_Reg[index_NA, 2] <- predict(iris_reg, iris_petal[index_NA, 1, drop = F])
ggplot(iris_petal_Reg) + ggtitle("Regression imputation") +
aes(x=Petal.Length, y=Petal.Width, colour = imputed) + geom_point()iris_petal_StochReg <- iris_petal
sig <- (summary(iris_reg))$sig
iris_petal_StochReg[index_NA, 2] <- iris_petal_Reg[index_NA, 2] + rnorm(length(index_NA), 0, sig)
ggplot(iris_petal_StochReg) + ggtitle("Stochastic regression imputation") +
aes(x=Petal.Length, y=Petal.Width, colour = imputed) + geom_point()Đối với mỗi bội dữ liệu điền khuyết được bởi từng phương pháp, tính trung bình mẫu, độ lệch chuẩn mẫu, hệ số tương quan với Petal.Length và khoảng tin cậy
data_all <- cbind.data.frame(iris$Petal.Width, iris_petal_Mean[,2], iris_petal_Reg[, 2], iris_petal_StochReg[,2])
mean_vec <- apply(data_all, 2, mean)
sd_vec <- apply(data_all, 2, sd)
cor_vec <- apply(data_all, 2, cor, iris_petal[,1])
lower <- mean_vec - qt(.975, n-1) * sd_vec/sqrt(n)
upper <- mean_vec + qt(.975, n-1) * sd_vec/sqrt(n)
width <- upper - lower
result <- rbind.data.frame(mean_vec, sd_vec, cor_vec, lower, upper, width)
result <- round(result, 4)
colnames(result) <- c("ORIGINAL", "MEAN","REG", "STOCH")
rownames(result) <- c("Mean", "STD", "Correlation", "Lower bound", "Upper bound ", "Width CI")
print(result)## ORIGINAL MEAN REG STOCH
## Mean 1.1993 1.2680 1.2221 1.2273
## STD 0.7622 0.5316 0.7579 0.7612
## Correlation 0.9629 0.6689 0.9849 0.9683
## Lower bound 1.0764 1.1822 1.0998 1.1044
## Upper bound 1.3223 1.3538 1.3443 1.3501
## Width CI 0.2460 0.1715 0.2446 0.2456
Data House_price: * price: Giá nhà được bán ra. * sqft_living15: Diện tích trung bình của 15 ngôi nhà gần nhất trong khu dân cư. * floors: Số tầng của ngôi nhà được phân loại từ 1-3.5. * condition: Điều kiện kiến trúc của ngôi nhà từ 1 − 5, 1: rất tệ và 5: rất tốt. * sqft_above: Diện tích ngôi nhà. * sqft_living: Diện tích khuôn viên nhà.
setwd("D:/Tap huan VIASM/Chuoi Seminar/T9_2021/Hoang HA")
house <- read.csv("house_price.csv", header = TRUE)## [1] "X.2" "X.1" "X" "id"
## [5] "date" "price" "bedrooms" "bathrooms"
## [9] "sqft_living" "sqft_lot" "floors" "waterfront"
## [13] "view" "condition" "grade" "sqft_above"
## [17] "sqft_basement" "yr_built" "yr_renovated" "zipcode"
## [21] "lat" "long" "sqft_living15" "sqft_lot15"
n <- 1:100# using a different number, 1000, 10000
houseKNN <- house[, c("price", "sqft_living15", "floors", "condition", "sqft_above", "sqft_living")]
houseKNN$floors <- as.factor(houseKNN$floors)
houseKNN$condition <- as.factor(houseKNN$condition)
head(houseKNN)## price sqft_living15 floors condition sqft_above sqft_living
## 1 221900 1340 1 3 1180 1180
## 2 538000 1690 2 3 2170 2570
## 3 180000 2720 1 3 770 770
## 4 604000 1360 1 5 1050 1960
## 5 510000 1800 1 3 1680 1680
## 6 1225000 4760 1 3 3890 5420
M1 <- lm(price ~ condition + floors + sqft_living15 + sqft_above+ sqft_living, data = houseKNN)
summary(M1)##
## Call:
## lm(formula = price ~ condition + floors + sqft_living15 + sqft_above +
## sqft_living, data = houseKNN)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1399648 -141411 -22768 104156 4503098
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.233e+05 4.669e+04 -2.642 0.008256 **
## condition2 -2.803e+04 5.029e+04 -0.557 0.577259
## condition3 -2.780e+04 4.650e+04 -0.598 0.550056
## condition4 5.198e+03 4.655e+04 0.112 0.911096
## condition5 7.484e+04 4.686e+04 1.597 0.110224
## floors1.5 7.390e+04 6.459e+03 11.441 < 2e-16 ***
## floors2 -1.702e+04 4.967e+03 -3.426 0.000615 ***
## floors2.5 2.435e+05 2.060e+04 11.824 < 2e-16 ***
## floors3 1.717e+05 1.090e+04 15.755 < 2e-16 ***
## floors3.5 3.146e+05 8.994e+04 3.498 0.000470 ***
## sqft_living15 9.132e+01 4.005e+00 22.802 < 2e-16 ***
## sqft_above -2.301e+01 5.248e+00 -4.384 1.17e-05 ***
## sqft_living 2.535e+02 4.340e+00 58.413 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 254100 on 21580 degrees of freedom
## (20 observations deleted due to missingness)
## Multiple R-squared: 0.5211, Adjusted R-squared: 0.5209
## F-statistic: 1957 on 12 and 21580 DF, p-value: < 2.2e-16
## Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
## use `guide = "none"` instead.
##
## Variables sorted by number of missings:
## Variable Count
## condition 0.2699301
## sqft_above 0.2599824
## price 0.2307870
## sqft_living15 0.0000000
## floors 0.0000000
## sqft_living 0.0000000
Aggregation of the k nearest neighbors is used to imputed value. The kind of aggregation and distance depends on the type of the variable.
The ‘kNN’ function * dist_var: vector of variable names to be used for calculating the distances * weights: numeric vector containing a weight for each distance variable * numFun: function for aggregating the k nearest neighbors for numerical variables * catFun: function for aggregating the k nearest neighbors for categorical variables,
IT WILL TAKE several minutes
start <- Sys.time()
houseKNN_imputed <- kNN(houseKNN, dist_var = c("sqft_living15", "floors", "sqft_living" ), k = 5, imp_var = FALSE)
end <- Sys.time()
print(runtime <- end -start)## Time difference of 3.982042 mins
## [1] 0
M2 <- lm(price ~ condition + floors + sqft_living15 + sqft_above+ sqft_living, data = houseKNN_imputed)
summary(M2)##
## Call:
## lm(formula = price ~ condition + floors + sqft_living15 + sqft_above +
## sqft_living, data = houseKNN_imputed)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1359047 -128003 -19910 93436 4498011
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -52531.147 52097.632 -1.008 0.313312
## condition2 -70813.609 55937.486 -1.266 0.205548
## condition3 -80680.312 51914.844 -1.554 0.120178
## condition4 -68210.925 51962.966 -1.313 0.189304
## condition5 -21432.235 52257.450 -0.410 0.681716
## floors1.5 81859.427 6055.455 13.518 < 2e-16 ***
## floors2 -27904.914 4763.329 -5.858 4.74e-09 ***
## floors2.5 240156.109 19280.544 12.456 < 2e-16 ***
## floors3 154330.217 10203.544 15.125 < 2e-16 ***
## floors3.5 310966.028 84071.158 3.699 0.000217 ***
## sqft_living15 87.085 3.766 23.126 < 2e-16 ***
## sqft_above -17.783 5.094 -3.491 0.000482 ***
## sqft_living 248.368 4.068 61.057 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 237500 on 21600 degrees of freedom
## Multiple R-squared: 0.5448, Adjusted R-squared: 0.5445
## F-statistic: 2154 on 12 and 21600 DF, p-value: < 2.2e-16
We will use the dataset Ozone:
Data ozone about air pollution: 112 observations collected during summer of 2001 in Rennes, France. * The variables available are: * maxO3 (maximum daily ozone) * maxO3v (maximum daily ozone the previous day) * T9 * T12 (temperature at midday) * T15 (temperature at 3pm) * Vx12 (projection of the wind speed vector on the east-west axis at midday) * Vx9 and Vx15 as well as the Nebulosity (cloud) Ne9, Ne12, Ne15 * Aim: analyse the relationship between the maximum daily ozone (maxO3) level and the other meteorological variables.
## maxO3 T9 T12 T15 Ne9 Ne12 Ne15 Vx9 Vx12 Vx15 maxO3v
## 20010601 87 15.6 18.5 NA 4 4 8 0.6946 -1.7101 -0.6946 84
## 20010602 82 NA NA NA 5 5 7 -4.3301 -4.0000 -3.0000 87
## 20010603 92 15.3 17.6 19.5 2 NA NA 2.9544 NA 0.5209 82
## 20010604 114 16.2 19.7 NA 1 1 0 NA 0.3473 -0.1736 92
## 20010605 94 NA 20.5 20.4 NA NA NA -0.5000 -2.9544 -4.3301 114
## 20010606 80 17.7 19.8 18.3 6 NA 7 -5.6382 -5.0000 -6.0000 94
## 20010607 79 16.8 15.6 14.9 7 8 NA -4.3301 -1.8794 -3.7588 80
## 20010610 79 14.9 17.5 18.9 5 5 NA 0.0000 -1.0419 -1.3892 99
## 20010611 101 16.1 19.6 21.4 2 NA 4 -0.7660 -1.0261 -2.2981 79
## 20010612 106 18.3 NA 22.9 5 NA NA 1.2856 -2.2981 -3.9392 101
## 20010613 101 17.3 19.3 20.2 7 7 3 -1.5000 -1.5000 -0.8682 106
## 20010614 90 17.6 20.3 17.4 NA 6 8 NA -1.0419 -0.6946 101
## 20010615 72 NA NA NA 7 5 6 -0.8682 -2.7362 -6.8944 90
## 20010616 70 17.1 18.2 18.0 NA 7 NA NA -7.8785 -5.1962 72
## 20010617 83 15.4 NA 16.6 8 7 NA -4.3301 -2.0521 -3.0000 70
## 20010618 88 NA 19.1 NA 6 5 4 0.5209 -2.9544 -1.0261 83
## 20010620 NA 21.0 24.6 26.9 NA NA 1 -0.3420 NA -0.6840 121
## 20010621 NA NA NA NA NA NA NA 0.0000 0.3473 -2.5712 NA
## 20010622 121 19.7 24.2 26.9 2 1 0 NA NA 2.0000 81
## 20010623 146 23.6 28.6 28.4 NA NA NA 1.0000 -1.9284 -1.2155 121
## maxO3 T9 T12 T15
## Min. : 42.00 Min. :11.30 Min. :14.30 Min. :14.90
## 1st Qu.: 71.00 1st Qu.:16.00 1st Qu.:18.60 1st Qu.:18.90
## Median : 81.50 Median :17.70 Median :20.40 Median :21.40
## Mean : 91.24 Mean :18.22 Mean :21.46 Mean :22.41
## 3rd Qu.:108.25 3rd Qu.:19.90 3rd Qu.:23.60 3rd Qu.:25.65
## Max. :166.00 Max. :25.30 Max. :33.50 Max. :35.50
## NA's :16 NA's :37 NA's :33 NA's :37
## Ne9 Ne12 Ne15 Vx9
## Min. :0.000 Min. :0.000 Min. :0.00 Min. :-7.8785
## 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:3.00 1st Qu.:-3.0000
## Median :5.000 Median :5.000 Median :5.00 Median :-0.8671
## Mean :4.987 Mean :4.986 Mean :4.60 Mean :-1.0958
## 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:6.25 3rd Qu.: 0.6919
## Max. :8.000 Max. :8.000 Max. :8.00 Max. : 5.1962
## NA's :34 NA's :42 NA's :32 NA's :18
## Vx12 Vx15 maxO3v
## Min. :-7.8785 Min. :-9.000 Min. : 42.00
## 1st Qu.:-3.6941 1st Qu.:-3.759 1st Qu.: 70.00
## Median :-1.9284 Median :-1.710 Median : 82.50
## Mean :-1.6853 Mean :-1.830 Mean : 89.39
## 3rd Qu.:-0.1302 3rd Qu.: 0.000 3rd Qu.:101.00
## Max. : 6.5778 Max. : 3.830 Max. :166.00
## NA's :10 NA's :21 NA's :12
##
## Variables sorted by number of missings:
## Variable Count
## Ne12 0.37500000
## T9 0.33035714
## T15 0.33035714
## Ne9 0.30357143
## T12 0.29464286
## Ne15 0.28571429
## Vx15 0.18750000
## Vx9 0.16071429
## maxO3 0.14285714
## maxO3v 0.10714286
## Vx12 0.08928571
We suppose that the data is drawn from a multivariate normal distribution with * parameter theta = (mu, Sigma) (mu: mean vector, Sigma: covariance matrix) #### a. Step 1: Estimate M and S from the incompleta dataset with EM
Get estimated parameter
pre_param <- prelim.norm(as.matrix(miss_ozone))
thetahat <- em.norm(pre_param) # run EM algorithm, compute MLE## Iterations of EM:
## 1...2...3...4...5...6...7...8...9...10...11...12...13...14...15...16...17...18...19...20...21...22...23...24...25...26...27...28...29...
## [1] 90.449970 18.153505 21.231393 22.468786 4.887813 4.902927 4.756645
## [8] -1.209441 -1.610990 -1.658337 89.076575
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 798.42882 56.8626446 88.165795 98.855977 -43.361980 -44.112574 -31.343664
## [2,] 56.86264 9.4620486 10.778438 11.528216 -2.392914 -2.410661 -1.832665
## [3,] 88.16579 10.7784377 16.044202 16.847247 -4.517125 -5.313627 -3.327555
## [4,] 98.85598 11.5282160 16.847247 20.409659 -5.571630 -6.268737 -5.386797
## [5,] -43.36198 -2.3929140 -4.517125 -5.571630 6.163044 4.080450 2.953342
## [6,] -44.11257 -2.4106612 -5.313627 -6.268737 4.080450 5.036928 3.620388
## [7,] -31.34366 -1.8326650 -3.327555 -5.386797 2.953342 3.620388 5.311804
## [8,] 36.93442 1.2394590 4.002302 5.048559 -2.834880 -2.774599 -2.296114
## [9,] 36.01630 0.9686541 2.786801 4.177639 -3.779082 -3.122105 -2.581832
## [10,] 27.73091 0.1581604 1.905013 2.917806 -2.906145 -2.459598 -2.296046
## [11,] 526.88457 54.6242014 66.455595 77.871268 -20.532679 -27.757591 -24.810815
## [,8] [,9] [,10] [,11]
## [1,] 36.934424 36.0163033 27.7309074 526.88457
## [2,] 1.239459 0.9686541 0.1581604 54.62420
## [3,] 4.002302 2.7868012 1.9050134 66.45559
## [4,] 5.048559 4.1776389 2.9178059 77.87127
## [5,] -2.834880 -3.7790819 -2.9061451 -20.53268
## [6,] -2.774599 -3.1221049 -2.4595976 -27.75759
## [7,] -2.296114 -2.5818324 -2.2960457 -24.81081
## [8,] 6.618449 5.5077633 4.5303973 23.85331
## [9,] 5.507763 7.9475560 6.3324129 17.55904
## [10,] 4.530397 6.3324129 7.4065652 13.97904
## [11,] 23.853313 17.5590360 13.9790362 740.55441
## maxO3 T9 T12 T15 Ne9 Ne12 Ne15
## 20010601 87.00000 15.60000 18.50000 18.80523 4.000000 4.0000000 8.0000000
## 20010602 82.00000 19.53650 24.16694 24.12489 5.000000 5.0000000 7.0000000
## 20010603 92.00000 15.30000 17.60000 19.50000 2.000000 4.9368627 4.1142550
## 20010604 114.00000 16.20000 19.70000 23.20843 1.000000 1.0000000 0.0000000
## 20010605 94.00000 19.88926 20.50000 20.40000 6.087397 5.1968643 5.7749611
## 20010606 80.00000 17.70000 19.80000 18.30000 6.000000 6.5433590 7.0000000
## 20010607 79.00000 16.80000 15.60000 14.90000 7.000000 8.0000000 7.0907445
## 20010610 79.00000 14.90000 17.50000 18.90000 5.000000 5.0000000 4.5173461
## 20010611 101.00000 16.10000 19.60000 21.40000 2.000000 3.5699347 4.0000000
## 20010612 106.00000 18.30000 24.16708 22.90000 5.000000 4.2774765 4.8439340
## 20010613 101.00000 17.30000 19.30000 20.20000 7.000000 7.0000000 3.0000000
## 20010614 90.00000 17.60000 20.30000 17.40000 2.984046 6.0000000 8.0000000
## 20010615 72.00000 17.96965 21.43565 21.62728 7.000000 5.0000000 6.0000000
## 20010616 70.00000 17.10000 18.20000 18.00000 7.827730 7.0000000 4.8448180
## 20010617 83.00000 15.40000 17.87892 16.60000 8.000000 7.0000000 8.2608807
## 20010618 88.00000 15.17082 19.10000 20.31545 6.000000 5.0000000 4.0000000
## 20010620 100.81083 21.00000 24.60000 26.90000 4.295852 3.0803031 1.0000000
## 20010621 86.32418 17.28943 18.08785 18.48448 3.522778 3.8013441 3.7390321
## 20010622 121.00000 19.70000 24.20000 26.90000 2.000000 1.0000000 0.0000000
## 20010623 146.00000 23.60000 28.60000 28.40000 2.313533 1.8129584 4.3019487
## 20010624 121.00000 20.40000 25.20000 27.70000 1.000000 0.0000000 0.0000000
## 20010625 146.00000 17.39221 20.31701 24.06672 -3.486243 0.0000000 0.0000000
## 20010626 108.00000 24.00000 23.50000 26.07150 4.000000 4.0000000 0.0000000
## 20010627 83.00000 19.70000 22.90000 24.80000 5.128027 4.4135061 0.1170471
## 20010628 72.45459 18.90692 18.74505 20.44892 1.291228 5.7117877 3.7073877
## 20010629 81.00000 15.29255 18.85546 20.65737 3.000000 4.0000000 4.0000000
## 20010630 67.00000 19.46240 23.40000 23.70000 7.529388 6.7604317 5.1571235
## 20010701 70.00000 18.05270 23.89321 26.43789 5.000000 2.0000000 1.0000000
## 20010702 106.00000 17.41271 22.32422 21.20086 0.278685 0.0000000 1.0000000
## 20010703 139.00000 21.75042 30.10000 31.90000 2.479183 1.0000000 4.0000000
## 20010704 79.00000 17.55282 19.62157 22.58062 3.067557 3.1065910 2.1639315
## 20010705 42.89067 16.80000 18.20000 22.00000 8.000000 8.0000000 6.0000000
## 20010706 100.79255 20.80000 25.82278 29.50565 4.334573 3.0000000 4.0000000
## 20010707 113.00000 16.71687 18.20000 22.70000 6.132511 4.8854014 1.9240069
## 20010708 72.00000 19.66297 21.20000 23.90000 7.000000 6.4077575 4.0000000
## 20010709 88.00000 19.20000 22.00000 23.15411 3.904259 5.9440686 4.1415467
## 20010710 77.00000 19.40000 20.70000 22.50000 7.000000 8.0000000 7.9787992
## 20010711 71.00000 19.20000 21.00000 22.40000 6.000000 4.0000000 6.0000000
## 20010712 56.00000 13.80000 16.32374 18.50000 8.000000 8.0000000 6.0000000
## 20010713 45.00000 12.12499 14.50000 15.20000 8.000000 7.9060489 8.0000000
## 20010714 67.00000 15.60000 18.60000 19.93285 5.000000 4.6162121 5.0000000
## 20010715 73.44735 16.90000 19.10000 21.54200 5.000000 6.4441634 6.0000000
## 20010716 84.00000 17.40000 20.40000 21.22985 3.000000 4.1984733 6.0000000
## 20010717 63.00000 16.09482 20.50000 20.60000 8.000000 6.0000000 6.0000000
## 20010718 58.62152 16.16375 15.60000 17.36262 8.185023 8.0000000 6.0073736
## 20010719 92.00000 16.70000 19.10000 19.30000 7.000000 6.0000000 4.0000000
## 20010720 88.00000 18.54952 20.30000 23.71556 6.003025 7.5189820 4.1784982
## 20010721 66.00000 18.00000 19.32197 22.35593 8.000000 6.0000000 5.0000000
## 20010722 72.00000 18.60000 21.90000 23.60000 4.000000 7.0000000 6.0000000
## 20010723 81.00000 18.80000 22.50000 23.90000 6.000000 3.0000000 2.0000000
## 20010724 87.88004 19.00000 22.50000 24.10000 4.106192 2.4314014 2.3961476
## 20010725 149.00000 19.90000 26.90000 29.00000 3.000000 4.0000000 2.2469571
## 20010726 153.00000 23.37849 26.24567 28.39418 1.000000 1.6599109 4.0000000
## 20010727 159.00000 24.00000 28.30000 26.50000 2.000000 1.9550015 7.0000000
## 20010728 149.00000 23.30000 27.60000 28.80000 4.000000 2.7453163 3.0000000
## 20010729 160.00000 19.78321 25.71437 25.23432 3.091227 0.5729389 2.8732227
## 20010730 156.00000 24.90000 30.50000 32.20000 0.000000 1.0000000 4.0000000
## 20010731 84.00000 21.98897 26.30000 27.80000 5.552872 2.8280423 2.0000000
## 20010801 126.00000 25.30000 29.50000 31.20000 3.293608 4.0000000 4.0000000
## 20010802 116.00000 21.30000 23.80000 22.10000 7.000000 7.0000000 8.0000000
## 20010803 77.00000 20.00000 18.20000 23.60000 5.000000 7.0000000 6.0000000
## 20010804 63.00000 18.70000 20.60000 20.30000 6.000000 4.1019861 7.0000000
## 20010805 67.57088 18.60000 18.70000 17.80000 8.000000 8.0000000 8.0000000
## 20010806 65.00000 19.20000 23.00000 22.70000 8.000000 7.0000000 7.0000000
## 20010807 72.00000 19.90000 24.20664 20.40000 7.000000 7.0000000 8.0000000
## 20010808 60.00000 18.70000 21.40000 21.70000 7.000000 7.0000000 7.0000000
## 20010809 70.00000 18.40000 17.10000 20.23380 3.000000 6.0000000 3.0000000
## 20010810 77.00000 16.60758 21.02899 23.69575 4.000000 5.0000000 3.1876648
## 20010811 98.00000 17.72105 23.43337 27.95671 1.000000 1.0000000 0.0000000
## 20010812 111.00000 23.15200 25.61011 27.31717 1.000000 5.0000000 2.0000000
## 20010813 75.00000 18.75488 20.15008 22.69952 8.000000 7.0000000 1.0000000
## 20010814 116.00000 23.50000 29.80000 31.70000 1.000000 3.0000000 5.0000000
## 20010815 109.00000 20.80000 23.70000 26.60000 8.000000 5.0000000 4.0000000
## 20010819 67.00000 18.80000 20.31022 18.90000 6.010757 5.2052648 6.5903960
## 20010820 76.00000 15.84810 19.55467 24.00000 6.161880 5.0000000 5.0000000
## 20010821 113.00000 20.60000 24.80000 27.00000 2.310979 2.9511850 4.3448644
## 20010822 117.00000 21.60000 26.90000 28.60000 6.000000 3.5938154 4.0000000
## 20010823 131.00000 20.29807 28.40000 30.10000 5.000000 3.0000000 3.0000000
## 20010824 166.00000 19.80000 27.20000 30.80000 4.000000 0.0000000 1.0000000
## 20010825 159.00000 25.00000 33.50000 35.50000 1.000000 -1.5158041 1.0000000
## 20010826 91.71810 20.10000 22.90000 27.60000 8.000000 8.0000000 6.0000000
## 20010827 114.00000 20.48188 26.58337 26.72147 7.000000 4.0000000 5.0000000
## 20010828 107.27221 21.00000 24.40000 26.57361 1.000000 6.0000000 3.0000000
## 20010829 66.58182 16.90000 17.80000 20.60000 7.787363 8.9938564 7.0000000
## 20010830 76.00000 17.71420 18.60000 18.70000 7.000000 7.0000000 7.0000000
## 20010831 59.00000 16.50000 20.30000 20.30000 5.000000 7.0000000 6.0000000
## 20010901 78.00000 17.70000 20.20000 21.50000 4.275804 5.0159247 4.2307628
## 20010902 76.00000 17.30000 22.70000 24.60000 4.000000 4.1773388 6.0000000
## 20010903 55.00000 15.30000 16.80000 19.20000 8.000000 7.0000000 5.0000000
## 20010904 71.00000 15.90000 19.20000 19.50000 7.000000 5.0000000 3.0000000
## 20010905 94.65995 16.20000 18.90000 19.30000 2.000000 5.0000000 6.0000000
## 20010906 59.00000 13.08482 14.59119 15.71161 7.000000 7.0000000 7.0000000
## 20010907 63.49682 17.87334 22.83667 23.03242 6.000000 5.0000000 6.5913788
## 20010908 63.00000 17.30000 19.80000 19.40000 5.843165 5.7561851 6.6620648
## 20010912 70.81305 14.20000 22.20000 19.83752 5.000000 4.0877775 6.0000000
## 20010913 74.00000 15.80000 18.70000 19.10000 7.468648 7.0000000 7.0000000
## 20010914 71.00000 15.20000 17.90000 18.60000 7.502740 6.3439714 4.2376973
## 20010915 69.00000 17.10000 17.70000 17.50000 6.000000 7.0000000 8.0000000
## 20010916 71.00000 15.40000 16.38294 16.60000 4.000000 5.0000000 5.0000000
## 20010917 60.00000 13.41068 14.05908 17.94003 4.000000 5.0000000 4.0000000
## 20010918 42.00000 12.25438 14.30000 14.90000 8.000000 7.0000000 7.0000000
## 20010919 65.00000 14.80000 13.74786 15.90000 7.000000 8.3104476 7.0000000
## 20010920 71.00000 15.50000 18.00000 17.40000 7.000000 7.0000000 6.0000000
## 20010921 96.00000 11.30000 17.49923 20.20000 3.000000 3.0000000 3.0000000
## 20010922 98.00000 15.20000 19.70000 20.30000 2.000000 2.0000000 2.0000000
## 20010923 92.00000 13.04676 17.60000 18.20000 1.000000 4.0000000 6.0000000
## 20010924 76.00000 13.30000 17.70000 17.70000 6.027148 5.5973732 7.7434040
## 20010925 72.36693 13.30000 16.32845 17.80000 3.000000 5.0000000 4.5296621
## 20010927 77.00000 16.20000 20.80000 20.89644 6.722992 7.2467261 7.1971120
## 20010928 99.00000 16.83962 21.05942 23.30569 3.458714 6.1277035 6.9607010
## 20010929 83.00000 19.24375 21.36490 23.47164 3.778340 5.0000000 3.0000000
## 20010930 70.00000 15.70000 18.60000 20.70000 7.000000 4.9782737 7.0000000
## Vx9 Vx12 Vx15 maxO3v
## 20010601 0.6946000 -1.7101000 -0.69460000 84.00000
## 20010602 -4.3301000 -4.0000000 -3.00000000 87.00000
## 20010603 2.9544000 0.8976546 0.52090000 82.00000
## 20010604 -0.2688894 0.3473000 -0.17360000 92.00000
## 20010605 -0.5000000 -2.9544000 -4.33010000 114.00000
## 20010606 -5.6382000 -5.0000000 -6.00000000 94.00000
## 20010607 -4.3301000 -1.8794000 -3.75880000 80.00000
## 20010610 0.0000000 -1.0419000 -1.38920000 99.00000
## 20010611 -0.7660000 -1.0261000 -2.29810000 79.00000
## 20010612 1.2856000 -2.2981000 -3.93920000 101.00000
## 20010613 -1.5000000 -1.5000000 -0.86820000 106.00000
## 20010614 0.4411571 -1.0419000 -0.69460000 101.00000
## 20010615 -0.8682000 -2.7362000 -6.89440000 90.00000
## 20010616 -4.7784876 -7.8785000 -5.19620000 72.00000
## 20010617 -4.3301000 -2.0521000 -3.00000000 70.00000
## 20010618 0.5209000 -2.9544000 -1.02610000 83.00000
## 20010620 -0.3420000 -1.6125710 -0.68400000 121.00000
## 20010621 0.0000000 0.3473000 -2.57120000 78.38886
## 20010622 4.0750114 3.7618992 2.00000000 81.00000
## 20010623 1.0000000 -1.9284000 -1.21550000 121.00000
## 20010624 -0.2068360 -0.5209000 1.02610000 146.00000
## 20010625 2.9544000 6.5778000 4.44405207 121.00000
## 20010626 -2.5712000 -3.8567000 -4.69850000 146.00000
## 20010627 -2.5981000 -3.9145720 -2.79472331 80.71020
## 20010628 -5.6382000 -3.8302000 -4.59630000 83.00000
## 20010629 -1.9284000 -2.5712000 -4.33010000 57.00000
## 20010630 -1.5321000 -3.0642000 -0.86820000 81.00000
## 20010701 0.6840000 0.0000000 1.36810000 67.00000
## 20010702 2.8191000 3.9392000 3.46410000 70.00000
## 20010703 1.8794000 2.0000000 1.36810000 106.00000
## 20010704 0.6946000 -0.8660000 -1.02610000 139.00000
## 20010705 0.0000000 0.0000000 1.28560000 79.00000
## 20010706 0.0000000 1.7101000 0.08428691 93.00000
## 20010707 -3.7588000 -3.9392000 -4.69850000 97.00000
## 20010708 -2.5981000 -3.9392000 -3.75880000 113.00000
## 20010709 -1.9696000 -3.0642000 -4.00000000 72.00000
## 20010710 -2.0549754 -5.6382000 -9.00000000 88.00000
## 20010711 -7.8785000 -6.8937000 -6.89370000 77.00000
## 20010712 1.5000000 -3.8302000 -2.05210000 71.00000
## 20010713 0.6840000 4.0000000 5.32711200 31.17708
## 20010714 -3.2139000 -3.2614443 -1.42964366 45.00000
## 20010715 -2.2981000 -3.7588000 0.00000000 67.00000
## 20010716 0.0000000 -0.4791944 -2.59810000 67.00000
## 20010717 2.0000000 -5.3623000 -6.12840000 84.00000
## 20010718 -4.0521162 -3.8302000 -4.33010000 63.00000
## 20010719 -2.0521000 -4.4995000 -2.73620000 69.00000
## 20010720 -1.6964161 -3.4641000 -5.03086388 92.00000
## 20010721 -3.0000000 -3.5000000 -3.71171123 88.00000
## 20010722 -0.3237145 -1.9696000 -2.17583160 66.00000
## 20010723 0.5209000 -1.0000000 -2.00000000 97.19249
## 20010724 -0.3482894 -1.0261000 0.52090000 81.00000
## 20010725 2.5283422 -0.9397000 -0.64280000 83.00000
## 20010726 0.9397000 1.5000000 -0.29283429 149.00000
## 20010727 -0.3420000 1.2856000 -2.00000000 125.87289
## 20010728 0.8660000 -1.5321000 -0.17360000 159.00000
## 20010729 1.5321000 0.4357703 -0.76272537 149.00000
## 20010730 -0.5000000 -1.8794000 -1.28560000 160.00000
## 20010731 -1.3681000 -4.2848608 0.00000000 156.00000
## 20010801 3.0000000 3.7588000 2.77992396 84.00000
## 20010802 0.0000000 -2.3941000 -1.38920000 126.00000
## 20010803 -3.4641000 -2.5981000 -3.75880000 116.00000
## 20010804 -5.0000000 -4.9240000 -5.63820000 86.87946
## 20010805 -4.6985000 -2.5000000 -0.86820000 63.00000
## 20010806 -3.8302000 -4.9240000 -5.63820000 54.00000
## 20010807 -3.0000000 -4.5963000 -1.82815973 65.00000
## 20010808 -5.6382000 -6.0622000 -6.89370000 72.00000
## 20010809 -5.9088000 -3.2139000 -4.49950000 60.00000
## 20010810 -1.9284000 -1.0261000 0.52090000 70.00000
## 20010811 3.3930906 -1.5321000 -1.00000000 85.84282
## 20010812 -1.0261000 -3.0000000 -2.29810000 98.00000
## 20010813 -0.8660000 0.0000000 0.00000000 74.11551
## 20010814 1.8794000 1.3681000 0.69460000 75.00000
## 20010815 -1.0261000 -1.7101000 -3.21390000 116.00000
## 20010819 -5.9608243 -5.3623000 -2.50000000 86.00000
## 20010820 -3.0642000 -2.2981000 -2.21493417 67.00000
## 20010821 1.3681000 0.8682000 -2.29810000 76.00000
## 20010822 1.5321000 1.9284000 1.92840000 113.00000
## 20010823 0.1736000 -1.9696000 -1.92840000 117.00000
## 20010824 0.6428000 -0.8660000 0.68400000 131.00000
## 20010825 1.0000000 0.6946000 -1.71010000 166.00000
## 20010826 1.2856000 -1.7321000 -0.68400000 119.34756
## 20010827 3.0642000 2.8191000 1.36810000 100.00000
## 20010828 4.0000000 4.0000000 3.75880000 114.00000
## 20010829 -2.0000000 -0.5209000 1.87940000 112.00000
## 20010830 -3.4641000 -4.0000000 -1.73210000 101.00000
## 20010831 -4.3301000 -5.3623000 -3.92730996 76.00000
## 20010901 0.7955965 0.5209000 0.00000000 59.00000
## 20010902 -2.9544000 -2.9544000 -2.00000000 109.36762
## 20010903 -1.8794000 -1.6415116 -2.39410000 76.00000
## 20010904 -0.5315167 -1.3063690 -1.38920000 55.00000
## 20010905 -1.3681000 -0.8682000 -1.60526148 71.00000
## 20010906 -2.4545132 -1.9284000 -1.71010000 66.00000
## 20010907 -1.5000000 -3.4641000 -3.06420000 59.00000
## 20010908 -4.5963000 -6.0622000 -4.33010000 68.00000
## 20010912 -0.8660000 -5.0000000 -2.67185584 62.00000
## 20010913 -4.5963000 -6.8937000 -7.14757735 78.00000
## 20010914 -1.0419000 -1.3681000 -0.01695107 74.00000
## 20010915 -5.1962000 -2.7362000 -1.04190000 71.00000
## 20010916 -3.8302000 0.0000000 1.38920000 69.00000
## 20010917 0.0000000 3.2139000 0.00000000 71.00000
## 20010918 -2.5000000 -3.2139000 -2.50000000 60.00000
## 20010919 -6.3084014 -6.0622000 -5.19620000 42.00000
## 20010920 -3.9392000 -3.0642000 0.00000000 65.00000
## 20010921 -0.1736000 3.7588000 3.83020000 71.00000
## 20010922 4.0000000 5.0000000 3.72927005 96.00000
## 20010923 5.1962000 5.1423000 5.86105076 98.00000
## 20010924 -0.9397000 -0.7660000 -0.50000000 73.50625
## 20010925 0.0000000 -1.0000000 -1.28560000 76.00000
## 20010927 -0.6946000 -2.0000000 -2.57377611 71.00000
## 20010928 1.5000000 0.8682000 0.86820000 77.42839
## 20010929 -4.0000000 -3.7588000 -4.00000000 99.00000
## 20010930 -2.2879663 -1.0419000 -4.00000000 83.00000
#pairs(ozone_imputed)
model_ozone <- lm(maxO3 ~ T9+T12+T15+Ne9+Ne12+Ne15+Vx9+Vx12+Vx15+maxO3v, data = ozone_imputed)
summary(model_ozone)##
## Call:
## lm(formula = maxO3 ~ T9 + T12 + T15 + Ne9 + Ne12 + Ne15 + Vx9 +
## Vx12 + Vx15 + maxO3v, data = ozone_imputed)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.942 -8.760 0.298 7.623 40.737
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 24.63947 12.65663 1.947 0.05434 .
## T9 -0.27722 1.31484 -0.211 0.83344
## T12 3.88926 1.43688 2.707 0.00798 **
## T15 -0.98922 0.98539 -1.004 0.31783
## Ne9 -2.19388 0.87467 -2.508 0.01373 *
## Ne12 -1.03719 1.44623 -0.717 0.47493
## Ne15 -0.44998 0.99415 -0.453 0.65179
## Vx9 0.03844 0.95232 0.040 0.96788
## Vx12 2.81121 1.15746 2.429 0.01692 *
## Vx15 -1.47269 0.92925 -1.585 0.11613
## maxO3v 0.33157 0.07101 4.670 9.31e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.65 on 101 degrees of freedom
## Multiple R-squared: 0.7823, Adjusted R-squared: 0.7607
## F-statistic: 36.29 on 10 and 101 DF, p-value: < 2.2e-16
We use again the Ozone dataset
We keep only the continuous variables (first 11 columns)
## maxO3 T9 T12 T15 Ne9 Ne12 Ne15 Vx9 Vx12 Vx15 maxO3v
## 20010601 87 15.6 18.5 NA 4 4 8 0.6946 -1.7101 -0.6946 84
## 20010602 82 NA NA NA 5 5 7 -4.3301 -4.0000 -3.0000 87
## 20010603 92 15.3 17.6 19.5 2 NA NA 2.9544 NA 0.5209 82
## 20010604 114 16.2 19.7 NA 1 1 0 NA 0.3473 -0.1736 92
## 20010605 94 NA 20.5 20.4 NA NA NA -0.5000 -2.9544 -4.3301 114
## 20010606 80 17.7 19.8 18.3 6 NA 7 -5.6382 -5.0000 -6.0000 94
## 20010607 79 16.8 15.6 14.9 7 8 NA -4.3301 -1.8794 -3.7588 80
## 20010610 79 14.9 17.5 18.9 5 5 NA 0.0000 -1.0419 -1.3892 99
## 20010611 101 16.1 19.6 21.4 2 NA 4 -0.7660 -1.0261 -2.2981 79
## 20010612 106 18.3 NA 22.9 5 NA NA 1.2856 -2.2981 -3.9392 101
## 20010613 101 17.3 19.3 20.2 7 7 3 -1.5000 -1.5000 -0.8682 106
## 20010614 90 17.6 20.3 17.4 NA 6 8 NA -1.0419 -0.6946 101
## 20010615 72 NA NA NA 7 5 6 -0.8682 -2.7362 -6.8944 90
## 20010616 70 17.1 18.2 18.0 NA 7 NA NA -7.8785 -5.1962 72
## 20010617 83 15.4 NA 16.6 8 7 NA -4.3301 -2.0521 -3.0000 70
## 20010618 88 NA 19.1 NA 6 5 4 0.5209 -2.9544 -1.0261 83
## 20010620 NA 21.0 24.6 26.9 NA NA 1 -0.3420 NA -0.6840 121
## 20010621 NA NA NA NA NA NA NA 0.0000 0.3473 -2.5712 NA
## 20010622 121 19.7 24.2 26.9 2 1 0 NA NA 2.0000 81
## 20010623 146 23.6 28.6 28.4 NA NA NA 1.0000 -1.9284 -1.2155 121
## maxO3 T9 T12 T15
## Min. : 42.00 Min. :11.30 Min. :14.30 Min. :14.90
## 1st Qu.: 71.00 1st Qu.:16.00 1st Qu.:18.60 1st Qu.:18.90
## Median : 81.50 Median :17.70 Median :20.40 Median :21.40
## Mean : 91.24 Mean :18.22 Mean :21.46 Mean :22.41
## 3rd Qu.:108.25 3rd Qu.:19.90 3rd Qu.:23.60 3rd Qu.:25.65
## Max. :166.00 Max. :25.30 Max. :33.50 Max. :35.50
## NA's :16 NA's :37 NA's :33 NA's :37
## Ne9 Ne12 Ne15 Vx9
## Min. :0.000 Min. :0.000 Min. :0.00 Min. :-7.8785
## 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:3.00 1st Qu.:-3.0000
## Median :5.000 Median :5.000 Median :5.00 Median :-0.8671
## Mean :4.987 Mean :4.986 Mean :4.60 Mean :-1.0958
## 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:6.25 3rd Qu.: 0.6919
## Max. :8.000 Max. :8.000 Max. :8.00 Max. : 5.1962
## NA's :34 NA's :42 NA's :32 NA's :18
## Vx12 Vx15 maxO3v
## Min. :-7.8785 Min. :-9.000 Min. : 42.00
## 1st Qu.:-3.6941 1st Qu.:-3.759 1st Qu.: 70.00
## Median :-1.9284 Median :-1.710 Median : 82.50
## Mean :-1.6853 Mean :-1.830 Mean : 89.39
## 3rd Qu.:-0.1302 3rd Qu.: 0.000 3rd Qu.:101.00
## Max. : 6.5778 Max. : 3.830 Max. :166.00
## NA's :10 NA's :21 NA's :12
We use the package missMDA to perform a PCA with missing values
## Warning: package 'missMDA' was built under R version 4.0.5
## Warning: package 'FactoMineR' was built under R version 4.0.5
## $ncp
## [1] 2
##
## $criterion
## 0 1 2 3 4 5
## 170.77775 88.60110 84.97487 104.60004 92.98638 115.43620
plot(0:(length(nb_dim$criterion)-1), nb_dim$criterion, xlab = "Number of dimensions",
ylab = "MSEP", col = 'red', type = 'l') So, we choose number of dimensions = 2
miss_ozne_PCA <- imputePCA(miss_ozone, ncp = nb_dim$ncp) # iterative PCA algorithm
str(miss_ozne_PCA)## List of 2
## $ completeObs: num [1:112, 1:11] 87 82 92 114 94 80 79 79 101 106 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:112] "20010601" "20010602" "20010603" "20010604" ...
## .. ..$ : chr [1:11] "maxO3" "T9" "T12" "T15" ...
## $ fittedX : num [1:112, 1:11] 80.7 82 89.1 109.6 89.9 ...
## maxO3 T9 T12 T15 Ne9 Ne12 Ne15
## 20010601 87.00000 15.60000 18.50000 20.47146 4.0000000 4.0000000 8.000000
## 20010602 82.00000 18.50470 20.86986 21.79932 5.0000000 5.0000000 7.000000
## 20010603 92.00000 15.30000 17.60000 19.50000 2.0000000 3.9840665 3.812104
## 20010604 114.00000 16.20000 19.70000 24.69265 1.0000000 1.0000000 0.000000
## 20010605 94.00000 18.96840 20.50000 20.40000 5.2941294 5.2716572 5.055926
## 20010606 80.00000 17.70000 19.80000 18.30000 6.0000000 7.0196398 7.000000
## 20010607 79.00000 16.80000 15.60000 14.90000 7.0000000 8.0000000 6.556354
## 20010610 79.00000 14.90000 17.50000 18.90000 5.0000000 5.0000000 5.015579
## 20010611 101.00000 16.10000 19.60000 21.40000 2.0000000 4.6909015 4.000000
## 20010612 106.00000 18.30000 22.49421 22.90000 5.0000000 4.6268550 4.495404
## 20010613 101.00000 17.30000 19.30000 20.20000 7.0000000 7.0000000 3.000000
## 20010614 90.00000 17.60000 20.30000 17.40000 5.5159048 6.0000000 8.000000
## 20010615 72.00000 18.66685 20.98567 21.91201 7.0000000 5.0000000 6.000000
## 20010616 70.00000 17.10000 18.20000 18.00000 7.5872379 7.0000000 7.072220
## 20010617 83.00000 15.40000 17.67644 16.60000 8.0000000 7.0000000 6.395235
## 20010618 88.00000 17.36484 19.10000 21.42266 6.0000000 5.0000000 4.000000
## 20010620 119.58078 21.00000 24.60000 26.90000 2.9567254 2.8481608 1.000000
## 20010621 100.45399 19.11688 22.77170 24.18467 4.1731211 4.1654596 4.086786
## 20010622 121.00000 19.70000 24.20000 26.90000 2.0000000 1.0000000 0.000000
## 20010623 146.00000 23.60000 28.60000 28.40000 2.6571327 2.4240083 2.683541
## 20010624 121.00000 20.40000 25.20000 27.70000 1.0000000 0.0000000 0.000000
## 20010625 146.00000 20.93913 27.31696 29.79175 0.4231594 0.0000000 0.000000
## 20010626 108.00000 24.00000 23.50000 28.72416 4.0000000 4.0000000 0.000000
## 20010627 83.00000 19.70000 22.90000 24.80000 5.2552836 5.1968311 5.008572
## 20010628 71.66914 17.88874 19.62904 20.29955 6.8111057 6.8299652 6.390548
## 20010629 81.00000 18.14849 21.04328 22.12432 3.0000000 4.0000000 4.000000
## 20010630 67.00000 18.20149 23.40000 23.70000 5.1512162 5.1855177 4.953085
## 20010701 70.00000 16.67781 20.50254 21.78439 5.0000000 2.0000000 1.000000
## 20010702 106.00000 17.50079 22.69045 24.49755 1.9890118 0.0000000 1.000000
## 20010703 139.00000 21.93161 30.10000 31.90000 1.7545547 1.0000000 4.000000
## 20010704 79.00000 20.14346 24.14224 25.75090 3.7206677 3.6514699 3.670153
## 20010705 73.74963 16.80000 18.20000 22.00000 8.0000000 8.0000000 6.000000
## 20010706 108.76862 20.80000 23.66573 25.29410 3.3998698 3.0000000 4.000000
## 20010707 113.00000 19.18635 18.20000 22.70000 5.8917695 5.8406507 5.564770
## 20010708 72.00000 19.35848 21.20000 23.90000 7.0000000 5.5936099 4.000000
## 20010709 88.00000 19.20000 22.00000 22.19128 5.6444901 5.6406281 5.367590
## 20010710 77.00000 19.40000 20.70000 22.50000 7.0000000 8.0000000 6.833130
## 20010711 71.00000 19.20000 21.00000 22.40000 6.0000000 4.0000000 6.000000
## 20010712 56.00000 13.80000 16.83463 18.50000 8.0000000 8.0000000 6.000000
## 20010713 45.00000 11.93983 14.50000 15.20000 8.0000000 6.4901827 8.000000
## 20010714 67.00000 15.60000 18.60000 18.56771 5.0000000 6.2761410 5.000000
## 20010715 74.77227 16.90000 19.10000 19.84864 5.0000000 5.8966989 6.000000
## 20010716 84.00000 17.40000 20.40000 21.33420 3.0000000 5.1484124 6.000000
## 20010717 63.00000 17.88233 20.50000 20.60000 8.0000000 6.0000000 6.000000
## 20010718 54.72441 15.42821 15.60000 16.97300 7.4499932 8.0000000 7.005500
## 20010719 92.00000 16.70000 19.10000 19.30000 7.0000000 6.0000000 4.000000
## 20010720 88.00000 18.42964 20.30000 22.07409 5.5831385 5.5918034 5.318972
## 20010721 66.00000 18.00000 19.85188 20.64123 8.0000000 6.0000000 5.000000
## 20010722 72.00000 18.60000 21.90000 23.60000 4.0000000 7.0000000 6.000000
## 20010723 81.00000 18.80000 22.50000 23.90000 6.0000000 3.0000000 2.000000
## 20010724 97.88748 19.00000 22.50000 24.10000 4.0938663 4.1373853 4.037186
## 20010725 149.00000 19.90000 26.90000 29.00000 3.0000000 4.0000000 3.163109
## 20010726 153.00000 23.30026 28.72704 31.06274 1.0000000 1.5659263 4.000000
## 20010727 159.00000 24.00000 28.30000 26.50000 2.0000000 2.7172560 7.000000
## 20010728 149.00000 23.30000 27.60000 28.80000 4.0000000 2.2552537 3.000000
## 20010729 160.00000 24.05070 29.81088 32.31737 1.3676494 1.0784140 1.541148
## 20010730 156.00000 24.90000 30.50000 32.20000 0.0000000 1.0000000 4.000000
## 20010731 84.00000 21.53860 26.30000 27.80000 3.1963484 3.0414876 2.000000
## 20010801 126.00000 25.30000 29.50000 31.20000 2.1398121 4.0000000 4.000000
## 20010802 116.00000 21.30000 23.80000 22.10000 7.0000000 7.0000000 8.000000
## 20010803 77.00000 20.00000 18.20000 23.60000 5.0000000 7.0000000 6.000000
## 20010804 63.00000 18.70000 20.60000 20.30000 6.0000000 6.9421933 7.000000
## 20010805 62.81727 18.60000 18.70000 17.80000 8.0000000 8.0000000 8.000000
## 20010806 65.00000 19.20000 23.00000 22.70000 8.0000000 7.0000000 7.000000
## 20010807 72.00000 19.90000 19.53917 20.40000 7.0000000 7.0000000 8.000000
## 20010808 60.00000 18.70000 21.40000 21.70000 7.0000000 7.0000000 7.000000
## 20010809 70.00000 18.40000 17.10000 20.65442 3.0000000 6.0000000 3.000000
## 20010810 77.00000 16.46104 19.45886 20.44428 4.0000000 5.0000000 4.884991
## 20010811 98.00000 21.76724 26.71189 28.76601 1.0000000 1.0000000 0.000000
## 20010812 111.00000 20.75629 24.67737 26.30714 1.0000000 5.0000000 2.000000
## 20010813 75.00000 15.99421 18.86287 19.76852 8.0000000 7.0000000 1.000000
## 20010814 116.00000 23.50000 29.80000 31.70000 1.0000000 3.0000000 5.000000
## 20010815 109.00000 20.80000 23.70000 26.60000 8.0000000 5.0000000 4.000000
## 20010819 67.00000 18.80000 19.73029 18.90000 6.4111997 6.4517505 6.051011
## 20010820 76.00000 17.85731 20.50783 24.00000 5.5405328 5.0000000 5.000000
## 20010821 113.00000 20.60000 24.80000 27.00000 3.5121485 3.4711458 3.499483
## 20010822 117.00000 21.60000 26.90000 28.60000 6.0000000 2.7243531 4.000000
## 20010823 131.00000 22.44351 28.40000 30.10000 5.0000000 3.0000000 3.000000
## 20010824 166.00000 19.80000 27.20000 30.80000 4.0000000 0.0000000 1.000000
## 20010825 159.00000 25.00000 33.50000 35.50000 1.0000000 0.6539349 1.000000
## 20010826 91.66020 20.10000 22.90000 27.60000 8.0000000 8.0000000 6.000000
## 20010827 114.00000 17.39384 21.41087 22.81311 7.0000000 4.0000000 5.000000
## 20010828 118.19522 21.00000 24.40000 26.08930 1.0000000 6.0000000 3.000000
## 20010829 82.63378 16.90000 17.80000 20.60000 4.9754080 5.1131422 7.000000
## 20010830 76.00000 17.18287 18.60000 18.70000 7.0000000 7.0000000 7.000000
## 20010831 59.00000 16.50000 20.30000 20.30000 5.0000000 7.0000000 6.000000
## 20010901 78.00000 17.70000 20.20000 21.50000 4.7158403 4.8724115 4.620659
## 20010902 76.00000 17.30000 22.70000 24.60000 4.0000000 5.3307655 6.000000
## 20010903 55.00000 15.30000 16.80000 19.20000 8.0000000 7.0000000 5.000000
## 20010904 71.00000 15.90000 19.20000 19.50000 7.0000000 5.0000000 3.000000
## 20010905 80.57400 16.20000 18.90000 19.30000 2.0000000 5.0000000 6.000000
## 20010906 59.00000 15.19454 17.06647 17.57319 7.0000000 7.0000000 7.000000
## 20010907 74.04784 16.81261 19.13269 19.91609 6.0000000 5.0000000 5.697830
## 20010908 63.00000 17.30000 19.80000 19.40000 7.2022695 7.2510887 6.741975
## 20010912 72.91166 14.20000 22.20000 19.66105 5.0000000 6.1161170 6.000000
## 20010913 74.00000 15.80000 18.70000 19.10000 7.3092925 7.0000000 7.000000
## 20010914 71.00000 15.20000 17.90000 18.60000 5.6805752 5.8725486 5.472865
## 20010915 69.00000 17.10000 17.70000 17.50000 6.0000000 7.0000000 8.000000
## 20010916 71.00000 15.40000 18.09115 16.60000 4.0000000 5.0000000 5.000000
## 20010917 60.00000 15.28343 18.56520 19.55561 4.0000000 5.0000000 4.000000
## 20010918 42.00000 14.09084 14.30000 14.90000 8.0000000 7.0000000 7.000000
## 20010919 65.00000 14.80000 16.42518 15.90000 7.0000000 7.9819722 7.000000
## 20010920 71.00000 15.50000 18.00000 17.40000 7.0000000 7.0000000 6.000000
## 20010921 96.00000 11.30000 18.62334 20.20000 3.0000000 3.0000000 3.000000
## 20010922 98.00000 15.20000 19.70000 20.30000 2.0000000 2.0000000 2.000000
## 20010923 92.00000 14.90403 17.60000 18.20000 1.0000000 4.0000000 6.000000
## 20010924 76.00000 13.30000 17.70000 17.70000 5.6310567 5.8834844 5.452620
## 20010925 75.57301 13.30000 18.43401 17.80000 3.0000000 5.0000000 5.000903
## 20010927 77.00000 16.20000 20.80000 20.49870 5.3682787 5.4945885 5.176542
## 20010928 99.00000 18.07392 22.16934 23.65068 3.5307173 3.6103283 3.561142
## 20010929 83.00000 19.85538 22.66328 23.84691 5.3742296 5.0000000 3.000000
## 20010930 70.00000 15.70000 18.60000 20.70000 7.0000000 6.4053442 7.000000
## Vx9 Vx12 Vx15 maxO3v
## 20010601 0.69460000 -1.71010000 -0.6946000 84.00000
## 20010602 -4.33010000 -4.00000000 -3.0000000 87.00000
## 20010603 2.95440000 1.95056272 0.5209000 82.00000
## 20010604 2.04389376 0.34730000 -0.1736000 92.00000
## 20010605 -0.50000000 -2.95440000 -4.3301000 114.00000
## 20010606 -5.63820000 -5.00000000 -6.0000000 94.00000
## 20010607 -4.33010000 -1.87940000 -3.7588000 80.00000
## 20010610 0.00000000 -1.04190000 -1.3892000 99.00000
## 20010611 -0.76600000 -1.02610000 -2.2981000 79.00000
## 20010612 1.28560000 -2.29810000 -3.9392000 101.00000
## 20010613 -1.50000000 -1.50000000 -0.8682000 106.00000
## 20010614 -1.60667079 -1.04190000 -0.6946000 101.00000
## 20010615 -0.86820000 -2.73620000 -6.8944000 90.00000
## 20010616 -4.76961237 -7.87850000 -5.1962000 72.00000
## 20010617 -4.33010000 -2.05210000 -3.0000000 70.00000
## 20010618 0.52090000 -2.95440000 -1.0261000 83.00000
## 20010620 -0.34200000 0.01153915 -0.6840000 121.00000
## 20010621 0.00000000 0.34730000 -2.5712000 97.47651
## 20010622 2.93502635 3.08500014 2.0000000 81.00000
## 20010623 1.00000000 -1.92840000 -1.2155000 121.00000
## 20010624 2.16787364 -0.52090000 1.0261000 146.00000
## 20010625 2.95440000 6.57780000 3.3597884 121.00000
## 20010626 -2.57120000 -3.85670000 -4.6985000 146.00000
## 20010627 -2.59810000 -3.03992270 -3.1250488 95.92854
## 20010628 -5.63820000 -3.83020000 -4.5963000 83.00000
## 20010629 -1.92840000 -2.57120000 -4.3301000 57.00000
## 20010630 -1.53210000 -3.06420000 -0.8682000 81.00000
## 20010701 0.68400000 0.00000000 1.3681000 67.00000
## 20010702 2.81910000 3.93920000 3.4641000 70.00000
## 20010703 1.87940000 2.00000000 1.3681000 106.00000
## 20010704 0.69460000 -0.86600000 -1.0261000 139.00000
## 20010705 0.00000000 0.00000000 1.2856000 79.00000
## 20010706 0.00000000 1.71010000 -0.2263833 93.00000
## 20010707 -3.75880000 -3.93920000 -4.6985000 97.00000
## 20010708 -2.59810000 -3.93920000 -3.7588000 113.00000
## 20010709 -1.96960000 -3.06420000 -4.0000000 72.00000
## 20010710 -5.30160893 -5.63820000 -9.0000000 88.00000
## 20010711 -7.87850000 -6.89370000 -6.8937000 77.00000
## 20010712 1.50000000 -3.83020000 -2.0521000 71.00000
## 20010713 0.68400000 4.00000000 0.8115513 44.44483
## 20010714 -3.21390000 -2.00658880 -1.8177682 45.00000
## 20010715 -2.29810000 -3.75880000 0.0000000 67.00000
## 20010716 0.00000000 -1.24408861 -2.5981000 67.00000
## 20010717 2.00000000 -5.36230000 -6.1284000 84.00000
## 20010718 -3.79515001 -3.83020000 -4.3301000 63.00000
## 20010719 -2.05210000 -4.49950000 -2.7362000 69.00000
## 20010720 -2.24328281 -3.46410000 -2.9053296 92.00000
## 20010721 -3.00000000 -3.50000000 -3.4745919 88.00000
## 20010722 -1.96195172 -1.96960000 -2.4577492 66.00000
## 20010723 0.52090000 -1.00000000 -2.0000000 95.35561
## 20010724 -0.04928014 -1.02610000 0.5209000 81.00000
## 20010725 0.37577764 -0.93970000 -0.6428000 83.00000
## 20010726 0.93970000 1.50000000 -0.3960468 149.00000
## 20010727 -0.34200000 1.28560000 -2.0000000 123.91428
## 20010728 0.86600000 -1.53210000 -0.1736000 159.00000
## 20010729 1.53210000 0.75819048 -0.2193392 149.00000
## 20010730 -0.50000000 -1.87940000 -1.2856000 160.00000
## 20010731 -1.36810000 -0.78743724 0.0000000 156.00000
## 20010801 3.00000000 3.75880000 0.2496129 84.00000
## 20010802 0.00000000 -2.39410000 -1.3892000 126.00000
## 20010803 -3.46410000 -2.59810000 -3.7588000 116.00000
## 20010804 -5.00000000 -4.92400000 -5.6382000 81.96823
## 20010805 -4.69850000 -2.50000000 -0.8682000 63.00000
## 20010806 -3.83020000 -4.92400000 -5.6382000 54.00000
## 20010807 -3.00000000 -4.59630000 -4.3950612 65.00000
## 20010808 -5.63820000 -6.06220000 -6.8937000 72.00000
## 20010809 -5.90880000 -3.21390000 -4.4995000 60.00000
## 20010810 -1.92840000 -1.02610000 0.5209000 70.00000
## 20010811 1.04235230 -1.53210000 -1.0000000 120.89095
## 20010812 -1.02610000 -3.00000000 -2.2981000 98.00000
## 20010813 -0.86600000 0.00000000 0.0000000 73.65922
## 20010814 1.87940000 1.36810000 0.6946000 75.00000
## 20010815 -1.02610000 -1.71010000 -3.2139000 116.00000
## 20010819 -3.17255050 -5.36230000 -2.5000000 86.00000
## 20010820 -3.06420000 -2.29810000 -2.4496377 67.00000
## 20010821 1.36810000 0.86820000 -2.2981000 76.00000
## 20010822 1.53210000 1.92840000 1.9284000 113.00000
## 20010823 0.17360000 -1.96960000 -1.9284000 117.00000
## 20010824 0.64280000 -0.86600000 0.6840000 131.00000
## 20010825 1.00000000 0.69460000 -1.7101000 166.00000
## 20010826 1.28560000 -1.73210000 -0.6840000 92.50948
## 20010827 3.06420000 2.81910000 1.3681000 100.00000
## 20010828 4.00000000 4.00000000 3.7588000 114.00000
## 20010829 -2.00000000 -0.52090000 1.8794000 112.00000
## 20010830 -3.46410000 -4.00000000 -1.7321000 101.00000
## 20010831 -4.33010000 -5.36230000 -3.9252674 76.00000
## 20010901 -0.24213097 0.52090000 0.0000000 59.00000
## 20010902 -2.95440000 -2.95440000 -2.0000000 89.30764
## 20010903 -1.87940000 -2.70011852 -2.3941000 76.00000
## 20010904 -1.07063419 -1.08541127 -1.3892000 55.00000
## 20010905 -1.36810000 -0.86820000 -0.6409125 71.00000
## 20010906 -2.38993455 -1.92840000 -1.7101000 66.00000
## 20010907 -1.50000000 -3.46410000 -3.0642000 59.00000
## 20010908 -4.59630000 -6.06220000 -4.3301000 68.00000
## 20010912 -0.86600000 -5.00000000 -2.2813228 62.00000
## 20010913 -4.59630000 -6.89370000 -4.7945164 78.00000
## 20010914 -1.04190000 -1.36810000 -1.2468043 74.00000
## 20010915 -5.19620000 -2.73620000 -1.0419000 71.00000
## 20010916 -3.83020000 0.00000000 1.3892000 69.00000
## 20010917 0.00000000 3.21390000 0.0000000 71.00000
## 20010918 -2.50000000 -3.21390000 -2.5000000 60.00000
## 20010919 -4.34076593 -6.06220000 -5.1962000 42.00000
## 20010920 -3.93920000 -3.06420000 0.0000000 65.00000
## 20010921 -0.17360000 3.75880000 3.8302000 71.00000
## 20010922 4.00000000 5.00000000 3.7578333 96.00000
## 20010923 5.19620000 5.14230000 3.4749942 98.00000
## 20010924 -0.93970000 -0.76600000 -0.5000000 65.13904
## 20010925 0.00000000 -1.00000000 -1.2856000 76.00000
## 20010927 -0.69460000 -2.00000000 -1.4729818 71.00000
## 20010928 1.50000000 0.86820000 0.8682000 93.13532
## 20010929 -4.00000000 -3.75880000 -4.0000000 99.00000
## 20010930 -2.58353624 -1.04190000 -4.0000000 83.00000
imputed_ozone <- cbind.data.frame(miss_ozne_PCA$completeObs, ozone$WindDirection)
imputed_ozone_pca <- PCA(imputed_ozone, quanti.sup = 1, quali.sup = 12, ncp = nb_dim$ncp, graph = FALSE)Variances of the principal components
## eigenvalue percentage of variance cumulative percentage of variance
## comp 1 5.74712571 57.4712571 57.47126
## comp 2 2.13079398 21.3079398 78.77920
## comp 3 0.59538380 5.9538380 84.73303
## comp 4 0.44581756 4.4581756 89.19121
## comp 5 0.37097047 3.7097047 92.90092
## comp 6 0.25916804 2.5916804 95.49260
## comp 7 0.17834459 1.7834459 97.27604
## comp 8 0.15217800 1.5217800 98.79782
## comp 9 0.07427126 0.7427126 99.54053
## comp 10 0.04594660 0.4594660 100.00000
barplot(eigenvalues[, 2], names.arg=1:nrow(eigenvalues), main = "Variances",
xlab = "Principal Components", ylab = "Percentage of variances", col ="steelblue")
lines(x = 1:nrow(eigenvalues), eigenvalues[, 2], type="b", pch=19, col = "red")Graph of individuals
Graph of variables
scores (principal components)
## Dim.1 Dim.2
## 20010601 -0.6604580 -1.2048271
## 20010602 -1.2317545 1.0465411
## 20010603 0.7984643 -2.7299508
## 20010604 2.5423205 -1.7435774
## 20010605 -0.4047517 0.8406578
## 20010606 -2.6701824 1.6934864
The dimdesc() function allows to describe the dimensions
## correlation p.value
## T15 0.8743245 2.527176e-36
## maxO3 0.8556999 2.979307e-33
## T12 0.8419942 2.959647e-31
## Vx9 0.7275225 1.038760e-19
## maxO3v 0.7072031 2.909125e-18
## T9 0.6750888 3.277300e-16
## Vx12 0.6615171 2.028392e-15
## Vx15 0.5720457 4.442479e-11
## Ne15 -0.7608957 2.156801e-22
## Ne9 -0.7997429 3.972429e-26
## Ne12 -0.8973103 7.210553e-41
## correlation p.value
## T9 0.6588857 2.857127e-15
## T12 0.4358243 1.563461e-06
## maxO3v 0.4049461 9.479876e-06
## T15 0.4009789 1.179954e-05
## maxO3 0.1896743 4.517440e-02
## Vx9 -0.4794252 8.893422e-08
## Vx12 -0.6428198 2.153160e-14
## Vx15 -0.7073874 2.826092e-18
## R2 p.value
## ozone$WindDirection 0.1662621 0.0001933068
## R2 p.value
## ozone$WindDirection 0.4262344 5.168436e-13
ozone_MIPCA <- MIPCA(miss_ozone, ncp = 2, nboot = 100) # MI with PCA using 2 dimensions
# Show the first 20 rows of the first imputed dataset
round(ozone_MIPCA$res.MI[[1]][1:20,], 3)## maxO3 T9 T12 T15 Ne9 Ne12 Ne15 Vx9 Vx12 Vx15
## 20010601 87.000 15.600 18.500 16.090 4.000 4.000 8.000 0.695 -1.710 -0.695
## 20010602 82.000 20.762 19.727 26.465 5.000 5.000 7.000 -4.330 -4.000 -3.000
## 20010603 92.000 15.300 17.600 19.500 2.000 4.340 1.604 2.954 3.771 0.521
## 20010604 114.000 16.200 19.700 19.977 1.000 1.000 0.000 -0.176 0.347 -0.174
## 20010605 94.000 20.861 20.500 20.400 4.039 5.007 3.813 -0.500 -2.954 -4.330
## 20010606 80.000 17.700 19.800 18.300 6.000 8.984 7.000 -5.638 -5.000 -6.000
## 20010607 79.000 16.800 15.600 14.900 7.000 8.000 6.608 -4.330 -1.879 -3.759
## 20010610 79.000 14.900 17.500 18.900 5.000 5.000 3.529 0.000 -1.042 -1.389
## 20010611 101.000 16.100 19.600 21.400 2.000 6.532 4.000 -0.766 -1.026 -2.298
## 20010612 106.000 18.300 21.437 22.900 5.000 4.411 3.611 1.286 -2.298 -3.939
## 20010613 101.000 17.300 19.300 20.200 7.000 7.000 3.000 -1.500 -1.500 -0.868
## 20010614 90.000 17.600 20.300 17.400 4.485 6.000 8.000 -0.489 -1.042 -0.695
## 20010615 72.000 17.504 22.071 22.157 7.000 5.000 6.000 -0.868 -2.736 -6.894
## 20010616 70.000 17.100 18.200 18.000 7.424 7.000 4.793 -2.332 -7.878 -5.196
## 20010617 83.000 15.400 16.011 16.600 8.000 7.000 8.805 -4.330 -2.052 -3.000
## 20010618 88.000 17.104 19.100 19.622 6.000 5.000 4.000 0.521 -2.954 -1.026
## 20010620 117.131 21.000 24.600 26.900 4.225 1.302 1.000 -0.342 -0.107 -0.684
## 20010621 108.721 20.423 26.499 24.729 3.910 5.503 4.607 0.000 0.347 -2.571
## 20010622 121.000 19.700 24.200 26.900 2.000 1.000 0.000 0.020 3.498 2.000
## 20010623 146.000 23.600 28.600 28.400 1.880 3.540 5.706 1.000 -1.928 -1.216
## maxO3v
## 20010601 84.000
## 20010602 87.000
## 20010603 82.000
## 20010604 92.000
## 20010605 114.000
## 20010606 94.000
## 20010607 80.000
## 20010610 99.000
## 20010611 79.000
## 20010612 101.000
## 20010613 106.000
## 20010614 101.000
## 20010615 90.000
## 20010616 72.000
## 20010617 70.000
## 20010618 83.000
## 20010620 121.000
## 20010621 151.907
## 20010622 81.000
## 20010623 121.000
## list()
## $PlotVar
These plots show that the variability across different imputations is small and we can interpret the PCA results with confidence
## Warning: package 'mice' was built under R version 4.0.5
##
## Attaching package: 'mice'
## The following object is masked from 'package:stats':
##
## filter
## The following objects are masked from 'package:base':
##
## cbind, rbind
## term estimate std.error statistic df p.value
## 1 (Intercept) 14.1734126 18.00572598 0.7871614 65.90233 0.434009202
## 2 T9 1.0975413 1.27579325 0.8602815 47.94048 0.393915874
## 3 T12 1.5412750 1.00733068 1.5300586 54.45720 0.131790866
## 4 T15 0.7488106 0.92570641 0.8089073 50.20750 0.422385241
## 5 Ne9 -1.1958245 1.12774035 -1.0603722 61.63674 0.293112856
## 6 Ne12 -1.8955085 1.50143825 -1.2624618 55.65074 0.212047463
## 7 Ne15 0.4484842 1.14819838 0.3905982 63.60316 0.697399468
## 8 Vx9 0.7952494 1.07420938 0.7403113 67.75881 0.461668689
## 9 Vx12 0.8453210 1.16486203 0.7256834 65.74568 0.470608253
## 10 Vx15 0.3654349 1.13218348 0.3227700 64.26183 0.747917293
## 11 maxO3v 0.2475597 0.09094285 2.7221458 68.84881 0.008208972
mice = Multivariate Imputation by Chained Equations
The following input arguments are used in mice for multiple imputation: * m: Number of imputed datasets (default is m=5) * seed: Random seed for reproducable results * method: method to use to impute missing values (default method for imputation of numeric variables is PMM)
To view the multiply imputed datasets, use the complete function.
## maxO3 T9 T12 T15 Ne9 Ne12 Ne15
## 1 87.00000 15.60000 18.50000 20.14668 4.0000000 4.000000 8.000000
## 2 82.00000 19.72583 20.34955 21.55971 5.0000000 5.000000 7.000000
## 3 92.00000 15.30000 17.60000 19.50000 2.0000000 3.040415 1.521419
## 4 114.00000 16.20000 19.70000 23.06551 1.0000000 1.000000 0.000000
## 5 94.00000 19.68478 20.50000 20.40000 6.4494396 8.768195 8.110756
## 6 80.00000 17.70000 19.80000 18.30000 6.0000000 4.838704 7.000000
## 7 79.00000 16.80000 15.60000 14.90000 7.0000000 8.000000 7.902305
## 8 79.00000 14.90000 17.50000 18.90000 5.0000000 5.000000 5.988172
## 9 101.00000 16.10000 19.60000 21.40000 2.0000000 4.591211 4.000000
## 10 106.00000 18.30000 21.19053 22.90000 5.0000000 5.413644 5.144680
## 11 101.00000 17.30000 19.30000 20.20000 7.0000000 7.000000 3.000000
## 12 90.00000 17.60000 20.30000 17.40000 5.8120967 6.000000 8.000000
## 13 72.00000 17.94896 23.67660 21.97368 7.0000000 5.000000 6.000000
## 14 70.00000 17.10000 18.20000 18.00000 7.2350097 7.000000 7.535272
## 15 83.00000 15.40000 16.41371 16.60000 8.0000000 7.000000 5.746205
## 16 88.00000 15.56505 19.10000 19.09532 6.0000000 5.000000 4.000000
## 17 115.47542 21.00000 24.60000 26.90000 1.6005083 2.939884 1.000000
## 18 54.35270 14.35704 19.61282 18.49255 9.3796972 6.657951 7.673472
## 19 121.00000 19.70000 24.20000 26.90000 2.0000000 1.000000 0.000000
## 20 146.00000 23.60000 28.60000 28.40000 4.6855667 1.822556 4.233582
## 21 121.00000 20.40000 25.20000 27.70000 1.0000000 0.000000 0.000000
## 22 146.00000 21.52747 26.61720 30.06146 0.4147200 0.000000 0.000000
## 23 108.00000 24.00000 23.50000 30.34552 4.0000000 4.000000 0.000000
## 24 83.00000 19.70000 22.90000 24.80000 6.8602163 6.768870 5.327603
## 25 29.00405 13.12272 17.22870 18.83816 10.2122272 7.163449 7.242783
## 26 81.00000 16.39354 19.12840 20.66616 3.0000000 4.000000 4.000000
## 27 67.00000 19.56759 23.40000 23.70000 4.7112108 5.325877 5.948896
## 28 70.00000 12.81413 17.22494 16.38531 5.0000000 2.000000 1.000000
## 29 106.00000 14.24173 21.65536 22.18701 -1.0945387 0.000000 1.000000
## 30 139.00000 22.63619 30.10000 31.90000 0.8930150 1.000000 4.000000
## 31 79.00000 17.34847 20.35951 23.16946 1.5981569 4.472444 2.538369
## 32 80.65809 16.80000 18.20000 22.00000 8.0000000 8.000000 6.000000
## 33 122.04138 20.80000 24.63714 25.09747 2.1270286 3.000000 4.000000
## 34 113.00000 18.27837 18.20000 22.70000 5.4712623 4.687106 1.990298
## 35 72.00000 19.33828 21.20000 23.90000 7.0000000 6.156596 4.000000
## 36 88.00000 19.20000 22.00000 21.80093 4.6511361 5.844295 6.792110
## 37 77.00000 19.40000 20.70000 22.50000 7.0000000 8.000000 7.024336
## 38 71.00000 19.20000 21.00000 22.40000 6.0000000 4.000000 6.000000
## 39 56.00000 13.80000 18.36558 18.50000 8.0000000 8.000000 6.000000
## 40 45.00000 11.06987 14.50000 15.20000 8.0000000 6.230832 8.000000
## 41 67.00000 15.60000 18.60000 17.28396 5.0000000 4.091384 5.000000
## 42 65.73226 16.90000 19.10000 19.39972 5.0000000 6.141933 6.000000
## 43 84.00000 17.40000 20.40000 23.59468 3.0000000 4.361795 6.000000
## 44 63.00000 16.35005 20.50000 20.60000 8.0000000 6.000000 6.000000
## 45 36.39931 12.24654 15.60000 16.47336 4.5424677 8.000000 7.755888
## 46 92.00000 16.70000 19.10000 19.30000 7.0000000 6.000000 4.000000
## 47 88.00000 18.71214 20.30000 21.79607 3.9850924 7.817232 7.126552
## 48 66.00000 18.00000 21.03602 23.47777 8.0000000 6.000000 5.000000
## 49 72.00000 18.60000 21.90000 23.60000 4.0000000 7.000000 6.000000
## 50 81.00000 18.80000 22.50000 23.90000 6.0000000 3.000000 2.000000
## 51 90.93144 19.00000 22.50000 24.10000 3.7994636 4.129036 3.672012
## 52 149.00000 19.90000 26.90000 29.00000 3.0000000 4.000000 5.885773
## 53 153.00000 23.74597 28.44225 29.59417 1.0000000 1.537358 4.000000
## 54 159.00000 24.00000 28.30000 26.50000 2.0000000 3.998203 7.000000
## 55 149.00000 23.30000 27.60000 28.80000 4.0000000 3.407688 3.000000
## 56 160.00000 17.71289 21.39854 24.81035 4.5204709 3.050541 4.139845
## 57 156.00000 24.90000 30.50000 32.20000 0.0000000 1.000000 4.000000
## 58 84.00000 21.80848 26.30000 27.80000 4.5871132 1.975970 2.000000
## 59 126.00000 25.30000 29.50000 31.20000 2.1807272 4.000000 4.000000
## 60 116.00000 21.30000 23.80000 22.10000 7.0000000 7.000000 8.000000
## 61 77.00000 20.00000 18.20000 23.60000 5.0000000 7.000000 6.000000
## 62 63.00000 18.70000 20.60000 20.30000 6.0000000 6.747241 7.000000
## 63 59.85957 18.60000 18.70000 17.80000 8.0000000 8.000000 8.000000
## 64 65.00000 19.20000 23.00000 22.70000 8.0000000 7.000000 7.000000
## 65 72.00000 19.90000 25.11540 20.40000 7.0000000 7.000000 8.000000
## 66 60.00000 18.70000 21.40000 21.70000 7.0000000 7.000000 7.000000
## 67 70.00000 18.40000 17.10000 18.47899 3.0000000 6.000000 3.000000
## 68 77.00000 17.80319 21.13320 21.73258 4.0000000 5.000000 6.443368
## 69 98.00000 16.52625 21.43704 22.90716 1.0000000 1.000000 0.000000
## 70 111.00000 20.21915 20.59792 23.99659 1.0000000 5.000000 2.000000
## 71 75.00000 15.04095 15.50086 20.48200 8.0000000 7.000000 1.000000
## 72 116.00000 23.50000 29.80000 31.70000 1.0000000 3.000000 5.000000
## 73 109.00000 20.80000 23.70000 26.60000 8.0000000 5.000000 4.000000
## 74 67.00000 18.80000 17.87540 18.90000 7.0503119 9.156412 6.634677
## 75 76.00000 20.38620 20.47599 24.00000 2.3974692 5.000000 5.000000
## 76 113.00000 20.60000 24.80000 27.00000 2.3259934 3.372839 2.032614
## 77 117.00000 21.60000 26.90000 28.60000 6.0000000 4.039133 4.000000
## 78 131.00000 21.74272 28.40000 30.10000 5.0000000 3.000000 3.000000
## 79 166.00000 19.80000 27.20000 30.80000 4.0000000 0.000000 1.000000
## 80 159.00000 25.00000 33.50000 35.50000 1.0000000 -1.405370 1.000000
## 81 105.73487 20.10000 22.90000 27.60000 8.0000000 8.000000 6.000000
## 82 114.00000 20.15346 25.82646 29.90577 7.0000000 4.000000 5.000000
## 83 122.93095 21.00000 24.40000 25.91279 1.0000000 6.000000 3.000000
## 84 92.75057 16.90000 17.80000 20.60000 4.0314740 5.074435 7.000000
## 85 76.00000 17.53586 18.60000 18.70000 7.0000000 7.000000 7.000000
## 86 59.00000 16.50000 20.30000 20.30000 5.0000000 7.000000 6.000000
## 87 78.00000 17.70000 20.20000 21.50000 0.9514572 4.703013 4.713841
## 88 76.00000 17.30000 22.70000 24.60000 4.0000000 4.929299 6.000000
## 89 55.00000 15.30000 16.80000 19.20000 8.0000000 7.000000 5.000000
## 90 71.00000 15.90000 19.20000 19.50000 7.0000000 5.000000 3.000000
## 91 78.96938 16.20000 18.90000 19.30000 2.0000000 5.000000 6.000000
## 92 59.00000 15.05765 18.59908 19.24915 7.0000000 7.000000 7.000000
## 93 67.85011 11.86667 18.79684 20.47776 6.0000000 5.000000 3.315118
## 94 63.00000 17.30000 19.80000 19.40000 6.2082934 5.580849 6.211284
## 95 94.79117 14.20000 22.20000 21.64405 5.0000000 4.298297 6.000000
## 96 74.00000 15.80000 18.70000 19.10000 9.4684786 7.000000 7.000000
## 97 71.00000 15.20000 17.90000 18.60000 1.7281800 3.043040 4.959485
## 98 69.00000 17.10000 17.70000 17.50000 6.0000000 7.000000 8.000000
## 99 71.00000 15.40000 15.38057 16.60000 4.0000000 5.000000 5.000000
## 100 60.00000 15.59461 17.16213 19.59295 4.0000000 5.000000 4.000000
## 101 42.00000 11.82012 14.30000 14.90000 8.0000000 7.000000 7.000000
## 102 65.00000 14.80000 15.47218 15.90000 7.0000000 6.280765 7.000000
## 103 71.00000 15.50000 18.00000 17.40000 7.0000000 7.000000 6.000000
## 104 96.00000 11.30000 16.11873 20.20000 3.0000000 3.000000 3.000000
## 105 98.00000 15.20000 19.70000 20.30000 2.0000000 2.000000 2.000000
## 106 92.00000 14.12637 17.60000 18.20000 1.0000000 4.000000 6.000000
## 107 76.00000 13.30000 17.70000 17.70000 3.7349357 2.959975 3.961621
## 108 67.64008 13.30000 15.07542 17.80000 3.0000000 5.000000 5.425920
## 109 77.00000 16.20000 20.80000 18.79756 5.7855078 6.535993 6.411554
## 110 99.00000 19.72729 24.86657 25.72966 1.4768082 2.531460 0.743947
## 111 83.00000 19.28514 19.22643 19.51209 4.1739576 5.000000 3.000000
## 112 70.00000 15.70000 18.60000 20.70000 7.0000000 6.058601 7.000000
## Vx9 Vx12 Vx15 maxO3v
## 1 0.6946000 -1.710100 -0.6946000 84.00000
## 2 -4.3301000 -4.000000 -3.0000000 87.00000
## 3 2.9544000 3.097391 0.5209000 82.00000
## 4 -0.6772824 0.347300 -0.1736000 92.00000
## 5 -0.5000000 -2.954400 -4.3301000 114.00000
## 6 -5.6382000 -5.000000 -6.0000000 94.00000
## 7 -4.3301000 -1.879400 -3.7588000 80.00000
## 8 0.0000000 -1.041900 -1.3892000 99.00000
## 9 -0.7660000 -1.026100 -2.2981000 79.00000
## 10 1.2856000 -2.298100 -3.9392000 101.00000
## 11 -1.5000000 -1.500000 -0.8682000 106.00000
## 12 -2.1302383 -1.041900 -0.6946000 101.00000
## 13 -0.8682000 -2.736200 -6.8944000 90.00000
## 14 -3.3257415 -7.878500 -5.1962000 72.00000
## 15 -4.3301000 -2.052100 -3.0000000 70.00000
## 16 0.5209000 -2.954400 -1.0261000 83.00000
## 17 -0.3420000 -1.599493 -0.6840000 121.00000
## 18 0.0000000 0.347300 -2.5712000 58.46837
## 19 2.6315844 1.694030 2.0000000 81.00000
## 20 1.0000000 -1.928400 -1.2155000 121.00000
## 21 2.5233934 -0.520900 1.0261000 146.00000
## 22 2.9544000 6.577800 5.2985788 121.00000
## 23 -2.5712000 -3.856700 -4.6985000 146.00000
## 24 -2.5981000 -2.875676 -4.1608364 135.27338
## 25 -5.6382000 -3.830200 -4.5963000 83.00000
## 26 -1.9284000 -2.571200 -4.3301000 57.00000
## 27 -1.5321000 -3.064200 -0.8682000 81.00000
## 28 0.6840000 0.000000 1.3681000 67.00000
## 29 2.8191000 3.939200 3.4641000 70.00000
## 30 1.8794000 2.000000 1.3681000 106.00000
## 31 0.6946000 -0.866000 -1.0261000 139.00000
## 32 0.0000000 0.000000 1.2856000 79.00000
## 33 0.0000000 1.710100 3.5515501 93.00000
## 34 -3.7588000 -3.939200 -4.6985000 97.00000
## 35 -2.5981000 -3.939200 -3.7588000 113.00000
## 36 -1.9696000 -3.064200 -4.0000000 72.00000
## 37 -3.5768194 -5.638200 -9.0000000 88.00000
## 38 -7.8785000 -6.893700 -6.8937000 77.00000
## 39 1.5000000 -3.830200 -2.0521000 71.00000
## 40 0.6840000 4.000000 0.3827663 37.62852
## 41 -3.2139000 -1.751409 -0.6867567 45.00000
## 42 -2.2981000 -3.758800 0.0000000 67.00000
## 43 0.0000000 -1.834547 -2.5981000 67.00000
## 44 2.0000000 -5.362300 -6.1284000 84.00000
## 45 0.5269970 -3.830200 -4.3301000 63.00000
## 46 -2.0521000 -4.499500 -2.7362000 69.00000
## 47 -2.3993809 -3.464100 -3.9063781 92.00000
## 48 -3.0000000 -3.500000 -3.4798462 88.00000
## 49 -1.8061818 -1.969600 -3.2514637 66.00000
## 50 0.5209000 -1.000000 -2.0000000 87.98933
## 51 1.2895353 -1.026100 0.5209000 81.00000
## 52 0.8113663 -0.939700 -0.6428000 83.00000
## 53 0.9397000 1.500000 2.0978029 149.00000
## 54 -0.3420000 1.285600 -2.0000000 157.97889
## 55 0.8660000 -1.532100 -0.1736000 159.00000
## 56 1.5321000 2.704697 1.0425683 149.00000
## 57 -0.5000000 -1.879400 -1.2856000 160.00000
## 58 -1.3681000 -4.040603 0.0000000 156.00000
## 59 3.0000000 3.758800 3.7967458 84.00000
## 60 0.0000000 -2.394100 -1.3892000 126.00000
## 61 -3.4641000 -2.598100 -3.7588000 116.00000
## 62 -5.0000000 -4.924000 -5.6382000 46.70976
## 63 -4.6985000 -2.500000 -0.8682000 63.00000
## 64 -3.8302000 -4.924000 -5.6382000 54.00000
## 65 -3.0000000 -4.596300 -0.6217046 65.00000
## 66 -5.6382000 -6.062200 -6.8937000 72.00000
## 67 -5.9088000 -3.213900 -4.4995000 60.00000
## 68 -1.9284000 -1.026100 0.5209000 70.00000
## 69 1.4672065 -1.532100 -1.0000000 49.21974
## 70 -1.0261000 -3.000000 -2.2981000 98.00000
## 71 -0.8660000 0.000000 0.0000000 107.44772
## 72 1.8794000 1.368100 0.6946000 75.00000
## 73 -1.0261000 -1.710100 -3.2139000 116.00000
## 74 -6.1917893 -5.362300 -2.5000000 86.00000
## 75 -3.0642000 -2.298100 -1.6300968 67.00000
## 76 1.3681000 0.868200 -2.2981000 76.00000
## 77 1.5321000 1.928400 1.9284000 113.00000
## 78 0.1736000 -1.969600 -1.9284000 117.00000
## 79 0.6428000 -0.866000 0.6840000 131.00000
## 80 1.0000000 0.694600 -1.7101000 166.00000
## 81 1.2856000 -1.732100 -0.6840000 127.24566
## 82 3.0642000 2.819100 1.3681000 100.00000
## 83 4.0000000 4.000000 3.7588000 114.00000
## 84 -2.0000000 -0.520900 1.8794000 112.00000
## 85 -3.4641000 -4.000000 -1.7321000 101.00000
## 86 -4.3301000 -5.362300 -4.6702153 76.00000
## 87 -1.6321312 0.520900 0.0000000 59.00000
## 88 -2.9544000 -2.954400 -2.0000000 97.42912
## 89 -1.8794000 -1.540118 -2.3941000 76.00000
## 90 -0.4041515 -5.471334 -1.3892000 55.00000
## 91 -1.3681000 -0.868200 -0.1284621 71.00000
## 92 -2.2204277 -1.928400 -1.7101000 66.00000
## 93 -1.5000000 -3.464100 -3.0642000 59.00000
## 94 -4.5963000 -6.062200 -4.3301000 68.00000
## 95 -0.8660000 -5.000000 -2.9563453 62.00000
## 96 -4.5963000 -6.893700 -4.8217680 78.00000
## 97 -1.0419000 -1.368100 -2.9226819 74.00000
## 98 -5.1962000 -2.736200 -1.0419000 71.00000
## 99 -3.8302000 0.000000 1.3892000 69.00000
## 100 0.0000000 3.213900 0.0000000 71.00000
## 101 -2.5000000 -3.213900 -2.5000000 60.00000
## 102 -4.7163352 -6.062200 -5.1962000 42.00000
## 103 -3.9392000 -3.064200 0.0000000 65.00000
## 104 -0.1736000 3.758800 3.8302000 71.00000
## 105 4.0000000 5.000000 3.9891642 96.00000
## 106 5.1962000 5.142300 1.7367193 98.00000
## 107 -0.9397000 -0.766000 -0.5000000 70.35605
## 108 0.0000000 -1.000000 -1.2856000 76.00000
## 109 -0.6946000 -2.000000 -1.6577494 71.00000
## 110 1.5000000 0.868200 0.8682000 95.45026
## 111 -4.0000000 -3.758800 -4.0000000 99.00000
## 112 0.8605592 -1.041900 -4.0000000 83.00000
Aggregate the results of Regression with Multiple Imputation according to Rubin’s rule
## term estimate std.error statistic df p.value
## 1 (Intercept) 22.28084973 17.0445314 1.30721398 42.67616 0.198139666
## 2 T9 -0.77723606 2.4654532 -0.31525079 22.30264 0.755502041
## 3 T12 3.82153383 2.3976294 1.59388012 27.68958 0.122315438
## 4 T15 -0.47609460 1.5728462 -0.30269622 31.20367 0.764130116
## 5 Ne9 -2.29754168 1.5897020 -1.44526561 21.00125 0.163140734
## 6 Ne12 -1.03175021 2.7710314 -0.37233437 19.86689 0.713586171
## 7 Ne15 -0.20008458 1.7214323 -0.11623145 25.61139 0.908375716
## 8 Vx9 0.21162181 1.3071297 0.16189810 47.22522 0.872076799
## 9 Vx12 1.16152887 1.5541476 0.74737361 42.68311 0.458935790
## 10 Vx15 -0.01637706 1.2529580 -0.01307072 40.85974 0.989634987
## 11 maxO3v 0.35394119 0.1024292 3.45547207 40.03831 0.001313845
Load the dataset ‘nhanes’ integrated in package ‘mice’. The dataset nhanes contains 25 observations on the following 4 variables: * age: Age group (1 = 20-39, 2 = 40-59, 3 = 60+) * bmi: Body mass index (kg/m^2) * hyp: Hypertensive (1 = no, 2 = yes) * chl: Total serum cholesterol (mg/dL)