En este documento se realiza un análisis exploratorio del dataset Boston. El objetivo es analizar la estructura de los datos, obtener estadísticas descriptivas y visualizar relaciones entre variables mediante diferentes gráficos.
library(MASS)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ dplyr::select() masks MASS::select()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(GGally)
library(skimr)
options(scipen=999)
data("Boston")
head(Boston)
## crim zn indus chas nox rm age dis rad tax ptratio black lstat
## 1 0.00632 18 2.31 0 0.538 6.575 65.2 4.0900 1 296 15.3 396.90 4.98
## 2 0.02731 0 7.07 0 0.469 6.421 78.9 4.9671 2 242 17.8 396.90 9.14
## 3 0.02729 0 7.07 0 0.469 7.185 61.1 4.9671 2 242 17.8 392.83 4.03
## 4 0.03237 0 2.18 0 0.458 6.998 45.8 6.0622 3 222 18.7 394.63 2.94
## 5 0.06905 0 2.18 0 0.458 7.147 54.2 6.0622 3 222 18.7 396.90 5.33
## 6 0.02985 0 2.18 0 0.458 6.430 58.7 6.0622 3 222 18.7 394.12 5.21
## medv
## 1 24.0
## 2 21.6
## 3 34.7
## 4 33.4
## 5 36.2
## 6 28.7
El dataset Boston contiene información sobre características de viviendas y variables socioeconómicas en Boston.
dim(Boston)
## [1] 506 14
str(Boston)
## 'data.frame': 506 obs. of 14 variables:
## $ crim : num 0.00632 0.02731 0.02729 0.03237 0.06905 ...
## $ zn : num 18 0 0 0 0 0 12.5 12.5 12.5 12.5 ...
## $ indus : num 2.31 7.07 7.07 2.18 2.18 2.18 7.87 7.87 7.87 7.87 ...
## $ chas : int 0 0 0 0 0 0 0 0 0 0 ...
## $ nox : num 0.538 0.469 0.469 0.458 0.458 0.458 0.524 0.524 0.524 0.524 ...
## $ rm : num 6.58 6.42 7.18 7 7.15 ...
## $ age : num 65.2 78.9 61.1 45.8 54.2 58.7 66.6 96.1 100 85.9 ...
## $ dis : num 4.09 4.97 4.97 6.06 6.06 ...
## $ rad : int 1 2 2 3 3 3 5 5 5 5 ...
## $ tax : num 296 242 242 222 222 222 311 311 311 311 ...
## $ ptratio: num 15.3 17.8 17.8 18.7 18.7 18.7 15.2 15.2 15.2 15.2 ...
## $ black : num 397 397 393 395 397 ...
## $ lstat : num 4.98 9.14 4.03 2.94 5.33 ...
## $ medv : num 24 21.6 34.7 33.4 36.2 28.7 22.9 27.1 16.5 18.9 ...
summary(Boston)
## crim zn indus chas
## Min. : 0.00632 Min. : 0.00 Min. : 0.46 Min. :0.00000
## 1st Qu.: 0.08205 1st Qu.: 0.00 1st Qu.: 5.19 1st Qu.:0.00000
## Median : 0.25651 Median : 0.00 Median : 9.69 Median :0.00000
## Mean : 3.61352 Mean : 11.36 Mean :11.14 Mean :0.06917
## 3rd Qu.: 3.67708 3rd Qu.: 12.50 3rd Qu.:18.10 3rd Qu.:0.00000
## Max. :88.97620 Max. :100.00 Max. :27.74 Max. :1.00000
## nox rm age dis
## Min. :0.3850 Min. :3.561 Min. : 2.90 Min. : 1.130
## 1st Qu.:0.4490 1st Qu.:5.886 1st Qu.: 45.02 1st Qu.: 2.100
## Median :0.5380 Median :6.208 Median : 77.50 Median : 3.207
## Mean :0.5547 Mean :6.285 Mean : 68.57 Mean : 3.795
## 3rd Qu.:0.6240 3rd Qu.:6.623 3rd Qu.: 94.08 3rd Qu.: 5.188
## Max. :0.8710 Max. :8.780 Max. :100.00 Max. :12.127
## rad tax ptratio black
## Min. : 1.000 Min. :187.0 Min. :12.60 Min. : 0.32
## 1st Qu.: 4.000 1st Qu.:279.0 1st Qu.:17.40 1st Qu.:375.38
## Median : 5.000 Median :330.0 Median :19.05 Median :391.44
## Mean : 9.549 Mean :408.2 Mean :18.46 Mean :356.67
## 3rd Qu.:24.000 3rd Qu.:666.0 3rd Qu.:20.20 3rd Qu.:396.23
## Max. :24.000 Max. :711.0 Max. :22.00 Max. :396.90
## lstat medv
## Min. : 1.73 Min. : 5.00
## 1st Qu.: 6.95 1st Qu.:17.02
## Median :11.36 Median :21.20
## Mean :12.65 Mean :22.53
## 3rd Qu.:16.95 3rd Qu.:25.00
## Max. :37.97 Max. :50.00
skim(Boston)
| Name | Boston |
| Number of rows | 506 |
| Number of columns | 14 |
| _______________________ | |
| Column type frequency: | |
| numeric | 14 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| crim | 0 | 1 | 3.61 | 8.60 | 0.01 | 0.08 | 0.26 | 3.68 | 88.98 | ▇▁▁▁▁ |
| zn | 0 | 1 | 11.36 | 23.32 | 0.00 | 0.00 | 0.00 | 12.50 | 100.00 | ▇▁▁▁▁ |
| indus | 0 | 1 | 11.14 | 6.86 | 0.46 | 5.19 | 9.69 | 18.10 | 27.74 | ▇▆▁▇▁ |
| chas | 0 | 1 | 0.07 | 0.25 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▁ |
| nox | 0 | 1 | 0.55 | 0.12 | 0.38 | 0.45 | 0.54 | 0.62 | 0.87 | ▇▇▆▅▁ |
| rm | 0 | 1 | 6.28 | 0.70 | 3.56 | 5.89 | 6.21 | 6.62 | 8.78 | ▁▂▇▂▁ |
| age | 0 | 1 | 68.57 | 28.15 | 2.90 | 45.02 | 77.50 | 94.07 | 100.00 | ▂▂▂▃▇ |
| dis | 0 | 1 | 3.80 | 2.11 | 1.13 | 2.10 | 3.21 | 5.19 | 12.13 | ▇▅▂▁▁ |
| rad | 0 | 1 | 9.55 | 8.71 | 1.00 | 4.00 | 5.00 | 24.00 | 24.00 | ▇▂▁▁▃ |
| tax | 0 | 1 | 408.24 | 168.54 | 187.00 | 279.00 | 330.00 | 666.00 | 711.00 | ▇▇▃▁▇ |
| ptratio | 0 | 1 | 18.46 | 2.16 | 12.60 | 17.40 | 19.05 | 20.20 | 22.00 | ▁▃▅▅▇ |
| black | 0 | 1 | 356.67 | 91.29 | 0.32 | 375.38 | 391.44 | 396.22 | 396.90 | ▁▁▁▁▇ |
| lstat | 0 | 1 | 12.65 | 7.14 | 1.73 | 6.95 | 11.36 | 16.96 | 37.97 | ▇▇▅▂▁ |
| medv | 0 | 1 | 22.53 | 9.20 | 5.00 | 17.02 | 21.20 | 25.00 | 50.00 | ▂▇▅▁▁ |
Observaciones:
El dataset contiene varias variables numéricas relacionadas con vivienda, crimen, impuestos y condiciones socioeconómicas.
Se pueden identificar rangos, medias y posibles valores extremos.
variables <- c("medv","rm","lstat","crim","tax")
Boston_sel <- Boston %>%
select(all_of(variables))
Boston_sel
## medv rm lstat crim tax
## 1 24.0 6.575 4.98 0.00632 296
## 2 21.6 6.421 9.14 0.02731 242
## 3 34.7 7.185 4.03 0.02729 242
## 4 33.4 6.998 2.94 0.03237 222
## 5 36.2 7.147 5.33 0.06905 222
## 6 28.7 6.430 5.21 0.02985 222
## 7 22.9 6.012 12.43 0.08829 311
## 8 27.1 6.172 19.15 0.14455 311
## 9 16.5 5.631 29.93 0.21124 311
## 10 18.9 6.004 17.10 0.17004 311
## 11 15.0 6.377 20.45 0.22489 311
## 12 18.9 6.009 13.27 0.11747 311
## 13 21.7 5.889 15.71 0.09378 311
## 14 20.4 5.949 8.26 0.62976 307
## 15 18.2 6.096 10.26 0.63796 307
## 16 19.9 5.834 8.47 0.62739 307
## 17 23.1 5.935 6.58 1.05393 307
## 18 17.5 5.990 14.67 0.78420 307
## 19 20.2 5.456 11.69 0.80271 307
## 20 18.2 5.727 11.28 0.72580 307
## 21 13.6 5.570 21.02 1.25179 307
## 22 19.6 5.965 13.83 0.85204 307
## 23 15.2 6.142 18.72 1.23247 307
## 24 14.5 5.813 19.88 0.98843 307
## 25 15.6 5.924 16.30 0.75026 307
## 26 13.9 5.599 16.51 0.84054 307
## 27 16.6 5.813 14.81 0.67191 307
## 28 14.8 6.047 17.28 0.95577 307
## 29 18.4 6.495 12.80 0.77299 307
## 30 21.0 6.674 11.98 1.00245 307
## 31 12.7 5.713 22.60 1.13081 307
## 32 14.5 6.072 13.04 1.35472 307
## 33 13.2 5.950 27.71 1.38799 307
## 34 13.1 5.701 18.35 1.15172 307
## 35 13.5 6.096 20.34 1.61282 307
## 36 18.9 5.933 9.68 0.06417 279
## 37 20.0 5.841 11.41 0.09744 279
## 38 21.0 5.850 8.77 0.08014 279
## 39 24.7 5.966 10.13 0.17505 279
## 40 30.8 6.595 4.32 0.02763 252
## 41 34.9 7.024 1.98 0.03359 252
## 42 26.6 6.770 4.84 0.12744 233
## 43 25.3 6.169 5.81 0.14150 233
## 44 24.7 6.211 7.44 0.15936 233
## 45 21.2 6.069 9.55 0.12269 233
## 46 19.3 5.682 10.21 0.17142 233
## 47 20.0 5.786 14.15 0.18836 233
## 48 16.6 6.030 18.80 0.22927 233
## 49 14.4 5.399 30.81 0.25387 233
## 50 19.4 5.602 16.20 0.21977 233
## 51 19.7 5.963 13.45 0.08873 243
## 52 20.5 6.115 9.43 0.04337 243
## 53 25.0 6.511 5.28 0.05360 243
## 54 23.4 5.998 8.43 0.04981 243
## 55 18.9 5.888 14.80 0.01360 469
## 56 35.4 7.249 4.81 0.01311 226
## 57 24.7 6.383 5.77 0.02055 313
## 58 31.6 6.816 3.95 0.01432 256
## 59 23.3 6.145 6.86 0.15445 284
## 60 19.6 5.927 9.22 0.10328 284
## 61 18.7 5.741 13.15 0.14932 284
## 62 16.0 5.966 14.44 0.17171 284
## 63 22.2 6.456 6.73 0.11027 284
## 64 25.0 6.762 9.50 0.12650 284
## 65 33.0 7.104 8.05 0.01951 216
## 66 23.5 6.290 4.67 0.03584 337
## 67 19.4 5.787 10.24 0.04379 337
## 68 22.0 5.878 8.10 0.05789 345
## 69 17.4 5.594 13.09 0.13554 345
## 70 20.9 5.885 8.79 0.12816 345
## 71 24.2 6.417 6.72 0.08826 305
## 72 21.7 5.961 9.88 0.15876 305
## 73 22.8 6.065 5.52 0.09164 305
## 74 23.4 6.245 7.54 0.19539 305
## 75 24.1 6.273 6.78 0.07896 398
## 76 21.4 6.286 8.94 0.09512 398
## 77 20.0 6.279 11.97 0.10153 398
## 78 20.8 6.140 10.27 0.08707 398
## 79 21.2 6.232 12.34 0.05646 398
## 80 20.3 5.874 9.10 0.08387 398
## 81 28.0 6.727 5.29 0.04113 281
## 82 23.9 6.619 7.22 0.04462 281
## 83 24.8 6.302 6.72 0.03659 281
## 84 22.9 6.167 7.51 0.03551 281
## 85 23.9 6.389 9.62 0.05059 247
## 86 26.6 6.630 6.53 0.05735 247
## 87 22.5 6.015 12.86 0.05188 247
## 88 22.2 6.121 8.44 0.07151 247
## 89 23.6 7.007 5.50 0.05660 270
## 90 28.7 7.079 5.70 0.05302 270
## 91 22.6 6.417 8.81 0.04684 270
## 92 22.0 6.405 8.20 0.03932 270
## 93 22.9 6.442 8.16 0.04203 270
## 94 25.0 6.211 6.21 0.02875 270
## 95 20.6 6.249 10.59 0.04294 270
## 96 28.4 6.625 6.65 0.12204 276
## 97 21.4 6.163 11.34 0.11504 276
## 98 38.7 8.069 4.21 0.12083 276
## 99 43.8 7.820 3.57 0.08187 276
## 100 33.2 7.416 6.19 0.06860 276
## 101 27.5 6.727 9.42 0.14866 384
## 102 26.5 6.781 7.67 0.11432 384
## 103 18.6 6.405 10.63 0.22876 384
## 104 19.3 6.137 13.44 0.21161 384
## 105 20.1 6.167 12.33 0.13960 384
## 106 19.5 5.851 16.47 0.13262 384
## 107 19.5 5.836 18.66 0.17120 384
## 108 20.4 6.127 14.09 0.13117 384
## 109 19.8 6.474 12.27 0.12802 384
## 110 19.4 6.229 15.55 0.26363 384
## 111 21.7 6.195 13.00 0.10793 384
## 112 22.8 6.715 10.16 0.10084 432
## 113 18.8 5.913 16.21 0.12329 432
## 114 18.7 6.092 17.09 0.22212 432
## 115 18.5 6.254 10.45 0.14231 432
## 116 18.3 5.928 15.76 0.17134 432
## 117 21.2 6.176 12.04 0.13158 432
## 118 19.2 6.021 10.30 0.15098 432
## 119 20.4 5.872 15.37 0.13058 432
## 120 19.3 5.731 13.61 0.14476 432
## 121 22.0 5.870 14.37 0.06899 188
## 122 20.3 6.004 14.27 0.07165 188
## 123 20.5 5.961 17.93 0.09299 188
## 124 17.3 5.856 25.41 0.15038 188
## 125 18.8 5.879 17.58 0.09849 188
## 126 21.4 5.986 14.81 0.16902 188
## 127 15.7 5.613 27.26 0.38735 188
## 128 16.2 5.693 17.19 0.25915 437
## 129 18.0 6.431 15.39 0.32543 437
## 130 14.3 5.637 18.34 0.88125 437
## 131 19.2 6.458 12.60 0.34006 437
## 132 19.6 6.326 12.26 1.19294 437
## 133 23.0 6.372 11.12 0.59005 437
## 134 18.4 5.822 15.03 0.32982 437
## 135 15.6 5.757 17.31 0.97617 437
## 136 18.1 6.335 16.96 0.55778 437
## 137 17.4 5.942 16.90 0.32264 437
## 138 17.1 6.454 14.59 0.35233 437
## 139 13.3 5.857 21.32 0.24980 437
## 140 17.8 6.151 18.46 0.54452 437
## 141 14.0 6.174 24.16 0.29090 437
## 142 14.4 5.019 34.41 1.62864 437
## 143 13.4 5.403 26.82 3.32105 403
## 144 15.6 5.468 26.42 4.09740 403
## 145 11.8 4.903 29.29 2.77974 403
## 146 13.8 6.130 27.80 2.37934 403
## 147 15.6 5.628 16.65 2.15505 403
## 148 14.6 4.926 29.53 2.36862 403
## 149 17.8 5.186 28.32 2.33099 403
## 150 15.4 5.597 21.45 2.73397 403
## 151 21.5 6.122 14.10 1.65660 403
## 152 19.6 5.404 13.28 1.49632 403
## 153 15.3 5.012 12.12 1.12658 403
## 154 19.4 5.709 15.79 2.14918 403
## 155 17.0 6.129 15.12 1.41385 403
## 156 15.6 6.152 15.02 3.53501 403
## 157 13.1 5.272 16.14 2.44668 403
## 158 41.3 6.943 4.59 1.22358 403
## 159 24.3 6.066 6.43 1.34284 403
## 160 23.3 6.510 7.39 1.42502 403
## 161 27.0 6.250 5.50 1.27346 403
## 162 50.0 7.489 1.73 1.46336 403
## 163 50.0 7.802 1.92 1.83377 403
## 164 50.0 8.375 3.32 1.51902 403
## 165 22.7 5.854 11.64 2.24236 403
## 166 25.0 6.101 9.81 2.92400 403
## 167 50.0 7.929 3.70 2.01019 403
## 168 23.8 5.877 12.14 1.80028 403
## 169 23.8 6.319 11.10 2.30040 403
## 170 22.3 6.402 11.32 2.44953 403
## 171 17.4 5.875 14.43 1.20742 403
## 172 19.1 5.880 12.03 2.31390 403
## 173 23.1 5.572 14.69 0.13914 296
## 174 23.6 6.416 9.04 0.09178 296
## 175 22.6 5.859 9.64 0.08447 296
## 176 29.4 6.546 5.33 0.06664 296
## 177 23.2 6.020 10.11 0.07022 296
## 178 24.6 6.315 6.29 0.05425 296
## 179 29.9 6.860 6.92 0.06642 296
## 180 37.2 6.980 5.04 0.05780 193
## 181 39.8 7.765 7.56 0.06588 193
## 182 36.2 6.144 9.45 0.06888 193
## 183 37.9 7.155 4.82 0.09103 193
## 184 32.5 6.563 5.68 0.10008 193
## 185 26.4 5.604 13.98 0.08308 193
## 186 29.6 6.153 13.15 0.06047 193
## 187 50.0 7.831 4.45 0.05602 193
## 188 32.0 6.782 6.68 0.07875 398
## 189 29.8 6.556 4.56 0.12579 398
## 190 34.9 7.185 5.39 0.08370 398
## 191 37.0 6.951 5.10 0.09068 398
## 192 30.5 6.739 4.69 0.06911 398
## 193 36.4 7.178 2.87 0.08664 398
## 194 31.1 6.800 5.03 0.02187 265
## 195 29.1 6.604 4.38 0.01439 265
## 196 50.0 7.875 2.97 0.01381 255
## 197 33.3 7.287 4.08 0.04011 329
## 198 30.3 7.107 8.61 0.04666 329
## 199 34.6 7.274 6.62 0.03768 329
## 200 34.9 6.975 4.56 0.03150 402
## 201 32.9 7.135 4.45 0.01778 402
## 202 24.1 6.162 7.43 0.03445 348
## 203 42.3 7.610 3.11 0.02177 348
## 204 48.5 7.853 3.81 0.03510 224
## 205 50.0 8.034 2.88 0.02009 224
## 206 22.6 5.891 10.87 0.13642 277
## 207 24.4 6.326 10.97 0.22969 277
## 208 22.5 5.783 18.06 0.25199 277
## 209 24.4 6.064 14.66 0.13587 277
## 210 20.0 5.344 23.09 0.43571 277
## 211 21.7 5.960 17.27 0.17446 277
## 212 19.3 5.404 23.98 0.37578 277
## 213 22.4 5.807 16.03 0.21719 277
## 214 28.1 6.375 9.38 0.14052 277
## 215 23.7 5.412 29.55 0.28955 277
## 216 25.0 6.182 9.47 0.19802 277
## 217 23.3 5.888 13.51 0.04560 276
## 218 28.7 6.642 9.69 0.07013 276
## 219 21.5 5.951 17.92 0.11069 276
## 220 23.0 6.373 10.50 0.11425 276
## 221 26.7 6.951 9.71 0.35809 307
## 222 21.7 6.164 21.46 0.40771 307
## 223 27.5 6.879 9.93 0.62356 307
## 224 30.1 6.618 7.60 0.61470 307
## 225 44.8 8.266 4.14 0.31533 307
## 226 50.0 8.725 4.63 0.52693 307
## 227 37.6 8.040 3.13 0.38214 307
## 228 31.6 7.163 6.36 0.41238 307
## 229 46.7 7.686 3.92 0.29819 307
## 230 31.5 6.552 3.76 0.44178 307
## 231 24.3 5.981 11.65 0.53700 307
## 232 31.7 7.412 5.25 0.46296 307
## 233 41.7 8.337 2.47 0.57529 307
## 234 48.3 8.247 3.95 0.33147 307
## 235 29.0 6.726 8.05 0.44791 307
## 236 24.0 6.086 10.88 0.33045 307
## 237 25.1 6.631 9.54 0.52058 307
## 238 31.5 7.358 4.73 0.51183 307
## 239 23.7 6.481 6.36 0.08244 300
## 240 23.3 6.606 7.37 0.09252 300
## 241 22.0 6.897 11.38 0.11329 300
## 242 20.1 6.095 12.40 0.10612 300
## 243 22.2 6.358 11.22 0.10290 300
## 244 23.7 6.393 5.19 0.12757 300
## 245 17.6 5.593 12.50 0.20608 330
## 246 18.5 5.605 18.46 0.19133 330
## 247 24.3 6.108 9.16 0.33983 330
## 248 20.5 6.226 10.15 0.19657 330
## 249 24.5 6.433 9.52 0.16439 330
## 250 26.2 6.718 6.56 0.19073 330
## 251 24.4 6.487 5.90 0.14030 330
## 252 24.8 6.438 3.59 0.21409 330
## 253 29.6 6.957 3.53 0.08221 330
## 254 42.8 8.259 3.54 0.36894 330
## 255 21.9 6.108 6.57 0.04819 315
## 256 20.9 5.876 9.25 0.03548 315
## 257 44.0 7.454 3.11 0.01538 244
## 258 50.0 8.704 5.12 0.61154 264
## 259 36.0 7.333 7.79 0.66351 264
## 260 30.1 6.842 6.90 0.65665 264
## 261 33.8 7.203 9.59 0.54011 264
## 262 43.1 7.520 7.26 0.53412 264
## 263 48.8 8.398 5.91 0.52014 264
## 264 31.0 7.327 11.25 0.82526 264
## 265 36.5 7.206 8.10 0.55007 264
## 266 22.8 5.560 10.45 0.76162 264
## 267 30.7 7.014 14.79 0.78570 264
## 268 50.0 8.297 7.44 0.57834 264
## 269 43.5 7.470 3.16 0.54050 264
## 270 20.7 5.920 13.65 0.09065 223
## 271 21.1 5.856 13.00 0.29916 223
## 272 25.2 6.240 6.59 0.16211 223
## 273 24.4 6.538 7.73 0.11460 223
## 274 35.2 7.691 6.58 0.22188 223
## 275 32.4 6.758 3.53 0.05644 254
## 276 32.0 6.854 2.98 0.09604 254
## 277 33.2 7.267 6.05 0.10469 254
## 278 33.1 6.826 4.16 0.06127 254
## 279 29.1 6.482 7.19 0.07978 254
## 280 35.1 6.812 4.85 0.21038 216
## 281 45.4 7.820 3.76 0.03578 216
## 282 35.4 6.968 4.59 0.03705 216
## 283 46.0 7.645 3.01 0.06129 216
## 284 50.0 7.923 3.16 0.01501 198
## 285 32.2 7.088 7.85 0.00906 285
## 286 22.0 6.453 8.23 0.01096 300
## 287 20.1 6.230 12.93 0.01965 241
## 288 23.2 6.209 7.14 0.03871 293
## 289 22.3 6.315 7.60 0.04590 293
## 290 24.8 6.565 9.51 0.04297 293
## 291 28.5 6.861 3.33 0.03502 245
## 292 37.3 7.148 3.56 0.07886 245
## 293 27.9 6.630 4.70 0.03615 245
## 294 23.9 6.127 8.58 0.08265 289
## 295 21.7 6.009 10.40 0.08199 289
## 296 28.6 6.678 6.27 0.12932 289
## 297 27.1 6.549 7.39 0.05372 289
## 298 20.3 5.790 15.84 0.14103 289
## 299 22.5 6.345 4.97 0.06466 358
## 300 29.0 7.041 4.74 0.05561 358
## 301 24.8 6.871 6.07 0.04417 358
## 302 22.0 6.590 9.50 0.03537 329
## 303 26.4 6.495 8.67 0.09266 329
## 304 33.1 6.982 4.86 0.10000 329
## 305 36.1 7.236 6.93 0.05515 222
## 306 28.4 6.616 8.93 0.05479 222
## 307 33.4 7.420 6.47 0.07503 222
## 308 28.2 6.849 7.53 0.04932 222
## 309 22.8 6.635 4.54 0.49298 304
## 310 20.3 5.972 9.97 0.34940 304
## 311 16.1 4.973 12.64 2.63548 304
## 312 22.1 6.122 5.98 0.79041 304
## 313 19.4 6.023 11.72 0.26169 304
## 314 21.6 6.266 7.90 0.26938 304
## 315 23.8 6.567 9.28 0.36920 304
## 316 16.2 5.705 11.50 0.25356 304
## 317 17.8 5.914 18.33 0.31827 304
## 318 19.8 5.782 15.94 0.24522 304
## 319 23.1 6.382 10.36 0.40202 304
## 320 21.0 6.113 12.73 0.47547 304
## 321 23.8 6.426 7.20 0.16760 287
## 322 23.1 6.376 6.87 0.18159 287
## 323 20.4 6.041 7.70 0.35114 287
## 324 18.5 5.708 11.74 0.28392 287
## 325 25.0 6.415 6.12 0.34109 287
## 326 24.6 6.431 5.08 0.19186 287
## 327 23.0 6.312 6.15 0.30347 287
## 328 22.2 6.083 12.79 0.24103 287
## 329 19.3 5.868 9.97 0.06617 430
## 330 22.6 6.333 7.34 0.06724 430
## 331 19.8 6.144 9.09 0.04544 430
## 332 17.1 5.706 12.43 0.05023 304
## 333 19.4 6.031 7.83 0.03466 304
## 334 22.2 6.316 5.68 0.05083 224
## 335 20.7 6.310 6.75 0.03738 224
## 336 21.1 6.037 8.01 0.03961 224
## 337 19.5 5.869 9.80 0.03427 224
## 338 18.5 5.895 10.56 0.03041 224
## 339 20.6 6.059 8.51 0.03306 224
## 340 19.0 5.985 9.74 0.05497 224
## 341 18.7 5.968 9.29 0.06151 224
## 342 32.7 7.241 5.49 0.01301 284
## 343 16.5 6.540 8.65 0.02498 422
## 344 23.9 6.696 7.18 0.02543 370
## 345 31.2 6.874 4.61 0.03049 370
## 346 17.5 6.014 10.53 0.03113 352
## 347 17.2 5.898 12.67 0.06162 352
## 348 23.1 6.516 6.36 0.01870 351
## 349 24.5 6.635 5.99 0.01501 280
## 350 26.6 6.939 5.89 0.02899 335
## 351 22.9 6.490 5.98 0.06211 335
## 352 24.1 6.579 5.49 0.07950 411
## 353 18.6 5.884 7.79 0.07244 411
## 354 30.1 6.728 4.50 0.01709 187
## 355 18.2 5.663 8.05 0.04301 334
## 356 20.6 5.936 5.57 0.10659 334
## 357 17.8 6.212 17.60 8.98296 666
## 358 21.7 6.395 13.27 3.84970 666
## 359 22.7 6.127 11.48 5.20177 666
## 360 22.6 6.112 12.67 4.26131 666
## 361 25.0 6.398 7.79 4.54192 666
## 362 19.9 6.251 14.19 3.83684 666
## 363 20.8 5.362 10.19 3.67822 666
## 364 16.8 5.803 14.64 4.22239 666
## 365 21.9 8.780 5.29 3.47428 666
## 366 27.5 3.561 7.12 4.55587 666
## 367 21.9 4.963 14.00 3.69695 666
## 368 23.1 3.863 13.33 13.52220 666
## 369 50.0 4.970 3.26 4.89822 666
## 370 50.0 6.683 3.73 5.66998 666
## 371 50.0 7.016 2.96 6.53876 666
## 372 50.0 6.216 9.53 9.23230 666
## 373 50.0 5.875 8.88 8.26725 666
## 374 13.8 4.906 34.77 11.10810 666
## 375 13.8 4.138 37.97 18.49820 666
## 376 15.0 7.313 13.44 19.60910 666
## 377 13.9 6.649 23.24 15.28800 666
## 378 13.3 6.794 21.24 9.82349 666
## 379 13.1 6.380 23.69 23.64820 666
## 380 10.2 6.223 21.78 17.86670 666
## 381 10.4 6.968 17.21 88.97620 666
## 382 10.9 6.545 21.08 15.87440 666
## 383 11.3 5.536 23.60 9.18702 666
## 384 12.3 5.520 24.56 7.99248 666
## 385 8.8 4.368 30.63 20.08490 666
## 386 7.2 5.277 30.81 16.81180 666
## 387 10.5 4.652 28.28 24.39380 666
## 388 7.4 5.000 31.99 22.59710 666
## 389 10.2 4.880 30.62 14.33370 666
## 390 11.5 5.390 20.85 8.15174 666
## 391 15.1 5.713 17.11 6.96215 666
## 392 23.2 6.051 18.76 5.29305 666
## 393 9.7 5.036 25.68 11.57790 666
## 394 13.8 6.193 15.17 8.64476 666
## 395 12.7 5.887 16.35 13.35980 666
## 396 13.1 6.471 17.12 8.71675 666
## 397 12.5 6.405 19.37 5.87205 666
## 398 8.5 5.747 19.92 7.67202 666
## 399 5.0 5.453 30.59 38.35180 666
## 400 6.3 5.852 29.97 9.91655 666
## 401 5.6 5.987 26.77 25.04610 666
## 402 7.2 6.343 20.32 14.23620 666
## 403 12.1 6.404 20.31 9.59571 666
## 404 8.3 5.349 19.77 24.80170 666
## 405 8.5 5.531 27.38 41.52920 666
## 406 5.0 5.683 22.98 67.92080 666
## 407 11.9 4.138 23.34 20.71620 666
## 408 27.9 5.608 12.13 11.95110 666
## 409 17.2 5.617 26.40 7.40389 666
## 410 27.5 6.852 19.78 14.43830 666
## 411 15.0 5.757 10.11 51.13580 666
## 412 17.2 6.657 21.22 14.05070 666
## 413 17.9 4.628 34.37 18.81100 666
## 414 16.3 5.155 20.08 28.65580 666
## 415 7.0 4.519 36.98 45.74610 666
## 416 7.2 6.434 29.05 18.08460 666
## 417 7.5 6.782 25.79 10.83420 666
## 418 10.4 5.304 26.64 25.94060 666
## 419 8.8 5.957 20.62 73.53410 666
## 420 8.4 6.824 22.74 11.81230 666
## 421 16.7 6.411 15.02 11.08740 666
## 422 14.2 6.006 15.70 7.02259 666
## 423 20.8 5.648 14.10 12.04820 666
## 424 13.4 6.103 23.29 7.05042 666
## 425 11.7 5.565 17.16 8.79212 666
## 426 8.3 5.896 24.39 15.86030 666
## 427 10.2 5.837 15.69 12.24720 666
## 428 10.9 6.202 14.52 37.66190 666
## 429 11.0 6.193 21.52 7.36711 666
## 430 9.5 6.380 24.08 9.33889 666
## 431 14.5 6.348 17.64 8.49213 666
## 432 14.1 6.833 19.69 10.06230 666
## 433 16.1 6.425 12.03 6.44405 666
## 434 14.3 6.436 16.22 5.58107 666
## 435 11.7 6.208 15.17 13.91340 666
## 436 13.4 6.629 23.27 11.16040 666
## 437 9.6 6.461 18.05 14.42080 666
## 438 8.7 6.152 26.45 15.17720 666
## 439 8.4 5.935 34.02 13.67810 666
## 440 12.8 5.627 22.88 9.39063 666
## 441 10.5 5.818 22.11 22.05110 666
## 442 17.1 6.406 19.52 9.72418 666
## 443 18.4 6.219 16.59 5.66637 666
## 444 15.4 6.485 18.85 9.96654 666
## 445 10.8 5.854 23.79 12.80230 666
## 446 11.8 6.459 23.98 10.67180 666
## 447 14.9 6.341 17.79 6.28807 666
## 448 12.6 6.251 16.44 9.92485 666
## 449 14.1 6.185 18.13 9.32909 666
## 450 13.0 6.417 19.31 7.52601 666
## 451 13.4 6.749 17.44 6.71772 666
## 452 15.2 6.655 17.73 5.44114 666
## 453 16.1 6.297 17.27 5.09017 666
## 454 17.8 7.393 16.74 8.24809 666
## 455 14.9 6.728 18.71 9.51363 666
## 456 14.1 6.525 18.13 4.75237 666
## 457 12.7 5.976 19.01 4.66883 666
## 458 13.5 5.936 16.94 8.20058 666
## 459 14.9 6.301 16.23 7.75223 666
## 460 20.0 6.081 14.70 6.80117 666
## 461 16.4 6.701 16.42 4.81213 666
## 462 17.7 6.376 14.65 3.69311 666
## 463 19.5 6.317 13.99 6.65492 666
## 464 20.2 6.513 10.29 5.82115 666
## 465 21.4 6.209 13.22 7.83932 666
## 466 19.9 5.759 14.13 3.16360 666
## 467 19.0 5.952 17.15 3.77498 666
## 468 19.1 6.003 21.32 4.42228 666
## 469 19.1 5.926 18.13 15.57570 666
## 470 20.1 5.713 14.76 13.07510 666
## 471 19.9 6.167 16.29 4.34879 666
## 472 19.6 6.229 12.87 4.03841 666
## 473 23.2 6.437 14.36 3.56868 666
## 474 29.8 6.980 11.66 4.64689 666
## 475 13.8 5.427 18.14 8.05579 666
## 476 13.3 6.162 24.10 6.39312 666
## 477 16.7 6.484 18.68 4.87141 666
## 478 12.0 5.304 24.91 15.02340 666
## 479 14.6 6.185 18.03 10.23300 666
## 480 21.4 6.229 13.11 14.33370 666
## 481 23.0 6.242 10.74 5.82401 666
## 482 23.7 6.750 7.74 5.70818 666
## 483 25.0 7.061 7.01 5.73116 666
## 484 21.8 5.762 10.42 2.81838 666
## 485 20.6 5.871 13.34 2.37857 666
## 486 21.2 6.312 10.58 3.67367 666
## 487 19.1 6.114 14.98 5.69175 666
## 488 20.6 5.905 11.45 4.83567 666
## 489 15.2 5.454 18.06 0.15086 711
## 490 7.0 5.414 23.97 0.18337 711
## 491 8.1 5.093 29.68 0.20746 711
## 492 13.6 5.983 18.07 0.10574 711
## 493 20.1 5.983 13.35 0.11132 711
## 494 21.8 5.707 12.01 0.17331 391
## 495 24.5 5.926 13.59 0.27957 391
## 496 23.1 5.670 17.60 0.17899 391
## 497 19.7 5.390 21.14 0.28960 391
## 498 18.3 5.794 14.10 0.26838 391
## 499 21.2 6.019 12.92 0.23912 391
## 500 17.5 5.569 15.10 0.17783 391
## 501 16.8 6.027 14.33 0.22438 391
## 502 22.4 6.593 9.67 0.06263 273
## 503 20.6 6.120 9.08 0.04527 273
## 504 23.9 6.976 5.64 0.06076 273
## 505 22.0 6.794 6.48 0.10959 273
## 506 11.9 6.030 7.88 0.04741 273
Estas variables representan:
medv: valor medio de las viviendas
rm: número promedio de habitaciones
lstat: porcentaje de población de bajo estatus
crim: tasa de criminalidad
tax: tasa de impuestos
sapply(Boston_sel, mean)
## medv rm lstat crim tax
## 22.532806 6.284634 12.653063 3.613524 408.237154
sapply(Boston_sel, sd)
## medv rm lstat crim tax
## 9.1971041 0.7026171 7.1410615 8.6015451 168.5371161
Observaciones:
La media muestra el valor promedio de cada variable.
La desviación estándar indica la variabilidad de los datos.
Boston %>%
select(medv, rm, lstat) %>%
pivot_longer(cols = everything(),
names_to = "variable",
values_to = "valor") %>%
ggplot(aes(x = valor, fill = variable)) +
geom_histogram(color = "white", bins = 15) +
facet_wrap(~variable, scales = "free_x") +
theme_minimal()
Los histogramas permiten visualizar la distribución de las variables.
Se pueden identificar posibles sesgos o concentraciones de valores.
Boston %>%
select(medv, lstat) %>%
pivot_longer(cols = everything(),
names_to = "variable",
values_to = "valor") %>%
ggplot(aes(x = variable, y = valor, fill = variable)) +
geom_boxplot() +
theme_minimal()
Los boxplots ayudan a identificar valores atípicos.
También permiten observar la dispersión de los datos.
Boston %>%
select(medv, rm, lstat, crim, tax) %>%
pivot_longer(cols = -medv,
names_to = "variable",
values_to = "valor") %>%
ggplot(aes(x = valor, y = medv, color = variable)) +
geom_point() +
facet_wrap(~variable, scales = "free_x") +
theme_minimal()
Se observa cómo diferentes variables se relacionan con el valor de la vivienda.
Algunas variables pueden mostrar relaciones positivas o negativas.
ggpairs(Boston[, c("medv","rm","lstat")])
La matriz de correlación permite observar relaciones entre variables.
Por ejemplo, el número de habitaciones parece tener relación positiva con el valor de las viviendas.