Introducción

En este documento se realiza un análisis exploratorio del dataset Boston. El objetivo es analizar la estructura de los datos, obtener estadísticas descriptivas y visualizar relaciones entre variables mediante diferentes gráficos.

1. Carga de paquetes

library(MASS)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.2.0
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ✖ dplyr::select() masks MASS::select()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(GGally)
library(skimr)

options(scipen=999)

2. Carga del dataset

data("Boston")

head(Boston)
##      crim zn indus chas   nox    rm  age    dis rad tax ptratio  black lstat
## 1 0.00632 18  2.31    0 0.538 6.575 65.2 4.0900   1 296    15.3 396.90  4.98
## 2 0.02731  0  7.07    0 0.469 6.421 78.9 4.9671   2 242    17.8 396.90  9.14
## 3 0.02729  0  7.07    0 0.469 7.185 61.1 4.9671   2 242    17.8 392.83  4.03
## 4 0.03237  0  2.18    0 0.458 6.998 45.8 6.0622   3 222    18.7 394.63  2.94
## 5 0.06905  0  2.18    0 0.458 7.147 54.2 6.0622   3 222    18.7 396.90  5.33
## 6 0.02985  0  2.18    0 0.458 6.430 58.7 6.0622   3 222    18.7 394.12  5.21
##   medv
## 1 24.0
## 2 21.6
## 3 34.7
## 4 33.4
## 5 36.2
## 6 28.7

El dataset Boston contiene información sobre características de viviendas y variables socioeconómicas en Boston.

3. Exploración del dataset

dim(Boston)
## [1] 506  14
str(Boston)
## 'data.frame':    506 obs. of  14 variables:
##  $ crim   : num  0.00632 0.02731 0.02729 0.03237 0.06905 ...
##  $ zn     : num  18 0 0 0 0 0 12.5 12.5 12.5 12.5 ...
##  $ indus  : num  2.31 7.07 7.07 2.18 2.18 2.18 7.87 7.87 7.87 7.87 ...
##  $ chas   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ nox    : num  0.538 0.469 0.469 0.458 0.458 0.458 0.524 0.524 0.524 0.524 ...
##  $ rm     : num  6.58 6.42 7.18 7 7.15 ...
##  $ age    : num  65.2 78.9 61.1 45.8 54.2 58.7 66.6 96.1 100 85.9 ...
##  $ dis    : num  4.09 4.97 4.97 6.06 6.06 ...
##  $ rad    : int  1 2 2 3 3 3 5 5 5 5 ...
##  $ tax    : num  296 242 242 222 222 222 311 311 311 311 ...
##  $ ptratio: num  15.3 17.8 17.8 18.7 18.7 18.7 15.2 15.2 15.2 15.2 ...
##  $ black  : num  397 397 393 395 397 ...
##  $ lstat  : num  4.98 9.14 4.03 2.94 5.33 ...
##  $ medv   : num  24 21.6 34.7 33.4 36.2 28.7 22.9 27.1 16.5 18.9 ...
summary(Boston)
##       crim                zn             indus            chas        
##  Min.   : 0.00632   Min.   :  0.00   Min.   : 0.46   Min.   :0.00000  
##  1st Qu.: 0.08205   1st Qu.:  0.00   1st Qu.: 5.19   1st Qu.:0.00000  
##  Median : 0.25651   Median :  0.00   Median : 9.69   Median :0.00000  
##  Mean   : 3.61352   Mean   : 11.36   Mean   :11.14   Mean   :0.06917  
##  3rd Qu.: 3.67708   3rd Qu.: 12.50   3rd Qu.:18.10   3rd Qu.:0.00000  
##  Max.   :88.97620   Max.   :100.00   Max.   :27.74   Max.   :1.00000  
##       nox               rm             age              dis        
##  Min.   :0.3850   Min.   :3.561   Min.   :  2.90   Min.   : 1.130  
##  1st Qu.:0.4490   1st Qu.:5.886   1st Qu.: 45.02   1st Qu.: 2.100  
##  Median :0.5380   Median :6.208   Median : 77.50   Median : 3.207  
##  Mean   :0.5547   Mean   :6.285   Mean   : 68.57   Mean   : 3.795  
##  3rd Qu.:0.6240   3rd Qu.:6.623   3rd Qu.: 94.08   3rd Qu.: 5.188  
##  Max.   :0.8710   Max.   :8.780   Max.   :100.00   Max.   :12.127  
##       rad              tax           ptratio          black       
##  Min.   : 1.000   Min.   :187.0   Min.   :12.60   Min.   :  0.32  
##  1st Qu.: 4.000   1st Qu.:279.0   1st Qu.:17.40   1st Qu.:375.38  
##  Median : 5.000   Median :330.0   Median :19.05   Median :391.44  
##  Mean   : 9.549   Mean   :408.2   Mean   :18.46   Mean   :356.67  
##  3rd Qu.:24.000   3rd Qu.:666.0   3rd Qu.:20.20   3rd Qu.:396.23  
##  Max.   :24.000   Max.   :711.0   Max.   :22.00   Max.   :396.90  
##      lstat            medv      
##  Min.   : 1.73   Min.   : 5.00  
##  1st Qu.: 6.95   1st Qu.:17.02  
##  Median :11.36   Median :21.20  
##  Mean   :12.65   Mean   :22.53  
##  3rd Qu.:16.95   3rd Qu.:25.00  
##  Max.   :37.97   Max.   :50.00
skim(Boston)
Data summary
Name Boston
Number of rows 506
Number of columns 14
_______________________
Column type frequency:
numeric 14
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
crim 0 1 3.61 8.60 0.01 0.08 0.26 3.68 88.98 ▇▁▁▁▁
zn 0 1 11.36 23.32 0.00 0.00 0.00 12.50 100.00 ▇▁▁▁▁
indus 0 1 11.14 6.86 0.46 5.19 9.69 18.10 27.74 ▇▆▁▇▁
chas 0 1 0.07 0.25 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▁
nox 0 1 0.55 0.12 0.38 0.45 0.54 0.62 0.87 ▇▇▆▅▁
rm 0 1 6.28 0.70 3.56 5.89 6.21 6.62 8.78 ▁▂▇▂▁
age 0 1 68.57 28.15 2.90 45.02 77.50 94.07 100.00 ▂▂▂▃▇
dis 0 1 3.80 2.11 1.13 2.10 3.21 5.19 12.13 ▇▅▂▁▁
rad 0 1 9.55 8.71 1.00 4.00 5.00 24.00 24.00 ▇▂▁▁▃
tax 0 1 408.24 168.54 187.00 279.00 330.00 666.00 711.00 ▇▇▃▁▇
ptratio 0 1 18.46 2.16 12.60 17.40 19.05 20.20 22.00 ▁▃▅▅▇
black 0 1 356.67 91.29 0.32 375.38 391.44 396.22 396.90 ▁▁▁▁▇
lstat 0 1 12.65 7.14 1.73 6.95 11.36 16.96 37.97 ▇▇▅▂▁
medv 0 1 22.53 9.20 5.00 17.02 21.20 25.00 50.00 ▂▇▅▁▁

Observaciones:

El dataset contiene varias variables numéricas relacionadas con vivienda, crimen, impuestos y condiciones socioeconómicas.

Se pueden identificar rangos, medias y posibles valores extremos.

4. Selección de variables para análisis

variables <- c("medv","rm","lstat","crim","tax")

Boston_sel <- Boston %>%
  select(all_of(variables))

Boston_sel
##     medv    rm lstat     crim tax
## 1   24.0 6.575  4.98  0.00632 296
## 2   21.6 6.421  9.14  0.02731 242
## 3   34.7 7.185  4.03  0.02729 242
## 4   33.4 6.998  2.94  0.03237 222
## 5   36.2 7.147  5.33  0.06905 222
## 6   28.7 6.430  5.21  0.02985 222
## 7   22.9 6.012 12.43  0.08829 311
## 8   27.1 6.172 19.15  0.14455 311
## 9   16.5 5.631 29.93  0.21124 311
## 10  18.9 6.004 17.10  0.17004 311
## 11  15.0 6.377 20.45  0.22489 311
## 12  18.9 6.009 13.27  0.11747 311
## 13  21.7 5.889 15.71  0.09378 311
## 14  20.4 5.949  8.26  0.62976 307
## 15  18.2 6.096 10.26  0.63796 307
## 16  19.9 5.834  8.47  0.62739 307
## 17  23.1 5.935  6.58  1.05393 307
## 18  17.5 5.990 14.67  0.78420 307
## 19  20.2 5.456 11.69  0.80271 307
## 20  18.2 5.727 11.28  0.72580 307
## 21  13.6 5.570 21.02  1.25179 307
## 22  19.6 5.965 13.83  0.85204 307
## 23  15.2 6.142 18.72  1.23247 307
## 24  14.5 5.813 19.88  0.98843 307
## 25  15.6 5.924 16.30  0.75026 307
## 26  13.9 5.599 16.51  0.84054 307
## 27  16.6 5.813 14.81  0.67191 307
## 28  14.8 6.047 17.28  0.95577 307
## 29  18.4 6.495 12.80  0.77299 307
## 30  21.0 6.674 11.98  1.00245 307
## 31  12.7 5.713 22.60  1.13081 307
## 32  14.5 6.072 13.04  1.35472 307
## 33  13.2 5.950 27.71  1.38799 307
## 34  13.1 5.701 18.35  1.15172 307
## 35  13.5 6.096 20.34  1.61282 307
## 36  18.9 5.933  9.68  0.06417 279
## 37  20.0 5.841 11.41  0.09744 279
## 38  21.0 5.850  8.77  0.08014 279
## 39  24.7 5.966 10.13  0.17505 279
## 40  30.8 6.595  4.32  0.02763 252
## 41  34.9 7.024  1.98  0.03359 252
## 42  26.6 6.770  4.84  0.12744 233
## 43  25.3 6.169  5.81  0.14150 233
## 44  24.7 6.211  7.44  0.15936 233
## 45  21.2 6.069  9.55  0.12269 233
## 46  19.3 5.682 10.21  0.17142 233
## 47  20.0 5.786 14.15  0.18836 233
## 48  16.6 6.030 18.80  0.22927 233
## 49  14.4 5.399 30.81  0.25387 233
## 50  19.4 5.602 16.20  0.21977 233
## 51  19.7 5.963 13.45  0.08873 243
## 52  20.5 6.115  9.43  0.04337 243
## 53  25.0 6.511  5.28  0.05360 243
## 54  23.4 5.998  8.43  0.04981 243
## 55  18.9 5.888 14.80  0.01360 469
## 56  35.4 7.249  4.81  0.01311 226
## 57  24.7 6.383  5.77  0.02055 313
## 58  31.6 6.816  3.95  0.01432 256
## 59  23.3 6.145  6.86  0.15445 284
## 60  19.6 5.927  9.22  0.10328 284
## 61  18.7 5.741 13.15  0.14932 284
## 62  16.0 5.966 14.44  0.17171 284
## 63  22.2 6.456  6.73  0.11027 284
## 64  25.0 6.762  9.50  0.12650 284
## 65  33.0 7.104  8.05  0.01951 216
## 66  23.5 6.290  4.67  0.03584 337
## 67  19.4 5.787 10.24  0.04379 337
## 68  22.0 5.878  8.10  0.05789 345
## 69  17.4 5.594 13.09  0.13554 345
## 70  20.9 5.885  8.79  0.12816 345
## 71  24.2 6.417  6.72  0.08826 305
## 72  21.7 5.961  9.88  0.15876 305
## 73  22.8 6.065  5.52  0.09164 305
## 74  23.4 6.245  7.54  0.19539 305
## 75  24.1 6.273  6.78  0.07896 398
## 76  21.4 6.286  8.94  0.09512 398
## 77  20.0 6.279 11.97  0.10153 398
## 78  20.8 6.140 10.27  0.08707 398
## 79  21.2 6.232 12.34  0.05646 398
## 80  20.3 5.874  9.10  0.08387 398
## 81  28.0 6.727  5.29  0.04113 281
## 82  23.9 6.619  7.22  0.04462 281
## 83  24.8 6.302  6.72  0.03659 281
## 84  22.9 6.167  7.51  0.03551 281
## 85  23.9 6.389  9.62  0.05059 247
## 86  26.6 6.630  6.53  0.05735 247
## 87  22.5 6.015 12.86  0.05188 247
## 88  22.2 6.121  8.44  0.07151 247
## 89  23.6 7.007  5.50  0.05660 270
## 90  28.7 7.079  5.70  0.05302 270
## 91  22.6 6.417  8.81  0.04684 270
## 92  22.0 6.405  8.20  0.03932 270
## 93  22.9 6.442  8.16  0.04203 270
## 94  25.0 6.211  6.21  0.02875 270
## 95  20.6 6.249 10.59  0.04294 270
## 96  28.4 6.625  6.65  0.12204 276
## 97  21.4 6.163 11.34  0.11504 276
## 98  38.7 8.069  4.21  0.12083 276
## 99  43.8 7.820  3.57  0.08187 276
## 100 33.2 7.416  6.19  0.06860 276
## 101 27.5 6.727  9.42  0.14866 384
## 102 26.5 6.781  7.67  0.11432 384
## 103 18.6 6.405 10.63  0.22876 384
## 104 19.3 6.137 13.44  0.21161 384
## 105 20.1 6.167 12.33  0.13960 384
## 106 19.5 5.851 16.47  0.13262 384
## 107 19.5 5.836 18.66  0.17120 384
## 108 20.4 6.127 14.09  0.13117 384
## 109 19.8 6.474 12.27  0.12802 384
## 110 19.4 6.229 15.55  0.26363 384
## 111 21.7 6.195 13.00  0.10793 384
## 112 22.8 6.715 10.16  0.10084 432
## 113 18.8 5.913 16.21  0.12329 432
## 114 18.7 6.092 17.09  0.22212 432
## 115 18.5 6.254 10.45  0.14231 432
## 116 18.3 5.928 15.76  0.17134 432
## 117 21.2 6.176 12.04  0.13158 432
## 118 19.2 6.021 10.30  0.15098 432
## 119 20.4 5.872 15.37  0.13058 432
## 120 19.3 5.731 13.61  0.14476 432
## 121 22.0 5.870 14.37  0.06899 188
## 122 20.3 6.004 14.27  0.07165 188
## 123 20.5 5.961 17.93  0.09299 188
## 124 17.3 5.856 25.41  0.15038 188
## 125 18.8 5.879 17.58  0.09849 188
## 126 21.4 5.986 14.81  0.16902 188
## 127 15.7 5.613 27.26  0.38735 188
## 128 16.2 5.693 17.19  0.25915 437
## 129 18.0 6.431 15.39  0.32543 437
## 130 14.3 5.637 18.34  0.88125 437
## 131 19.2 6.458 12.60  0.34006 437
## 132 19.6 6.326 12.26  1.19294 437
## 133 23.0 6.372 11.12  0.59005 437
## 134 18.4 5.822 15.03  0.32982 437
## 135 15.6 5.757 17.31  0.97617 437
## 136 18.1 6.335 16.96  0.55778 437
## 137 17.4 5.942 16.90  0.32264 437
## 138 17.1 6.454 14.59  0.35233 437
## 139 13.3 5.857 21.32  0.24980 437
## 140 17.8 6.151 18.46  0.54452 437
## 141 14.0 6.174 24.16  0.29090 437
## 142 14.4 5.019 34.41  1.62864 437
## 143 13.4 5.403 26.82  3.32105 403
## 144 15.6 5.468 26.42  4.09740 403
## 145 11.8 4.903 29.29  2.77974 403
## 146 13.8 6.130 27.80  2.37934 403
## 147 15.6 5.628 16.65  2.15505 403
## 148 14.6 4.926 29.53  2.36862 403
## 149 17.8 5.186 28.32  2.33099 403
## 150 15.4 5.597 21.45  2.73397 403
## 151 21.5 6.122 14.10  1.65660 403
## 152 19.6 5.404 13.28  1.49632 403
## 153 15.3 5.012 12.12  1.12658 403
## 154 19.4 5.709 15.79  2.14918 403
## 155 17.0 6.129 15.12  1.41385 403
## 156 15.6 6.152 15.02  3.53501 403
## 157 13.1 5.272 16.14  2.44668 403
## 158 41.3 6.943  4.59  1.22358 403
## 159 24.3 6.066  6.43  1.34284 403
## 160 23.3 6.510  7.39  1.42502 403
## 161 27.0 6.250  5.50  1.27346 403
## 162 50.0 7.489  1.73  1.46336 403
## 163 50.0 7.802  1.92  1.83377 403
## 164 50.0 8.375  3.32  1.51902 403
## 165 22.7 5.854 11.64  2.24236 403
## 166 25.0 6.101  9.81  2.92400 403
## 167 50.0 7.929  3.70  2.01019 403
## 168 23.8 5.877 12.14  1.80028 403
## 169 23.8 6.319 11.10  2.30040 403
## 170 22.3 6.402 11.32  2.44953 403
## 171 17.4 5.875 14.43  1.20742 403
## 172 19.1 5.880 12.03  2.31390 403
## 173 23.1 5.572 14.69  0.13914 296
## 174 23.6 6.416  9.04  0.09178 296
## 175 22.6 5.859  9.64  0.08447 296
## 176 29.4 6.546  5.33  0.06664 296
## 177 23.2 6.020 10.11  0.07022 296
## 178 24.6 6.315  6.29  0.05425 296
## 179 29.9 6.860  6.92  0.06642 296
## 180 37.2 6.980  5.04  0.05780 193
## 181 39.8 7.765  7.56  0.06588 193
## 182 36.2 6.144  9.45  0.06888 193
## 183 37.9 7.155  4.82  0.09103 193
## 184 32.5 6.563  5.68  0.10008 193
## 185 26.4 5.604 13.98  0.08308 193
## 186 29.6 6.153 13.15  0.06047 193
## 187 50.0 7.831  4.45  0.05602 193
## 188 32.0 6.782  6.68  0.07875 398
## 189 29.8 6.556  4.56  0.12579 398
## 190 34.9 7.185  5.39  0.08370 398
## 191 37.0 6.951  5.10  0.09068 398
## 192 30.5 6.739  4.69  0.06911 398
## 193 36.4 7.178  2.87  0.08664 398
## 194 31.1 6.800  5.03  0.02187 265
## 195 29.1 6.604  4.38  0.01439 265
## 196 50.0 7.875  2.97  0.01381 255
## 197 33.3 7.287  4.08  0.04011 329
## 198 30.3 7.107  8.61  0.04666 329
## 199 34.6 7.274  6.62  0.03768 329
## 200 34.9 6.975  4.56  0.03150 402
## 201 32.9 7.135  4.45  0.01778 402
## 202 24.1 6.162  7.43  0.03445 348
## 203 42.3 7.610  3.11  0.02177 348
## 204 48.5 7.853  3.81  0.03510 224
## 205 50.0 8.034  2.88  0.02009 224
## 206 22.6 5.891 10.87  0.13642 277
## 207 24.4 6.326 10.97  0.22969 277
## 208 22.5 5.783 18.06  0.25199 277
## 209 24.4 6.064 14.66  0.13587 277
## 210 20.0 5.344 23.09  0.43571 277
## 211 21.7 5.960 17.27  0.17446 277
## 212 19.3 5.404 23.98  0.37578 277
## 213 22.4 5.807 16.03  0.21719 277
## 214 28.1 6.375  9.38  0.14052 277
## 215 23.7 5.412 29.55  0.28955 277
## 216 25.0 6.182  9.47  0.19802 277
## 217 23.3 5.888 13.51  0.04560 276
## 218 28.7 6.642  9.69  0.07013 276
## 219 21.5 5.951 17.92  0.11069 276
## 220 23.0 6.373 10.50  0.11425 276
## 221 26.7 6.951  9.71  0.35809 307
## 222 21.7 6.164 21.46  0.40771 307
## 223 27.5 6.879  9.93  0.62356 307
## 224 30.1 6.618  7.60  0.61470 307
## 225 44.8 8.266  4.14  0.31533 307
## 226 50.0 8.725  4.63  0.52693 307
## 227 37.6 8.040  3.13  0.38214 307
## 228 31.6 7.163  6.36  0.41238 307
## 229 46.7 7.686  3.92  0.29819 307
## 230 31.5 6.552  3.76  0.44178 307
## 231 24.3 5.981 11.65  0.53700 307
## 232 31.7 7.412  5.25  0.46296 307
## 233 41.7 8.337  2.47  0.57529 307
## 234 48.3 8.247  3.95  0.33147 307
## 235 29.0 6.726  8.05  0.44791 307
## 236 24.0 6.086 10.88  0.33045 307
## 237 25.1 6.631  9.54  0.52058 307
## 238 31.5 7.358  4.73  0.51183 307
## 239 23.7 6.481  6.36  0.08244 300
## 240 23.3 6.606  7.37  0.09252 300
## 241 22.0 6.897 11.38  0.11329 300
## 242 20.1 6.095 12.40  0.10612 300
## 243 22.2 6.358 11.22  0.10290 300
## 244 23.7 6.393  5.19  0.12757 300
## 245 17.6 5.593 12.50  0.20608 330
## 246 18.5 5.605 18.46  0.19133 330
## 247 24.3 6.108  9.16  0.33983 330
## 248 20.5 6.226 10.15  0.19657 330
## 249 24.5 6.433  9.52  0.16439 330
## 250 26.2 6.718  6.56  0.19073 330
## 251 24.4 6.487  5.90  0.14030 330
## 252 24.8 6.438  3.59  0.21409 330
## 253 29.6 6.957  3.53  0.08221 330
## 254 42.8 8.259  3.54  0.36894 330
## 255 21.9 6.108  6.57  0.04819 315
## 256 20.9 5.876  9.25  0.03548 315
## 257 44.0 7.454  3.11  0.01538 244
## 258 50.0 8.704  5.12  0.61154 264
## 259 36.0 7.333  7.79  0.66351 264
## 260 30.1 6.842  6.90  0.65665 264
## 261 33.8 7.203  9.59  0.54011 264
## 262 43.1 7.520  7.26  0.53412 264
## 263 48.8 8.398  5.91  0.52014 264
## 264 31.0 7.327 11.25  0.82526 264
## 265 36.5 7.206  8.10  0.55007 264
## 266 22.8 5.560 10.45  0.76162 264
## 267 30.7 7.014 14.79  0.78570 264
## 268 50.0 8.297  7.44  0.57834 264
## 269 43.5 7.470  3.16  0.54050 264
## 270 20.7 5.920 13.65  0.09065 223
## 271 21.1 5.856 13.00  0.29916 223
## 272 25.2 6.240  6.59  0.16211 223
## 273 24.4 6.538  7.73  0.11460 223
## 274 35.2 7.691  6.58  0.22188 223
## 275 32.4 6.758  3.53  0.05644 254
## 276 32.0 6.854  2.98  0.09604 254
## 277 33.2 7.267  6.05  0.10469 254
## 278 33.1 6.826  4.16  0.06127 254
## 279 29.1 6.482  7.19  0.07978 254
## 280 35.1 6.812  4.85  0.21038 216
## 281 45.4 7.820  3.76  0.03578 216
## 282 35.4 6.968  4.59  0.03705 216
## 283 46.0 7.645  3.01  0.06129 216
## 284 50.0 7.923  3.16  0.01501 198
## 285 32.2 7.088  7.85  0.00906 285
## 286 22.0 6.453  8.23  0.01096 300
## 287 20.1 6.230 12.93  0.01965 241
## 288 23.2 6.209  7.14  0.03871 293
## 289 22.3 6.315  7.60  0.04590 293
## 290 24.8 6.565  9.51  0.04297 293
## 291 28.5 6.861  3.33  0.03502 245
## 292 37.3 7.148  3.56  0.07886 245
## 293 27.9 6.630  4.70  0.03615 245
## 294 23.9 6.127  8.58  0.08265 289
## 295 21.7 6.009 10.40  0.08199 289
## 296 28.6 6.678  6.27  0.12932 289
## 297 27.1 6.549  7.39  0.05372 289
## 298 20.3 5.790 15.84  0.14103 289
## 299 22.5 6.345  4.97  0.06466 358
## 300 29.0 7.041  4.74  0.05561 358
## 301 24.8 6.871  6.07  0.04417 358
## 302 22.0 6.590  9.50  0.03537 329
## 303 26.4 6.495  8.67  0.09266 329
## 304 33.1 6.982  4.86  0.10000 329
## 305 36.1 7.236  6.93  0.05515 222
## 306 28.4 6.616  8.93  0.05479 222
## 307 33.4 7.420  6.47  0.07503 222
## 308 28.2 6.849  7.53  0.04932 222
## 309 22.8 6.635  4.54  0.49298 304
## 310 20.3 5.972  9.97  0.34940 304
## 311 16.1 4.973 12.64  2.63548 304
## 312 22.1 6.122  5.98  0.79041 304
## 313 19.4 6.023 11.72  0.26169 304
## 314 21.6 6.266  7.90  0.26938 304
## 315 23.8 6.567  9.28  0.36920 304
## 316 16.2 5.705 11.50  0.25356 304
## 317 17.8 5.914 18.33  0.31827 304
## 318 19.8 5.782 15.94  0.24522 304
## 319 23.1 6.382 10.36  0.40202 304
## 320 21.0 6.113 12.73  0.47547 304
## 321 23.8 6.426  7.20  0.16760 287
## 322 23.1 6.376  6.87  0.18159 287
## 323 20.4 6.041  7.70  0.35114 287
## 324 18.5 5.708 11.74  0.28392 287
## 325 25.0 6.415  6.12  0.34109 287
## 326 24.6 6.431  5.08  0.19186 287
## 327 23.0 6.312  6.15  0.30347 287
## 328 22.2 6.083 12.79  0.24103 287
## 329 19.3 5.868  9.97  0.06617 430
## 330 22.6 6.333  7.34  0.06724 430
## 331 19.8 6.144  9.09  0.04544 430
## 332 17.1 5.706 12.43  0.05023 304
## 333 19.4 6.031  7.83  0.03466 304
## 334 22.2 6.316  5.68  0.05083 224
## 335 20.7 6.310  6.75  0.03738 224
## 336 21.1 6.037  8.01  0.03961 224
## 337 19.5 5.869  9.80  0.03427 224
## 338 18.5 5.895 10.56  0.03041 224
## 339 20.6 6.059  8.51  0.03306 224
## 340 19.0 5.985  9.74  0.05497 224
## 341 18.7 5.968  9.29  0.06151 224
## 342 32.7 7.241  5.49  0.01301 284
## 343 16.5 6.540  8.65  0.02498 422
## 344 23.9 6.696  7.18  0.02543 370
## 345 31.2 6.874  4.61  0.03049 370
## 346 17.5 6.014 10.53  0.03113 352
## 347 17.2 5.898 12.67  0.06162 352
## 348 23.1 6.516  6.36  0.01870 351
## 349 24.5 6.635  5.99  0.01501 280
## 350 26.6 6.939  5.89  0.02899 335
## 351 22.9 6.490  5.98  0.06211 335
## 352 24.1 6.579  5.49  0.07950 411
## 353 18.6 5.884  7.79  0.07244 411
## 354 30.1 6.728  4.50  0.01709 187
## 355 18.2 5.663  8.05  0.04301 334
## 356 20.6 5.936  5.57  0.10659 334
## 357 17.8 6.212 17.60  8.98296 666
## 358 21.7 6.395 13.27  3.84970 666
## 359 22.7 6.127 11.48  5.20177 666
## 360 22.6 6.112 12.67  4.26131 666
## 361 25.0 6.398  7.79  4.54192 666
## 362 19.9 6.251 14.19  3.83684 666
## 363 20.8 5.362 10.19  3.67822 666
## 364 16.8 5.803 14.64  4.22239 666
## 365 21.9 8.780  5.29  3.47428 666
## 366 27.5 3.561  7.12  4.55587 666
## 367 21.9 4.963 14.00  3.69695 666
## 368 23.1 3.863 13.33 13.52220 666
## 369 50.0 4.970  3.26  4.89822 666
## 370 50.0 6.683  3.73  5.66998 666
## 371 50.0 7.016  2.96  6.53876 666
## 372 50.0 6.216  9.53  9.23230 666
## 373 50.0 5.875  8.88  8.26725 666
## 374 13.8 4.906 34.77 11.10810 666
## 375 13.8 4.138 37.97 18.49820 666
## 376 15.0 7.313 13.44 19.60910 666
## 377 13.9 6.649 23.24 15.28800 666
## 378 13.3 6.794 21.24  9.82349 666
## 379 13.1 6.380 23.69 23.64820 666
## 380 10.2 6.223 21.78 17.86670 666
## 381 10.4 6.968 17.21 88.97620 666
## 382 10.9 6.545 21.08 15.87440 666
## 383 11.3 5.536 23.60  9.18702 666
## 384 12.3 5.520 24.56  7.99248 666
## 385  8.8 4.368 30.63 20.08490 666
## 386  7.2 5.277 30.81 16.81180 666
## 387 10.5 4.652 28.28 24.39380 666
## 388  7.4 5.000 31.99 22.59710 666
## 389 10.2 4.880 30.62 14.33370 666
## 390 11.5 5.390 20.85  8.15174 666
## 391 15.1 5.713 17.11  6.96215 666
## 392 23.2 6.051 18.76  5.29305 666
## 393  9.7 5.036 25.68 11.57790 666
## 394 13.8 6.193 15.17  8.64476 666
## 395 12.7 5.887 16.35 13.35980 666
## 396 13.1 6.471 17.12  8.71675 666
## 397 12.5 6.405 19.37  5.87205 666
## 398  8.5 5.747 19.92  7.67202 666
## 399  5.0 5.453 30.59 38.35180 666
## 400  6.3 5.852 29.97  9.91655 666
## 401  5.6 5.987 26.77 25.04610 666
## 402  7.2 6.343 20.32 14.23620 666
## 403 12.1 6.404 20.31  9.59571 666
## 404  8.3 5.349 19.77 24.80170 666
## 405  8.5 5.531 27.38 41.52920 666
## 406  5.0 5.683 22.98 67.92080 666
## 407 11.9 4.138 23.34 20.71620 666
## 408 27.9 5.608 12.13 11.95110 666
## 409 17.2 5.617 26.40  7.40389 666
## 410 27.5 6.852 19.78 14.43830 666
## 411 15.0 5.757 10.11 51.13580 666
## 412 17.2 6.657 21.22 14.05070 666
## 413 17.9 4.628 34.37 18.81100 666
## 414 16.3 5.155 20.08 28.65580 666
## 415  7.0 4.519 36.98 45.74610 666
## 416  7.2 6.434 29.05 18.08460 666
## 417  7.5 6.782 25.79 10.83420 666
## 418 10.4 5.304 26.64 25.94060 666
## 419  8.8 5.957 20.62 73.53410 666
## 420  8.4 6.824 22.74 11.81230 666
## 421 16.7 6.411 15.02 11.08740 666
## 422 14.2 6.006 15.70  7.02259 666
## 423 20.8 5.648 14.10 12.04820 666
## 424 13.4 6.103 23.29  7.05042 666
## 425 11.7 5.565 17.16  8.79212 666
## 426  8.3 5.896 24.39 15.86030 666
## 427 10.2 5.837 15.69 12.24720 666
## 428 10.9 6.202 14.52 37.66190 666
## 429 11.0 6.193 21.52  7.36711 666
## 430  9.5 6.380 24.08  9.33889 666
## 431 14.5 6.348 17.64  8.49213 666
## 432 14.1 6.833 19.69 10.06230 666
## 433 16.1 6.425 12.03  6.44405 666
## 434 14.3 6.436 16.22  5.58107 666
## 435 11.7 6.208 15.17 13.91340 666
## 436 13.4 6.629 23.27 11.16040 666
## 437  9.6 6.461 18.05 14.42080 666
## 438  8.7 6.152 26.45 15.17720 666
## 439  8.4 5.935 34.02 13.67810 666
## 440 12.8 5.627 22.88  9.39063 666
## 441 10.5 5.818 22.11 22.05110 666
## 442 17.1 6.406 19.52  9.72418 666
## 443 18.4 6.219 16.59  5.66637 666
## 444 15.4 6.485 18.85  9.96654 666
## 445 10.8 5.854 23.79 12.80230 666
## 446 11.8 6.459 23.98 10.67180 666
## 447 14.9 6.341 17.79  6.28807 666
## 448 12.6 6.251 16.44  9.92485 666
## 449 14.1 6.185 18.13  9.32909 666
## 450 13.0 6.417 19.31  7.52601 666
## 451 13.4 6.749 17.44  6.71772 666
## 452 15.2 6.655 17.73  5.44114 666
## 453 16.1 6.297 17.27  5.09017 666
## 454 17.8 7.393 16.74  8.24809 666
## 455 14.9 6.728 18.71  9.51363 666
## 456 14.1 6.525 18.13  4.75237 666
## 457 12.7 5.976 19.01  4.66883 666
## 458 13.5 5.936 16.94  8.20058 666
## 459 14.9 6.301 16.23  7.75223 666
## 460 20.0 6.081 14.70  6.80117 666
## 461 16.4 6.701 16.42  4.81213 666
## 462 17.7 6.376 14.65  3.69311 666
## 463 19.5 6.317 13.99  6.65492 666
## 464 20.2 6.513 10.29  5.82115 666
## 465 21.4 6.209 13.22  7.83932 666
## 466 19.9 5.759 14.13  3.16360 666
## 467 19.0 5.952 17.15  3.77498 666
## 468 19.1 6.003 21.32  4.42228 666
## 469 19.1 5.926 18.13 15.57570 666
## 470 20.1 5.713 14.76 13.07510 666
## 471 19.9 6.167 16.29  4.34879 666
## 472 19.6 6.229 12.87  4.03841 666
## 473 23.2 6.437 14.36  3.56868 666
## 474 29.8 6.980 11.66  4.64689 666
## 475 13.8 5.427 18.14  8.05579 666
## 476 13.3 6.162 24.10  6.39312 666
## 477 16.7 6.484 18.68  4.87141 666
## 478 12.0 5.304 24.91 15.02340 666
## 479 14.6 6.185 18.03 10.23300 666
## 480 21.4 6.229 13.11 14.33370 666
## 481 23.0 6.242 10.74  5.82401 666
## 482 23.7 6.750  7.74  5.70818 666
## 483 25.0 7.061  7.01  5.73116 666
## 484 21.8 5.762 10.42  2.81838 666
## 485 20.6 5.871 13.34  2.37857 666
## 486 21.2 6.312 10.58  3.67367 666
## 487 19.1 6.114 14.98  5.69175 666
## 488 20.6 5.905 11.45  4.83567 666
## 489 15.2 5.454 18.06  0.15086 711
## 490  7.0 5.414 23.97  0.18337 711
## 491  8.1 5.093 29.68  0.20746 711
## 492 13.6 5.983 18.07  0.10574 711
## 493 20.1 5.983 13.35  0.11132 711
## 494 21.8 5.707 12.01  0.17331 391
## 495 24.5 5.926 13.59  0.27957 391
## 496 23.1 5.670 17.60  0.17899 391
## 497 19.7 5.390 21.14  0.28960 391
## 498 18.3 5.794 14.10  0.26838 391
## 499 21.2 6.019 12.92  0.23912 391
## 500 17.5 5.569 15.10  0.17783 391
## 501 16.8 6.027 14.33  0.22438 391
## 502 22.4 6.593  9.67  0.06263 273
## 503 20.6 6.120  9.08  0.04527 273
## 504 23.9 6.976  5.64  0.06076 273
## 505 22.0 6.794  6.48  0.10959 273
## 506 11.9 6.030  7.88  0.04741 273

Estas variables representan:

medv: valor medio de las viviendas

rm: número promedio de habitaciones

lstat: porcentaje de población de bajo estatus

crim: tasa de criminalidad

tax: tasa de impuestos

5. Estadísticas descriptivas

sapply(Boston_sel, mean)
##       medv         rm      lstat       crim        tax 
##  22.532806   6.284634  12.653063   3.613524 408.237154
sapply(Boston_sel, sd)
##        medv          rm       lstat        crim         tax 
##   9.1971041   0.7026171   7.1410615   8.6015451 168.5371161

Observaciones:

La media muestra el valor promedio de cada variable.

La desviación estándar indica la variabilidad de los datos.

6. Histogramas

Boston %>% 
  select(medv, rm, lstat) %>% 
  pivot_longer(cols = everything(),
               names_to = "variable",
               values_to = "valor") %>% 
  ggplot(aes(x = valor, fill = variable)) +
  geom_histogram(color = "white", bins = 15) +
  facet_wrap(~variable, scales = "free_x") +
  theme_minimal()

Los histogramas permiten visualizar la distribución de las variables.

Se pueden identificar posibles sesgos o concentraciones de valores.

7. Boxplots

Boston %>% 
  select(medv, lstat) %>% 
  pivot_longer(cols = everything(),
               names_to = "variable",
               values_to = "valor") %>% 
  ggplot(aes(x = variable, y = valor, fill = variable)) +
  geom_boxplot() +
  theme_minimal()

Los boxplots ayudan a identificar valores atípicos.

También permiten observar la dispersión de los datos.

8. Scatter plots

Boston %>% 
  select(medv, rm, lstat, crim, tax) %>% 
  pivot_longer(cols = -medv,
               names_to = "variable",
               values_to = "valor") %>% 
  ggplot(aes(x = valor, y = medv, color = variable)) +
  geom_point() +
  facet_wrap(~variable, scales = "free_x") +
  theme_minimal()

Se observa cómo diferentes variables se relacionan con el valor de la vivienda.

Algunas variables pueden mostrar relaciones positivas o negativas.

9. Matriz de correlación

ggpairs(Boston[, c("medv","rm","lstat")])

La matriz de correlación permite observar relaciones entre variables.

Por ejemplo, el número de habitaciones parece tener relación positiva con el valor de las viviendas.