Instalación y llamado de ibrerías
library(foreign) # Read Data Stored by 'Minitab', 'S', 'SAS', 'SPSS', 'Stata', 'Systat', 'Weka', 'dBase'
library(ggplot2) # It is a system for creating graphics
library(dplyr) # A fast, consistent tool for working with data frame like objects
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(mapview) # Quickly and conveniently create interactive visualizations of spatial data with or without background maps
library(naniar) # Provides data structures and functions that facilitate the plotting of missing values and examination of imputations.
library(tmaptools) # A collection of functions to create spatial weights matrix objects from polygon 'contiguities', for summarizing these objects, and for permitting their use in spatial data analysis
library(tmap) # For drawing thematic maps
## Breaking News: tmap 3.x is retiring. Please test v4, e.g. with
## remotes::install_github('r-tmap/tmap')
library(RColorBrewer) # It offers several color palettes
library(dlookr) # A collection of tools that support data diagnosis, exploration, and transformation
## Registered S3 methods overwritten by 'dlookr':
## method from
## plot.transform scales
## print.transform scales
##
## Attaching package: 'dlookr'
## The following object is masked from 'package:base':
##
## transform
# Predictive modeling
library(regclass) # Contains basic tools for visualizing, interpreting, and building regression models
## Loading required package: bestglm
## Loading required package: leaps
## Loading required package: VGAM
## Loading required package: stats4
## Loading required package: splines
## Loading required package: rpart
## Loading required package: randomForest
## randomForest 4.7-1.1
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:dplyr':
##
## combine
## The following object is masked from 'package:ggplot2':
##
## margin
## Important regclass change from 1.3:
## All functions that had a . in the name now have an _
## all.correlations -> all_correlations, cor.demo -> cor_demo, etc.
library(mctest) # Multicollinearity diagnostics
library(lmtest) # Testing linear regression models
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Attaching package: 'lmtest'
## The following object is masked from 'package:VGAM':
##
## lrtest
library(spdep) # A collection of functions to create spatial weights matrix objects from polygon 'contiguities', for summarizing these objects, and for permitting their use in spatial data analysis
## Loading required package: spData
## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source')`
## Loading required package: sf
## Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE
library(sf) # A standardized way to encode spatial vector data
library(spData) # Diverse spatial datasets for demonstrating, benchmarking and teaching spatial data analysis
library(spatialreg) # A collection of all the estimation functions for spatial cross-sectional models
## Loading required package: Matrix
##
## Attaching package: 'spatialreg'
## The following objects are masked from 'package:spdep':
##
## get.ClusterOption, get.coresOption, get.mcOption,
## get.VerboseOption, get.ZeroPolicyOption, set.ClusterOption,
## set.coresOption, set.mcOption, set.VerboseOption,
## set.ZeroPolicyOption
library(caret) # The caret package (short for Classification And Rgression Training) contains functions to streamline the model training process for complex regression and classification problems.
## Loading required package: lattice
##
## Attaching package: 'lattice'
## The following object is masked from 'package:regclass':
##
## qq
##
## Attaching package: 'caret'
## The following object is masked from 'package:VGAM':
##
## predictors
library(e1071) # Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, generalized k-nearest neighbor.
##
## Attaching package: 'e1071'
## The following objects are masked from 'package:dlookr':
##
## kurtosis, skewness
library(SparseM) # Provides some basic R functionality for linear algebra with sparse matrices
##
## Attaching package: 'SparseM'
## The following object is masked from 'package:base':
##
## backsolve
library(Metrics) # An implementation of evaluation metrics in R that are commonly used in supervised machine learning
##
## Attaching package: 'Metrics'
## The following objects are masked from 'package:caret':
##
## precision, recall
library(randomForest) # Classification and regression based on a forest of trees using random inputs
library(jtools) # This is a collection of tools for more efficiently understanding and sharing the results of (primarily) regression analyses
library(xgboost) # The package includes efficient linear model solver and tree learning algorithms
##
## Attaching package: 'xgboost'
## The following object is masked from 'package:dplyr':
##
## slice
library(DiagrammeR) # Build graph/network structures using functions for stepwise addition and deletion of nodes and edges
library(effects) # Graphical and tabular effect displays, e.g., of interactions, for various statistical models with linear predictors
## Loading required package: carData
## Registered S3 method overwritten by 'survey':
## method from
## summary.pps dlookr
## Use the command
## lattice::trellis.par.set(effectsTheme())
## to customize lattice options for effects plots.
## See ?effectsTheme for details.
library(shinyjs)
##
## Attaching package: 'shinyjs'
## The following object is masked from 'package:Matrix':
##
## show
## The following object is masked from 'package:lmtest':
##
## reset
## The following object is masked from 'package:VGAM':
##
## show
## The following object is masked from 'package:stats4':
##
## show
## The following objects are masked from 'package:methods':
##
## removeClass, show
library(sp)
library(geoR)
## --------------------------------------------------------------
## Analysis of Geostatistical Data
## For an Introduction to geoR go to http://www.leg.ufpr.br/geoR
## geoR version 1.9-4 (built on 2024-02-14) is now loaded
## --------------------------------------------------------------
library(gstat)
library(caret)
#install.packages("datasets") #Usar la base de datos "Iris"
library(datasets)
#install.packages("ggplot2") #Gráficas con mejor diseño
library(ggplot2)
#install.packages("lattice") #Crear gráficos
library(lattice)
#install.packages("DataExplorer") #Crear gráficos
library(DataExplorer)
#install.packages("mlbench") #Crear gráficos
library(mlbench)
#install.packages("tidyverse")
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ lubridate 1.9.3 ✔ tibble 3.2.1
## ✔ purrr 1.0.2 ✔ tidyr 1.3.1
## ✔ readr 2.1.5
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ randomForest::combine() masks dplyr::combine()
## ✖ tidyr::expand() masks Matrix::expand()
## ✖ tidyr::extract() masks dlookr::extract()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ purrr::lift() masks caret::lift()
## ✖ randomForest::margin() masks ggplot2::margin()
## ✖ tidyr::pack() masks Matrix::pack()
## ✖ lubridate::show() masks sp::show(), shinyjs::show(), Matrix::show(), VGAM::show(), stats4::show(), methods::show()
## ✖ xgboost::slice() masks dplyr::slice()
## ✖ tidyr::unpack() masks Matrix::unpack()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Carga de base de datos “health.insurance.csv”
df <- read.csv("C:/Users/AVRIL/Documents/health_insurance.csv")
Se eliminan duplicados
df <- unique(df)
dim(df) #1 registro duplicado
## [1] 1337 7
# Se identifica si hay valores nulos
dfn <- is.na(df)
#No hay valores nulos, pero en caso de que si se hubiera detectado, estos se habrían reemplazo por la mediana del df
# Se eliminan nulos
df <- na.omit(df)
df
## age sex bmi children smoker region expenses
## 1 19 female 27.9 0 yes southwest 16884.92
## 2 18 male 33.8 1 no southeast 1725.55
## 3 28 male 33.0 3 no southeast 4449.46
## 4 33 male 22.7 0 no northwest 21984.47
## 5 32 male 28.9 0 no northwest 3866.86
## 6 31 female 25.7 0 no southeast 3756.62
## 7 46 female 33.4 1 no southeast 8240.59
## 8 37 female 27.7 3 no northwest 7281.51
## 9 37 male 29.8 2 no northeast 6406.41
## 10 60 female 25.8 0 no northwest 28923.14
## 11 25 male 26.2 0 no northeast 2721.32
## 12 62 female 26.3 0 yes southeast 27808.73
## 13 23 male 34.4 0 no southwest 1826.84
## 14 56 female 39.8 0 no southeast 11090.72
## 15 27 male 42.1 0 yes southeast 39611.76
## 16 19 male 24.6 1 no southwest 1837.24
## 17 52 female 30.8 1 no northeast 10797.34
## 18 23 male 23.8 0 no northeast 2395.17
## 19 56 male 40.3 0 no southwest 10602.39
## 20 30 male 35.3 0 yes southwest 36837.47
## 21 60 female 36.0 0 no northeast 13228.85
## 22 30 female 32.4 1 no southwest 4149.74
## 23 18 male 34.1 0 no southeast 1137.01
## 24 34 female 31.9 1 yes northeast 37701.88
## 25 37 male 28.0 2 no northwest 6203.90
## 26 59 female 27.7 3 no southeast 14001.13
## 27 63 female 23.1 0 no northeast 14451.84
## 28 55 female 32.8 2 no northwest 12268.63
## 29 23 male 17.4 1 no northwest 2775.19
## 30 31 male 36.3 2 yes southwest 38711.00
## 31 22 male 35.6 0 yes southwest 35585.58
## 32 18 female 26.3 0 no northeast 2198.19
## 33 19 female 28.6 5 no southwest 4687.80
## 34 63 male 28.3 0 no northwest 13770.10
## 35 28 male 36.4 1 yes southwest 51194.56
## 36 19 male 20.4 0 no northwest 1625.43
## 37 62 female 33.0 3 no northwest 15612.19
## 38 26 male 20.8 0 no southwest 2302.30
## 39 35 male 36.7 1 yes northeast 39774.28
## 40 60 male 39.9 0 yes southwest 48173.36
## 41 24 female 26.6 0 no northeast 3046.06
## 42 31 female 36.6 2 no southeast 4949.76
## 43 41 male 21.8 1 no southeast 6272.48
## 44 37 female 30.8 2 no southeast 6313.76
## 45 38 male 37.1 1 no northeast 6079.67
## 46 55 male 37.3 0 no southwest 20630.28
## 47 18 female 38.7 2 no northeast 3393.36
## 48 28 female 34.8 0 no northwest 3556.92
## 49 60 female 24.5 0 no southeast 12629.90
## 50 36 male 35.2 1 yes southeast 38709.18
## 51 18 female 35.6 0 no northeast 2211.13
## 52 21 female 33.6 2 no northwest 3579.83
## 53 48 male 28.0 1 yes southwest 23568.27
## 54 36 male 34.4 0 yes southeast 37742.58
## 55 40 female 28.7 3 no northwest 8059.68
## 56 58 male 37.0 2 yes northwest 47496.49
## 57 58 female 31.8 2 no northeast 13607.37
## 58 18 male 31.7 2 yes southeast 34303.17
## 59 53 female 22.9 1 yes southeast 23244.79
## 60 34 female 37.3 2 no northwest 5989.52
## 61 43 male 27.4 3 no northeast 8606.22
## 62 25 male 33.7 4 no southeast 4504.66
## 63 64 male 24.7 1 no northwest 30166.62
## 64 28 female 25.9 1 no northwest 4133.64
## 65 20 female 22.4 0 yes northwest 14711.74
## 66 19 female 28.9 0 no southwest 1743.21
## 67 61 female 39.1 2 no southwest 14235.07
## 68 40 male 26.3 1 no northwest 6389.38
## 69 40 female 36.2 0 no southeast 5920.10
## 70 28 male 24.0 3 yes southeast 17663.14
## 71 27 female 24.8 0 yes southeast 16577.78
## 72 31 male 28.5 5 no northeast 6799.46
## 73 53 female 28.1 3 no southwest 11741.73
## 74 58 male 32.0 1 no southeast 11946.63
## 75 44 male 27.4 2 no southwest 7726.85
## 76 57 male 34.0 0 no northwest 11356.66
## 77 29 female 29.6 1 no southeast 3947.41
## 78 21 male 35.5 0 no southeast 1532.47
## 79 22 female 39.8 0 no northeast 2755.02
## 80 41 female 33.0 0 no northwest 6571.02
## 81 31 male 26.9 1 no northeast 4441.21
## 82 45 female 38.3 0 no northeast 7935.29
## 83 22 male 37.6 1 yes southeast 37165.16
## 84 48 female 41.2 4 no northwest 11033.66
## 85 37 female 34.8 2 yes southwest 39836.52
## 86 45 male 22.9 2 yes northwest 21098.55
## 87 57 female 31.2 0 yes northwest 43578.94
## 88 56 female 27.2 0 no southwest 11073.18
## 89 46 female 27.7 0 no northwest 8026.67
## 90 55 female 27.0 0 no northwest 11082.58
## 91 21 female 39.5 0 no southeast 2026.97
## 92 53 female 24.8 1 no northwest 10942.13
## 93 59 male 29.8 3 yes northeast 30184.94
## 94 35 male 34.8 2 no northwest 5729.01
## 95 64 female 31.3 2 yes southwest 47291.06
## 96 28 female 37.6 1 no southeast 3766.88
## 97 54 female 30.8 3 no southwest 12105.32
## 98 55 male 38.3 0 no southeast 10226.28
## 99 56 male 20.0 0 yes northeast 22412.65
## 100 38 male 19.3 0 yes southwest 15820.70
## 101 41 female 31.6 0 no southwest 6186.13
## 102 30 male 25.5 0 no northeast 3645.09
## 103 18 female 30.1 0 no northeast 21344.85
## 104 61 female 29.9 3 yes southeast 30942.19
## 105 34 female 27.5 1 no southwest 5003.85
## 106 20 male 28.0 1 yes northwest 17560.38
## 107 19 female 28.4 1 no southwest 2331.52
## 108 26 male 30.9 2 no northwest 3877.30
## 109 29 male 27.9 0 no southeast 2867.12
## 110 63 male 35.1 0 yes southeast 47055.53
## 111 54 male 33.6 1 no northwest 10825.25
## 112 55 female 29.7 2 no southwest 11881.36
## 113 37 male 30.8 0 no southwest 4646.76
## 114 21 female 35.7 0 no northwest 2404.73
## 115 52 male 32.2 3 no northeast 11488.32
## 116 60 male 28.6 0 no northeast 30260.00
## 117 58 male 49.1 0 no southeast 11381.33
## 118 29 female 27.9 1 yes southeast 19107.78
## 119 49 female 27.2 0 no southeast 8601.33
## 120 37 female 23.4 2 no northwest 6686.43
## 121 44 male 37.1 2 no southwest 7740.34
## 122 18 male 23.8 0 no northeast 1705.62
## 123 20 female 29.0 0 no northwest 2257.48
## 124 44 male 31.4 1 yes northeast 39556.49
## 125 47 female 33.9 3 no northwest 10115.01
## 126 26 female 28.8 0 no northeast 3385.40
## 127 19 female 28.3 0 yes southwest 17081.08
## 128 52 female 37.4 0 no southwest 9634.54
## 129 32 female 17.8 2 yes northwest 32734.19
## 130 38 male 34.7 2 no southwest 6082.41
## 131 59 female 26.5 0 no northeast 12815.44
## 132 61 female 22.0 0 no northeast 13616.36
## 133 53 female 35.9 2 no southwest 11163.57
## 134 19 male 25.6 0 no northwest 1632.56
## 135 20 female 28.8 0 no northeast 2457.21
## 136 22 female 28.1 0 no southeast 2155.68
## 137 19 male 34.1 0 no southwest 1261.44
## 138 22 male 25.2 0 no northwest 2045.69
## 139 54 female 31.9 3 no southeast 27322.73
## 140 22 female 36.0 0 no southwest 2166.73
## 141 34 male 22.4 2 no northeast 27375.90
## 142 26 male 32.5 1 no northeast 3490.55
## 143 34 male 25.3 2 yes southeast 18972.50
## 144 29 male 29.7 2 no northwest 18157.88
## 145 30 male 28.7 3 yes northwest 20745.99
## 146 29 female 38.8 3 no southeast 5138.26
## 147 46 male 30.5 3 yes northwest 40720.55
## 148 51 female 37.7 1 no southeast 9877.61
## 149 53 female 37.4 1 no northwest 10959.69
## 150 19 male 28.4 1 no southwest 1842.52
## 151 35 male 24.1 1 no northwest 5125.22
## 152 48 male 29.7 0 no southeast 7789.64
## 153 32 female 37.1 3 no northeast 6334.34
## 154 42 female 23.4 0 yes northeast 19964.75
## 155 40 female 25.5 1 no northeast 7077.19
## 156 44 male 39.5 0 no northwest 6948.70
## 157 48 male 24.4 0 yes southeast 21223.68
## 158 18 male 25.2 0 yes northeast 15518.18
## 159 30 male 35.5 0 yes southeast 36950.26
## 160 50 female 27.8 3 no southeast 19749.38
## 161 42 female 26.6 0 yes northwest 21348.71
## 162 18 female 36.9 0 yes southeast 36149.48
## 163 54 male 39.6 1 no southwest 10450.55
## 164 32 female 29.8 2 no southwest 5152.13
## 165 37 male 29.6 0 no northwest 5028.15
## 166 47 male 28.2 4 no northeast 10407.09
## 167 20 female 37.0 5 no southwest 4830.63
## 168 32 female 33.2 3 no northwest 6128.80
## 169 19 female 31.8 1 no northwest 2719.28
## 170 27 male 18.9 3 no northeast 4827.90
## 171 63 male 41.5 0 no southeast 13405.39
## 172 49 male 30.3 0 no southwest 8116.68
## 173 18 male 16.0 0 no northeast 1694.80
## 174 35 female 34.8 1 no southwest 5246.05
## 175 24 female 33.3 0 no northwest 2855.44
## 176 63 female 37.7 0 yes southwest 48824.45
## 177 38 male 27.8 2 no northwest 6455.86
## 178 54 male 29.2 1 no southwest 10436.10
## 179 46 female 28.9 2 no southwest 8823.28
## 180 41 female 33.2 3 no northeast 8538.29
## 181 58 male 28.6 0 no northwest 11735.88
## 182 18 female 38.3 0 no southeast 1631.82
## 183 22 male 20.0 3 no northeast 4005.42
## 184 44 female 26.4 0 no northwest 7419.48
## 185 44 male 30.7 2 no southeast 7731.43
## 186 36 male 41.9 3 yes northeast 43753.34
## 187 26 female 29.9 2 no southeast 3981.98
## 188 30 female 30.9 3 no southwest 5325.65
## 189 41 female 32.2 1 no southwest 6775.96
## 190 29 female 32.1 2 no northwest 4922.92
## 191 61 male 31.6 0 no southeast 12557.61
## 192 36 female 26.2 0 no southwest 4883.87
## 193 25 male 25.7 0 no southeast 2137.65
## 194 56 female 26.6 1 no northwest 12044.34
## 195 18 male 34.4 0 no southeast 1137.47
## 196 19 male 30.6 0 no northwest 1639.56
## 197 39 female 32.8 0 no southwest 5649.72
## 198 45 female 28.6 2 no southeast 8516.83
## 199 51 female 18.1 0 no northwest 9644.25
## 200 64 female 39.3 0 no northeast 14901.52
## 201 19 female 32.1 0 no northwest 2130.68
## 202 48 female 32.2 1 no southeast 8871.15
## 203 60 female 24.0 0 no northwest 13012.21
## 204 27 female 36.1 0 yes southeast 37133.90
## 205 46 male 22.3 0 no southwest 7147.11
## 206 28 female 28.9 1 no northeast 4337.74
## 207 59 male 26.4 0 no southeast 11743.30
## 208 35 male 27.7 2 yes northeast 20984.09
## 209 63 female 31.8 0 no southwest 13880.95
## 210 40 male 41.2 1 no northeast 6610.11
## 211 20 male 33.0 1 no southwest 1980.07
## 212 40 male 30.9 4 no northwest 8162.72
## 213 24 male 28.5 2 no northwest 3537.70
## 214 34 female 26.7 1 no southeast 5002.78
## 215 45 female 30.9 2 no southwest 8520.03
## 216 41 female 37.1 2 no southwest 7371.77
## 217 53 female 26.6 0 no northwest 10355.64
## 218 27 male 23.1 0 no southeast 2483.74
## 219 26 female 29.9 1 no southeast 3392.98
## 220 24 female 23.2 0 no southeast 25081.77
## 221 34 female 33.7 1 no southwest 5012.47
## 222 53 female 33.3 0 no northeast 10564.88
## 223 32 male 30.8 3 no southwest 5253.52
## 224 19 male 34.8 0 yes southwest 34779.62
## 225 42 male 24.6 0 yes southeast 19515.54
## 226 55 male 33.9 3 no southeast 11987.17
## 227 28 male 38.1 0 no southeast 2689.50
## 228 58 female 41.9 0 no southeast 24227.34
## 229 41 female 31.6 1 no northeast 7358.18
## 230 47 male 25.5 2 no northeast 9225.26
## 231 42 female 36.2 1 no northwest 7443.64
## 232 59 female 27.8 3 no southeast 14001.29
## 233 19 female 17.8 0 no southwest 1727.79
## 234 59 male 27.5 1 no southwest 12333.83
## 235 39 male 24.5 2 no northwest 6710.19
## 236 40 female 22.2 2 yes southeast 19444.27
## 237 18 female 26.7 0 no southeast 1615.77
## 238 31 male 38.4 2 no southeast 4463.21
## 239 19 male 29.1 0 yes northwest 17352.68
## 240 44 male 38.1 1 no southeast 7152.67
## 241 23 female 36.7 2 yes northeast 38511.63
## 242 33 female 22.1 1 no northeast 5354.07
## 243 55 female 26.8 1 no southwest 35160.13
## 244 40 male 35.3 3 no southwest 7196.87
## 245 63 female 27.7 0 yes northeast 29523.17
## 246 54 male 30.0 0 no northwest 24476.48
## 247 60 female 38.1 0 no southeast 12648.70
## 248 24 male 35.9 0 no southeast 1986.93
## 249 19 male 20.9 1 no southwest 1832.09
## 250 29 male 29.0 1 no northeast 4040.56
## 251 18 male 17.3 2 yes northeast 12829.46
## 252 63 female 32.2 2 yes southwest 47305.31
## 253 54 male 34.2 2 yes southeast 44260.75
## 254 27 male 30.3 3 no southwest 4260.74
## 255 50 male 31.8 0 yes northeast 41097.16
## 256 55 female 25.4 3 no northeast 13047.33
## 257 56 male 33.6 0 yes northwest 43921.18
## 258 38 female 40.2 0 no southeast 5400.98
## 259 51 male 24.4 4 no northwest 11520.10
## 260 19 male 31.9 0 yes northwest 33750.29
## 261 58 female 25.2 0 no southwest 11837.16
## 262 20 female 26.8 1 yes southeast 17085.27
## 263 52 male 24.3 3 yes northeast 24869.84
## 264 19 male 37.0 0 yes northwest 36219.41
## 265 53 female 38.1 3 no southeast 20463.00
## 266 46 male 42.4 3 yes southeast 46151.12
## 267 40 male 19.8 1 yes southeast 17179.52
## 268 59 female 32.4 3 no northeast 14590.63
## 269 45 male 30.2 1 no southwest 7441.05
## 270 49 male 25.8 1 no northeast 9282.48
## 271 18 male 29.4 1 no southeast 1719.44
## 272 50 male 34.2 2 yes southwest 42856.84
## 273 41 male 37.1 2 no northwest 7265.70
## 274 50 male 27.5 1 no northeast 9617.66
## 275 25 male 27.6 0 no northwest 2523.17
## 276 47 female 26.6 2 no northeast 9715.84
## 277 19 male 20.6 2 no northwest 2803.70
## 278 22 female 24.3 0 no southwest 2150.47
## 279 59 male 31.8 2 no southeast 12928.79
## 280 51 female 21.6 1 no southeast 9855.13
## 281 40 female 28.1 1 yes northeast 22331.57
## 282 54 male 40.6 3 yes northeast 48549.18
## 283 30 male 27.6 1 no northeast 4237.13
## 284 55 female 32.4 1 no northeast 11879.10
## 285 52 female 31.2 0 no southwest 9625.92
## 286 46 male 26.6 1 no southeast 7742.11
## 287 46 female 48.1 2 no northeast 9432.93
## 288 63 female 26.2 0 no northwest 14256.19
## 289 59 female 36.8 1 yes northeast 47896.79
## 290 52 male 26.4 3 no southeast 25992.82
## 291 28 female 33.4 0 no southwest 3172.02
## 292 29 male 29.6 1 no northeast 20277.81
## 293 25 male 45.5 2 yes southeast 42112.24
## 294 22 female 28.8 0 no southeast 2156.75
## 295 25 male 26.8 3 no southwest 3906.13
## 296 18 male 23.0 0 no northeast 1704.57
## 297 19 male 27.7 0 yes southwest 16297.85
## 298 47 male 25.4 1 yes southeast 21978.68
## 299 31 male 34.4 3 yes northwest 38746.36
## 300 48 female 28.9 1 no northwest 9249.50
## 301 36 male 27.6 3 no northeast 6746.74
## 302 53 female 22.6 3 yes northeast 24873.38
## 303 56 female 37.5 2 no southeast 12265.51
## 304 28 female 33.0 2 no southeast 4349.46
## 305 57 female 38.0 2 no southwest 12646.21
## 306 29 male 33.3 2 no northwest 19442.35
## 307 28 female 27.5 2 no southwest 20177.67
## 308 30 female 33.3 1 no southeast 4151.03
## 309 58 male 34.9 0 no northeast 11944.59
## 310 41 female 33.1 2 no northwest 7749.16
## 311 50 male 26.6 0 no southwest 8444.47
## 312 19 female 24.7 0 no southwest 1737.38
## 313 43 male 36.0 3 yes southeast 42124.52
## 314 49 male 35.9 0 no southeast 8124.41
## 315 27 female 31.4 0 yes southwest 34838.87
## 316 52 male 33.3 0 no northeast 9722.77
## 317 50 male 32.2 0 no northwest 8835.26
## 318 54 male 32.8 0 no northeast 10435.07
## 319 44 female 27.6 0 no northwest 7421.19
## 320 32 male 37.3 1 no northeast 4667.61
## 321 34 male 25.3 1 no northwest 4894.75
## 322 26 female 29.6 4 no northeast 24671.66
## 323 34 male 30.8 0 yes southwest 35491.64
## 324 57 male 40.9 0 no northeast 11566.30
## 325 29 male 27.2 0 no southwest 2866.09
## 326 40 male 34.1 1 no northeast 6600.21
## 327 27 female 23.2 1 no southeast 3561.89
## 328 45 male 36.5 2 yes northwest 42760.50
## 329 64 female 33.8 1 yes southwest 47928.03
## 330 52 male 36.7 0 no southwest 9144.57
## 331 61 female 36.4 1 yes northeast 48517.56
## 332 52 male 27.4 0 yes northwest 24393.62
## 333 61 female 31.2 0 no northwest 13429.04
## 334 56 female 28.8 0 no northeast 11658.38
## 335 43 female 35.7 2 no northeast 19144.58
## 336 64 male 34.5 0 no southwest 13822.80
## 337 60 male 25.7 0 no southeast 12142.58
## 338 62 male 27.6 1 no northwest 13937.67
## 339 50 male 32.3 1 yes northeast 41919.10
## 340 46 female 27.7 1 no southeast 8232.64
## 341 24 female 27.6 0 no southwest 18955.22
## 342 62 male 30.0 0 no northwest 13352.10
## 343 60 female 27.6 0 no northeast 13217.09
## 344 63 male 36.8 0 no northeast 13981.85
## 345 49 female 41.5 4 no southeast 10977.21
## 346 34 female 29.3 3 no southeast 6184.30
## 347 33 male 35.8 2 no southeast 4890.00
## 348 46 male 33.3 1 no northeast 8334.46
## 349 36 female 29.9 1 no southeast 5478.04
## 350 19 male 27.8 0 no northwest 1635.73
## 351 57 female 23.2 0 no northwest 11830.61
## 352 50 female 25.6 0 no southwest 8932.08
## 353 30 female 27.7 0 no southwest 3554.20
## 354 33 male 35.2 0 no northeast 12404.88
## 355 18 female 38.3 0 no southeast 14133.04
## 356 46 male 27.6 0 no southwest 24603.05
## 357 46 male 43.9 3 no southeast 8944.12
## 358 47 male 29.8 3 no northwest 9620.33
## 359 23 male 41.9 0 no southeast 1837.28
## 360 18 female 20.8 0 no southeast 1607.51
## 361 48 female 32.3 2 no northeast 10043.25
## 362 35 male 30.5 1 no southwest 4751.07
## 363 19 female 21.7 0 yes southwest 13844.51
## 364 21 female 26.4 1 no southwest 2597.78
## 365 21 female 21.9 2 no southeast 3180.51
## 366 49 female 30.8 1 no northeast 9778.35
## 367 56 female 32.3 3 no northeast 13430.27
## 368 42 female 25.0 2 no northwest 8017.06
## 369 44 male 32.0 2 no northwest 8116.27
## 370 18 male 30.4 3 no northeast 3481.87
## 371 61 female 21.1 0 no northwest 13415.04
## 372 57 female 22.2 0 no northeast 12029.29
## 373 42 female 33.2 1 no northeast 7639.42
## 374 26 male 32.9 2 yes southwest 36085.22
## 375 20 male 33.3 0 no southeast 1391.53
## 376 23 female 28.3 0 yes northwest 18033.97
## 377 39 female 24.9 3 yes northeast 21659.93
## 378 24 male 40.2 0 yes southeast 38126.25
## 379 64 female 30.1 3 no northwest 16455.71
## 380 62 male 31.5 1 no southeast 27000.98
## 381 27 female 18.0 2 yes northeast 15006.58
## 382 55 male 30.7 0 yes northeast 42303.69
## 383 55 male 33.0 0 no southeast 20781.49
## 384 35 female 43.3 2 no southeast 5846.92
## 385 44 male 22.1 2 no northeast 8302.54
## 386 19 male 34.4 0 no southwest 1261.86
## 387 58 female 39.1 0 no southeast 11856.41
## 388 50 male 25.4 2 no northwest 30284.64
## 389 26 female 22.6 0 no northwest 3176.82
## 390 24 female 30.2 3 no northwest 4618.08
## 391 48 male 35.6 4 no northeast 10736.87
## 392 19 female 37.4 0 no northwest 2138.07
## 393 48 male 31.4 1 no northeast 8964.06
## 394 49 male 31.4 1 no northeast 9290.14
## 395 46 female 32.3 2 no northeast 9411.01
## 396 46 male 19.9 0 no northwest 7526.71
## 397 43 female 34.4 3 no southwest 8522.00
## 398 21 male 31.0 0 no southeast 16586.50
## 399 64 male 25.6 2 no southwest 14988.43
## 400 18 female 38.2 0 no southeast 1631.67
## 401 51 female 20.6 0 no southwest 9264.80
## 402 47 male 47.5 1 no southeast 8083.92
## 403 64 female 33.0 0 no northwest 14692.67
## 404 49 male 32.3 3 no northwest 10269.46
## 405 31 male 20.4 0 no southwest 3260.20
## 406 52 female 38.4 2 no northeast 11396.90
## 407 33 female 24.3 0 no southeast 4185.10
## 408 47 female 23.6 1 no southwest 8539.67
## 409 38 male 21.1 3 no southeast 6652.53
## 410 32 male 30.0 1 no southeast 4074.45
## 411 19 male 17.5 0 no northwest 1621.34
## 412 44 female 20.2 1 yes northeast 19594.81
## 413 26 female 17.2 2 yes northeast 14455.64
## 414 25 male 23.9 5 no southwest 5080.10
## 415 19 female 35.2 0 no northwest 2134.90
## 416 43 female 35.6 1 no southeast 7345.73
## 417 52 male 34.1 0 no southeast 9140.95
## 418 36 female 22.6 2 yes southwest 18608.26
## 419 64 male 39.2 1 no southeast 14418.28
## 420 63 female 27.0 0 yes northwest 28950.47
## 421 64 male 33.9 0 yes southeast 46889.26
## 422 61 male 35.9 0 yes southeast 46599.11
## 423 40 male 32.8 1 yes northeast 39125.33
## 424 25 male 30.6 0 no northeast 2727.40
## 425 48 male 30.2 2 no southwest 8968.33
## 426 45 male 24.3 5 no southeast 9788.87
## 427 38 female 27.3 1 no northeast 6555.07
## 428 18 female 29.2 0 no northeast 7323.73
## 429 21 female 16.8 1 no northeast 3167.46
## 430 27 female 30.4 3 no northwest 18804.75
## 431 19 male 33.1 0 no southwest 23082.96
## 432 29 female 20.2 2 no northwest 4906.41
## 433 42 male 26.9 0 no southwest 5969.72
## 434 60 female 30.5 0 no southwest 12638.20
## 435 31 male 28.6 1 no northwest 4243.59
## 436 60 male 33.1 3 no southeast 13919.82
## 437 22 male 31.7 0 no northeast 2254.80
## 438 35 male 28.9 3 no southwest 5926.85
## 439 52 female 46.8 5 no southeast 12592.53
## 440 26 male 29.5 0 no northeast 2897.32
## 441 31 female 32.7 1 no northwest 4738.27
## 442 33 female 33.5 0 yes southwest 37079.37
## 443 18 male 43.0 0 no southeast 1149.40
## 444 59 female 36.5 1 no southeast 28287.90
## 445 56 male 26.7 1 yes northwest 26109.33
## 446 45 female 33.1 0 no southwest 7345.08
## 447 60 male 29.6 0 no northeast 12731.00
## 448 56 female 25.7 0 no northwest 11454.02
## 449 40 female 29.6 0 no southwest 5910.94
## 450 35 male 38.6 1 no southwest 4762.33
## 451 39 male 29.6 4 no southwest 7512.27
## 452 30 male 24.1 1 no northwest 4032.24
## 453 24 male 23.4 0 no southwest 1969.61
## 454 20 male 29.7 0 no northwest 1769.53
## 455 32 male 46.5 2 no southeast 4686.39
## 456 59 male 37.4 0 no southwest 21797.00
## 457 55 female 30.1 2 no southeast 11881.97
## 458 57 female 30.5 0 no northwest 11840.78
## 459 56 male 39.6 0 no southwest 10601.41
## 460 40 female 33.0 3 no southeast 7682.67
## 461 49 female 36.6 3 no southeast 10381.48
## 462 42 male 30.0 0 yes southwest 22144.03
## 463 62 female 38.1 2 no northeast 15230.32
## 464 56 male 25.9 0 no northeast 11165.42
## 465 19 male 25.2 0 no northwest 1632.04
## 466 30 female 28.4 1 yes southeast 19521.97
## 467 60 female 28.7 1 no southwest 13224.69
## 468 56 female 33.8 2 no northwest 12643.38
## 469 28 female 24.3 1 no northeast 23288.93
## 470 18 female 24.1 1 no southeast 2201.10
## 471 27 male 32.7 0 no southeast 2497.04
## 472 18 female 30.1 0 no northeast 2203.47
## 473 19 female 29.8 0 no southwest 1744.47
## 474 47 female 33.3 0 no northeast 20878.78
## 475 54 male 25.1 3 yes southwest 25382.30
## 476 61 male 28.3 1 yes northwest 28868.66
## 477 24 male 28.5 0 yes northeast 35147.53
## 478 25 male 35.6 0 no northwest 2534.39
## 479 21 male 36.9 0 no southeast 1534.30
## 480 23 male 32.6 0 no southeast 1824.29
## 481 63 male 41.3 3 no northwest 15555.19
## 482 49 male 37.5 2 no southeast 9304.70
## 483 18 female 31.4 0 no southeast 1622.19
## 484 51 female 39.5 1 no southwest 9880.07
## 485 48 male 34.3 3 no southwest 9563.03
## 486 31 female 31.1 0 no northeast 4347.02
## 487 54 female 21.5 3 no northwest 12475.35
## 488 19 male 28.7 0 no southwest 1253.94
## 489 44 female 38.1 0 yes southeast 48885.14
## 490 53 male 31.2 1 no northwest 10461.98
## 491 19 female 32.9 0 no southwest 1748.77
## 492 61 female 25.1 0 no southeast 24513.09
## 493 18 female 25.1 0 no northeast 2196.47
## 494 61 male 43.4 0 no southwest 12574.05
## 495 21 male 25.7 4 yes southwest 17942.11
## 496 20 male 27.9 0 no northeast 1967.02
## 497 31 female 23.6 2 no southwest 4931.65
## 498 45 male 28.7 2 no southwest 8027.97
## 499 44 female 24.0 2 no southeast 8211.10
## 500 62 female 39.2 0 no southwest 13470.86
## 501 29 male 34.4 0 yes southwest 36197.70
## 502 43 male 26.0 0 no northeast 6837.37
## 503 51 male 23.2 1 yes southeast 22218.11
## 504 19 male 30.3 0 yes southeast 32548.34
## 505 38 female 28.9 1 no southeast 5974.38
## 506 37 male 30.9 3 no northwest 6796.86
## 507 22 male 31.4 1 no northwest 2643.27
## 508 21 male 23.8 2 no northwest 3077.10
## 509 24 female 25.3 0 no northeast 3044.21
## 510 57 female 28.7 0 no southwest 11455.28
## 511 56 male 32.1 1 no northeast 11763.00
## 512 27 male 33.7 0 no southeast 2498.41
## 513 51 male 22.4 0 no northeast 9361.33
## 514 19 male 30.4 0 no southwest 1256.30
## 515 39 male 28.3 1 yes southwest 21082.16
## 516 58 male 35.7 0 no southwest 11362.76
## 517 20 male 35.3 1 no southeast 27724.29
## 518 45 male 30.5 2 no northwest 8413.46
## 519 35 female 31.0 1 no southwest 5240.77
## 520 31 male 30.9 0 no northeast 3857.76
## 521 50 female 27.4 0 no northeast 25656.58
## 522 32 female 44.2 0 no southeast 3994.18
## 523 51 female 33.9 0 no northeast 9866.30
## 524 38 female 37.7 0 no southeast 5397.62
## 525 42 male 26.1 1 yes southeast 38245.59
## 526 18 female 33.9 0 no southeast 11482.63
## 527 19 female 30.6 2 no northwest 24059.68
## 528 51 female 25.8 1 no southwest 9861.03
## 529 46 male 39.4 1 no northeast 8342.91
## 530 18 male 25.5 0 no northeast 1708.00
## 531 57 male 42.1 1 yes southeast 48675.52
## 532 62 female 31.7 0 no northeast 14043.48
## 533 59 male 29.7 2 no southeast 12925.89
## 534 37 male 36.2 0 no southeast 19214.71
## 535 64 male 40.5 0 no southeast 13831.12
## 536 38 male 28.0 1 no northeast 6067.13
## 537 33 female 38.9 3 no southwest 5972.38
## 538 46 female 30.2 2 no southwest 8825.09
## 539 46 female 28.1 1 no southeast 8233.10
## 540 53 male 31.4 0 no southeast 27346.04
## 541 34 female 38.0 3 no southwest 6196.45
## 542 20 female 31.8 2 no southeast 3056.39
## 543 63 female 36.3 0 no southeast 13887.20
## 544 54 female 47.4 0 yes southeast 63770.43
## 545 54 male 30.2 0 no northwest 10231.50
## 546 49 male 25.8 2 yes northwest 23807.24
## 547 28 male 35.4 0 no northeast 3268.85
## 548 54 female 46.7 2 no southwest 11538.42
## 549 25 female 28.6 0 no northeast 3213.62
## 550 43 female 46.2 0 yes southeast 45863.21
## 551 63 male 30.8 0 no southwest 13390.56
## 552 32 female 28.9 0 no southeast 3972.92
## 553 62 male 21.4 0 no southwest 12957.12
## 554 52 female 31.7 2 no northwest 11187.66
## 555 25 female 41.3 0 no northeast 17878.90
## 556 28 male 23.8 2 no southwest 3847.67
## 557 46 male 33.4 1 no northeast 8334.59
## 558 34 male 34.2 0 no southeast 3935.18
## 559 35 female 34.1 3 yes northwest 39983.43
## 560 19 male 35.5 0 no northwest 1646.43
## 561 46 female 20.0 2 no northwest 9193.84
## 562 54 female 32.7 0 no northeast 10923.93
## 563 27 male 30.5 0 no southwest 2494.02
## 564 50 male 44.8 1 no southeast 9058.73
## 565 18 female 32.1 2 no southeast 2801.26
## 566 19 female 30.5 0 no northwest 2128.43
## 567 38 female 40.6 1 no northwest 6373.56
## 568 41 male 30.6 2 no northwest 7256.72
## 569 49 female 31.9 5 no southwest 11552.90
## 570 48 male 40.6 2 yes northwest 45702.02
## 571 31 female 29.1 0 no southwest 3761.29
## 572 18 female 37.3 1 no southeast 2219.45
## 573 30 female 43.1 2 no southeast 4753.64
## 574 62 female 36.9 1 no northeast 31620.00
## 575 57 female 34.3 2 no northeast 13224.06
## 576 58 female 27.2 0 no northwest 12222.90
## 577 22 male 26.8 0 no southeast 1665.00
## 578 31 female 38.1 1 yes northeast 58571.07
## 579 52 male 30.2 1 no southwest 9724.53
## 580 25 female 23.5 0 no northeast 3206.49
## 581 59 male 25.5 1 no northeast 12913.99
## 583 39 male 45.4 2 no southeast 6356.27
## 584 32 female 23.7 1 no southeast 17626.24
## 585 19 male 20.7 0 no southwest 1242.82
## 586 33 female 28.3 1 no southeast 4779.60
## 587 21 male 20.2 3 no northeast 3861.21
## 588 34 female 30.2 1 yes northwest 43943.88
## 589 61 female 35.9 0 no northeast 13635.64
## 590 38 female 30.7 1 no southeast 5976.83
## 591 58 female 29.0 0 no southwest 11842.44
## 592 47 male 19.6 1 no northwest 8428.07
## 593 20 male 31.1 2 no southeast 2566.47
## 594 21 female 21.9 1 yes northeast 15359.10
## 595 41 male 40.3 0 no southeast 5709.16
## 596 46 female 33.7 1 no northeast 8823.99
## 597 42 female 29.5 2 no southeast 7640.31
## 598 34 female 33.3 1 no northeast 5594.85
## 599 43 male 32.6 2 no southwest 7441.50
## 600 52 female 37.5 2 no northwest 33471.97
## 601 18 female 39.2 0 no southeast 1633.04
## 602 51 male 31.6 0 no northwest 9174.14
## 603 56 female 25.3 0 no southwest 11070.54
## 604 64 female 39.1 3 no southeast 16085.13
## 605 19 female 28.3 0 yes northwest 17468.98
## 606 51 female 34.1 0 no southeast 9283.56
## 607 27 female 25.2 0 no northeast 3558.62
## 608 59 female 23.7 0 yes northwest 25678.78
## 609 28 male 27.0 2 no northeast 4435.09
## 610 30 male 37.8 2 yes southwest 39241.44
## 611 47 female 29.4 1 no southeast 8547.69
## 612 38 female 34.8 2 no southwest 6571.54
## 613 18 female 33.2 0 no northeast 2207.70
## 614 34 female 19.0 3 no northeast 6753.04
## 615 20 female 33.0 0 no southeast 1880.07
## 616 47 female 36.6 1 yes southeast 42969.85
## 617 56 female 28.6 0 no northeast 11658.12
## 618 49 male 25.6 2 yes southwest 23306.55
## 619 19 female 33.1 0 yes southeast 34439.86
## 620 55 female 37.1 0 no southwest 10713.64
## 621 30 male 31.4 1 no southwest 3659.35
## 622 37 male 34.1 4 yes southwest 40182.25
## 623 49 female 21.3 1 no southwest 9182.17
## 624 18 male 33.5 0 yes northeast 34617.84
## 625 59 male 28.8 0 no northwest 12129.61
## 626 29 female 26.0 0 no northwest 3736.46
## 627 36 male 28.9 3 no northeast 6748.59
## 628 33 male 42.5 1 no southeast 11326.71
## 629 58 male 38.0 0 no southwest 11365.95
## 630 44 female 39.0 0 yes northwest 42983.46
## 631 53 male 36.1 1 no southwest 10085.85
## 632 24 male 29.3 0 no southwest 1977.82
## 633 29 female 35.5 0 no southeast 3366.67
## 634 40 male 22.7 2 no northeast 7173.36
## 635 51 male 39.7 1 no southwest 9391.35
## 636 64 male 38.2 0 no northeast 14410.93
## 637 19 female 24.5 1 no northwest 2709.11
## 638 35 female 38.1 2 no northeast 24915.05
## 639 39 male 26.4 0 yes northeast 20149.32
## 640 56 male 33.7 4 no southeast 12949.16
## 641 33 male 42.4 5 no southwest 6666.24
## 642 42 male 28.3 3 yes northwest 32787.46
## 643 61 male 33.9 0 no northeast 13143.86
## 644 23 female 35.0 3 no northwest 4466.62
## 645 43 male 35.3 2 no southeast 18806.15
## 646 48 male 30.8 3 no northeast 10141.14
## 647 39 male 26.2 1 no northwest 6123.57
## 648 40 female 23.4 3 no northeast 8252.28
## 649 18 male 28.5 0 no northeast 1712.23
## 650 58 female 33.0 0 no northeast 12430.95
## 651 49 female 42.7 2 no southeast 9800.89
## 652 53 female 39.6 1 no southeast 10579.71
## 653 48 female 31.1 0 no southeast 8280.62
## 654 45 female 36.3 2 no southeast 8527.53
## 655 59 female 35.2 0 no southeast 12244.53
## 656 52 female 25.3 2 yes southeast 24667.42
## 657 26 female 42.4 1 no southwest 3410.32
## 658 27 male 33.2 2 no northwest 4058.71
## 659 48 female 35.9 1 no northeast 26392.26
## 660 57 female 28.8 4 no northeast 14394.40
## 661 37 male 46.5 3 no southeast 6435.62
## 662 57 female 24.0 1 no southeast 22192.44
## 663 32 female 31.5 1 no northeast 5148.55
## 664 18 male 33.7 0 no southeast 1136.40
## 665 64 female 23.0 0 yes southeast 27037.91
## 666 43 male 38.1 2 yes southeast 42560.43
## 667 49 male 28.7 1 no southwest 8703.46
## 668 40 female 32.8 2 yes northwest 40003.33
## 669 62 male 32.0 0 yes northeast 45710.21
## 670 40 female 29.8 1 no southeast 6500.24
## 671 30 male 31.6 3 no southeast 4837.58
## 672 29 female 31.2 0 no northeast 3943.60
## 673 36 male 29.7 0 no southeast 4399.73
## 674 41 female 31.0 0 no southeast 6185.32
## 675 44 female 43.9 2 yes southeast 46200.99
## 676 45 male 21.4 0 no northwest 7222.79
## 677 55 female 40.8 3 no southeast 12485.80
## 678 60 male 31.4 3 yes northwest 46130.53
## 679 56 male 36.1 3 no southwest 12363.55
## 680 49 female 23.2 2 no northwest 10156.78
## 681 21 female 17.4 1 no southwest 2585.27
## 682 19 male 20.3 0 no southwest 1242.26
## 683 39 male 35.3 2 yes southwest 40103.89
## 684 53 male 24.3 0 no northwest 9863.47
## 685 33 female 18.5 1 no southwest 4766.02
## 686 53 male 26.4 2 no northeast 11244.38
## 687 42 male 26.1 2 no northeast 7729.65
## 688 40 male 41.7 0 no southeast 5438.75
## 689 47 female 24.1 1 no southwest 26236.58
## 690 27 male 31.1 1 yes southeast 34806.47
## 691 21 male 27.4 0 no northeast 2104.11
## 692 47 male 36.2 1 no southwest 8068.19
## 693 20 male 32.4 1 no northwest 2362.23
## 694 24 male 23.7 0 no northwest 2352.97
## 695 27 female 34.8 1 no southwest 3578.00
## 696 26 female 40.2 0 no northwest 3201.25
## 697 53 female 32.3 2 no northeast 29186.48
## 698 41 male 35.8 1 yes southeast 40273.65
## 699 56 male 33.7 0 no northwest 10976.25
## 700 23 female 39.3 2 no southeast 3500.61
## 701 21 female 34.9 0 no southeast 2020.55
## 702 50 female 44.7 0 no northeast 9541.70
## 703 53 male 41.5 0 no southeast 9504.31
## 704 34 female 26.4 1 no northwest 5385.34
## 705 47 female 29.5 1 no northwest 8930.93
## 706 33 female 32.9 2 no southwest 5375.04
## 707 51 female 38.1 0 yes southeast 44400.41
## 708 49 male 28.7 3 no northwest 10264.44
## 709 31 female 30.5 3 no northeast 6113.23
## 710 36 female 27.7 0 no northeast 5469.01
## 711 18 male 35.2 1 no southeast 1727.54
## 712 50 female 23.5 2 no southeast 10107.22
## 713 43 female 30.7 2 no northwest 8310.84
## 714 20 male 40.5 0 no northeast 1984.45
## 715 24 female 22.6 0 no southwest 2457.50
## 716 60 male 28.9 0 no southwest 12146.97
## 717 49 female 22.6 1 no northwest 9566.99
## 718 60 male 24.3 1 no northwest 13112.60
## 719 51 female 36.7 2 no northwest 10848.13
## 720 58 female 33.4 0 no northwest 12231.61
## 721 51 female 40.7 0 no northeast 9875.68
## 722 53 male 36.6 3 no southwest 11264.54
## 723 62 male 37.4 0 no southwest 12979.36
## 724 19 male 35.4 0 no southwest 1263.25
## 725 50 female 27.1 1 no northeast 10106.13
## 726 30 female 39.1 3 yes southeast 40932.43
## 727 41 male 28.4 1 no northwest 6664.69
## 728 29 female 21.8 1 yes northeast 16657.72
## 729 18 female 40.3 0 no northeast 2217.60
## 730 41 female 36.1 1 no southeast 6781.35
## 731 35 male 24.4 3 yes southeast 19362.00
## 732 53 male 21.4 1 no southwest 10065.41
## 733 24 female 30.1 3 no southwest 4234.93
## 734 48 female 27.3 1 no northeast 9447.25
## 735 59 female 32.1 3 no southwest 14007.22
## 736 49 female 34.8 1 no northwest 9583.89
## 737 37 female 38.4 0 yes southeast 40419.02
## 738 26 male 23.7 2 no southwest 3484.33
## 739 23 male 31.7 3 yes northeast 36189.10
## 740 29 male 35.5 2 yes southwest 44585.46
## 741 45 male 24.0 2 no northeast 8604.48
## 742 27 male 29.2 0 yes southeast 18246.50
## 743 53 male 34.1 0 yes northeast 43254.42
## 744 31 female 26.6 0 no southeast 3757.84
## 745 50 male 26.4 0 no northwest 8827.21
## 746 50 female 30.1 1 no northwest 9910.36
## 747 34 male 27.0 2 no southwest 11737.85
## 748 19 male 21.8 0 no northwest 1627.28
## 749 47 female 36.0 1 no southwest 8556.91
## 750 28 male 30.9 0 no northwest 3062.51
## 751 37 female 26.4 0 yes southeast 19539.24
## 752 21 male 29.0 0 no northwest 1906.36
## 753 64 male 37.9 0 no northwest 14210.54
## 754 58 female 22.8 0 no southeast 11833.78
## 755 24 male 33.6 4 no northeast 17128.43
## 756 31 male 27.6 2 no northeast 5031.27
## 757 39 female 22.8 3 no northeast 7985.82
## 758 47 female 27.8 0 yes southeast 23065.42
## 759 30 male 37.4 3 no northeast 5428.73
## 760 18 male 38.2 0 yes southeast 36307.80
## 761 22 female 34.6 2 no northeast 3925.76
## 762 23 male 35.2 1 no southwest 2416.96
## 763 33 male 27.1 1 yes southwest 19040.88
## 764 27 male 26.0 0 no northeast 3070.81
## 765 45 female 25.2 2 no northeast 9095.07
## 766 57 female 31.8 0 no northwest 11842.62
## 767 47 male 32.3 1 no southwest 8062.76
## 768 42 female 29.0 1 no southwest 7050.64
## 769 64 female 39.7 0 no southwest 14319.03
## 770 38 female 19.5 2 no northwest 6933.24
## 771 61 male 36.1 3 no southwest 27941.29
## 772 53 female 26.7 2 no southwest 11150.78
## 773 44 female 36.5 0 no northeast 12797.21
## 774 19 female 28.9 0 yes northwest 17748.51
## 775 41 male 34.2 2 no northwest 7261.74
## 776 51 male 33.3 3 no southeast 10560.49
## 777 40 male 32.3 2 no northwest 6986.70
## 778 45 male 39.8 0 no northeast 7448.40
## 779 35 male 34.3 3 no southeast 5934.38
## 780 53 male 28.9 0 no northwest 9869.81
## 781 30 male 24.4 3 yes southwest 18259.22
## 782 18 male 41.1 0 no southeast 1146.80
## 783 51 male 36.0 1 no southeast 9386.16
## 784 50 female 27.6 1 yes southwest 24520.26
## 785 31 female 29.3 1 no southeast 4350.51
## 786 35 female 27.7 3 no southwest 6414.18
## 787 60 male 37.0 0 no northeast 12741.17
## 788 21 male 36.9 0 no northwest 1917.32
## 789 29 male 22.5 3 no northeast 5209.58
## 790 62 female 29.9 0 no southeast 13457.96
## 791 39 female 41.8 0 no southeast 5662.23
## 792 19 male 27.6 0 no southwest 1252.41
## 793 22 female 23.2 0 no northeast 2731.91
## 794 53 male 20.9 0 yes southeast 21195.82
## 795 39 female 31.9 2 no northwest 7209.49
## 796 27 male 28.5 0 yes northwest 18310.74
## 797 30 male 44.2 2 no southeast 4266.17
## 798 30 female 22.9 1 no northeast 4719.52
## 799 58 female 33.1 0 no southwest 11848.14
## 800 33 male 24.8 0 yes northeast 17904.53
## 801 42 female 26.2 1 no southeast 7046.72
## 802 64 female 36.0 0 no southeast 14313.85
## 803 21 male 22.3 1 no southwest 2103.08
## 804 18 female 42.2 0 yes southeast 38792.69
## 805 23 male 26.5 0 no southeast 1815.88
## 806 45 female 35.8 0 no northwest 7731.86
## 807 40 female 41.4 1 no northwest 28476.73
## 808 19 female 36.6 0 no northwest 2136.88
## 809 18 male 30.1 0 no southeast 1131.51
## 810 25 male 25.8 1 no northeast 3309.79
## 811 46 female 30.8 3 no southwest 9414.92
## 812 33 female 42.9 3 no northwest 6360.99
## 813 54 male 21.0 2 no southeast 11013.71
## 814 28 male 22.5 2 no northeast 4428.89
## 815 36 male 34.4 2 no southeast 5584.31
## 816 20 female 31.5 0 no southeast 1877.93
## 817 24 female 24.2 0 no northwest 2842.76
## 818 23 male 37.1 3 no southwest 3597.60
## 819 47 female 26.1 1 yes northeast 23401.31
## 820 33 female 35.5 0 yes northwest 55135.40
## 821 45 male 33.7 1 no southwest 7445.92
## 822 26 male 17.7 0 no northwest 2680.95
## 823 18 female 31.1 0 no southeast 1621.88
## 824 44 female 29.8 2 no southeast 8219.20
## 825 60 male 24.3 0 no northwest 12523.60
## 826 64 female 31.8 2 no northeast 16069.08
## 827 56 male 31.8 2 yes southeast 43813.87
## 828 36 male 28.0 1 yes northeast 20773.63
## 829 41 male 30.8 3 yes northeast 39597.41
## 830 39 male 21.9 1 no northwest 6117.49
## 831 63 male 33.1 0 no southwest 13393.76
## 832 36 female 25.8 0 no northwest 5266.37
## 833 28 female 23.8 2 no northwest 4719.74
## 834 58 male 34.4 0 no northwest 11743.93
## 835 36 male 33.8 1 no northwest 5377.46
## 836 42 male 36.0 2 no southeast 7160.33
## 837 36 male 31.5 0 no southwest 4402.23
## 838 56 female 28.3 0 no northeast 11657.72
## 839 35 female 23.5 2 no northeast 6402.29
## 840 59 female 31.4 0 no northwest 12622.18
## 841 21 male 31.1 0 no southwest 1526.31
## 842 59 male 24.7 0 no northeast 12323.94
## 843 23 female 32.8 2 yes southeast 36021.01
## 844 57 female 29.8 0 yes southeast 27533.91
## 845 53 male 30.5 0 no northeast 10072.06
## 846 60 female 32.5 0 yes southeast 45008.96
## 847 51 female 34.2 1 no southwest 9872.70
## 848 23 male 50.4 1 no southeast 2438.06
## 849 27 female 24.1 0 no southwest 2974.13
## 850 55 male 32.8 0 no northwest 10601.63
## 851 37 female 30.8 0 yes northeast 37270.15
## 852 61 male 32.3 2 no northwest 14119.62
## 853 46 female 35.5 0 yes northeast 42111.66
## 854 53 female 23.8 2 no northeast 11729.68
## 855 49 female 23.8 3 yes northeast 24106.91
## 856 20 female 29.6 0 no southwest 1875.34
## 857 48 female 33.1 0 yes southeast 40974.16
## 858 25 male 24.1 0 yes northwest 15817.99
## 859 25 female 32.2 1 no southeast 18218.16
## 860 57 male 28.1 0 no southwest 10965.45
## 861 37 female 47.6 2 yes southwest 46113.51
## 862 38 female 28.0 3 no southwest 7151.09
## 863 55 female 33.5 2 no northwest 12269.69
## 864 36 female 19.9 0 no northeast 5458.05
## 865 51 male 25.4 0 no southwest 8782.47
## 866 40 male 29.9 2 no southwest 6600.36
## 867 18 male 37.3 0 no southeast 1141.45
## 868 57 male 43.7 1 no southwest 11576.13
## 869 61 male 23.7 0 no northeast 13129.60
## 870 25 female 24.3 3 no southwest 4391.65
## 871 50 male 36.2 0 no southwest 8457.82
## 872 26 female 29.5 1 no southeast 3392.37
## 873 42 male 24.9 0 no southeast 5966.89
## 874 43 male 30.1 1 no southwest 6849.03
## 875 44 male 21.9 3 no northeast 8891.14
## 876 23 female 28.1 0 no northwest 2690.11
## 877 49 female 27.1 1 no southwest 26140.36
## 878 33 male 33.4 5 no southeast 6653.79
## 879 41 male 28.8 1 no southwest 6282.24
## 880 37 female 29.5 2 no southwest 6311.95
## 881 22 male 34.8 3 no southwest 3443.06
## 882 23 male 27.4 1 no northwest 2789.06
## 883 21 female 22.1 0 no northeast 2585.85
## 884 51 female 37.1 3 yes northeast 46255.11
## 885 25 male 26.7 4 no northwest 4877.98
## 886 32 male 28.9 1 yes southeast 19719.69
## 887 57 male 29.0 0 yes northeast 27218.44
## 888 36 female 30.0 0 no northwest 5272.18
## 889 22 male 39.5 0 no southwest 1682.60
## 890 57 male 33.6 1 no northwest 11945.13
## 891 64 female 26.9 0 yes northwest 29330.98
## 892 36 female 29.0 4 no southeast 7243.81
## 893 54 male 24.0 0 no northeast 10422.92
## 894 47 male 38.9 2 yes southeast 44202.65
## 895 62 male 32.1 0 no northeast 13555.00
## 896 61 female 44.0 0 no southwest 13063.88
## 897 43 female 20.0 2 yes northeast 19798.05
## 898 19 male 25.6 1 no northwest 2221.56
## 899 18 female 40.3 0 no southeast 1634.57
## 900 19 female 22.5 0 no northwest 2117.34
## 901 49 male 22.5 0 no northeast 8688.86
## 902 60 male 40.9 0 yes southeast 48673.56
## 903 26 male 27.3 3 no northeast 4661.29
## 904 49 male 36.9 0 no southeast 8125.78
## 905 60 female 35.1 0 no southwest 12644.59
## 906 26 female 29.4 2 no northeast 4564.19
## 907 27 male 32.6 3 no northeast 4846.92
## 908 44 female 32.3 1 no southeast 7633.72
## 909 63 male 39.8 3 no southwest 15170.07
## 910 32 female 24.6 0 yes southwest 17496.31
## 911 22 male 28.3 1 no northwest 2639.04
## 912 18 male 31.7 0 yes northeast 33732.69
## 913 59 female 26.7 3 no northwest 14382.71
## 914 44 female 27.5 1 no southwest 7626.99
## 915 33 male 24.6 2 no northwest 5257.51
## 916 24 female 34.0 0 no southeast 2473.33
## 917 43 female 26.9 0 yes northwest 21774.32
## 918 45 male 22.9 0 yes northeast 35069.37
## 919 61 female 28.2 0 no southwest 13041.92
## 920 35 female 34.2 1 no southeast 5245.23
## 921 62 female 25.0 0 no southwest 13451.12
## 922 62 female 33.2 0 no southwest 13462.52
## 923 38 male 31.0 1 no southwest 5488.26
## 924 34 male 35.8 0 no northwest 4320.41
## 925 43 male 23.2 0 no southwest 6250.44
## 926 50 male 32.1 2 no northeast 25333.33
## 927 19 female 23.4 2 no southwest 2913.57
## 928 57 female 20.1 1 no southwest 12032.33
## 929 62 female 39.2 0 no southeast 13470.80
## 930 41 male 34.2 1 no southeast 6289.75
## 931 26 male 46.5 1 no southeast 2927.06
## 932 39 female 32.5 1 no southwest 6238.30
## 933 46 male 25.8 5 no southwest 10096.97
## 934 45 female 35.3 0 no southwest 7348.14
## 935 32 male 37.2 2 no southeast 4673.39
## 936 59 female 27.5 0 no southwest 12233.83
## 937 44 male 29.7 2 no northeast 32108.66
## 938 39 female 24.2 5 no northwest 8965.80
## 939 18 male 26.2 2 no southeast 2304.00
## 940 53 male 29.5 0 no southeast 9487.64
## 941 18 male 23.2 0 no southeast 1121.87
## 942 50 female 46.1 1 no southeast 9549.57
## 943 18 female 40.2 0 no northeast 2217.47
## 944 19 male 22.6 0 no northwest 1628.47
## 945 62 male 39.9 0 no southeast 12982.87
## 946 56 female 35.8 1 no southwest 11674.13
## 947 42 male 35.8 2 no southwest 7160.09
## 948 37 male 34.2 1 yes northeast 39047.29
## 949 42 male 31.3 0 no northwest 6358.78
## 950 25 male 29.7 3 yes southwest 19933.46
## 951 57 male 18.3 0 no northeast 11534.87
## 952 51 male 42.9 2 yes southeast 47462.89
## 953 30 female 28.4 1 no northwest 4527.18
## 954 44 male 30.2 2 yes southwest 38998.55
## 955 34 male 27.8 1 yes northwest 20009.63
## 956 31 male 39.5 1 no southeast 3875.73
## 957 54 male 30.8 1 yes southeast 41999.52
## 958 24 male 26.8 1 no northwest 12609.89
## 959 43 male 35.0 1 yes northeast 41034.22
## 960 48 male 36.7 1 no northwest 28468.92
## 961 19 female 39.6 1 no northwest 2730.11
## 962 29 female 25.9 0 no southwest 3353.28
## 963 63 female 35.2 1 no southeast 14474.68
## 964 46 male 24.8 3 no northeast 9500.57
## 965 52 male 36.8 2 no northwest 26467.10
## 966 35 male 27.1 1 no southwest 4746.34
## 967 51 male 24.8 2 yes northwest 23967.38
## 968 44 male 25.4 1 no northwest 7518.03
## 969 21 male 25.7 2 no northeast 3279.87
## 970 39 female 34.3 5 no southeast 8596.83
## 971 50 female 28.2 3 no southeast 10702.64
## 972 34 female 23.6 0 no northeast 4992.38
## 973 22 female 20.2 0 no northwest 2527.82
## 974 19 female 40.5 0 no southwest 1759.34
## 975 26 male 35.4 0 no southeast 2322.62
## 976 29 male 22.9 0 yes northeast 16138.76
## 977 48 male 40.2 0 no southeast 7804.16
## 978 26 male 29.2 1 no southeast 2902.91
## 979 45 female 40.0 3 no northeast 9704.67
## 980 36 female 29.9 0 no southeast 4889.04
## 981 54 male 25.5 1 no northeast 25517.11
## 982 34 male 21.4 0 no northeast 4500.34
## 983 31 male 25.9 3 yes southwest 19199.94
## 984 27 female 30.6 1 no northeast 16796.41
## 985 20 male 30.1 5 no northeast 4915.06
## 986 44 female 25.8 1 no southwest 7624.63
## 987 43 male 30.1 3 no northwest 8410.05
## 988 45 female 27.6 1 no northwest 28340.19
## 989 34 male 34.7 0 no northeast 4518.83
## 990 24 female 20.5 0 yes northeast 14571.89
## 991 26 female 19.8 1 no southwest 3378.91
## 992 38 female 27.8 2 no northeast 7144.86
## 993 50 female 31.6 2 no southwest 10118.42
## 994 38 male 28.3 1 no southeast 5484.47
## 995 27 female 20.0 3 yes northwest 16420.49
## 996 39 female 23.3 3 no northeast 7986.48
## 997 39 female 34.1 3 no southwest 7418.52
## 998 63 female 36.9 0 no southeast 13887.97
## 999 33 female 36.3 3 no northeast 6551.75
## 1000 36 female 26.9 0 no northwest 5267.82
## 1001 30 male 23.0 2 yes northwest 17361.77
## 1002 24 male 32.7 0 yes southwest 34472.84
## 1003 24 male 25.8 0 no southwest 1972.95
## 1004 48 male 29.6 0 no southwest 21232.18
## 1005 47 male 19.2 1 no northeast 8627.54
## 1006 29 male 31.7 2 no northwest 4433.39
## 1007 28 male 29.3 2 no northeast 4438.26
## 1008 47 male 28.2 3 yes northwest 24915.22
## 1009 25 male 25.0 2 no northeast 23241.47
## 1010 51 male 27.7 1 no northeast 9957.72
## 1011 48 female 22.8 0 no southwest 8269.04
## 1012 43 male 20.1 2 yes southeast 18767.74
## 1013 61 female 33.3 4 no southeast 36580.28
## 1014 48 male 32.3 1 no northwest 8765.25
## 1015 38 female 27.6 0 no southwest 5383.54
## 1016 59 male 25.5 0 no northwest 12124.99
## 1017 19 female 24.6 1 no northwest 2709.24
## 1018 26 female 34.2 2 no southwest 3987.93
## 1019 54 female 35.8 3 no northwest 12495.29
## 1020 21 female 32.7 2 no northwest 26018.95
## 1021 51 male 37.0 0 no southwest 8798.59
## 1022 22 female 31.0 3 yes southeast 35595.59
## 1023 47 male 36.1 1 yes southeast 42211.14
## 1024 18 male 23.3 1 no southeast 1711.03
## 1025 47 female 45.3 1 no southeast 8569.86
## 1026 21 female 34.6 0 no southwest 2020.18
## 1027 19 male 26.0 1 yes northwest 16450.89
## 1028 23 male 18.7 0 no northwest 21595.38
## 1029 54 male 31.6 0 no southwest 9850.43
## 1030 37 female 17.3 2 no northeast 6877.98
## 1031 46 female 23.7 1 yes northwest 21677.28
## 1032 55 female 35.2 0 yes southeast 44423.80
## 1033 30 female 27.9 0 no northeast 4137.52
## 1034 18 male 21.6 0 yes northeast 13747.87
## 1035 61 male 38.4 0 no northwest 12950.07
## 1036 54 female 23.0 3 no southwest 12094.48
## 1037 22 male 37.1 2 yes southeast 37484.45
## 1038 45 female 30.5 1 yes northwest 39725.52
## 1039 22 male 28.9 0 no northeast 2250.84
## 1040 19 male 27.3 2 no northwest 22493.66
## 1041 35 female 28.0 0 yes northwest 20234.85
## 1042 18 male 23.1 0 no northeast 1704.70
## 1043 20 male 30.7 0 yes northeast 33475.82
## 1044 28 female 25.8 0 no southwest 3161.45
## 1045 55 male 35.2 1 no northeast 11394.07
## 1046 43 female 24.7 2 yes northwest 21880.82
## 1047 43 female 25.1 0 no northeast 7325.05
## 1048 22 male 52.6 1 yes southeast 44501.40
## 1049 25 female 22.5 1 no northwest 3594.17
## 1050 49 male 30.9 0 yes southwest 39727.61
## 1051 44 female 37.0 1 no northwest 8023.14
## 1052 64 male 26.4 0 no northeast 14394.56
## 1053 49 male 29.8 1 no northeast 9288.03
## 1054 47 male 29.8 3 yes southwest 25309.49
## 1055 27 female 21.5 0 no northwest 3353.47
## 1056 55 male 27.6 0 no northwest 10594.50
## 1057 48 female 28.9 0 no southwest 8277.52
## 1058 45 female 31.8 0 no southeast 17929.30
## 1059 24 female 39.5 0 no southeast 2480.98
## 1060 32 male 33.8 1 no northwest 4462.72
## 1061 24 male 32.0 0 no southeast 1981.58
## 1062 57 male 27.9 1 no southeast 11554.22
## 1063 59 male 41.1 1 yes southeast 48970.25
## 1064 36 male 28.6 3 no northwest 6548.20
## 1065 29 female 25.6 4 no southwest 5708.87
## 1066 42 female 25.3 1 no southwest 7045.50
## 1067 48 male 37.3 2 no southeast 8978.19
## 1068 39 male 42.7 0 no northeast 5757.41
## 1069 63 male 21.7 1 no northwest 14349.85
## 1070 54 female 31.9 1 no southeast 10928.85
## 1071 37 male 37.1 1 yes southeast 39871.70
## 1072 63 male 31.4 0 no northeast 13974.46
## 1073 21 male 31.3 0 no northwest 1909.53
## 1074 54 female 28.9 2 no northeast 12096.65
## 1075 60 female 18.3 0 no northeast 13204.29
## 1076 32 female 29.6 1 no southeast 4562.84
## 1077 47 female 32.0 1 no southwest 8551.35
## 1078 21 male 26.0 0 no northeast 2102.26
## 1079 28 male 31.7 0 yes southeast 34672.15
## 1080 63 male 33.7 3 no southeast 15161.53
## 1081 18 male 21.8 2 no southeast 11884.05
## 1082 32 male 27.8 1 no northwest 4454.40
## 1083 38 male 20.0 1 no northwest 5855.90
## 1084 32 male 31.5 1 no southwest 4076.50
## 1085 62 female 30.5 2 no northwest 15019.76
## 1086 39 female 18.3 5 yes southwest 19023.26
## 1087 55 male 29.0 0 no northeast 10796.35
## 1088 57 male 31.5 0 no northwest 11353.23
## 1089 52 male 47.7 1 no southeast 9748.91
## 1090 56 male 22.1 0 no southwest 10577.09
## 1091 47 male 36.2 0 yes southeast 41676.08
## 1092 55 female 29.8 0 no northeast 11286.54
## 1093 23 male 32.7 3 no southwest 3591.48
## 1094 22 female 30.4 0 yes northwest 33907.55
## 1095 50 female 33.7 4 no southwest 11299.34
## 1096 18 female 31.4 4 no northeast 4561.19
## 1097 51 female 35.0 2 yes northeast 44641.20
## 1098 22 male 33.8 0 no southeast 1674.63
## 1099 52 female 30.9 0 no northeast 23045.57
## 1100 25 female 34.0 1 no southeast 3227.12
## 1101 33 female 19.1 2 yes northeast 16776.30
## 1102 53 male 28.6 3 no southwest 11253.42
## 1103 29 male 38.9 1 no southeast 3471.41
## 1104 58 male 36.1 0 no southeast 11363.28
## 1105 37 male 29.8 0 no southwest 20420.60
## 1106 54 female 31.2 0 no southeast 10338.93
## 1107 49 female 29.9 0 no northwest 8988.16
## 1108 50 female 26.2 2 no northwest 10493.95
## 1109 26 male 30.0 1 no southwest 2904.09
## 1110 45 male 20.4 3 no southeast 8605.36
## 1111 54 female 32.3 1 no northeast 11512.41
## 1112 38 male 38.4 3 yes southeast 41949.24
## 1113 48 female 25.9 3 yes southeast 24180.93
## 1114 28 female 26.3 3 no northwest 5312.17
## 1115 23 male 24.5 0 no northeast 2396.10
## 1116 55 male 32.7 1 no southeast 10807.49
## 1117 41 male 29.6 5 no northeast 9222.40
## 1118 25 male 33.3 2 yes southeast 36124.57
## 1119 33 male 35.8 1 yes southeast 38282.75
## 1120 30 female 20.0 3 no northwest 5693.43
## 1121 23 female 31.4 0 yes southwest 34166.27
## 1122 46 male 38.2 2 no southeast 8347.16
## 1123 53 female 36.9 3 yes northwest 46661.44
## 1124 27 female 32.4 1 no northeast 18903.49
## 1125 23 female 42.8 1 yes northeast 40904.20
## 1126 63 female 25.1 0 no northwest 14254.61
## 1127 55 male 29.9 0 no southwest 10214.64
## 1128 35 female 35.9 2 no southeast 5836.52
## 1129 34 male 32.8 1 no southwest 14358.36
## 1130 19 female 18.6 0 no southwest 1728.90
## 1131 39 female 23.9 5 no southeast 8582.30
## 1132 27 male 45.9 2 no southwest 3693.43
## 1133 57 male 40.3 0 no northeast 20709.02
## 1134 52 female 18.3 0 no northwest 9991.04
## 1135 28 male 33.8 0 no northwest 19673.34
## 1136 50 female 28.1 3 no northwest 11085.59
## 1137 44 female 25.0 1 no southwest 7623.52
## 1138 26 female 22.2 0 no northwest 3176.29
## 1139 33 male 30.3 0 no southeast 3704.35
## 1140 19 female 32.5 0 yes northwest 36898.73
## 1141 50 male 37.1 1 no southeast 9048.03
## 1142 41 female 32.6 3 no southwest 7954.52
## 1143 52 female 24.9 0 no southeast 27117.99
## 1144 39 male 32.3 2 no southeast 6338.08
## 1145 50 male 32.3 2 no southwest 9630.40
## 1146 52 male 32.8 3 no northwest 11289.11
## 1147 60 male 32.8 0 yes southwest 52590.83
## 1148 20 female 31.9 0 no northwest 2261.57
## 1149 55 male 21.5 1 no southwest 10791.96
## 1150 42 male 34.1 0 no southwest 5979.73
## 1151 18 female 30.3 0 no northeast 2203.74
## 1152 58 female 36.5 0 no northwest 12235.84
## 1153 43 female 32.6 3 yes southeast 40941.29
## 1154 35 female 35.8 1 no northwest 5630.46
## 1155 48 female 27.9 4 no northwest 11015.17
## 1156 36 female 22.1 3 no northeast 7228.22
## 1157 19 male 44.9 0 yes southeast 39722.75
## 1158 23 female 23.2 2 no northwest 14426.07
## 1159 20 female 30.6 0 no northeast 2459.72
## 1160 32 female 41.1 0 no southwest 3989.84
## 1161 43 female 34.6 1 no northwest 7727.25
## 1162 34 male 42.1 2 no southeast 5124.19
## 1163 30 male 38.8 1 no southeast 18963.17
## 1164 18 female 28.2 0 no northeast 2200.83
## 1165 41 female 28.3 1 no northwest 7153.55
## 1166 35 female 26.1 0 no northeast 5227.99
## 1167 57 male 40.4 0 no southeast 10982.50
## 1168 29 female 24.6 2 no southwest 4529.48
## 1169 32 male 35.2 2 no southwest 4670.64
## 1170 37 female 34.1 1 no northwest 6112.35
## 1171 18 male 27.4 1 yes northeast 17178.68
## 1172 43 female 26.7 2 yes southwest 22478.60
## 1173 56 female 41.9 0 no southeast 11093.62
## 1174 38 male 29.3 2 no northwest 6457.84
## 1175 29 male 32.1 2 no northwest 4433.92
## 1176 22 female 27.1 0 no southwest 2154.36
## 1177 52 female 24.1 1 yes northwest 23887.66
## 1178 40 female 27.4 1 no southwest 6496.89
## 1179 23 female 34.9 0 no northeast 2899.49
## 1180 31 male 29.8 0 yes southeast 19350.37
## 1181 42 female 41.3 1 no northeast 7650.77
## 1182 24 female 29.9 0 no northwest 2850.68
## 1183 25 female 30.3 0 no southwest 2632.99
## 1184 48 female 27.4 1 no northeast 9447.38
## 1185 23 female 28.5 1 yes southeast 18328.24
## 1186 45 male 23.6 2 no northeast 8603.82
## 1187 20 male 35.6 3 yes northwest 37465.34
## 1188 62 female 32.7 0 no northwest 13844.80
## 1189 43 female 25.3 1 yes northeast 21771.34
## 1190 23 female 28.0 0 no southwest 13126.68
## 1191 31 female 32.8 2 no northwest 5327.40
## 1192 41 female 21.8 1 no northeast 13725.47
## 1193 58 female 32.4 1 no northeast 13019.16
## 1194 48 female 36.6 0 no northwest 8671.19
## 1195 31 female 21.8 0 no northwest 4134.08
## 1196 19 female 27.9 3 no northwest 18838.70
## 1197 19 female 30.0 0 yes northwest 33307.55
## 1198 41 male 33.6 0 no southeast 5699.84
## 1199 40 male 29.4 1 no northwest 6393.60
## 1200 31 female 25.8 2 no southwest 4934.71
## 1201 37 male 24.3 2 no northwest 6198.75
## 1202 46 male 40.4 2 no northwest 8733.23
## 1203 22 male 32.1 0 no northwest 2055.32
## 1204 51 male 32.3 1 no northeast 9964.06
## 1205 18 female 27.3 3 yes southeast 18223.45
## 1206 35 male 17.9 1 no northwest 5116.50
## 1207 59 female 34.8 2 no southwest 36910.61
## 1208 36 male 33.4 2 yes southwest 38415.47
## 1209 37 female 25.6 1 yes northeast 20296.86
## 1210 59 male 37.1 1 no southwest 12347.17
## 1211 36 male 30.9 1 no northwest 5373.36
## 1212 39 male 34.1 2 no southeast 23563.02
## 1213 18 male 21.5 0 no northeast 1702.46
## 1214 52 female 33.3 2 no southwest 10806.84
## 1215 27 female 31.3 1 no northwest 3956.07
## 1216 18 male 39.1 0 no northeast 12890.06
## 1217 40 male 25.1 0 no southeast 5415.66
## 1218 29 male 37.3 2 no southeast 4058.12
## 1219 46 female 34.6 1 yes southwest 41661.60
## 1220 38 female 30.2 3 no northwest 7537.16
## 1221 30 female 21.9 1 no northeast 4718.20
## 1222 40 male 25.0 2 no southeast 6593.51
## 1223 50 male 25.3 0 no southeast 8442.67
## 1224 20 female 24.4 0 yes southeast 26125.67
## 1225 41 male 23.9 1 no northeast 6858.48
## 1226 33 female 39.8 1 no southeast 4795.66
## 1227 38 male 16.8 2 no northeast 6640.54
## 1228 42 male 37.2 2 no southeast 7162.01
## 1229 56 male 34.4 0 no southeast 10594.23
## 1230 58 male 30.3 0 no northeast 11938.26
## 1231 52 male 34.5 3 yes northwest 60021.40
## 1232 20 female 21.8 0 yes southwest 20167.34
## 1233 54 female 24.6 3 no northwest 12479.71
## 1234 58 male 23.3 0 no southwest 11345.52
## 1235 45 female 27.8 2 no southeast 8515.76
## 1236 26 male 31.1 0 no northwest 2699.57
## 1237 63 female 21.7 0 no northeast 14449.85
## 1238 58 female 28.2 0 no northwest 12224.35
## 1239 37 male 22.7 3 no northeast 6985.51
## 1240 25 female 42.1 1 no southeast 3238.44
## 1241 52 male 41.8 2 yes southeast 47269.85
## 1242 64 male 37.0 2 yes southeast 49577.66
## 1243 22 female 21.3 3 no northwest 4296.27
## 1244 28 female 33.1 0 no southeast 3171.61
## 1245 18 male 33.3 0 no southeast 1135.94
## 1246 28 male 24.3 5 no southwest 5615.37
## 1247 45 female 25.7 3 no southwest 9101.80
## 1248 33 male 29.4 4 no southwest 6059.17
## 1249 18 female 39.8 0 no southeast 1633.96
## 1250 32 male 33.6 1 yes northeast 37607.53
## 1251 24 male 29.8 0 yes northeast 18648.42
## 1252 19 male 19.8 0 no southwest 1241.57
## 1253 20 male 27.3 0 yes southwest 16232.85
## 1254 40 female 29.3 4 no southwest 15828.82
## 1255 34 female 27.7 0 no southeast 4415.16
## 1256 42 female 37.9 0 no southwest 6474.01
## 1257 51 female 36.4 3 no northwest 11436.74
## 1258 54 female 27.6 1 no northwest 11305.93
## 1259 55 male 37.7 3 no northwest 30063.58
## 1260 52 female 23.2 0 no northeast 10197.77
## 1261 32 female 20.5 0 no northeast 4544.23
## 1262 28 male 37.1 1 no southwest 3277.16
## 1263 41 female 28.1 1 no southeast 6770.19
## 1264 43 female 29.9 1 no southwest 7337.75
## 1265 49 female 33.3 2 no northeast 10370.91
## 1266 64 male 23.8 0 yes southeast 26926.51
## 1267 55 female 30.5 0 no southwest 10704.47
## 1268 24 male 31.1 0 yes northeast 34254.05
## 1269 20 female 33.3 0 no southwest 1880.49
## 1270 45 male 27.5 3 no southwest 8615.30
## 1271 26 male 33.9 1 no northwest 3292.53
## 1272 25 female 34.5 0 no northwest 3021.81
## 1273 43 male 25.5 5 no southeast 14478.33
## 1274 35 male 27.6 1 no southeast 4747.05
## 1275 26 male 27.1 0 yes southeast 17043.34
## 1276 57 male 23.7 0 no southwest 10959.33
## 1277 22 female 30.4 0 no northeast 2741.95
## 1278 32 female 29.7 0 no northwest 4357.04
## 1279 39 male 29.9 1 yes northeast 22462.04
## 1280 25 female 26.8 2 no northwest 4189.11
## 1281 48 female 33.3 0 no southeast 8283.68
## 1282 47 female 27.6 2 yes northwest 24535.70
## 1283 18 female 21.7 0 yes northeast 14283.46
## 1284 18 male 30.0 1 no southeast 1720.35
## 1285 61 male 36.3 1 yes southwest 47403.88
## 1286 47 female 24.3 0 no northeast 8534.67
## 1287 28 female 17.3 0 no northeast 3732.63
## 1288 36 female 25.9 1 no southwest 5472.45
## 1289 20 male 39.4 2 yes southwest 38344.57
## 1290 44 male 34.3 1 no southeast 7147.47
## 1291 38 female 20.0 2 no northeast 7133.90
## 1292 19 male 34.9 0 yes southwest 34828.65
## 1293 21 male 23.2 0 no southeast 1515.34
## 1294 46 male 25.7 3 no northwest 9301.89
## 1295 58 male 25.2 0 no northeast 11931.13
## 1296 20 male 22.0 1 no southwest 1964.78
## 1297 18 male 26.1 0 no northeast 1708.93
## 1298 28 female 26.5 2 no southeast 4340.44
## 1299 33 male 27.5 2 no northwest 5261.47
## 1300 19 female 25.7 1 no northwest 2710.83
## 1301 45 male 30.4 0 yes southeast 62592.87
## 1302 62 male 30.9 3 yes northwest 46718.16
## 1303 25 female 20.8 1 no southwest 3208.79
## 1304 43 male 27.8 0 yes southwest 37829.72
## 1305 42 male 24.6 2 yes northeast 21259.38
## 1306 24 female 27.7 0 no southeast 2464.62
## 1307 29 female 21.9 0 yes northeast 16115.30
## 1308 32 male 28.1 4 yes northwest 21472.48
## 1309 25 female 30.2 0 yes southwest 33900.65
## 1310 41 male 32.2 2 no southwest 6875.96
## 1311 42 male 26.3 1 no northwest 6940.91
## 1312 33 female 26.7 0 no northwest 4571.41
## 1313 34 male 42.9 1 no southwest 4536.26
## 1314 19 female 34.7 2 yes southwest 36397.58
## 1315 30 female 23.7 3 yes northwest 18765.88
## 1316 18 male 28.3 1 no northeast 11272.33
## 1317 19 female 20.6 0 no southwest 1731.68
## 1318 18 male 53.1 0 no southeast 1163.46
## 1319 35 male 39.7 4 no northeast 19496.72
## 1320 39 female 26.3 2 no northwest 7201.70
## 1321 31 male 31.1 3 no northwest 5425.02
## 1322 62 male 26.7 0 yes northeast 28101.33
## 1323 62 male 38.8 0 no southeast 12981.35
## 1324 42 female 40.4 2 yes southeast 43896.38
## 1325 31 male 25.9 1 no northwest 4239.89
## 1326 61 male 33.5 0 no northeast 13143.34
## 1327 42 female 32.9 0 no northeast 7050.02
## 1328 51 male 30.0 1 no southeast 9377.90
## 1329 23 female 24.2 2 no northeast 22395.74
## 1330 52 male 38.6 2 no southwest 10325.21
## 1331 57 female 25.7 2 no southeast 12629.17
## 1332 23 female 33.4 0 no southwest 10795.94
## 1333 52 female 44.7 3 no southwest 11411.69
## 1334 50 male 31.0 3 no northwest 10600.55
## 1335 18 female 31.9 0 no northeast 2205.98
## 1336 18 female 36.9 0 no southeast 1629.83
## 1337 21 female 25.8 0 no southwest 2007.95
## 1338 61 female 29.1 0 yes northwest 29141.36
Acorde a los datos, se observa que:
* Los hombres tienen un IMC ligeramente mayor que las mujeres. * Las
mujeres tienen un promedio ligeramente mayor de hijos que los hombres. *
Los hombres tienen una mayor prevalencia de tabaquismo que las mujeres.
* Los hombres tienen un promedio de gastos ligeramente mayor que las
mujeres.
df$age <- as.numeric(df$age)
df$children <- as.numeric(df$children)
df$region <- as.numeric(factor(df$region))
factor_with_na <- lapply(df[, c("sex", "smoker")], function(x) any(is.na(x)))
# Se convierten variables categóricas a numéricas
df$sex <- as.numeric(factor(df$sex, ordered = TRUE))
df$smoker <- as.numeric(factor(df$smoker, ordered = TRUE))
#Obtencion de medidas descriptivas
summary(df)
## age sex bmi children
## Min. :18.00 Min. :1.000 Min. :16.00 Min. :0.000
## 1st Qu.:27.00 1st Qu.:1.000 1st Qu.:26.30 1st Qu.:0.000
## Median :39.00 Median :2.000 Median :30.40 Median :1.000
## Mean :39.22 Mean :1.505 Mean :30.67 Mean :1.096
## 3rd Qu.:51.00 3rd Qu.:2.000 3rd Qu.:34.70 3rd Qu.:2.000
## Max. :64.00 Max. :2.000 Max. :53.10 Max. :5.000
## smoker region expenses
## Min. :1.000 Min. :1.000 Min. : 1122
## 1st Qu.:1.000 1st Qu.:2.000 1st Qu.: 4746
## Median :1.000 Median :3.000 Median : 9386
## Mean :1.205 Mean :2.516 Mean :13279
## 3rd Qu.:1.000 3rd Qu.:3.000 3rd Qu.:16658
## Max. :2.000 Max. :4.000 Max. :63770
Acorde a los datos se observa que no hay diferencias significativas
en la variabilidad de las variables entre hombres y mujeres:
* Dispersión del IMC: La dispersión del IMC es similar en hombres y
mujeres, con una desviación estándar de alrededor de 5 kg/m².
* Dispersión del número de hijos: La dispersión del número de hijos es
similar en hombres y mujeres, con una desviación estándar de alrededor
de 1 hijo.
* Dispersión del tabaquismo: La probabilidad de ser fumador es
ligeramente mayor en hombres que en mujeres, con una desviación estándar
de 0.42 (42%) para hombres y 0.39 (39%) para mujeres.
* Dispersión de los gastos: La dispersión de los gastos es similar en
hombres y mujeres, con una desviación estándar de alrededor de
$4,800.
# Desviación estándar de cada columnan
dispersion_sd <- sapply(df, sd)
print(dispersion_sd)
## age sex bmi children smoker region
## 1.404433e+01 5.001634e-01 6.100664e+00 1.205571e+00 4.038062e-01 1.105208e+00
## expenses
## 1.211036e+04
#Rango de cada columna
dispersion_range <- sapply(df, range)
print(dispersion_range)
## age sex bmi children smoker region expenses
## [1,] 18 1 16.0 0 1 1 1121.87
## [2,] 64 2 53.1 5 2 4 63770.43
#Rango intercuartílico de cada columna
dispersion_iqr <- sapply(df, IQR)
print(dispersion_iqr)
## age sex bmi children smoker region expenses
## 24.00 1.00 8.40 2.00 0.00 1.00 11911.38
outlier_ratio <- dispersion_sd / dispersion_iqr
# Columna con la proporción más alta
max_outlier_column <- names(outlier_ratio)[which.max(outlier_ratio)]
print(outlier_ratio)
## age sex bmi children smoker region expenses
## 0.5851805 0.5001634 0.7262695 0.6027857 Inf 1.1052082 1.0167050
print(paste("La variable con más outliers es:", max_outlier_column))
## [1] "La variable con más outliers es: smoker"
Los patrones y/o tendencias significativas que se observan son: * Diferencias por Sexo: Las mujeres tienen un IMC más alto, tienden a tener más hijos y tienen menores gastos que los hombres. * Relación entre Edad e IMC: El IMC aumenta con la edad. * Relación entre Edad y Gastos: Los gastos aumentan con la edad.
# Descripción introductoria o resumen de df
introduce(df)
## rows columns discrete_columns continuous_columns all_missing_columns
## 1 1337 7 0 7 0
## total_missing_values complete_rows total_observations memory_usage
## 1 0 1337 9359 77184
plot_intro(df)
# Boxplot para visualizar la distribución de los valores en expenses
plot_boxplot(df, by="expenses")
# Cantidad de valores faltantes (NA's) en cada columna
plot_missing(df)
# Histograma para cada variable numérica del dataframe df
plot_histogram(df)
# Gráfico de barras para cada variable categórica
plot_bar(df)
# Gráfico scatterplot
plot(df)
Visualización de distribución normal
* Expenses: Distribución sesgada hacia la izquierda (asimetría
negativa), con mayor concentración en el rango 0-50.000.
* Los hombres tienen mayores gastos que las mujeres.
* Aumento de los gastos con la edad.
Se muestra que para tener una mejor distribución normal de los datos, se deben convertir las variables a log() con el fin de tener mayor precisión y menor sesgo.
par(mfrow = c(1, 2)) # Divide el área de trazado en 1 fila y 2 columnas
for(i in 1:ncol(df)) {
qqnorm(df[, i], main = names(df)[i])
qqline(df[, i], col = 2)
}
plot_normality(df, expenses, bmi, age)
Correlación entre variables
Se visualiza que hay una asociación positiva entre expenses y smoker
dev.new(width = 10, height = 8)
correlate(df) %>% plot()
plot_correlation(df)
Se infiere que el impacto de cada una de las variables explicativas
sobre la principal variable de estudio, en este caso “expenses”, sería
de esta manera:
* Sexo: Las mujeres podrían tener un impacto negativo en los gastos, es
decir, se espera que las mujeres tengan menores gastos que los
hombres.
* IMC: Un mayor IMC podría tener un impacto positivo en los gastos, es
decir, se espera que las personas con un mayor IMC tengan mayores gastos
en salud, alimentación, etc.
* Hijos: Un mayor número de hijos podría tener un impacto positivo en
los gastos, es decir, se espera que las familias con más hijos tengan
mayores gastos en educación, alimentación, etc.
* Fumador: El hábito de fumar podría tener un impacto positivo en los
gastos, es decir, se espera que los fumadores tengan mayores gastos en
tabaco, tratamiento de enfermedades relacionadas con el tabaquismo,
etc.
Asimismo, se espera que la principal variable x que afecta a y, sería “smoker”, donde si tiene un valor de “2” (igual a persona si fumadora) implica que aumenta “expenses”. Mientras que, la variable “region” tiene menor impacto en y.
par(mfrow = c(3, 2)) # Divide el área de la trama en una cuadrícula de 3 filas y 2 columnas
# Age
boxplot(expenses ~ age, data = df, main = "Expenses vs. age", xlab = "age", ylab = "Expenses")
# Sex
boxplot(expenses ~ sex, data = df, main = "Expenses vs. sex", xlab = "sex", ylab = "Expenses")
# BMI
boxplot(expenses ~ bmi, data = df, main = "Expenses vs. bmi", xlab = "bmi", ylab = "Expenses")
# Children
boxplot(expenses ~ children, data = df, main = "Expenses vs. children", xlab = "children", ylab = "Expenses")
# Smoker
boxplot(expenses ~ smoker, data = df, main = "Expenses vs. smoker", xlab = "smoker", ylab = "Expenses")
# Region
boxplot(expenses ~ region, data = df, main = "Expenses vs. region", xlab = "region", ylab = "Expenses")
set.seed(123) # What is set.seed()? We want to make sure that we get the same results for randomization each time you run the script.
partition <- createDataPartition(y = df$expenses, p=0.7, list=F)
train = df[partition, ]
test = df[-partition, ]
lm_model <- lm(log(expenses) ~ log(age) + sex + children + log(bmi) + smoker + region, data = train)
summary(lm_model)
##
## Call:
## lm(formula = log(expenses) ~ log(age) + sex + children + log(bmi) +
## smoker + region, data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.80943 -0.20736 -0.06954 0.07419 2.24993
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.70251 0.25988 6.551 9.44e-11 ***
## log(age) 1.26435 0.03665 34.502 < 2e-16 ***
## sex -0.07753 0.02829 -2.741 0.00624 **
## children 0.08130 0.01150 7.068 3.08e-12 ***
## log(bmi) 0.33420 0.07080 4.720 2.72e-06 ***
## smoker 1.51880 0.03467 43.814 < 2e-16 ***
## region -0.04042 0.01279 -3.160 0.00163 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4292 on 930 degrees of freedom
## Multiple R-squared: 0.7837, Adjusted R-squared: 0.7823
## F-statistic: 561.7 on 6 and 930 DF, p-value: < 2.2e-16
Se visualiza el efecto parcial de smoker sobre la variable dependiente
plot(effect("smoker",lm_model))
Se presentan los valores previstos frente a los observados de la
variable dependiente
ggplot(train, aes(x = exp(lm_model$fitted.values), y = train$expenses)) +
geom_point() +
stat_smooth() +
labs(x='Predicted Values', y='Actual Values', title='OLS Predicted vs. Actual Values')
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
ols_model <- lm(expenses ~ age + sex + bmi + children + smoker + region, data = train)
summary(ols_model)
##
## Call:
## lm(formula = expenses ~ age + sex + bmi + children + smoker +
## region, data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11331.3 -2643.9 -922.3 1343.2 29891.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -34753.85 1375.65 -25.264 < 2e-16 ***
## age 258.98 14.19 18.254 < 2e-16 ***
## sex -149.20 395.47 -0.377 0.70606
## bmi 318.71 33.34 9.560 < 2e-16 ***
## children 482.92 160.42 3.010 0.00268 **
## smoker 23655.65 484.64 48.811 < 2e-16 ***
## region -300.30 178.81 -1.679 0.09339 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6000 on 930 degrees of freedom
## Multiple R-squared: 0.7587, Adjusted R-squared: 0.7572
## F-statistic: 487.4 on 6 and 930 DF, p-value: < 2.2e-16
AIC(ols_model) # AIC = 18971.02
## [1] 18971.02
prediction_ols_model <- predict(ols_model,test)
RMSE_ols_model <- rmse(prediction_ols_model,test$expenses)
RMSE_ols_model
## [1] 6206.795
log_ols_model <- lm(log(expenses) ~ log(age) + sex + children + log(bmi) + smoker + region, data = train)
summary(log_ols_model)
##
## Call:
## lm(formula = log(expenses) ~ log(age) + sex + children + log(bmi) +
## smoker + region, data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.80943 -0.20736 -0.06954 0.07419 2.24993
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.70251 0.25988 6.551 9.44e-11 ***
## log(age) 1.26435 0.03665 34.502 < 2e-16 ***
## sex -0.07753 0.02829 -2.741 0.00624 **
## children 0.08130 0.01150 7.068 3.08e-12 ***
## log(bmi) 0.33420 0.07080 4.720 2.72e-06 ***
## smoker 1.51880 0.03467 43.814 < 2e-16 ***
## region -0.04042 0.01279 -3.160 0.00163 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4292 on 930 degrees of freedom
## Multiple R-squared: 0.7837, Adjusted R-squared: 0.7823
## F-statistic: 561.7 on 6 and 930 DF, p-value: < 2.2e-16
AIC(log_ols_model) # AIC = 1082.83
## [1] 1082.827
# Otra forma de calcular el RMSE
#log_errors <- train$expenses - exp(log_ols_model$fitted.values) #Calcula los errores entre los gastos log reales y los previstos
#RMSE_log_ols_model <- sqrt(mean(log_errors^2))
#RMSE_log_ols_model
prediction_log_model <- exp(predict(log_ols_model, newdata = test)) # Predecir en escala original
RMSE_log_model <- sqrt(mean((test$expenses - prediction_log_model)^2)) # Calcular RMSE
RMSE_log_model
## [1] 8158.644
Los modelos SAR y SEM no se pueden realizar en este caso debido a la ausencia de información espacial, puesto que dichos modelos requieren datos espaciales, como la ubicación de las observaciones, para estimar los efectos espaciales. El conjunto de datos proporcionado no incluye información espacial, como coordenadas o nombres de regiones suficientes para convertirlos en nb2listw. Por lo tanto, sin información espacial, no se pueden calcular las matrices de pesos espaciales que son esenciales para estimar los modelos SAR y SEM.
No obstante, se muestran los códigos que se podrían utilizar: SAR
#library(spdep)
#map_centroid <- coordinates(df)
#col.gal.nb <- read.gal(system.file("df", package="spdep"))
#map.linkW <- nb2listw(col.gal.nb, style="W") #########NEIGHT
#sar_model <- lagsarlm(log(expenses) ~ log(age) + sex + log(children +0.01) +
# log(bmi) + smoker + region, data = train, method="Matrix", listw = wt)
#summary(sar_model)
#RMSE_SAR <- sqrt(mean((df$expenses - sar_model$fitted.values)^2)) #Ajusta el modelo a los datos de entrenamiento.
#RMSE_SAR
SEM
#sem_model <- errorsarlm(log(expenses) ~ log(age) + sex + children + log(bmi) + smoker + region, data = train, map.linkW, method="Matrix")
#summary(sem_model)
#RMSE_SEM <- sqrt(mean((df$expenses - sem_model$fitted.values)^2))
#RMSE_SEM
fit.svm = svm(formula = log(expenses) ~ log(age) + sex + children + log(bmi) + smoker + region, data = train, type = 'eps-regression', kernel = 'radial')
summary(fit.svm)
##
## Call:
## svm(formula = log(expenses) ~ log(age) + sex + children + log(bmi) +
## smoker + region, data = train, type = "eps-regression", kernel = "radial")
##
##
## Parameters:
## SVM-Type: eps-regression
## SVM-Kernel: radial
## cost: 1
## gamma: 0.1666667
## epsilon: 0.1
##
##
## Number of Support Vectors: 296
plot(fit.svm$fitted, fit.svm$residuals, main="SVM Residual vs. Fitted Values", xlab="Fitted Values", ylab="Residuals")
abline(0,0)
predictions <- exp(predict(fit.svm,newdata = test))
# Error cuadrático medio de la raíz
RMSE_svm <- rmse(predictions,test$expenses)
RMSE_svm
## [1] 5549.75
dv_svm<-data.frame(exp(fit.svm$fitted),train$expenses)
ggplot(dv_svm, aes(x = exp.fit.svm.fitted., y = train.expenses)) +
geom_point() +
stat_smooth() +
labs(x='Predicted Values', y='Actual Values', title='SVM Predicted vs. Actual Values')
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
decision_tree_regression <- rpart(log(expenses) ~ log(age) + sex + children + log(bmi) + smoker + region, data = train)
# summary(decision_tree_regression)
plot(decision_tree_regression, compress = TRUE)
text(decision_tree_regression, use.n = TRUE)
#install.packages("rpart.plot") # Instalar el paquete rpart.plot
library(rpart.plot)
rpart.plot(decision_tree_regression)
dt_prediction_test_data <-exp(predict(decision_tree_regression,newdata = test))
RMSE_Tree <- rmse(dt_prediction_test_data, test$expenses)
RMSE_Tree
## [1] 5375.011
library(dplyr)
transform_df <- df
transform_df$age <- log(transform_df$age)
transform_df$bmi <- log(transform_df$bmi)
transform_df$children <- log(transform_df$children +0.01)
transform_df$expenses <- log(transform_df$expenses)
summary(transform_df)
## age sex bmi children
## Min. :2.890 Min. :1.000 Min. :2.773 Min. :-4.60517
## 1st Qu.:3.296 1st Qu.:1.000 1st Qu.:3.270 1st Qu.:-4.60517
## Median :3.664 Median :2.000 Median :3.414 Median : 0.00995
## Mean :3.598 Mean :1.505 Mean :3.403 Mean :-1.66885
## 3rd Qu.:3.932 3rd Qu.:2.000 3rd Qu.:3.547 3rd Qu.: 0.69813
## Max. :4.159 Max. :2.000 Max. :3.972 Max. : 1.61144
## smoker region expenses
## Min. :1.000 Min. :1.000 Min. : 7.023
## 1st Qu.:1.000 1st Qu.:2.000 1st Qu.: 8.465
## Median :1.000 Median :3.000 Median : 9.147
## Mean :1.205 Mean :2.516 Mean : 9.100
## 3rd Qu.:1.000 3rd Qu.:3.000 3rd Qu.: 9.721
## Max. :2.000 Max. :4.000 Max. :11.063
set.seed(123)
partition_alt <- createDataPartition(y = transform_df$expenses, p=0.7, list=F)
train_alt = transform_df[partition_alt, ]
test_alt <- transform_df[-partition_alt, ]
random_forest<-randomForest(expenses ~ age + sex + children + bmi + smoker + region, data=train_alt, proximity=TRUE)
print(random_forest)
##
## Call:
## randomForest(formula = expenses ~ age + sex + children + bmi + smoker + region, data = train_alt, proximity = TRUE)
## Type of random forest: regression
## Number of trees: 500
## No. of variables tried at each split: 2
##
## Mean of squared residuals: 0.1414657
## % Var explained: 83.26
rf_prediction_test_data <-exp(predict(random_forest,newdata = test_alt))
RMSE_RF <- rmse(rf_prediction_test_data, test$expenses)
RMSE_RF
## [1] 5417.839
Evaluacion de Importancia de Variables
Se observa que la variable “smoker” es la que tiene mayor importancia en
la predicción de y (“expenses”); a la cual le continua la variable “age”
en nivel de importancia.
#Cuanto mayor sea el valor de la precisión media de la disminución, mayor será la importancia de la variable en el modelo. En otras palabras, la precisión media de la disminución representa en qué medida la eliminación de cada variable reduce la precisión del modelo.
varImpPlot(random_forest, n.var = 5, main = "Top 10 - Variable")
importance(random_forest)
## IncNodePurity
## age 256.902644
## sex 8.449236
## children 31.091885
## bmi 63.095680
## smoker 334.298033
## region 15.067092
# Define las variables explicativas (X) y la variable dependiente (Y) en el conjunto de entrenamiento
train_x = data.matrix(train_alt[, -7])
train_y = train_alt[,7]
# Define las variables explicativas (X) y la variable dependiente (Y) en el conjunto de pruebas
test_x = data.matrix(test_alt[, -7])
test_y = test_alt[, 7]
# Definer sets fianles de train y test
xgb_train = xgb.DMatrix(data = train_x, label = train_y)
xgb_test = xgb.DMatrix(data = test_x, label = test_y)
# Ajustamos el modelo de regresión XGBoost y mostramos el RMSE de los datos de entrenamiento y de prueba en cada ronda.
watchlist = list(train=xgb_train, test=xgb_test)
model_xgb = xgb.train(data=xgb_train, max.depth=3, watchlist=watchlist, nrounds=70)
## [1] train-rmse:6.072449 test-rmse:6.085466
## [2] train-rmse:4.269285 test-rmse:4.280048
## [3] train-rmse:3.008771 test-rmse:3.018650
## [4] train-rmse:2.128019 test-rmse:2.143468
## [5] train-rmse:1.515914 test-rmse:1.538863
## [6] train-rmse:1.094982 test-rmse:1.125079
## [7] train-rmse:0.808904 test-rmse:0.847757
## [8] train-rmse:0.619527 test-rmse:0.667509
## [9] train-rmse:0.499389 test-rmse:0.554829
## [10] train-rmse:0.427170 test-rmse:0.487947
## [11] train-rmse:0.384923 test-rmse:0.449014
## [12] train-rmse:0.361299 test-rmse:0.425338
## [13] train-rmse:0.347701 test-rmse:0.415393
## [14] train-rmse:0.339893 test-rmse:0.409008
## [15] train-rmse:0.334655 test-rmse:0.406328
## [16] train-rmse:0.331662 test-rmse:0.404075
## [17] train-rmse:0.329707 test-rmse:0.402880
## [18] train-rmse:0.328199 test-rmse:0.403266
## [19] train-rmse:0.325943 test-rmse:0.401362
## [20] train-rmse:0.323954 test-rmse:0.399620
## [21] train-rmse:0.321749 test-rmse:0.399034
## [22] train-rmse:0.319239 test-rmse:0.400724
## [23] train-rmse:0.318239 test-rmse:0.401305
## [24] train-rmse:0.316830 test-rmse:0.401674
## [25] train-rmse:0.314678 test-rmse:0.401438
## [26] train-rmse:0.313099 test-rmse:0.401975
## [27] train-rmse:0.311669 test-rmse:0.402005
## [28] train-rmse:0.311260 test-rmse:0.401829
## [29] train-rmse:0.309970 test-rmse:0.400966
## [30] train-rmse:0.308875 test-rmse:0.402224
## [31] train-rmse:0.307366 test-rmse:0.402095
## [32] train-rmse:0.306115 test-rmse:0.401730
## [33] train-rmse:0.302633 test-rmse:0.404479
## [34] train-rmse:0.301667 test-rmse:0.405409
## [35] train-rmse:0.300698 test-rmse:0.405319
## [36] train-rmse:0.299389 test-rmse:0.404883
## [37] train-rmse:0.297227 test-rmse:0.406926
## [38] train-rmse:0.295584 test-rmse:0.407227
## [39] train-rmse:0.294832 test-rmse:0.407158
## [40] train-rmse:0.294244 test-rmse:0.406956
## [41] train-rmse:0.293035 test-rmse:0.406692
## [42] train-rmse:0.290426 test-rmse:0.407427
## [43] train-rmse:0.289592 test-rmse:0.407390
## [44] train-rmse:0.288926 test-rmse:0.408196
## [45] train-rmse:0.288667 test-rmse:0.407954
## [46] train-rmse:0.287841 test-rmse:0.408023
## [47] train-rmse:0.286714 test-rmse:0.407258
## [48] train-rmse:0.285400 test-rmse:0.407675
## [49] train-rmse:0.284463 test-rmse:0.408266
## [50] train-rmse:0.283906 test-rmse:0.408685
## [51] train-rmse:0.283566 test-rmse:0.408982
## [52] train-rmse:0.282759 test-rmse:0.408243
## [53] train-rmse:0.281684 test-rmse:0.408680
## [54] train-rmse:0.281330 test-rmse:0.408672
## [55] train-rmse:0.280540 test-rmse:0.408453
## [56] train-rmse:0.279895 test-rmse:0.409364
## [57] train-rmse:0.278899 test-rmse:0.408177
## [58] train-rmse:0.277285 test-rmse:0.409075
## [59] train-rmse:0.275960 test-rmse:0.409806
## [60] train-rmse:0.275713 test-rmse:0.409791
## [61] train-rmse:0.275451 test-rmse:0.409951
## [62] train-rmse:0.274734 test-rmse:0.409582
## [63] train-rmse:0.274608 test-rmse:0.409787
## [64] train-rmse:0.274399 test-rmse:0.409752
## [65] train-rmse:0.273609 test-rmse:0.409676
## [66] train-rmse:0.270243 test-rmse:0.410433
## [67] train-rmse:0.269124 test-rmse:0.410239
## [68] train-rmse:0.268054 test-rmse:0.411109
## [69] train-rmse:0.266979 test-rmse:0.411710
## [70] train-rmse:0.266224 test-rmse:0.411668
reg_xgb = xgboost(data = xgb_train, max.depth = 3, nrounds = 59, verbose = 0)
prediction_xgb_test<-exp(predict(reg_xgb, newdata = xgb_test))
RMSE_XGB <- rmse(prediction_xgb_test, test$expenses)
RMSE_XGB
## [1] 5295.267
xgb_reg_residuals<-test$expenses - prediction_xgb_test
plot(xgb_reg_residuals, xlab= "Dependent Variable", ylab = "Residuals", main = 'XGBoost Regression Residuals')
abline(0,0)
# Plot 3 modelos de arbol
xgb.plot.tree(model=reg_xgb, trees=0:2)
importance_matrix <- xgb.importance(model = reg_xgb)
xgb.plot.importance(importance_matrix, xlab = "Explanatory Variables X's Importance")
#install.packages("neuralnet")
library(neuralnet)
## Warning: package 'neuralnet' was built under R version 4.3.3
##
## Attaching package: 'neuralnet'
## The following object is masked from 'package:dplyr':
##
## compute
nn_model <- neuralnet(expenses ~ age + sex + bmi + children + smoker + region, data = train, hidden = c(5, 3), linear.output = TRUE)
Gráfico de neural network
plot(nn_model)
prediction_net<-predict(nn_model, test)
RMSE_net <- rmse(prediction_net, test$expenses)
RMSE_net
## [1] 11954.46
Estimar un análisis de regresión espacial requiere construir una matriz espacial que conecte los puntos geográficos. En este caso, el df no contiene variables espaciales y/o geograficas. Por lo tanto, no es necesario realizar un modelo de regresión espacial y, por ende, calcular el test de MOran para determinar si existe autocorrelacion espacial. No obstante, se podría realizar el siguiente código en caso de tener este tipo de valores en el dataframe:
#library(spdep)
#vecinos <- poly2nb(df)
# A continuación, crea una matriz de pesos espaciales a partir de la lista de vecinos
#matriz_pesos <- nb2listw(df)
# Finalmente, realiza el test de Moran para determinar la autocorrelación espacial en 'expenses'
#moran.test(df$expenses, listw = matriz_pesos)
No hay autocorrelacion serial debido a que no se tiene una bd de series de tiempo, y/o una variable de tiempo en el dataframe actual. Sin embargo, en caso de tener un valor serial se realizaría un análisis afc como el que se muestra a continuación:
acf_result <- acf(df$expenses, lag.max = 30, main = "ACF of HOVAL", ylim = c(-1, 1))
plot(acf_result, main="Autocorrelación Serial")
Se muestra que los valores de VIF son cercanos a 1, lo que indica que no hay multicolinealidad en el modelo
VIF(lm_model)
## log(age) sex children log(bmi) smoker region
## 1.030732 1.017558 1.010255 1.053302 1.015063 1.034176
VIF(ols_model)
## age sex bmi children smoker region
## 1.024782 1.017573 1.052172 1.005218 1.014980 1.033881
Un valor p estadísticamente significativo (normalmente inferior a 0,05), en este caso, los valores p son extremadamente pequeños, lo que significa que rechazaríamos la hipótesis nula de homocedasticidad y concluimos que hay evidencia significativa de heterocedasticidad en el modelo.
bptest(log_ols_model)
##
## studentized Breusch-Pagan test
##
## data: log_ols_model
## BP = 57.75, df = 6, p-value = 1.288e-10
bptest(lm_model)
##
## studentized Breusch-Pagan test
##
## data: lm_model
## BP = 57.75, df = 6, p-value = 1.288e-10
bptest(ols_model)
##
## studentized Breusch-Pagan test
##
## data: ols_model
## BP = 109.21, df = 6, p-value < 2.2e-16
plot(fitted(ols_model), resid(ols_model), main="Linear Regression Residual vs. Fitted Values", xlab="Fitted Values", ylab="Residuals")
abline(0,0)
plot(fitted(lm_model), resid(lm_model), main="Linear Regression Residual vs. Fitted Values", xlab="Fitted Values", ylab="Residuals")
abline(0,0)
plot(fitted(ols_model), resid(ols_model), main="Residual vs. Fitted Values", xlab="Fitted Values", ylab="Residuals")
abline(0,0)
plot(fitted(lm_model), resid(lm_model), main="Residual vs. Fitted Values", xlab="Fitted Values", ylab="Residuals")
abline(0,0)
Se espera que el modelo de regresión lineal produzca errores que se distribuyan normalmente. Así, más errores se agrupan alrededor de cero.
hist(lm_model$residuals, xlab="Estimated Regression Residuals", main='Distribution of LM Estimated Regression Residuals', col='lightblue', border="white")
hist(ols_model$residuals, xlab="Estimated Regression Residuals", main='Distribution of OLS Estimated Regression Residuals', col='lightblue', border="white")
# Haremos predicciones utilizando los datos de prueba para evaluar el rendimiento de nuestro modelo de regresión.
prediction_ols_model1 <- ols_model %>% predict(test)
RMSE_ols1 <- RMSE(prediction_ols_model1, test$expenses)
RMSE_ols1
## [1] 6206.795
wt1 <- 1 / lm(abs(ols_model$residuals) ~ ols_model$fitted.values)$fitted.values^2
wls_model1 <-lm(log(expenses) ~ log(age) + sex + children + log(bmi) + smoker + region, data = train, weights=wt1)
summary(wls_model1)
##
## Call:
## lm(formula = log(expenses) ~ log(age) + sex + children + log(bmi) +
## smoker + region, data = train, weights = wt1)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -3.539e-04 -7.018e-05 -3.420e-05 1.288e-05 1.731e-03
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.94875 0.26379 7.388 3.32e-13 ***
## log(age) 1.44113 0.04299 33.520 < 2e-16 ***
## sex -0.16420 0.03138 -5.233 2.07e-07 ***
## children 0.14524 0.01418 10.240 < 2e-16 ***
## log(bmi) 0.09818 0.07238 1.356 0.175
## smoker 1.58671 0.09989 15.885 < 2e-16 ***
## region -0.07874 0.01396 -5.640 2.26e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.000173 on 930 degrees of freedom
## Multiple R-squared: 0.6856, Adjusted R-squared: 0.6836
## F-statistic: 338 on 6 and 930 DF, p-value: < 2.2e-16
Se muestra que ya no hay heterocedasticidad en el modelo
bptest(wls_model1)
##
## studentized Breusch-Pagan test
##
## data: wls_model1
## BP = 2.1435e-05, df = 6, p-value = 1
plot(fitted(wls_model1), resid(wls_model1), main="Residual vs. Fitted Values", xlab="Fitted Values", ylab="Residuals")
abline(0,0)
# Haremos predicciones utilizando los datos de prueba para evaluar el rendimiento de nuestro modelo de regresión wls.
prediction_wls_model1 <- wls_model1 %>% predict(test)
RMSE(prediction_wls_model1, test$expenses) # WLS - RMSE is greater than OLS - RMSE
## [1] 17845.99
wls_errors <- train$expenses - exp(wls_model1$fitted.values)
# Calculate RMSE for WLS model
RMSE_wls_model1 <- sqrt(mean(wls_errors^2))
RMSE_wls_model1
## [1] 9820.253
wt2 <- 1 / lm(abs(log_ols_model$residuals) ~ log_ols_model$fitted.values)$fitted.values^2
wls_model2 <-lm(log(expenses) ~ log(age) + sex + children + log(bmi) + smoker + region, data = train, weights=wt2)
prediction_wls_model2 <- wls_model2 %>% predict(test)
RMSE(prediction_wls_model2, test$expenses)
## [1] 17846.01
wls_errors2 <- train$expenses - exp(wls_model2$fitted.values)
# Calculate RMSE for WLS model
RMSE_wls_model2 <- sqrt(mean(wls_errors2^2))
RMSE_wls_model2
## [1] 7036.752
Se observan los siguientes hallazgos en relación a la comparativa de
los RMSE:
1. El promedio de los valores de RMSE es de 7042.774.
2. La diferencia entre el mayor y menor RMSE es de 6659.191.
3. La desviación estándar de los valores de RMSE es de 2672.846.
4. La mayoría de los modelos (55.56%) tienen un RMSE mayor al
promedio.
5. A pesar de la heterocedasticidad, el modelo Arbol de Decision tienen
un RMSE menor que varios de los demás. 6. La heterocedasticidad no
parece afectar significativamente el rendimiento de ciertos modelos.
Aunque, se observa heterocedasticidad en los demás modelos que no son
WLS, lo que significa que la varianza de los errores no es constante. 7.
Los modelos wls model parecen tener un RMSE alto en comparación a los
primeros 5 modelos.
Por lo tanto, el modelo seleccionado es model xgb debido a su bajo
RMSE en comparación con otros modelos lo que implica que tiene un mejor
rendimiento en la predicción de los datos de prueba y tiene ausencia de
heterocedasticidad como multicolinealidad lo que significa que los
errores de predicción tienen una varianza constante, lo que mejora la
confiabilidad del modelo.
## Dataframe
resultados <- data.frame(
"log_ols_model" = c(RMSE_log_model),
"rf_model" = c(RMSE_RF),
"decision_tree_model" = c(RMSE_Tree),
"ols_model" = c(RMSE_ols_model),
"svm_model" = c(RMSE_svm),
"model_xgb" = c(RMSE_XGB),
"model_nnet" = c(RMSE_net),
"wls_model1" = c(RMSE_wls_model1),
"wls_model2" = c(RMSE_wls_model2)
)
RMSE_values <- unlist(resultados)
RMSE_sorted <- sort(RMSE_values)
# Obtén los nombres de los modelos ordenados según los valores de RMSE
model_names <- names(resultados)
model_names_sorted <- model_names[order(RMSE_values)]
# Crea un nuevo marco de datos con los nombres de los modelos y sus respectivos RMSE ordenados
RMSE_df <- data.frame(RMSE = RMSE_sorted)
RMSE_df
## RMSE
## model_xgb 5295.267
## decision_tree_model 5375.011
## rf_model 5417.839
## svm_model 5549.750
## ols_model 6206.795
## wls_model2 7036.752
## log_ols_model 8158.644
## wls_model1 9820.253
## model_nnet 11954.458
ggplot(RMSE_df,aes(x=reorder(model_names,-RMSE_values), y=RMSE_values)) +
geom_bar(stat="identity", fill="#f68060", alpha=.6, width=.4) +
coord_flip() +
xlab("") +
theme_bw()
Se observa que el modelo wls_model2 tiene un buen balance entre sus bias y varianza, en comparación con otros modelos. No obstante, model_xgb tiene un RMSE más bajo que otros modelos lo que implica que al tener sus valores en logaritmo, tendrá un equilibrio entre sus valores con sus predicciones.
ggplot(train, aes(x = exp(wls_model2$fitted.values), y = train$expenses)) +
geom_point() +
stat_smooth() +
labs(x='Predicted Values', y='Actual Values', title='WLS2 Predicted vs. Actual Values')
## Warning: Use of `train$expenses` is discouraged.
## ℹ Use `expenses` instead.
## Use of `train$expenses` is discouraged.
## ℹ Use `expenses` instead.
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
ggplot(train, aes(x = exp(wls_model1$fitted.values), y = train$expenses)) +
geom_point() +
stat_smooth() +
labs(x='Predicted Values', y='Actual Values', title='LOG OLS Predicted vs. Actual Values')
## Warning: Use of `train$expenses` is discouraged.
## ℹ Use `expenses` instead.
## Use of `train$expenses` is discouraged.
## ℹ Use `expenses` instead.
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
El análisis exploratorio da a conocer que la variable Dependiente será la que tiene mayor variabilidad, en este caso “expenses” lo que denota que el modelo a buscar debe explicar los cambios en esta misma. Asimismo, se muestra que la variable de “smoker” tiende a impactar en mayor medida a “expenses”. De igual manera, Se espera que: * Si los hombres tienen mayores gastos que las mujeres, y negativo si las mujeres tienen mayores gastos. * Un mayor IMC se asocia con mayores gastos. * Un mayor número de hijos se asocia con mayores gastos. * Las personas si fumadoras tienen mayores gastos que los no fumadores.
Las variables que contribuyen a explicar los cambios de la principal variable de estudio son age y smoker debido a que las personas que fuman tienen un valor más alto de la variable dependiente que las personas que no fuman. Posteriormente, se subdivide el conjunto segun la edad donde por cada unidad que aumenta la edad, la variable dependiente aumenta en una cantidad específica. Por otra parte, el impacto de las demás variables explicativas sobre la variable dependiente son que: Las mujeres tienen un valor más bajo de “y” que los hombres lo que se puede atribuir a una serie de factores; si el IMC aumenta la variable dependiente aumenta en una cantidad específica; las personas con hijos tienen “expenses” relativamente más bajas que las personas sin hijos; y la region es de las variables con menos impacto pero las personas de southeast tienen en promedio mayor “expenses”.
Los resultados estimados del modelo seleccionado que en este caso es XGBoost, son relativamente similares a los otros modelos estimados, ya que coinciden en que smoker es una de las variables principales con mayor impacto en la dependiente. Pero, difieren en la jerarquía del nivel de impacto de las variables que tienden a tener poca correlación. Cabe destacar que, se eligió este modelo debiod a que tiene mayor precisión que el resto. No obstante, dicho modelo al tener los valores logarítmicos es probable que no tenga heterocedasticidad; aunque se podría elegir wls_model2 debido a que cumple con este mismo factor pero tiene un RMSE más alto.