caso 1: Analisis A priori

El presente estudio se enmarca en el análisis del mercado inmobiliario europeo, realizado a solicitud de una empresa inmobiliaria que busca comprender los factores determinantes de los precios de los inmuebles. La empresa opera en varias ciudades y ha proporcionado datos que incluyen variables de características de las viviendas (como el área media y el número de habitaciones), así como variables económicas (como el PIB per cápita y la tasa de desempleo). En primer lugar, la teoría económica de los bienes raíces señala que las características físicas de una vivienda (tamaño, número de habitaciones, ubicación) determinan de manera directa su valor en el mercado, ya que incrementan la utilidad percibida por el consumidor (Rosen, 1974; Lancaster, 1966). Por lo tanto, se espera que el área media (Area_Median) y el número medio de habitaciones (Room_Median) tengan una relación positiva y significativa con el precio de los inmuebles.

En cuanto a las variables demográficas, la densidad de población (Density) y la población total (Population) pueden tener efectos ambiguos. Según la teoría de la demanda de vivienda en entornos urbanos (Alonso, 1964; Muth, 1969), una mayor densidad puede elevar los precios al reflejar una mayor presión de demanda; sin embargo, una densidad excesiva puede asociarse con congestión y externalidades negativas, lo que reduciría la disposición a pagar por vivienda en esas áreas. La población total, por su parte, se espera que influya positivamente, ya que a mayor número de habitantes, mayor demanda potencial de inmuebles. En cuanto a las variables macroeconómicas, el PIB per cápita refleja el nivel de renta disponible en cada región y se espera que guarde una relación positiva con los precios de la vivienda, dado que mayores ingresos permiten a los hogares destinar más recursos a este tipo de bienes. La tasa de desempleo, en cambio, tendería a reducir los precios, pues limita la capacidad de compra y debilita la demanda inmobiliaria. De forma similar, unas tasas de interés hipotecarias elevadas encarecen el crédito y restringen la posibilidad de adquirir vivienda, lo que suele traducirse en una menor presión sobre los precios.

2.Analisis exploratorio y estadistico descriptivo

Uilisamos nuestra base de datos tipo excel

library(dplyr)
## 
## Adjuntando el paquete: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)   # <- agrega esta
library(GGally)
library(Hmisc)
## 
## Adjuntando el paquete: 'Hmisc'
## The following objects are masked from 'package:dplyr':
## 
##     src, summarize
## The following objects are masked from 'package:base':
## 
##     format.pval, units
library(corrplot)
## corrplot 0.95 loaded
library(readxl)

# Cargar archivo Excel
base_taller <- read_excel("C:/Users/carab/Downloads/Rosi_files/dp2015-13_Dataset.xls")

# Ver tabla en pestaña nueva
View(base_taller)
tail(base_taller)
## # A tibble: 6 × 184
##   City  City_Eng City_Short   NAds Price_Median Price_Mean Area_Median Area_Mean
##   <chr> <chr>    <chr>       <dbl>        <dbl>      <dbl>       <dbl>     <dbl>
## 1 Tall… Tallinn  TAL          8876        1062.      1059.        52.6      54.1
## 2 Tori… Turin    TUR          8525        2225       2321.        75        80.5
## 3 Vale… Valencia VAL         14537        1741.      1849.        93        96.5
## 4 Viln… Vilnius  VIL          2794        1251.      1289.        58        59.1
## 5 Wars… Warsaw   WAW        155154        1934.      1976.        54        56.7
## 6 Wien  Vienna   VIE         10370        3571.      3657.        87        93.0
## # ℹ 176 more variables: Room_Median <dbl>, Room_Mean <dbl>, Euro_area <dbl>,
## #   EU <dbl>, Population <dbl>, City_Area <dbl>, Density <dbl>, GDP_PC <dbl>,
## #   GDP_PC_PPS <dbl>, GDP_PC2008 <dbl>, GDP_PC2009 <dbl>, GDP_PC2010 <dbl>,
## #   Gini <dbl>, HOR <dbl>, Kearny_GCI2010 <dbl>, LRIR <dbl>,
## #   Inflation2010 <dbl>, Inflation2011 <dbl>, URate <dbl>, MIR2009 <dbl>,
## #   MIR2010 <dbl>, Mortgage_PC2010 <dbl>, Tppl1989_1993 <dbl>,
## #   Tppl1994_1998 <dbl>, Tppl1999_2002 <dbl>, Tppl2003_2006 <dbl>, …
summary(base_taller)
##      City             City_Eng          City_Short             NAds       
##  Length:50          Length:50          Length:50          Min.   :   576  
##  Class :character   Class :character   Class :character   1st Qu.:  4932  
##  Mode  :character   Mode  :character   Mode  :character   Median :  9694  
##                                                           Mean   : 17924  
##                                                           3rd Qu.: 15981  
##                                                           Max.   :155154  
##                                                                           
##   Price_Median      Price_Mean      Area_Median       Area_Mean     
##  Min.   : 503.1   Min.   : 534.7   Min.   : 47.00   Min.   : 51.12  
##  1st Qu.:1305.3   1st Qu.:1322.1   1st Qu.: 55.03   1st Qu.: 59.10  
##  Median :2107.2   Median :2200.7   Median : 65.50   Median : 68.07  
##  Mean   :2436.7   Mean   :2535.0   Mean   : 68.56   Mean   : 72.14  
##  3rd Qu.:3092.2   3rd Qu.:3176.5   3rd Qu.: 78.50   3rd Qu.: 82.63  
##  Max.   :8590.9   Max.   :8865.7   Max.   :100.00   Max.   :109.49  
##                                                                     
##   Room_Median     Room_Mean       Euro_area         EU        Population      
##  Min.   :2.00   Min.   :1.805   Min.   :0.0   Min.   :0.0   Min.   :  401389  
##  1st Qu.:2.00   1st Qu.:2.172   1st Qu.:0.0   1st Qu.:0.0   1st Qu.:  799132  
##  Median :2.00   Median :2.476   Median :0.5   Median :1.0   Median : 1148799  
##  Mean   :2.47   Mean   :2.515   Mean   :0.5   Mean   :0.7   Mean   : 1872288  
##  3rd Qu.:3.00   3rd Qu.:2.832   3rd Qu.:1.0   3rd Qu.:1.0   3rd Qu.: 1707910  
##  Max.   :3.00   Max.   :3.394   Max.   :1.0   Max.   :1.0   Max.   :12915158  
##                                                                               
##    City_Area         Density          GDP_PC        GDP_PC_PPS   
##  Min.   :  84.8   Min.   : 1315   Min.   : 2535   Min.   : 4696  
##  1st Qu.: 169.9   1st Qu.: 2571   1st Qu.:11862   1st Qu.:15047  
##  Median : 375.2   Median : 3297   Median :29550   Median :30400  
##  Mean   : 523.4   Mean   : 4552   Mean   :31026   Mean   :31335  
##  3rd Qu.: 500.0   3rd Qu.: 5551   3rd Qu.:42750   3rd Qu.:42750  
##  Max.   :5512.0   Max.   :20618   Max.   :86464   Max.   :76200  
##                                                                  
##    GDP_PC2008      GDP_PC2009      GDP_PC2010         Gini      
##  Min.   : 2535   Min.   : 1872   Min.   : 2301   Min.   :23.60  
##  1st Qu.:10537   1st Qu.: 9384   1st Qu.:10685   1st Qu.:28.35  
##  Median :25848   Median :24680   Median :25899   Median :31.85  
##  Mean   :31226   Mean   :28888   Mean   :30422   Mean   :32.89  
##  3rd Qu.:47699   3rd Qu.:43128   3rd Qu.:48444   3rd Qu.:36.40  
##  Max.   :88300   Max.   :84421   Max.   :85861   Max.   :52.10  
##                                                                 
##       HOR        Kearny_GCI2010        LRIR        Inflation2010   
##  Min.   :12.80   Min.   :0.0000   Min.   : 0.690   Min.   :-1.600  
##  1st Qu.:30.02   1st Qu.:0.0000   1st Qu.: 3.770   1st Qu.: 1.450  
##  Median :66.15   Median :0.0000   Median : 4.515   Median : 2.000  
##  Mean   :58.74   Mean   :0.9976   Mean   : 6.525   Mean   : 3.885  
##  3rd Qu.:83.25   3rd Qu.:2.3050   3rd Qu.:10.706   3rd Qu.: 8.400  
##  Max.   :97.42   Max.   :5.8600   Max.   :15.970   Max.   :10.500  
##                                                                    
##  Inflation2011        URate           MIR2009          MIR2010      
##  Min.   : 1.200   Min.   : 0.300   Min.   : 1.260   Min.   : 0.920  
##  1st Qu.: 2.500   1st Qu.: 5.802   1st Qu.: 3.277   1st Qu.: 2.993  
##  Median : 3.100   Median : 7.990   Median : 4.360   Median : 3.815  
##  Mean   : 4.816   Mean   : 9.446   Mean   : 7.457   Mean   : 7.026  
##  3rd Qu.: 5.275   3rd Qu.:11.908   3rd Qu.: 9.352   3rd Qu.: 8.615  
##  Max.   :53.200   Max.   :23.124   Max.   :26.000   Max.   :26.200  
##  NA's   :6                         NA's   :2        NA's   :2       
##  Mortgage_PC2010  Tppl1989_1993     Tppl1994_1998     Tppl1999_2002    
##  Min.   : 0.190   Min.   : 464773   Min.   : 421249   Min.   : 399685  
##  1st Qu.: 0.395   1st Qu.: 702444   1st Qu.: 704303   1st Qu.: 722104  
##  Median : 6.475   Median :1067365   Median : 964346   Median : 993134  
##  Mean   :10.195   Mean   :1450095   Mean   :1393183   Mean   :1570878  
##  3rd Qu.:14.090   3rd Qu.:1655272   3rd Qu.:1611954   3rd Qu.:1698320  
##  Max.   :45.160   Max.   :6829300   Max.   :6901300   Max.   :8803468  
##  NA's   :2        NA's   :17        NA's   :19        NA's   :14       
##  Tppl2003_2006     Tppl2007_2009     GDP_PC_PPS1989_1993 GDP_PC_PPS1994_1998
##  Min.   : 392306   Min.   : 401389   Mode:logical        Min.   : 6900      
##  1st Qu.: 689874   1st Qu.: 704162   NA's:50             1st Qu.:13675      
##  Median :1000174   Median :1021956                       Median :20700      
##  Mean   :1363199   Mean   :1467902                       Mean   :23215      
##  3rd Qu.:1622183   3rd Qu.:1695450                       3rd Qu.:29500      
##  Max.   :7413100   Max.   :7668300                       Max.   :54300      
##  NA's   :14        NA's   :20                            NA's   :16         
##  GDP_PC_PPS1999_2002 GDP_PC_PPS2003_2006 GDP_PC_PPS2007_2009    CITIES         
##  Min.   :10600       Min.   :13900       Min.   :16000       Length:50         
##  1st Qu.:21175       1st Qu.:22700       1st Qu.:26550       Class :character  
##  Median :27800       Median :31100       Median :34300       Mode  :character  
##  Mean   :30556       Mean   :33851       Mean   :39020                         
##  3rd Qu.:37575       3rd Qu.:43300       3rd Qu.:48450                         
##  Max.   :65800       Max.   :70100       Max.   :76200                         
##  NA's   :16          NA's   :15          NA's   :15                            
##  DemoDepend1989_1993 DemoDepend1994_1998 DemoDepend1999_2002
##  Min.   :48.10       Min.   :49.40       Min.   :48.40      
##  1st Qu.:53.83       1st Qu.:53.40       1st Qu.:54.40      
##  Median :61.30       Median :61.00       Median :57.00      
##  Mean   :60.58       Mean   :59.36       Mean   :57.36      
##  3rd Qu.:66.72       3rd Qu.:63.80       3rd Qu.:59.30      
##  Max.   :72.40       Max.   :68.10       Max.   :71.50      
##  NA's   :22          NA's   :25          NA's   :19         
##  DemoDepend2003_2006 DemoDepend2007_2009 DemoODepend1989_1993
##  Min.   :48.20       Min.   :45.10       Min.   :11.60       
##  1st Qu.:52.30       1st Qu.:51.85       1st Qu.:21.00       
##  Median :55.50       Median :55.40       Median :24.30       
##  Mean   :57.08       Mean   :56.45       Mean   :24.02       
##  3rd Qu.:59.90       3rd Qu.:59.80       3rd Qu.:26.30       
##  Max.   :73.10       Max.   :73.90       Max.   :35.20       
##  NA's   :17          NA's   :23          NA's   :17          
##  DemoODepend1994_1998 DemoODepend1999_2002 DemoODepend2003_2006
##  Min.   :19.90        Min.   :17.80        Min.   : 8.00       
##  1st Qu.:22.10        1st Qu.:23.05        1st Qu.:22.30       
##  Median :24.90        Median :25.30        Median :25.60       
##  Mean   :24.97        Mean   :25.87        Mean   :25.63       
##  3rd Qu.:26.50        3rd Qu.:27.60        3rd Qu.:27.90       
##  Max.   :33.80        Max.   :39.70        Max.   :41.20       
##  NA's   :21           NA's   :14           NA's   :13          
##  DemoODepend2007_2009  Thh1989_1993      Thh1994_1998      Thh1999_2002    
##  Min.   :16.50        Min.   : 181700   Min.   : 172189   Min.   : 173215  
##  1st Qu.:23.60        1st Qu.: 321757   1st Qu.: 300098   1st Qu.: 301566  
##  Median :27.25        Median : 468546   Median : 501000   Median : 459053  
##  Mean   :27.29        Mean   : 640428   Mean   : 677489   Mean   : 650780  
##  3rd Qu.:29.25        3rd Qu.: 745727   3rd Qu.: 784475   3rd Qu.: 757578  
##  Max.   :42.10        Max.   :2841000   Max.   :3002000   Max.   :3015997  
##  NA's   :20           NA's   :20        NA's   :27        NA's   :13       
##   Thh2003_2006      Thh2007_2009     Ndwe1989_1993     Ndwe1994_1998    
##  Min.   : 169913   Min.   : 179823   Min.   : 245810   Min.   : 172218  
##  1st Qu.: 315000   1st Qu.: 311600   1st Qu.: 336712   1st Qu.: 318326  
##  Median : 445413   Median : 441678   Median : 640641   Median : 483800  
##  Mean   : 740042   Mean   : 709285   Mean   : 675929   Mean   : 656970  
##  3rd Qu.: 861892   3rd Qu.: 778750   3rd Qu.: 826938   3rd Qu.: 756840  
##  Max.   :3112000   Max.   :3243000   Max.   :1734320   Max.   :1792443  
##  NA's   :23        NA's   :27        NA's   :35        NA's   :43       
##  Ndwe1999_2002     Ndwe2003_2006     Ndwe2007_2009     Napart1989_1993  
##  Min.   : 172753   Min.   : 175200   Min.   : 180500   Min.   : 216902  
##  1st Qu.: 318722   1st Qu.: 309745   1st Qu.: 300026   1st Qu.: 293286  
##  Median : 459753   Median : 473332   Median : 398317   Median : 493569  
##  Mean   : 714348   Mean   : 761303   Mean   : 575072   Mean   : 583210  
##  3rd Qu.: 779094   3rd Qu.: 827429   3rd Qu.: 743266   3rd Qu.: 783494  
##  Max.   :3401080   Max.   :3414094   Max.   :1890837   Max.   :1128801  
##  NA's   :13        NA's   :19        NA's   :29        NA's   :46       
##  Napart1994_1998 Napart1999_2002   Napart2003_2006   Napart2007_2009  
##  Mode:logical    Min.   :  50230   Min.   :  52280   Min.   : 165200  
##  NA's:50         1st Qu.: 278208   1st Qu.: 287361   1st Qu.: 270472  
##                  Median : 364509   Median : 371303   Median : 310492  
##                  Mean   : 533246   Mean   : 542447   Mean   : 579920  
##                  3rd Qu.: 651279   3rd Qu.: 660014   3rd Qu.: 663682  
##                  Max.   :1692262   Max.   :1694180   Max.   :1721929  
##                  NA's   :16        NA's   :27        NA's   :38       
##  Nhouse1989_1993  Nhouse1994_1998 Nhouse1999_2002   Nhouse2003_2006 
##  Min.   : 10383   Mode:logical    Min.   :   5354   Min.   : 10162  
##  1st Qu.: 19111   NA's:50         1st Qu.:  21275   1st Qu.: 27275  
##  Median : 28908                   Median :  43705   Median : 45387  
##  Mean   : 50611                   Mean   : 103348   Mean   : 74495  
##  3rd Qu.: 40959                   3rd Qu.:  92317   3rd Qu.: 98770  
##  Max.   :153693                   Max.   :1553888   Max.   :307987  
##  NA's   :45                       NA's   :16        NA's   :27      
##  Nhouse2007_2009   Aphouse1989_1993 Aphouse1994_1998 Aphouse1999_2002
##  Min.   :   1779   Min.   : 477.3   Min.   : 585.6   Min.   : 160.0  
##  1st Qu.:  35839   1st Qu.:1116.4   1st Qu.: 976.0   1st Qu.: 951.3  
##  Median :  70374   Median :1800.0   Median :1629.8   Median :1759.0  
##  Mean   : 203085   Mean   :1830.0   Mean   :1797.1   Mean   :1727.5  
##  3rd Qu.: 144459   3rd Qu.:2350.0   3rd Qu.:2550.0   3rd Qu.:2476.8  
##  Max.   :1570149   Max.   :3700.0   Max.   :3500.0   Max.   :3784.0  
##  NA's   :38        NA's   :39       NA's   :38       NA's   :31      
##  Aphouse2003_2006 Aphouse2007_2009 ApapartMincome1989_1993
##  Min.   : 408.6   Min.   : 238.7   Min.   :0.1080         
##  1st Qu.:1097.0   1st Qu.:1466.5   1st Qu.:0.1108         
##  Median :2200.0   Median :2800.0   Median :0.1190         
##  Mean   :2187.3   Mean   :2714.4   Mean   :0.1270         
##  3rd Qu.:2838.5   3rd Qu.:3833.0   3rd Qu.:0.1452         
##  Max.   :4530.0   Max.   :5399.9   Max.   :0.1540         
##  NA's   :27       NA's   :31       NA's   :44             
##  ApapartMincome1994_1998 ApapartMincome1999_2002 ApapartMincome2003_2006
##  Min.   :0.1070          Min.   :0.0440          Min.   :0.0800         
##  1st Qu.:0.1138          1st Qu.:0.0775          1st Qu.:0.0915         
##  Median :0.1255          Median :0.0955          Median :0.1145         
##  Mean   :0.1230          Mean   :0.1095          Mean   :0.1335         
##  3rd Qu.:0.1305          3rd Qu.:0.1080          3rd Qu.:0.1610         
##  Max.   :0.1380          Max.   :0.3050          Max.   :0.2660         
##  NA's   :44              NA's   :38              NA's   :34             
##  ApapartMincome2007_2009 Arent-housing1989_1993 Arent-housing1994_1998
##  Min.   :0.0650          Mode:logical           Mode:logical          
##  1st Qu.:0.0950          NA's:50                NA's:50               
##  Median :0.1100                                                       
##  Mean   :0.1334                                                       
##  3rd Qu.:0.1340                                                       
##  Max.   :0.2870                                                       
##  NA's   :37                                                           
##  Arent-housing1999_2002 Arent-housing2003_2006 Arent-housing2007_2009
##  Min.   : 3.00          Min.   :  5.00         Min.   :  8.00        
##  1st Qu.:12.25          1st Qu.: 13.00         1st Qu.: 18.00        
##  Median :70.50          Median : 78.00         Median : 88.00        
##  Mean   :55.00          Mean   : 58.59         Mean   : 76.06        
##  3rd Qu.:80.50          3rd Qu.: 85.00         3rd Qu.: 99.00        
##  Max.   :99.00          Max.   :105.00         Max.   :167.00        
##  NA's   :34             NA's   :33             NA's   :33            
##  Alarea1989_1993 Alarea1994_1998 Alarea1999_2002 Alarea2003_2006
##  Min.   :14.91   Min.   :16.70   Min.   :13.20   Min.   :14.80  
##  1st Qu.:19.28   1st Qu.:21.06   1st Qu.:24.27   1st Qu.:32.71  
##  Median :32.75   Median :33.35   Median :34.90   Median :38.20  
##  Mean   :28.58   Mean   :30.16   Mean   :31.65   Mean   :34.29  
##  3rd Qu.:35.12   3rd Qu.:38.12   3rd Qu.:38.00   3rd Qu.:40.35  
##  Max.   :37.10   Max.   :39.30   Max.   :47.70   Max.   :45.86  
##  NA's   :32      NA's   :38      NA's   :19      NA's   :30     
##  Alarea2007_2009 Phh-owndwe1989_1993 Phh-owndwe1994_1998 Phh-owndwe1999_2002
##  Min.   :15.85   Min.   : 9.30       Min.   :10.10       Min.   :11.40      
##  1st Qu.:27.03   1st Qu.:21.20       1st Qu.:16.45       1st Qu.:22.20      
##  Median :38.76   Median :39.90       Median :20.10       Median :50.00      
##  Mean   :33.50   Mean   :42.26       Mean   :36.05       Mean   :47.05      
##  3rd Qu.:39.75   3rd Qu.:57.20       3rd Qu.:54.90       3rd Qu.:64.20      
##  Max.   :46.44   Max.   :86.50       Max.   :78.10       Max.   :88.80      
##  NA's   :35      NA's   :23          NA's   :39          NA's   :13         
##  Phh-owndwe2003_2006 Phh-owndwe2007_2009 Urate1989_1993   Urate1994_1998 
##  Min.   :11.50       Min.   :12.80       Min.   : 1.300   Min.   : 2.00  
##  1st Qu.:19.27       1st Qu.:19.93       1st Qu.: 4.025   1st Qu.: 8.60  
##  Median :23.50       Median :21.25       Median : 7.150   Median : 9.30  
##  Mean   :36.59       Mean   :36.03       Mean   :10.575   Mean   :12.24  
##  3rd Qu.:50.80       3rd Qu.:41.15       3rd Qu.:14.250   3rd Qu.:16.60  
##  Max.   :84.70       Max.   :85.20       Max.   :42.700   Max.   :27.80  
##  NA's   :32          NA's   :38          NA's   :26       NA's   :29     
##  Urate1999_2002   Urate2003_2006   Urate2007_2009   Ncom-head1989_1993
##  Min.   : 2.600   Min.   : 3.300   Min.   : 1.100   Mode:logical      
##  1st Qu.: 5.250   1st Qu.: 6.700   1st Qu.: 5.225   NA's:50           
##  Median : 8.000   Median : 8.900   Median : 6.550                     
##  Mean   : 9.842   Mean   : 9.134   Mean   : 6.906                     
##  3rd Qu.:12.475   3rd Qu.:11.300   3rd Qu.: 8.550                     
##  Max.   :31.800   Max.   :19.100   Max.   :15.300                     
##  NA's   :14       NA's   :15       NA's   :32                         
##  Ncom-head1994_1998 Ncom-head1999_2002 Ncom-head2003_2006 Ncom-head2007_2009
##  Mode:logical       Min.   :  8.00     Min.   :  2.00     Min.   :  9.00    
##  NA's:50            1st Qu.: 23.25     1st Qu.: 12.50     1st Qu.: 21.75    
##                     Median : 46.00     Median : 21.00     Median : 38.00    
##                     Mean   :171.55     Mean   : 42.74     Mean   : 64.42    
##                     3rd Qu.:117.00     3rd Qu.: 37.50     3rd Qu.: 68.75    
##                     Max.   :985.00     Max.   :331.00     Max.   :210.00    
##                     NA's   :28         NA's   :27         NA's   :38        
##  Mhhincome1989_1993 Mhhincome1994_1998 Mhhincome1999_2002 Mhhincome2003_2006
##  Min.   :11913      Min.   : 1091      Min.   : 1641      Min.   : 2877     
##  1st Qu.:14350      1st Qu.: 8118      1st Qu.:12148      1st Qu.:12475     
##  Median :14988      Median :15500      Median :17476      Median :17400     
##  Mean   :14511      Mean   :11525      Mean   :15866      Mean   :15766     
##  3rd Qu.:15225      3rd Qu.:16000      3rd Qu.:21700      3rd Qu.:20600     
##  Max.   :15700      Max.   :17400      Max.   :26490      Max.   :26544     
##  NA's   :42         NA's   :35         NA's   :25         NA's   :33        
##  Mhhincome2007_2009 Ahhincome1989_1993 Ahhincome1994_1998 Ahhincome1999_2002
##  Min.   : 3437      Mode:logical       Mode:logical       Min.   : 2873     
##  1st Qu.:13587      NA's:50            NA's:50            1st Qu.:21302     
##  Median :21650                                            Median :24900     
##  Mean   :18643                                            Mean   :22507     
##  3rd Qu.:23275                                            3rd Qu.:26022     
##  Max.   :32210                                            Max.   :38516     
##  NA's   :36                                               NA's   :35        
##  Ahhincome2003_2006 Ahhincome2007_2009 RQ1-Q4earn1989_1993 RQ1-Q4earn1994_1998
##  Min.   : 3592      Min.   : 4278      Mode:logical        Mode:logical       
##  1st Qu.:21926      1st Qu.:23952      NA's:50             NA's:50            
##  Median :25150      Median :28200                                             
##  Mean   :22890      Mean   :24842                                             
##  3rd Qu.:27525      3rd Qu.:30437                                             
##  Max.   :40250      Max.   :35917                                             
##  NA's   :32         NA's   :31                                                
##  RQ1-Q4earn1999_2002 RQ1-Q4earn2003_2006 RQ1-Q4earn2007_2009
##  Min.   :0.20        Min.   :0.3000      Min.   :0.3000     
##  1st Qu.:0.30        1st Qu.:0.3000      1st Qu.:0.3000     
##  Median :0.40        Median :0.3000      Median :0.3000     
##  Mean   :0.35        Mean   :0.3429      Mean   :0.3286     
##  3rd Qu.:0.40        3rd Qu.:0.4000      3rd Qu.:0.3750     
##  Max.   :0.50        Max.   :0.4000      Max.   :0.4000     
##  NA's   :32          NA's   :36          NA's   :36         
##  HhincomeQ21989_1993 HhincomeQ21994_1998 HhincomeQ21999_2002
##  Mode:logical        Mode:logical        Min.   : 1374      
##  NA's:50             NA's:50             1st Qu.:11138      
##                                          Median :16500      
##                                          Mean   :14535      
##                                          3rd Qu.:18350      
##                                          Max.   :28444      
##                                          NA's   :32         
##  HhincomeQ22003_2006 HhincomeQ22007_2009 HhincomeQ31989_1993
##  Min.   : 2498       Min.   : 2838       Mode:logical       
##  1st Qu.:12900       1st Qu.:11096       NA's:50            
##  Median :16900       Median :18350                          
##  Mean   :14058       Mean   :15866                          
##  3rd Qu.:17900       3rd Qu.:19975                          
##  Max.   :23502       Max.   :27899                          
##  NA's   :37          NA's   :36                             
##  HhincomeQ31994_1998 HhincomeQ31999_2002 HhincomeQ32003_2006
##  Mode:logical        Min.   : 2009       Min.   : 3338      
##  NA's:50             1st Qu.:15869       1st Qu.:19509      
##                      Median :22907       Median :23400      
##                      Mean   :20194       Mean   :19306      
##                      3rd Qu.:25450       3rd Qu.:24500      
##                      Max.   :43628       Max.   :30372      
##                      NA's   :32          NA's   :37         
##  HhincomeQ32007_2009 Tlandarea1989_1993 Tlandarea1994_1998 Tlandarea1999_2002
##  Min.   : 3989       Min.   :  39.0     Min.   :  83.8     Min.   :  38.9    
##  1st Qu.:15830       1st Qu.: 139.5     1st Qu.: 144.1     1st Qu.: 158.3    
##  Median :25450       Median : 267.1     Median : 287.2     Median : 248.4    
##  Mean   :21855       Mean   : 363.2     Mean   : 364.2     Mean   : 382.6    
##  3rd Qu.:27325       3rd Qu.: 487.8     3rd Qu.: 495.0     3rd Qu.: 494.0    
##  Max.   :36527       Max.   :1498.7     Max.   :1285.3     Max.   :1572.0    
##  NA's   :36          NA's   :26         NA's   :23         NA's   :17        
##  Tlandarea2003_2006 Tlandarea2007_2009 Larea-leisure1989_1993
##  Min.   :  38.9     Min.   :  84.7     Min.   :1.50          
##  1st Qu.: 141.3     1st Qu.: 148.2     1st Qu.:3.05          
##  Median : 217.0     Median : 248.3     Median :4.60          
##  Mean   : 327.9     Mean   : 359.3     Mean   :4.60          
##  3rd Qu.: 426.2     3rd Qu.: 496.0     3rd Qu.:6.15          
##  Max.   :1285.3     Max.   :1307.7     Max.   :7.70          
##  NA's   :21         NA's   :25         NA's   :48            
##  Larea-leisure1994_1998 Larea-leisure1999_2002 Larea-leisure2003_2006
##  Min.   :2.000          Min.   : 0.00          Min.   : 2.40         
##  1st Qu.:2.600          1st Qu.: 4.35          1st Qu.:10.40         
##  Median :3.200          Median : 9.45          Median :22.00         
##  Mean   :3.667          Mean   :15.34          Mean   :21.76         
##  3rd Qu.:4.500          3rd Qu.:24.60          3rd Qu.:34.20         
##  Max.   :5.800          Max.   :38.90          Max.   :42.80         
##  NA's   :47             NA's   :32             NA's   :33            
##  Larea-leisure2007_2009 Parea-housing1989_1993 Parea-housing1994_1998
##  Min.   : 5.80          Min.   :34             Min.   :16.50         
##  1st Qu.:18.70          1st Qu.:34             1st Qu.:29.25         
##  Median :26.70          Median :34             Median :37.10         
##  Mean   :25.09          Mean   :34             Mean   :38.79         
##  3rd Qu.:31.18          3rd Qu.:34             3rd Qu.:44.05         
##  Max.   :42.00          Max.   :34             Max.   :71.30         
##  NA's   :38             NA's   :49             NA's   :43            
##  Parea-housing1999_2002 Parea-housing2003_2006 Parea-housing2007_2009
##  Min.   : 4.30          Min.   :10.70          Min.   :13.10         
##  1st Qu.:14.75          1st Qu.:14.25          1st Qu.:15.12         
##  Median :20.20          Median :19.60          Median :18.00         
##  Mean   :23.02          Mean   :24.40          Mean   :19.12         
##  3rd Qu.:24.45          3rd Qu.:27.98          3rd Qu.:22.70         
##  Max.   :72.00          Max.   :72.10          Max.   :28.60         
##  NA's   :31             NA's   :34             NA's   :38            
##  Ppldens1989_1993 Ppldens1994_1998 Ppldens1999_2002 Ppldens2003_2006
##  Min.   : 1852    Min.   : 2014    Min.   : 1384    Min.   : 1243   
##  1st Qu.: 2674    1st Qu.: 2627    1st Qu.: 2510    1st Qu.: 2608   
##  Median : 3775    Median : 3816    Median : 3768    Median : 4030   
##  Mean   : 5357    Mean   : 4622    Mean   : 5271    Mean   : 5103   
##  3rd Qu.: 6031    3rd Qu.: 5617    3rd Qu.: 5633    3rd Qu.: 6196   
##  Max.   :19797    Max.   :15240    Max.   :20287    Max.   :20467   
##  NA's   :26       NA's   :26       NA's   :18       NA's   :21      
##  Ppldens2007_2009 Netresidens-housingarea1989_1993
##  Min.   : 1313    Min.   :48871                   
##  1st Qu.: 2486    1st Qu.:48871                   
##  Median : 3306    Median :48871                   
##  Mean   : 4477    Mean   :48871                   
##  3rd Qu.: 5778    3rd Qu.:48871                   
##  Max.   :16454    Max.   :48871                   
##  NA's   :25       NA's   :49                      
##  Netresidens-housingarea1994_1998 Netresidens-housingarea1999_2002
##  Min.   : 7075                    Min.   :  6422                  
##  1st Qu.: 7362                    1st Qu.: 13804                  
##  Median :11080                    Median : 18127                  
##  Mean   :18732                    Mean   : 42694                  
##  3rd Qu.:22451                    3rd Qu.: 25980                  
##  Max.   :45694                    Max.   :465043                  
##  NA's   :46                       NA's   :31                      
##  Netresidens-housingarea2003_2006 Netresidens-housingarea2007_2009
##  Min.   : 8265                    Min.   : 8537                   
##  1st Qu.:12476                    1st Qu.:13269                   
##  Median :16750                    Median :16869                   
##  Mean   :17101                    Mean   :16338                   
##  3rd Qu.:20177                    3rd Qu.:18911                   
##  Max.   :28404                    Max.   :22743                   
##  NA's   :33                       NA's   :38                      
##  APApartment1989_1993 APApartment1994_1998 APApartment1999_2002
##  Min.   :1600         Min.   : 848         Min.   : 217.7      
##  1st Qu.:1725         1st Qu.:1700         1st Qu.:1072.7      
##  Median :1800         Median :2000         Median :1342.5      
##  Mean   :1883         Mean   :1821         Mean   :1425.4      
##  3rd Qu.:1950         3rd Qu.:2150         3rd Qu.:1879.2      
##  Max.   :2400         Max.   :2200         Max.   :2666.8      
##  NA's   :44           NA's   :43           NA's   :34          
##  APApartment2003_2006 APApartment2007_2009 Temp_Jul1989_1993 Temp_Jul1994_1998
##  Min.   : 341         Min.   : 962         Min.   :17.40     Min.   :14.80    
##  1st Qu.:1352         1st Qu.:1660         1st Qu.:18.20     1st Qu.:20.85    
##  Median :2007         Median :2150         Median :19.00     Median :26.00    
##  Mean   :2017         Mean   :2468         Mean   :20.77     Mean   :25.33    
##  3rd Qu.:2681         3rd Qu.:3232         3rd Qu.:22.45     3rd Qu.:29.55    
##  Max.   :4486         Max.   :5269         Max.   :25.90     Max.   :36.00    
##  NA's   :23           NA's   :29           NA's   :47        NA's   :39       
##  Temp_Jul1999_2002 Temp_Jul2003_2006 Temp_Jul2007_2009 Temp_Jan1989_1993
##  Min.   :18.50     Min.   :16.00     Min.   :16.70     Min.   :-1.800   
##  1st Qu.:20.00     1st Qu.:19.00     1st Qu.:18.50     1st Qu.: 0.300   
##  Median :21.00     Median :20.45     Median :20.30     Median : 2.400   
##  Mean   :22.40     Mean   :21.44     Mean   :22.06     Mean   : 3.333   
##  3rd Qu.:25.12     3rd Qu.:24.52     3rd Qu.:25.12     3rd Qu.: 5.900   
##  Max.   :31.50     Max.   :29.20     Max.   :32.00     Max.   : 9.400   
##  NA's   :20        NA's   :16        NA's   :22        NA's   :47       
##  Temp_Jan1994_1998 Temp_Jan1999_2002 Temp_Jan2003_2006 Temp_Jan2007_2009
##  Min.   :-8.500    Min.   :-7.200    Min.   :-7.700    Min.   :-3.000   
##  1st Qu.: 1.800    1st Qu.:-0.625    1st Qu.:-1.475    1st Qu.: 0.375   
##  Median : 4.500    Median : 1.700    Median : 1.600    Median : 1.950   
##  Mean   : 4.122    Mean   : 2.480    Mean   : 2.006    Mean   : 2.754   
##  3rd Qu.: 7.600    3rd Qu.: 6.150    3rd Qu.: 5.525    3rd Qu.: 3.650   
##  Max.   :13.400    Max.   :13.200    Max.   :11.900    Max.   :11.700   
##  NA's   :41        NA's   :20        NA's   :16        NA's   :22       
##   Latitude_deg    Latitude_min    Latitude_sec   Longitude_deg  
##  Min.   :37.00   Min.   : 0.00   Min.   : 0.00   Min.   :-9.00  
##  1st Qu.:44.25   1st Qu.:17.50   1st Qu.: 0.00   1st Qu.: 6.00  
##  Median :50.00   Median :28.00   Median : 0.00   Median :14.00  
##  Mean   :48.72   Mean   :29.98   Mean   :14.07   Mean   :17.44  
##  3rd Qu.:53.00   3rd Qu.:45.75   3rd Qu.:32.75   3rd Qu.:26.75  
##  Max.   :59.00   Max.   :56.00   Max.   :59.76   Max.   :60.00  
##                                                                 
##  Longitude_min    Longitude_sec          Lat             Lon        
##  Min.   :-59.00   Min.   :-57.000   Min.   :37.38   Min.   :-9.185  
##  1st Qu.:  6.00   1st Qu.:  0.000   1st Qu.:44.88   1st Qu.: 6.829  
##  Median : 21.00   Median :  0.000   Median :50.04   Median :14.336  
##  Mean   : 20.18   Mean   :  8.731   Mean   :49.22   Mean   :17.779  
##  3rd Qu.: 39.25   3rd Qu.: 22.250   3rd Qu.:53.50   3rd Qu.:27.143  
##  Max.   : 59.00   Max.   : 59.000   Max.   :59.93   Max.   :60.583  
##                                                                     
##  Liveability2010 Mercer_Qual_Liv2011 Mercer_Per_Safe2011    ECM2010      
##  Min.   :61.00   Min.   : 1.0        Min.   :  5.00      Min.   :0.0200  
##  1st Qu.:80.00   1st Qu.:16.0        1st Qu.: 11.00      1st Qu.:0.0550  
##  Median :90.00   Median :30.0        Median : 20.50      Median :0.1000  
##  Mean   :86.23   Mean   :33.2        Mean   : 34.94      Mean   :0.1663  
##  3rd Qu.:93.00   3rd Qu.:42.0        3rd Qu.: 40.50      3rd Qu.:0.2300  
##  Max.   :98.00   Max.   :84.0        Max.   :199.00      Max.   :0.8500  
##  NA's   :19      NA's   :25          NA's   :34          NA's   :23      
##   ECM_Cost2010   
##  Min.   :0.0200  
##  1st Qu.:0.1600  
##  Median :0.2700  
##  Mean   :0.4637  
##  3rd Qu.:0.6350  
##  Max.   :1.4200  
##  NA's   :23
library(dplyr)

Revision de valores faltantes

faltantes <- data.frame(
  Variable = names(base_taller),
  NAs = colSums(is.na(base_taller))
)
faltantes
##                                                          Variable NAs
## City                                                         City   0
## City_Eng                                                 City_Eng   0
## City_Short                                             City_Short   0
## NAds                                                         NAds   0
## Price_Median                                         Price_Median   0
## Price_Mean                                             Price_Mean   0
## Area_Median                                           Area_Median   0
## Area_Mean                                               Area_Mean   0
## Room_Median                                           Room_Median   0
## Room_Mean                                               Room_Mean   0
## Euro_area                                               Euro_area   0
## EU                                                             EU   0
## Population                                             Population   0
## City_Area                                               City_Area   0
## Density                                                   Density   0
## GDP_PC                                                     GDP_PC   0
## GDP_PC_PPS                                             GDP_PC_PPS   0
## GDP_PC2008                                             GDP_PC2008   0
## GDP_PC2009                                             GDP_PC2009   0
## GDP_PC2010                                             GDP_PC2010   0
## Gini                                                         Gini   0
## HOR                                                           HOR   0
## Kearny_GCI2010                                     Kearny_GCI2010   0
## LRIR                                                         LRIR   0
## Inflation2010                                       Inflation2010   0
## Inflation2011                                       Inflation2011   6
## URate                                                       URate   0
## MIR2009                                                   MIR2009   2
## MIR2010                                                   MIR2010   2
## Mortgage_PC2010                                   Mortgage_PC2010   2
## Tppl1989_1993                                       Tppl1989_1993  17
## Tppl1994_1998                                       Tppl1994_1998  19
## Tppl1999_2002                                       Tppl1999_2002  14
## Tppl2003_2006                                       Tppl2003_2006  14
## Tppl2007_2009                                       Tppl2007_2009  20
## GDP_PC_PPS1989_1993                           GDP_PC_PPS1989_1993  50
## GDP_PC_PPS1994_1998                           GDP_PC_PPS1994_1998  16
## GDP_PC_PPS1999_2002                           GDP_PC_PPS1999_2002  16
## GDP_PC_PPS2003_2006                           GDP_PC_PPS2003_2006  15
## GDP_PC_PPS2007_2009                           GDP_PC_PPS2007_2009  15
## CITIES                                                     CITIES  13
## DemoDepend1989_1993                           DemoDepend1989_1993  22
## DemoDepend1994_1998                           DemoDepend1994_1998  25
## DemoDepend1999_2002                           DemoDepend1999_2002  19
## DemoDepend2003_2006                           DemoDepend2003_2006  17
## DemoDepend2007_2009                           DemoDepend2007_2009  23
## DemoODepend1989_1993                         DemoODepend1989_1993  17
## DemoODepend1994_1998                         DemoODepend1994_1998  21
## DemoODepend1999_2002                         DemoODepend1999_2002  14
## DemoODepend2003_2006                         DemoODepend2003_2006  13
## DemoODepend2007_2009                         DemoODepend2007_2009  20
## Thh1989_1993                                         Thh1989_1993  20
## Thh1994_1998                                         Thh1994_1998  27
## Thh1999_2002                                         Thh1999_2002  13
## Thh2003_2006                                         Thh2003_2006  23
## Thh2007_2009                                         Thh2007_2009  27
## Ndwe1989_1993                                       Ndwe1989_1993  35
## Ndwe1994_1998                                       Ndwe1994_1998  43
## Ndwe1999_2002                                       Ndwe1999_2002  13
## Ndwe2003_2006                                       Ndwe2003_2006  19
## Ndwe2007_2009                                       Ndwe2007_2009  29
## Napart1989_1993                                   Napart1989_1993  46
## Napart1994_1998                                   Napart1994_1998  50
## Napart1999_2002                                   Napart1999_2002  16
## Napart2003_2006                                   Napart2003_2006  27
## Napart2007_2009                                   Napart2007_2009  38
## Nhouse1989_1993                                   Nhouse1989_1993  45
## Nhouse1994_1998                                   Nhouse1994_1998  50
## Nhouse1999_2002                                   Nhouse1999_2002  16
## Nhouse2003_2006                                   Nhouse2003_2006  27
## Nhouse2007_2009                                   Nhouse2007_2009  38
## Aphouse1989_1993                                 Aphouse1989_1993  39
## Aphouse1994_1998                                 Aphouse1994_1998  38
## Aphouse1999_2002                                 Aphouse1999_2002  31
## Aphouse2003_2006                                 Aphouse2003_2006  27
## Aphouse2007_2009                                 Aphouse2007_2009  31
## ApapartMincome1989_1993                   ApapartMincome1989_1993  44
## ApapartMincome1994_1998                   ApapartMincome1994_1998  44
## ApapartMincome1999_2002                   ApapartMincome1999_2002  38
## ApapartMincome2003_2006                   ApapartMincome2003_2006  34
## ApapartMincome2007_2009                   ApapartMincome2007_2009  37
## Arent-housing1989_1993                     Arent-housing1989_1993  50
## Arent-housing1994_1998                     Arent-housing1994_1998  50
## Arent-housing1999_2002                     Arent-housing1999_2002  34
## Arent-housing2003_2006                     Arent-housing2003_2006  33
## Arent-housing2007_2009                     Arent-housing2007_2009  33
## Alarea1989_1993                                   Alarea1989_1993  32
## Alarea1994_1998                                   Alarea1994_1998  38
## Alarea1999_2002                                   Alarea1999_2002  19
## Alarea2003_2006                                   Alarea2003_2006  30
## Alarea2007_2009                                   Alarea2007_2009  35
## Phh-owndwe1989_1993                           Phh-owndwe1989_1993  23
## Phh-owndwe1994_1998                           Phh-owndwe1994_1998  39
## Phh-owndwe1999_2002                           Phh-owndwe1999_2002  13
## Phh-owndwe2003_2006                           Phh-owndwe2003_2006  32
## Phh-owndwe2007_2009                           Phh-owndwe2007_2009  38
## Urate1989_1993                                     Urate1989_1993  26
## Urate1994_1998                                     Urate1994_1998  29
## Urate1999_2002                                     Urate1999_2002  14
## Urate2003_2006                                     Urate2003_2006  15
## Urate2007_2009                                     Urate2007_2009  32
## Ncom-head1989_1993                             Ncom-head1989_1993  50
## Ncom-head1994_1998                             Ncom-head1994_1998  50
## Ncom-head1999_2002                             Ncom-head1999_2002  28
## Ncom-head2003_2006                             Ncom-head2003_2006  27
## Ncom-head2007_2009                             Ncom-head2007_2009  38
## Mhhincome1989_1993                             Mhhincome1989_1993  42
## Mhhincome1994_1998                             Mhhincome1994_1998  35
## Mhhincome1999_2002                             Mhhincome1999_2002  25
## Mhhincome2003_2006                             Mhhincome2003_2006  33
## Mhhincome2007_2009                             Mhhincome2007_2009  36
## Ahhincome1989_1993                             Ahhincome1989_1993  50
## Ahhincome1994_1998                             Ahhincome1994_1998  50
## Ahhincome1999_2002                             Ahhincome1999_2002  35
## Ahhincome2003_2006                             Ahhincome2003_2006  32
## Ahhincome2007_2009                             Ahhincome2007_2009  31
## RQ1-Q4earn1989_1993                           RQ1-Q4earn1989_1993  50
## RQ1-Q4earn1994_1998                           RQ1-Q4earn1994_1998  50
## RQ1-Q4earn1999_2002                           RQ1-Q4earn1999_2002  32
## RQ1-Q4earn2003_2006                           RQ1-Q4earn2003_2006  36
## RQ1-Q4earn2007_2009                           RQ1-Q4earn2007_2009  36
## HhincomeQ21989_1993                           HhincomeQ21989_1993  50
## HhincomeQ21994_1998                           HhincomeQ21994_1998  50
## HhincomeQ21999_2002                           HhincomeQ21999_2002  32
## HhincomeQ22003_2006                           HhincomeQ22003_2006  37
## HhincomeQ22007_2009                           HhincomeQ22007_2009  36
## HhincomeQ31989_1993                           HhincomeQ31989_1993  50
## HhincomeQ31994_1998                           HhincomeQ31994_1998  50
## HhincomeQ31999_2002                           HhincomeQ31999_2002  32
## HhincomeQ32003_2006                           HhincomeQ32003_2006  37
## HhincomeQ32007_2009                           HhincomeQ32007_2009  36
## Tlandarea1989_1993                             Tlandarea1989_1993  26
## Tlandarea1994_1998                             Tlandarea1994_1998  23
## Tlandarea1999_2002                             Tlandarea1999_2002  17
## Tlandarea2003_2006                             Tlandarea2003_2006  21
## Tlandarea2007_2009                             Tlandarea2007_2009  25
## Larea-leisure1989_1993                     Larea-leisure1989_1993  48
## Larea-leisure1994_1998                     Larea-leisure1994_1998  47
## Larea-leisure1999_2002                     Larea-leisure1999_2002  32
## Larea-leisure2003_2006                     Larea-leisure2003_2006  33
## Larea-leisure2007_2009                     Larea-leisure2007_2009  38
## Parea-housing1989_1993                     Parea-housing1989_1993  49
## Parea-housing1994_1998                     Parea-housing1994_1998  43
## Parea-housing1999_2002                     Parea-housing1999_2002  31
## Parea-housing2003_2006                     Parea-housing2003_2006  34
## Parea-housing2007_2009                     Parea-housing2007_2009  38
## Ppldens1989_1993                                 Ppldens1989_1993  26
## Ppldens1994_1998                                 Ppldens1994_1998  26
## Ppldens1999_2002                                 Ppldens1999_2002  18
## Ppldens2003_2006                                 Ppldens2003_2006  21
## Ppldens2007_2009                                 Ppldens2007_2009  25
## Netresidens-housingarea1989_1993 Netresidens-housingarea1989_1993  49
## Netresidens-housingarea1994_1998 Netresidens-housingarea1994_1998  46
## Netresidens-housingarea1999_2002 Netresidens-housingarea1999_2002  31
## Netresidens-housingarea2003_2006 Netresidens-housingarea2003_2006  33
## Netresidens-housingarea2007_2009 Netresidens-housingarea2007_2009  38
## APApartment1989_1993                         APApartment1989_1993  44
## APApartment1994_1998                         APApartment1994_1998  43
## APApartment1999_2002                         APApartment1999_2002  34
## APApartment2003_2006                         APApartment2003_2006  23
## APApartment2007_2009                         APApartment2007_2009  29
## Temp_Jul1989_1993                               Temp_Jul1989_1993  47
## Temp_Jul1994_1998                               Temp_Jul1994_1998  39
## Temp_Jul1999_2002                               Temp_Jul1999_2002  20
## Temp_Jul2003_2006                               Temp_Jul2003_2006  16
## Temp_Jul2007_2009                               Temp_Jul2007_2009  22
## Temp_Jan1989_1993                               Temp_Jan1989_1993  47
## Temp_Jan1994_1998                               Temp_Jan1994_1998  41
## Temp_Jan1999_2002                               Temp_Jan1999_2002  20
## Temp_Jan2003_2006                               Temp_Jan2003_2006  16
## Temp_Jan2007_2009                               Temp_Jan2007_2009  22
## Latitude_deg                                         Latitude_deg   0
## Latitude_min                                         Latitude_min   0
## Latitude_sec                                         Latitude_sec   0
## Longitude_deg                                       Longitude_deg   0
## Longitude_min                                       Longitude_min   0
## Longitude_sec                                       Longitude_sec   0
## Lat                                                           Lat   0
## Lon                                                           Lon   0
## Liveability2010                                   Liveability2010  19
## Mercer_Qual_Liv2011                           Mercer_Qual_Liv2011  25
## Mercer_Per_Safe2011                           Mercer_Per_Safe2011  34
## ECM2010                                                   ECM2010  23
## ECM_Cost2010                                         ECM_Cost2010  23

2.1. Estadisticos descriptivos.

# Cargar librerías necesarias
library(psych)
## 
## Adjuntando el paquete: 'psych'
## The following object is masked from 'package:Hmisc':
## 
##     describe
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha
library(dplyr)
library(ggplot2)

# Seleccionar variables de interés
vars_analisis <- c("Price_Median", "Area_Median", "Room_Median", "Density", 
                   "Population", "GDP_PC", "URate", "MIR2010")

# Filtrar datos y convertir a numérico
datos_analisis <- base_taller[, vars_analisis]
datos_analisis <- data.frame(lapply(datos_analisis, as.numeric))
datos_analisis <- na.omit(datos_analisis)

# Estadísticos descriptivos básicos con summary()
cat("=== ESTADÍSTICOS DESCRIPTIVOS BÁSICOS ===\n")
## === ESTADÍSTICOS DESCRIPTIVOS BÁSICOS ===
summary_stats <- summary(datos_analisis)
print(summary_stats)
##   Price_Median     Area_Median      Room_Median      Density     
##  Min.   : 503.1   Min.   : 47.00   Min.   :2.00   Min.   : 1315  
##  1st Qu.:1324.7   1st Qu.: 55.10   1st Qu.:2.00   1st Qu.: 2551  
##  Median :2167.8   Median : 68.50   Median :2.25   Median : 3297  
##  Mean   :2488.7   Mean   : 69.17   Mean   :2.49   Mean   : 4549  
##  3rd Qu.:3127.2   3rd Qu.: 79.25   3rd Qu.:3.00   3rd Qu.: 5394  
##  Max.   :8590.9   Max.   :100.00   Max.   :3.00   Max.   :20618  
##    Population           GDP_PC          URate           MIR2010      
##  Min.   :  401389   Min.   : 2535   Min.   : 1.700   Min.   : 0.920  
##  1st Qu.:  784105   1st Qu.:19825   1st Qu.: 5.805   1st Qu.: 2.993  
##  Median : 1132700   Median :30700   Median : 7.990   Median : 3.815  
##  Mean   : 1885327   Mean   :32034   Mean   : 9.545   Mean   : 7.026  
##  3rd Qu.: 1704168   3rd Qu.:43000   3rd Qu.:11.908   3rd Qu.: 8.615  
##  Max.   :12915158   Max.   :86464   Max.   :23.124   Max.   :26.200
# Estadísticos más detallados con psych::describe()
cat("\n=== ESTADÍSTICOS DESCRIPTIVOS DETALLADOS ===\n")
## 
## === ESTADÍSTICOS DESCRIPTIVOS DETALLADOS ===
detallado_stats <- describe(datos_analisis)
print(round(detallado_stats, 3))
##              vars  n       mean         sd     median    trimmed       mad
## Price_Median    1 48    2488.72    1620.91    2167.76    2272.81   1316.81
## Area_Median     2 48      69.17      14.90      68.50      68.48     17.79
## Room_Median     3 48       2.49       0.50       2.25       2.49      0.37
## Density         4 48    4548.62    3520.04    3296.74    3928.58   1479.41
## Population      5 48 1885327.48 2431882.99 1132700.00 1322120.12 667765.26
## GDP_PC          6 48   32033.68   20843.98   30700.00   30494.03  18532.50
## URate           7 48       9.54       5.26       7.99       9.13      4.76
## MIR2010         8 48       7.03       6.84       3.82       5.60      1.26
##                    min         max       range skew kurtosis        se
## Price_Median    503.07     8590.87     8087.80 1.54     2.98    233.96
## Area_Median      47.00      100.00       53.00 0.28    -0.99      2.15
## Room_Median       2.00        3.00        1.00 0.04    -2.02      0.07
## Density        1315.07    20617.90    19302.83 2.74     8.74    508.07
## Population   401389.00 12915158.00 12513769.00 3.15     9.93 351012.07
## GDP_PC         2534.91    86463.74    83928.83 0.53    -0.26   3008.57
## URate             1.70       23.12       21.42 0.75    -0.08      0.76
## MIR2010           0.92       26.20       25.28 1.82     2.32      0.99
# Matriz de correlaciones (ya la tienes, pero la incluimos para completitud)
cat("\n=== MATRIZ DE CORRELACIONES ===\n")
## 
## === MATRIZ DE CORRELACIONES ===
cor_matrix <- cor(datos_analisis, use = "complete.obs")
print(round(cor_matrix, 3))
##              Price_Median Area_Median Room_Median Density Population GDP_PC
## Price_Median        1.000       0.121       0.276   0.498      0.132  0.630
## Area_Median         0.121       1.000       0.748   0.061      0.060  0.304
## Room_Median         0.276       0.748       1.000   0.041      0.020  0.502
## Density             0.498       0.061       0.041   1.000      0.105  0.157
## Population          0.132       0.060       0.020   0.105      1.000 -0.172
## GDP_PC              0.630       0.304       0.502   0.157     -0.172  1.000
## URate              -0.140       0.272       0.197   0.058     -0.097 -0.034
## MIR2010            -0.383      -0.545      -0.546  -0.090      0.017 -0.591
##               URate MIR2010
## Price_Median -0.140  -0.383
## Area_Median   0.272  -0.545
## Room_Median   0.197  -0.546
## Density       0.058  -0.090
## Population   -0.097   0.017
## GDP_PC       -0.034  -0.591
## URate         1.000  -0.346
## MIR2010      -0.346   1.000

2.2. Graficos exploratorios

# Configurar tema para gráficos
theme_set(theme_minimal())

# A) HISTOGRAMAS DE DISTRIBUCIÓN

library(readxl)
library(ggplot2)

# Cargar la base de datos
datos <- read_excel("C:/Users/carab/Downloads/Rosi_files/dp2015-13_Dataset.xls")

# Variables principales
vars <- c("Price_Median", "Area_Median", "Room_Median", 
          "Population", "GDP_PC", "URate", "MIR2010")

# Generar histogramas con leyenda
for (v in vars) {
  p <- ggplot(datos, aes_string(x = v, fill = shQuote(v))) +
    geom_histogram(color = "black", bins = 20, alpha = 0.7) +
    labs(title = paste("Distribución de", v),
         x = v,
         y = "Frecuencia",
         fill = "Variable") +
    theme_minimal(base_size = 14) +
    theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 16),
          axis.title = element_text(face = "bold"),
          legend.position = "right",
          panel.grid.major = element_line(color = "purple"))
  
  print(p)  # Mostrar cada histograma
}
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_bin()`).

# c) GRÁFICO DE VIOLÍN PARA PRINCIPALES VARIABLES
cat("\nGenerando gráficos de violín...\n")
## 
## Generando gráficos de violín...
library(vioplot)
## Cargando paquete requerido: sm
## Package 'sm', version 2.2-6.0: type help(sm) for summary information
## Cargando paquete requerido: zoo
## 
## Adjuntando el paquete: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
par(mfrow = c(2, 2))
vioplot(datos_analisis$Price_Median, main = "Distribución Price_Median", col = "gold")
vioplot(datos_analisis$Area_Median, main = "Distribución Area_Median", col = "lightblue")
vioplot(datos_analisis$GDP_PC, main = "Distribución GDP_PC", col = "lightgreen")
vioplot(datos_analisis$URate, main = "Distribución URate", col = "pink")

Mapa

# CARGAR LIBRERÍAS
library(leaflet)
library(readxl)

# CARGAR Y PREPARAR DATOS
datos <- read_excel("C:/Users/carab/Downloads/Rosi_files/dp2015-13_Dataset.xls")

# Verificar nombres de columnas reales
cat("Nombres de columnas en tu dataset:\n")
## Nombres de columnas en tu dataset:
print(names(datos))
##   [1] "City"                             "City_Eng"                        
##   [3] "City_Short"                       "NAds"                            
##   [5] "Price_Median"                     "Price_Mean"                      
##   [7] "Area_Median"                      "Area_Mean"                       
##   [9] "Room_Median"                      "Room_Mean"                       
##  [11] "Euro_area"                        "EU"                              
##  [13] "Population"                       "City_Area"                       
##  [15] "Density"                          "GDP_PC"                          
##  [17] "GDP_PC_PPS"                       "GDP_PC2008"                      
##  [19] "GDP_PC2009"                       "GDP_PC2010"                      
##  [21] "Gini"                             "HOR"                             
##  [23] "Kearny_GCI2010"                   "LRIR"                            
##  [25] "Inflation2010"                    "Inflation2011"                   
##  [27] "URate"                            "MIR2009"                         
##  [29] "MIR2010"                          "Mortgage_PC2010"                 
##  [31] "Tppl1989_1993"                    "Tppl1994_1998"                   
##  [33] "Tppl1999_2002"                    "Tppl2003_2006"                   
##  [35] "Tppl2007_2009"                    "GDP_PC_PPS1989_1993"             
##  [37] "GDP_PC_PPS1994_1998"              "GDP_PC_PPS1999_2002"             
##  [39] "GDP_PC_PPS2003_2006"              "GDP_PC_PPS2007_2009"             
##  [41] "CITIES"                           "DemoDepend1989_1993"             
##  [43] "DemoDepend1994_1998"              "DemoDepend1999_2002"             
##  [45] "DemoDepend2003_2006"              "DemoDepend2007_2009"             
##  [47] "DemoODepend1989_1993"             "DemoODepend1994_1998"            
##  [49] "DemoODepend1999_2002"             "DemoODepend2003_2006"            
##  [51] "DemoODepend2007_2009"             "Thh1989_1993"                    
##  [53] "Thh1994_1998"                     "Thh1999_2002"                    
##  [55] "Thh2003_2006"                     "Thh2007_2009"                    
##  [57] "Ndwe1989_1993"                    "Ndwe1994_1998"                   
##  [59] "Ndwe1999_2002"                    "Ndwe2003_2006"                   
##  [61] "Ndwe2007_2009"                    "Napart1989_1993"                 
##  [63] "Napart1994_1998"                  "Napart1999_2002"                 
##  [65] "Napart2003_2006"                  "Napart2007_2009"                 
##  [67] "Nhouse1989_1993"                  "Nhouse1994_1998"                 
##  [69] "Nhouse1999_2002"                  "Nhouse2003_2006"                 
##  [71] "Nhouse2007_2009"                  "Aphouse1989_1993"                
##  [73] "Aphouse1994_1998"                 "Aphouse1999_2002"                
##  [75] "Aphouse2003_2006"                 "Aphouse2007_2009"                
##  [77] "ApapartMincome1989_1993"          "ApapartMincome1994_1998"         
##  [79] "ApapartMincome1999_2002"          "ApapartMincome2003_2006"         
##  [81] "ApapartMincome2007_2009"          "Arent-housing1989_1993"          
##  [83] "Arent-housing1994_1998"           "Arent-housing1999_2002"          
##  [85] "Arent-housing2003_2006"           "Arent-housing2007_2009"          
##  [87] "Alarea1989_1993"                  "Alarea1994_1998"                 
##  [89] "Alarea1999_2002"                  "Alarea2003_2006"                 
##  [91] "Alarea2007_2009"                  "Phh-owndwe1989_1993"             
##  [93] "Phh-owndwe1994_1998"              "Phh-owndwe1999_2002"             
##  [95] "Phh-owndwe2003_2006"              "Phh-owndwe2007_2009"             
##  [97] "Urate1989_1993"                   "Urate1994_1998"                  
##  [99] "Urate1999_2002"                   "Urate2003_2006"                  
## [101] "Urate2007_2009"                   "Ncom-head1989_1993"              
## [103] "Ncom-head1994_1998"               "Ncom-head1999_2002"              
## [105] "Ncom-head2003_2006"               "Ncom-head2007_2009"              
## [107] "Mhhincome1989_1993"               "Mhhincome1994_1998"              
## [109] "Mhhincome1999_2002"               "Mhhincome2003_2006"              
## [111] "Mhhincome2007_2009"               "Ahhincome1989_1993"              
## [113] "Ahhincome1994_1998"               "Ahhincome1999_2002"              
## [115] "Ahhincome2003_2006"               "Ahhincome2007_2009"              
## [117] "RQ1-Q4earn1989_1993"              "RQ1-Q4earn1994_1998"             
## [119] "RQ1-Q4earn1999_2002"              "RQ1-Q4earn2003_2006"             
## [121] "RQ1-Q4earn2007_2009"              "HhincomeQ21989_1993"             
## [123] "HhincomeQ21994_1998"              "HhincomeQ21999_2002"             
## [125] "HhincomeQ22003_2006"              "HhincomeQ22007_2009"             
## [127] "HhincomeQ31989_1993"              "HhincomeQ31994_1998"             
## [129] "HhincomeQ31999_2002"              "HhincomeQ32003_2006"             
## [131] "HhincomeQ32007_2009"              "Tlandarea1989_1993"              
## [133] "Tlandarea1994_1998"               "Tlandarea1999_2002"              
## [135] "Tlandarea2003_2006"               "Tlandarea2007_2009"              
## [137] "Larea-leisure1989_1993"           "Larea-leisure1994_1998"          
## [139] "Larea-leisure1999_2002"           "Larea-leisure2003_2006"          
## [141] "Larea-leisure2007_2009"           "Parea-housing1989_1993"          
## [143] "Parea-housing1994_1998"           "Parea-housing1999_2002"          
## [145] "Parea-housing2003_2006"           "Parea-housing2007_2009"          
## [147] "Ppldens1989_1993"                 "Ppldens1994_1998"                
## [149] "Ppldens1999_2002"                 "Ppldens2003_2006"                
## [151] "Ppldens2007_2009"                 "Netresidens-housingarea1989_1993"
## [153] "Netresidens-housingarea1994_1998" "Netresidens-housingarea1999_2002"
## [155] "Netresidens-housingarea2003_2006" "Netresidens-housingarea2007_2009"
## [157] "APApartment1989_1993"             "APApartment1994_1998"            
## [159] "APApartment1999_2002"             "APApartment2003_2006"            
## [161] "APApartment2007_2009"             "Temp_Jul1989_1993"               
## [163] "Temp_Jul1994_1998"                "Temp_Jul1999_2002"               
## [165] "Temp_Jul2003_2006"                "Temp_Jul2007_2009"               
## [167] "Temp_Jan1989_1993"                "Temp_Jan1994_1998"               
## [169] "Temp_Jan1999_2002"                "Temp_Jan2003_2006"               
## [171] "Temp_Jan2007_2009"                "Latitude_deg"                    
## [173] "Latitude_min"                     "Latitude_sec"                    
## [175] "Longitude_deg"                    "Longitude_min"                   
## [177] "Longitude_sec"                    "Lat"                             
## [179] "Lon"                              "Liveability2010"                 
## [181] "Mercer_Qual_Liv2011"              "Mercer_Per_Safe2011"             
## [183] "ECM2010"                          "ECM_Cost2010"
# Limpiar datos (usando nombres genéricos - ajústalos según tu dataset)
# Asumiendo que las columnas se llaman "Lon" y "Lat"
if ("Lon" %in% names(datos) & "Lat" %in% names(datos)) {
  datos <- datos[complete.cases(datos$Lon, datos$Lat), ]
} else {
  # Si tienen otros nombres, busca columnas que puedan ser coordenadas
  coords_cols <- grep("lon|lat|longitud|latitud", names(datos), ignore.case = TRUE, value = TRUE)
  cat("Columnas que podrían ser coordenadas:", paste(coords_cols, collapse = ", "), "\n")
  stop("Ajusta los nombres de las columnas de coordenadas en el código")
}

# BUSCAR VARIABLE DE PIB (ajusta el nombre según tu dataset)
variable_pib <- NULL
posibles_nombres_pib <- c("GDP_PC", "PIB", "PIB_PC", "GDP", "PIB_per_capita", "GDP_per_capita")

for (nombre in posibles_nombres_pib) {
  if (nombre %in% names(datos)) {
    variable_pib <- nombre
    break
  }
}

if (is.null(variable_pib)) {
  cat("No se encontró variable de PIB. Variables numéricas disponibles:\n")
  numeric_vars <- names(datos)[sapply(datos, is.numeric)]
  print(numeric_vars)
  stop("Selecciona una variable numérica y ajusta 'variable_pib' en el código")
}

cat("Usando variable:", variable_pib, "\n")
## Usando variable: GDP_PC
# CONFIGURACIÓN DEL MAPA
centro_europa <- c(50, 10)  # Latitud, Longitud (centro de Europa)
nivel_zoom <- 4

# CREAR EL MAPA
paleta_pib <- colorNumeric(
  palette = c("#f7fbff", "#08306b"),  # Azul claro a oscuro
  domain = datos[[variable_pib]]
)

mapa <- leaflet(datos) %>%
  setView(lng = centro_europa[2], lat = centro_europa[1], zoom = nivel_zoom) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  addCircleMarkers(
    lng = ~Lon,
    lat = ~Lat,
    radius = 6,
    fillColor = ~paleta_pib(datos[[variable_pib]]),
    fillOpacity = 0.8,
    stroke = TRUE,
    color = "white",
    weight = 1,
    popup = ~paste(
      "<b>PIB per cápita:</b> €", 
      format(round(datos[[variable_pib]], 0), big.mark = ",")
    )
  ) %>%
  addLegend(
    position = "bottomright",
    pal = paleta_pib,
    values = datos[[variable_pib]],
    title = "PIB per cápita (€)",
    opacity = 0.9
  ) %>%
  addScaleBar(
    position = "bottomleft",
    options = scaleBarOptions(metric = TRUE, imperial = FALSE)
  )

# MOSTRAR MAPA
print(mapa)

cat("Mapa creado exitosamente!\n")
## Mapa creado exitosamente!

2.3 Matriz de correlacion entre variables explicativas y dependientes

# Filtrar solo las variables numéricas de interés
vars <- c("Price_Median", "Area_Median", "Room_Median", "GDP_PC", "URate", "MIR2010")
datos_cor <- base_taller[, vars]

library(Hmisc)
cor_test <- rcorr(as.matrix(datos_cor))
library(Hmisc)
library(corrplot)

vars <- c("Price_Median", "Area_Median", "Room_Median", "GDP_PC", "URate", "MIR2010")

# Asegurar que sean numéricas y eliminar NA
datos_cor <- base_taller[, vars]
datos_cor <- data.frame(lapply(datos_cor, as.numeric))
datos_cor <- na.omit(datos_cor)

# Calcular correlaciones
cor_test <- rcorr(as.matrix(datos_cor))

# Matriz de correlaciones y p-valores
matrizcor <- round(cor_test$r, 2)   # redondeo a 2 decimales como en tu imagen
pval <- cor_test$P

# Reemplazar posibles NA
pval[is.na(pval)] <- 1

# Graficar correlación completa
corrplot(
  matrizcor,
  method = "color",
  type = "full",       # 👈 Toda la matriz
  tl.col = "black",
  tl.srt = 45,
  addCoef.col = "black", # mostrar coeficientes en cada celda
  number.cex = 0.8,
  p.mat = pval,         # Matriz de p-valores
  sig.level = 0.05,     # Solo significativas
  insig = "blank",      # Ocultar las no significativas
  title = "Mapa de calor de correlaciones significativas",
  mar = c(0, 0, 2, 0)
)

grafios de dispercion:`

## Matriz de dispersión con pares de variables


# Seleccionar las variables
vars <- c("Price_Median", "Area_Median", "Room_Median", "GDP_PC", "URate", "MIR2010")

# Matriz de dispersión con otro color
pairs(base_taller[vars],
      main = "Matriz de dispersión",
      pch = 19, col = "blue")   # puedes cambiar: "red", "purple", "orange", etc.

Graficos de dispersion individuales

## Dispersión de Price_Median vs Area_Median
ggplot(base_taller, aes(x = Area_Median, y = Price_Median)) +
  geom_point(aes(color = "Puntos"), alpha = 0.6, size = 2) +
  geom_smooth(aes(color = "Línea de tendencia"), method = "lm", se = FALSE, size = 1) +
  scale_color_manual(name = "Elementos",
                     values = c("Puntos" = "steelblue", "Línea de tendencia" = "red")) +
  labs(title = "Dispersión de Price_Median vs Area_Median",
       x = "Area_Median", y = "Price_Median") +
  theme_minimal() +
  theme(legend.position = "bottom",
        plot.title = element_text(hjust = 0.5, face = "bold"),
        axis.title = element_text(face = "bold"))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `geom_smooth()` using formula = 'y ~ x'

## Gráfico 2 – Price_Median vs Room_Media

# Dispersión de Price_Median vs Room_Median con mejoras estéticas

ggplot(base_taller, aes(x = Room_Median, y = Price_Median)) +
  geom_point(aes(color = "Puntos"), alpha = 0.6, size = 2) +
  geom_smooth(aes(color = "Línea de tendencia"), method = "lm", se = FALSE, size = 1) +
  scale_color_manual(name = "Elementos",
                     values = c("Puntos" = "steelblue", "Línea de tendencia" = "red")) +
  labs(title = "Dispersión de Price_Median vs Room_Median",
       x = "Room_Median", y = "Price_Median") +
  theme_minimal() +
  theme(legend.position = "bottom",
        plot.title = element_text(hjust = 0.5, face = "bold"),
        axis.title = element_text(face = "bold"))
## `geom_smooth()` using formula = 'y ~ x'

## Gráfico 3

## Dispersión de Price_Median vs GDP_PC
ggplot(base_taller, aes(x = GDP_PC, y = Price_Median)) +
  geom_point(aes(color = "Puntos"), alpha = 0.6, size = 2) +
  geom_smooth(aes(color = "Línea de tendencia"), method = "lm", se = FALSE, size = 1) +
  scale_color_manual(name = "Elementos",
                     values = c("Puntos" = "steelblue", "Línea de tendencia" = "red")) +
  labs(title = "Dispersión de Price_Median vs GDP_PC",
       x = "GDP_PC", y = "Price_Median") +
  theme_minimal() +
  theme(legend.position = "bottom",
        plot.title = element_text(hjust = 0.5, face = "bold"),
        axis.title = element_text(face = "bold"))
## `geom_smooth()` using formula = 'y ~ x'

## Gráfico 4 

## Dispersión de Price_Median vs URate

ggplot(base_taller, aes(x = URate, y = Price_Median)) +
  geom_point(aes(color = "Puntos"), alpha = 0.6, size = 2) +
  geom_smooth(aes(color = "Línea de tendencia"), method = "lm", se = FALSE, size = 1) +
  scale_color_manual(name = "Elementos",
                     values = c("Puntos" = "steelblue", "Línea de tendencia" = "red")) +
  labs(title = "Dispersión de Price_Median vs URate",
       x = "URate", y = "Price_Median") +
  theme_minimal() +
  theme(legend.position = "bottom",
        plot.title = element_text(hjust = 0.5, face = "bold"),
        axis.title = element_text(face = "bold"))
## `geom_smooth()` using formula = 'y ~ x'

## Gráfico 5 

## Dispersión de Price_Median vs MIR2010

ggplot(base_taller, aes(x = MIR2010, y = Price_Median)) +
  geom_point(aes(color = "Puntos"), alpha = 0.6, size = 2) +
  geom_smooth(aes(color = "Línea de tendencia"), method = "lm", se = FALSE, size = 1) +
  scale_color_manual(name = "Elementos",
                     values = c("Puntos" = "steelblue", "Línea de tendencia" = "red")) +
  labs(title = "Dispersión de Price_Median vs MIR2010",
       x = "MIR2010", y = "Price_Median") +
  theme_minimal() +
  theme(legend.position = "bottom",
        plot.title = element_text(hjust = 0.5, face = "bold"),
        axis.title = element_text(face = "bold"))
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).

3: Modelo de regrecion lineal multiple

# 3: Modelo de regresión lineal múltiple


# 1. CARGAR LIBRERÍAS
library(gridExtra)
## 
## Adjuntando el paquete: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine
# 2. PREPARAR DATOS
cat("=== PREPARACIÓN DE DATOS ===\n")
## === PREPARACIÓN DE DATOS ===
variables_modelo <- c("Price_Median", "Area_Median", "Room_Median", "Density", 
                     "Population", "GDP_PC", "URate", "MIR2010")

# Filtrar variables existentes
vars_disponibles <- variables_modelo[variables_modelo %in% names(base_taller)]
cat("Variables disponibles:", paste(vars_disponibles, collapse = ", "), "\n")
## Variables disponibles: Price_Median, Area_Median, Room_Median, Density, Population, GDP_PC, URate, MIR2010
datos_completos <- base_taller[, vars_disponibles]
datos_completos <- data.frame(lapply(datos_completos, as.numeric))
datos_completos <- na.omit(datos_completos)

cat("Observaciones finales:", nrow(datos_completos), "\n\n")
## Observaciones finales: 48
# 3. ESTADÍSTICOS DESCRIPTIVOS
cat("=== ESTADÍSTICOS DESCRIPTIVOS ===\n")
## === ESTADÍSTICOS DESCRIPTIVOS ===
print(summary(datos_completos))
##   Price_Median     Area_Median      Room_Median      Density     
##  Min.   : 503.1   Min.   : 47.00   Min.   :2.00   Min.   : 1315  
##  1st Qu.:1324.7   1st Qu.: 55.10   1st Qu.:2.00   1st Qu.: 2551  
##  Median :2167.8   Median : 68.50   Median :2.25   Median : 3297  
##  Mean   :2488.7   Mean   : 69.17   Mean   :2.49   Mean   : 4549  
##  3rd Qu.:3127.2   3rd Qu.: 79.25   3rd Qu.:3.00   3rd Qu.: 5394  
##  Max.   :8590.9   Max.   :100.00   Max.   :3.00   Max.   :20618  
##    Population           GDP_PC          URate           MIR2010      
##  Min.   :  401389   Min.   : 2535   Min.   : 1.700   Min.   : 0.920  
##  1st Qu.:  784105   1st Qu.:19825   1st Qu.: 5.805   1st Qu.: 2.993  
##  Median : 1132700   Median :30700   Median : 7.990   Median : 3.815  
##  Mean   : 1885327   Mean   :32034   Mean   : 9.545   Mean   : 7.026  
##  3rd Qu.: 1704168   3rd Qu.:43000   3rd Qu.:11.908   3rd Qu.: 8.615  
##  Max.   :12915158   Max.   :86464   Max.   :23.124   Max.   :26.200
# 4. GRÁFICOS EXPLORATORIOS
cat("\n=== GRÁFICOS EXPLORATORIOS ===\n")
## 
## === GRÁFICOS EXPLORATORIOS ===
# Histogramas
plot_list <- list()
for(i in 1:ncol(datos_completos)) {
  var_name <- colnames(datos_completos)[i]
  p <- ggplot(datos_completos, aes(x = .data[[var_name]])) +
    geom_histogram(fill = "lightblue", color = "black", bins = 10) +
    labs(title = paste("Distribución de", var_name), x = var_name) +
    theme_minimal()
  plot_list[[i]] <- p
}
grid.arrange(grobs = plot_list, ncol = 3)

# Diagramas de dispersión vs Price_Median
scatter_list <- list()
vars_ind <- setdiff(vars_disponibles, "Price_Median")
for(var in vars_ind) {
  p <- ggplot(datos_completos, aes(x = .data[[var]], y = Price_Median)) +
    geom_point(alpha = 0.6, color = "blue") +
    geom_smooth(method = "lm", color = "red") +
    labs(title = paste("Price_Median vs", var), x = var, y = "Price_Median") +
    theme_minimal()
  scatter_list[[var]] <- p
}
grid.arrange(grobs = scatter_list, ncol = 3)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'

# 5. MODELO DE REGRESIÓN
cat("\n=== MODELO DE REGRESIÓN LINEAL MÚLTIPLE ===\n")
## 
## === MODELO DE REGRESIÓN LINEAL MÚLTIPLE ===
# Especificar fórmula del modelo
formula_modelo <- as.formula(paste("Price_Median ~", paste(vars_ind, collapse = " + ")))
modelo <- lm(formula_modelo, data = datos_completos)

# Resultados detallados
summary_modelo <- summary(modelo)
print(summary_modelo)
## 
## Call:
## lm(formula = formula_modelo, data = datos_completos)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1839.24  -694.31    52.88   490.60  2596.36 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.487e+03  1.250e+03   1.190 0.241245    
## Area_Median -1.699e+01  1.702e+01  -0.998 0.324133    
## Room_Median  1.940e+02  5.286e+02   0.367 0.715582    
## Density      1.844e-01  4.591e-02   4.016 0.000253 ***
## Population   1.173e-04  6.807e-05   1.724 0.092485 .  
## GDP_PC       4.082e-02  1.112e-02   3.671 0.000706 ***
## URate       -4.559e+01  3.445e+01  -1.323 0.193250    
## MIR2010     -3.396e+01  3.510e+01  -0.967 0.339108    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1074 on 40 degrees of freedom
## Multiple R-squared:  0.6265, Adjusted R-squared:  0.5612 
## F-statistic: 9.586 on 7 and 40 DF,  p-value: 6.164e-07
# 6. RESULTADOS RESUMIDOS EN TABLA
cat("\n=== TABLA RESUMEN DE RESULTADOS ===\n")
## 
## === TABLA RESUMEN DE RESULTADOS ===
resultados_tabla <- data.frame(
  Variable = c("Intercepto", vars_ind),
  Coeficiente = round(coef(modelo), 4),
  Error_Std = round(summary_modelo$coefficients[, 2], 4),
  t_value = round(summary_modelo$coefficients[, 3], 3),
  p_value = round(summary_modelo$coefficients[, 4], 4)
)
print(resultados_tabla)
##                Variable Coeficiente Error_Std t_value p_value
## (Intercept)  Intercepto   1487.1538 1250.2043   1.190  0.2412
## Area_Median Area_Median    -16.9894   17.0183  -0.998  0.3241
## Room_Median Room_Median    193.9879  528.6325   0.367  0.7156
## Density         Density      0.1844    0.0459   4.016  0.0003
## Population   Population      0.0001    0.0001   1.724  0.0925
## GDP_PC           GDP_PC      0.0408    0.0111   3.671  0.0007
## URate             URate    -45.5866   34.4492  -1.323  0.1933
## MIR2010         MIR2010    -33.9564   35.0971  -0.967  0.3391
# 7. INTERPRETACIÓN BÁSICA
cat("\n=== INTERPRETACIÓN DE RESULTADOS ===\n")
## 
## === INTERPRETACIÓN DE RESULTADOS ===
# Bondad de ajuste
cat("BONDAD DE AJUSTE:\n")
## BONDAD DE AJUSTE:
cat("- R-cuadrado:", round(summary_modelo$r.squared, 4), "(", round(summary_modelo$r.squared * 100, 1), "%)\n")
## - R-cuadrado: 0.6265 ( 62.7 %)
cat("- R-cuadrado ajustado:", round(summary_modelo$adj.r.squared, 4), "\n")
## - R-cuadrado ajustado: 0.5612
cat("- Estadístico F:", round(summary_modelo$fstatistic[1], 2), "\n")
## - Estadístico F: 9.59
# Variables significativas
vars_significativas <- rownames(summary_modelo$coefficients)[summary_modelo$coefficients[, 4] < 0.05]
vars_significativas <- vars_significativas[vars_significativas != "(Intercept)"]
cat("\nVARIABLES SIGNIFICATIVAS (p < 0.05):", 
    if(length(vars_significativas) > 0) paste(vars_significativas, collapse = ", ") else "Ninguna", "\n")
## 
## VARIABLES SIGNIFICATIVAS (p < 0.05): Density, GDP_PC
# 8. ECUACIÓN DEL MODELO
cat("\n=== ECUACIÓN DEL MODELO ===\n")
## 
## === ECUACIÓN DEL MODELO ===
cat("Price_Median =", round(coef(modelo)[1], 2))
## Price_Median = 1487.15
for(i in 2:length(coef(modelo))) {
  cat(" +", round(coef(modelo)[i], 2), "*", names(coef(modelo))[i])
}
##  + -16.99 * Area_Median + 193.99 * Room_Median + 0.18 * Density + 0 * Population + 0.04 * GDP_PC + -45.59 * URate + -33.96 * MIR2010
cat("\n")
cat("\n=== ANÁLISIS COMPLETADO ===\n")
## 
## === ANÁLISIS COMPLETADO ===

4 Estimación por Mínimos Cuadrados Ordinarios (MCO)

# 1. CARGAR LIBRERÍAS
library(knitr)
library(broom)

# 2. ESTIMACIÓN MCO
cat("=== ESTIMACIÓN POR MÍNIMOS CUADRADOS ORDINARIOS (MCO) ===\n")
## === ESTIMACIÓN POR MÍNIMOS CUADRADOS ORDINARIOS (MCO) ===
# Definir variables independientes (excluyendo Price_Median que es la dependiente)
variables_independientes <- setdiff(vars_disponibles, "Price_Median")

# Crear fórmula del modelo
formula_mco <- as.formula(paste("Price_Median ~", paste(variables_independientes, collapse = " + ")))
cat("Fórmula del modelo:", deparse(formula_mco), "\n\n")
## Fórmula del modelo: Price_Median ~ Area_Median + Room_Median + Density + Population +      GDP_PC + URate + MIR2010
# Estimación MCO
modelo_mco <- lm(formula_mco, data = datos_completos)

# 3. RESULTADOS DE LA ESTIMACIÓN MCO
cat("=== RESULTADOS DEL MODELO MCO ===\n")
## === RESULTADOS DEL MODELO MCO ===
# Resumen completo
resumen_mco <- summary(modelo_mco)
print(resumen_mco)
## 
## Call:
## lm(formula = formula_mco, data = datos_completos)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1839.24  -694.31    52.88   490.60  2596.36 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.487e+03  1.250e+03   1.190 0.241245    
## Area_Median -1.699e+01  1.702e+01  -0.998 0.324133    
## Room_Median  1.940e+02  5.286e+02   0.367 0.715582    
## Density      1.844e-01  4.591e-02   4.016 0.000253 ***
## Population   1.173e-04  6.807e-05   1.724 0.092485 .  
## GDP_PC       4.082e-02  1.112e-02   3.671 0.000706 ***
## URate       -4.559e+01  3.445e+01  -1.323 0.193250    
## MIR2010     -3.396e+01  3.510e+01  -0.967 0.339108    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1074 on 40 degrees of freedom
## Multiple R-squared:  0.6265, Adjusted R-squared:  0.5612 
## F-statistic: 9.586 on 7 and 40 DF,  p-value: 6.164e-07
# 4. TABLA DE COEFICIENTES MCO FORMATO BONITO
cat("\n=== TABLA DE COEFICIENTES MCO ===\n")
## 
## === TABLA DE COEFICIENTES MCO ===
tabla_coeficientes <- tidy(modelo_mco)
tabla_coeficientes$significancia <- ifelse(tabla_coeficientes$p.value < 0.001, "***",
                                          ifelse(tabla_coeficientes$p.value < 0.01, "**",
                                                ifelse(tabla_coeficientes$p.value < 0.05, "*",
                                                      ifelse(tabla_coeficientes$p.value < 0.1, ".", ""))))

print(tabla_coeficientes)
## # A tibble: 8 × 6
##   term           estimate    std.error statistic  p.value significancia
##   <chr>             <dbl>        <dbl>     <dbl>    <dbl> <chr>        
## 1 (Intercept) 1487.       1250.            1.19  0.241    ""           
## 2 Area_Median  -17.0        17.0          -0.998 0.324    ""           
## 3 Room_Median  194.        529.            0.367 0.716    ""           
## 4 Density        0.184       0.0459        4.02  0.000253 "***"        
## 5 Population     0.000117    0.0000681     1.72  0.0925   "."          
## 6 GDP_PC         0.0408      0.0111        3.67  0.000706 "***"        
## 7 URate        -45.6        34.4          -1.32  0.193    ""           
## 8 MIR2010      -34.0        35.1          -0.967 0.339    ""
# Mostrar tabla formateada
kable(tabla_coeficientes, 
      col.names = c("Variable", "Coeficiente", "Error Estándar", "Estadístico t", "Valor p", "Significancia"),
      digits = 4,
      caption = "Resultados de la Estimación MCO")
Resultados de la Estimación MCO
Variable Coeficiente Error Estándar Estadístico t Valor p Significancia
(Intercept) 1487.1538 1250.2043 1.1895 0.2412
Area_Median -16.9894 17.0183 -0.9983 0.3241
Room_Median 193.9879 528.6325 0.3670 0.7156
Density 0.1844 0.0459 4.0160 0.0003 ***
Population 0.0001 0.0001 1.7237 0.0925 .
GDP_PC 0.0408 0.0111 3.6711 0.0007 ***
URate -45.5866 34.4492 -1.3233 0.1933
MIR2010 -33.9564 35.0971 -0.9675 0.3391
# 5. ESTADÍSTICOS DE BONDAD DE AJUSTE
cat("\n=== BONDAD DE AJUSTE ===\n")
## 
## === BONDAD DE AJUSTE ===
bondad_ajuste <- glance(modelo_mco)
cat("R-cuadrado:", round(bondad_ajuste$r.squared, 4), "(", round(bondad_ajuste$r.squared * 100, 1), "%)\n")
## R-cuadrado: 0.6265 ( 62.7 %)
cat("R-cuadrado ajustado:", round(bondad_ajuste$adj.r.squared, 4), "\n")
## R-cuadrado ajustado: 0.5612
cat("Estadístico F:", round(bondad_ajuste$statistic, 2), "\n")
## Estadístico F: 9.59
cat("Valor-p del modelo:", bondad_ajuste$p.value, "\n")
## Valor-p del modelo: 6.16443e-07
cat("Número de observaciones:", bondad_ajuste$nobs, "\n")
## Número de observaciones: 48
# 6. ECUACIÓN DEL MODELO MCO
cat("\n=== ECUACIÓN ESTIMADA POR MCO ===\n")
## 
## === ECUACIÓN ESTIMADA POR MCO ===
coeficientes <- coef(modelo_mco)
cat("Price_Median =", round(coeficientes[1], 4))
## Price_Median = 1487.154
for(i in 2:length(coeficientes)) {
  signo <- ifelse(coeficientes[i] >= 0, "+", "")
  cat(" ", signo, round(coeficientes[i], 4), "*", names(coeficientes)[i])
}
##    -16.9894 * Area_Median  + 193.9879 * Room_Median  + 0.1844 * Density  + 1e-04 * Population  + 0.0408 * GDP_PC   -45.5866 * URate   -33.9564 * MIR2010
cat("\n")
# 7. INTERPRETACIÓN DE COEFICIENTES
cat("\n=== INTERPRETACIÓN DE COEFICIENTES ===\n")
## 
## === INTERPRETACIÓN DE COEFICIENTES ===
for(i in 2:length(coeficientes)) {
  var_name <- names(coeficientes)[i]
  coef_value <- coeficientes[i]
  p_value <- tabla_coeficientes$p.value[tabla_coeficientes$term == var_name]
  
  cat("•", var_name, ":")
  if(p_value < 0.05) {
    cat(" SIGNIFICATIVO")
  } else {
    cat(" No significativo")
  }
  cat(" - Coeficiente:", round(coef_value, 4))
  if(p_value < 0.05) {
    if(coef_value > 0) {
      cat(" (efecto positivo)")
    } else {
      cat(" (efecto negativo)")
    }
  }
  cat(" | p-value:", round(p_value, 4), "\n")
}
## • Area_Median : No significativo - Coeficiente: -16.9894 | p-value: 0.3241 
## • Room_Median : No significativo - Coeficiente: 193.9879 | p-value: 0.7156 
## • Density : SIGNIFICATIVO - Coeficiente: 0.1844 (efecto positivo) | p-value: 3e-04 
## • Population : No significativo - Coeficiente: 1e-04 | p-value: 0.0925 
## • GDP_PC : SIGNIFICATIVO - Coeficiente: 0.0408 (efecto positivo) | p-value: 7e-04 
## • URate : No significativo - Coeficiente: -45.5866 | p-value: 0.1933 
## • MIR2010 : No significativo - Coeficiente: -33.9564 | p-value: 0.3391
# 8. PREDICCIONES DEL MODELO MCO
cat("\n=== PREDICCIONES MCO ===\n")
## 
## === PREDICCIONES MCO ===
predicciones <- predict(modelo_mco)
residuos <- residuals(modelo_mco)

# Mostrar primeras 5 predicciones vs valores reales
cat("Primeras 5 observaciones (Real vs Predicho):\n")
## Primeras 5 observaciones (Real vs Predicho):
ejemplo_pred <- data.frame(
  Real = datos_completos$Price_Median[1:5],
  Predicho = round(predicciones[1:5], 2),
  Residuo = round(residuos[1:5], 2)
)
print(ejemplo_pred)
##       Real Predicho  Residuo
## 1 3419.973  3397.91    22.07
## 2 2064.004  1955.39   108.62
## 3 3140.000  4570.87 -1430.87
## 5 2150.338  2020.00   130.33
## 6 2357.143  3138.36  -781.22
# 9. GRÁFICO DE PREDICCIONES VS REALES CON LEYENDA
# Crear un data frame para el gráfico
df_grafico <- data.frame(Real = datos_completos$Price_Median, Predicho = predicciones)

# Crear el gráfico
ggplot(df_grafico, aes(x = Real, y = Predicho)) +
  geom_point(aes(color = "Predicciones"), alpha = 0.6) +
  geom_abline(aes(linetype = "Línea de identidad (y=x)", color = "Línea de identidad (y=x)"), 
              slope = 1, intercept = 0) +
  scale_color_manual(name = "Elementos",
                     values = c("Predicciones" = "blue", "Línea de identidad (y=x)" = "red")) +
  scale_linetype_manual(name = "Elementos", values = c("Línea de identidad (y=x)" = "dashed")) +
  labs(title = "Predicciones MCO vs Valores Reales",
       x = "Valor Real de Price_Median",
       y = "Valor Predicho por MCO") +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5, face = "bold"),
        legend.position = "bottom")
## Warning: `geom_abline()`: Ignoring `mapping` because `slope` and/or `intercept` were
## provided.
## Warning: No shared levels found between `names(values)` of the manual scale and the
## data's linetype values.

CASO 2: ANALSIS A PRIORI

La desigualdad de ingresos constituye uno de los principales desafíos económicos y sociales en muchos países. Un nivel elevado de desigualdad no solo limita las oportunidades de desarrollo humano, sino que también genera tensiones sociales y reduce el crecimiento económico sostenible. En este contexto, resulta fundamental identificar los factores que explican las diferencias en el coeficiente de Gini entre distintas ciudades europeas. El presente estudio busca analizar empíricamente la relación entre la desigualdad y ciertas variables macroeconómicas clave, utilizando datos de 50 ciudades. En particular, se evalúa el impacto del PIB per cápita (GDP_PC), la tasa de desempleo (URate), la inflación en 2010 (Inflation2010) y la tasa de interés de largo plazo (LRIR) sobre el coeficiente de Gini. Antes de realizar la estimación econométrica, es pertinente establecer las relaciones teóricas esperadas entre estas variables. Según la teoría del desarrollo económico, un mayor PIB per cápita suele asociarse con mejoras en el bienestar y en la capacidad estatal para implementar políticas redistributivas, lo que tendería a reducir la desigualdad (Todaro & Smith, 2015). Sin embargo, la hipótesis de la curva de Kuznets (1955) plantea que esta relación puede ser no lineal: en fases iniciales del desarrollo la desigualdad aumenta, mientras que en etapas más avanzadas tiende a disminuir, configurando una curva en forma de U invertida. En cuanto al desempleo, la teoría económica sostiene que este fenómeno amplía las brechas sociales al reducir los ingresos de los hogares más vulnerables, lo que se traduce en un aumento de la desigualdad (Keynes, 1936; Stiglitz, 2012). De igual manera, la inflación presenta un efecto regresivo sobre la distribución del ingreso, pues erosiona con mayor fuerza el poder adquisitivo de los hogares de bajos recursos (Friedman, 1976).

2: caragamos la base de datos:

library(readxl)
base_taller <- read_excel("C:/Users/carab/Downloads/Rosi_files/dp2015-13_Dataset.xls")
View(base_taller)

3: Análisis de valores faltantes

library(dplyr)
faltantes <- colSums(is.na(base_taller)) %>%
                 as.data.frame()

faltantes
##                                   .
## City                              0
## City_Eng                          0
## City_Short                        0
## NAds                              0
## Price_Median                      0
## Price_Mean                        0
## Area_Median                       0
## Area_Mean                         0
## Room_Median                       0
## Room_Mean                         0
## Euro_area                         0
## EU                                0
## Population                        0
## City_Area                         0
## Density                           0
## GDP_PC                            0
## GDP_PC_PPS                        0
## GDP_PC2008                        0
## GDP_PC2009                        0
## GDP_PC2010                        0
## Gini                              0
## HOR                               0
## Kearny_GCI2010                    0
## LRIR                              0
## Inflation2010                     0
## Inflation2011                     6
## URate                             0
## MIR2009                           2
## MIR2010                           2
## Mortgage_PC2010                   2
## Tppl1989_1993                    17
## Tppl1994_1998                    19
## Tppl1999_2002                    14
## Tppl2003_2006                    14
## Tppl2007_2009                    20
## GDP_PC_PPS1989_1993              50
## GDP_PC_PPS1994_1998              16
## GDP_PC_PPS1999_2002              16
## GDP_PC_PPS2003_2006              15
## GDP_PC_PPS2007_2009              15
## CITIES                           13
## DemoDepend1989_1993              22
## DemoDepend1994_1998              25
## DemoDepend1999_2002              19
## DemoDepend2003_2006              17
## DemoDepend2007_2009              23
## DemoODepend1989_1993             17
## DemoODepend1994_1998             21
## DemoODepend1999_2002             14
## DemoODepend2003_2006             13
## DemoODepend2007_2009             20
## Thh1989_1993                     20
## Thh1994_1998                     27
## Thh1999_2002                     13
## Thh2003_2006                     23
## Thh2007_2009                     27
## Ndwe1989_1993                    35
## Ndwe1994_1998                    43
## Ndwe1999_2002                    13
## Ndwe2003_2006                    19
## Ndwe2007_2009                    29
## Napart1989_1993                  46
## Napart1994_1998                  50
## Napart1999_2002                  16
## Napart2003_2006                  27
## Napart2007_2009                  38
## Nhouse1989_1993                  45
## Nhouse1994_1998                  50
## Nhouse1999_2002                  16
## Nhouse2003_2006                  27
## Nhouse2007_2009                  38
## Aphouse1989_1993                 39
## Aphouse1994_1998                 38
## Aphouse1999_2002                 31
## Aphouse2003_2006                 27
## Aphouse2007_2009                 31
## ApapartMincome1989_1993          44
## ApapartMincome1994_1998          44
## ApapartMincome1999_2002          38
## ApapartMincome2003_2006          34
## ApapartMincome2007_2009          37
## Arent-housing1989_1993           50
## Arent-housing1994_1998           50
## Arent-housing1999_2002           34
## Arent-housing2003_2006           33
## Arent-housing2007_2009           33
## Alarea1989_1993                  32
## Alarea1994_1998                  38
## Alarea1999_2002                  19
## Alarea2003_2006                  30
## Alarea2007_2009                  35
## Phh-owndwe1989_1993              23
## Phh-owndwe1994_1998              39
## Phh-owndwe1999_2002              13
## Phh-owndwe2003_2006              32
## Phh-owndwe2007_2009              38
## Urate1989_1993                   26
## Urate1994_1998                   29
## Urate1999_2002                   14
## Urate2003_2006                   15
## Urate2007_2009                   32
## Ncom-head1989_1993               50
## Ncom-head1994_1998               50
## Ncom-head1999_2002               28
## Ncom-head2003_2006               27
## Ncom-head2007_2009               38
## Mhhincome1989_1993               42
## Mhhincome1994_1998               35
## Mhhincome1999_2002               25
## Mhhincome2003_2006               33
## Mhhincome2007_2009               36
## Ahhincome1989_1993               50
## Ahhincome1994_1998               50
## Ahhincome1999_2002               35
## Ahhincome2003_2006               32
## Ahhincome2007_2009               31
## RQ1-Q4earn1989_1993              50
## RQ1-Q4earn1994_1998              50
## RQ1-Q4earn1999_2002              32
## RQ1-Q4earn2003_2006              36
## RQ1-Q4earn2007_2009              36
## HhincomeQ21989_1993              50
## HhincomeQ21994_1998              50
## HhincomeQ21999_2002              32
## HhincomeQ22003_2006              37
## HhincomeQ22007_2009              36
## HhincomeQ31989_1993              50
## HhincomeQ31994_1998              50
## HhincomeQ31999_2002              32
## HhincomeQ32003_2006              37
## HhincomeQ32007_2009              36
## Tlandarea1989_1993               26
## Tlandarea1994_1998               23
## Tlandarea1999_2002               17
## Tlandarea2003_2006               21
## Tlandarea2007_2009               25
## Larea-leisure1989_1993           48
## Larea-leisure1994_1998           47
## Larea-leisure1999_2002           32
## Larea-leisure2003_2006           33
## Larea-leisure2007_2009           38
## Parea-housing1989_1993           49
## Parea-housing1994_1998           43
## Parea-housing1999_2002           31
## Parea-housing2003_2006           34
## Parea-housing2007_2009           38
## Ppldens1989_1993                 26
## Ppldens1994_1998                 26
## Ppldens1999_2002                 18
## Ppldens2003_2006                 21
## Ppldens2007_2009                 25
## Netresidens-housingarea1989_1993 49
## Netresidens-housingarea1994_1998 46
## Netresidens-housingarea1999_2002 31
## Netresidens-housingarea2003_2006 33
## Netresidens-housingarea2007_2009 38
## APApartment1989_1993             44
## APApartment1994_1998             43
## APApartment1999_2002             34
## APApartment2003_2006             23
## APApartment2007_2009             29
## Temp_Jul1989_1993                47
## Temp_Jul1994_1998                39
## Temp_Jul1999_2002                20
## Temp_Jul2003_2006                16
## Temp_Jul2007_2009                22
## Temp_Jan1989_1993                47
## Temp_Jan1994_1998                41
## Temp_Jan1999_2002                20
## Temp_Jan2003_2006                16
## Temp_Jan2007_2009                22
## Latitude_deg                      0
## Latitude_min                      0
## Latitude_sec                      0
## Longitude_deg                     0
## Longitude_min                     0
## Longitude_sec                     0
## Lat                               0
## Lon                               0
## Liveability2010                  19
## Mercer_Qual_Liv2011              25
## Mercer_Per_Safe2011              34
## ECM2010                          23
## ECM_Cost2010                     23

4: Matriz de correlación

matrizcorrelacionc2<-cor(base_taller[,c("Gini", "GDP_PC", "URate", "Inflation2010", "LRIR")])
matrizcorrelacionc2
##                      Gini       GDP_PC        URate Inflation2010       LRIR
## Gini           1.00000000 -0.319858630  0.033550827     0.3429263  0.2610788
## GDP_PC        -0.31985863  1.000000000 -0.005821582    -0.7171235 -0.6958277
## URate          0.03355083 -0.005821582  1.000000000    -0.3245602 -0.3644162
## Inflation2010  0.34292631 -0.717123490 -0.324560182     1.0000000  0.8707909
## LRIR           0.26107879 -0.695827720 -0.364416158     0.8707909  1.0000000
library(corrplot)
corrplot(matrizcorrelacionc2, method = "color", addCoef.col = "purple", number.cex = 0.7, tl.cex = 0.7)

5: Modelo de regresión lineal múltiple

options(scipen = 999)
modeloc2 <- lm(formula = Gini ~ GDP_PC+URate+Inflation2010+LRIR, data = base_taller)
summary(modeloc2)
## 
## Call:
## lm(formula = Gini ~ GDP_PC + URate + Inflation2010 + LRIR, data = base_taller)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.8153  -3.6291  -0.6696   4.1949  17.1762 
## 
## Coefficients:
##                  Estimate  Std. Error t value    Pr(>|t|)    
## (Intercept)   31.13242294  5.22374614   5.960 0.000000358 ***
## GDP_PC        -0.00003527  0.00006692  -0.527       0.601    
## URate          0.13884875  0.19251006   0.721       0.474    
## Inflation2010  0.78030534  0.54424155   1.434       0.159    
## LRIR          -0.22850224  0.45269340  -0.505       0.616    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.109 on 45 degrees of freedom
## Multiple R-squared:  0.1488, Adjusted R-squared:  0.07316 
## F-statistic: 1.967 on 4 and 45 DF,  p-value: 0.1158

6: Gráficos de dispersión con línea de tendencia

# ráfico de dispersión: Gini vs PIB per capita(2010)
plot(base_taller$GDP_PC, base_taller$Gini,
     main = "Dispersión: Gini vs PIB per cápita",
     xlab = "PIB per cápita",
     ylab = "Gini",
     pch = 19, col = "yellow")
modeloc2 <- lm(Gini ~ GDP_PC, data = base_taller)

# Agregar la línea de tendencia al gráfico
abline(modeloc2, col = "blue", lwd = 2)


# Crear ecuación para la leyenda
eq <- paste0("Gini = ", round(coef(modeloc2)[1], 3),
             " + ", round(coef(modeloc2)[2], 5), " * PIB_pc")

# Agregar la leyenda
legend("topright", legend = eq, col = "blue", lwd = 2, bty = "n")

# Gráfico de dispersión: Gini vs inflación (2010)
plot(base_taller$Inflation2010, base_taller$Gini,
     main = "Dispersión: Gini vs Inflación (2010)",
     xlab = "Inflación (2010)",
     ylab = "Gini",
     pch = 19, col = "blue")

# Modelo de regresión simple
modelo_inf <- lm(Gini ~ Inflation2010, data = base_taller)

# Agregar la línea de tendencia
abline(modelo_inf, col = "red", lwd = 2)

# Agregar leyenda
legend("topleft",
       legend = c("Datos", "Tendencia lineal"),
       col = c("blue", "red"),
       pch = c(19, NA),   # círculo para puntos, nada para la línea
       lty = c(NA, 1),    # sin línea para puntos, línea continua para regresión
       bty = "n")

# Gráfico de dispersión: Gini vs tasa de interés a largo plazo
plot(base_taller$LRIR, base_taller$Gini,
     main = "Dispersión: Gini vs Tasa de interés a largo plazo",
     xlab = "Tasa de interés a largo plazo",
     ylab = "Gini",
     pch = 19, col = "pink")

# Modelo de regresión simple
modelo_lr <- lm(Gini ~ LRIR, data = base_taller)

# Agregar la línea de tendencia
abline(modelo_lr, col = "red", lwd = 2)

# Agregar leyenda
legend("topleft",
       legend = c("Datos", "Tendencia lineal"),
       col = c("purple", "red"),
       pch = c(19, NA),
       lty = c(NA, 1),
       bty = "n")

7: Boxplot de la tasa de desempleo

boxplot(base_taller$URate,
        main = "Boxplot de Tasa de desempleo",
        ylab = "Tasa de desempleo",
        col = "lightgreen",
        border = "purple")
# Agregar leyenda
legend("topright",
       legend = c("Distribución de la tasa de desempleo"),
       fill = "lightgreen",
       border = "purple",
       bty = "n")

8: Mapas

library(sf)
## Linking to GEOS 3.13.0, GDAL 3.10.1, PROJ 9.5.1; sf_use_s2() is TRUE
library(rnaturalearth)
library(rnaturalearthdata)
## 
## Adjuntando el paquete: 'rnaturalearthdata'
## The following object is masked from 'package:rnaturalearth':
## 
##     countries110
library(ggplot2)
library(dplyr)

#Mapa con Inflación (2010).

# Crear el mapa de Europa
europa_mapa <- ne_countries(scale = "medium", returnclass = "sf") %>%
  filter(region_un == "Europe")

# Convertir tus datos a sf (asegúrate de que tengas columnas Lon y Lat en base_taller)
ciudades_sf <- st_as_sf(base_taller, coords = c("Lon", "Lat"), crs = 4326)

# Graficar
ggplot() +
  geom_sf(data = europa_mapa, fill = "grey90", color = "pink") +
  geom_sf(data = ciudades_sf, aes(color = Inflation2010, size = Inflation2010)) +
  scale_color_gradient(low = "lightblue", high = "purple") +
  theme_minimal() +
  labs(title = "Inflación en ciudades de Europa") +
  coord_sf(xlim = c(-10, 45), ylim = c(37, 65), expand = FALSE)

Mapa con desigualdad (Gini).

ggplot() +
  geom_sf(data = europa_mapa, fill = "grey90", color = "white") +
  geom_sf(data = ciudades_sf, aes(color = Gini, size = Gini)) +
  scale_color_gradient(low = "lightblue", high = "darkblue") +
  theme_minimal() +
  labs(title = "Inflación en ciudades de Europa") +
  coord_sf(xlim = c(-10, 45), ylim = c(37, 65), expand = FALSE)

```