Este ejercicio consiste en realizar un análisis exploratorio sobre un dataset de vehiculos Toyota Corolla con 1436 instancias y 37 atributos.

El objetivo es conseguir un modelo de regresión lineal con un resultado aceptable interpretando cada paso del razonamiento necesario para llegar al objetivo.

  • El atributo objetivo es Price.

1: Carga de Librerias

library(fastDummies) 
library(car)
library(corrplot) 
library(mctest) 
library(tidyverse) 
library(leaps)
library(glmnet)
library(MASS)
library(reshape)
library(caret)
library(ggrepel)

2: Lectura del DataSet

raw_data = read.csv("ToyotaCorolla.csv") 
raw_data$Id = NULL
raw_data$Model = NULL
  • El atributo ID no es representativo de cada instancia, decido no considerarlo en el modelo.
  • El atributo Model no es representativo de cada instancia, decido no considerarlo en el modelo.

3: DataSet

raw_data

4: Estructura del DataSet

str(raw_data)
'data.frame':   1436 obs. of  35 variables:
 $ Price           : int  13500 13750 13950 14950 13750 12950 16900 18600 21500 12950 ...
 $ Age_08_04       : int  23 23 24 26 30 32 27 30 27 23 ...
 $ Mfg_Month       : int  10 10 9 7 3 1 6 3 6 10 ...
 $ Mfg_Year        : int  2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 ...
 $ KM              : int  46986 72937 41711 48000 38500 61000 94612 75889 19700 71138 ...
 $ Fuel_Type       : Factor w/ 3 levels "CNG","Diesel",..: 2 2 2 2 2 2 2 2 3 2 ...
 $ HP              : int  90 90 90 90 90 90 90 90 192 69 ...
 $ Met_Color       : int  1 1 1 0 0 0 1 1 0 0 ...
 $ Automatic       : int  0 0 0 0 0 0 0 0 0 0 ...
 $ cc              : int  2000 2000 2000 2000 2000 2000 2000 2000 1800 1900 ...
 $ Doors           : int  3 3 3 3 3 3 3 3 3 3 ...
 $ Cylinders       : int  4 4 4 4 4 4 4 4 4 4 ...
 $ Gears           : int  5 5 5 5 5 5 5 5 5 5 ...
 $ Quarterly_Tax   : int  210 210 210 210 210 210 210 210 100 185 ...
 $ Weight          : int  1165 1165 1165 1165 1170 1170 1245 1245 1185 1105 ...
 $ Mfr_Guarantee   : int  0 0 1 1 1 0 0 1 0 0 ...
 $ BOVAG_Guarantee : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Guarantee_Period: int  3 3 3 3 3 3 3 3 3 3 ...
 $ ABS             : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Airbag_1        : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Airbag_2        : int  1 1 1 1 1 1 1 1 0 1 ...
 $ Airco           : int  0 1 0 0 1 1 1 1 1 1 ...
 $ Automatic_airco : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Boardcomputer   : int  1 1 1 1 1 1 1 1 0 1 ...
 $ CD_Player       : int  0 1 0 0 0 0 0 1 0 0 ...
 $ Central_Lock    : int  1 1 0 0 1 1 1 1 1 0 ...
 $ Powered_Windows : int  1 0 0 0 1 1 1 1 1 0 ...
 $ Power_Steering  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Radio           : int  0 0 0 0 0 0 0 0 1 0 ...
 $ Mistlamps       : int  0 0 0 0 1 1 0 0 0 0 ...
 $ Sport_Model     : int  0 0 0 0 0 0 1 0 0 0 ...
 $ Backseat_Divider: int  1 1 1 1 1 1 1 1 0 1 ...
 $ Metallic_Rim    : int  0 0 0 0 0 0 0 0 1 0 ...
 $ Radio_cassette  : int  0 0 0 0 0 0 0 0 1 0 ...
 $ Tow_Bar         : int  0 0 0 0 0 0 0 0 0 0 ...

5: Resumen del DataSet

summary(raw_data)
     Price         Age_08_04       Mfg_Month         Mfg_Year          KM        
 Min.   : 4350   Min.   : 1.00   Min.   : 1.000   Min.   :1998   Min.   :     1  
 1st Qu.: 8450   1st Qu.:44.00   1st Qu.: 3.000   1st Qu.:1998   1st Qu.: 43000  
 Median : 9900   Median :61.00   Median : 5.000   Median :1999   Median : 63390  
 Mean   :10731   Mean   :55.95   Mean   : 5.549   Mean   :2000   Mean   : 68533  
 3rd Qu.:11950   3rd Qu.:70.00   3rd Qu.: 8.000   3rd Qu.:2001   3rd Qu.: 87021  
 Max.   :32500   Max.   :80.00   Max.   :12.000   Max.   :2004   Max.   :243000  
  Fuel_Type          HP          Met_Color        Automatic             cc       
 CNG   :  17   Min.   : 69.0   Min.   :0.0000   Min.   :0.00000   Min.   : 1300  
 Diesel: 155   1st Qu.: 90.0   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.: 1400  
 Petrol:1264   Median :110.0   Median :1.0000   Median :0.00000   Median : 1600  
               Mean   :101.5   Mean   :0.6748   Mean   :0.05571   Mean   : 1577  
               3rd Qu.:110.0   3rd Qu.:1.0000   3rd Qu.:0.00000   3rd Qu.: 1600  
               Max.   :192.0   Max.   :1.0000   Max.   :1.00000   Max.   :16000  
     Doors         Cylinders     Gears       Quarterly_Tax        Weight    
 Min.   :2.000   Min.   :4   Min.   :3.000   Min.   : 19.00   Min.   :1000  
 1st Qu.:3.000   1st Qu.:4   1st Qu.:5.000   1st Qu.: 69.00   1st Qu.:1040  
 Median :4.000   Median :4   Median :5.000   Median : 85.00   Median :1070  
 Mean   :4.033   Mean   :4   Mean   :5.026   Mean   : 87.12   Mean   :1072  
 3rd Qu.:5.000   3rd Qu.:4   3rd Qu.:5.000   3rd Qu.: 85.00   3rd Qu.:1085  
 Max.   :5.000   Max.   :4   Max.   :6.000   Max.   :283.00   Max.   :1615  
 Mfr_Guarantee    BOVAG_Guarantee  Guarantee_Period      ABS        
 Min.   :0.0000   Min.   :0.0000   Min.   : 3.000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.: 3.000   1st Qu.:1.0000  
 Median :0.0000   Median :1.0000   Median : 3.000   Median :1.0000  
 Mean   :0.4095   Mean   :0.8955   Mean   : 3.815   Mean   :0.8134  
 3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.: 3.000   3rd Qu.:1.0000  
 Max.   :1.0000   Max.   :1.0000   Max.   :36.000   Max.   :1.0000  
    Airbag_1         Airbag_2          Airco        Automatic_airco  
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000  
 1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000  
 Median :1.0000   Median :1.0000   Median :1.0000   Median :0.00000  
 Mean   :0.9708   Mean   :0.7228   Mean   :0.5084   Mean   :0.05641  
 3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.00000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00000  
 Boardcomputer      CD_Player       Central_Lock    Powered_Windows
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.000  
 1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.000  
 Median :0.0000   Median :0.0000   Median :1.0000   Median :1.000  
 Mean   :0.2946   Mean   :0.2187   Mean   :0.5801   Mean   :0.562  
 3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.000  
 Power_Steering       Radio          Mistlamps      Sport_Model    
 Min.   :0.0000   Min.   :0.0000   Min.   :0.000   Min.   :0.0000  
 1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000  
 Median :1.0000   Median :0.0000   Median :0.000   Median :0.0000  
 Mean   :0.9777   Mean   :0.1462   Mean   :0.257   Mean   :0.3001  
 3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:1.000   3rd Qu.:1.0000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.000   Max.   :1.0000  
 Backseat_Divider  Metallic_Rim    Radio_cassette      Tow_Bar      
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :1.0000   Median :0.0000   Median :0.0000   Median :0.0000  
 Mean   :0.7702   Mean   :0.2047   Mean   :0.1455   Mean   :0.2779  
 3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:1.0000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  

Observaciones
* El valor maximo de cc es de 16000, demasiado alto considerando la media.
* El atributo Fuel_Type es de tipo char, requerira un proceso de encoding.
* El valor de Cylinder es constante.

6: Análisis exploratorio

6a: Price

par(mfrow=c(1,2))
boxplot(raw_data$Price,main = "Precio Vehiculos Toyota Corolla",
        ylab = "Precio ($)", notch = TRUE)

hist(raw_data$Price,main = "Precio Vehiculos Toyota Corolla")

  • Se observa que la mediana del precio de los vehiculos, ronda los $10000 aproximadamente.
  • Se presentan valores atípicos con valores superiores a las $20000 y valores menores a $7000 aproximadamente.

6b: Mfg_Year

par(mfrow=c(1,2))

boxplot(raw_data$Age_08_04,main = "Age_08_04")
barplot(table(as.factor(raw_data$Age_08_04)),main = "Age_08_04")

  • El atributo “Age_08_04” presenta valores outliers correspondientes a vehiculos nuevos cuya antiguedad es 0.

6c: Mfg_Year

par(mfrow=c(1,2))

boxplot(raw_data$Mfg_Year,main = "Año de Mfg_Year")
barplot(table(as.factor(raw_data$Mfg_Year)),main = "Mfg_Year")

6d: KM

par(mfrow=c(1,2))

boxplot(raw_data$KM,main = "KM",
        ylab = "KM", notch = TRUE)

hist(raw_data$KM,main = "KM")

  • El atributo “KM” presenta valores outliers. Destaco sobretodo un conjunto de valores superiores a los 200000.

6e: HP

par(mfrow=c(1,2))

boxplot(raw_data$HP,main = "HP",
        ylab = "HP", notch = FALSE)

barplot(table(as.factor(raw_data$HP)),main = "HP")

  • El atributo “HP” presenta un valor outlier superior a 180. Según una investigacion realizada en medios externos al dataset, el valor si corresponde a un modelo de Toyota Corolla.

6f: CC

par(mfrow=c(1,2))

boxplot(raw_data$cc,main = "Cilindrada",
        ylab = "CC", notch = FALSE)

barplot(table(as.factor(raw_data$cc)),main = "Cilindrada")

  • El atributo “CC” presenta un outlier notorio superior a 16000, este valor esta fuera del contexto de un vehiculo toyota corolla, donde los valores promedio rondan el 100.

6g: Quarterly_Tax

par(mfrow=c(1,2))

boxplot(raw_data$Quarterly_Tax, main="Quarterly_Tax")

hist(raw_data$Quarterly_Tax)

  • El atributo “Quarterly_Tax” presenta outliers para valores superiores a 150 y valores menores a 50, sobre una mediana de 70 aproximadamente.

6h: Weight

par(mfrow=c(1,2))

boxplot(raw_data$Weight, main="Peso(KG)")
hist(raw_data$Weight)

  • El atributo “Weight” presenta outliers para valores superiores a 1150 sobre una mediana de 1050 aproximadamente.

6i: Fueltype y Radio Cassete

lbls <- c("0: No tiene", "1: Tiene")

par(mfrow=c(1,2)) 

barplot(table(as.factor(raw_data$Fuel_Type)), main="Fuel_Type")
pie(x = table(raw_data$Radio_cassette), labels = lbls, main="Radio Cassete")

6j: Metallic Rim y Backseat Divider


par(mfrow=c(1,2)) 

pie(x = table(raw_data$Metallic_Rim), labels = lbls,  main="Metallic Rim")
pie(x = table(raw_data$Backseat_Divider) , labels = lbls,  main="Backseat_Divider")

6k: Mistlamp, Radio y Sport Model

par(mfrow=c(1,3))

pie(x = table(raw_data$Mistlamps) , labels = lbls,  main="Mistlamps")
pie(x = table(raw_data$Radio), labels = lbls,  main="Radio")
pie(x = table(raw_data$Sport_Model), labels = lbls,  main="Sport_Model")

6l: Central Lock, CD Player y BoardComputer


par(mfrow=c(1,3))

pie(x = table(raw_data$Central_Lock), labels = lbls, main="Central_Lock")
pie(x = table(raw_data$CD_Player), labels = lbls, main="CD_Player")
pie(x = table(raw_data$Boardcomputer), labels = lbls, main="Boardcomputer")

6m: Airco, Airbag_2 y Airbag_1

par(mfrow=c(1,3))

pie(x = table(raw_data$Airco), labels = lbls,  main="Airco")
pie(x = table(raw_data$Airbag_2), labels = lbls,  main="Airbag_2")
pie(x = table(raw_data$Airbag_1), labels = lbls,main="Airbag_1")

6n: Guarantee Period y Automatic Airco


par(mfrow=c(1,2))

barplot(table(as.factor(raw_data$Guarantee_Period)), main="Guarantee_Period") 
pie(x = table(raw_data$Automatic_airco), labels = lbls,  main="Automatic_airco")

6o: MFR Guarantee, Gears y BOVAG Guarantee


par(mfrow=c(1,3))

pie(x = table(raw_data$Mfr_Guarantee), labels = lbls, main="Mfr_Guarantee")
barplot(table(as.factor(raw_data$Gears)), main="Gears")
pie(x = table(raw_data$BOVAG_Guarantee), labels = lbls, main="BOVAG_Guarantee")

6p: Doors, Automatic y ABS

par(mfrow=c(1,3))

barplot(table(as.factor(raw_data$Doors)), main="Doors")
pie(x = table(raw_data$Automatic),labels = lbls, main="Automatic")
pie(x = table(raw_data$ABS),labels = lbls, main="ABS")

7: Estudio de Variable Objetivo “Price”

7a: Distribucion de Price

hist(raw_data$Price, col="blue", breaks = 60, freq = F)
lines(density(raw_data$Price), col = "red", lwd=2)
rug(raw_data$Price)

7b: Relacion Price vs Resto de Predictores

plot(Price~., data=raw_data,col="blue")

8: Seleccion y modificacion de Variables

8a: Estudio de correlacion

corrplot(cor(dplyr::select(raw_data, -c("Fuel_Type"))), type="upper", method="pie")
the standard deviation is zero

8b: Indicadores de Colinealidad

imcdiag(dplyr::select(raw_data, -c("Price", "Fuel_Type")), raw_data$Price)

Call:
imcdiag(x = dplyr::select(raw_data, -c("Price", "Fuel_Type")), 
    y = raw_data$Price)


All Individual Multicollinearity Diagnostics Result

                     VIF    TOL        Wi        Fi Leamer    CVIF Klein
Age_08_04            Inf 0.0000       Inf       Inf 0.0000    -Inf     1
Mfg_Month            Inf 0.0000       Inf       Inf 0.0000    -Inf     1
Mfg_Year             Inf 0.0000       Inf       Inf 0.0000    -Inf     1
KM                1.8647 0.5363   37.9120   39.1629 0.7323 -0.0556     0
HP                1.6023 0.6241   26.4092   27.2805 0.7900 -0.0478     0
Met_Color         1.1398 0.8773    6.1308    6.3331 0.9367 -0.0340     0
Automatic         1.0805 0.9255    3.5309    3.6474 0.9620 -0.0322     0
cc                1.2170 0.8217    9.5136    9.8275 0.9065 -0.0363     0
Doors             1.2554 0.7966   11.1979   11.5674 0.8925 -0.0374     0
Cylinders         2.0001 0.5000   43.8472   45.2939     NA -0.0596     0
Gears             1.2599 0.7937   11.3958   11.7718 0.8909 -0.0376     0
Quarterly_Tax     2.7801 0.3597   78.0447   80.6197 0.5998 -0.0829     0
Weight            3.2137 0.3112   97.0581  100.2604 0.5578 -0.0958     0
Mfr_Guarantee     1.1983 0.8345    8.6960    8.9830 0.9135 -0.0357     0
BOVAG_Guarantee   1.3712 0.7293   16.2736   16.8105 0.8540 -0.0409     0
Guarantee_Period  1.5381 0.6502   23.5907   24.3691 0.8063 -0.0458     0
ABS               2.2232 0.4498   53.6282   55.3976 0.6707 -0.0663     0
Airbag_1          1.5989 0.6254   26.2590   27.1253 0.7908 -0.0477     0
Airbag_2          3.0894 0.3237   91.6074   94.6299 0.5689 -0.0921     0
Airco             1.8361 0.5446   36.6558   37.8652 0.7380 -0.0547     0
Automatic_airco   1.7419 0.5741   32.5257   33.5988 0.7577 -0.0519     0
Boardcomputer     2.6305 0.3802   71.4869   73.8455 0.6166 -0.0784     0
CD_Player         1.5503 0.6450   24.1291   24.9253 0.8031 -0.0462     0
Central_Lock      4.5886 0.2179  157.3372  162.5283 0.4668 -0.1368     0
Powered_Windows   4.6078 0.2170  158.1800  163.3990 0.4659 -0.1373     0
Power_Steering    1.5557 0.6428   24.3626   25.1664 0.8018 -0.0464     0
Radio            62.3090 0.0160 2688.0184 2776.7064 0.1267 -1.8572     1
Mistlamps         2.0750 0.4819   47.1335   48.6886 0.6942 -0.0618     0
Sport_Model       1.4606 0.6846   20.1946   20.8609 0.8274 -0.0435     0
Backseat_Divider  2.5379 0.3940   67.4257   69.6504 0.6277 -0.0756     0
Metallic_Rim      1.3400 0.7463   14.9077   15.3995 0.8639 -0.0399     0
Radio_cassette   62.1291 0.0161 2680.1284 2768.5561 0.1269 -1.8518     1
Tow_Bar           1.1445 0.8738    6.3348    6.5438 0.9347 -0.0341     0

1 --> COLLINEARITY is detected by the test 
0 --> COLLINEARITY is not detected by the test

HP , Automatic , Doors , Guarantee_Period , ABS , Boardcomputer , Central_Lock , Powered_Windows , Power_Steering , Mistlamps , Backseat_Divider , coefficient(s) are non-significant may be due to multicollinearity

R-square of y on all x: 0.9058 

* use method argument to check which regressors may be the reason of collinearity
===================================
  • Mediante el calculo de VIF y haciendo principal hincapíe en los atributos cuyo valor de VIF es muy superior a 5, es posible que exista colinealidad vinculado con los atributos Age_08_04,Mfg_Month, Mfg_Year, Radio y Radio_cassette

9: Limpieza de Datos

9a: Tratamiento de Outliers

  • El atributo CC presenta un outlier(valor atípico) de CC = 16000. No es un valor coherente con el contexto de un vehiculo Toyota Corolla. Considero que probablemente fue un error y supongo que se agrego un cero de más, siendo el valor correcto 1600.
  • El atributo Guarantee_Period presenta un outlier de Guarantee_Period = 13.Considero que probablemente fue un error y decido imputar el valor 12.
  • El atributo KM presenta outliers para valores superiores a 150000 y valores menos a 10000. Si bien son valores coherentes dentro del contexto de vehiculos, al estar la mayor concentracion de los vehiculos dentro del intervalo (10000,120000), decido recortar el dataSet, reduciendo su tamaño un 12%.
  • El atributo Age_08_04 presenta outliers para instancias cuyo Age_08_04 es menor a 25.
  • El atributo HP presenta outliers para valores superiores a 120 y valores menores a 80. Si bien son valores coherentes para algunos modelos de Toyota Corrolla, son casos particulares no representativos del conjunto de vehiculos.
  • Tras realizar estas operaciones, el dataset restante posee un 80% de las instancias del dataset original.
  • El atributo Price posee outliers para valores menores a 6500 y valores superiores a 15000. Por este motivo considero el intervalo (6500,14500), dado que los valores excluidos no son representativos de la mayoria del conjunto.

clean_data <- raw_data

clean_data = clean_data %>%  mutate(cc = ifelse(cc == 16000, 1600, cc))
clean_data = clean_data %>%  mutate(Guarantee_Period = ifelse(Guarantee_Period == 13, 12, Guarantee_Period))

clean_data = clean_data %>% filter((KM > 20000 & KM < 130000))
clean_data = clean_data %>% filter(Weight < 1075 & Weight > 1010)

clean_data$Cylinders = NULL
  • El dataset restante posee un 72% de las instancias del dataset original.

10: Selección de Variables

10a: Best SubSet Selection

bss.model <- regsubsets(Price~., data = clean_data, nvmax = dim(clean_data)[2])
2  linear dependencies found
Reordering variables and trying again:
summary(bss.model)
Subset selection object
Call: regsubsets.formula(Price ~ ., data = clean_data, nvmax = dim(clean_data)[2])
34 Variables  (and intercept)
                 Forced in Forced out
Age_08_04            FALSE      FALSE
Mfg_Month            FALSE      FALSE
KM                   FALSE      FALSE
Fuel_TypePetrol      FALSE      FALSE
HP                   FALSE      FALSE
Met_Color            FALSE      FALSE
Automatic            FALSE      FALSE
cc                   FALSE      FALSE
Doors                FALSE      FALSE
Gears                FALSE      FALSE
Quarterly_Tax        FALSE      FALSE
Weight               FALSE      FALSE
Mfr_Guarantee        FALSE      FALSE
BOVAG_Guarantee      FALSE      FALSE
Guarantee_Period     FALSE      FALSE
ABS                  FALSE      FALSE
Airbag_1             FALSE      FALSE
Airbag_2             FALSE      FALSE
Airco                FALSE      FALSE
Automatic_airco      FALSE      FALSE
Boardcomputer        FALSE      FALSE
CD_Player            FALSE      FALSE
Central_Lock         FALSE      FALSE
Powered_Windows      FALSE      FALSE
Power_Steering       FALSE      FALSE
Radio                FALSE      FALSE
Mistlamps            FALSE      FALSE
Sport_Model          FALSE      FALSE
Backseat_Divider     FALSE      FALSE
Metallic_Rim         FALSE      FALSE
Radio_cassette       FALSE      FALSE
Tow_Bar              FALSE      FALSE
Mfg_Year             FALSE      FALSE
Fuel_TypeDiesel      FALSE      FALSE
1 subsets of each size up to 32
Selection Algorithm: exhaustive
          Age_08_04 Mfg_Month Mfg_Year KM  Fuel_TypeDiesel Fuel_TypePetrol HP 
1  ( 1 )  " "       " "       "*"      " " " "             " "             " "
2  ( 1 )  " "       " "       "*"      " " " "             " "             " "
3  ( 1 )  " "       " "       "*"      "*" " "             " "             " "
4  ( 1 )  " "       " "       "*"      "*" " "             " "             " "
5  ( 1 )  " "       " "       "*"      "*" " "             " "             " "
6  ( 1 )  " "       " "       "*"      "*" " "             " "             " "
7  ( 1 )  " "       " "       "*"      "*" " "             " "             " "
8  ( 1 )  " "       " "       "*"      "*" " "             " "             " "
9  ( 1 )  " "       " "       "*"      "*" " "             " "             " "
10  ( 1 ) " "       " "       "*"      "*" " "             "*"             " "
11  ( 1 ) " "       " "       "*"      "*" " "             "*"             " "
12  ( 1 ) " "       " "       "*"      "*" " "             "*"             " "
13  ( 1 ) " "       " "       "*"      "*" " "             "*"             " "
14  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
15  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
16  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
17  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
18  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
19  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
20  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
21  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
22  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
23  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
24  ( 1 ) " "       " "       "*"      "*" " "             "*"             "*"
25  ( 1 ) " "       "*"       "*"      "*" " "             "*"             "*"
26  ( 1 ) " "       "*"       "*"      "*" " "             "*"             "*"
27  ( 1 ) " "       "*"       "*"      "*" " "             "*"             "*"
28  ( 1 ) " "       "*"       "*"      "*" " "             "*"             "*"
29  ( 1 ) "*"       " "       "*"      "*" " "             "*"             "*"
          Met_Color Automatic cc  Doors Gears Quarterly_Tax Weight Mfr_Guarantee
1  ( 1 )  " "       " "       " " " "   " "   " "           " "    " "          
2  ( 1 )  " "       " "       " " " "   " "   " "           "*"    " "          
3  ( 1 )  " "       " "       " " " "   " "   " "           "*"    " "          
4  ( 1 )  " "       " "       " " " "   " "   " "           "*"    " "          
5  ( 1 )  " "       " "       " " " "   " "   " "           "*"    " "          
6  ( 1 )  " "       " "       " " " "   " "   " "           "*"    " "          
7  ( 1 )  " "       " "       " " " "   " "   " "           "*"    "*"          
8  ( 1 )  " "       "*"       " " " "   " "   " "           "*"    "*"          
9  ( 1 )  " "       "*"       " " " "   " "   " "           "*"    "*"          
10  ( 1 ) " "       "*"       " " " "   " "   " "           "*"    "*"          
11  ( 1 ) " "       "*"       " " " "   " "   " "           "*"    "*"          
12  ( 1 ) " "       "*"       " " " "   " "   " "           "*"    "*"          
13  ( 1 ) " "       "*"       " " " "   " "   " "           "*"    "*"          
14  ( 1 ) " "       " "       "*" " "   "*"   " "           "*"    "*"          
15  ( 1 ) " "       " "       "*" " "   "*"   " "           "*"    "*"          
16  ( 1 ) " "       " "       "*" " "   "*"   " "           "*"    "*"          
17  ( 1 ) " "       " "       "*" " "   "*"   " "           "*"    "*"          
18  ( 1 ) " "       "*"       "*" " "   "*"   " "           "*"    "*"          
19  ( 1 ) " "       "*"       "*" " "   "*"   "*"           "*"    "*"          
20  ( 1 ) " "       "*"       "*" " "   "*"   "*"           "*"    "*"          
21  ( 1 ) "*"       "*"       "*" " "   "*"   "*"           "*"    "*"          
22  ( 1 ) "*"       "*"       "*" " "   "*"   "*"           "*"    "*"          
23  ( 1 ) "*"       "*"       "*" " "   "*"   "*"           "*"    "*"          
24  ( 1 ) "*"       "*"       "*" " "   "*"   "*"           "*"    "*"          
25  ( 1 ) "*"       "*"       "*" " "   "*"   "*"           "*"    "*"          
26  ( 1 ) "*"       "*"       "*" " "   "*"   "*"           "*"    "*"          
27  ( 1 ) "*"       "*"       "*" " "   "*"   "*"           "*"    "*"          
28  ( 1 ) "*"       "*"       "*" " "   "*"   "*"           "*"    "*"          
29  ( 1 ) "*"       "*"       "*" " "   "*"   "*"           "*"    "*"          
          BOVAG_Guarantee Guarantee_Period ABS Airbag_1 Airbag_2 Airco
1  ( 1 )  " "             " "              " " " "      " "      " "  
2  ( 1 )  " "             " "              " " " "      " "      " "  
3  ( 1 )  " "             " "              " " " "      " "      " "  
4  ( 1 )  " "             " "              " " " "      " "      "*"  
5  ( 1 )  " "             " "              " " " "      " "      "*"  
6  ( 1 )  " "             "*"              " " " "      " "      "*"  
7  ( 1 )  " "             "*"              " " " "      " "      "*"  
8  ( 1 )  " "             "*"              " " " "      " "      "*"  
9  ( 1 )  " "             "*"              " " " "      " "      "*"  
10  ( 1 ) " "             "*"              " " " "      " "      "*"  
11  ( 1 ) " "             "*"              " " " "      " "      "*"  
12  ( 1 ) "*"             "*"              " " " "      " "      "*"  
13  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
14  ( 1 ) "*"             "*"              " " " "      " "      "*"  
15  ( 1 ) "*"             "*"              " " " "      " "      "*"  
16  ( 1 ) "*"             "*"              " " " "      " "      "*"  
17  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
18  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
19  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
20  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
21  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
22  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
23  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
24  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
25  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
26  ( 1 ) "*"             "*"              " " " "      "*"      "*"  
27  ( 1 ) "*"             "*"              "*" " "      "*"      "*"  
28  ( 1 ) "*"             "*"              "*" "*"      "*"      "*"  
29  ( 1 ) "*"             "*"              "*" "*"      "*"      "*"  
          Automatic_airco Boardcomputer CD_Player Central_Lock Powered_Windows
1  ( 1 )  " "             " "           " "       " "          " "            
2  ( 1 )  " "             " "           " "       " "          " "            
3  ( 1 )  " "             " "           " "       " "          " "            
4  ( 1 )  " "             " "           " "       " "          " "            
5  ( 1 )  " "             " "           " "       "*"          " "            
6  ( 1 )  " "             " "           " "       " "          " "            
7  ( 1 )  " "             " "           " "       " "          " "            
8  ( 1 )  " "             " "           " "       " "          " "            
9  ( 1 )  " "             " "           " "       "*"          " "            
10  ( 1 ) " "             " "           " "       "*"          " "            
11  ( 1 ) " "             " "           " "       "*"          " "            
12  ( 1 ) " "             " "           " "       "*"          " "            
13  ( 1 ) " "             " "           " "       " "          "*"            
14  ( 1 ) " "             " "           " "       "*"          " "            
15  ( 1 ) "*"             " "           " "       "*"          " "            
16  ( 1 ) "*"             " "           " "       "*"          " "            
17  ( 1 ) "*"             " "           " "       "*"          " "            
18  ( 1 ) "*"             " "           " "       "*"          " "            
19  ( 1 ) "*"             " "           " "       "*"          " "            
20  ( 1 ) "*"             "*"           " "       "*"          " "            
21  ( 1 ) "*"             " "           "*"       "*"          " "            
22  ( 1 ) "*"             "*"           "*"       "*"          " "            
23  ( 1 ) "*"             "*"           "*"       "*"          " "            
24  ( 1 ) "*"             "*"           "*"       "*"          " "            
25  ( 1 ) "*"             "*"           "*"       "*"          " "            
26  ( 1 ) "*"             "*"           "*"       "*"          "*"            
27  ( 1 ) "*"             "*"           "*"       "*"          "*"            
28  ( 1 ) "*"             "*"           "*"       "*"          "*"            
29  ( 1 ) "*"             "*"           "*"       "*"          "*"            
          Power_Steering Radio Mistlamps Sport_Model Backseat_Divider Metallic_Rim
1  ( 1 )  " "            " "   " "       " "         " "              " "         
2  ( 1 )  " "            " "   " "       " "         " "              " "         
3  ( 1 )  " "            " "   " "       " "         " "              " "         
4  ( 1 )  " "            " "   " "       " "         " "              " "         
5  ( 1 )  " "            " "   " "       " "         " "              " "         
6  ( 1 )  " "            " "   "*"       " "         " "              " "         
7  ( 1 )  " "            " "   "*"       " "         " "              " "         
8  ( 1 )  " "            " "   "*"       " "         " "              " "         
9  ( 1 )  " "            " "   "*"       " "         " "              " "         
10  ( 1 ) " "            " "   "*"       " "         " "              " "         
11  ( 1 ) " "            " "   "*"       " "         " "              " "         
12  ( 1 ) " "            " "   "*"       " "         " "              " "         
13  ( 1 ) " "            " "   "*"       " "         " "              " "         
14  ( 1 ) " "            " "   " "       "*"         " "              " "         
15  ( 1 ) " "            " "   " "       "*"         " "              " "         
16  ( 1 ) " "            " "   " "       "*"         " "              " "         
17  ( 1 ) " "            " "   " "       "*"         " "              " "         
18  ( 1 ) " "            " "   " "       "*"         " "              " "         
19  ( 1 ) " "            " "   " "       "*"         " "              " "         
20  ( 1 ) " "            " "   " "       "*"         " "              " "         
21  ( 1 ) " "            " "   " "       "*"         " "              " "         
22  ( 1 ) " "            " "   " "       "*"         " "              " "         
23  ( 1 ) "*"            " "   " "       "*"         " "              " "         
24  ( 1 ) "*"            " "   "*"       "*"         " "              " "         
25  ( 1 ) "*"            " "   "*"       "*"         " "              " "         
26  ( 1 ) "*"            " "   "*"       "*"         " "              " "         
27  ( 1 ) "*"            " "   "*"       "*"         " "              " "         
28  ( 1 ) "*"            " "   "*"       "*"         " "              " "         
29  ( 1 ) "*"            "*"   "*"       "*"         " "              " "         
          Radio_cassette Tow_Bar
1  ( 1 )  " "            " "    
2  ( 1 )  " "            " "    
3  ( 1 )  " "            " "    
4  ( 1 )  " "            " "    
5  ( 1 )  " "            " "    
6  ( 1 )  " "            " "    
7  ( 1 )  " "            " "    
8  ( 1 )  " "            " "    
9  ( 1 )  " "            " "    
10  ( 1 ) " "            " "    
11  ( 1 ) " "            "*"    
12  ( 1 ) " "            "*"    
13  ( 1 ) " "            "*"    
14  ( 1 ) " "            "*"    
15  ( 1 ) " "            "*"    
16  ( 1 ) "*"            "*"    
17  ( 1 ) "*"            "*"    
18  ( 1 ) "*"            "*"    
19  ( 1 ) "*"            "*"    
20  ( 1 ) "*"            "*"    
21  ( 1 ) "*"            "*"    
22  ( 1 ) "*"            "*"    
23  ( 1 ) "*"            "*"    
24  ( 1 ) "*"            "*"    
25  ( 1 ) "*"            "*"    
26  ( 1 ) "*"            "*"    
27  ( 1 ) "*"            "*"    
28  ( 1 ) "*"            "*"    
29  ( 1 ) "*"            "*"    
 [ reached getOption("max.print") -- omitted 3 rows ]
  • Mediante Best Subset se definen todos los modelos posibles para los 34 predictores del modelo mediante combinaciones de los mismos y comparando su valor de RSS obtenido.

10a.1: Cantidad de Variables propuestas por el modelo

which.max(summary(bss.model)$adjr2)
[1] 24
  • El modelo con un mejor valor de R ajustado utiliza 24 predictores.

10a.2: R Ajustado vs Cant. Predictores

p <- ggplot(data = data.frame(n_predictores = 1:32,
                              R_ajustado = summary(bss.model)$adjr2),
            aes(x = n_predictores, y = R_ajustado)) +
    geom_line() +
    geom_point()

p <- p + geom_point(aes(
                    x = n_predictores[which.max(summary(bss.model)$adjr2)],
                    y = R_ajustado[which.max(summary(bss.model)$adjr2)]),
                    colour = "red", size = 3)
p <- p +  scale_x_continuous(breaks = c(0:34)) + 
          theme_bw() +
          labs(title = 'R2_ajustado vs número de predictores', 
               x =  'número predictores')
p

10a.3: Coeficientes: Predictores propuestos por el modelo

coef(object = bss.model, id = which.max(summary(bss.model)$adjr2))
     (Intercept)               KM  Fuel_TypePetrol        Met_Color 
   -2.529947e+06    -1.179339e-02    -2.636556e+03    -1.288967e+02 
       Automatic               cc            Doors            Gears 
    5.137662e+02     2.755593e-01     7.045670e+01     4.497871e+02 
          Weight    Mfr_Guarantee  BOVAG_Guarantee Guarantee_Period 
    1.073410e+01     2.288838e+02     3.509444e+02     5.796841e+01 
             ABS         Airbag_1  Automatic_airco    Boardcomputer 
   -1.306206e+02    -1.155433e+02     9.962195e+02     1.253951e+00 
       CD_Player     Central_Lock  Powered_Windows   Power_Steering 
    1.386529e+02     1.808711e+02     1.862831e+02    -2.503139e+02 
       Mistlamps Backseat_Divider     Metallic_Rim         Mfg_Year 
    3.676494e+02    -3.742877e+01     5.032722e+01     1.264728e+03 
 Fuel_TypeDiesel 
    0.000000e+00 
  • El modelo obtenido mediante el uso del metodo Best Subset propone usar los predictores expuestos en el recuadro superior.

10a.4: R Ajustado: Predictores propuestos por el modelo

summary(bss.model)$adjr2[24]
[1] 0.774939
  • El valor de R Ajustado obtenido para el uso de 24 predictores es de 0.77. Pero, el uso de demasiados predictores añade complejidad al modelo.

10a.5: R Ajustado: Cant. de Predictores Alternativas

summary(bss.model)$adjr2[9]
[1] 0.7579834
summary(bss.model)$adjr2[6]
[1] 0.7469467
  • Reduciendo la cantidad predictores, disminuye el valor de R ajustado, pero disminuye la complejidad del modelo lo que permite una respuesta más general frente a nuevos datos de entrada.

10a.6: CP, AIC Y BIC

bss.model.sum = summary(bss.model)

par(mfrow = c(2, 2))
plot(bss.model.sum$rss, xlab = "Numero de Predictores", ylab = "RSS", type = "b")

plot(bss.model.sum$adjr2, xlab = "Numero de Predictores", ylab = "R Ajustada", type = "b")
best_adj_r2 = which.max(bss.model.sum$adjr2)
points(best_adj_r2, bss.model.sum$adjr2[best_adj_r2],
       col = "red",cex = 2, pch = 20)

plot(bss.model.sum$cp, xlab = "Numero de Predictores", ylab = "Cp", type = 'b')
best_cp = which.min(bss.model.sum$cp)
points(best_cp, bss.model.sum$cp[best_cp], 
       col = "red", cex = 2, pch = 20)

plot(bss.model.sum$bic, xlab = "Numero de Predictores", ylab = "BIC", type = 'b')
best_bic = which.min(bss.model.sum$bic)
points(best_bic, bss.model.sum$bic[best_bic], 
       col = "red", cex = 2, pch = 20)

  • En las gráficas anteriores BIC, CP, R ajustada se observan los puntos cuyo valores son mínimos y que no concordancia entre ellos para seleccionar ubivocamente la cantidad de predictores a emplear en un modelo. Sin embargo, puede observarse puntualmente en cada grafico, que existen sutiles mejoras (casi imperceptibles) entre algunas cantidades de predictores.

  • Se destaca que no tiene importancia destacar el punto minimo de RSS, dado que al por la naturaleza del modelo, a mayor cantidad de variable, menor será su valor.

10a.7: Coeficientes: Modelo Seleccionado

coef(object = bss.model, id = 7)
     (Intercept)               KM  Fuel_TypePetrol  BOVAG_Guarantee 
    1.309274e+04    -2.326143e-02    -4.063483e+03     4.683116e+02 
Guarantee_Period         Airbag_1    Boardcomputer Backseat_Divider 
    1.697532e+02     2.091082e+02     1.959851e+03     3.807397e+02 
  • Mediante la comparación de las graficas, se opta por un modelo con 7 predictores.

10a.8: Validacion Cruzada

set.seed(10)

index <- createDataPartition(clean_data$Price, p = 0.7, list = FALSE)

data.train <- clean_data[index, ]
data.test <- clean_data[-index, ]
set.seed(10)
model.fwd <- regsubsets(Price ~., data = data.train, nvmax = 7)
3  linear dependencies found
Reordering variables and trying again:
summary(model.fwd)
Subset selection object
Call: regsubsets.formula(Price ~ ., data = data.train, nvmax = 7)
34 Variables  (and intercept)
                 Forced in Forced out
Age_08_04            FALSE      FALSE
Mfg_Month            FALSE      FALSE
KM                   FALSE      FALSE
Fuel_TypePetrol      FALSE      FALSE
HP                   FALSE      FALSE
Met_Color            FALSE      FALSE
Automatic            FALSE      FALSE
cc                   FALSE      FALSE
Doors                FALSE      FALSE
Gears                FALSE      FALSE
Quarterly_Tax        FALSE      FALSE
Weight               FALSE      FALSE
Mfr_Guarantee        FALSE      FALSE
BOVAG_Guarantee      FALSE      FALSE
Guarantee_Period     FALSE      FALSE
ABS                  FALSE      FALSE
Airbag_1             FALSE      FALSE
Airbag_2             FALSE      FALSE
Airco                FALSE      FALSE
Automatic_airco      FALSE      FALSE
Boardcomputer        FALSE      FALSE
CD_Player            FALSE      FALSE
Central_Lock         FALSE      FALSE
Powered_Windows      FALSE      FALSE
Power_Steering       FALSE      FALSE
Radio                FALSE      FALSE
Mistlamps            FALSE      FALSE
Sport_Model          FALSE      FALSE
Backseat_Divider     FALSE      FALSE
Metallic_Rim         FALSE      FALSE
Tow_Bar              FALSE      FALSE
Mfg_Year             FALSE      FALSE
Fuel_TypeDiesel      FALSE      FALSE
Radio_cassette       FALSE      FALSE
1 subsets of each size up to 8
Selection Algorithm: exhaustive
         Age_08_04 Mfg_Month Mfg_Year KM  Fuel_TypeDiesel Fuel_TypePetrol HP 
1  ( 1 ) " "       " "       "*"      " " " "             " "             " "
2  ( 1 ) " "       " "       "*"      " " " "             " "             " "
3  ( 1 ) " "       " "       "*"      "*" " "             " "             " "
4  ( 1 ) " "       " "       "*"      "*" " "             " "             " "
5  ( 1 ) " "       " "       "*"      "*" " "             " "             " "
6  ( 1 ) " "       " "       "*"      "*" " "             " "             " "
7  ( 1 ) " "       " "       "*"      "*" " "             " "             " "
8  ( 1 ) " "       " "       "*"      "*" " "             " "             " "
         Met_Color Automatic cc  Doors Gears Quarterly_Tax Weight Mfr_Guarantee
1  ( 1 ) " "       " "       " " " "   " "   " "           " "    " "          
2  ( 1 ) " "       " "       " " " "   " "   " "           "*"    " "          
3  ( 1 ) " "       " "       " " " "   " "   " "           "*"    " "          
4  ( 1 ) " "       " "       " " " "   " "   " "           "*"    " "          
5  ( 1 ) " "       " "       " " " "   " "   " "           "*"    " "          
6  ( 1 ) " "       " "       " " " "   " "   " "           "*"    "*"          
7  ( 1 ) " "       " "       " " " "   " "   " "           "*"    "*"          
8  ( 1 ) " "       " "       " " " "   " "   " "           "*"    "*"          
         BOVAG_Guarantee Guarantee_Period ABS Airbag_1 Airbag_2 Airco
1  ( 1 ) " "             " "              " " " "      " "      " "  
2  ( 1 ) " "             " "              " " " "      " "      " "  
3  ( 1 ) " "             " "              " " " "      " "      " "  
4  ( 1 ) " "             " "              " " " "      " "      " "  
5  ( 1 ) " "             " "              " " " "      " "      "*"  
6  ( 1 ) " "             " "              " " " "      " "      "*"  
7  ( 1 ) " "             "*"              " " " "      " "      "*"  
8  ( 1 ) " "             "*"              " " " "      " "      "*"  
         Automatic_airco Boardcomputer CD_Player Central_Lock Powered_Windows
1  ( 1 ) " "             " "           " "       " "          " "            
2  ( 1 ) " "             " "           " "       " "          " "            
3  ( 1 ) " "             " "           " "       " "          " "            
4  ( 1 ) " "             " "           " "       " "          "*"            
5  ( 1 ) " "             " "           " "       " "          "*"            
6  ( 1 ) " "             " "           " "       " "          "*"            
7  ( 1 ) " "             " "           " "       " "          "*"            
8  ( 1 ) " "             " "           " "       " "          "*"            
         Power_Steering Radio Mistlamps Sport_Model Backseat_Divider Metallic_Rim
1  ( 1 ) " "            " "   " "       " "         " "              " "         
2  ( 1 ) " "            " "   " "       " "         " "              " "         
3  ( 1 ) " "            " "   " "       " "         " "              " "         
4  ( 1 ) " "            " "   " "       " "         " "              " "         
5  ( 1 ) " "            " "   " "       " "         " "              " "         
6  ( 1 ) " "            " "   " "       " "         " "              " "         
7  ( 1 ) " "            " "   " "       " "         " "              " "         
8  ( 1 ) " "            " "   " "       " "         " "              " "         
         Radio_cassette Tow_Bar
1  ( 1 ) " "            " "    
2  ( 1 ) " "            " "    
3  ( 1 ) " "            " "    
4  ( 1 ) " "            " "    
5  ( 1 ) " "            " "    
6  ( 1 ) " "            " "    
7  ( 1 ) " "            " "    
8  ( 1 ) " "            "*"    

10a.9: RMSE

val.errors = rep(NA,7)
x.test <- model.matrix(Price ~., data = data.test)

for(i in 1:7) 
{
  coeficientes <- coef(model.fwd, id = i)
  predictions <- x.test[,names(coeficientes)] %*% coeficientes
  val.errors[i] <- mean((data.test$Price - predictions)^2)
}

rmse <- sqrt(mean(val.errors))
rmse
[1] 1407.842
  • Tras una validación cruzada se obtiene un valor de RMSE de 1408.

10b: Ridge

  • Mediante Ridge se obtiene un modelo a partir de los 34 predictores iniciales, ponderando la influencia de cada predictor pero sin eliminar ninguno de los mismos del modelo.
set.seed(10)

index <- createDataPartition(clean_data$Price, p = 0.7, list = FALSE)

data.train <- clean_data[index, ]
data.test <- clean_data[-index, ]
  • Se produce en conjunto de entrenamiento y de prueba.

10b.1: Modelo


x = model.matrix(Price ~ . , data.train)[, -1]
y = as.matrix(data.train$Price)

ridge.model = glmnet(x, y, alpha = 0)
beta=coef(ridge.model)

tmp <- as.data.frame(as.matrix(beta))
tmp$coef <- row.names(tmp)
tmp <- reshape::melt(tmp, id = "coef")
tmp$variable <- as.numeric(gsub("s", "", tmp$variable))
tmp$lambda <- ridge.model$lambda[tmp$variable+1]
tmp$norm <- apply(abs(beta[-1,]), 2, sum)[tmp$variable+1]

ggplot(tmp[tmp$coef != "(Intercept)",], aes(lambda, value, color = coef, group = coef, )) + 
  geom_line() + 
    scale_x_log10() + 
    xlab("Lambda (log scale)") + 
    guides(color = guide_legend(title = ""), 
           linetype = guide_legend(title = "")) +
    theme_bw() + 
    theme(legend.key.width = unit(3,"lines")) 

plot(ridge.model, xvar = "lambda", label = TRUE)

  • Mediante los dos graficos anteriores se puede observar que los predictores mas significativos son aquellos que, durante el procesamiento del modelo, demoran mas en acercarse a valores cercanos a cero.

indices <- sample(c(TRUE,FALSE), nrow(data.train), replace = TRUE)

cv.out <- cv.glmnet(x[indices,], y[indices], alpha = 0)

plot(cv.out)

coef(cv.out)
35 x 1 sparse Matrix of class "dgCMatrix"
                             1
(Intercept)      -9.276178e+05
Age_08_04        -3.403398e+01
Mfg_Month        -1.675292e+01
Mfg_Year          4.666735e+02
KM               -8.205862e-03
Fuel_TypeDiesel   .           
Fuel_TypePetrol   .           
HP                4.510978e+00
Met_Color        -4.077869e+01
Automatic         5.819668e+02
cc                3.425118e-01
Doors             3.215659e+01
Gears             1.800406e+02
Quarterly_Tax     8.492749e+00
Weight            4.057818e+00
Mfr_Guarantee     2.013134e+02
BOVAG_Guarantee   3.643001e+02
Guarantee_Period  4.918492e+01
ABS               1.136664e+02
Airbag_1         -1.506483e+02
Airbag_2         -4.032492e-01
Airco             2.930682e+02
Automatic_airco   7.676041e+02
Boardcomputer     2.717277e+02
CD_Player         1.082930e+02
Central_Lock      2.059698e+02
Powered_Windows   6.839280e+01
Power_Steering   -7.009734e+02
Radio            -3.066674e+01
Mistlamps         5.970154e+01
Sport_Model      -2.479769e+02
Backseat_Divider  3.448104e+00
Metallic_Rim      5.100967e+01
Radio_cassette   -3.071270e+01
Tow_Bar          -2.591923e+02
  • La grafica anterior representa como el error incrementa a mayores valores de log(lambda).
  • Las lineas punteadas indican, el valor minimo de mse obtenido en la validacion cruzada y el valor de la derecha representa el error estandar.

bestlam = cv.out$lambda.min
bestlam
[1] 137.6734
  • Se emplea el menor lambda obtenido de la validacion cruzada propia del modelo.

ridge.pred <- predict(ridge.model, s = bestlam, newx = x[-indices,])
sqrt(mean((ridge.pred - y[-indices])^2))
[1] 807.2642
  • Se obtiene un valor de rmse de 807 obtenido mediante los datos de entrenamiento.

10b.2: Validacion Cruzada


x = model.matrix(Price ~ . , data.test)[, -1]
y = as.matrix(data.test$Price)

ridge.model = glmnet(x, y, alpha = 0)

indices <- sample(c(TRUE,FALSE), nrow(data.test), replace = TRUE)

cv.out <- cv.glmnet(x[indices,], y[indices], alpha = 0)

plot(cv.out)

coef(cv.out)
35 x 1 sparse Matrix of class "dgCMatrix"
                             1
(Intercept)      -5.107223e+05
Age_08_04        -1.990590e+01
Mfg_Month        -1.156697e+01
Mfg_Year          2.572843e+02
KM               -9.166624e-03
Fuel_TypeDiesel   .           
Fuel_TypePetrol   .           
HP                1.018606e-01
Met_Color        -3.050419e+01
Automatic        -3.058052e+02
cc               -7.618709e-02
Doors             8.384777e+01
Gears             4.417662e+02
Quarterly_Tax     4.467208e+00
Weight            4.256372e+00
Mfr_Guarantee     2.180984e+02
BOVAG_Guarantee   5.903780e+01
Guarantee_Period  2.742444e+01
ABS               1.886358e+02
Airbag_1         -2.687474e+01
Airbag_2          6.792561e+01
Airco             1.990458e+02
Automatic_airco   .           
Boardcomputer     4.884786e+02
CD_Player         3.181346e+02
Central_Lock      8.770576e+01
Powered_Windows   2.707806e+01
Power_Steering    9.958717e-01
Radio            -7.240374e+01
Mistlamps         2.673791e+02
Sport_Model      -3.403362e+02
Backseat_Divider  6.955809e+01
Metallic_Rim      3.906290e+00
Radio_cassette   -7.238457e+01
Tow_Bar          -8.097683e+01
  • El valor minimo de MSE y el error estandar, son similares a los obtenidos mediante datos de entrenamiento.

bestlam = cv.out$lambda.min
bestlam
[1] 489.783
  • Se emplea el menor lambda obtenido de la validacion cruzada propia del modelo.

ridge.pred <- predict(ridge.model, s = bestlam, newx = x[-indices,])
sqrt(mean((ridge.pred - y[-indices])^2))
[1] 775.3981
  • Se obtiene un valor de rmse de 775 datos de prueba.

10c: Lasso

  • Mediante Lasso se obtiene un modelo a partir de los 34 predictores iniciales, ponderando la influencia de cada predictor. A diferencia de Ridge, usando Lasso si se realiza una seleccion de variables.
set.seed(10)

index <- createDataPartition(clean_data$Price, p = 0.7, list = FALSE)

data.train <- clean_data[index, ]
data.test <- clean_data[-index, ]
  • Se produce en conjunto de entrenamiento y de prueba.

10c.1: Modelo


x = model.matrix(Price ~ . , data.train)[, -1]
y = as.matrix(data.train$Price)

lasso.model = glmnet(x, y, alpha = 1)
beta=coef(lasso.model)

tmp <- as.data.frame(as.matrix(beta))
tmp$coef <- row.names(tmp)
tmp <- reshape::melt(tmp, id = "coef")
tmp$variable <- as.numeric(gsub("s", "", tmp$variable))
tmp$lambda <- lasso.model$lambda[tmp$variable+1]
tmp$norm <- apply(abs(beta[-1,]), 2, sum)[tmp$variable+1]

ggplot(tmp[tmp$coef != "(Intercept)",], aes(lambda, value, color = coef, group = coef, )) + 
  geom_line() + 
    scale_x_log10() + 
    xlab("Lambda (log scale)") + 
    guides(color = guide_legend(title = ""), 
           linetype = guide_legend(title = "")) +
    theme_bw() + 
    theme(legend.key.width = unit(3,"lines")) 

plot(lasso.model, xvar = "lambda", label = TRUE)

  • Mediante los dos graficos anteriores se puede observar que los predictores mas significativos son aquellos que, durante el procesamiento del modelo, demoran mas en acercarse a valores cercanos a cero.

indices <- sample(c(TRUE,FALSE), nrow(data.train), replace = TRUE)

cv.out <- cv.glmnet(x[indices,], y[indices], alpha = 1)

plot(cv.out)

coef(cv.out)
35 x 1 sparse Matrix of class "dgCMatrix"
                             1
(Intercept)      -2.152194e+06
Age_08_04        -9.617712e+00
Mfg_Month         .           
Mfg_Year          1.077728e+03
KM               -6.654090e-03
Fuel_TypeDiesel   .           
Fuel_TypePetrol   .           
HP                .           
Met_Color         .           
Automatic         3.858173e+02
cc                2.490596e-01
Doors             .           
Gears             .           
Quarterly_Tax     3.041571e+00
Weight            7.261598e+00
Mfr_Guarantee     1.032277e+02
BOVAG_Guarantee   1.922574e+02
Guarantee_Period  9.119164e+00
ABS               .           
Airbag_1          .           
Airbag_2          .           
Airco             2.308151e+02
Automatic_airco   .           
Boardcomputer     .           
CD_Player         .           
Central_Lock      2.433077e+02
Powered_Windows   .           
Power_Steering   -4.338181e+02
Radio             .           
Mistlamps         .           
Sport_Model      -6.242317e+01
Backseat_Divider  .           
Metallic_Rim      .           
Radio_cassette    .           
Tow_Bar          -1.144862e+02
  • El valor minimo de MSE y el error estandar, son similares a los obtenidos mediante datos de entrenamiento.

bestlam = cv.out$lambda.min
bestlam
[1] 27.66179
  • Se emplea el menor lambda obtenido de la validacion cruzada propia del modelo.

lasso.pred <- predict(lasso.model, s = bestlam, newx = x[-indices,])
sqrt(mean((lasso.pred - y[-indices])^2))
[1] 812.298
  • Se obtiene un valor de rmse de 813 datos de entrenamiento.

10c.2: Validacion Cruzada


x = model.matrix(Price ~ . , data.test)[, -1]
y = as.matrix(data.test$Price)

lasso.model = glmnet(x, y, alpha = 1)

indices <- sample(c(TRUE,FALSE), nrow(data.test), replace = TRUE)

cv.out <- cv.glmnet(x[indices,], y[indices], alpha = 1)

plot(cv.out)

coef(cv.out)
35 x 1 sparse Matrix of class "dgCMatrix"
                             1
(Intercept)      -2.083003e+06
Age_08_04         .           
Mfg_Month         .           
Mfg_Year          1.045326e+03
KM               -6.128349e-03
Fuel_TypeDiesel   .           
Fuel_TypePetrol   .           
HP                .           
Met_Color         .           
Automatic         .           
cc                .           
Doors             6.359957e+01
Gears             2.868344e+01
Quarterly_Tax     .           
Weight            2.598188e+00
Mfr_Guarantee     .           
BOVAG_Guarantee   .           
Guarantee_Period  .           
ABS               .           
Airbag_1          .           
Airbag_2          .           
Airco             2.077925e+02
Automatic_airco   .           
Boardcomputer     6.818690e+01
CD_Player         .           
Central_Lock      .           
Powered_Windows   .           
Power_Steering    .           
Radio             .           
Mistlamps         2.173762e+02
Sport_Model      -5.946130e+01
Backseat_Divider  .           
Metallic_Rim      .           
Radio_cassette    .           
Tow_Bar           .           

bestlam = cv.out$lambda.min
bestlam
[1] 81.70072

lasso.pred <- predict(lasso.model, s = bestlam, newx = x[-indices,])
sqrt(mean((lasso.pred - y[-indices])^2))
[1] 813.2023
  • Se obtiene un valor de rmse de 813 datos de prueba.

11: Conclusión

Dado que es un dataset con una cantidad baja de instancias (1500 aprox), y que mediante el uso de Lasso se obtuvo un resultado aceptable comparado a los rmse calculados con otros métodos (Best Subset y Ridge), se opta por usar ese metodo para seleccion de variables.

LS0tDQp0aXRsZTogIlNlbGVjY2lvbiBkZSBWYXJpYWJsZXM6IERhdGFzZXQgJ1RveW90YSBDb3JvbGxhJyINCmF1dGhvcjogIkFsdmFyZXogSWduYWNpbyBOaWNvbGFzIg0Kb3V0cHV0Og0KICBodG1sX25vdGVib29rOg0KICAgIGRmX3ByaW50OiBwYWdlZA0KICAgIGZpZzpoZWlnaHQ6IDQNCiAgICBmaWc6d2lkdGg6IDYNCiAgICB0aGVtZTogcmVhZGFibGUNCiAgICB0b2M6IHllcw0KICAgIHRvY19mbG9hdDogeWVzDQogIHBkZl9kb2N1bWVudDoNCiAgICB0b2M6IHllcw0KZmlnOmhlaWdodDogNA0KLS0tDQpgYGB7ciBzZXR1cCwgaW5jbHVkZT1GQUxTRX0NCmtuaXRyOjpvcHRzX2NodW5rJHNldChlY2hvID0gVFJVRSkNCmBgYA0KDQpFc3RlIGVqZXJjaWNpbyBjb25zaXN0ZSBlbiByZWFsaXphciB1biBhbsOhbGlzaXMgZXhwbG9yYXRvcmlvIHNvYnJlIHVuIGRhdGFzZXQgZGUgdmVoaWN1bG9zIFRveW90YSBDb3JvbGxhIGNvbiAxNDM2IGluc3RhbmNpYXMgeSAzNyBhdHJpYnV0b3MuIA0KDQpFbCBvYmpldGl2byBlcyBjb25zZWd1aXIgdW4gbW9kZWxvIGRlIHJlZ3Jlc2nDs24gbGluZWFsIGNvbiB1biByZXN1bHRhZG8gYWNlcHRhYmxlIGludGVycHJldGFuZG8gY2FkYSBwYXNvIGRlbCByYXpvbmFtaWVudG8gbmVjZXNhcmlvIHBhcmEgbGxlZ2FyIGFsIG9iamV0aXZvLg0KDQoqIEVsIGF0cmlidXRvIG9iamV0aXZvIGVzIFByaWNlLiAgIA0KDQojIDE6IENhcmdhIGRlIExpYnJlcmlhcw0KYGBge3IgQ2FyZ2EgZGUgTGlicmVyaWFzLCBldmFsPUZBTFNFfQ0KbGlicmFyeShmYXN0RHVtbWllcykgDQpsaWJyYXJ5KGNhcikNCmxpYnJhcnkoY29ycnBsb3QpIA0KbGlicmFyeShtY3Rlc3QpIA0KbGlicmFyeSh0aWR5dmVyc2UpIA0KbGlicmFyeShsZWFwcykNCmxpYnJhcnkoZ2xtbmV0KQ0KbGlicmFyeShNQVNTKQ0KbGlicmFyeShyZXNoYXBlKQ0KbGlicmFyeShjYXJldCkNCmxpYnJhcnkoZ2dyZXBlbCkNCmBgYA0KDQojIDI6IExlY3R1cmEgZGVsIERhdGFTZXQNCg0KYGBge3IgTGVjdHVyYSBkZWwgRGF0YVNldH0NCnJhd19kYXRhID0gcmVhZC5jc3YoIlRveW90YUNvcm9sbGEuY3N2IikgDQpyYXdfZGF0YSRJZCA9IE5VTEwNCnJhd19kYXRhJE1vZGVsID0gTlVMTA0KDQpgYGANCg0KKiBFbCBhdHJpYnV0byAqKklEKiogbm8gZXMgcmVwcmVzZW50YXRpdm8gZGUgY2FkYSBpbnN0YW5jaWEsIGRlY2lkbyBubyBjb25zaWRlcmFybG8gZW4gZWwgbW9kZWxvLiAgICAgDQoqIEVsIGF0cmlidXRvICoqTW9kZWwqKiBubyBlcyByZXByZXNlbnRhdGl2byBkZSBjYWRhIGluc3RhbmNpYSwgZGVjaWRvIG5vIGNvbnNpZGVyYXJsbyBlbiBlbCBtb2RlbG8uIA0KDQoNCiMgMzogRGF0YVNldA0KDQpgYGB7ciBWaXN1YWxpemFjaW9uIGRlbCBEYXRhU2V0fQ0KcmF3X2RhdGENCmBgYA0KDQojIDQ6IEVzdHJ1Y3R1cmEgZGVsIERhdGFTZXQNCg0KYGBge3IgRXN0cnVjdHVyYSBkZWwgRGF0YVNldH0NCnN0cihyYXdfZGF0YSkNCmBgYA0KDQojIDU6IFJlc3VtZW4gZGVsIERhdGFTZXQNCg0KYGBge3IgUmVzdW1lbiBkZWwgRGF0YVNldH0NCnN1bW1hcnkocmF3X2RhdGEpDQpgYGANCg0KKipPYnNlcnZhY2lvbmVzKiogICANCiogRWwgdmFsb3IgbWF4aW1vIGRlIGNjIGVzIGRlIDE2MDAwLCBkZW1hc2lhZG8gYWx0byBjb25zaWRlcmFuZG8gbGEgbWVkaWEuICAgIA0KKiBFbCBhdHJpYnV0byBGdWVsX1R5cGUgZXMgZGUgdGlwbyBjaGFyLCByZXF1ZXJpcmEgdW4gcHJvY2VzbyBkZSBlbmNvZGluZy4gICAgDQoqIEVsIHZhbG9yIGRlIEN5bGluZGVyIGVzIGNvbnN0YW50ZS4gDQoNCiMgNjogQW7DoWxpc2lzIGV4cGxvcmF0b3JpbyAgICANCg0KIyMgNmE6IFByaWNlIA0KDQpgYGB7ciBQcmljZX0NCnBhcihtZnJvdz1jKDEsMikpDQpib3hwbG90KHJhd19kYXRhJFByaWNlLG1haW4gPSAiUHJlY2lvIFZlaGljdWxvcyBUb3lvdGEgQ29yb2xsYSIsDQogICAgICAgIHlsYWIgPSAiUHJlY2lvICgkKSIsIG5vdGNoID0gVFJVRSkNCg0KaGlzdChyYXdfZGF0YSRQcmljZSxtYWluID0gIlByZWNpbyBWZWhpY3Vsb3MgVG95b3RhIENvcm9sbGEiKQ0KYGBgDQoqIFNlIG9ic2VydmEgcXVlIGxhIG1lZGlhbmEgZGVsIHByZWNpbyBkZSBsb3MgdmVoaWN1bG9zLCByb25kYSBsb3MgJDEwMDAwIGFwcm94aW1hZGFtZW50ZS4gDQoqIFNlIHByZXNlbnRhbiB2YWxvcmVzIGF0w61waWNvcyBjb24gdmFsb3JlcyBzdXBlcmlvcmVzIGEgbGFzICQyMDAwMCB5IHZhbG9yZXMgbWVub3JlcyBhICQ3MDAwIGFwcm94aW1hZGFtZW50ZS4gIA0KDQojIyA2YjogTWZnX1llYXINCiAgICAgDQpgYGB7ciBBZ2VfMDhfMDR9DQpwYXIobWZyb3c9YygxLDIpKQ0KDQpib3hwbG90KHJhd19kYXRhJEFnZV8wOF8wNCxtYWluID0gIkFnZV8wOF8wNCIpDQpiYXJwbG90KHRhYmxlKGFzLmZhY3RvcihyYXdfZGF0YSRBZ2VfMDhfMDQpKSxtYWluID0gIkFnZV8wOF8wNCIpDQoNCmBgYA0KKiBFbCBhdHJpYnV0byAiQWdlXzA4XzA0IiBwcmVzZW50YSB2YWxvcmVzIG91dGxpZXJzIGNvcnJlc3BvbmRpZW50ZXMgYSB2ZWhpY3Vsb3MgbnVldm9zIGN1eWEgYW50aWd1ZWRhZCBlcyAwLiAgDQoNCiMjIDZjOiBNZmdfWWVhcg0KDQpgYGB7ciBNZmdfWWVhcn0NCnBhcihtZnJvdz1jKDEsMikpDQoNCmJveHBsb3QocmF3X2RhdGEkTWZnX1llYXIsbWFpbiA9ICJBw7FvIGRlIE1mZ19ZZWFyIikNCmJhcnBsb3QodGFibGUoYXMuZmFjdG9yKHJhd19kYXRhJE1mZ19ZZWFyKSksbWFpbiA9ICJNZmdfWWVhciIpDQpgYGANCg0KIyMgNmQ6IEtNDQoNCmBgYHtyIEJveHBsb3QgS019DQpwYXIobWZyb3c9YygxLDIpKQ0KDQpib3hwbG90KHJhd19kYXRhJEtNLG1haW4gPSAiS00iLA0KICAgICAgICB5bGFiID0gIktNIiwgbm90Y2ggPSBUUlVFKQ0KDQpoaXN0KHJhd19kYXRhJEtNLG1haW4gPSAiS00iKQ0KDQpgYGANCiogRWwgYXRyaWJ1dG8gIktNIiBwcmVzZW50YSB2YWxvcmVzIG91dGxpZXJzLiBEZXN0YWNvIHNvYnJldG9kbyB1biBjb25qdW50byBkZSB2YWxvcmVzIHN1cGVyaW9yZXMgYSBsb3MgMjAwMDAwLiANCg0KDQojIyA2ZTogSFANCmBgYHtyIEhQfQ0KcGFyKG1mcm93PWMoMSwyKSkNCg0KYm94cGxvdChyYXdfZGF0YSRIUCxtYWluID0gIkhQIiwNCiAgICAgICAgeWxhYiA9ICJIUCIsIG5vdGNoID0gRkFMU0UpDQoNCmJhcnBsb3QodGFibGUoYXMuZmFjdG9yKHJhd19kYXRhJEhQKSksbWFpbiA9ICJIUCIpDQpgYGANCg0KKiBFbCBhdHJpYnV0byAiSFAiIHByZXNlbnRhIHVuIHZhbG9yIG91dGxpZXIgc3VwZXJpb3IgYSAxODAuIFNlZ8O6biB1bmEgaW52ZXN0aWdhY2lvbiByZWFsaXphZGEgZW4gbWVkaW9zIGV4dGVybm9zIGFsIGRhdGFzZXQsIGVsIHZhbG9yIHNpIGNvcnJlc3BvbmRlIGEgdW4gbW9kZWxvIGRlIFRveW90YSBDb3JvbGxhLiAgICANCg0KDQojIyA2ZjogQ0MNCiAgDQpgYGB7ciBCb3hwbG90IENDfQ0KcGFyKG1mcm93PWMoMSwyKSkNCg0KYm94cGxvdChyYXdfZGF0YSRjYyxtYWluID0gIkNpbGluZHJhZGEiLA0KICAgICAgICB5bGFiID0gIkNDIiwgbm90Y2ggPSBGQUxTRSkNCg0KYmFycGxvdCh0YWJsZShhcy5mYWN0b3IocmF3X2RhdGEkY2MpKSxtYWluID0gIkNpbGluZHJhZGEiKQ0KYGBgDQoNCiogRWwgYXRyaWJ1dG8gIkNDIiBwcmVzZW50YSB1biBvdXRsaWVyIG5vdG9yaW8gc3VwZXJpb3IgYSAxNjAwMCwgZXN0ZSB2YWxvciBlc3RhIGZ1ZXJhIGRlbCBjb250ZXh0byBkZSB1biB2ZWhpY3VsbyB0b3lvdGEgY29yb2xsYSwgZG9uZGUgbG9zIHZhbG9yZXMgcHJvbWVkaW8gcm9uZGFuIGVsIDEwMC4gICAgDQoNCiMjIDZnOiBRdWFydGVybHlfVGF4DQoNCmBgYHtyIEJveHBsb3QgUXVhcnRlcmx5X1RheH0NCnBhcihtZnJvdz1jKDEsMikpDQoNCmJveHBsb3QocmF3X2RhdGEkUXVhcnRlcmx5X1RheCwgbWFpbj0iUXVhcnRlcmx5X1RheCIpDQoNCmhpc3QocmF3X2RhdGEkUXVhcnRlcmx5X1RheCkNCmBgYA0KDQoqIEVsIGF0cmlidXRvICJRdWFydGVybHlfVGF4IiBwcmVzZW50YSBvdXRsaWVycyBwYXJhIHZhbG9yZXMgc3VwZXJpb3JlcyBhIDE1MCB5IHZhbG9yZXMgbWVub3JlcyBhIDUwLCBzb2JyZSB1bmEgbWVkaWFuYSBkZSA3MCBhcHJveGltYWRhbWVudGUuIA0KDQojIyA2aDogV2VpZ2h0DQoNCmBgYHtyIFdlaWdodH0NCnBhcihtZnJvdz1jKDEsMikpDQoNCmJveHBsb3QocmF3X2RhdGEkV2VpZ2h0LCBtYWluPSJQZXNvKEtHKSIpDQpoaXN0KHJhd19kYXRhJFdlaWdodCkNCmBgYA0KDQoqIEVsIGF0cmlidXRvICJXZWlnaHQiIHByZXNlbnRhIG91dGxpZXJzIHBhcmEgdmFsb3JlcyBzdXBlcmlvcmVzIGEgMTE1MCBzb2JyZSB1bmEgbWVkaWFuYSBkZSAxMDUwIGFwcm94aW1hZGFtZW50ZS4NCg0KIyMgNmk6IEZ1ZWx0eXBlIHkgUmFkaW8gQ2Fzc2V0ZQ0KDQpgYGB7ciBQbG90IEZ1ZWx0eXBlIHkgUmFkaW8gQ2Fzc2V0ZX0NCmxibHMgPC0gYygiMDogTm8gdGllbmUiLCAiMTogVGllbmUiKQ0KDQpwYXIobWZyb3c9YygxLDIpKSANCg0KYmFycGxvdCh0YWJsZShhcy5mYWN0b3IocmF3X2RhdGEkRnVlbF9UeXBlKSksIG1haW49IkZ1ZWxfVHlwZSIpDQpwaWUoeCA9IHRhYmxlKHJhd19kYXRhJFJhZGlvX2Nhc3NldHRlKSwgbGFiZWxzID0gbGJscywgbWFpbj0iUmFkaW8gQ2Fzc2V0ZSIpDQpgYGANCg0KIyMgNmo6IE1ldGFsbGljIFJpbSB5IEJhY2tzZWF0IERpdmlkZXINCg0KYGBge3IgUGxvdCBNZXRhbGxpYyBSaW0geSBCYWNrc2VhdCBEaXZpZGVyfQ0KDQpwYXIobWZyb3c9YygxLDIpKSANCg0KcGllKHggPSB0YWJsZShyYXdfZGF0YSRNZXRhbGxpY19SaW0pLCBsYWJlbHMgPSBsYmxzLCAgbWFpbj0iTWV0YWxsaWMgUmltIikNCnBpZSh4ID0gdGFibGUocmF3X2RhdGEkQmFja3NlYXRfRGl2aWRlcikgLCBsYWJlbHMgPSBsYmxzLCAgbWFpbj0iQmFja3NlYXRfRGl2aWRlciIpDQpgYGANCg0KIyMgNms6IE1pc3RsYW1wLCBSYWRpbyB5IFNwb3J0IE1vZGVsDQoNCmBgYHtyIFBsb3QgTWlzdGxhbXAsIFJhZGlvIHkgU3BvcnQgTW9kZWx9DQpwYXIobWZyb3c9YygxLDMpKQ0KDQpwaWUoeCA9IHRhYmxlKHJhd19kYXRhJE1pc3RsYW1wcykgLCBsYWJlbHMgPSBsYmxzLCAgbWFpbj0iTWlzdGxhbXBzIikNCnBpZSh4ID0gdGFibGUocmF3X2RhdGEkUmFkaW8pLCBsYWJlbHMgPSBsYmxzLCAgbWFpbj0iUmFkaW8iKQ0KcGllKHggPSB0YWJsZShyYXdfZGF0YSRTcG9ydF9Nb2RlbCksIGxhYmVscyA9IGxibHMsICBtYWluPSJTcG9ydF9Nb2RlbCIpDQpgYGANCg0KIyMgNmw6IENlbnRyYWwgTG9jaywgQ0QgUGxheWVyIHkgQm9hcmRDb21wdXRlcg0KDQpgYGB7ciBQbG90IENlbnRyYWwgTG9jaywgQ0QgUGxheWVyIHkgQm9hcmRDb21wdXRlcn0NCg0KcGFyKG1mcm93PWMoMSwzKSkNCg0KcGllKHggPSB0YWJsZShyYXdfZGF0YSRDZW50cmFsX0xvY2spLCBsYWJlbHMgPSBsYmxzLCBtYWluPSJDZW50cmFsX0xvY2siKQ0KcGllKHggPSB0YWJsZShyYXdfZGF0YSRDRF9QbGF5ZXIpLCBsYWJlbHMgPSBsYmxzLCBtYWluPSJDRF9QbGF5ZXIiKQ0KcGllKHggPSB0YWJsZShyYXdfZGF0YSRCb2FyZGNvbXB1dGVyKSwgbGFiZWxzID0gbGJscywgbWFpbj0iQm9hcmRjb21wdXRlciIpDQoNCmBgYA0KDQojIyA2bTogQWlyY28sIEFpcmJhZ18yIHkgQWlyYmFnXzENCg0KYGBge3IgUGxvdCBBaXJjbywgQWlyYmFnXzIgeSBBaXJiYWdfMX0NCnBhcihtZnJvdz1jKDEsMykpDQoNCnBpZSh4ID0gdGFibGUocmF3X2RhdGEkQWlyY28pLCBsYWJlbHMgPSBsYmxzLCAgbWFpbj0iQWlyY28iKQ0KcGllKHggPSB0YWJsZShyYXdfZGF0YSRBaXJiYWdfMiksIGxhYmVscyA9IGxibHMsICBtYWluPSJBaXJiYWdfMiIpDQpwaWUoeCA9IHRhYmxlKHJhd19kYXRhJEFpcmJhZ18xKSwgbGFiZWxzID0gbGJscyxtYWluPSJBaXJiYWdfMSIpDQoNCmBgYA0KDQojIyA2bjogR3VhcmFudGVlIFBlcmlvZCB5IEF1dG9tYXRpYyBBaXJjbw0KDQpgYGB7ciBQbG90IEd1YXJhbnRlZSBQZXJpb2QgeSBBdXRvbWF0aWMgQWlyY299DQoNCnBhcihtZnJvdz1jKDEsMikpDQoNCmJhcnBsb3QodGFibGUoYXMuZmFjdG9yKHJhd19kYXRhJEd1YXJhbnRlZV9QZXJpb2QpKSwgbWFpbj0iR3VhcmFudGVlX1BlcmlvZCIpIA0KcGllKHggPSB0YWJsZShyYXdfZGF0YSRBdXRvbWF0aWNfYWlyY28pLCBsYWJlbHMgPSBsYmxzLCAgbWFpbj0iQXV0b21hdGljX2FpcmNvIikNCg0KYGBgDQoNCiMjIDZvOiBNRlIgR3VhcmFudGVlLCBHZWFycyB5IEJPVkFHIEd1YXJhbnRlZQ0KDQpgYGB7ciBQbG90IE1GUiBHdWFyYW50ZWUsIEdlYXJzIHkgQk9WQUcgR3VhcmFudGVlfQ0KDQpwYXIobWZyb3c9YygxLDMpKQ0KDQpwaWUoeCA9IHRhYmxlKHJhd19kYXRhJE1mcl9HdWFyYW50ZWUpLCBsYWJlbHMgPSBsYmxzLCBtYWluPSJNZnJfR3VhcmFudGVlIikNCmJhcnBsb3QodGFibGUoYXMuZmFjdG9yKHJhd19kYXRhJEdlYXJzKSksIG1haW49IkdlYXJzIikNCnBpZSh4ID0gdGFibGUocmF3X2RhdGEkQk9WQUdfR3VhcmFudGVlKSwgbGFiZWxzID0gbGJscywgbWFpbj0iQk9WQUdfR3VhcmFudGVlIikNCmBgYA0KDQojIyA2cDogRG9vcnMsIEF1dG9tYXRpYyB5IEFCUw0KDQpgYGB7ciBQbG90IERvb3JzLCBBdXRvbWF0aWMgeSBBQlN9DQpwYXIobWZyb3c9YygxLDMpKQ0KDQpiYXJwbG90KHRhYmxlKGFzLmZhY3RvcihyYXdfZGF0YSREb29ycykpLCBtYWluPSJEb29ycyIpDQpwaWUoeCA9IHRhYmxlKHJhd19kYXRhJEF1dG9tYXRpYyksbGFiZWxzID0gbGJscywgbWFpbj0iQXV0b21hdGljIikNCnBpZSh4ID0gdGFibGUocmF3X2RhdGEkQUJTKSxsYWJlbHMgPSBsYmxzLCBtYWluPSJBQlMiKQ0KYGBgDQoNCiMgNzogRXN0dWRpbyBkZSBWYXJpYWJsZSBPYmpldGl2byAiUHJpY2UiDQoNCiMjIDdhOiBEaXN0cmlidWNpb24gZGUgUHJpY2UNCmBgYHtyIEhpc3RvZ3JhbWEgZGUgUHJpY2V9DQpoaXN0KHJhd19kYXRhJFByaWNlLCBjb2w9ImJsdWUiLCBicmVha3MgPSA2MCwgZnJlcSA9IEYpDQpsaW5lcyhkZW5zaXR5KHJhd19kYXRhJFByaWNlKSwgY29sID0gInJlZCIsIGx3ZD0yKQ0KcnVnKHJhd19kYXRhJFByaWNlKQ0KYGBgDQoNCiMjIDdiOiBSZWxhY2lvbiBQcmljZSB2cyBSZXN0byBkZSBQcmVkaWN0b3Jlcw0KYGBge3IgR3JhZmljb3MgZGUgRGlzcGVyc2lvbjogUHJpY2UgdnMgVG9kb3N9DQpwbG90KFByaWNlfi4sIGRhdGE9cmF3X2RhdGEsY29sPSJibHVlIikNCmBgYA0KDQojIDg6IFNlbGVjY2lvbiB5IG1vZGlmaWNhY2lvbiBkZSBWYXJpYWJsZXMgICAgDQoNCiMjIDhhOiBFc3R1ZGlvIGRlIGNvcnJlbGFjaW9uDQpgYGB7ciBFc3R1ZGlvIGRlIGNvcnJlbGFjaW9ufQ0KY29ycnBsb3QoY29yKGRwbHlyOjpzZWxlY3QocmF3X2RhdGEsIC1jKCJGdWVsX1R5cGUiKSkpLCB0eXBlPSJ1cHBlciIsIG1ldGhvZD0icGllIikNCmBgYA0KDQoNCiMjIDhiOiBJbmRpY2Fkb3JlcyBkZSBDb2xpbmVhbGlkYWQNCg0KYGBge3IgQ2FsY3VsbyBkZSBWSUYgeSBUT0wgc29icmUgRGF0YVNldH0NCmltY2RpYWcoZHBseXI6OnNlbGVjdChyYXdfZGF0YSwgLWMoIlByaWNlIiwgIkZ1ZWxfVHlwZSIpKSwgcmF3X2RhdGEkUHJpY2UpDQpgYGANCiogTWVkaWFudGUgZWwgY2FsY3VsbyBkZSBWSUYgeSBoYWNpZW5kbyBwcmluY2lwYWwgaGluY2Fww61lIGVuIGxvcyBhdHJpYnV0b3MgY3V5byB2YWxvciBkZSBWSUYgZXMgbXV5IHN1cGVyaW9yIGEgNSwgZXMgcG9zaWJsZSBxdWUgZXhpc3RhIGNvbGluZWFsaWRhZCB2aW5jdWxhZG8gY29uIGxvcyBhdHJpYnV0b3MgKipBZ2VfMDhfMDQsTWZnX01vbnRoLCBNZmdfWWVhciwgUmFkaW8geSAgUmFkaW9fY2Fzc2V0dGUqKg0KDQojIDk6IExpbXBpZXphIGRlIERhdG9zICAgDQoNCiMjIDlhOiBUcmF0YW1pZW50byBkZSBPdXRsaWVycyANCg0KKiBFbCBhdHJpYnV0byBDQyBwcmVzZW50YSB1biBvdXRsaWVyKHZhbG9yIGF0w61waWNvKSBkZSBDQyA9IDE2MDAwLiBObyBlcyB1biB2YWxvciBjb2hlcmVudGUgY29uIGVsIGNvbnRleHRvIGRlIHVuIHZlaGljdWxvIFRveW90YSBDb3JvbGxhLiBDb25zaWRlcm8gcXVlIHByb2JhYmxlbWVudGUgZnVlIHVuIGVycm9yIHkgc3Vwb25nbyBxdWUgc2UgYWdyZWdvIHVuIGNlcm8gZGUgbcOhcywgc2llbmRvIGVsIHZhbG9yIGNvcnJlY3RvIDE2MDAuICAgDQoqIEVsIGF0cmlidXRvIEd1YXJhbnRlZV9QZXJpb2QgcHJlc2VudGEgdW4gb3V0bGllciBkZSBHdWFyYW50ZWVfUGVyaW9kID0gMTMuQ29uc2lkZXJvIHF1ZSBwcm9iYWJsZW1lbnRlIGZ1ZSB1biBlcnJvciB5IGRlY2lkbyBpbXB1dGFyIGVsIHZhbG9yIDEyLiAgIA0KKiBFbCBhdHJpYnV0byBLTSBwcmVzZW50YSBvdXRsaWVycyBwYXJhIHZhbG9yZXMgc3VwZXJpb3JlcyBhIDE1MDAwMCB5IHZhbG9yZXMgbWVub3MgYSAxMDAwMC4gU2kgYmllbiBzb24gdmFsb3JlcyBjb2hlcmVudGVzIGRlbnRybyBkZWwgY29udGV4dG8gZGUgdmVoaWN1bG9zLCBhbCBlc3RhciBsYSBtYXlvciBjb25jZW50cmFjaW9uIGRlIGxvcyB2ZWhpY3Vsb3MgZGVudHJvIGRlbCAqKmludGVydmFsbyAoMTAwMDAsMTIwMDAwKSoqLCBkZWNpZG8gcmVjb3J0YXIgZWwgZGF0YVNldCwgcmVkdWNpZW5kbyBzdSB0YW1hw7FvIHVuIDEyJS4gICAgIA0KKiBFbCBhdHJpYnV0byBBZ2VfMDhfMDQgcHJlc2VudGEgb3V0bGllcnMgcGFyYSBpbnN0YW5jaWFzIGN1eW8gQWdlXzA4XzA0IGVzIG1lbm9yIGEgMjUuICAgDQoqIEVsIGF0cmlidXRvIEhQIHByZXNlbnRhIG91dGxpZXJzIHBhcmEgdmFsb3JlcyBzdXBlcmlvcmVzIGEgMTIwIHkgdmFsb3JlcyBtZW5vcmVzIGEgODAuIFNpIGJpZW4gc29uIHZhbG9yZXMgY29oZXJlbnRlcyBwYXJhIGFsZ3Vub3MgbW9kZWxvcyBkZSBUb3lvdGEgQ29ycm9sbGEsIHNvbiBjYXNvcyBwYXJ0aWN1bGFyZXMgbm8gcmVwcmVzZW50YXRpdm9zIGRlbCBjb25qdW50byBkZSB2ZWhpY3Vsb3MuICAgIA0KKiBUcmFzIHJlYWxpemFyIGVzdGFzIG9wZXJhY2lvbmVzLCBlbCBkYXRhc2V0IHJlc3RhbnRlIHBvc2VlIHVuIDgwJSBkZSBsYXMgaW5zdGFuY2lhcyBkZWwgZGF0YXNldCBvcmlnaW5hbC4gICAgIA0KKiBFbCBhdHJpYnV0byBQcmljZSBwb3NlZSBvdXRsaWVycyBwYXJhIHZhbG9yZXMgbWVub3JlcyBhIDY1MDAgeSB2YWxvcmVzIHN1cGVyaW9yZXMgYSAxNTAwMC4gUG9yIGVzdGUgbW90aXZvIGNvbnNpZGVybyBlbCAqKmludGVydmFsbyAoNjUwMCwxNDUwMCkqKiwgZGFkbyBxdWUgbG9zIHZhbG9yZXMgZXhjbHVpZG9zIG5vIHNvbiByZXByZXNlbnRhdGl2b3MgZGUgbGEgbWF5b3JpYSBkZWwgY29uanVudG8uICAgIA0KDQoNCmBgYHtyIExpbXBpZXphIGRlIERhdG9zfQ0KDQpjbGVhbl9kYXRhIDwtIHJhd19kYXRhDQoNCmNsZWFuX2RhdGEgPSBjbGVhbl9kYXRhICU+JSAgbXV0YXRlKGNjID0gaWZlbHNlKGNjID09IDE2MDAwLCAxNjAwLCBjYykpDQpjbGVhbl9kYXRhID0gY2xlYW5fZGF0YSAlPiUgIG11dGF0ZShHdWFyYW50ZWVfUGVyaW9kID0gaWZlbHNlKEd1YXJhbnRlZV9QZXJpb2QgPT0gMTMsIDEyLCBHdWFyYW50ZWVfUGVyaW9kKSkNCg0KY2xlYW5fZGF0YSA9IGNsZWFuX2RhdGEgJT4lIGZpbHRlcigoS00gPiAyMDAwMCAmIEtNIDwgMTMwMDAwKSkNCmNsZWFuX2RhdGEgPSBjbGVhbl9kYXRhICU+JSBmaWx0ZXIoV2VpZ2h0IDwgMTA3NSAmIFdlaWdodCA+IDEwMTApDQoNCmNsZWFuX2RhdGEkQ3lsaW5kZXJzID0gTlVMTA0KDQpgYGANCg0KKiBFbCBkYXRhc2V0IHJlc3RhbnRlIHBvc2VlIHVuIDcyJSBkZSBsYXMgaW5zdGFuY2lhcyBkZWwgZGF0YXNldCBvcmlnaW5hbC4gICANCg0KDQojIDEwOiBTZWxlY2Npw7NuIGRlIFZhcmlhYmxlcyAgDQoNCiMjIDEwYTogQmVzdCBTdWJTZXQgU2VsZWN0aW9uDQoNCg0KYGBge3IgTW9kZWxvIEJlc3QgU3ViU2V0IFNlbGVjdGlvbn0NCmJzcy5tb2RlbCA8LSByZWdzdWJzZXRzKFByaWNlfi4sIGRhdGEgPSBjbGVhbl9kYXRhLCBudm1heCA9IGRpbShjbGVhbl9kYXRhKVsyXSkNCnN1bW1hcnkoYnNzLm1vZGVsKQ0KYGBgDQoNCiogTWVkaWFudGUgQmVzdCBTdWJzZXQgc2UgZGVmaW5lbiB0b2RvcyBsb3MgbW9kZWxvcyBwb3NpYmxlcyBwYXJhIGxvcyAzNCBwcmVkaWN0b3JlcyBkZWwgbW9kZWxvIG1lZGlhbnRlIGNvbWJpbmFjaW9uZXMgZGUgbG9zIG1pc21vcyB5IGNvbXBhcmFuZG8gc3UgdmFsb3IgZGUgUlNTIG9idGVuaWRvLg0KDQojIyMgMTBhLjE6IENhbnRpZGFkIGRlIFZhcmlhYmxlcyBwcm9wdWVzdGFzIHBvciBlbCBtb2RlbG8NCg0KYGBge3IgTW9kZWxvIEJlc3QgU3ViU2V0IFNlbGVjdGlvbiAyIH0NCndoaWNoLm1heChzdW1tYXJ5KGJzcy5tb2RlbCkkYWRqcjIpDQpgYGANCg0KKiBFbCBtb2RlbG8gY29uIHVuIG1lam9yIHZhbG9yIGRlIFIgYWp1c3RhZG8gdXRpbGl6YSAyNCBwcmVkaWN0b3Jlcy4gDQoNCiMjIyAxMGEuMjogUiBBanVzdGFkbyB2cyBDYW50LiBQcmVkaWN0b3Jlcw0KDQpgYGB7ciBCU1M6IFJeMiBhanVzdGFkbyB2cyBOdW1lcm8gZGUgUHJlZGljdG9yZXN9DQpwIDwtIGdncGxvdChkYXRhID0gZGF0YS5mcmFtZShuX3ByZWRpY3RvcmVzID0gMTozMiwNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIFJfYWp1c3RhZG8gPSBzdW1tYXJ5KGJzcy5tb2RlbCkkYWRqcjIpLA0KICAgICAgICAgICAgYWVzKHggPSBuX3ByZWRpY3RvcmVzLCB5ID0gUl9hanVzdGFkbykpICsNCiAgICBnZW9tX2xpbmUoKSArDQogICAgZ2VvbV9wb2ludCgpDQoNCnAgPC0gcCArIGdlb21fcG9pbnQoYWVzKA0KICAgICAgICAgICAgICAgICAgICB4ID0gbl9wcmVkaWN0b3Jlc1t3aGljaC5tYXgoc3VtbWFyeShic3MubW9kZWwpJGFkanIyKV0sDQogICAgICAgICAgICAgICAgICAgIHkgPSBSX2FqdXN0YWRvW3doaWNoLm1heChzdW1tYXJ5KGJzcy5tb2RlbCkkYWRqcjIpXSksDQogICAgICAgICAgICAgICAgICAgIGNvbG91ciA9ICJyZWQiLCBzaXplID0gMykNCnAgPC0gcCArICBzY2FsZV94X2NvbnRpbnVvdXMoYnJlYWtzID0gYygwOjM0KSkgKyANCiAgICAgICAgICB0aGVtZV9idygpICsNCiAgICAgICAgICBsYWJzKHRpdGxlID0gJ1IyX2FqdXN0YWRvIHZzIG7Dum1lcm8gZGUgcHJlZGljdG9yZXMnLCANCiAgICAgICAgICAgICAgIHggPSAgJ27Dum1lcm8gcHJlZGljdG9yZXMnKQ0KcA0KYGBgDQoNCiMjIyAxMGEuMzogQ29lZmljaWVudGVzOiBQcmVkaWN0b3JlcyBwcm9wdWVzdG9zIHBvciBlbCBtb2RlbG8NCg0KYGBge3IgQ29lZmljaWVudGVzIE1heH0NCmNvZWYob2JqZWN0ID0gYnNzLm1vZGVsLCBpZCA9IHdoaWNoLm1heChzdW1tYXJ5KGJzcy5tb2RlbCkkYWRqcjIpKQ0KYGBgDQoNCiogRWwgbW9kZWxvIG9idGVuaWRvIG1lZGlhbnRlIGVsIHVzbyBkZWwgbWV0b2RvIEJlc3QgU3Vic2V0IHByb3BvbmUgdXNhciBsb3MgcHJlZGljdG9yZXMgZXhwdWVzdG9zIGVuIGVsIHJlY3VhZHJvIHN1cGVyaW9yLiAgICAgDQoNCiMjIyAxMGEuNDogUiBBanVzdGFkbzogUHJlZGljdG9yZXMgcHJvcHVlc3RvcyBwb3IgZWwgbW9kZWxvDQoNCmBgYHtyIFZhbG9yIFJeMiBBanVzdGFkbzogTWV0b2RvfQ0Kc3VtbWFyeShic3MubW9kZWwpJGFkanIyWzI0XQ0KYGBgDQoNCiogRWwgdmFsb3IgZGUgUiBBanVzdGFkbyBvYnRlbmlkbyBwYXJhIGVsIHVzbyBkZSAyNCBwcmVkaWN0b3JlcyBlcyBkZSAwLjc3LiBQZXJvLCBlbCB1c28gZGUgZGVtYXNpYWRvcyBwcmVkaWN0b3JlcyBhw7FhZGUgY29tcGxlamlkYWQgYWwgbW9kZWxvLiAgICAgDQoNCiMjIyAxMGEuNTogUiBBanVzdGFkbzogQ2FudC4gZGUgUHJlZGljdG9yZXMgQWx0ZXJuYXRpdmFzDQoNCmBgYHtyIFZhbG9yIFJeMiBBanVzdGFkbzogQWx0ZXJuYXRpdmFzfQ0Kc3VtbWFyeShic3MubW9kZWwpJGFkanIyWzldDQpgYGANCg0KYGBge3IgVmFsb3IgUl4yIEFqdXN0YWRvOiBBbHRlcm5hdGl2YXMgMn0NCnN1bW1hcnkoYnNzLm1vZGVsKSRhZGpyMls2XQ0KYGBgDQoNCiogUmVkdWNpZW5kbyBsYSBjYW50aWRhZCBwcmVkaWN0b3JlcywgZGlzbWludXllIGVsIHZhbG9yIGRlIFIgYWp1c3RhZG8sIHBlcm8gZGlzbWludXllIGxhIGNvbXBsZWppZGFkIGRlbCBtb2RlbG8gbG8gcXVlIHBlcm1pdGUgdW5hIHJlc3B1ZXN0YSBtw6FzIGdlbmVyYWwgZnJlbnRlIGEgbnVldm9zIGRhdG9zIGRlIGVudHJhZGEuDQoNCiMjIyAxMGEuNjogQ1AsIEFJQyBZIEJJQw0KDQpgYGB7ciBCU1M6IEFJQyB5IEJJQ30NCmJzcy5tb2RlbC5zdW0gPSBzdW1tYXJ5KGJzcy5tb2RlbCkNCg0KcGFyKG1mcm93ID0gYygyLCAyKSkNCnBsb3QoYnNzLm1vZGVsLnN1bSRyc3MsIHhsYWIgPSAiTnVtZXJvIGRlIFByZWRpY3RvcmVzIiwgeWxhYiA9ICJSU1MiLCB0eXBlID0gImIiKQ0KDQpwbG90KGJzcy5tb2RlbC5zdW0kYWRqcjIsIHhsYWIgPSAiTnVtZXJvIGRlIFByZWRpY3RvcmVzIiwgeWxhYiA9ICJSIEFqdXN0YWRhIiwgdHlwZSA9ICJiIikNCmJlc3RfYWRqX3IyID0gd2hpY2gubWF4KGJzcy5tb2RlbC5zdW0kYWRqcjIpDQpwb2ludHMoYmVzdF9hZGpfcjIsIGJzcy5tb2RlbC5zdW0kYWRqcjJbYmVzdF9hZGpfcjJdLA0KICAgICAgIGNvbCA9ICJyZWQiLGNleCA9IDIsIHBjaCA9IDIwKQ0KDQpwbG90KGJzcy5tb2RlbC5zdW0kY3AsIHhsYWIgPSAiTnVtZXJvIGRlIFByZWRpY3RvcmVzIiwgeWxhYiA9ICJDcCIsIHR5cGUgPSAnYicpDQpiZXN0X2NwID0gd2hpY2gubWluKGJzcy5tb2RlbC5zdW0kY3ApDQpwb2ludHMoYmVzdF9jcCwgYnNzLm1vZGVsLnN1bSRjcFtiZXN0X2NwXSwgDQogICAgICAgY29sID0gInJlZCIsIGNleCA9IDIsIHBjaCA9IDIwKQ0KDQpwbG90KGJzcy5tb2RlbC5zdW0kYmljLCB4bGFiID0gIk51bWVybyBkZSBQcmVkaWN0b3JlcyIsIHlsYWIgPSAiQklDIiwgdHlwZSA9ICdiJykNCmJlc3RfYmljID0gd2hpY2gubWluKGJzcy5tb2RlbC5zdW0kYmljKQ0KcG9pbnRzKGJlc3RfYmljLCBic3MubW9kZWwuc3VtJGJpY1tiZXN0X2JpY10sIA0KICAgICAgIGNvbCA9ICJyZWQiLCBjZXggPSAyLCBwY2ggPSAyMCkNCmBgYA0KDQoqIEVuIGxhcyBncsOhZmljYXMgYW50ZXJpb3JlcyAqQklDLCBDUCwgUiBhanVzdGFkYSogc2Ugb2JzZXJ2YW4gbG9zIHB1bnRvcyBjdXlvIHZhbG9yZXMgc29uIG3DrW5pbW9zIHkgcXVlIG5vIGNvbmNvcmRhbmNpYSBlbnRyZSBlbGxvcyBwYXJhIHNlbGVjY2lvbmFyIHViaXZvY2FtZW50ZSBsYSBjYW50aWRhZCBkZSBwcmVkaWN0b3JlcyBhIGVtcGxlYXIgZW4gdW4gbW9kZWxvLiBTaW4gZW1iYXJnbywgcHVlZGUgb2JzZXJ2YXJzZSBwdW50dWFsbWVudGUgZW4gY2FkYSBncmFmaWNvLCBxdWUgZXhpc3RlbiBzdXRpbGVzIG1lam9yYXMgKGNhc2kgaW1wZXJjZXB0aWJsZXMpIGVudHJlIGFsZ3VuYXMgY2FudGlkYWRlcyBkZSBwcmVkaWN0b3Jlcy4gICAgIA0KDQoqIFNlIGRlc3RhY2EgcXVlIG5vIHRpZW5lIGltcG9ydGFuY2lhIGRlc3RhY2FyIGVsIHB1bnRvIG1pbmltbyBkZSBSU1MsIGRhZG8gcXVlIGFsIHBvciBsYSBuYXR1cmFsZXphIGRlbCBtb2RlbG8sIGEgbWF5b3IgY2FudGlkYWQgZGUgdmFyaWFibGUsIG1lbm9yIHNlcsOhIHN1IHZhbG9yLiAgIA0KDQojIyMgMTBhLjc6IENvZWZpY2llbnRlczogTW9kZWxvIFNlbGVjY2lvbmFkbw0KDQpgYGB7ciBNb2RlbG8gNyBwcmVkaWN0b3Jlc30NCmNvZWYob2JqZWN0ID0gYnNzLm1vZGVsLCBpZCA9IDcpDQpgYGANCg0KKiBNZWRpYW50ZSBsYSBjb21wYXJhY2nDs24gZGUgbGFzIGdyYWZpY2FzLCBzZSBvcHRhIHBvciB1biBtb2RlbG8gY29uIDcgcHJlZGljdG9yZXMuICAgIA0KDQojIyMgMTBhLjg6IFZhbGlkYWNpb24gQ3J1emFkYQ0KDQpgYGB7ciBDViBNb2RlbG8gQlNTfQ0Kc2V0LnNlZWQoMTApDQoNCmluZGV4IDwtIGNyZWF0ZURhdGFQYXJ0aXRpb24oY2xlYW5fZGF0YSRQcmljZSwgcCA9IDAuNywgbGlzdCA9IEZBTFNFKQ0KDQpkYXRhLnRyYWluIDwtIGNsZWFuX2RhdGFbaW5kZXgsIF0NCmRhdGEudGVzdCA8LSBjbGVhbl9kYXRhWy1pbmRleCwgXQ0KYGBgDQoNCg0KYGBge3IgQ1YgTW9kZWxvIEJTUyAyfQ0Kc2V0LnNlZWQoMTApDQptb2RlbC5md2QgPC0gcmVnc3Vic2V0cyhQcmljZSB+LiwgZGF0YSA9IGRhdGEudHJhaW4sIG52bWF4ID0gNykNCnN1bW1hcnkobW9kZWwuZndkKQ0KYGBgDQoNCiMjIyAxMGEuOTogUk1TRQ0KDQpgYGB7ciBDYWxjdWxvIGRlIFJNU0UgQlNTfQ0KdmFsLmVycm9ycyA9IHJlcChOQSw3KQ0KeC50ZXN0IDwtIG1vZGVsLm1hdHJpeChQcmljZSB+LiwgZGF0YSA9IGRhdGEudGVzdCkNCg0KZm9yKGkgaW4gMTo3KSANCnsNCiAgY29lZmljaWVudGVzIDwtIGNvZWYobW9kZWwuZndkLCBpZCA9IGkpDQogIHByZWRpY3Rpb25zIDwtIHgudGVzdFssbmFtZXMoY29lZmljaWVudGVzKV0gJSolIGNvZWZpY2llbnRlcw0KICB2YWwuZXJyb3JzW2ldIDwtIG1lYW4oKGRhdGEudGVzdCRQcmljZSAtIHByZWRpY3Rpb25zKV4yKQ0KfQ0KDQpybXNlIDwtIHNxcnQobWVhbih2YWwuZXJyb3JzKSkNCnJtc2UNCmBgYA0KDQoqIFRyYXMgdW5hIHZhbGlkYWNpw7NuIGNydXphZGEgc2Ugb2J0aWVuZSB1biB2YWxvciBkZSBSTVNFIGRlIDE0MDguICAgIA0KDQoNCiMjIDEwYjogUmlkZ2UNCg0KKiBNZWRpYW50ZSBSaWRnZSBzZSBvYnRpZW5lIHVuIG1vZGVsbyBhIHBhcnRpciBkZSBsb3MgMzQgcHJlZGljdG9yZXMgaW5pY2lhbGVzLCBwb25kZXJhbmRvIGxhIGluZmx1ZW5jaWEgZGUgY2FkYSBwcmVkaWN0b3IgcGVybyBzaW4gZWxpbWluYXIgbmluZ3VubyBkZSBsb3MgbWlzbW9zIGRlbCBtb2RlbG8uDQoNCmBgYHtyfQ0Kc2V0LnNlZWQoMTApDQoNCmluZGV4IDwtIGNyZWF0ZURhdGFQYXJ0aXRpb24oY2xlYW5fZGF0YSRQcmljZSwgcCA9IDAuNywgbGlzdCA9IEZBTFNFKQ0KDQpkYXRhLnRyYWluIDwtIGNsZWFuX2RhdGFbaW5kZXgsIF0NCmRhdGEudGVzdCA8LSBjbGVhbl9kYXRhWy1pbmRleCwgXQ0KYGBgDQoNCiogU2UgcHJvZHVjZSBlbiBjb25qdW50byBkZSBlbnRyZW5hbWllbnRvIHkgZGUgcHJ1ZWJhLiAgICANCg0KIyMgMTBiLjE6IE1vZGVsbw0KDQpgYGB7cn0NCg0KeCA9IG1vZGVsLm1hdHJpeChQcmljZSB+IC4gLCBkYXRhLnRyYWluKVssIC0xXQ0KeSA9IGFzLm1hdHJpeChkYXRhLnRyYWluJFByaWNlKQ0KDQpyaWRnZS5tb2RlbCA9IGdsbW5ldCh4LCB5LCBhbHBoYSA9IDApDQoNCmBgYA0KDQoNCmBgYHtyfQ0KYmV0YT1jb2VmKHJpZGdlLm1vZGVsKQ0KDQp0bXAgPC0gYXMuZGF0YS5mcmFtZShhcy5tYXRyaXgoYmV0YSkpDQp0bXAkY29lZiA8LSByb3cubmFtZXModG1wKQ0KdG1wIDwtIHJlc2hhcGU6Om1lbHQodG1wLCBpZCA9ICJjb2VmIikNCnRtcCR2YXJpYWJsZSA8LSBhcy5udW1lcmljKGdzdWIoInMiLCAiIiwgdG1wJHZhcmlhYmxlKSkNCnRtcCRsYW1iZGEgPC0gcmlkZ2UubW9kZWwkbGFtYmRhW3RtcCR2YXJpYWJsZSsxXQ0KdG1wJG5vcm0gPC0gYXBwbHkoYWJzKGJldGFbLTEsXSksIDIsIHN1bSlbdG1wJHZhcmlhYmxlKzFdDQoNCmdncGxvdCh0bXBbdG1wJGNvZWYgIT0gIihJbnRlcmNlcHQpIixdLCBhZXMobGFtYmRhLCB2YWx1ZSwgY29sb3IgPSBjb2VmLCBncm91cCA9IGNvZWYsICkpICsgDQogIGdlb21fbGluZSgpICsgDQogICAgc2NhbGVfeF9sb2cxMCgpICsgDQogICAgeGxhYigiTGFtYmRhIChsb2cgc2NhbGUpIikgKyANCiAgICBndWlkZXMoY29sb3IgPSBndWlkZV9sZWdlbmQodGl0bGUgPSAiIiksIA0KICAgICAgICAgICBsaW5ldHlwZSA9IGd1aWRlX2xlZ2VuZCh0aXRsZSA9ICIiKSkgKw0KICAgIHRoZW1lX2J3KCkgKyANCiAgICB0aGVtZShsZWdlbmQua2V5LndpZHRoID0gdW5pdCgzLCJsaW5lcyIpKSANCmBgYA0KDQoNCmBgYHtyfQ0KcGxvdChyaWRnZS5tb2RlbCwgeHZhciA9ICJsYW1iZGEiLCBsYWJlbCA9IFRSVUUpDQpgYGANCg0KKiBNZWRpYW50ZSBsb3MgZG9zIGdyYWZpY29zIGFudGVyaW9yZXMgc2UgcHVlZGUgb2JzZXJ2YXIgcXVlIGxvcyBwcmVkaWN0b3JlcyBtYXMgc2lnbmlmaWNhdGl2b3Mgc29uIGFxdWVsbG9zIHF1ZSwgZHVyYW50ZSBlbCBwcm9jZXNhbWllbnRvIGRlbCBtb2RlbG8sIGRlbW9yYW4gbWFzIGVuIGFjZXJjYXJzZSBhIHZhbG9yZXMgY2VyY2Fub3MgYSBjZXJvLiAgICANCg0KYGBge3J9DQoNCmluZGljZXMgPC0gc2FtcGxlKGMoVFJVRSxGQUxTRSksIG5yb3coZGF0YS50cmFpbiksIHJlcGxhY2UgPSBUUlVFKQ0KDQpjdi5vdXQgPC0gY3YuZ2xtbmV0KHhbaW5kaWNlcyxdLCB5W2luZGljZXNdLCBhbHBoYSA9IDApDQoNCnBsb3QoY3Yub3V0KQ0KY29lZihjdi5vdXQpDQpgYGANCg0KDQoqIExhIGdyYWZpY2EgYW50ZXJpb3IgcmVwcmVzZW50YSBjb21vIGVsIGVycm9yIGluY3JlbWVudGEgYSBtYXlvcmVzIHZhbG9yZXMgZGUgbG9nKGxhbWJkYSkuICAgIA0KKiBMYXMgbGluZWFzIHB1bnRlYWRhcyBpbmRpY2FuLCBlbCB2YWxvciBtaW5pbW8gZGUgbXNlIG9idGVuaWRvIGVuIGxhIHZhbGlkYWNpb24gY3J1emFkYSB5IGVsIHZhbG9yIGRlIGxhIGRlcmVjaGEgcmVwcmVzZW50YSBlbCBlcnJvciBlc3RhbmRhci4gICAgIA0KDQoNCmBgYHtyfQ0KDQpiZXN0bGFtID0gY3Yub3V0JGxhbWJkYS5taW4NCmJlc3RsYW0NCg0KYGBgDQoNCiogU2UgZW1wbGVhIGVsIG1lbm9yIGxhbWJkYSBvYnRlbmlkbyBkZSBsYSB2YWxpZGFjaW9uIGNydXphZGEgcHJvcGlhIGRlbCBtb2RlbG8uICAgIA0KDQpgYGB7cn0NCg0KcmlkZ2UucHJlZCA8LSBwcmVkaWN0KHJpZGdlLm1vZGVsLCBzID0gYmVzdGxhbSwgbmV3eCA9IHhbLWluZGljZXMsXSkNCnNxcnQobWVhbigocmlkZ2UucHJlZCAtIHlbLWluZGljZXNdKV4yKSkNCg0KYGBgDQoNCiogU2Ugb2J0aWVuZSB1biB2YWxvciBkZSBybXNlIGRlIDgwNyBvYnRlbmlkbyBtZWRpYW50ZSBsb3MgZGF0b3MgZGUgZW50cmVuYW1pZW50by4gICAgDQoNCiMjIDEwYi4yOiBWYWxpZGFjaW9uIENydXphZGEgICANCg0KYGBge3J9DQoNCnggPSBtb2RlbC5tYXRyaXgoUHJpY2UgfiAuICwgZGF0YS50ZXN0KVssIC0xXQ0KeSA9IGFzLm1hdHJpeChkYXRhLnRlc3QkUHJpY2UpDQoNCnJpZGdlLm1vZGVsID0gZ2xtbmV0KHgsIHksIGFscGhhID0gMCkNCg0KYGBgDQoNCg0KYGBge3J9DQoNCmluZGljZXMgPC0gc2FtcGxlKGMoVFJVRSxGQUxTRSksIG5yb3coZGF0YS50ZXN0KSwgcmVwbGFjZSA9IFRSVUUpDQoNCmN2Lm91dCA8LSBjdi5nbG1uZXQoeFtpbmRpY2VzLF0sIHlbaW5kaWNlc10sIGFscGhhID0gMCkNCg0KcGxvdChjdi5vdXQpDQpjb2VmKGN2Lm91dCkNCmBgYA0KDQoqIEVsIHZhbG9yIG1pbmltbyBkZSBNU0UgeSBlbCBlcnJvciBlc3RhbmRhciwgc29uIHNpbWlsYXJlcyBhIGxvcyBvYnRlbmlkb3MgbWVkaWFudGUgZGF0b3MgZGUgZW50cmVuYW1pZW50by4gICANCg0KDQpgYGB7cn0NCg0KYmVzdGxhbSA9IGN2Lm91dCRsYW1iZGEubWluDQpiZXN0bGFtDQoNCmBgYA0KDQoqIFNlIGVtcGxlYSBlbCBtZW5vciBsYW1iZGEgb2J0ZW5pZG8gZGUgbGEgdmFsaWRhY2lvbiBjcnV6YWRhIHByb3BpYSBkZWwgbW9kZWxvLiAgICANCg0KYGBge3J9DQoNCnJpZGdlLnByZWQgPC0gcHJlZGljdChyaWRnZS5tb2RlbCwgcyA9IGJlc3RsYW0sIG5ld3ggPSB4Wy1pbmRpY2VzLF0pDQpzcXJ0KG1lYW4oKHJpZGdlLnByZWQgLSB5Wy1pbmRpY2VzXSleMikpDQoNCmBgYA0KDQoqIFNlIG9idGllbmUgdW4gdmFsb3IgZGUgcm1zZSBkZSA3NzUgZGF0b3MgZGUgcHJ1ZWJhLg0KDQoNCg0KIyMgMTBjOiBMYXNzbyAgICAgIA0KDQoqIE1lZGlhbnRlIExhc3NvIHNlIG9idGllbmUgdW4gbW9kZWxvIGEgcGFydGlyIGRlIGxvcyAzNCBwcmVkaWN0b3JlcyBpbmljaWFsZXMsIHBvbmRlcmFuZG8gbGEgaW5mbHVlbmNpYSBkZSBjYWRhIHByZWRpY3Rvci4gQSBkaWZlcmVuY2lhIGRlIFJpZGdlLCB1c2FuZG8gTGFzc28gc2kgc2UgcmVhbGl6YSB1bmEgc2VsZWNjaW9uIGRlIHZhcmlhYmxlcy4gICAgIA0KDQpgYGB7cn0NCnNldC5zZWVkKDEwKQ0KDQppbmRleCA8LSBjcmVhdGVEYXRhUGFydGl0aW9uKGNsZWFuX2RhdGEkUHJpY2UsIHAgPSAwLjcsIGxpc3QgPSBGQUxTRSkNCg0KZGF0YS50cmFpbiA8LSBjbGVhbl9kYXRhW2luZGV4LCBdDQpkYXRhLnRlc3QgPC0gY2xlYW5fZGF0YVstaW5kZXgsIF0NCmBgYA0KDQoqIFNlIHByb2R1Y2UgZW4gY29uanVudG8gZGUgZW50cmVuYW1pZW50byB5IGRlIHBydWViYS4gICAgDQoNCiMjIDEwYy4xOiBNb2RlbG8NCg0KYGBge3J9DQoNCnggPSBtb2RlbC5tYXRyaXgoUHJpY2UgfiAuICwgZGF0YS50cmFpbilbLCAtMV0NCnkgPSBhcy5tYXRyaXgoZGF0YS50cmFpbiRQcmljZSkNCg0KbGFzc28ubW9kZWwgPSBnbG1uZXQoeCwgeSwgYWxwaGEgPSAxKQ0KDQpgYGANCg0KDQpgYGB7cn0NCmJldGE9Y29lZihsYXNzby5tb2RlbCkNCg0KdG1wIDwtIGFzLmRhdGEuZnJhbWUoYXMubWF0cml4KGJldGEpKQ0KdG1wJGNvZWYgPC0gcm93Lm5hbWVzKHRtcCkNCnRtcCA8LSByZXNoYXBlOjptZWx0KHRtcCwgaWQgPSAiY29lZiIpDQp0bXAkdmFyaWFibGUgPC0gYXMubnVtZXJpYyhnc3ViKCJzIiwgIiIsIHRtcCR2YXJpYWJsZSkpDQp0bXAkbGFtYmRhIDwtIGxhc3NvLm1vZGVsJGxhbWJkYVt0bXAkdmFyaWFibGUrMV0NCnRtcCRub3JtIDwtIGFwcGx5KGFicyhiZXRhWy0xLF0pLCAyLCBzdW0pW3RtcCR2YXJpYWJsZSsxXQ0KDQpnZ3Bsb3QodG1wW3RtcCRjb2VmICE9ICIoSW50ZXJjZXB0KSIsXSwgYWVzKGxhbWJkYSwgdmFsdWUsIGNvbG9yID0gY29lZiwgZ3JvdXAgPSBjb2VmLCApKSArIA0KICBnZW9tX2xpbmUoKSArIA0KICAgIHNjYWxlX3hfbG9nMTAoKSArIA0KICAgIHhsYWIoIkxhbWJkYSAobG9nIHNjYWxlKSIpICsgDQogICAgZ3VpZGVzKGNvbG9yID0gZ3VpZGVfbGVnZW5kKHRpdGxlID0gIiIpLCANCiAgICAgICAgICAgbGluZXR5cGUgPSBndWlkZV9sZWdlbmQodGl0bGUgPSAiIikpICsNCiAgICB0aGVtZV9idygpICsgDQogICAgdGhlbWUobGVnZW5kLmtleS53aWR0aCA9IHVuaXQoMywibGluZXMiKSkgDQpgYGANCg0KDQpgYGB7cn0NCnBsb3QobGFzc28ubW9kZWwsIHh2YXIgPSAibGFtYmRhIiwgbGFiZWwgPSBUUlVFKQ0KYGBgDQoNCg0KKiBNZWRpYW50ZSBsb3MgZG9zIGdyYWZpY29zIGFudGVyaW9yZXMgc2UgcHVlZGUgb2JzZXJ2YXIgcXVlIGxvcyBwcmVkaWN0b3JlcyBtYXMgc2lnbmlmaWNhdGl2b3Mgc29uIGFxdWVsbG9zIHF1ZSwgZHVyYW50ZSBlbCBwcm9jZXNhbWllbnRvIGRlbCBtb2RlbG8sIGRlbW9yYW4gbWFzIGVuIGFjZXJjYXJzZSBhIHZhbG9yZXMgY2VyY2Fub3MgYSBjZXJvLiAgICANCg0KDQpgYGB7cn0NCg0KaW5kaWNlcyA8LSBzYW1wbGUoYyhUUlVFLEZBTFNFKSwgbnJvdyhkYXRhLnRyYWluKSwgcmVwbGFjZSA9IFRSVUUpDQoNCmN2Lm91dCA8LSBjdi5nbG1uZXQoeFtpbmRpY2VzLF0sIHlbaW5kaWNlc10sIGFscGhhID0gMSkNCg0KcGxvdChjdi5vdXQpDQpjb2VmKGN2Lm91dCkNCmBgYA0KDQoqIEVsIHZhbG9yIG1pbmltbyBkZSBNU0UgeSBlbCBlcnJvciBlc3RhbmRhciwgc29uIHNpbWlsYXJlcyBhIGxvcyBvYnRlbmlkb3MgbWVkaWFudGUgZGF0b3MgZGUgZW50cmVuYW1pZW50by4gICANCg0KDQoNCmBgYHtyfQ0KDQpiZXN0bGFtID0gY3Yub3V0JGxhbWJkYS5taW4NCmJlc3RsYW0NCg0KYGBgDQoNCiogU2UgZW1wbGVhIGVsIG1lbm9yIGxhbWJkYSBvYnRlbmlkbyBkZSBsYSB2YWxpZGFjaW9uIGNydXphZGEgcHJvcGlhIGRlbCBtb2RlbG8uICAgIA0KDQoNCmBgYHtyfQ0KDQpsYXNzby5wcmVkIDwtIHByZWRpY3QobGFzc28ubW9kZWwsIHMgPSBiZXN0bGFtLCBuZXd4ID0geFstaW5kaWNlcyxdKQ0Kc3FydChtZWFuKChsYXNzby5wcmVkIC0geVstaW5kaWNlc10pXjIpKQ0KDQpgYGANCg0KKiBTZSBvYnRpZW5lIHVuIHZhbG9yIGRlIHJtc2UgZGUgODEzIGRhdG9zIGRlIGVudHJlbmFtaWVudG8uDQoNCiMjIDEwYy4yOiBWYWxpZGFjaW9uIENydXphZGEgDQoNCmBgYHtyfQ0KDQp4ID0gbW9kZWwubWF0cml4KFByaWNlIH4gLiAsIGRhdGEudGVzdClbLCAtMV0NCnkgPSBhcy5tYXRyaXgoZGF0YS50ZXN0JFByaWNlKQ0KDQpsYXNzby5tb2RlbCA9IGdsbW5ldCh4LCB5LCBhbHBoYSA9IDEpDQoNCmBgYA0KDQpgYGB7cn0NCg0KaW5kaWNlcyA8LSBzYW1wbGUoYyhUUlVFLEZBTFNFKSwgbnJvdyhkYXRhLnRlc3QpLCByZXBsYWNlID0gVFJVRSkNCg0KY3Yub3V0IDwtIGN2LmdsbW5ldCh4W2luZGljZXMsXSwgeVtpbmRpY2VzXSwgYWxwaGEgPSAxKQ0KDQpwbG90KGN2Lm91dCkNCmNvZWYoY3Yub3V0KQ0KYGBgDQoNCmBgYHtyfQ0KDQpiZXN0bGFtID0gY3Yub3V0JGxhbWJkYS5taW4NCmJlc3RsYW0NCg0KYGBgDQoNCmBgYHtyfQ0KDQpsYXNzby5wcmVkIDwtIHByZWRpY3QobGFzc28ubW9kZWwsIHMgPSBiZXN0bGFtLCBuZXd4ID0geFstaW5kaWNlcyxdKQ0Kc3FydChtZWFuKChsYXNzby5wcmVkIC0geVstaW5kaWNlc10pXjIpKQ0KDQpgYGANCg0KKiBTZSBvYnRpZW5lIHVuIHZhbG9yIGRlIHJtc2UgZGUgODEzIGRhdG9zIGRlIHBydWViYS4gICANCg0KDQojIDExOiBDb25jbHVzacOzbg0KDQpEYWRvIHF1ZSBlcyB1biBkYXRhc2V0IGNvbiB1bmEgY2FudGlkYWQgYmFqYSBkZSBpbnN0YW5jaWFzICgxNTAwIGFwcm94KSwgeSBxdWUgbWVkaWFudGUgZWwgdXNvIGRlIExhc3NvIHNlIG9idHV2byB1biByZXN1bHRhZG8gYWNlcHRhYmxlIGNvbXBhcmFkbyBhIGxvcyBybXNlIGNhbGN1bGFkb3MgY29uIG90cm9zIG3DqXRvZG9zIChCZXN0IFN1YnNldCB5IFJpZGdlKSwgc2Ugb3B0YSBwb3IgdXNhciBlc2UgbWV0b2RvIHBhcmEgc2VsZWNjaW9uIGRlIHZhcmlhYmxlcy4gICAgDQoNCg0K