Teoría

El paquete CARET (Classification And REgression Training) es un paquete integral con una amplia variedad de algoritmos para el aprendizaje automático.

Instalar paquetes y llamar librerías

library(dplyr)
# install.packages("ggplot2") # Gráficas
library(ggplot2)
# install.packages("lattice") # Crear gráficos
library(lattice)
# install.packages("caret") # Algoritmos de aprendizaje automático
library(caret)
# install.packages("datasets") # Usar bases de datos, en este caso Iris
library(datasets)
# install.packages("DataExplorer") # Análisis Exploratorio
library(DataExplorer)

Cargar la base de datos

df <- read.csv("C:\\Users\\eveyu\\Downloads\\concentración\\m2_R\\M1_data.csv")
head(df)
##   trust_apple interest_computers age_computer user_pcmac appleproducts_count
## 1          No                  4            8         PC                   0
## 2         Yes                  2            4         PC                   1
## 3         Yes                  5            6         PC                   0
## 4         Yes                  2            6      Apple                   4
## 5         Yes                  4            4      Apple                   7
## 6         Yes                  3            1      Apple                   2
##   familiarity_m1 f_batterylife f_price f_size f_multitasking f_noise
## 1             No             5       4      3              4       4
## 2             No             5       5      5              3       4
## 3             No             3       4      2              4       1
## 4             No             4       3      3              4       4
## 5            Yes             5       3      3              4       4
## 6             No             5       5      4              4       5
##   f_performance f_neural f_synergy f_performanceloss m1_consideration
## 1             2        2         1                 1                1
## 2             5        2         2                 4                2
## 3             4        2         2                 2                4
## 4             4        4         4                 3                2
## 5             5        3         4                 4                4
## 6             5        5         4                 2                2
##   m1_purchase gender age_group income_group   status          domain
## 1         Yes   Male         2            2  Student         Science
## 2          No   Male         2            3 Employed         Finance
## 3         Yes   Male         2            2  Student IT & Technology
## 4          No Female         2            2  Student  Arts & Culture
## 5         Yes   Male         5            7 Employed     Hospitality
## 6          No Female         2            2  Student        Politics

Entender la base de datos

summary(df)
##  trust_apple        interest_computers  age_computer    user_pcmac       
##  Length:133         Min.   :2.000      Min.   :0.000   Length:133        
##  Class :character   1st Qu.:3.000      1st Qu.:1.000   Class :character  
##  Mode  :character   Median :4.000      Median :3.000   Mode  :character  
##                     Mean   :3.812      Mean   :2.827                     
##                     3rd Qu.:5.000      3rd Qu.:5.000                     
##                     Max.   :5.000      Max.   :9.000                     
##  appleproducts_count familiarity_m1     f_batterylife      f_price     
##  Min.   :0.000       Length:133         Min.   :1.000   Min.   :1.000  
##  1st Qu.:1.000       Class :character   1st Qu.:4.000   1st Qu.:3.000  
##  Median :3.000       Mode  :character   Median :5.000   Median :4.000  
##  Mean   :2.609                          Mean   :4.526   Mean   :3.872  
##  3rd Qu.:4.000                          3rd Qu.:5.000   3rd Qu.:5.000  
##  Max.   :8.000                          Max.   :5.000   Max.   :5.000  
##      f_size      f_multitasking    f_noise      f_performance      f_neural    
##  Min.   :1.000   Min.   :2.00   Min.   :1.000   Min.   :2.000   Min.   :1.000  
##  1st Qu.:2.000   1st Qu.:4.00   1st Qu.:3.000   1st Qu.:4.000   1st Qu.:2.000  
##  Median :3.000   Median :4.00   Median :4.000   Median :5.000   Median :3.000  
##  Mean   :3.158   Mean   :4.12   Mean   :3.729   Mean   :4.398   Mean   :3.165  
##  3rd Qu.:4.000   3rd Qu.:5.00   3rd Qu.:5.000   3rd Qu.:5.000   3rd Qu.:4.000  
##  Max.   :5.000   Max.   :5.00   Max.   :5.000   Max.   :5.000   Max.   :5.000  
##    f_synergy     f_performanceloss m1_consideration m1_purchase       
##  Min.   :1.000   Min.   :1.000     Min.   :1.000    Length:133        
##  1st Qu.:3.000   1st Qu.:3.000     1st Qu.:3.000    Class :character  
##  Median :4.000   Median :4.000     Median :4.000    Mode  :character  
##  Mean   :3.466   Mean   :3.376     Mean   :3.609                      
##  3rd Qu.:4.000   3rd Qu.:4.000     3rd Qu.:5.000                      
##  Max.   :5.000   Max.   :5.000     Max.   :5.000                      
##     gender            age_group      income_group     status         
##  Length:133         Min.   : 1.00   Min.   :1.00   Length:133        
##  Class :character   1st Qu.: 2.00   1st Qu.:1.00   Class :character  
##  Mode  :character   Median : 2.00   Median :2.00   Mode  :character  
##                     Mean   : 2.97   Mean   :2.97                     
##                     3rd Qu.: 3.00   3rd Qu.:4.00                     
##                     Max.   :10.00   Max.   :7.00                     
##     domain         
##  Length:133        
##  Class :character  
##  Mode  :character  
##                    
##                    
## 
str(df)
## 'data.frame':    133 obs. of  22 variables:
##  $ trust_apple        : chr  "No" "Yes" "Yes" "Yes" ...
##  $ interest_computers : int  4 2 5 2 4 3 3 3 4 5 ...
##  $ age_computer       : int  8 4 6 6 4 1 2 0 2 0 ...
##  $ user_pcmac         : chr  "PC" "PC" "PC" "Apple" ...
##  $ appleproducts_count: int  0 1 0 4 7 2 7 0 6 7 ...
##  $ familiarity_m1     : chr  "No" "No" "No" "No" ...
##  $ f_batterylife      : int  5 5 3 4 5 5 4 5 4 5 ...
##  $ f_price            : int  4 5 4 3 3 5 3 5 4 3 ...
##  $ f_size             : int  3 5 2 3 3 4 4 4 3 5 ...
##  $ f_multitasking     : int  4 3 4 4 4 4 5 4 4 5 ...
##  $ f_noise            : int  4 4 1 4 4 5 5 3 4 5 ...
##  $ f_performance      : int  2 5 4 4 5 5 5 3 4 5 ...
##  $ f_neural           : int  2 2 2 4 3 5 3 2 3 3 ...
##  $ f_synergy          : int  1 2 2 4 4 4 3 2 3 5 ...
##  $ f_performanceloss  : int  1 4 2 3 4 2 2 3 4 5 ...
##  $ m1_consideration   : int  1 2 4 2 4 2 3 1 5 5 ...
##  $ m1_purchase        : chr  "Yes" "No" "Yes" "No" ...
##  $ gender             : chr  "Male" "Male" "Male" "Female" ...
##  $ age_group          : int  2 2 2 2 5 2 6 2 8 4 ...
##  $ income_group       : int  2 3 2 2 7 2 7 2 7 6 ...
##  $ status             : chr  "Student" "Employed" "Student" "Student" ...
##  $ domain             : chr  "Science" "Finance" "IT & Technology" "Arts & Culture" ...
# create_report(df)
plot_missing(df)
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the DataExplorer package.
##   Please report the issue at
##   <https://github.com/boxuancui/DataExplorer/issues>.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

plot_histogram(df)

df_numeric <- df[, sapply(df, is.numeric)]
str(df_numeric)
## 'data.frame':    133 obs. of  15 variables:
##  $ interest_computers : int  4 2 5 2 4 3 3 3 4 5 ...
##  $ age_computer       : int  8 4 6 6 4 1 2 0 2 0 ...
##  $ appleproducts_count: int  0 1 0 4 7 2 7 0 6 7 ...
##  $ f_batterylife      : int  5 5 3 4 5 5 4 5 4 5 ...
##  $ f_price            : int  4 5 4 3 3 5 3 5 4 3 ...
##  $ f_size             : int  3 5 2 3 3 4 4 4 3 5 ...
##  $ f_multitasking     : int  4 3 4 4 4 4 5 4 4 5 ...
##  $ f_noise            : int  4 4 1 4 4 5 5 3 4 5 ...
##  $ f_performance      : int  2 5 4 4 5 5 5 3 4 5 ...
##  $ f_neural           : int  2 2 2 4 3 5 3 2 3 3 ...
##  $ f_synergy          : int  1 2 2 4 4 4 3 2 3 5 ...
##  $ f_performanceloss  : int  1 4 2 3 4 2 2 3 4 5 ...
##  $ m1_consideration   : int  1 2 4 2 4 2 3 1 5 5 ...
##  $ age_group          : int  2 2 2 2 5 2 6 2 8 4 ...
##  $ income_group       : int  2 3 2 2 7 2 7 2 7 6 ...
plot_correlation(df_numeric)

# Transformar variables categóricas a factor

df$trust_apple <- as.factor(df$trust_apple)
df$user_pcmac <- as.factor(df$user_pcmac)
df$familiarity_m1 <- as.factor(df$familiarity_m1)
df$m1_purchase <- as.factor(df$m1_purchase)
df$gender <- as.factor(df$gender)
df$status <- as.factor(df$status)
df$domain <- as.factor(df$domain)

str(df)
## 'data.frame':    133 obs. of  22 variables:
##  $ trust_apple        : Factor w/ 2 levels "No","Yes": 1 2 2 2 2 2 2 1 2 2 ...
##  $ interest_computers : int  4 2 5 2 4 3 3 3 4 5 ...
##  $ age_computer       : int  8 4 6 6 4 1 2 0 2 0 ...
##  $ user_pcmac         : Factor w/ 4 levels "Apple","Hp","Other",..: 4 4 4 1 1 1 1 4 1 1 ...
##  $ appleproducts_count: int  0 1 0 4 7 2 7 0 6 7 ...
##  $ familiarity_m1     : Factor w/ 2 levels "No","Yes": 1 1 1 1 2 1 1 1 2 2 ...
##  $ f_batterylife      : int  5 5 3 4 5 5 4 5 4 5 ...
##  $ f_price            : int  4 5 4 3 3 5 3 5 4 3 ...
##  $ f_size             : int  3 5 2 3 3 4 4 4 3 5 ...
##  $ f_multitasking     : int  4 3 4 4 4 4 5 4 4 5 ...
##  $ f_noise            : int  4 4 1 4 4 5 5 3 4 5 ...
##  $ f_performance      : int  2 5 4 4 5 5 5 3 4 5 ...
##  $ f_neural           : int  2 2 2 4 3 5 3 2 3 3 ...
##  $ f_synergy          : int  1 2 2 4 4 4 3 2 3 5 ...
##  $ f_performanceloss  : int  1 4 2 3 4 2 2 3 4 5 ...
##  $ m1_consideration   : int  1 2 4 2 4 2 3 1 5 5 ...
##  $ m1_purchase        : Factor w/ 2 levels "No","Yes": 2 1 2 1 2 1 2 1 2 2 ...
##  $ gender             : Factor w/ 2 levels "Female","Male": 2 2 2 1 2 1 2 2 2 2 ...
##  $ age_group          : int  2 2 2 2 5 2 6 2 8 4 ...
##  $ income_group       : int  2 3 2 2 7 2 7 2 7 6 ...
##  $ status             : Factor w/ 6 levels "Employed","Retired",..: 4 1 4 4 1 4 1 4 1 1 ...
##  $ domain             : Factor w/ 22 levels "Administration & Public Services",..: 21 10 13 3 12 17 13 22 13 12 ...

NOTA: La variable que queremos predecir debe tener formato de FACTOR

Partir la base de datos

# Normalmente 80-20
set.seed(123)
renglones_entrenamiento <- createDataPartition(df$m1_purchase, p=0.8, list=FALSE)
entrenamiento <- df[renglones_entrenamiento, ]
prueba <- df[-renglones_entrenamiento, ]

Distintos tipos de Métodos para Modelar

Los métodos más utilizados para modelar aprendizaje automático son:

  • SVM: Support Vector Machine o Máquina de Vectores de Soporte. Hay varios subtipos: Lineal (svmLinear), Radial (svmRadial), Polinómico (svmPoly), etc.
  • Árbol de Decisión: rpart
  • Redes Neuronales: nnet
  • Random Forest o Bosques Aleatorios: rf

Modelo 1. SVM Lineal

modelo1 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "svmLinear", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneGrid = data.frame(C=1) #Cambiar
                 )

resultado_entrenamiento1 <- predict(modelo1, entrenamiento)
resultado_prueba1 <- predict(modelo1, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre1 <- confusionMatrix(resultado_entrenamiento1, entrenamiento$m1_purchase)
mcre1
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No  30   4
##        Yes  6  67
##                                           
##                Accuracy : 0.9065          
##                  95% CI : (0.8348, 0.9543)
##     No Information Rate : 0.6636          
##     P-Value [Acc > NIR] : 4.281e-09       
##                                           
##                   Kappa : 0.7878          
##                                           
##  Mcnemar's Test P-Value : 0.7518          
##                                           
##             Sensitivity : 0.8333          
##             Specificity : 0.9437          
##          Pos Pred Value : 0.8824          
##          Neg Pred Value : 0.9178          
##              Prevalence : 0.3364          
##          Detection Rate : 0.2804          
##    Detection Prevalence : 0.3178          
##       Balanced Accuracy : 0.8885          
##                                           
##        'Positive' Class : No              
## 
# Matriz de Confusión del Resultado de la Prueba
mcrp1 <- confusionMatrix(resultado_prueba1, prueba$m1_purchase)
mcrp1
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No   3   6
##        Yes  6  11
##                                           
##                Accuracy : 0.5385          
##                  95% CI : (0.3337, 0.7341)
##     No Information Rate : 0.6538          
##     P-Value [Acc > NIR] : 0.9231          
##                                           
##                   Kappa : -0.0196         
##                                           
##  Mcnemar's Test P-Value : 1.0000          
##                                           
##             Sensitivity : 0.3333          
##             Specificity : 0.6471          
##          Pos Pred Value : 0.3333          
##          Neg Pred Value : 0.6471          
##              Prevalence : 0.3462          
##          Detection Rate : 0.1154          
##    Detection Prevalence : 0.3462          
##       Balanced Accuracy : 0.4902          
##                                           
##        'Positive' Class : No              
## 

Modelo 2. SVM Radial

modelo2 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "svmRadial", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneGrid = data.frame(sigma=1, C=1) #Cambiar
                 )

resultado_entrenamiento2 <- predict(modelo2, entrenamiento)
resultado_prueba2 <- predict(modelo2, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre2 <- confusionMatrix(resultado_entrenamiento2, entrenamiento$m1_purchase)
mcre2
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No  34   0
##        Yes  2  71
##                                           
##                Accuracy : 0.9813          
##                  95% CI : (0.9341, 0.9977)
##     No Information Rate : 0.6636          
##     P-Value [Acc > NIR] : <2e-16          
##                                           
##                   Kappa : 0.9576          
##                                           
##  Mcnemar's Test P-Value : 0.4795          
##                                           
##             Sensitivity : 0.9444          
##             Specificity : 1.0000          
##          Pos Pred Value : 1.0000          
##          Neg Pred Value : 0.9726          
##              Prevalence : 0.3364          
##          Detection Rate : 0.3178          
##    Detection Prevalence : 0.3178          
##       Balanced Accuracy : 0.9722          
##                                           
##        'Positive' Class : No              
## 
# Matriz de Confusión del Resultado de la Prueba
mcrp2 <- confusionMatrix(resultado_prueba2, prueba$m1_purchase)
mcrp2
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No   0   0
##        Yes  9  17
##                                           
##                Accuracy : 0.6538          
##                  95% CI : (0.4433, 0.8279)
##     No Information Rate : 0.6538          
##     P-Value [Acc > NIR] : 0.589398        
##                                           
##                   Kappa : 0               
##                                           
##  Mcnemar's Test P-Value : 0.007661        
##                                           
##             Sensitivity : 0.0000          
##             Specificity : 1.0000          
##          Pos Pred Value :    NaN          
##          Neg Pred Value : 0.6538          
##              Prevalence : 0.3462          
##          Detection Rate : 0.0000          
##    Detection Prevalence : 0.0000          
##       Balanced Accuracy : 0.5000          
##                                           
##        'Positive' Class : No              
## 

Modelo 3. SVM Polinómico

modelo3 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "svmPoly", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneGrid = data.frame(degree=1, scale=1, C=1) #Cambiar
                 )

resultado_entrenamiento3 <- predict(modelo3, entrenamiento)
resultado_prueba3 <- predict(modelo3, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre3 <- confusionMatrix(resultado_entrenamiento3, entrenamiento$m1_purchase)
mcre3
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No  30   4
##        Yes  6  67
##                                           
##                Accuracy : 0.9065          
##                  95% CI : (0.8348, 0.9543)
##     No Information Rate : 0.6636          
##     P-Value [Acc > NIR] : 4.281e-09       
##                                           
##                   Kappa : 0.7878          
##                                           
##  Mcnemar's Test P-Value : 0.7518          
##                                           
##             Sensitivity : 0.8333          
##             Specificity : 0.9437          
##          Pos Pred Value : 0.8824          
##          Neg Pred Value : 0.9178          
##              Prevalence : 0.3364          
##          Detection Rate : 0.2804          
##    Detection Prevalence : 0.3178          
##       Balanced Accuracy : 0.8885          
##                                           
##        'Positive' Class : No              
## 
# Matriz de Confusión del Resultado de la Prueba
mcrp3 <- confusionMatrix(resultado_prueba3, prueba$m1_purchase)
mcrp3
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No   3   6
##        Yes  6  11
##                                           
##                Accuracy : 0.5385          
##                  95% CI : (0.3337, 0.7341)
##     No Information Rate : 0.6538          
##     P-Value [Acc > NIR] : 0.9231          
##                                           
##                   Kappa : -0.0196         
##                                           
##  Mcnemar's Test P-Value : 1.0000          
##                                           
##             Sensitivity : 0.3333          
##             Specificity : 0.6471          
##          Pos Pred Value : 0.3333          
##          Neg Pred Value : 0.6471          
##              Prevalence : 0.3462          
##          Detection Rate : 0.1154          
##    Detection Prevalence : 0.3462          
##       Balanced Accuracy : 0.4902          
##                                           
##        'Positive' Class : No              
## 

Modelo 4. Árbol de Decisión

modelo4 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "rpart", #Cambiar
                 #preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneLength = 10 #Cambiar
                 )

resultado_entrenamiento4 <- predict(modelo4, entrenamiento)
resultado_prueba4 <- predict(modelo4, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre4 <- confusionMatrix(resultado_entrenamiento4, entrenamiento$m1_purchase)
mcre4
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No  17   2
##        Yes 19  69
##                                           
##                Accuracy : 0.8037          
##                  95% CI : (0.7158, 0.8742)
##     No Information Rate : 0.6636          
##     P-Value [Acc > NIR] : 0.0010139       
##                                           
##                   Kappa : 0.5025          
##                                           
##  Mcnemar's Test P-Value : 0.0004803       
##                                           
##             Sensitivity : 0.4722          
##             Specificity : 0.9718          
##          Pos Pred Value : 0.8947          
##          Neg Pred Value : 0.7841          
##              Prevalence : 0.3364          
##          Detection Rate : 0.1589          
##    Detection Prevalence : 0.1776          
##       Balanced Accuracy : 0.7220          
##                                           
##        'Positive' Class : No              
## 
# Matriz de Confusión del Resultado de la Prueba
mcrp4 <- confusionMatrix(resultado_prueba4, prueba$m1_purchase)
mcrp4
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No   4   6
##        Yes  5  11
##                                           
##                Accuracy : 0.5769          
##                  95% CI : (0.3692, 0.7665)
##     No Information Rate : 0.6538          
##     P-Value [Acc > NIR] : 0.8485          
##                                           
##                   Kappa : 0.0892          
##                                           
##  Mcnemar's Test P-Value : 1.0000          
##                                           
##             Sensitivity : 0.4444          
##             Specificity : 0.6471          
##          Pos Pred Value : 0.4000          
##          Neg Pred Value : 0.6875          
##              Prevalence : 0.3462          
##          Detection Rate : 0.1538          
##    Detection Prevalence : 0.3846          
##       Balanced Accuracy : 0.5458          
##                                           
##        'Positive' Class : No              
## 

Modelo 5. Redes Neuronales

modelo5 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "nnet", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10)
                 #Cambiar
                 )
## # weights:  50
## initial  value 82.794609 
## iter  10 value 32.523223
## iter  20 value 30.173940
## iter  30 value 28.539522
## iter  40 value 28.506990
## iter  50 value 27.074299
## iter  60 value 26.908957
## iter  70 value 25.867242
## iter  80 value 25.188107
## iter  90 value 25.180191
## iter 100 value 25.175239
## final  value 25.175239 
## stopped after 100 iterations
## # weights:  148
## initial  value 97.677847 
## iter  10 value 16.219975
## iter  20 value 10.653356
## iter  30 value 7.796083
## iter  40 value 7.291744
## iter  50 value 7.256140
## iter  60 value 6.798688
## iter  70 value 6.604047
## iter  80 value 6.596572
## iter  90 value 6.594610
## iter 100 value 6.592594
## final  value 6.592594 
## stopped after 100 iterations
## # weights:  246
## initial  value 72.744037 
## iter  10 value 19.372157
## iter  20 value 9.636929
## iter  30 value 7.136274
## iter  40 value 3.528353
## iter  50 value 2.915231
## iter  60 value 2.797221
## iter  70 value 2.776372
## iter  80 value 2.774198
## iter  90 value 2.773608
## iter 100 value 2.772933
## final  value 2.772933 
## stopped after 100 iterations
## # weights:  50
## initial  value 70.289855 
## iter  10 value 35.933317
## iter  20 value 24.558417
## iter  30 value 24.244017
## iter  40 value 24.242078
## final  value 24.242074 
## converged
## # weights:  148
## initial  value 77.281580 
## iter  10 value 30.010986
## iter  20 value 22.050017
## iter  30 value 18.424384
## iter  40 value 17.604812
## iter  50 value 16.857607
## iter  60 value 16.355402
## iter  70 value 16.332377
## iter  80 value 16.331379
## final  value 16.331356 
## converged
## # weights:  246
## initial  value 69.945990 
## iter  10 value 24.243650
## iter  20 value 17.522007
## iter  30 value 16.322947
## iter  40 value 15.878325
## iter  50 value 15.597082
## iter  60 value 15.488776
## iter  70 value 15.446262
## iter  80 value 15.410025
## iter  90 value 15.402042
## iter 100 value 15.393000
## final  value 15.393000 
## stopped after 100 iterations
## # weights:  50
## initial  value 62.360472 
## iter  10 value 28.658205
## iter  20 value 19.201108
## iter  30 value 19.083328
## iter  40 value 19.079268
## iter  50 value 19.078384
## iter  60 value 19.077813
## iter  70 value 19.077231
## iter  80 value 19.076897
## iter  90 value 19.076728
## iter 100 value 19.076605
## final  value 19.076605 
## stopped after 100 iterations
## # weights:  148
## initial  value 77.117099 
## iter  10 value 25.472374
## iter  20 value 9.149204
## iter  30 value 7.753878
## iter  40 value 7.655239
## iter  50 value 7.636452
## iter  60 value 7.621506
## iter  70 value 7.451727
## iter  80 value 6.311720
## iter  90 value 6.251173
## iter 100 value 6.243870
## final  value 6.243870 
## stopped after 100 iterations
## # weights:  246
## initial  value 68.029470 
## iter  10 value 12.172986
## iter  20 value 3.473130
## iter  30 value 3.100124
## iter  40 value 3.028963
## iter  50 value 2.993067
## iter  60 value 2.963699
## iter  70 value 2.949382
## iter  80 value 2.930823
## iter  90 value 2.923014
## iter 100 value 2.917919
## final  value 2.917919 
## stopped after 100 iterations
## # weights:  50
## initial  value 59.368741 
## iter  10 value 26.718303
## iter  20 value 20.584750
## iter  30 value 19.306762
## iter  40 value 19.190102
## iter  50 value 19.183632
## iter  60 value 19.181910
## iter  70 value 14.260203
## iter  80 value 14.224927
## iter  90 value 14.224845
## final  value 14.224722 
## converged
## # weights:  148
## initial  value 67.993695 
## iter  10 value 16.383947
## iter  20 value 7.276646
## iter  30 value 6.923741
## iter  40 value 4.943834
## iter  50 value 4.932038
## iter  60 value 4.927091
## iter  70 value 2.518276
## iter  80 value 1.932994
## iter  90 value 1.921476
## iter 100 value 1.917878
## final  value 1.917878 
## stopped after 100 iterations
## # weights:  246
## initial  value 98.387219 
## iter  10 value 11.476411
## iter  20 value 5.547852
## iter  30 value 3.232565
## iter  40 value 2.781108
## iter  50 value 2.774195
## iter  60 value 2.773354
## iter  70 value 2.772740
## iter  80 value 2.772324
## iter  90 value 1.909664
## iter 100 value 1.909560
## final  value 1.909560 
## stopped after 100 iterations
## # weights:  50
## initial  value 68.791180 
## iter  10 value 35.028759
## iter  20 value 27.907264
## iter  30 value 22.762805
## iter  40 value 22.112099
## iter  50 value 22.098946
## iter  60 value 22.098422
## final  value 22.098418 
## converged
## # weights:  148
## initial  value 83.712837 
## iter  10 value 26.716450
## iter  20 value 17.587663
## iter  30 value 15.233041
## iter  40 value 14.310126
## iter  50 value 14.241002
## iter  60 value 14.236339
## iter  70 value 14.236237
## iter  70 value 14.236237
## iter  70 value 14.236237
## final  value 14.236237 
## converged
## # weights:  246
## initial  value 69.979147 
## iter  10 value 24.376136
## iter  20 value 15.363840
## iter  30 value 14.323596
## iter  40 value 14.214524
## iter  50 value 14.089252
## iter  60 value 14.051703
## iter  70 value 14.047668
## iter  80 value 14.047613
## iter  80 value 14.047613
## iter  80 value 14.047613
## final  value 14.047613 
## converged
## # weights:  50
## initial  value 60.434429 
## iter  10 value 33.749684
## iter  20 value 30.752206
## iter  30 value 25.402654
## iter  40 value 24.769739
## iter  50 value 24.760702
## iter  60 value 20.670778
## iter  70 value 18.213444
## iter  80 value 18.179513
## iter  90 value 18.176381
## iter 100 value 18.174123
## final  value 18.174123 
## stopped after 100 iterations
## # weights:  148
## initial  value 73.425392 
## iter  10 value 20.617829
## iter  20 value 15.789569
## iter  30 value 14.489948
## iter  40 value 10.895604
## iter  50 value 10.618544
## iter  60 value 10.370854
## iter  70 value 10.289755
## iter  80 value 10.281710
## iter  90 value 10.110416
## iter 100 value 9.918664
## final  value 9.918664 
## stopped after 100 iterations
## # weights:  246
## initial  value 87.891678 
## iter  10 value 19.390635
## iter  20 value 8.607673
## iter  30 value 5.944232
## iter  40 value 4.918315
## iter  50 value 3.933743
## iter  60 value 2.246445
## iter  70 value 2.145542
## iter  80 value 2.132707
## iter  90 value 2.124702
## iter 100 value 2.112569
## final  value 2.112569 
## stopped after 100 iterations
## # weights:  50
## initial  value 71.976216 
## iter  10 value 33.342980
## iter  20 value 21.662463
## iter  30 value 14.639662
## iter  40 value 12.946586
## iter  50 value 12.922052
## iter  60 value 12.920325
## iter  70 value 12.919702
## iter  80 value 12.919511
## iter  90 value 12.919486
## final  value 12.919483 
## converged
## # weights:  148
## initial  value 78.216282 
## iter  10 value 17.900698
## iter  20 value 3.843636
## iter  30 value 2.784637
## iter  40 value 2.773070
## iter  50 value 2.772667
## iter  60 value 2.772608
## final  value 2.772589 
## converged
## # weights:  246
## initial  value 70.573297 
## iter  10 value 11.056969
## iter  20 value 6.188884
## iter  30 value 5.745323
## iter  40 value 4.681470
## iter  50 value 2.795438
## iter  60 value 2.780735
## iter  70 value 2.777014
## iter  80 value 2.776401
## iter  90 value 2.773795
## iter 100 value 2.773227
## final  value 2.773227 
## stopped after 100 iterations
## # weights:  50
## initial  value 83.300123 
## iter  10 value 30.687234
## iter  20 value 25.559378
## iter  30 value 24.516171
## iter  40 value 24.089590
## iter  50 value 23.950815
## iter  60 value 23.702450
## iter  70 value 23.700113
## final  value 23.700110 
## converged
## # weights:  148
## initial  value 94.372604 
## iter  10 value 34.330543
## iter  20 value 20.242339
## iter  30 value 16.575234
## iter  40 value 16.080500
## iter  50 value 15.835419
## iter  60 value 15.737704
## iter  70 value 15.726623
## iter  80 value 15.726062
## iter  90 value 15.725993
## final  value 15.725993 
## converged
## # weights:  246
## initial  value 68.477806 
## iter  10 value 23.228993
## iter  20 value 16.404908
## iter  30 value 15.539001
## iter  40 value 15.026944
## iter  50 value 14.868771
## iter  60 value 14.848713
## iter  70 value 14.845767
## iter  80 value 14.844733
## iter  90 value 14.840800
## iter 100 value 14.837916
## final  value 14.837916 
## stopped after 100 iterations
## # weights:  50
## initial  value 62.649672 
## iter  10 value 31.665932
## iter  20 value 21.798895
## iter  30 value 13.205426
## iter  40 value 12.985333
## iter  50 value 12.982972
## iter  60 value 12.982058
## iter  70 value 12.981439
## iter  80 value 12.981136
## iter  90 value 12.980976
## iter 100 value 12.980773
## final  value 12.980773 
## stopped after 100 iterations
## # weights:  148
## initial  value 61.516738 
## iter  10 value 12.573174
## iter  20 value 4.389903
## iter  30 value 4.319975
## iter  40 value 4.294086
## iter  50 value 4.278291
## iter  60 value 3.225050
## iter  70 value 2.972311
## iter  80 value 2.920447
## iter  90 value 2.910318
## iter 100 value 2.905371
## final  value 2.905371 
## stopped after 100 iterations
## # weights:  246
## initial  value 100.181376 
## iter  10 value 22.784907
## iter  20 value 13.965855
## iter  30 value 11.145884
## iter  40 value 9.989035
## iter  50 value 9.004827
## iter  60 value 8.324247
## iter  70 value 8.310037
## iter  80 value 8.294750
## iter  90 value 8.191976
## iter 100 value 5.229373
## final  value 5.229373 
## stopped after 100 iterations
## # weights:  50
## initial  value 79.954652 
## iter  10 value 39.234535
## iter  20 value 33.514689
## iter  30 value 33.140681
## iter  40 value 31.229409
## iter  50 value 25.219234
## iter  60 value 22.877642
## iter  70 value 20.591738
## iter  80 value 20.484456
## iter  90 value 20.479297
## iter 100 value 20.477962
## final  value 20.477962 
## stopped after 100 iterations
## # weights:  148
## initial  value 61.873646 
## iter  10 value 10.930614
## iter  20 value 8.159923
## iter  30 value 7.642354
## iter  40 value 7.553842
## iter  50 value 6.785688
## iter  60 value 6.164015
## iter  70 value 6.145143
## iter  80 value 6.140889
## iter  90 value 6.139390
## iter 100 value 6.139081
## final  value 6.139081 
## stopped after 100 iterations
## # weights:  246
## initial  value 64.815571 
## iter  10 value 10.981524
## iter  20 value 4.404612
## iter  30 value 2.800178
## iter  40 value 2.774528
## iter  50 value 2.773007
## iter  60 value 2.772744
## iter  70 value 2.772651
## iter  80 value 2.772596
## iter  90 value 2.772592
## iter  90 value 2.772592
## iter  90 value 2.772592
## final  value 2.772592 
## converged
## # weights:  50
## initial  value 65.118354 
## iter  10 value 32.386986
## iter  20 value 25.716508
## iter  30 value 24.256204
## iter  40 value 23.892339
## iter  50 value 23.878284
## final  value 23.878274 
## converged
## # weights:  148
## initial  value 87.342192 
## iter  10 value 26.955697
## iter  20 value 18.385422
## iter  30 value 17.587733
## iter  40 value 17.315309
## iter  50 value 17.199786
## iter  60 value 17.190483
## final  value 17.190432 
## converged
## # weights:  246
## initial  value 68.367557 
## iter  10 value 23.902082
## iter  20 value 16.361040
## iter  30 value 15.301320
## iter  40 value 15.083982
## iter  50 value 15.059599
## iter  60 value 15.015962
## iter  70 value 14.980847
## iter  80 value 14.976009
## iter  90 value 14.974998
## iter 100 value 14.974702
## final  value 14.974702 
## stopped after 100 iterations
## # weights:  50
## initial  value 74.913060 
## iter  10 value 29.342125
## iter  20 value 25.213536
## iter  30 value 23.271096
## iter  40 value 23.219913
## iter  50 value 23.212860
## iter  60 value 23.211414
## iter  70 value 23.210637
## iter  80 value 23.210225
## iter  90 value 23.209769
## iter 100 value 23.209394
## final  value 23.209394 
## stopped after 100 iterations
## # weights:  148
## initial  value 66.808724 
## iter  10 value 15.854533
## iter  20 value 8.820257
## iter  30 value 3.112124
## iter  40 value 2.999179
## iter  50 value 2.958810
## iter  60 value 2.934274
## iter  70 value 2.907263
## iter  80 value 2.898741
## iter  90 value 2.892244
## iter 100 value 2.887944
## final  value 2.887944 
## stopped after 100 iterations
## # weights:  246
## initial  value 72.989823 
## iter  10 value 5.924902
## iter  20 value 3.024461
## iter  30 value 2.989294
## iter  40 value 2.956873
## iter  50 value 2.925959
## iter  60 value 2.908988
## iter  70 value 2.895211
## iter  80 value 2.888331
## iter  90 value 2.881497
## iter 100 value 2.873979
## final  value 2.873979 
## stopped after 100 iterations
## # weights:  50
## initial  value 73.194708 
## iter  10 value 29.257803
## iter  20 value 15.121911
## iter  30 value 10.831618
## iter  40 value 10.266672
## iter  50 value 10.254020
## iter  60 value 10.253940
## iter  70 value 10.253927
## final  value 10.253925 
## converged
## # weights:  148
## initial  value 65.546884 
## iter  10 value 19.968681
## iter  20 value 9.875934
## iter  30 value 9.435251
## iter  40 value 9.244243
## iter  50 value 8.796685
## iter  60 value 7.372868
## iter  70 value 7.316372
## iter  80 value 4.754440
## iter  90 value 4.693311
## iter 100 value 4.170914
## final  value 4.170914 
## stopped after 100 iterations
## # weights:  246
## initial  value 75.182816 
## iter  10 value 12.219245
## iter  20 value 3.138752
## iter  30 value 2.799542
## iter  40 value 2.776662
## iter  50 value 2.773339
## iter  60 value 2.772814
## iter  70 value 2.772618
## iter  80 value 2.772596
## iter  90 value 2.772591
## final  value 2.772590 
## converged
## # weights:  50
## initial  value 80.664198 
## iter  10 value 31.769665
## iter  20 value 22.930277
## iter  30 value 20.995536
## iter  40 value 20.389782
## iter  50 value 20.382084
## final  value 20.382049 
## converged
## # weights:  148
## initial  value 72.140767 
## iter  10 value 22.597729
## iter  20 value 15.754747
## iter  30 value 15.506772
## iter  40 value 15.443794
## iter  50 value 15.380771
## iter  60 value 15.379478
## iter  70 value 15.379430
## iter  70 value 15.379429
## iter  70 value 15.379429
## final  value 15.379429 
## converged
## # weights:  246
## initial  value 71.957486 
## iter  10 value 22.531119
## iter  20 value 15.276009
## iter  30 value 14.859896
## iter  40 value 14.764417
## iter  50 value 14.720292
## iter  60 value 14.662859
## iter  70 value 14.657139
## iter  80 value 14.653565
## iter  90 value 14.653481
## final  value 14.653480 
## converged
## # weights:  50
## initial  value 69.279274 
## iter  10 value 31.048343
## iter  20 value 27.271239
## iter  30 value 24.725026
## iter  40 value 21.862000
## iter  50 value 21.454322
## iter  60 value 21.448891
## iter  70 value 21.445487
## iter  80 value 18.558477
## iter  90 value 16.061342
## iter 100 value 16.010145
## final  value 16.010145 
## stopped after 100 iterations
## # weights:  148
## initial  value 80.018636 
## iter  10 value 18.522263
## iter  20 value 9.026920
## iter  30 value 8.280354
## iter  40 value 7.883689
## iter  50 value 5.662657
## iter  60 value 5.426435
## iter  70 value 5.404630
## iter  80 value 5.384877
## iter  90 value 5.378448
## iter 100 value 5.191033
## final  value 5.191033 
## stopped after 100 iterations
## # weights:  246
## initial  value 85.974871 
## iter  10 value 16.707814
## iter  20 value 7.228730
## iter  30 value 6.636602
## iter  40 value 6.293116
## iter  50 value 6.031013
## iter  60 value 5.943857
## iter  70 value 5.644049
## iter  80 value 5.525917
## iter  90 value 5.399625
## iter 100 value 4.851789
## final  value 4.851789 
## stopped after 100 iterations
## # weights:  50
## initial  value 61.553814 
## iter  10 value 35.277615
## iter  20 value 28.063575
## iter  30 value 25.104741
## iter  40 value 21.311828
## iter  50 value 21.197852
## iter  60 value 21.162542
## iter  70 value 21.153516
## iter  80 value 21.150601
## iter  90 value 21.150536
## final  value 21.150523 
## converged
## # weights:  148
## initial  value 63.384780 
## iter  10 value 14.283585
## iter  20 value 8.256076
## iter  30 value 6.371471
## iter  40 value 4.589857
## iter  50 value 3.921047
## iter  60 value 3.357869
## iter  70 value 3.319939
## iter  80 value 3.303442
## iter  90 value 3.299578
## iter 100 value 3.296512
## final  value 3.296512 
## stopped after 100 iterations
## # weights:  246
## initial  value 64.043275 
## iter  10 value 6.237517
## iter  20 value 2.002794
## iter  30 value 1.910590
## iter  40 value 1.909592
## iter  50 value 1.909550
## iter  60 value 1.909544
## iter  70 value 1.909543
## final  value 1.909543 
## converged
## # weights:  50
## initial  value 62.316059 
## iter  10 value 38.944154
## iter  20 value 26.380743
## iter  30 value 19.315276
## iter  40 value 18.989410
## iter  50 value 18.975903
## final  value 18.975881 
## converged
## # weights:  148
## initial  value 66.640341 
## iter  10 value 32.172447
## iter  20 value 19.778969
## iter  30 value 16.687003
## iter  40 value 14.737146
## iter  50 value 14.302814
## iter  60 value 14.208910
## iter  70 value 14.156938
## iter  80 value 14.133909
## iter  90 value 14.133432
## iter 100 value 14.133425
## final  value 14.133425 
## stopped after 100 iterations
## # weights:  246
## initial  value 67.239292 
## iter  10 value 22.220849
## iter  20 value 16.128021
## iter  30 value 15.037465
## iter  40 value 14.273030
## iter  50 value 14.035413
## iter  60 value 13.792054
## iter  70 value 13.522224
## iter  80 value 13.430113
## iter  90 value 13.396012
## iter 100 value 13.392577
## final  value 13.392577 
## stopped after 100 iterations
## # weights:  50
## initial  value 62.466453 
## iter  10 value 26.924936
## iter  20 value 25.103263
## iter  30 value 23.251517
## iter  40 value 23.209114
## iter  50 value 21.231767
## iter  60 value 21.217169
## iter  70 value 21.214984
## iter  80 value 21.212140
## iter  90 value 21.209688
## iter 100 value 21.207537
## final  value 21.207537 
## stopped after 100 iterations
## # weights:  148
## initial  value 74.410303 
## iter  10 value 20.148540
## iter  20 value 10.450807
## iter  30 value 5.722416
## iter  40 value 4.708645
## iter  50 value 4.664552
## iter  60 value 4.653850
## iter  70 value 4.642256
## iter  80 value 4.199105
## iter  90 value 2.427519
## iter 100 value 2.414477
## final  value 2.414477 
## stopped after 100 iterations
## # weights:  246
## initial  value 81.626665 
## iter  10 value 8.208893
## iter  20 value 2.100247
## iter  30 value 2.046130
## iter  40 value 2.035333
## iter  50 value 2.026906
## iter  60 value 2.013949
## iter  70 value 2.002988
## iter  80 value 1.996243
## iter  90 value 1.991121
## iter 100 value 1.985333
## final  value 1.985333 
## stopped after 100 iterations
## # weights:  50
## initial  value 66.957991 
## iter  10 value 33.795560
## iter  20 value 30.764744
## iter  30 value 30.726799
## iter  40 value 29.432519
## iter  50 value 29.411648
## iter  60 value 28.055067
## iter  70 value 28.050877
## iter  80 value 26.606075
## iter  90 value 26.596372
## iter 100 value 26.594993
## final  value 26.594993 
## stopped after 100 iterations
## # weights:  148
## initial  value 66.380510 
## iter  10 value 23.583858
## iter  20 value 9.287468
## iter  30 value 6.828115
## iter  40 value 4.956410
## iter  50 value 4.805013
## iter  60 value 4.788865
## iter  70 value 4.783917
## iter  80 value 4.782095
## iter  90 value 4.781036
## iter 100 value 4.764955
## final  value 4.764955 
## stopped after 100 iterations
## # weights:  246
## initial  value 72.009673 
## iter  10 value 13.199890
## iter  20 value 3.188530
## iter  30 value 1.947583
## iter  40 value 1.912963
## iter  50 value 1.909636
## final  value 1.909543 
## converged
## # weights:  50
## initial  value 67.284119 
## iter  10 value 36.036373
## iter  20 value 26.175441
## iter  30 value 22.693643
## iter  40 value 22.145809
## final  value 22.138071 
## converged
## # weights:  148
## initial  value 68.078681 
## iter  10 value 27.024005
## iter  20 value 18.966043
## iter  30 value 16.901176
## iter  40 value 16.127766
## iter  50 value 15.740388
## iter  60 value 15.650303
## iter  70 value 15.641672
## iter  80 value 15.639938
## iter  90 value 15.638782
## final  value 15.638780 
## converged
## # weights:  246
## initial  value 69.285061 
## iter  10 value 23.914361
## iter  20 value 15.333893
## iter  30 value 14.895605
## iter  40 value 14.827519
## iter  50 value 14.803767
## iter  60 value 14.773113
## iter  70 value 14.749670
## iter  80 value 14.748566
## iter  90 value 14.746799
## iter 100 value 14.746031
## final  value 14.746031 
## stopped after 100 iterations
## # weights:  50
## initial  value 64.336021 
## iter  10 value 31.507096
## iter  20 value 29.993624
## iter  30 value 28.245755
## iter  40 value 28.239712
## iter  50 value 28.231201
## iter  60 value 28.223967
## iter  70 value 25.213225
## iter  80 value 25.210617
## iter  90 value 25.208764
## iter 100 value 25.206148
## final  value 25.206148 
## stopped after 100 iterations
## # weights:  148
## initial  value 66.814703 
## iter  10 value 22.546948
## iter  20 value 15.381218
## iter  30 value 14.501346
## iter  40 value 14.379715
## iter  50 value 14.082980
## iter  60 value 14.055293
## iter  70 value 13.962366
## iter  80 value 13.943572
## iter  90 value 13.622021
## iter 100 value 12.187155
## final  value 12.187155 
## stopped after 100 iterations
## # weights:  246
## initial  value 84.919110 
## iter  10 value 28.033163
## iter  20 value 10.210805
## iter  30 value 8.570518
## iter  40 value 7.763227
## iter  50 value 7.295027
## iter  60 value 7.031426
## iter  70 value 7.000249
## iter  80 value 6.955851
## iter  90 value 6.644369
## iter 100 value 6.600102
## final  value 6.600102 
## stopped after 100 iterations
## # weights:  50
## initial  value 86.324573 
## iter  10 value 26.232086
## iter  20 value 25.069007
## iter  30 value 25.051006
## iter  40 value 24.034646
## iter  50 value 23.407801
## iter  60 value 23.406463
## final  value 23.405936 
## converged
## # weights:  148
## initial  value 63.872883 
## iter  10 value 23.261756
## iter  20 value 16.996061
## iter  30 value 16.630222
## iter  40 value 16.249935
## iter  50 value 15.287087
## iter  60 value 15.233224
## iter  70 value 14.997925
## iter  80 value 14.988528
## iter  90 value 14.984635
## iter 100 value 14.980263
## final  value 14.980263 
## stopped after 100 iterations
## # weights:  246
## initial  value 74.742942 
## iter  10 value 20.371087
## iter  20 value 7.286435
## iter  30 value 6.102180
## iter  40 value 5.549045
## iter  50 value 4.714455
## iter  60 value 4.685760
## iter  70 value 4.683824
## iter  80 value 4.683360
## iter  90 value 4.683130
## iter 100 value 4.682918
## final  value 4.682918 
## stopped after 100 iterations
## # weights:  50
## initial  value 69.378173 
## iter  10 value 32.177127
## iter  20 value 24.233473
## iter  30 value 22.246803
## iter  40 value 22.071438
## iter  50 value 21.751972
## iter  60 value 21.618247
## iter  70 value 21.617007
## iter  70 value 21.617007
## iter  70 value 21.617007
## final  value 21.617007 
## converged
## # weights:  148
## initial  value 73.071981 
## iter  10 value 25.773608
## iter  20 value 17.298459
## iter  30 value 15.742513
## iter  40 value 15.155733
## iter  50 value 15.001713
## iter  60 value 14.961105
## iter  70 value 14.949065
## iter  80 value 14.948699
## final  value 14.948682 
## converged
## # weights:  246
## initial  value 80.762229 
## iter  10 value 24.417215
## iter  20 value 15.634746
## iter  30 value 14.536364
## iter  40 value 14.274673
## iter  50 value 14.238511
## iter  60 value 14.235964
## iter  70 value 14.235910
## iter  80 value 14.235907
## final  value 14.235907 
## converged
## # weights:  50
## initial  value 73.433304 
## iter  10 value 37.555125
## iter  20 value 20.700075
## iter  30 value 19.804623
## iter  40 value 19.795099
## iter  50 value 19.791943
## iter  60 value 19.788699
## iter  70 value 17.771500
## iter  80 value 17.715991
## iter  90 value 17.709412
## iter 100 value 15.475401
## final  value 15.475401 
## stopped after 100 iterations
## # weights:  148
## initial  value 73.677626 
## iter  10 value 35.760074
## iter  20 value 22.990582
## iter  30 value 19.533515
## iter  40 value 17.992623
## iter  50 value 17.340637
## iter  60 value 16.571211
## iter  70 value 15.818469
## iter  80 value 15.806080
## iter  90 value 13.259448
## iter 100 value 12.401802
## final  value 12.401802 
## stopped after 100 iterations
## # weights:  246
## initial  value 78.297731 
## iter  10 value 15.539093
## iter  20 value 11.334613
## iter  30 value 11.055371
## iter  40 value 10.022467
## iter  50 value 9.500383
## iter  60 value 9.291604
## iter  70 value 9.179569
## iter  80 value 8.796615
## iter  90 value 8.760201
## iter 100 value 8.739276
## final  value 8.739276 
## stopped after 100 iterations
## # weights:  50
## initial  value 64.529412 
## iter  10 value 25.651713
## iter  20 value 19.715909
## iter  30 value 16.664665
## iter  40 value 16.429905
## iter  50 value 16.398188
## iter  60 value 16.387949
## iter  70 value 16.382131
## iter  80 value 16.381226
## final  value 16.381224 
## converged
## # weights:  148
## initial  value 65.266712 
## iter  10 value 23.215554
## iter  20 value 11.933435
## iter  30 value 4.208241
## iter  40 value 3.827260
## iter  50 value 2.838205
## iter  60 value 2.793879
## iter  70 value 2.780570
## iter  80 value 2.775860
## iter  90 value 2.774085
## iter 100 value 2.773028
## final  value 2.773028 
## stopped after 100 iterations
## # weights:  246
## initial  value 66.556822 
## iter  10 value 11.482455
## iter  20 value 4.917838
## iter  30 value 4.702307
## iter  40 value 4.682600
## iter  50 value 4.682265
## iter  60 value 4.682131
## iter  60 value 4.682131
## iter  60 value 4.682131
## final  value 4.682131 
## converged
## # weights:  50
## initial  value 71.696143 
## iter  10 value 27.939852
## iter  20 value 20.937513
## iter  30 value 20.541111
## iter  40 value 20.530253
## iter  40 value 20.530253
## iter  40 value 20.530253
## final  value 20.530253 
## converged
## # weights:  148
## initial  value 67.447831 
## iter  10 value 22.730246
## iter  20 value 16.069012
## iter  30 value 14.975225
## iter  40 value 14.297662
## iter  50 value 14.254544
## iter  60 value 14.248011
## iter  70 value 14.247734
## iter  70 value 14.247734
## iter  70 value 14.247734
## final  value 14.247734 
## converged
## # weights:  246
## initial  value 65.335582 
## iter  10 value 21.447596
## iter  20 value 15.454211
## iter  30 value 14.477365
## iter  40 value 13.873263
## iter  50 value 13.708738
## iter  60 value 13.665483
## iter  70 value 13.646620
## iter  80 value 13.643768
## iter  90 value 13.643436
## final  value 13.643432 
## converged
## # weights:  50
## initial  value 73.072488 
## iter  10 value 29.805494
## iter  20 value 23.816631
## iter  30 value 21.213655
## iter  40 value 21.196828
## iter  50 value 21.193898
## iter  60 value 21.192865
## iter  70 value 21.056069
## iter  80 value 14.713420
## iter  90 value 14.705609
## iter 100 value 14.702258
## final  value 14.702258 
## stopped after 100 iterations
## # weights:  148
## initial  value 59.887673 
## iter  10 value 25.606206
## iter  20 value 11.845724
## iter  30 value 10.400596
## iter  40 value 10.375751
## iter  50 value 10.312334
## iter  60 value 10.237429
## iter  70 value 10.196505
## iter  80 value 10.012968
## iter  90 value 9.808608
## iter 100 value 9.657355
## final  value 9.657355 
## stopped after 100 iterations
## # weights:  246
## initial  value 74.319562 
## iter  10 value 10.168484
## iter  20 value 5.511297
## iter  30 value 5.402116
## iter  40 value 4.905955
## iter  50 value 4.340351
## iter  60 value 4.298331
## iter  70 value 4.291601
## iter  80 value 4.280951
## iter  90 value 4.260944
## iter 100 value 4.193815
## final  value 4.193815 
## stopped after 100 iterations
## # weights:  50
## initial  value 66.015095 
## iter  10 value 31.097786
## iter  20 value 29.604179
## iter  30 value 27.080306
## iter  40 value 27.036491
## iter  50 value 24.287027
## iter  60 value 21.336392
## iter  70 value 21.156247
## iter  80 value 21.154844
## iter  90 value 21.153388
## iter 100 value 21.151713
## final  value 21.151713 
## stopped after 100 iterations
## # weights:  148
## initial  value 63.610273 
## iter  10 value 24.104075
## iter  20 value 8.640125
## iter  30 value 2.981083
## iter  40 value 2.775571
## iter  50 value 2.772885
## iter  60 value 2.772731
## iter  70 value 2.772644
## iter  80 value 2.772633
## iter  90 value 2.772597
## iter 100 value 2.772590
## final  value 2.772590 
## stopped after 100 iterations
## # weights:  246
## initial  value 63.137518 
## iter  10 value 21.063098
## iter  20 value 10.576914
## iter  30 value 9.067025
## iter  40 value 6.438300
## iter  50 value 5.072279
## iter  60 value 4.555081
## iter  70 value 2.797958
## iter  80 value 2.779677
## iter  90 value 2.777176
## iter 100 value 2.775329
## final  value 2.775329 
## stopped after 100 iterations
## # weights:  50
## initial  value 61.787783 
## iter  10 value 28.573552
## iter  20 value 24.511573
## iter  30 value 24.033795
## iter  40 value 24.004832
## iter  50 value 24.004664
## final  value 24.004663 
## converged
## # weights:  148
## initial  value 73.072023 
## iter  10 value 34.321151
## iter  20 value 19.607188
## iter  30 value 16.057632
## iter  40 value 15.221674
## iter  50 value 15.122504
## iter  60 value 15.094168
## iter  70 value 15.091944
## iter  80 value 15.091887
## final  value 15.091884 
## converged
## # weights:  246
## initial  value 76.151706 
## iter  10 value 21.580490
## iter  20 value 15.677001
## iter  30 value 14.810781
## iter  40 value 14.661626
## iter  50 value 14.645167
## iter  60 value 14.643259
## iter  70 value 14.642952
## iter  80 value 14.642876
## iter  80 value 14.642876
## iter  80 value 14.642876
## final  value 14.642876 
## converged
## # weights:  50
## initial  value 91.328835 
## iter  10 value 35.412042
## iter  20 value 24.634171
## iter  30 value 15.730665
## iter  40 value 15.335339
## iter  50 value 15.329432
## iter  60 value 15.324485
## iter  70 value 15.321931
## iter  80 value 15.319288
## iter  90 value 15.316559
## iter 100 value 15.315422
## final  value 15.315422 
## stopped after 100 iterations
## # weights:  148
## initial  value 60.051624 
## iter  10 value 20.361561
## iter  20 value 11.136947
## iter  30 value 9.101344
## iter  40 value 8.572187
## iter  50 value 8.094703
## iter  60 value 8.075152
## iter  70 value 7.382035
## iter  80 value 7.352125
## iter  90 value 6.236481
## iter 100 value 3.028709
## final  value 3.028709 
## stopped after 100 iterations
## # weights:  246
## initial  value 73.261698 
## iter  10 value 13.314776
## iter  20 value 7.553811
## iter  30 value 7.178838
## iter  40 value 7.133394
## iter  50 value 7.093329
## iter  60 value 7.061353
## iter  70 value 7.039266
## iter  80 value 7.029383
## iter  90 value 6.462540
## iter 100 value 5.126840
## final  value 5.126840 
## stopped after 100 iterations
## # weights:  148
## initial  value 95.467947 
## iter  10 value 29.352737
## iter  20 value 18.658725
## iter  30 value 17.792349
## iter  40 value 17.742214
## iter  50 value 17.738211
## iter  60 value 17.737670
## final  value 17.737641 
## converged
resultado_entrenamiento5 <- predict(modelo5, entrenamiento)
resultado_prueba5 <- predict(modelo5, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre5 <- confusionMatrix(resultado_entrenamiento5, entrenamiento$m1_purchase)
mcre5
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No  32   2
##        Yes  4  69
##                                           
##                Accuracy : 0.9439          
##                  95% CI : (0.8819, 0.9791)
##     No Information Rate : 0.6636          
##     P-Value [Acc > NIR] : 3.021e-12       
##                                           
##                   Kappa : 0.8727          
##                                           
##  Mcnemar's Test P-Value : 0.6831          
##                                           
##             Sensitivity : 0.8889          
##             Specificity : 0.9718          
##          Pos Pred Value : 0.9412          
##          Neg Pred Value : 0.9452          
##              Prevalence : 0.3364          
##          Detection Rate : 0.2991          
##    Detection Prevalence : 0.3178          
##       Balanced Accuracy : 0.9304          
##                                           
##        'Positive' Class : No              
## 
# Matriz de Confusión del Resultado de la Prueba
mcrp5 <- confusionMatrix(resultado_prueba5, prueba$m1_purchase)
mcrp5
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No   4   6
##        Yes  5  11
##                                           
##                Accuracy : 0.5769          
##                  95% CI : (0.3692, 0.7665)
##     No Information Rate : 0.6538          
##     P-Value [Acc > NIR] : 0.8485          
##                                           
##                   Kappa : 0.0892          
##                                           
##  Mcnemar's Test P-Value : 1.0000          
##                                           
##             Sensitivity : 0.4444          
##             Specificity : 0.6471          
##          Pos Pred Value : 0.4000          
##          Neg Pred Value : 0.6875          
##              Prevalence : 0.3462          
##          Detection Rate : 0.1538          
##    Detection Prevalence : 0.3846          
##       Balanced Accuracy : 0.5458          
##                                           
##        'Positive' Class : No              
## 

Modelo 6. Bosques Aleatorios

modelo6 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "rf", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneGrid = expand.grid(mtry = c(2,4,6)) #Cambiar
                 )

resultado_entrenamiento6 <- predict(modelo6, entrenamiento)
resultado_prueba6 <- predict(modelo6, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre6 <- confusionMatrix(resultado_entrenamiento6, entrenamiento$m1_purchase)
mcre6
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No  34   0
##        Yes  2  71
##                                           
##                Accuracy : 0.9813          
##                  95% CI : (0.9341, 0.9977)
##     No Information Rate : 0.6636          
##     P-Value [Acc > NIR] : <2e-16          
##                                           
##                   Kappa : 0.9576          
##                                           
##  Mcnemar's Test P-Value : 0.4795          
##                                           
##             Sensitivity : 0.9444          
##             Specificity : 1.0000          
##          Pos Pred Value : 1.0000          
##          Neg Pred Value : 0.9726          
##              Prevalence : 0.3364          
##          Detection Rate : 0.3178          
##    Detection Prevalence : 0.3178          
##       Balanced Accuracy : 0.9722          
##                                           
##        'Positive' Class : No              
## 
# Matriz de Confusión del Resultado de la Prueba
mcrp6 <- confusionMatrix(resultado_prueba6, prueba$m1_purchase)
mcrp6
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction No Yes
##        No   4   3
##        Yes  5  14
##                                           
##                Accuracy : 0.6923          
##                  95% CI : (0.4821, 0.8567)
##     No Information Rate : 0.6538          
##     P-Value [Acc > NIR] : 0.4267          
##                                           
##                   Kappa : 0.2828          
##                                           
##  Mcnemar's Test P-Value : 0.7237          
##                                           
##             Sensitivity : 0.4444          
##             Specificity : 0.8235          
##          Pos Pred Value : 0.5714          
##          Neg Pred Value : 0.7368          
##              Prevalence : 0.3462          
##          Detection Rate : 0.1538          
##    Detection Prevalence : 0.2692          
##       Balanced Accuracy : 0.6340          
##                                           
##        'Positive' Class : No              
## 

Tabla de Resultados

resultados <- data.frame(
  "svmLinear" = c(mcre1$overall["Accuracy"], mcrp1$overall["Accuracy"]),
   "svmRadial" = c(mcre2$overall["Accuracy"], mcrp2$overall["Accuracy"]),
   "svmPoly" = c(mcre3$overall["Accuracy"], mcrp3$overall["Accuracy"]),
   "rpart" = c(mcre4$overall["Accuracy"], mcrp4$overall["Accuracy"]),
   "nnet" = c(mcre5$overall["Accuracy"], mcrp5$overall["Accuracy"]),
   "rf" = c(mcre6$overall["Accuracy"], mcrp6$overall["Accuracy"])
)

rownames(resultados) <- c("Precisión de entrenamiento", "Precisión de prueba")
resultados
##                            svmLinear svmRadial   svmPoly     rpart      nnet
## Precisión de entrenamiento 0.9065421 0.9813084 0.9065421 0.8037383 0.9439252
## Precisión de prueba        0.5384615 0.6538462 0.5384615 0.5769231 0.5769231
##                                   rf
## Precisión de entrenamiento 0.9813084
## Precisión de prueba        0.6923077

Conclusiones

En términos de desempeño, el modelo que presentó la mayor precisión en datos de prueba fue Random Forest con un accuracy de 0.6923, superando a los demás modelos evaluados. Aunque varios modelos mostraron un desempeño alto en el conjunto de entrenamiento, su rendimiento disminuyó en el conjunto de prueba, lo que sugiere overfitting.

Considerando tanto la capacidad predictiva como la inversión de recursos computacionales, considero que el modelo de Random Forest es la mejor opción ya que ofrece el mejor balance entre desempeño y costo.

---
title: "SVM - Apple Purchase Prediction"
author: "Evelyn Rodríguez Yudiche"
date: "2026-02-27"
output: 
  html_document:
    toc: TRUE
    toc_float: TRUE
    code_download: TRUE
    theme: yeti
---

<center>
![](https://i.gifer.com/CYQx.gif)
</center>

# <span style="color: blue"> Teoría </span>
El paquete **CARET (Classification And REgression Training)** es un paquete integral con una amplia variedad de algoritmos para el aprendizaje automático.  

# <span style="color: blue"> Instalar paquetes y llamar librerías </span>
```{r message=FALSE, warning=FALSE}
library(dplyr)
# install.packages("ggplot2") # Gráficas
library(ggplot2)
# install.packages("lattice") # Crear gráficos
library(lattice)
# install.packages("caret") # Algoritmos de aprendizaje automático
library(caret)
# install.packages("datasets") # Usar bases de datos, en este caso Iris
library(datasets)
# install.packages("DataExplorer") # Análisis Exploratorio
library(DataExplorer)
```

# <span style="color: blue"> Cargar la base de datos </span>
```{r}
df <- read.csv("C:\\Users\\eveyu\\Downloads\\concentración\\m2_R\\M1_data.csv")
head(df)
```

# <span style="color: blue"> Entender la base de datos </span>
```{r}
summary(df)
```

```{r}
str(df)
```

```{r}
# create_report(df)
plot_missing(df)
```
```{r}
plot_histogram(df)
```
```{r}
df_numeric <- df[, sapply(df, is.numeric)]
str(df_numeric)

plot_correlation(df_numeric)
```
# <span style="color: blue"> Transformar variables categóricas a factor </span>
```{r}
df$trust_apple <- as.factor(df$trust_apple)
df$user_pcmac <- as.factor(df$user_pcmac)
df$familiarity_m1 <- as.factor(df$familiarity_m1)
df$m1_purchase <- as.factor(df$m1_purchase)
df$gender <- as.factor(df$gender)
df$status <- as.factor(df$status)
df$domain <- as.factor(df$domain)

str(df)

```

**NOTA: La variable que queremos predecir debe tener formato de FACTOR**

# <span style="color: blue"> Partir la base de datos </span>
```{r}
# Normalmente 80-20
set.seed(123)
renglones_entrenamiento <- createDataPartition(df$m1_purchase, p=0.8, list=FALSE)
entrenamiento <- df[renglones_entrenamiento, ]
prueba <- df[-renglones_entrenamiento, ]
```

# <span style="color: blue"> Distintos tipos de Métodos para Modelar </span>
Los métodos más utilizados para modelar aprendizaje automático son:  

* **SVM**: *Support Vector Machine* o Máquina de Vectores de Soporte. Hay varios subtipos: Lineal (svmLinear), Radial (svmRadial), Polinómico (svmPoly), etc.   
* **Árbol de Decisión**: rpart  
* **Redes Neuronales**: nnet  
* **Random Forest** o Bosques Aleatorios: rf  

# <span style="color: blue"> Modelo 1. SVM Lineal </span>
```{r message=FALSE, warning=FALSE}
modelo1 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "svmLinear", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneGrid = data.frame(C=1) #Cambiar
                 )

resultado_entrenamiento1 <- predict(modelo1, entrenamiento)
resultado_prueba1 <- predict(modelo1, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre1 <- confusionMatrix(resultado_entrenamiento1, entrenamiento$m1_purchase)
mcre1
# Matriz de Confusión del Resultado de la Prueba
mcrp1 <- confusionMatrix(resultado_prueba1, prueba$m1_purchase)
mcrp1
```

# <span style="color: blue"> Modelo 2. SVM Radial </span>
```{r message=FALSE, warning=FALSE}
modelo2 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "svmRadial", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneGrid = data.frame(sigma=1, C=1) #Cambiar
                 )

resultado_entrenamiento2 <- predict(modelo2, entrenamiento)
resultado_prueba2 <- predict(modelo2, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre2 <- confusionMatrix(resultado_entrenamiento2, entrenamiento$m1_purchase)
mcre2
# Matriz de Confusión del Resultado de la Prueba
mcrp2 <- confusionMatrix(resultado_prueba2, prueba$m1_purchase)
mcrp2
```

# <span style="color: blue"> Modelo 3. SVM Polinómico </span>
```{r message=FALSE, warning=FALSE}
modelo3 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "svmPoly", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneGrid = data.frame(degree=1, scale=1, C=1) #Cambiar
                 )

resultado_entrenamiento3 <- predict(modelo3, entrenamiento)
resultado_prueba3 <- predict(modelo3, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre3 <- confusionMatrix(resultado_entrenamiento3, entrenamiento$m1_purchase)
mcre3
# Matriz de Confusión del Resultado de la Prueba
mcrp3 <- confusionMatrix(resultado_prueba3, prueba$m1_purchase)
mcrp3
```

# <span style="color: blue"> Modelo 4. Árbol de Decisión </span>
```{r message=FALSE, warning=FALSE}
modelo4 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "rpart", #Cambiar
                 #preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneLength = 10 #Cambiar
                 )

resultado_entrenamiento4 <- predict(modelo4, entrenamiento)
resultado_prueba4 <- predict(modelo4, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre4 <- confusionMatrix(resultado_entrenamiento4, entrenamiento$m1_purchase)
mcre4
# Matriz de Confusión del Resultado de la Prueba
mcrp4 <- confusionMatrix(resultado_prueba4, prueba$m1_purchase)
mcrp4
```

# <span style="color: blue"> Modelo 5. Redes Neuronales </span>
```{r message=FALSE, warning=FALSE}
modelo5 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "nnet", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10)
                 #Cambiar
                 )

resultado_entrenamiento5 <- predict(modelo5, entrenamiento)
resultado_prueba5 <- predict(modelo5, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre5 <- confusionMatrix(resultado_entrenamiento5, entrenamiento$m1_purchase)
mcre5
# Matriz de Confusión del Resultado de la Prueba
mcrp5 <- confusionMatrix(resultado_prueba5, prueba$m1_purchase)
mcrp5
```

# <span style="color: blue"> Modelo 6. Bosques Aleatorios </span>
```{r message=FALSE, warning=FALSE}
modelo6 <- train(m1_purchase ~ ., data=entrenamiento,
                 method = "rf", #Cambiar
                 preProcess = c("scale", "center"),
                 trControl = trainControl(method="cv", number=10),
                 tuneGrid = expand.grid(mtry = c(2,4,6)) #Cambiar
                 )

resultado_entrenamiento6 <- predict(modelo6, entrenamiento)
resultado_prueba6 <- predict(modelo6, prueba)

# Matriz de Confusión
# Es una tabla de evaluación que desglosa el rendimiento del modelo de clasificación. 

# Matriz de Confusión del Resultado del Entrenamiento
mcre6 <- confusionMatrix(resultado_entrenamiento6, entrenamiento$m1_purchase)
mcre6
# Matriz de Confusión del Resultado de la Prueba
mcrp6 <- confusionMatrix(resultado_prueba6, prueba$m1_purchase)
mcrp6
```

# <span style="color: blue"> Tabla de Resultados </span>
```{r}
resultados <- data.frame(
  "svmLinear" = c(mcre1$overall["Accuracy"], mcrp1$overall["Accuracy"]),
   "svmRadial" = c(mcre2$overall["Accuracy"], mcrp2$overall["Accuracy"]),
   "svmPoly" = c(mcre3$overall["Accuracy"], mcrp3$overall["Accuracy"]),
   "rpart" = c(mcre4$overall["Accuracy"], mcrp4$overall["Accuracy"]),
   "nnet" = c(mcre5$overall["Accuracy"], mcrp5$overall["Accuracy"]),
   "rf" = c(mcre6$overall["Accuracy"], mcrp6$overall["Accuracy"])
)

rownames(resultados) <- c("Precisión de entrenamiento", "Precisión de prueba")
resultados
```

# <span style="color: blue"> Conclusiones </span>
En términos de desempeño, el modelo que presentó la mayor precisión en datos de prueba fue Random Forest con un accuracy de 0.6923, superando a los demás modelos evaluados. Aunque varios modelos mostraron un desempeño alto en el conjunto de entrenamiento, su rendimiento disminuyó en el conjunto de prueba, lo que sugiere overfitting.

Considerando tanto la capacidad predictiva como la inversión de recursos computacionales, considero que el modelo de Random Forest es la mejor opción ya que ofrece el mejor balance entre desempeño y costo.
