“ Año del Bicentenario, de la consolidación de nuestra Independencia, y de la conmemoración de las heroicas batallas de Junín y Ayacucho ”

Participante:
  • Duran Durand Diego Alonso
  • Curso: Estadística Aplicada a la Computación

    Docente: Guevara Ponce Victor Manuel

    Carrera: Gestión de Sistemas de Información

    Sede: Piura

    Tema: Análisis de Datos para la empresa Floricultura “Verde Esencia - Informe Final

    Indice

    I. Aspectos generales

    1.1 Nombre de la organización

    1.2 Descripción del caso que se va a analizar

    II. Fundamentos básicos de la Estadística

    2.1 Objetivo de estudio

    2.2 Población de estudio

    2.3 Muestra

    2.4 Unidad de análisis

    III. Variables y tipo de variables

    3.1 Importación al entorno de trabajo

    3.2 Variables y descripción de cada variable

    IV. Manejo de base de datos

    4.1 Carga y visualización de los datos

    V. Tablas de frecuencia (Para cada variable)

    5.1 Tablas de frecuencia de las variables

    VI. Representación gráfica de datos.

    6.1 Graficos de las tablas Especies por Ancho del sepal

    VII. Medidas estadísticas de tendencia

    7.1 Media aritmética

    7.2 Mediana

    7.3 Moda

    VIII. Medidas estadísticas de posición

    8.1 Categorizar en 4 grupos a el ancho del sepal

    8.2 Calcular e interpretar los cuartiles para la logintud del sepal

    8.3 Dividir en 10 grupo

    8.4 Dividir en 100 grupo

    8.5 Asimetria

    8.6 Curtosis

    IX. Manejo de datos Missing

    9.1 Verificación de valores perdidos

    9.2 Corrección de missing

    9.2.1 Eliminar filas o columnas con missin

    9.2.2 Aplicando técnicas de imputación

    9.2.3 Utilizando otra librería para imputar datos

    9.2.4 Imputación utilizando vecinos más cercanos

    X. Manejo de valores outliers

    10.1 Detección de outtliers univariado - gráfica

    10.1.1 Gráfico de cajas

    10.2. Correción

    10.2.1 Eliminar los atípicos

    XI. Transformación de variables

    11.1 Transformación de raíz cuadrada

    11.2 Transformación exponencial

    11.3 Transformación logarítmica

    11.4 Comparación de transformaciones

    XII. Estandarización y normalización de variables

    12.1 Estandarización

    12.1.1 Método 1 Por partes

    12.1.2 Método 2 Directo

    12.1.3 Método 3 Apoyarse en las funciones de R

    12.2 Normalización

    12.2.1 Método 1

    12.2.2 Método 2 Función

    12.2.3 Aplicando a todo el caso

    XIII. Modelamiento predictivo

    13.1 Regreción lineal

    13.1.1 Diagrama de dispersión o puntos

    13.1.2 Coeficiente de correlación

    13.1.3. Regresión lineal simple

    13.2 Regreción angular

    13.2.1 Representación de las observaciones

    13.2.2 Generar el modelo de regresión logística

    13.2.3 Gráfico del modelo

    13.2.4 Frecuencias de las variables Longitud del sepal por el ancho del sepal

    13.2.5 Comparando modelos

    I. Aspectos generales

    1.1. Nombre de la organización

    Floricultura “Verde Esencia

    1.2. Descripción del caso que se va a analizar

    La floricultura “Verde Esencia” se especializa en el cultivo y venta de diversas especies de flores, incluyendo las populares plantas de iris. Recientemente, la empresa ha estado experimentando dificultades para clasificar y diferenciar los diferentes tipos de iris que cultiva, ya que comparten similitudes en cuanto a la longitud y el ancho de sus sépalos y pétalos. Para abordar este problema, la empresa ha recopilado datos detallados sobre las características de los sépalos y pétalos de varias flores de iris.

    Los datos recolectados incluyen mediciones de la longitud y el ancho del sépalo y pétalo de diferentes especies de iris, junto con la clasificación de cada flor en cuanto a su especie (setosa, versicolor o virginica). El objetivo es utilizar estos datos para desarrollar un modelo que pueda clasificar automáticamente las flores de iris en una de las tres especies, facilitando así la gestión de inventario y optimizando las operaciones de la floricultura.

    II. Fundamentos básicos de la Estadística

    2.1. Objetivo de estudio

    El objetivo principal de este estudio es desarrollar un modelo predictivo eficiente y preciso que pueda clasificar automáticamente las especies de iris (setosa, versicolor o virginica) basadas en las mediciones de la longitud y el ancho del sépalo y el pétalo. Este modelo proporcionará a la floricultura “Verde Esencia” una herramienta valiosa para optimizar la gestión de inventario, facilitando la identificación y clasificación de las flores de iris en función de sus características morfológicas.

    2.2. Población de estudio

    La población de estudio comprende todas las flores de iris cultivadas por la floricultura “Verde Esencia”. Este conjunto incluye flores de las tres especies: setosa, versicolor y virginica. Cada flor de iris representa una unidad individual dentro de la población.

    2.3. Muestra

    La muestra se selecciona a partir de la población total de flores de iris cultivadas por la floricultura. Se tomarán mediciones detalladas de la longitud y el ancho del sépalo y el pétalo de un número representativo de flores de cada especie. La cantidad exacta de muestras dependerá de consideraciones prácticas y estadísticas para garantizar la representatividad y validez del modelo.

    2.4. Unidad de análisis

    La unidad de análisis en este estudio es cada flor de iris individual. Las mediciones de la longitud y el ancho del sépalo y el pétalo, junto con la clasificación de la especie, constituirán la información relevante para el análisis. El modelo desarrollado se aplica a nivel de cada unidad de análisis para realizar la clasificación automática de las flores de iris en las tres categorías especificadas.

    III. Variables y tipo de variables

    3.1. Importación al entorno de trabajo

    # Instalar y cargar los paquetes necesarios
    if (!require(ggplot2)) {
      install.packages("ggplot2")
      library(ggplot2)
    }
    ## Loading required package: ggplot2
    ## Warning: package 'ggplot2' was built under R version 4.3.2

    3.2. Variables y descripción de cada variable

    Sepal.Length: Longitud del sépalo de la flor en centímetros. Sepal.Width: Ancho del sépalo de la flor en centímetros. Petal.Length: Longitud del pétalo de la flor en centímetros. Petal.Width: Ancho del pétalo de la flor en centímetros. Species: Especie de la flor iris. Este es un atributo categórico que indica a qué especie pertenece la flor (setosa, versicolor, virginica).

    IV. Manejo de base de datos

    4.1. Carga y visualización de los datos

    # Cargar los datos
    iris <- read.csv("iris.csv", sep = ";", stringsAsFactors = TRUE, encoding = "latin1");
    head(iris)
    ##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
    ## 1          5.1         3.5          1.4         0.2  setosa
    ## 2          4.9         3.0          1.4         0.2  setosa
    ## 3          4.7         3.2          1.3         0.2  setosa
    ## 4          4.6         3.1          1.5         0.2  setosa
    ## 5          5.0         3.6          1.4         0.2  setosa
    ## 6          5.4         3.9          1.7         0.4  setosa

    V. Tablas de frecuencia

    5.1. Tablas de frecuencia de las variables

    # Tabla de frecuencia para la tabla Sepal.Length
    library(agricolae)
    ## Warning: package 'agricolae' was built under R version 4.3.2
    tabla_frecuencia_Sepal.Length <- table.freq(hist(iris$Sepal.Length,breaks = "Sturges", 
                                              plot = FALSE))
    tabla_frecuencia_Sepal.Length
    ##   Lower Upper Main Frequency Percentage  CF   CPF
    ## 1   4.0   4.5 4.25         5        3.3   5   3.3
    ## 2   4.5   5.0 4.75        27       18.0  32  21.3
    ## 3   5.0   5.5 5.25        27       18.0  59  39.3
    ## 4   5.5   6.0 5.75        30       20.0  89  59.3
    ## 5   6.0   6.5 6.25        31       20.7 120  80.0
    ## 6   6.5   7.0 6.75        18       12.0 138  92.0
    ## 7   7.0   7.5 7.25         6        4.0 144  96.0
    ## 8   7.5   8.0 7.75         6        4.0 150 100.0
    # Tabla de frecuencia para la tabla Sepal.Width
    library(agricolae)
    tabla_frecuencia_Sepal.Width <- table.freq(hist(iris$Sepal.Width,breaks = "Sturges", 
                                              plot = FALSE))
    tabla_frecuencia_Sepal.Width
    ##    Lower Upper Main Frequency Percentage  CF   CPF
    ## 1    2.0   2.2  2.1         4        2.7   4   2.7
    ## 2    2.2   2.4  2.3         7        4.7  11   7.3
    ## 3    2.4   2.6  2.5        13        8.7  24  16.0
    ## 4    2.6   2.8  2.7        23       15.3  47  31.3
    ## 5    2.8   3.0  2.9        36       24.0  83  55.3
    ## 6    3.0   3.2  3.1        24       16.0 107  71.3
    ## 7    3.2   3.4  3.3        18       12.0 125  83.3
    ## 8    3.4   3.6  3.5        10        6.7 135  90.0
    ## 9    3.6   3.8  3.7         9        6.0 144  96.0
    ## 10   3.8   4.0  3.9         3        2.0 147  98.0
    ## 11   4.0   4.2  4.1         2        1.3 149  99.3
    ## 12   4.2   4.4  4.3         1        0.7 150 100.0
    # Tabla de frecuencia para la tabla Petal.Length
    library(agricolae)
    tabla_frecuencia_Petal.Length <- table.freq(hist(iris$Petal.Length,breaks = "Sturges", 
                                              plot = FALSE))
    tabla_frecuencia_Petal.Length
    ##    Lower Upper Main Frequency Percentage  CF   CPF
    ## 1    1.0   1.5 1.25        37       24.7  37  24.7
    ## 2    1.5   2.0 1.75        13        8.7  50  33.3
    ## 3    2.0   2.5 2.25         0        0.0  50  33.3
    ## 4    2.5   3.0 2.75         1        0.7  51  34.0
    ## 5    3.0   3.5 3.25         4        2.7  55  36.7
    ## 6    3.5   4.0 3.75        11        7.3  66  44.0
    ## 7    4.0   4.5 4.25        21       14.0  87  58.0
    ## 8    4.5   5.0 4.75        21       14.0 108  72.0
    ## 9    5.0   5.5 5.25        17       11.3 125  83.3
    ## 10   5.5   6.0 5.75        16       10.7 141  94.0
    ## 11   6.0   6.5 6.25         5        3.3 146  97.3
    ## 12   6.5   7.0 6.75         4        2.7 150 100.0
    # Tabla de frecuencia para la tabla Petal.Width
    library(agricolae)
    tabla_Petal.Width <- table.freq(hist(iris$Petal.Width,breaks = "Sturges", 
                                              plot = FALSE))
    tabla_Petal.Width
    ##    Lower Upper Main Frequency Percentage  CF   CPF
    ## 1    0.0   0.2  0.1        34       22.7  34  22.7
    ## 2    0.2   0.4  0.3        14        9.3  48  32.0
    ## 3    0.4   0.6  0.5         2        1.3  50  33.3
    ## 4    0.6   0.8  0.7         0        0.0  50  33.3
    ## 5    0.8   1.0  0.9         7        4.7  57  38.0
    ## 6    1.0   1.2  1.1         8        5.3  65  43.3
    ## 7    1.2   1.4  1.3        21       14.0  86  57.3
    ## 8    1.4   1.6  1.5        16       10.7 102  68.0
    ## 9    1.6   1.8  1.7        14        9.3 116  77.3
    ## 10   1.8   2.0  1.9        11        7.3 127  84.7
    ## 11   2.0   2.2  2.1         9        6.0 136  90.7
    ## 12   2.2   2.4  2.3        11        7.3 147  98.0
    ## 13   2.4   2.6  2.5         3        2.0 150 100.0

    VI. Representación gráfica de datos.

    6.1. Graficos de las tablas Especies por Ancho del sepal

    # Cargar librería ggplot2 para visualización de datos
    library(ggplot2)
     
    # Crear gráfico de barras para el Ancho del sépalo de la Longitud del sépalo con colores
    grafico_barras_Species_Sepal.Width <- ggplot(iris, aes(x = Species, y = Sepal.Width, fill = Species)) +
      geom_bar(stat = "identity", position = "dodge") +
      labs(title = "Distribución de Especies por Ancho del sépalo:",
           x = "Species",
           y = "Sepal.Width",
           fill = "Species") +
      scale_fill_manual(values = c(
        "setosa" = "blue",
        "versicolor" = "green",
        "virginica" = "orange"
      )) +
      theme(axis.text.x = element_text(angle = 45, hjust = 1))
    # Visualizar gráfico de barras
    print(grafico_barras_Species_Sepal.Width)

    # Crear histograma para el Ancho del sépalo de la Longitud del sépalo
    histograma_Sepal.Width <- ggplot(iris, aes(x = Sepal.Width, fill = Species)) +
      geom_bar(position = "identity", alpha = 0.7) +  # Cambiar a geom_bar y quitar binwidth
      labs(title = "Histograma de Ancho del sépalo de Longitud del sépalo",
           x = "Sepal.Width",
           y = "Count",  # Cambiar a Count ya que ahora estamos usando stat="count"
           fill = "Species") +
      theme_minimal()
    
    # Visualizar el histograma
    print(histograma_Sepal.Width)

    # Crear gráfico de cajas para el Ancho del sépalo de la Especie
    boxplot_Sepal.Width <- ggplot(iris, aes(x = Species, y = Sepal.Width)) +
      geom_boxplot() +
      labs(title = "Distribución de Ancho del sépalo por Especies",
           x = "Species",
           y = "Sepal.Width") +
      theme_minimal() +
      theme(axis.text.x = element_text(angle = 45, hjust = 1))
     
    # Visualizar el gráfico de cajas
    print(boxplot_Sepal.Width)

    # Crear gráfico de densidad para el Ancho del sépalo de la Especie
    density_plot <- ggplot(iris, aes(x = Sepal.Width, fill = Species)) +
      geom_density(alpha = 0.5) +
      labs(title = "Distribución de Densidad de Ancho del sépalo por Especies",
           x = "Sepal.Width",
           y = "Densidad") +
      theme_minimal()
     
    # Visualizar el gráfico de densidad
    print(density_plot)

    # Crear diagrama circular para Especies y Ancho del sépalo
    pie_chart_Species_Sepal.Width <- ggplot(iris, aes(x = "Species", y = Sepal.Width, fill = Species)) +
      geom_bar(stat = "identity", width = 1, color = "black") +
      coord_polar("y") +
      labs(title = "Distribución de Especies por Ancho del sépalo",
           fill = "tipo") +
      theme_minimal() +
      theme(legend.position = "bottom")
     
    # Visualizar diagrama circular
    print(pie_chart_Species_Sepal.Width)

    VII. Medidas estadísticas de tendencia

    ##Datos de los anchos del sepal
    Sepal.Width<- c(1,9,7,6,3,4,6,3,4,9,4,8,8,3,8,7,4)

    7.1. Media aritmética

    #Opción 1
    promedio = sum(Sepal.Width)/length(Sepal.Width)
    promedio
    ## [1] 5.529412
    #Opción 2
    mean(Sepal.Width)
    ## [1] 5.529412

    7.2. Mediana

    median(Sepal.Width)
    ## [1] 6

    7.3. Moda

    # Opción 1 (tabla)
    table(Sepal.Width)
    ## Sepal.Width
    ## 1 3 4 6 7 8 9 
    ## 1 3 4 2 2 3 2
    # Opción 2
    library(modeest)
    ## Warning: package 'modeest' was built under R version 4.3.2
    ## 
    ## Attaching package: 'modeest'
    ## The following object is masked from 'package:agricolae':
    ## 
    ##     skewness
    mfv(Sepal.Width)
    ## [1] 4

    VIII. Medidas estadísticas de posición

    8.1. Categorizar en 4 grupos a el ancho del sepal

    quantile(iris$Sepal.Width)
    ##   0%  25%  50%  75% 100% 
    ##  2.0  2.8  3.0  3.3  4.4

    8.2. Calcular e interpretar los cuartiles para la logintud del sepal

    quantile(iris$Sepal.Length)
    ##   0%  25%  50%  75% 100% 
    ##  4.3  5.1  5.8  6.4  7.9

    8.3. Dividir en 10 grupo

    quantile(iris$Sepal.Length, probs = seq(0, 1, 0.1))
    ##   0%  10%  20%  30%  40%  50%  60%  70%  80%  90% 100% 
    ## 4.30 4.80 5.00 5.27 5.60 5.80 6.10 6.30 6.52 6.90 7.90

    8.4. Dividir en 100 grupo

    quantile(iris$Sepal.Length, probs = seq(0, 1, 0.01))
    ##    0%    1%    2%    3%    4%    5%    6%    7%    8%    9%   10%   11%   12% 
    ## 4.300 4.400 4.400 4.547 4.600 4.600 4.694 4.743 4.800 4.800 4.800 4.900 4.900 
    ##   13%   14%   15%   16%   17%   18%   19%   20%   21%   22%   23%   24%   25% 
    ## 4.900 4.900 5.000 5.000 5.000 5.000 5.000 5.000 5.029 5.100 5.100 5.100 5.100 
    ##   26%   27%   28%   29%   30%   31%   32%   33%   34%   35%   36%   37%   38% 
    ## 5.100 5.123 5.200 5.200 5.270 5.400 5.400 5.400 5.400 5.500 5.500 5.500 5.500 
    ##   39%   40%   41%   42%   43%   44%   45%   46%   47%   48%   49%   50%   51% 
    ## 5.511 5.600 5.600 5.600 5.607 5.700 5.700 5.700 5.700 5.700 5.800 5.800 5.800 
    ##   52%   53%   54%   55%   56%   57%   58%   59%   60%   61%   62%   63%   64% 
    ## 5.800 5.800 5.900 5.900 6.000 6.000 6.000 6.000 6.100 6.100 6.100 6.100 6.200 
    ##   65%   66%   67%   68%   69%   70%   71%   72%   73%   74%   75%   76%   77% 
    ## 6.200 6.234 6.300 6.300 6.300 6.300 6.300 6.328 6.400 6.400 6.400 6.400 6.473 
    ##   78%   79%   80%   81%   82%   83%   84%   85%   86%   87%   88%   89%   90% 
    ## 6.500 6.500 6.520 6.600 6.700 6.700 6.700 6.700 6.700 6.763 6.800 6.861 6.900 
    ##   91%   92%   93%   94%   95%   96%   97%   98%   99%  100% 
    ## 6.900 7.008 7.157 7.200 7.255 7.408 7.653 7.700 7.700 7.900

    8.5. Asimetria

    library(fBasics)
    ## Warning: package 'fBasics' was built under R version 4.3.2
    ## 
    ## Attaching package: 'fBasics'
    ## The following objects are masked from 'package:modeest':
    ## 
    ##     ghMode, ghtMode, gldMode, hypMode, nigMode, skewness
    ## The following objects are masked from 'package:agricolae':
    ## 
    ##     kurtosis, skewness
    skewness(iris$Sepal.Length)
    ## [1] 0.3086407
    ## attr(,"method")
    ## [1] "moment"
    hist(iris$Sepal.Length)

    8.6. Curtosis

    kurtosis(iris$Sepal.Length)
    ## [1] -0.6058125
    ## attr(,"method")
    ## [1] "excess"

    IX. Manejo de datos Missing

    # Mostrar
    head(iris)
    ##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
    ## 1          5.1         3.5          1.4         0.2  setosa
    ## 2          4.9         3.0          1.4         0.2  setosa
    ## 3          4.7         3.2          1.3         0.2  setosa
    ## 4          4.6         3.1          1.5         0.2  setosa
    ## 5          5.0         3.6          1.4         0.2  setosa
    ## 6          5.4         3.9          1.7         0.4  setosa
    str(iris)
    ## 'data.frame':    150 obs. of  5 variables:
    ##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
    ##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
    ##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
    ##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
    ##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

    9.1. VERIFICACIÓN DE VALORES PERDIDOS

    # Verificar columnas con missing
    which(colSums(is.na(iris))!= 0)
    ## named integer(0)

    Realizar el análisis utilizando librerias

    library(VIM)
    library(mice)
    
    resumen_missing <- aggr(iris, numbers=T)

    summary(resumen_missing)
    ## 
    ##  Missings per variable: 
    ##      Variable Count
    ##  Sepal.Length     0
    ##   Sepal.Width     0
    ##  Petal.Length     0
    ##   Petal.Width     0
    ##       Species     0
    ## 
    ##  Missings in combinations of variables: 
    ##  Combinations Count Percent
    ##     0:0:0:0:0   150     100

    Para determinar mejor lo patrones de comportamiento de missing se puede utilizar la siguiente función

    library(VIM)
    matrixplot(iris)

    otra representación

    #Con librería mice
    library(mice)
    md.pattern(iris, rotate.names = TRUE)
    ##  /\     /\
    ## {  `---'  }
    ## {  O   O  }
    ## ==>  V <==  No need for mice. This data set is completely observed.
    ##  \  \|/  /
    ##   `-----'

    ##     Sepal.Length Sepal.Width Petal.Length Petal.Width Species  
    ## 150            1           1            1           1       1 0
    ##                0           0            0           0       0 0

    La librería visdat permite visualizar missing pero los ordena por tipo de datos

    library(visdat)
    vis_dat(iris)

    Para obtener columnas con porcentajes de missing

    vis_miss(iris)

    9.2. Corrección de missing

    9.2.1. Eliminar filas o columnas con missin

    En este caso se va a eliminar filas:

    iris_corregido1 <- na.omit(iris)
    str(iris_corregido1)
    ## 'data.frame':    150 obs. of  5 variables:
    ##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
    ##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
    ##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
    ##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
    ##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
    # Verificar columnas con missing
    which(colSums(is.na(iris_corregido1))!= 0)
    ## named integer(0)

    9.2.2. Aplicando técnicas de imputación

    Imputación por medidas de tendencia central

    library(DMwR2)
    iris_corregido2<-centralImputation(iris) #DMwR, mediana (númerico), moda(no númerico)
    str(iris_corregido2)
    ## 'data.frame':    150 obs. of  5 variables:
    ##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
    ##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
    ##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
    ##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
    ##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
    # Verificar columnas con missing
    which(colSums(is.na(iris_corregido2))!= 0)
    ## named integer(0)

    9.2.3. Utilizando otra librería para imputar datos

    library(VIM)
    iris_corregido3 <- initialise(iris, method = "median") #media (continuos) mediana (discretos), moda(no númerico)
    str(iris_corregido3)
    ## 'data.frame':    150 obs. of  5 variables:
    ##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
    ##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
    ##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
    ##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
    ##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
    # Verificar columnas con missing
    which(colSums(is.na(iris_corregido3))!= 0)
    ## named integer(0)

    9.2.4. Imputación utilizando vecinos más cercanos

    library(DMwR2)
    iris_corregido4<-knnImputation(iris, k=10)
    str(iris_corregido4)
    ## 'data.frame':    150 obs. of  5 variables:
    ##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
    ##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
    ##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
    ##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
    ##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
    # Verificar columnas con missing
    which(colSums(is.na(iris_corregido4))!= 0)
    ## named integer(0)

    X. Manejo de valores outliers

    10.1. Detección de outtliers univariado - gráfica

    El análisis solo se realiza para variable cuantitativas

    10.1.1. Gráfico de cajas

    Gráfico de cajas y bigotes

    #Gráfico de cajas y bigotes
    boxplot(iris$Sepal.Length)

    Según los resultados, el ancho del sepal no tiene valores atípicos

    Obteniendo valores atípicos para la variable Sepal.Width

    boxplot(iris$Sepal.Width)

    Para todo

    boxplot(iris)

    Para Petal.Length

    boxplot(iris$Petal.Length)

    Según los resultados, se identifica valores atípicos. Vamos a identificarlo y plantear estrategia de corrección

    # Calcular el RIC (RIC = Q3 - Q1)
    q1 <- quantile(iris$Petal.Length, 0.25)
    q3 <- quantile(iris$Petal.Length, 0.75)
    RIC <- q3-q1
    RIC
    ## 75% 
    ## 3.5
    # Limites o bigotes (Superior e inferior)
    bigote_inferior <- q1-1.5*RIC
    bigote_inferior
    ##   25% 
    ## -3.65
    bigote_superior <- q3+1.5*RIC
    bigote_superior
    ##   75% 
    ## 10.35
    # Identificar lo valores atípicos
    outliers_det <- iris$Petal.Length[iris$Petal.Length < bigote_inferior | iris$Petal.Length > bigote_superior]
    outliers_det
    ## numeric(0)

    10.2. Correción

    10.2.1 Eliminar los atípicos

    iris_sin_atipicos <- iris[!iris$Petal.Length %in% outliers_det,]
    iris_sin_atipicos
    ##     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
    ## 1            5.1         3.5          1.4         0.2     setosa
    ## 2            4.9         3.0          1.4         0.2     setosa
    ## 3            4.7         3.2          1.3         0.2     setosa
    ## 4            4.6         3.1          1.5         0.2     setosa
    ## 5            5.0         3.6          1.4         0.2     setosa
    ## 6            5.4         3.9          1.7         0.4     setosa
    ## 7            4.6         3.4          1.4         0.3     setosa
    ## 8            5.0         3.4          1.5         0.2     setosa
    ## 9            4.4         2.9          1.4         0.2     setosa
    ## 10           4.9         3.1          1.5         0.1     setosa
    ## 11           5.4         3.7          1.5         0.2     setosa
    ## 12           4.8         3.4          1.6         0.2     setosa
    ## 13           4.8         3.0          1.4         0.1     setosa
    ## 14           4.3         3.0          1.1         0.1     setosa
    ## 15           5.8         4.0          1.2         0.2     setosa
    ## 16           5.7         4.4          1.5         0.4     setosa
    ## 17           5.4         3.9          1.3         0.4     setosa
    ## 18           5.1         3.5          1.4         0.3     setosa
    ## 19           5.7         3.8          1.7         0.3     setosa
    ## 20           5.1         3.8          1.5         0.3     setosa
    ## 21           5.4         3.4          1.7         0.2     setosa
    ## 22           5.1         3.7          1.5         0.4     setosa
    ## 23           4.6         3.6          1.0         0.2     setosa
    ## 24           5.1         3.3          1.7         0.5     setosa
    ## 25           4.8         3.4          1.9         0.2     setosa
    ## 26           5.0         3.0          1.6         0.2     setosa
    ## 27           5.0         3.4          1.6         0.4     setosa
    ## 28           5.2         3.5          1.5         0.2     setosa
    ## 29           5.2         3.4          1.4         0.2     setosa
    ## 30           4.7         3.2          1.6         0.2     setosa
    ## 31           4.8         3.1          1.6         0.2     setosa
    ## 32           5.4         3.4          1.5         0.4     setosa
    ## 33           5.2         4.1          1.5         0.1     setosa
    ## 34           5.5         4.2          1.4         0.2     setosa
    ## 35           4.9         3.1          1.5         0.2     setosa
    ## 36           5.0         3.2          1.2         0.2     setosa
    ## 37           5.5         3.5          1.3         0.2     setosa
    ## 38           4.9         3.6          1.4         0.1     setosa
    ## 39           4.4         3.0          1.3         0.2     setosa
    ## 40           5.1         3.4          1.5         0.2     setosa
    ## 41           5.0         3.5          1.3         0.3     setosa
    ## 42           4.5         2.3          1.3         0.3     setosa
    ## 43           4.4         3.2          1.3         0.2     setosa
    ## 44           5.0         3.5          1.6         0.6     setosa
    ## 45           5.1         3.8          1.9         0.4     setosa
    ## 46           4.8         3.0          1.4         0.3     setosa
    ## 47           5.1         3.8          1.6         0.2     setosa
    ## 48           4.6         3.2          1.4         0.2     setosa
    ## 49           5.3         3.7          1.5         0.2     setosa
    ## 50           5.0         3.3          1.4         0.2     setosa
    ## 51           7.0         3.2          4.7         1.4 versicolor
    ## 52           6.4         3.2          4.5         1.5 versicolor
    ## 53           6.9         3.1          4.9         1.5 versicolor
    ## 54           5.5         2.3          4.0         1.3 versicolor
    ## 55           6.5         2.8          4.6         1.5 versicolor
    ## 56           5.7         2.8          4.5         1.3 versicolor
    ## 57           6.3         3.3          4.7         1.6 versicolor
    ## 58           4.9         2.4          3.3         1.0 versicolor
    ## 59           6.6         2.9          4.6         1.3 versicolor
    ## 60           5.2         2.7          3.9         1.4 versicolor
    ## 61           5.0         2.0          3.5         1.0 versicolor
    ## 62           5.9         3.0          4.2         1.5 versicolor
    ## 63           6.0         2.2          4.0         1.0 versicolor
    ## 64           6.1         2.9          4.7         1.4 versicolor
    ## 65           5.6         2.9          3.6         1.3 versicolor
    ## 66           6.7         3.1          4.4         1.4 versicolor
    ## 67           5.6         3.0          4.5         1.5 versicolor
    ## 68           5.8         2.7          4.1         1.0 versicolor
    ## 69           6.2         2.2          4.5         1.5 versicolor
    ## 70           5.6         2.5          3.9         1.1 versicolor
    ## 71           5.9         3.2          4.8         1.8 versicolor
    ## 72           6.1         2.8          4.0         1.3 versicolor
    ## 73           6.3         2.5          4.9         1.5 versicolor
    ## 74           6.1         2.8          4.7         1.2 versicolor
    ## 75           6.4         2.9          4.3         1.3 versicolor
    ## 76           6.6         3.0          4.4         1.4 versicolor
    ## 77           6.8         2.8          4.8         1.4 versicolor
    ## 78           6.7         3.0          5.0         1.7 versicolor
    ## 79           6.0         2.9          4.5         1.5 versicolor
    ## 80           5.7         2.6          3.5         1.0 versicolor
    ## 81           5.5         2.4          3.8         1.1 versicolor
    ## 82           5.5         2.4          3.7         1.0 versicolor
    ## 83           5.8         2.7          3.9         1.2 versicolor
    ## 84           6.0         2.7          5.1         1.6 versicolor
    ## 85           5.4         3.0          4.5         1.5 versicolor
    ## 86           6.0         3.4          4.5         1.6 versicolor
    ## 87           6.7         3.1          4.7         1.5 versicolor
    ## 88           6.3         2.3          4.4         1.3 versicolor
    ## 89           5.6         3.0          4.1         1.3 versicolor
    ## 90           5.5         2.5          4.0         1.3 versicolor
    ## 91           5.5         2.6          4.4         1.2 versicolor
    ## 92           6.1         3.0          4.6         1.4 versicolor
    ## 93           5.8         2.6          4.0         1.2 versicolor
    ## 94           5.0         2.3          3.3         1.0 versicolor
    ## 95           5.6         2.7          4.2         1.3 versicolor
    ## 96           5.7         3.0          4.2         1.2 versicolor
    ## 97           5.7         2.9          4.2         1.3 versicolor
    ## 98           6.2         2.9          4.3         1.3 versicolor
    ## 99           5.1         2.5          3.0         1.1 versicolor
    ## 100          5.7         2.8          4.1         1.3 versicolor
    ## 101          6.3         3.3          6.0         2.5  virginica
    ## 102          5.8         2.7          5.1         1.9  virginica
    ## 103          7.1         3.0          5.9         2.1  virginica
    ## 104          6.3         2.9          5.6         1.8  virginica
    ## 105          6.5         3.0          5.8         2.2  virginica
    ## 106          7.6         3.0          6.6         2.1  virginica
    ## 107          4.9         2.5          4.5         1.7  virginica
    ## 108          7.3         2.9          6.3         1.8  virginica
    ## 109          6.7         2.5          5.8         1.8  virginica
    ## 110          7.2         3.6          6.1         2.5  virginica
    ## 111          6.5         3.2          5.1         2.0  virginica
    ## 112          6.4         2.7          5.3         1.9  virginica
    ## 113          6.8         3.0          5.5         2.1  virginica
    ## 114          5.7         2.5          5.0         2.0  virginica
    ## 115          5.8         2.8          5.1         2.4  virginica
    ## 116          6.4         3.2          5.3         2.3  virginica
    ## 117          6.5         3.0          5.5         1.8  virginica
    ## 118          7.7         3.8          6.7         2.2  virginica
    ## 119          7.7         2.6          6.9         2.3  virginica
    ## 120          6.0         2.2          5.0         1.5  virginica
    ## 121          6.9         3.2          5.7         2.3  virginica
    ## 122          5.6         2.8          4.9         2.0  virginica
    ## 123          7.7         2.8          6.7         2.0  virginica
    ## 124          6.3         2.7          4.9         1.8  virginica
    ## 125          6.7         3.3          5.7         2.1  virginica
    ## 126          7.2         3.2          6.0         1.8  virginica
    ## 127          6.2         2.8          4.8         1.8  virginica
    ## 128          6.1         3.0          4.9         1.8  virginica
    ## 129          6.4         2.8          5.6         2.1  virginica
    ## 130          7.2         3.0          5.8         1.6  virginica
    ## 131          7.4         2.8          6.1         1.9  virginica
    ## 132          7.9         3.8          6.4         2.0  virginica
    ## 133          6.4         2.8          5.6         2.2  virginica
    ## 134          6.3         2.8          5.1         1.5  virginica
    ## 135          6.1         2.6          5.6         1.4  virginica
    ## 136          7.7         3.0          6.1         2.3  virginica
    ## 137          6.3         3.4          5.6         2.4  virginica
    ## 138          6.4         3.1          5.5         1.8  virginica
    ## 139          6.0         3.0          4.8         1.8  virginica
    ## 140          6.9         3.1          5.4         2.1  virginica
    ## 141          6.7         3.1          5.6         2.4  virginica
    ## 142          6.9         3.1          5.1         2.3  virginica
    ## 143          5.8         2.7          5.1         1.9  virginica
    ## 144          6.8         3.2          5.9         2.3  virginica
    ## 145          6.7         3.3          5.7         2.5  virginica
    ## 146          6.7         3.0          5.2         2.3  virginica
    ## 147          6.3         2.5          5.0         1.9  virginica
    ## 148          6.5         3.0          5.2         2.0  virginica
    ## 149          6.2         3.4          5.4         2.3  virginica
    ## 150          5.9         3.0          5.1         1.8  virginica

    Para confirmar vamos a realizar un gráfico de cajas con la nueva data

    boxplot(iris_sin_atipicos$Petal.Length)

    XI. Transformación de variables

    11.1. Transformación de raíz cuadrada

    # Original
    hist(iris$Petal.Length, 12)

    Para sacar la raiz cuadrada, simplemente se puede utilizar la función sqrt

    sqrt(iris$Petal.Length)
    ##   [1] 1.183216 1.183216 1.140175 1.224745 1.183216 1.303840 1.183216 1.224745
    ##   [9] 1.183216 1.224745 1.224745 1.264911 1.183216 1.048809 1.095445 1.224745
    ##  [17] 1.140175 1.183216 1.303840 1.224745 1.303840 1.224745 1.000000 1.303840
    ##  [25] 1.378405 1.264911 1.264911 1.224745 1.183216 1.264911 1.264911 1.224745
    ##  [33] 1.224745 1.183216 1.224745 1.095445 1.140175 1.183216 1.140175 1.224745
    ##  [41] 1.140175 1.140175 1.140175 1.264911 1.378405 1.183216 1.264911 1.183216
    ##  [49] 1.224745 1.183216 2.167948 2.121320 2.213594 2.000000 2.144761 2.121320
    ##  [57] 2.167948 1.816590 2.144761 1.974842 1.870829 2.049390 2.000000 2.167948
    ##  [65] 1.897367 2.097618 2.121320 2.024846 2.121320 1.974842 2.190890 2.000000
    ##  [73] 2.213594 2.167948 2.073644 2.097618 2.190890 2.236068 2.121320 1.870829
    ##  [81] 1.949359 1.923538 1.974842 2.258318 2.121320 2.121320 2.167948 2.097618
    ##  [89] 2.024846 2.000000 2.097618 2.144761 2.000000 1.816590 2.049390 2.049390
    ##  [97] 2.049390 2.073644 1.732051 2.024846 2.449490 2.258318 2.428992 2.366432
    ## [105] 2.408319 2.569047 2.121320 2.509980 2.408319 2.469818 2.258318 2.302173
    ## [113] 2.345208 2.236068 2.258318 2.302173 2.345208 2.588436 2.626785 2.236068
    ## [121] 2.387467 2.213594 2.588436 2.213594 2.387467 2.449490 2.190890 2.213594
    ## [129] 2.366432 2.408319 2.469818 2.529822 2.366432 2.258318 2.366432 2.469818
    ## [137] 2.366432 2.345208 2.190890 2.323790 2.366432 2.258318 2.258318 2.428992
    ## [145] 2.387467 2.280351 2.236068 2.280351 2.323790 2.258318

    Graficamente

    hist(sqrt(iris$Petal.Length))

    11.2. Transformación exponencial

    exp(iris$Petal.Length)
    ##   [1]   4.055200   4.055200   3.669297   4.481689   4.055200   5.473947
    ##   [7]   4.055200   4.481689   4.055200   4.481689   4.481689   4.953032
    ##  [13]   4.055200   3.004166   3.320117   4.481689   3.669297   4.055200
    ##  [19]   5.473947   4.481689   5.473947   4.481689   2.718282   5.473947
    ##  [25]   6.685894   4.953032   4.953032   4.481689   4.055200   4.953032
    ##  [31]   4.953032   4.481689   4.481689   4.055200   4.481689   3.320117
    ##  [37]   3.669297   4.055200   3.669297   4.481689   3.669297   3.669297
    ##  [43]   3.669297   4.953032   6.685894   4.055200   4.953032   4.055200
    ##  [49]   4.481689   4.055200 109.947172  90.017131 134.289780  54.598150
    ##  [55]  99.484316  90.017131 109.947172  27.112639  99.484316  49.402449
    ##  [61]  33.115452  66.686331  54.598150 109.947172  36.598234  81.450869
    ##  [67]  90.017131  60.340288  90.017131  49.402449 121.510418  54.598150
    ##  [73] 134.289780 109.947172  73.699794  81.450869 121.510418 148.413159
    ##  [79]  90.017131  33.115452  44.701184  40.447304  49.402449 164.021907
    ##  [85]  90.017131  90.017131 109.947172  81.450869  60.340288  54.598150
    ##  [91]  81.450869  99.484316  54.598150  27.112639  66.686331  66.686331
    ##  [97]  66.686331  73.699794  20.085537  60.340288 403.428793 164.021907
    ## [103] 365.037468 270.426407 330.299560 735.095189  90.017131 544.571910
    ## [109] 330.299560 445.857770 164.021907 200.336810 244.691932 148.413159
    ## [115] 164.021907 200.336810 244.691932 812.405825 992.274716 148.413159
    ## [121] 298.867401 134.289780 812.405825 134.289780 298.867401 403.428793
    ## [127] 121.510418 134.289780 270.426407 330.299560 445.857770 601.845038
    ## [133] 270.426407 164.021907 270.426407 445.857770 270.426407 244.691932
    ## [139] 121.510418 221.406416 270.426407 164.021907 164.021907 365.037468
    ## [145] 298.867401 181.272242 148.413159 181.272242 221.406416 164.021907

    para poder observarlo graficamente se tiene:

    hist(exp(iris$Petal.Length))

    Forma 2

    Petal.Length_exp<- exp(iris$Petal.Length)
    hist(Petal.Length_exp)

    11.3. Transformación logarítmica

    log(iris$Petal.Length)
    ##   [1] 0.33647224 0.33647224 0.26236426 0.40546511 0.33647224 0.53062825
    ##   [7] 0.33647224 0.40546511 0.33647224 0.40546511 0.40546511 0.47000363
    ##  [13] 0.33647224 0.09531018 0.18232156 0.40546511 0.26236426 0.33647224
    ##  [19] 0.53062825 0.40546511 0.53062825 0.40546511 0.00000000 0.53062825
    ##  [25] 0.64185389 0.47000363 0.47000363 0.40546511 0.33647224 0.47000363
    ##  [31] 0.47000363 0.40546511 0.40546511 0.33647224 0.40546511 0.18232156
    ##  [37] 0.26236426 0.33647224 0.26236426 0.40546511 0.26236426 0.26236426
    ##  [43] 0.26236426 0.47000363 0.64185389 0.33647224 0.47000363 0.33647224
    ##  [49] 0.40546511 0.33647224 1.54756251 1.50407740 1.58923521 1.38629436
    ##  [55] 1.52605630 1.50407740 1.54756251 1.19392247 1.52605630 1.36097655
    ##  [61] 1.25276297 1.43508453 1.38629436 1.54756251 1.28093385 1.48160454
    ##  [67] 1.50407740 1.41098697 1.50407740 1.36097655 1.56861592 1.38629436
    ##  [73] 1.58923521 1.54756251 1.45861502 1.48160454 1.56861592 1.60943791
    ##  [79] 1.50407740 1.25276297 1.33500107 1.30833282 1.36097655 1.62924054
    ##  [85] 1.50407740 1.50407740 1.54756251 1.48160454 1.41098697 1.38629436
    ##  [91] 1.48160454 1.52605630 1.38629436 1.19392247 1.43508453 1.43508453
    ##  [97] 1.43508453 1.45861502 1.09861229 1.41098697 1.79175947 1.62924054
    ## [103] 1.77495235 1.72276660 1.75785792 1.88706965 1.50407740 1.84054963
    ## [109] 1.75785792 1.80828877 1.62924054 1.66770682 1.70474809 1.60943791
    ## [115] 1.62924054 1.66770682 1.70474809 1.90210753 1.93152141 1.60943791
    ## [121] 1.74046617 1.58923521 1.90210753 1.58923521 1.74046617 1.79175947
    ## [127] 1.56861592 1.58923521 1.72276660 1.75785792 1.80828877 1.85629799
    ## [133] 1.72276660 1.62924054 1.72276660 1.80828877 1.72276660 1.70474809
    ## [139] 1.56861592 1.68639895 1.72276660 1.62924054 1.62924054 1.77495235
    ## [145] 1.74046617 1.64865863 1.60943791 1.64865863 1.68639895 1.62924054

    graficamente

    hist(log(iris$Petal.Length))

    Cambiar la base 2

    log(iris$Petal.Length, base=2)
    ##   [1] 0.4854268 0.4854268 0.3785116 0.5849625 0.4854268 0.7655347 0.4854268
    ##   [8] 0.5849625 0.4854268 0.5849625 0.5849625 0.6780719 0.4854268 0.1375035
    ##  [15] 0.2630344 0.5849625 0.3785116 0.4854268 0.7655347 0.5849625 0.7655347
    ##  [22] 0.5849625 0.0000000 0.7655347 0.9259994 0.6780719 0.6780719 0.5849625
    ##  [29] 0.4854268 0.6780719 0.6780719 0.5849625 0.5849625 0.4854268 0.5849625
    ##  [36] 0.2630344 0.3785116 0.4854268 0.3785116 0.5849625 0.3785116 0.3785116
    ##  [43] 0.3785116 0.6780719 0.9259994 0.4854268 0.6780719 0.4854268 0.5849625
    ##  [50] 0.4854268 2.2326608 2.1699250 2.2927817 2.0000000 2.2016339 2.1699250
    ##  [57] 2.2326608 1.7224660 2.2016339 1.9634741 1.8073549 2.0703893 2.0000000
    ##  [64] 2.2326608 1.8479969 2.1375035 2.1699250 2.0356239 2.1699250 1.9634741
    ##  [71] 2.2630344 2.0000000 2.2927817 2.2326608 2.1043367 2.1375035 2.2630344
    ##  [78] 2.3219281 2.1699250 1.8073549 1.9259994 1.8875253 1.9634741 2.3504972
    ##  [85] 2.1699250 2.1699250 2.2326608 2.1375035 2.0356239 2.0000000 2.1375035
    ##  [92] 2.2016339 2.0000000 1.7224660 2.0703893 2.0703893 2.0703893 2.1043367
    ##  [99] 1.5849625 2.0356239 2.5849625 2.3504972 2.5607150 2.4854268 2.5360529
    ## [106] 2.7224660 2.1699250 2.6553518 2.5360529 2.6088092 2.3504972 2.4059924
    ## [113] 2.4594316 2.3219281 2.3504972 2.4059924 2.4594316 2.7441611 2.7865964
    ## [120] 2.3219281 2.5109619 2.2927817 2.7441611 2.2927817 2.5109619 2.5849625
    ## [127] 2.2630344 2.2927817 2.4854268 2.5360529 2.6088092 2.6780719 2.4854268
    ## [134] 2.3504972 2.4854268 2.6088092 2.4854268 2.4594316 2.2630344 2.4329594
    ## [141] 2.4854268 2.3504972 2.3504972 2.5607150 2.5109619 2.3785116 2.3219281
    ## [148] 2.3785116 2.4329594 2.3504972

    graficamente

    hist(log(iris$Petal.Length, base=2))

    11.4. Comparación de transformaciones

    #Obtener solo tranaformaciones
    Petal.Length_sqrt <- sqrt(iris$Petal.Length)
    Petal.Length_exp <- exp(iris$Petal.Length)
    Petal.Length_ln <- log(iris$Petal.Length)
    Petal.Length_log2 <- log(iris$Petal.Length, base=2)
    Petal.Length_log5 <- log(iris$Petal.Length, base=5)

    Ver graficamente cada una:

    par(mfrow=c(3,2))
    hist(iris$Petal.Length)
    hist(Petal.Length_sqrt)
    hist(Petal.Length_exp)
    hist(Petal.Length_ln)
    hist(Petal.Length_log2)
    hist(Petal.Length_log5)

    par(mfrow=c(1,1))

    La visualización de la distribución puede mejorarse con la gráfica de densidad

    par(mfrow=c(3,2))
    plot(density(iris$Petal.Length), main = "Distribución de Petal.Length originales")
    plot(density(Petal.Length_sqrt), main = "Distribución de Petal.Length transformadas - sqrt")
    plot(density(Petal.Length_exp), main = "Distribución de Petal.Length transformadas - exp")
    plot(density(Petal.Length_ln), main = "Distribución de Petal.Length transformadas - ln")
    plot(density(Petal.Length_log2), main = "Distribución de Petal.Length transformadas - log2")
    plot(density(Petal.Length_log5), main = "Distribución de Petal.Length transformadas - log5")

    par(mfrow=c(1,1))    

    gráfica general

    # Convertir las columnas seleccionadas a numéricas si es necesario
    iris[, 1:5] <- sapply(iris[, 1:5], as.numeric)
    
    # Verificar si hay algún problema con la conversión
    print(sapply(iris[, 1:5], class))
    ## Sepal.Length  Sepal.Width Petal.Length  Petal.Width      Species 
    ##    "numeric"    "numeric"    "numeric"    "numeric"    "numeric"
    # Ahora puedes calcular la correlación sin problemas
    library(PerformanceAnalytics)
    chart.Correlation(cor(iris[, 1:5]), histogram = TRUE)

    XII. Estandarización y normalización de variables

    12.1. Estandarización

    head(iris)
    ##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
    ## 1          5.1         3.5          1.4         0.2       1
    ## 2          4.9         3.0          1.4         0.2       1
    ## 3          4.7         3.2          1.3         0.2       1
    ## 4          4.6         3.1          1.5         0.2       1
    ## 5          5.0         3.6          1.4         0.2       1
    ## 6          5.4         3.9          1.7         0.4       1

    Vamos a aplicar estandarización Z a la variable longitud de manera manual

    12.1.1. Método 1 Por partes

    iris$Petal.Length
    ##   [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4
    ##  [19] 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2
    ##  [37] 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0
    ##  [55] 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0
    ##  [73] 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0
    ##  [91] 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3
    ## [109] 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0
    ## [127] 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9
    ## [145] 5.7 5.2 5.0 5.2 5.4 5.1
    media_Petal.Length <- mean(iris$Petal.Length)
    media_Petal.Length
    ## [1] 3.758
    desv_est <- sd(iris$Petal.Length)
    desv_est
    ## [1] 1.765298
    Petal.Length_estandar <- (iris$Petal.Length-media_Petal.Length)/desv_est
    Petal.Length_estandar
    ##   [1] -1.33575163 -1.33575163 -1.39239929 -1.27910398 -1.33575163 -1.16580868
    ##   [7] -1.33575163 -1.27910398 -1.33575163 -1.27910398 -1.27910398 -1.22245633
    ##  [13] -1.33575163 -1.50569459 -1.44904694 -1.27910398 -1.39239929 -1.33575163
    ##  [19] -1.16580868 -1.27910398 -1.16580868 -1.27910398 -1.56234224 -1.16580868
    ##  [25] -1.05251337 -1.22245633 -1.22245633 -1.27910398 -1.33575163 -1.22245633
    ##  [31] -1.22245633 -1.27910398 -1.27910398 -1.33575163 -1.27910398 -1.44904694
    ##  [37] -1.39239929 -1.33575163 -1.39239929 -1.27910398 -1.39239929 -1.39239929
    ##  [43] -1.39239929 -1.22245633 -1.05251337 -1.33575163 -1.22245633 -1.33575163
    ##  [49] -1.27910398 -1.33575163  0.53362088  0.42032558  0.64691619  0.13708732
    ##  [55]  0.47697323  0.42032558  0.53362088 -0.25944625  0.47697323  0.08043967
    ##  [61] -0.14615094  0.25038262  0.13708732  0.53362088 -0.08950329  0.36367793
    ##  [67]  0.42032558  0.19373497  0.42032558  0.08043967  0.59026853  0.13708732
    ##  [73]  0.64691619  0.53362088  0.30703027  0.36367793  0.59026853  0.70356384
    ##  [79]  0.42032558 -0.14615094  0.02379201 -0.03285564  0.08043967  0.76021149
    ##  [85]  0.42032558  0.42032558  0.53362088  0.36367793  0.19373497  0.13708732
    ##  [91]  0.36367793  0.47697323  0.13708732 -0.25944625  0.25038262  0.25038262
    ##  [97]  0.25038262  0.30703027 -0.42938920  0.19373497  1.27004036  0.76021149
    ## [103]  1.21339271  1.04344975  1.15674505  1.60992627  0.42032558  1.43998331
    ## [109]  1.15674505  1.32668801  0.76021149  0.87350679  0.98680210  0.70356384
    ## [115]  0.76021149  0.87350679  0.98680210  1.66657392  1.77986923  0.70356384
    ## [121]  1.10009740  0.64691619  1.66657392  0.64691619  1.10009740  1.27004036
    ## [127]  0.59026853  0.64691619  1.04344975  1.15674505  1.32668801  1.49663097
    ## [133]  1.04344975  0.76021149  1.04344975  1.32668801  1.04344975  0.98680210
    ## [139]  0.59026853  0.93015445  1.04344975  0.76021149  0.76021149  1.21339271
    ## [145]  1.10009740  0.81685914  0.70356384  0.81685914  0.93015445  0.76021149

    12.1.2. Método 2 Directo

    Petal.Length_estandar2 <- (iris$Petal.Length-mean(iris$Petal.Length))/sd(iris$Petal.Length)
    Petal.Length_estandar2
    ##   [1] -1.33575163 -1.33575163 -1.39239929 -1.27910398 -1.33575163 -1.16580868
    ##   [7] -1.33575163 -1.27910398 -1.33575163 -1.27910398 -1.27910398 -1.22245633
    ##  [13] -1.33575163 -1.50569459 -1.44904694 -1.27910398 -1.39239929 -1.33575163
    ##  [19] -1.16580868 -1.27910398 -1.16580868 -1.27910398 -1.56234224 -1.16580868
    ##  [25] -1.05251337 -1.22245633 -1.22245633 -1.27910398 -1.33575163 -1.22245633
    ##  [31] -1.22245633 -1.27910398 -1.27910398 -1.33575163 -1.27910398 -1.44904694
    ##  [37] -1.39239929 -1.33575163 -1.39239929 -1.27910398 -1.39239929 -1.39239929
    ##  [43] -1.39239929 -1.22245633 -1.05251337 -1.33575163 -1.22245633 -1.33575163
    ##  [49] -1.27910398 -1.33575163  0.53362088  0.42032558  0.64691619  0.13708732
    ##  [55]  0.47697323  0.42032558  0.53362088 -0.25944625  0.47697323  0.08043967
    ##  [61] -0.14615094  0.25038262  0.13708732  0.53362088 -0.08950329  0.36367793
    ##  [67]  0.42032558  0.19373497  0.42032558  0.08043967  0.59026853  0.13708732
    ##  [73]  0.64691619  0.53362088  0.30703027  0.36367793  0.59026853  0.70356384
    ##  [79]  0.42032558 -0.14615094  0.02379201 -0.03285564  0.08043967  0.76021149
    ##  [85]  0.42032558  0.42032558  0.53362088  0.36367793  0.19373497  0.13708732
    ##  [91]  0.36367793  0.47697323  0.13708732 -0.25944625  0.25038262  0.25038262
    ##  [97]  0.25038262  0.30703027 -0.42938920  0.19373497  1.27004036  0.76021149
    ## [103]  1.21339271  1.04344975  1.15674505  1.60992627  0.42032558  1.43998331
    ## [109]  1.15674505  1.32668801  0.76021149  0.87350679  0.98680210  0.70356384
    ## [115]  0.76021149  0.87350679  0.98680210  1.66657392  1.77986923  0.70356384
    ## [121]  1.10009740  0.64691619  1.66657392  0.64691619  1.10009740  1.27004036
    ## [127]  0.59026853  0.64691619  1.04344975  1.15674505  1.32668801  1.49663097
    ## [133]  1.04344975  0.76021149  1.04344975  1.32668801  1.04344975  0.98680210
    ## [139]  0.59026853  0.93015445  1.04344975  0.76021149  0.76021149  1.21339271
    ## [145]  1.10009740  0.81685914  0.70356384  0.81685914  0.93015445  0.76021149

    12.1.3. Método 3 Apoyarse en las funciones de R

    R tiene múltiple funciones para estandarizar, la clásica es la función scale

    #Función scale
    Petal.Length_estandar3 <- scale(iris$Petal.Length)
    Petal.Length_estandar3
    ##               [,1]
    ##   [1,] -1.33575163
    ##   [2,] -1.33575163
    ##   [3,] -1.39239929
    ##   [4,] -1.27910398
    ##   [5,] -1.33575163
    ##   [6,] -1.16580868
    ##   [7,] -1.33575163
    ##   [8,] -1.27910398
    ##   [9,] -1.33575163
    ##  [10,] -1.27910398
    ##  [11,] -1.27910398
    ##  [12,] -1.22245633
    ##  [13,] -1.33575163
    ##  [14,] -1.50569459
    ##  [15,] -1.44904694
    ##  [16,] -1.27910398
    ##  [17,] -1.39239929
    ##  [18,] -1.33575163
    ##  [19,] -1.16580868
    ##  [20,] -1.27910398
    ##  [21,] -1.16580868
    ##  [22,] -1.27910398
    ##  [23,] -1.56234224
    ##  [24,] -1.16580868
    ##  [25,] -1.05251337
    ##  [26,] -1.22245633
    ##  [27,] -1.22245633
    ##  [28,] -1.27910398
    ##  [29,] -1.33575163
    ##  [30,] -1.22245633
    ##  [31,] -1.22245633
    ##  [32,] -1.27910398
    ##  [33,] -1.27910398
    ##  [34,] -1.33575163
    ##  [35,] -1.27910398
    ##  [36,] -1.44904694
    ##  [37,] -1.39239929
    ##  [38,] -1.33575163
    ##  [39,] -1.39239929
    ##  [40,] -1.27910398
    ##  [41,] -1.39239929
    ##  [42,] -1.39239929
    ##  [43,] -1.39239929
    ##  [44,] -1.22245633
    ##  [45,] -1.05251337
    ##  [46,] -1.33575163
    ##  [47,] -1.22245633
    ##  [48,] -1.33575163
    ##  [49,] -1.27910398
    ##  [50,] -1.33575163
    ##  [51,]  0.53362088
    ##  [52,]  0.42032558
    ##  [53,]  0.64691619
    ##  [54,]  0.13708732
    ##  [55,]  0.47697323
    ##  [56,]  0.42032558
    ##  [57,]  0.53362088
    ##  [58,] -0.25944625
    ##  [59,]  0.47697323
    ##  [60,]  0.08043967
    ##  [61,] -0.14615094
    ##  [62,]  0.25038262
    ##  [63,]  0.13708732
    ##  [64,]  0.53362088
    ##  [65,] -0.08950329
    ##  [66,]  0.36367793
    ##  [67,]  0.42032558
    ##  [68,]  0.19373497
    ##  [69,]  0.42032558
    ##  [70,]  0.08043967
    ##  [71,]  0.59026853
    ##  [72,]  0.13708732
    ##  [73,]  0.64691619
    ##  [74,]  0.53362088
    ##  [75,]  0.30703027
    ##  [76,]  0.36367793
    ##  [77,]  0.59026853
    ##  [78,]  0.70356384
    ##  [79,]  0.42032558
    ##  [80,] -0.14615094
    ##  [81,]  0.02379201
    ##  [82,] -0.03285564
    ##  [83,]  0.08043967
    ##  [84,]  0.76021149
    ##  [85,]  0.42032558
    ##  [86,]  0.42032558
    ##  [87,]  0.53362088
    ##  [88,]  0.36367793
    ##  [89,]  0.19373497
    ##  [90,]  0.13708732
    ##  [91,]  0.36367793
    ##  [92,]  0.47697323
    ##  [93,]  0.13708732
    ##  [94,] -0.25944625
    ##  [95,]  0.25038262
    ##  [96,]  0.25038262
    ##  [97,]  0.25038262
    ##  [98,]  0.30703027
    ##  [99,] -0.42938920
    ## [100,]  0.19373497
    ## [101,]  1.27004036
    ## [102,]  0.76021149
    ## [103,]  1.21339271
    ## [104,]  1.04344975
    ## [105,]  1.15674505
    ## [106,]  1.60992627
    ## [107,]  0.42032558
    ## [108,]  1.43998331
    ## [109,]  1.15674505
    ## [110,]  1.32668801
    ## [111,]  0.76021149
    ## [112,]  0.87350679
    ## [113,]  0.98680210
    ## [114,]  0.70356384
    ## [115,]  0.76021149
    ## [116,]  0.87350679
    ## [117,]  0.98680210
    ## [118,]  1.66657392
    ## [119,]  1.77986923
    ## [120,]  0.70356384
    ## [121,]  1.10009740
    ## [122,]  0.64691619
    ## [123,]  1.66657392
    ## [124,]  0.64691619
    ## [125,]  1.10009740
    ## [126,]  1.27004036
    ## [127,]  0.59026853
    ## [128,]  0.64691619
    ## [129,]  1.04344975
    ## [130,]  1.15674505
    ## [131,]  1.32668801
    ## [132,]  1.49663097
    ## [133,]  1.04344975
    ## [134,]  0.76021149
    ## [135,]  1.04344975
    ## [136,]  1.32668801
    ## [137,]  1.04344975
    ## [138,]  0.98680210
    ## [139,]  0.59026853
    ## [140,]  0.93015445
    ## [141,]  1.04344975
    ## [142,]  0.76021149
    ## [143,]  0.76021149
    ## [144,]  1.21339271
    ## [145,]  1.10009740
    ## [146,]  0.81685914
    ## [147,]  0.70356384
    ## [148,]  0.81685914
    ## [149,]  0.93015445
    ## [150,]  0.76021149
    ## attr(,"scaled:center")
    ## [1] 3.758
    ## attr(,"scaled:scale")
    ## [1] 1.765298

    La ventaja de la función de R, es que se puede enviar todo el caso

    iris_cuanti_scale <- scale(iris[ ,1:5])
    head(iris_cuanti_scale)
    ##      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
    ## [1,]   -0.8976739  1.01560199    -1.335752   -1.311052 -1.220656
    ## [2,]   -1.1392005 -0.13153881    -1.335752   -1.311052 -1.220656
    ## [3,]   -1.3807271  0.32731751    -1.392399   -1.311052 -1.220656
    ## [4,]   -1.5014904  0.09788935    -1.279104   -1.311052 -1.220656
    ## [5,]   -1.0184372  1.24503015    -1.335752   -1.311052 -1.220656
    ## [6,]   -0.5353840  1.93331463    -1.165809   -1.048667 -1.220656

    12.2. Normalización

    12.2.1. Método 1

    Petal.Length_normal <- (iris$Petal.Length-min(iris$Petal.Length))/(max(iris$Petal.Length)-min(iris$Petal.Length))
    Petal.Length_normal
    ##   [1] 0.06779661 0.06779661 0.05084746 0.08474576 0.06779661 0.11864407
    ##   [7] 0.06779661 0.08474576 0.06779661 0.08474576 0.08474576 0.10169492
    ##  [13] 0.06779661 0.01694915 0.03389831 0.08474576 0.05084746 0.06779661
    ##  [19] 0.11864407 0.08474576 0.11864407 0.08474576 0.00000000 0.11864407
    ##  [25] 0.15254237 0.10169492 0.10169492 0.08474576 0.06779661 0.10169492
    ##  [31] 0.10169492 0.08474576 0.08474576 0.06779661 0.08474576 0.03389831
    ##  [37] 0.05084746 0.06779661 0.05084746 0.08474576 0.05084746 0.05084746
    ##  [43] 0.05084746 0.10169492 0.15254237 0.06779661 0.10169492 0.06779661
    ##  [49] 0.08474576 0.06779661 0.62711864 0.59322034 0.66101695 0.50847458
    ##  [55] 0.61016949 0.59322034 0.62711864 0.38983051 0.61016949 0.49152542
    ##  [61] 0.42372881 0.54237288 0.50847458 0.62711864 0.44067797 0.57627119
    ##  [67] 0.59322034 0.52542373 0.59322034 0.49152542 0.64406780 0.50847458
    ##  [73] 0.66101695 0.62711864 0.55932203 0.57627119 0.64406780 0.67796610
    ##  [79] 0.59322034 0.42372881 0.47457627 0.45762712 0.49152542 0.69491525
    ##  [85] 0.59322034 0.59322034 0.62711864 0.57627119 0.52542373 0.50847458
    ##  [91] 0.57627119 0.61016949 0.50847458 0.38983051 0.54237288 0.54237288
    ##  [97] 0.54237288 0.55932203 0.33898305 0.52542373 0.84745763 0.69491525
    ## [103] 0.83050847 0.77966102 0.81355932 0.94915254 0.59322034 0.89830508
    ## [109] 0.81355932 0.86440678 0.69491525 0.72881356 0.76271186 0.67796610
    ## [115] 0.69491525 0.72881356 0.76271186 0.96610169 1.00000000 0.67796610
    ## [121] 0.79661017 0.66101695 0.96610169 0.66101695 0.79661017 0.84745763
    ## [127] 0.64406780 0.66101695 0.77966102 0.81355932 0.86440678 0.91525424
    ## [133] 0.77966102 0.69491525 0.77966102 0.86440678 0.77966102 0.76271186
    ## [139] 0.64406780 0.74576271 0.77966102 0.69491525 0.69491525 0.83050847
    ## [145] 0.79661017 0.71186441 0.67796610 0.71186441 0.74576271 0.69491525

    12.2.2. Método 2 Función

    library(scales)
    ## Warning: package 'scales' was built under R version 4.3.2
    rescale(iris$Petal.Length)
    ##   [1] 0.06779661 0.06779661 0.05084746 0.08474576 0.06779661 0.11864407
    ##   [7] 0.06779661 0.08474576 0.06779661 0.08474576 0.08474576 0.10169492
    ##  [13] 0.06779661 0.01694915 0.03389831 0.08474576 0.05084746 0.06779661
    ##  [19] 0.11864407 0.08474576 0.11864407 0.08474576 0.00000000 0.11864407
    ##  [25] 0.15254237 0.10169492 0.10169492 0.08474576 0.06779661 0.10169492
    ##  [31] 0.10169492 0.08474576 0.08474576 0.06779661 0.08474576 0.03389831
    ##  [37] 0.05084746 0.06779661 0.05084746 0.08474576 0.05084746 0.05084746
    ##  [43] 0.05084746 0.10169492 0.15254237 0.06779661 0.10169492 0.06779661
    ##  [49] 0.08474576 0.06779661 0.62711864 0.59322034 0.66101695 0.50847458
    ##  [55] 0.61016949 0.59322034 0.62711864 0.38983051 0.61016949 0.49152542
    ##  [61] 0.42372881 0.54237288 0.50847458 0.62711864 0.44067797 0.57627119
    ##  [67] 0.59322034 0.52542373 0.59322034 0.49152542 0.64406780 0.50847458
    ##  [73] 0.66101695 0.62711864 0.55932203 0.57627119 0.64406780 0.67796610
    ##  [79] 0.59322034 0.42372881 0.47457627 0.45762712 0.49152542 0.69491525
    ##  [85] 0.59322034 0.59322034 0.62711864 0.57627119 0.52542373 0.50847458
    ##  [91] 0.57627119 0.61016949 0.50847458 0.38983051 0.54237288 0.54237288
    ##  [97] 0.54237288 0.55932203 0.33898305 0.52542373 0.84745763 0.69491525
    ## [103] 0.83050847 0.77966102 0.81355932 0.94915254 0.59322034 0.89830508
    ## [109] 0.81355932 0.86440678 0.69491525 0.72881356 0.76271186 0.67796610
    ## [115] 0.69491525 0.72881356 0.76271186 0.96610169 1.00000000 0.67796610
    ## [121] 0.79661017 0.66101695 0.96610169 0.66101695 0.79661017 0.84745763
    ## [127] 0.64406780 0.66101695 0.77966102 0.81355932 0.86440678 0.91525424
    ## [133] 0.77966102 0.69491525 0.77966102 0.86440678 0.77966102 0.76271186
    ## [139] 0.64406780 0.74576271 0.77966102 0.69491525 0.69491525 0.83050847
    ## [145] 0.79661017 0.71186441 0.67796610 0.71186441 0.74576271 0.69491525

    12.2.3. Aplicando a todo el caso

    la función rescale solo permite aplicarse a vectores, no es posible directamente apicar al data frame.

    library(caret)
    ## Warning: package 'caret' was built under R version 4.3.2
    ## Loading required package: lattice
    pre_procesamiento<-preProcess(iris[,1:5]) # Así por defecto muestra la est. Z
    predict(pre_procesamiento, iris[,1:5]) 
    ##     Sepal.Length Sepal.Width Petal.Length   Petal.Width   Species
    ## 1    -0.89767388  1.01560199  -1.33575163 -1.3110521482 -1.220656
    ## 2    -1.13920048 -0.13153881  -1.33575163 -1.3110521482 -1.220656
    ## 3    -1.38072709  0.32731751  -1.39239929 -1.3110521482 -1.220656
    ## 4    -1.50149039  0.09788935  -1.27910398 -1.3110521482 -1.220656
    ## 5    -1.01843718  1.24503015  -1.33575163 -1.3110521482 -1.220656
    ## 6    -0.53538397  1.93331463  -1.16580868 -1.0486667950 -1.220656
    ## 7    -1.50149039  0.78617383  -1.33575163 -1.1798594716 -1.220656
    ## 8    -1.01843718  0.78617383  -1.27910398 -1.3110521482 -1.220656
    ## 9    -1.74301699 -0.36096697  -1.33575163 -1.3110521482 -1.220656
    ## 10   -1.13920048  0.09788935  -1.27910398 -1.4422448248 -1.220656
    ## 11   -0.53538397  1.47445831  -1.27910398 -1.3110521482 -1.220656
    ## 12   -1.25996379  0.78617383  -1.22245633 -1.3110521482 -1.220656
    ## 13   -1.25996379 -0.13153881  -1.33575163 -1.4422448248 -1.220656
    ## 14   -1.86378030 -0.13153881  -1.50569459 -1.4422448248 -1.220656
    ## 15   -0.05233076  2.16274279  -1.44904694 -1.3110521482 -1.220656
    ## 16   -0.17309407  3.08045544  -1.27910398 -1.0486667950 -1.220656
    ## 17   -0.53538397  1.93331463  -1.39239929 -1.0486667950 -1.220656
    ## 18   -0.89767388  1.01560199  -1.33575163 -1.1798594716 -1.220656
    ## 19   -0.17309407  1.70388647  -1.16580868 -1.1798594716 -1.220656
    ## 20   -0.89767388  1.70388647  -1.27910398 -1.1798594716 -1.220656
    ## 21   -0.53538397  0.78617383  -1.16580868 -1.3110521482 -1.220656
    ## 22   -0.89767388  1.47445831  -1.27910398 -1.0486667950 -1.220656
    ## 23   -1.50149039  1.24503015  -1.56234224 -1.3110521482 -1.220656
    ## 24   -0.89767388  0.55674567  -1.16580868 -0.9174741184 -1.220656
    ## 25   -1.25996379  0.78617383  -1.05251337 -1.3110521482 -1.220656
    ## 26   -1.01843718 -0.13153881  -1.22245633 -1.3110521482 -1.220656
    ## 27   -1.01843718  0.78617383  -1.22245633 -1.0486667950 -1.220656
    ## 28   -0.77691058  1.01560199  -1.27910398 -1.3110521482 -1.220656
    ## 29   -0.77691058  0.78617383  -1.33575163 -1.3110521482 -1.220656
    ## 30   -1.38072709  0.32731751  -1.22245633 -1.3110521482 -1.220656
    ## 31   -1.25996379  0.09788935  -1.22245633 -1.3110521482 -1.220656
    ## 32   -0.53538397  0.78617383  -1.27910398 -1.0486667950 -1.220656
    ## 33   -0.77691058  2.39217095  -1.27910398 -1.4422448248 -1.220656
    ## 34   -0.41462067  2.62159911  -1.33575163 -1.3110521482 -1.220656
    ## 35   -1.13920048  0.09788935  -1.27910398 -1.3110521482 -1.220656
    ## 36   -1.01843718  0.32731751  -1.44904694 -1.3110521482 -1.220656
    ## 37   -0.41462067  1.01560199  -1.39239929 -1.3110521482 -1.220656
    ## 38   -1.13920048  1.24503015  -1.33575163 -1.4422448248 -1.220656
    ## 39   -1.74301699 -0.13153881  -1.39239929 -1.3110521482 -1.220656
    ## 40   -0.89767388  0.78617383  -1.27910398 -1.3110521482 -1.220656
    ## 41   -1.01843718  1.01560199  -1.39239929 -1.1798594716 -1.220656
    ## 42   -1.62225369 -1.73753594  -1.39239929 -1.1798594716 -1.220656
    ## 43   -1.74301699  0.32731751  -1.39239929 -1.3110521482 -1.220656
    ## 44   -1.01843718  1.01560199  -1.22245633 -0.7862814418 -1.220656
    ## 45   -0.89767388  1.70388647  -1.05251337 -1.0486667950 -1.220656
    ## 46   -1.25996379 -0.13153881  -1.33575163 -1.1798594716 -1.220656
    ## 47   -0.89767388  1.70388647  -1.22245633 -1.3110521482 -1.220656
    ## 48   -1.50149039  0.32731751  -1.33575163 -1.3110521482 -1.220656
    ## 49   -0.65614727  1.47445831  -1.27910398 -1.3110521482 -1.220656
    ## 50   -1.01843718  0.55674567  -1.33575163 -1.3110521482 -1.220656
    ## 51    1.39682886  0.32731751   0.53362088  0.2632599711  0.000000
    ## 52    0.67224905  0.32731751   0.42032558  0.3944526477  0.000000
    ## 53    1.27606556  0.09788935   0.64691619  0.3944526477  0.000000
    ## 54   -0.41462067 -1.73753594   0.13708732  0.1320672944  0.000000
    ## 55    0.79301235 -0.59039513   0.47697323  0.3944526477  0.000000
    ## 56   -0.17309407 -0.59039513   0.42032558  0.1320672944  0.000000
    ## 57    0.55148575  0.55674567   0.53362088  0.5256453243  0.000000
    ## 58   -1.13920048 -1.50810778  -0.25944625 -0.2615107354  0.000000
    ## 59    0.91377565 -0.36096697   0.47697323  0.1320672944  0.000000
    ## 60   -0.77691058 -0.81982329   0.08043967  0.2632599711  0.000000
    ## 61   -1.01843718 -2.42582042  -0.14615094 -0.2615107354  0.000000
    ## 62    0.06843254 -0.13153881   0.25038262  0.3944526477  0.000000
    ## 63    0.18919584 -1.96696410   0.13708732 -0.2615107354  0.000000
    ## 64    0.30995914 -0.36096697   0.53362088  0.2632599711  0.000000
    ## 65   -0.29385737 -0.36096697  -0.08950329  0.1320672944  0.000000
    ## 66    1.03453895  0.09788935   0.36367793  0.2632599711  0.000000
    ## 67   -0.29385737 -0.13153881   0.42032558  0.3944526477  0.000000
    ## 68   -0.05233076 -0.81982329   0.19373497 -0.2615107354  0.000000
    ## 69    0.43072244 -1.96696410   0.42032558  0.3944526477  0.000000
    ## 70   -0.29385737 -1.27867961   0.08043967 -0.1303180588  0.000000
    ## 71    0.06843254  0.32731751   0.59026853  0.7880306775  0.000000
    ## 72    0.30995914 -0.59039513   0.13708732  0.1320672944  0.000000
    ## 73    0.55148575 -1.27867961   0.64691619  0.3944526477  0.000000
    ## 74    0.30995914 -0.59039513   0.53362088  0.0008746178  0.000000
    ## 75    0.67224905 -0.36096697   0.30703027  0.1320672944  0.000000
    ## 76    0.91377565 -0.13153881   0.36367793  0.2632599711  0.000000
    ## 77    1.15530226 -0.59039513   0.59026853  0.2632599711  0.000000
    ## 78    1.03453895 -0.13153881   0.70356384  0.6568380009  0.000000
    ## 79    0.18919584 -0.36096697   0.42032558  0.3944526477  0.000000
    ## 80   -0.17309407 -1.04925145  -0.14615094 -0.2615107354  0.000000
    ## 81   -0.41462067 -1.50810778   0.02379201 -0.1303180588  0.000000
    ## 82   -0.41462067 -1.50810778  -0.03285564 -0.2615107354  0.000000
    ## 83   -0.05233076 -0.81982329   0.08043967  0.0008746178  0.000000
    ## 84    0.18919584 -0.81982329   0.76021149  0.5256453243  0.000000
    ## 85   -0.53538397 -0.13153881   0.42032558  0.3944526477  0.000000
    ## 86    0.18919584  0.78617383   0.42032558  0.5256453243  0.000000
    ## 87    1.03453895  0.09788935   0.53362088  0.3944526477  0.000000
    ## 88    0.55148575 -1.73753594   0.36367793  0.1320672944  0.000000
    ## 89   -0.29385737 -0.13153881   0.19373497  0.1320672944  0.000000
    ## 90   -0.41462067 -1.27867961   0.13708732  0.1320672944  0.000000
    ## 91   -0.41462067 -1.04925145   0.36367793  0.0008746178  0.000000
    ## 92    0.30995914 -0.13153881   0.47697323  0.2632599711  0.000000
    ## 93   -0.05233076 -1.04925145   0.13708732  0.0008746178  0.000000
    ## 94   -1.01843718 -1.73753594  -0.25944625 -0.2615107354  0.000000
    ## 95   -0.29385737 -0.81982329   0.25038262  0.1320672944  0.000000
    ## 96   -0.17309407 -0.13153881   0.25038262  0.0008746178  0.000000
    ## 97   -0.17309407 -0.36096697   0.25038262  0.1320672944  0.000000
    ## 98    0.43072244 -0.36096697   0.30703027  0.1320672944  0.000000
    ## 99   -0.89767388 -1.27867961  -0.42938920 -0.1303180588  0.000000
    ## 100  -0.17309407 -0.59039513   0.19373497  0.1320672944  0.000000
    ## 101   0.55148575  0.55674567   1.27004036  1.7063794137  1.220656
    ## 102  -0.05233076 -0.81982329   0.76021149  0.9192233541  1.220656
    ## 103   1.51759216 -0.13153881   1.21339271  1.1816087073  1.220656
    ## 104   0.55148575 -0.36096697   1.04344975  0.7880306775  1.220656
    ## 105   0.79301235 -0.13153881   1.15674505  1.3128013839  1.220656
    ## 106   2.12140867 -0.13153881   1.60992627  1.1816087073  1.220656
    ## 107  -1.13920048 -1.27867961   0.42032558  0.6568380009  1.220656
    ## 108   1.75911877 -0.36096697   1.43998331  0.7880306775  1.220656
    ## 109   1.03453895 -1.27867961   1.15674505  0.7880306775  1.220656
    ## 110   1.63835547  1.24503015   1.32668801  1.7063794137  1.220656
    ## 111   0.79301235  0.32731751   0.76021149  1.0504160307  1.220656
    ## 112   0.67224905 -0.81982329   0.87350679  0.9192233541  1.220656
    ## 113   1.15530226 -0.13153881   0.98680210  1.1816087073  1.220656
    ## 114  -0.17309407 -1.27867961   0.70356384  1.0504160307  1.220656
    ## 115  -0.05233076 -0.59039513   0.76021149  1.5751867371  1.220656
    ## 116   0.67224905  0.32731751   0.87350679  1.4439940605  1.220656
    ## 117   0.79301235 -0.13153881   0.98680210  0.7880306775  1.220656
    ## 118   2.24217198  1.70388647   1.66657392  1.3128013839  1.220656
    ## 119   2.24217198 -1.04925145   1.77986923  1.4439940605  1.220656
    ## 120   0.18919584 -1.96696410   0.70356384  0.3944526477  1.220656
    ## 121   1.27606556  0.32731751   1.10009740  1.4439940605  1.220656
    ## 122  -0.29385737 -0.59039513   0.64691619  1.0504160307  1.220656
    ## 123   2.24217198 -0.59039513   1.66657392  1.0504160307  1.220656
    ## 124   0.55148575 -0.81982329   0.64691619  0.7880306775  1.220656
    ## 125   1.03453895  0.55674567   1.10009740  1.1816087073  1.220656
    ## 126   1.63835547  0.32731751   1.27004036  0.7880306775  1.220656
    ## 127   0.43072244 -0.59039513   0.59026853  0.7880306775  1.220656
    ## 128   0.30995914 -0.13153881   0.64691619  0.7880306775  1.220656
    ## 129   0.67224905 -0.59039513   1.04344975  1.1816087073  1.220656
    ## 130   1.63835547 -0.13153881   1.15674505  0.5256453243  1.220656
    ## 131   1.87988207 -0.59039513   1.32668801  0.9192233541  1.220656
    ## 132   2.48369858  1.70388647   1.49663097  1.0504160307  1.220656
    ## 133   0.67224905 -0.59039513   1.04344975  1.3128013839  1.220656
    ## 134   0.55148575 -0.59039513   0.76021149  0.3944526477  1.220656
    ## 135   0.30995914 -1.04925145   1.04344975  0.2632599711  1.220656
    ## 136   2.24217198 -0.13153881   1.32668801  1.4439940605  1.220656
    ## 137   0.55148575  0.78617383   1.04344975  1.5751867371  1.220656
    ## 138   0.67224905  0.09788935   0.98680210  0.7880306775  1.220656
    ## 139   0.18919584 -0.13153881   0.59026853  0.7880306775  1.220656
    ## 140   1.27606556  0.09788935   0.93015445  1.1816087073  1.220656
    ## 141   1.03453895  0.09788935   1.04344975  1.5751867371  1.220656
    ## 142   1.27606556  0.09788935   0.76021149  1.4439940605  1.220656
    ## 143  -0.05233076 -0.81982329   0.76021149  0.9192233541  1.220656
    ## 144   1.15530226  0.32731751   1.21339271  1.4439940605  1.220656
    ## 145   1.03453895  0.55674567   1.10009740  1.7063794137  1.220656
    ## 146   1.03453895 -0.13153881   0.81685914  1.4439940605  1.220656
    ## 147   0.55148575 -1.27867961   0.70356384  0.9192233541  1.220656
    ## 148   0.79301235 -0.13153881   0.81685914  1.0504160307  1.220656
    ## 149   0.43072244  0.78617383   0.93015445  1.4439940605  1.220656
    ## 150   0.06843254 -0.13153881   0.76021149  0.7880306775  1.220656
    library(caret)
    pre_procesamiento<-preProcess(iris[,1:5], method = "range") 
    predict(pre_procesamiento, iris[,1:5]) 
    ##     Sepal.Length Sepal.Width Petal.Length Petal.Width Species
    ## 1     0.22222222  0.62500000   0.06779661  0.04166667     0.0
    ## 2     0.16666667  0.41666667   0.06779661  0.04166667     0.0
    ## 3     0.11111111  0.50000000   0.05084746  0.04166667     0.0
    ## 4     0.08333333  0.45833333   0.08474576  0.04166667     0.0
    ## 5     0.19444444  0.66666667   0.06779661  0.04166667     0.0
    ## 6     0.30555556  0.79166667   0.11864407  0.12500000     0.0
    ## 7     0.08333333  0.58333333   0.06779661  0.08333333     0.0
    ## 8     0.19444444  0.58333333   0.08474576  0.04166667     0.0
    ## 9     0.02777778  0.37500000   0.06779661  0.04166667     0.0
    ## 10    0.16666667  0.45833333   0.08474576  0.00000000     0.0
    ## 11    0.30555556  0.70833333   0.08474576  0.04166667     0.0
    ## 12    0.13888889  0.58333333   0.10169492  0.04166667     0.0
    ## 13    0.13888889  0.41666667   0.06779661  0.00000000     0.0
    ## 14    0.00000000  0.41666667   0.01694915  0.00000000     0.0
    ## 15    0.41666667  0.83333333   0.03389831  0.04166667     0.0
    ## 16    0.38888889  1.00000000   0.08474576  0.12500000     0.0
    ## 17    0.30555556  0.79166667   0.05084746  0.12500000     0.0
    ## 18    0.22222222  0.62500000   0.06779661  0.08333333     0.0
    ## 19    0.38888889  0.75000000   0.11864407  0.08333333     0.0
    ## 20    0.22222222  0.75000000   0.08474576  0.08333333     0.0
    ## 21    0.30555556  0.58333333   0.11864407  0.04166667     0.0
    ## 22    0.22222222  0.70833333   0.08474576  0.12500000     0.0
    ## 23    0.08333333  0.66666667   0.00000000  0.04166667     0.0
    ## 24    0.22222222  0.54166667   0.11864407  0.16666667     0.0
    ## 25    0.13888889  0.58333333   0.15254237  0.04166667     0.0
    ## 26    0.19444444  0.41666667   0.10169492  0.04166667     0.0
    ## 27    0.19444444  0.58333333   0.10169492  0.12500000     0.0
    ## 28    0.25000000  0.62500000   0.08474576  0.04166667     0.0
    ## 29    0.25000000  0.58333333   0.06779661  0.04166667     0.0
    ## 30    0.11111111  0.50000000   0.10169492  0.04166667     0.0
    ## 31    0.13888889  0.45833333   0.10169492  0.04166667     0.0
    ## 32    0.30555556  0.58333333   0.08474576  0.12500000     0.0
    ## 33    0.25000000  0.87500000   0.08474576  0.00000000     0.0
    ## 34    0.33333333  0.91666667   0.06779661  0.04166667     0.0
    ## 35    0.16666667  0.45833333   0.08474576  0.04166667     0.0
    ## 36    0.19444444  0.50000000   0.03389831  0.04166667     0.0
    ## 37    0.33333333  0.62500000   0.05084746  0.04166667     0.0
    ## 38    0.16666667  0.66666667   0.06779661  0.00000000     0.0
    ## 39    0.02777778  0.41666667   0.05084746  0.04166667     0.0
    ## 40    0.22222222  0.58333333   0.08474576  0.04166667     0.0
    ## 41    0.19444444  0.62500000   0.05084746  0.08333333     0.0
    ## 42    0.05555556  0.12500000   0.05084746  0.08333333     0.0
    ## 43    0.02777778  0.50000000   0.05084746  0.04166667     0.0
    ## 44    0.19444444  0.62500000   0.10169492  0.20833333     0.0
    ## 45    0.22222222  0.75000000   0.15254237  0.12500000     0.0
    ## 46    0.13888889  0.41666667   0.06779661  0.08333333     0.0
    ## 47    0.22222222  0.75000000   0.10169492  0.04166667     0.0
    ## 48    0.08333333  0.50000000   0.06779661  0.04166667     0.0
    ## 49    0.27777778  0.70833333   0.08474576  0.04166667     0.0
    ## 50    0.19444444  0.54166667   0.06779661  0.04166667     0.0
    ## 51    0.75000000  0.50000000   0.62711864  0.54166667     0.5
    ## 52    0.58333333  0.50000000   0.59322034  0.58333333     0.5
    ## 53    0.72222222  0.45833333   0.66101695  0.58333333     0.5
    ## 54    0.33333333  0.12500000   0.50847458  0.50000000     0.5
    ## 55    0.61111111  0.33333333   0.61016949  0.58333333     0.5
    ## 56    0.38888889  0.33333333   0.59322034  0.50000000     0.5
    ## 57    0.55555556  0.54166667   0.62711864  0.62500000     0.5
    ## 58    0.16666667  0.16666667   0.38983051  0.37500000     0.5
    ## 59    0.63888889  0.37500000   0.61016949  0.50000000     0.5
    ## 60    0.25000000  0.29166667   0.49152542  0.54166667     0.5
    ## 61    0.19444444  0.00000000   0.42372881  0.37500000     0.5
    ## 62    0.44444444  0.41666667   0.54237288  0.58333333     0.5
    ## 63    0.47222222  0.08333333   0.50847458  0.37500000     0.5
    ## 64    0.50000000  0.37500000   0.62711864  0.54166667     0.5
    ## 65    0.36111111  0.37500000   0.44067797  0.50000000     0.5
    ## 66    0.66666667  0.45833333   0.57627119  0.54166667     0.5
    ## 67    0.36111111  0.41666667   0.59322034  0.58333333     0.5
    ## 68    0.41666667  0.29166667   0.52542373  0.37500000     0.5
    ## 69    0.52777778  0.08333333   0.59322034  0.58333333     0.5
    ## 70    0.36111111  0.20833333   0.49152542  0.41666667     0.5
    ## 71    0.44444444  0.50000000   0.64406780  0.70833333     0.5
    ## 72    0.50000000  0.33333333   0.50847458  0.50000000     0.5
    ## 73    0.55555556  0.20833333   0.66101695  0.58333333     0.5
    ## 74    0.50000000  0.33333333   0.62711864  0.45833333     0.5
    ## 75    0.58333333  0.37500000   0.55932203  0.50000000     0.5
    ## 76    0.63888889  0.41666667   0.57627119  0.54166667     0.5
    ## 77    0.69444444  0.33333333   0.64406780  0.54166667     0.5
    ## 78    0.66666667  0.41666667   0.67796610  0.66666667     0.5
    ## 79    0.47222222  0.37500000   0.59322034  0.58333333     0.5
    ## 80    0.38888889  0.25000000   0.42372881  0.37500000     0.5
    ## 81    0.33333333  0.16666667   0.47457627  0.41666667     0.5
    ## 82    0.33333333  0.16666667   0.45762712  0.37500000     0.5
    ## 83    0.41666667  0.29166667   0.49152542  0.45833333     0.5
    ## 84    0.47222222  0.29166667   0.69491525  0.62500000     0.5
    ## 85    0.30555556  0.41666667   0.59322034  0.58333333     0.5
    ## 86    0.47222222  0.58333333   0.59322034  0.62500000     0.5
    ## 87    0.66666667  0.45833333   0.62711864  0.58333333     0.5
    ## 88    0.55555556  0.12500000   0.57627119  0.50000000     0.5
    ## 89    0.36111111  0.41666667   0.52542373  0.50000000     0.5
    ## 90    0.33333333  0.20833333   0.50847458  0.50000000     0.5
    ## 91    0.33333333  0.25000000   0.57627119  0.45833333     0.5
    ## 92    0.50000000  0.41666667   0.61016949  0.54166667     0.5
    ## 93    0.41666667  0.25000000   0.50847458  0.45833333     0.5
    ## 94    0.19444444  0.12500000   0.38983051  0.37500000     0.5
    ## 95    0.36111111  0.29166667   0.54237288  0.50000000     0.5
    ## 96    0.38888889  0.41666667   0.54237288  0.45833333     0.5
    ## 97    0.38888889  0.37500000   0.54237288  0.50000000     0.5
    ## 98    0.52777778  0.37500000   0.55932203  0.50000000     0.5
    ## 99    0.22222222  0.20833333   0.33898305  0.41666667     0.5
    ## 100   0.38888889  0.33333333   0.52542373  0.50000000     0.5
    ## 101   0.55555556  0.54166667   0.84745763  1.00000000     1.0
    ## 102   0.41666667  0.29166667   0.69491525  0.75000000     1.0
    ## 103   0.77777778  0.41666667   0.83050847  0.83333333     1.0
    ## 104   0.55555556  0.37500000   0.77966102  0.70833333     1.0
    ## 105   0.61111111  0.41666667   0.81355932  0.87500000     1.0
    ## 106   0.91666667  0.41666667   0.94915254  0.83333333     1.0
    ## 107   0.16666667  0.20833333   0.59322034  0.66666667     1.0
    ## 108   0.83333333  0.37500000   0.89830508  0.70833333     1.0
    ## 109   0.66666667  0.20833333   0.81355932  0.70833333     1.0
    ## 110   0.80555556  0.66666667   0.86440678  1.00000000     1.0
    ## 111   0.61111111  0.50000000   0.69491525  0.79166667     1.0
    ## 112   0.58333333  0.29166667   0.72881356  0.75000000     1.0
    ## 113   0.69444444  0.41666667   0.76271186  0.83333333     1.0
    ## 114   0.38888889  0.20833333   0.67796610  0.79166667     1.0
    ## 115   0.41666667  0.33333333   0.69491525  0.95833333     1.0
    ## 116   0.58333333  0.50000000   0.72881356  0.91666667     1.0
    ## 117   0.61111111  0.41666667   0.76271186  0.70833333     1.0
    ## 118   0.94444444  0.75000000   0.96610169  0.87500000     1.0
    ## 119   0.94444444  0.25000000   1.00000000  0.91666667     1.0
    ## 120   0.47222222  0.08333333   0.67796610  0.58333333     1.0
    ## 121   0.72222222  0.50000000   0.79661017  0.91666667     1.0
    ## 122   0.36111111  0.33333333   0.66101695  0.79166667     1.0
    ## 123   0.94444444  0.33333333   0.96610169  0.79166667     1.0
    ## 124   0.55555556  0.29166667   0.66101695  0.70833333     1.0
    ## 125   0.66666667  0.54166667   0.79661017  0.83333333     1.0
    ## 126   0.80555556  0.50000000   0.84745763  0.70833333     1.0
    ## 127   0.52777778  0.33333333   0.64406780  0.70833333     1.0
    ## 128   0.50000000  0.41666667   0.66101695  0.70833333     1.0
    ## 129   0.58333333  0.33333333   0.77966102  0.83333333     1.0
    ## 130   0.80555556  0.41666667   0.81355932  0.62500000     1.0
    ## 131   0.86111111  0.33333333   0.86440678  0.75000000     1.0
    ## 132   1.00000000  0.75000000   0.91525424  0.79166667     1.0
    ## 133   0.58333333  0.33333333   0.77966102  0.87500000     1.0
    ## 134   0.55555556  0.33333333   0.69491525  0.58333333     1.0
    ## 135   0.50000000  0.25000000   0.77966102  0.54166667     1.0
    ## 136   0.94444444  0.41666667   0.86440678  0.91666667     1.0
    ## 137   0.55555556  0.58333333   0.77966102  0.95833333     1.0
    ## 138   0.58333333  0.45833333   0.76271186  0.70833333     1.0
    ## 139   0.47222222  0.41666667   0.64406780  0.70833333     1.0
    ## 140   0.72222222  0.45833333   0.74576271  0.83333333     1.0
    ## 141   0.66666667  0.45833333   0.77966102  0.95833333     1.0
    ## 142   0.72222222  0.45833333   0.69491525  0.91666667     1.0
    ## 143   0.41666667  0.29166667   0.69491525  0.75000000     1.0
    ## 144   0.69444444  0.50000000   0.83050847  0.91666667     1.0
    ## 145   0.66666667  0.54166667   0.79661017  1.00000000     1.0
    ## 146   0.66666667  0.41666667   0.71186441  0.91666667     1.0
    ## 147   0.55555556  0.20833333   0.67796610  0.75000000     1.0
    ## 148   0.61111111  0.41666667   0.71186441  0.79166667     1.0
    ## 149   0.52777778  0.58333333   0.74576271  0.91666667     1.0
    ## 150   0.44444444  0.41666667   0.69491525  0.70833333     1.0

    XIII. Modelamiento predictivo

    13.1. Regreción lineal

    13.1.1. Diagrama de dispersión o puntos

    Los siguientes datos son extraidos desde iris.csv

    nuevos_datos <- data.frame(
      Sepal.Length= c(5,4,4,4,5,5,4,5,4,4,5,4,4,4,5,5,5),
      Sepal.Width= c(1,9,7,6,3,4,6,3,4,9,4,8,8,3,8,7,4)
    )
    
    nuevos_datos
    ##    Sepal.Length Sepal.Width
    ## 1             5           1
    ## 2             4           9
    ## 3             4           7
    ## 4             4           6
    ## 5             5           3
    ## 6             5           4
    ## 7             4           6
    ## 8             5           3
    ## 9             4           4
    ## 10            4           9
    ## 11            5           4
    ## 12            4           8
    ## 13            4           8
    ## 14            4           3
    ## 15            5           8
    ## 16            5           7
    ## 17            5           4
    # Gráfico con plot
    plot(nuevos_datos)

    # Gráfico con pairs
    pairs(nuevos_datos)

    # Realizamos un gráfico mejorado
    library(PerformanceAnalytics)
    chart.Correlation(nuevos_datos)
    ## Warning in par(usr): argument 1 does not name a graphical parameter

    #Realizamos un gráfico mejorado
    library(corrplot)
    ## Warning: package 'corrplot' was built under R version 4.3.2
    ## corrplot 0.92 loaded
    corrplot(cor(nuevos_datos))

    13.1.2. Coeficiente de correlación

    # Mediante la función cor
    cor(nuevos_datos) # Matriz de correlaciones
    ##              Sepal.Length Sepal.Width
    ## Sepal.Length    1.0000000  -0.5069806
    ## Sepal.Width    -0.5069806   1.0000000

    Coeficiente de correlación:

    r = 0.5069806

    13.1.3. Regreción lineal simple

    # lm, notación: Y ~ X, data=
    modelo_iris <- lm(Sepal.Length ~ Sepal.Width, data=iris)
    
    # Resumen de resultados
    summary(modelo_iris)
    ## 
    ## Call:
    ## lm(formula = Sepal.Length ~ Sepal.Width, data = iris)
    ## 
    ## Residuals:
    ##     Min      1Q  Median      3Q     Max 
    ## -1.5561 -0.6333 -0.1120  0.5579  2.2226 
    ## 
    ## Coefficients:
    ##             Estimate Std. Error t value Pr(>|t|)    
    ## (Intercept)   6.5262     0.4789   13.63   <2e-16 ***
    ## Sepal.Width  -0.2234     0.1551   -1.44    0.152    
    ## ---
    ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    ## 
    ## Residual standard error: 0.8251 on 148 degrees of freedom
    ## Multiple R-squared:  0.01382,    Adjusted R-squared:  0.007159 
    ## F-statistic: 2.074 on 1 and 148 DF,  p-value: 0.1519
    ## Coefficients:
    ##             Estimate Std. Error t value Pr(>|t|)    
    ## (Intercept)   6.5262     0.4789   13.63   <2e-16 ***
    ## Sepal.Width  -0.2234     0.1551   -1.44    0.152    
    ## ---
    ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    ## 
    ## Residual standard error: 0.8251 on 148 degrees of freedom
    ## Multiple R-squared:  0.01382,    Adjusted R-squared:  0.007159 
    ## F-statistic: 2.074 on 1 and 148 DF,  p-value: 0.1519

    13.2. Regreción angular

    13.2.1. Representación de las observaciones

    # Mejoramos el grafico
    ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, color = Sepal.Length)) +
      geom_boxplot(outlier.shape = NA) +
      geom_jitter(width = 0.1) +
      theme_bw() +
      theme(legend.position = "null")
    ## Warning: Continuous x aesthetic
    ## ℹ did you forget `aes(group = ...)`?
    ## Warning: The following aesthetics were dropped during statistical transformation: colour
    ## ℹ This can happen when ggplot fails to infer the correct grouping structure in
    ##   the data.
    ## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
    ##   variable into a factor?

    13.2.2. Generar el modelo de regresión logística

    modelgest<-glm(Sepal.Length~Sepal.Width, data= iris, family = gaussian())
    
    summary(modelgest)
    ## 
    ## Call:
    ## glm(formula = Sepal.Length ~ Sepal.Width, family = gaussian(), 
    ##     data = iris)
    ## 
    ## Coefficients:
    ##             Estimate Std. Error t value Pr(>|t|)    
    ## (Intercept)   6.5262     0.4789   13.63   <2e-16 ***
    ## Sepal.Width  -0.2234     0.1551   -1.44    0.152    
    ## ---
    ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    ## 
    ## (Dispersion parameter for gaussian family taken to be 0.6807844)
    ## 
    ##     Null deviance: 102.17  on 149  degrees of freedom
    ## Residual deviance: 100.76  on 148  degrees of freedom
    ## AIC: 371.99
    ## 
    ## Number of Fisher Scoring iterations: 2

    13.2.3. Gráfico del modelo

    # Codificación 0,1 de la variable respuesta
    iris$Sepal.Width <- as.character(iris$Sepal.Width)
    iris$Sepal.Width <- as.numeric(iris$Sepal.Width)
    
    # Gráfico de dispersión
    plot(Sepal.Width ~ Sepal.Length, iris, col = "darkblue",
         main = "Modelo regresión lineal general",
         ylab = "P(Sepal.Width=1|Sepal.Length)",
         xlab = "Sepal.Length", pch = 16)
    
    # Añade la línea de regresión
    abline(coef(modelgest), col = "firebrick", lwd = 2.5)

    13.2.4. Frecuencias de las variables Longitud del sepal por el ancho del sepal

    ggplot(iris, aes(Sepal.Length))+
      geom_histogram(binwidth= .25, fill="red", colour="black")+
      labs(x = "Longitud del sepal", y = "Frecuencia")+
      
      ggtitle("Frecuencia vs Longitud del sepal")

    ggplot(iris, aes(Sepal.Width))+
      geom_histogram(binwidth= 4, fill="red", colour="black")+
      labs(x= "Ancho del sepal", y="Frecuancia")+
      
      ggtitle("Frecuencia vs Ancho del sepal")

    13.2.5. Comparando modelos

    ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width)) +
      geom_jitter(height=0.10) +
      stat_smooth( method="glm", method.args = list(family = "binomial")) +
      geom_smooth(color="yellow")+
      geom_smooth(method = lm, color="purple")+
      labs(x= "Longitud del sepal", y= "Ancho del sepal")+
      ggtitle("Modelos de probabilidades de Longitud del sepal que puede ver en Ancho del sepal")
    ## `geom_smooth()` using formula = 'y ~ x'
    ## Warning: Computation failed in `stat_smooth()`
    ## Caused by error:
    ## ! y values must be 0 <= y <= 1
    ## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
    ## `geom_smooth()` using formula = 'y ~ x'