2.3 Muestra
8.1 Categorizar en 4 grupos a el ancho del sepal
8.2 Calcular e interpretar los cuartiles para la logintud del sepal
8.5 Asimetria
8.6 Curtosis
9.1 Verificación de valores perdidos
9.2.1 Eliminar filas o columnas con missin
9.2.2 Aplicando técnicas de imputación
10.1 Detección de outtliers univariado - gráfica
10.1.1 Gráfico de cajas
10.2. Correción
10.2.1 Eliminar los atípicos
11.1 Transformación de raíz cuadrada
11.2 Transformación exponencial
12.1 Estandarización
12.1.1 Método 1 Por partes
12.1.2 Método 2 Directo
12.1.3 Método 3 Apoyarse en las funciones de R
12.2 Normalización
12.2.1 Método 1
12.2.2 Método 2 Función
12.2.3 Aplicando a todo el caso
13.1 Regreción lineal
13.1.1 Diagrama de dispersión o puntos
13.1.2 Coeficiente de correlación
13.1.3. Regresión lineal simple
13.2 Regreción angular
13.2.1 Representación de las observaciones
13.2.2 Generar el modelo de regresión logística
13.2.3 Gráfico del modelo
13.2.4 Frecuencias de las variables Longitud del sepal por el ancho del sepal
13.2.5 Comparando modelos
Floricultura “Verde Esencia
La floricultura “Verde Esencia” se especializa en el cultivo y venta de diversas especies de flores, incluyendo las populares plantas de iris. Recientemente, la empresa ha estado experimentando dificultades para clasificar y diferenciar los diferentes tipos de iris que cultiva, ya que comparten similitudes en cuanto a la longitud y el ancho de sus sépalos y pétalos. Para abordar este problema, la empresa ha recopilado datos detallados sobre las características de los sépalos y pétalos de varias flores de iris.
Los datos recolectados incluyen mediciones de la longitud y el ancho del sépalo y pétalo de diferentes especies de iris, junto con la clasificación de cada flor en cuanto a su especie (setosa, versicolor o virginica). El objetivo es utilizar estos datos para desarrollar un modelo que pueda clasificar automáticamente las flores de iris en una de las tres especies, facilitando así la gestión de inventario y optimizando las operaciones de la floricultura.
El objetivo principal de este estudio es desarrollar un modelo predictivo eficiente y preciso que pueda clasificar automáticamente las especies de iris (setosa, versicolor o virginica) basadas en las mediciones de la longitud y el ancho del sépalo y el pétalo. Este modelo proporcionará a la floricultura “Verde Esencia” una herramienta valiosa para optimizar la gestión de inventario, facilitando la identificación y clasificación de las flores de iris en función de sus características morfológicas.
La población de estudio comprende todas las flores de iris cultivadas por la floricultura “Verde Esencia”. Este conjunto incluye flores de las tres especies: setosa, versicolor y virginica. Cada flor de iris representa una unidad individual dentro de la población.
La muestra se selecciona a partir de la población total de flores de iris cultivadas por la floricultura. Se tomarán mediciones detalladas de la longitud y el ancho del sépalo y el pétalo de un número representativo de flores de cada especie. La cantidad exacta de muestras dependerá de consideraciones prácticas y estadísticas para garantizar la representatividad y validez del modelo.
La unidad de análisis en este estudio es cada flor de iris individual. Las mediciones de la longitud y el ancho del sépalo y el pétalo, junto con la clasificación de la especie, constituirán la información relevante para el análisis. El modelo desarrollado se aplica a nivel de cada unidad de análisis para realizar la clasificación automática de las flores de iris en las tres categorías especificadas.
# Instalar y cargar los paquetes necesarios
if (!require(ggplot2)) {
install.packages("ggplot2")
library(ggplot2)
}
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.3.2
Sepal.Length: Longitud del sépalo de la flor en centímetros. Sepal.Width: Ancho del sépalo de la flor en centímetros. Petal.Length: Longitud del pétalo de la flor en centímetros. Petal.Width: Ancho del pétalo de la flor en centímetros. Species: Especie de la flor iris. Este es un atributo categórico que indica a qué especie pertenece la flor (setosa, versicolor, virginica).
# Cargar los datos
iris <- read.csv("iris.csv", sep = ";", stringsAsFactors = TRUE, encoding = "latin1");
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
# Tabla de frecuencia para la tabla Sepal.Length
library(agricolae)
## Warning: package 'agricolae' was built under R version 4.3.2
tabla_frecuencia_Sepal.Length <- table.freq(hist(iris$Sepal.Length,breaks = "Sturges",
plot = FALSE))
tabla_frecuencia_Sepal.Length
## Lower Upper Main Frequency Percentage CF CPF
## 1 4.0 4.5 4.25 5 3.3 5 3.3
## 2 4.5 5.0 4.75 27 18.0 32 21.3
## 3 5.0 5.5 5.25 27 18.0 59 39.3
## 4 5.5 6.0 5.75 30 20.0 89 59.3
## 5 6.0 6.5 6.25 31 20.7 120 80.0
## 6 6.5 7.0 6.75 18 12.0 138 92.0
## 7 7.0 7.5 7.25 6 4.0 144 96.0
## 8 7.5 8.0 7.75 6 4.0 150 100.0
# Tabla de frecuencia para la tabla Sepal.Width
library(agricolae)
tabla_frecuencia_Sepal.Width <- table.freq(hist(iris$Sepal.Width,breaks = "Sturges",
plot = FALSE))
tabla_frecuencia_Sepal.Width
## Lower Upper Main Frequency Percentage CF CPF
## 1 2.0 2.2 2.1 4 2.7 4 2.7
## 2 2.2 2.4 2.3 7 4.7 11 7.3
## 3 2.4 2.6 2.5 13 8.7 24 16.0
## 4 2.6 2.8 2.7 23 15.3 47 31.3
## 5 2.8 3.0 2.9 36 24.0 83 55.3
## 6 3.0 3.2 3.1 24 16.0 107 71.3
## 7 3.2 3.4 3.3 18 12.0 125 83.3
## 8 3.4 3.6 3.5 10 6.7 135 90.0
## 9 3.6 3.8 3.7 9 6.0 144 96.0
## 10 3.8 4.0 3.9 3 2.0 147 98.0
## 11 4.0 4.2 4.1 2 1.3 149 99.3
## 12 4.2 4.4 4.3 1 0.7 150 100.0
# Tabla de frecuencia para la tabla Petal.Length
library(agricolae)
tabla_frecuencia_Petal.Length <- table.freq(hist(iris$Petal.Length,breaks = "Sturges",
plot = FALSE))
tabla_frecuencia_Petal.Length
## Lower Upper Main Frequency Percentage CF CPF
## 1 1.0 1.5 1.25 37 24.7 37 24.7
## 2 1.5 2.0 1.75 13 8.7 50 33.3
## 3 2.0 2.5 2.25 0 0.0 50 33.3
## 4 2.5 3.0 2.75 1 0.7 51 34.0
## 5 3.0 3.5 3.25 4 2.7 55 36.7
## 6 3.5 4.0 3.75 11 7.3 66 44.0
## 7 4.0 4.5 4.25 21 14.0 87 58.0
## 8 4.5 5.0 4.75 21 14.0 108 72.0
## 9 5.0 5.5 5.25 17 11.3 125 83.3
## 10 5.5 6.0 5.75 16 10.7 141 94.0
## 11 6.0 6.5 6.25 5 3.3 146 97.3
## 12 6.5 7.0 6.75 4 2.7 150 100.0
# Tabla de frecuencia para la tabla Petal.Width
library(agricolae)
tabla_Petal.Width <- table.freq(hist(iris$Petal.Width,breaks = "Sturges",
plot = FALSE))
tabla_Petal.Width
## Lower Upper Main Frequency Percentage CF CPF
## 1 0.0 0.2 0.1 34 22.7 34 22.7
## 2 0.2 0.4 0.3 14 9.3 48 32.0
## 3 0.4 0.6 0.5 2 1.3 50 33.3
## 4 0.6 0.8 0.7 0 0.0 50 33.3
## 5 0.8 1.0 0.9 7 4.7 57 38.0
## 6 1.0 1.2 1.1 8 5.3 65 43.3
## 7 1.2 1.4 1.3 21 14.0 86 57.3
## 8 1.4 1.6 1.5 16 10.7 102 68.0
## 9 1.6 1.8 1.7 14 9.3 116 77.3
## 10 1.8 2.0 1.9 11 7.3 127 84.7
## 11 2.0 2.2 2.1 9 6.0 136 90.7
## 12 2.2 2.4 2.3 11 7.3 147 98.0
## 13 2.4 2.6 2.5 3 2.0 150 100.0
# Cargar librería ggplot2 para visualización de datos
library(ggplot2)
# Crear gráfico de barras para el Ancho del sépalo de la Longitud del sépalo con colores
grafico_barras_Species_Sepal.Width <- ggplot(iris, aes(x = Species, y = Sepal.Width, fill = Species)) +
geom_bar(stat = "identity", position = "dodge") +
labs(title = "Distribución de Especies por Ancho del sépalo:",
x = "Species",
y = "Sepal.Width",
fill = "Species") +
scale_fill_manual(values = c(
"setosa" = "blue",
"versicolor" = "green",
"virginica" = "orange"
)) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
# Visualizar gráfico de barras
print(grafico_barras_Species_Sepal.Width)
# Crear histograma para el Ancho del sépalo de la Longitud del sépalo
histograma_Sepal.Width <- ggplot(iris, aes(x = Sepal.Width, fill = Species)) +
geom_bar(position = "identity", alpha = 0.7) + # Cambiar a geom_bar y quitar binwidth
labs(title = "Histograma de Ancho del sépalo de Longitud del sépalo",
x = "Sepal.Width",
y = "Count", # Cambiar a Count ya que ahora estamos usando stat="count"
fill = "Species") +
theme_minimal()
# Visualizar el histograma
print(histograma_Sepal.Width)
# Crear gráfico de cajas para el Ancho del sépalo de la Especie
boxplot_Sepal.Width <- ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_boxplot() +
labs(title = "Distribución de Ancho del sépalo por Especies",
x = "Species",
y = "Sepal.Width") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
# Visualizar el gráfico de cajas
print(boxplot_Sepal.Width)
# Crear gráfico de densidad para el Ancho del sépalo de la Especie
density_plot <- ggplot(iris, aes(x = Sepal.Width, fill = Species)) +
geom_density(alpha = 0.5) +
labs(title = "Distribución de Densidad de Ancho del sépalo por Especies",
x = "Sepal.Width",
y = "Densidad") +
theme_minimal()
# Visualizar el gráfico de densidad
print(density_plot)
# Crear diagrama circular para Especies y Ancho del sépalo
pie_chart_Species_Sepal.Width <- ggplot(iris, aes(x = "Species", y = Sepal.Width, fill = Species)) +
geom_bar(stat = "identity", width = 1, color = "black") +
coord_polar("y") +
labs(title = "Distribución de Especies por Ancho del sépalo",
fill = "tipo") +
theme_minimal() +
theme(legend.position = "bottom")
# Visualizar diagrama circular
print(pie_chart_Species_Sepal.Width)
##Datos de los anchos del sepal
Sepal.Width<- c(1,9,7,6,3,4,6,3,4,9,4,8,8,3,8,7,4)
#Opción 1
promedio = sum(Sepal.Width)/length(Sepal.Width)
promedio
## [1] 5.529412
#Opción 2
mean(Sepal.Width)
## [1] 5.529412
median(Sepal.Width)
## [1] 6
# Opción 1 (tabla)
table(Sepal.Width)
## Sepal.Width
## 1 3 4 6 7 8 9
## 1 3 4 2 2 3 2
# Opción 2
library(modeest)
## Warning: package 'modeest' was built under R version 4.3.2
##
## Attaching package: 'modeest'
## The following object is masked from 'package:agricolae':
##
## skewness
mfv(Sepal.Width)
## [1] 4
quantile(iris$Sepal.Width)
## 0% 25% 50% 75% 100%
## 2.0 2.8 3.0 3.3 4.4
quantile(iris$Sepal.Length)
## 0% 25% 50% 75% 100%
## 4.3 5.1 5.8 6.4 7.9
quantile(iris$Sepal.Length, probs = seq(0, 1, 0.1))
## 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
## 4.30 4.80 5.00 5.27 5.60 5.80 6.10 6.30 6.52 6.90 7.90
quantile(iris$Sepal.Length, probs = seq(0, 1, 0.01))
## 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% 11% 12%
## 4.300 4.400 4.400 4.547 4.600 4.600 4.694 4.743 4.800 4.800 4.800 4.900 4.900
## 13% 14% 15% 16% 17% 18% 19% 20% 21% 22% 23% 24% 25%
## 4.900 4.900 5.000 5.000 5.000 5.000 5.000 5.000 5.029 5.100 5.100 5.100 5.100
## 26% 27% 28% 29% 30% 31% 32% 33% 34% 35% 36% 37% 38%
## 5.100 5.123 5.200 5.200 5.270 5.400 5.400 5.400 5.400 5.500 5.500 5.500 5.500
## 39% 40% 41% 42% 43% 44% 45% 46% 47% 48% 49% 50% 51%
## 5.511 5.600 5.600 5.600 5.607 5.700 5.700 5.700 5.700 5.700 5.800 5.800 5.800
## 52% 53% 54% 55% 56% 57% 58% 59% 60% 61% 62% 63% 64%
## 5.800 5.800 5.900 5.900 6.000 6.000 6.000 6.000 6.100 6.100 6.100 6.100 6.200
## 65% 66% 67% 68% 69% 70% 71% 72% 73% 74% 75% 76% 77%
## 6.200 6.234 6.300 6.300 6.300 6.300 6.300 6.328 6.400 6.400 6.400 6.400 6.473
## 78% 79% 80% 81% 82% 83% 84% 85% 86% 87% 88% 89% 90%
## 6.500 6.500 6.520 6.600 6.700 6.700 6.700 6.700 6.700 6.763 6.800 6.861 6.900
## 91% 92% 93% 94% 95% 96% 97% 98% 99% 100%
## 6.900 7.008 7.157 7.200 7.255 7.408 7.653 7.700 7.700 7.900
library(fBasics)
## Warning: package 'fBasics' was built under R version 4.3.2
##
## Attaching package: 'fBasics'
## The following objects are masked from 'package:modeest':
##
## ghMode, ghtMode, gldMode, hypMode, nigMode, skewness
## The following objects are masked from 'package:agricolae':
##
## kurtosis, skewness
skewness(iris$Sepal.Length)
## [1] 0.3086407
## attr(,"method")
## [1] "moment"
hist(iris$Sepal.Length)
kurtosis(iris$Sepal.Length)
## [1] -0.6058125
## attr(,"method")
## [1] "excess"
# Mostrar
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
str(iris)
## 'data.frame': 150 obs. of 5 variables:
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# Verificar columnas con missing
which(colSums(is.na(iris))!= 0)
## named integer(0)
Realizar el análisis utilizando librerias
library(VIM)
library(mice)
resumen_missing <- aggr(iris, numbers=T)
summary(resumen_missing)
##
## Missings per variable:
## Variable Count
## Sepal.Length 0
## Sepal.Width 0
## Petal.Length 0
## Petal.Width 0
## Species 0
##
## Missings in combinations of variables:
## Combinations Count Percent
## 0:0:0:0:0 150 100
Para determinar mejor lo patrones de comportamiento de missing se puede utilizar la siguiente función
library(VIM)
matrixplot(iris)
otra representación
#Con librería mice
library(mice)
md.pattern(iris, rotate.names = TRUE)
## /\ /\
## { `---' }
## { O O }
## ==> V <== No need for mice. This data set is completely observed.
## \ \|/ /
## `-----'
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 150 1 1 1 1 1 0
## 0 0 0 0 0 0
La librería visdat permite visualizar missing pero los ordena por tipo de datos
library(visdat)
vis_dat(iris)
Para obtener columnas con porcentajes de missing
vis_miss(iris)
En este caso se va a eliminar filas:
iris_corregido1 <- na.omit(iris)
str(iris_corregido1)
## 'data.frame': 150 obs. of 5 variables:
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# Verificar columnas con missing
which(colSums(is.na(iris_corregido1))!= 0)
## named integer(0)
Imputación por medidas de tendencia central
library(DMwR2)
iris_corregido2<-centralImputation(iris) #DMwR, mediana (númerico), moda(no númerico)
str(iris_corregido2)
## 'data.frame': 150 obs. of 5 variables:
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# Verificar columnas con missing
which(colSums(is.na(iris_corregido2))!= 0)
## named integer(0)
library(VIM)
iris_corregido3 <- initialise(iris, method = "median") #media (continuos) mediana (discretos), moda(no númerico)
str(iris_corregido3)
## 'data.frame': 150 obs. of 5 variables:
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# Verificar columnas con missing
which(colSums(is.na(iris_corregido3))!= 0)
## named integer(0)
library(DMwR2)
iris_corregido4<-knnImputation(iris, k=10)
str(iris_corregido4)
## 'data.frame': 150 obs. of 5 variables:
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# Verificar columnas con missing
which(colSums(is.na(iris_corregido4))!= 0)
## named integer(0)
El análisis solo se realiza para variable cuantitativas
Gráfico de cajas y bigotes
#Gráfico de cajas y bigotes
boxplot(iris$Sepal.Length)
Según los resultados, el ancho del sepal no tiene valores atípicos
Obteniendo valores atípicos para la variable Sepal.Width
boxplot(iris$Sepal.Width)
Para todo
boxplot(iris)
Para Petal.Length
boxplot(iris$Petal.Length)
Según los resultados, se identifica valores atípicos. Vamos a identificarlo y plantear estrategia de corrección
# Calcular el RIC (RIC = Q3 - Q1)
q1 <- quantile(iris$Petal.Length, 0.25)
q3 <- quantile(iris$Petal.Length, 0.75)
RIC <- q3-q1
RIC
## 75%
## 3.5
# Limites o bigotes (Superior e inferior)
bigote_inferior <- q1-1.5*RIC
bigote_inferior
## 25%
## -3.65
bigote_superior <- q3+1.5*RIC
bigote_superior
## 75%
## 10.35
# Identificar lo valores atípicos
outliers_det <- iris$Petal.Length[iris$Petal.Length < bigote_inferior | iris$Petal.Length > bigote_superior]
outliers_det
## numeric(0)
iris_sin_atipicos <- iris[!iris$Petal.Length %in% outliers_det,]
iris_sin_atipicos
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## 7 4.6 3.4 1.4 0.3 setosa
## 8 5.0 3.4 1.5 0.2 setosa
## 9 4.4 2.9 1.4 0.2 setosa
## 10 4.9 3.1 1.5 0.1 setosa
## 11 5.4 3.7 1.5 0.2 setosa
## 12 4.8 3.4 1.6 0.2 setosa
## 13 4.8 3.0 1.4 0.1 setosa
## 14 4.3 3.0 1.1 0.1 setosa
## 15 5.8 4.0 1.2 0.2 setosa
## 16 5.7 4.4 1.5 0.4 setosa
## 17 5.4 3.9 1.3 0.4 setosa
## 18 5.1 3.5 1.4 0.3 setosa
## 19 5.7 3.8 1.7 0.3 setosa
## 20 5.1 3.8 1.5 0.3 setosa
## 21 5.4 3.4 1.7 0.2 setosa
## 22 5.1 3.7 1.5 0.4 setosa
## 23 4.6 3.6 1.0 0.2 setosa
## 24 5.1 3.3 1.7 0.5 setosa
## 25 4.8 3.4 1.9 0.2 setosa
## 26 5.0 3.0 1.6 0.2 setosa
## 27 5.0 3.4 1.6 0.4 setosa
## 28 5.2 3.5 1.5 0.2 setosa
## 29 5.2 3.4 1.4 0.2 setosa
## 30 4.7 3.2 1.6 0.2 setosa
## 31 4.8 3.1 1.6 0.2 setosa
## 32 5.4 3.4 1.5 0.4 setosa
## 33 5.2 4.1 1.5 0.1 setosa
## 34 5.5 4.2 1.4 0.2 setosa
## 35 4.9 3.1 1.5 0.2 setosa
## 36 5.0 3.2 1.2 0.2 setosa
## 37 5.5 3.5 1.3 0.2 setosa
## 38 4.9 3.6 1.4 0.1 setosa
## 39 4.4 3.0 1.3 0.2 setosa
## 40 5.1 3.4 1.5 0.2 setosa
## 41 5.0 3.5 1.3 0.3 setosa
## 42 4.5 2.3 1.3 0.3 setosa
## 43 4.4 3.2 1.3 0.2 setosa
## 44 5.0 3.5 1.6 0.6 setosa
## 45 5.1 3.8 1.9 0.4 setosa
## 46 4.8 3.0 1.4 0.3 setosa
## 47 5.1 3.8 1.6 0.2 setosa
## 48 4.6 3.2 1.4 0.2 setosa
## 49 5.3 3.7 1.5 0.2 setosa
## 50 5.0 3.3 1.4 0.2 setosa
## 51 7.0 3.2 4.7 1.4 versicolor
## 52 6.4 3.2 4.5 1.5 versicolor
## 53 6.9 3.1 4.9 1.5 versicolor
## 54 5.5 2.3 4.0 1.3 versicolor
## 55 6.5 2.8 4.6 1.5 versicolor
## 56 5.7 2.8 4.5 1.3 versicolor
## 57 6.3 3.3 4.7 1.6 versicolor
## 58 4.9 2.4 3.3 1.0 versicolor
## 59 6.6 2.9 4.6 1.3 versicolor
## 60 5.2 2.7 3.9 1.4 versicolor
## 61 5.0 2.0 3.5 1.0 versicolor
## 62 5.9 3.0 4.2 1.5 versicolor
## 63 6.0 2.2 4.0 1.0 versicolor
## 64 6.1 2.9 4.7 1.4 versicolor
## 65 5.6 2.9 3.6 1.3 versicolor
## 66 6.7 3.1 4.4 1.4 versicolor
## 67 5.6 3.0 4.5 1.5 versicolor
## 68 5.8 2.7 4.1 1.0 versicolor
## 69 6.2 2.2 4.5 1.5 versicolor
## 70 5.6 2.5 3.9 1.1 versicolor
## 71 5.9 3.2 4.8 1.8 versicolor
## 72 6.1 2.8 4.0 1.3 versicolor
## 73 6.3 2.5 4.9 1.5 versicolor
## 74 6.1 2.8 4.7 1.2 versicolor
## 75 6.4 2.9 4.3 1.3 versicolor
## 76 6.6 3.0 4.4 1.4 versicolor
## 77 6.8 2.8 4.8 1.4 versicolor
## 78 6.7 3.0 5.0 1.7 versicolor
## 79 6.0 2.9 4.5 1.5 versicolor
## 80 5.7 2.6 3.5 1.0 versicolor
## 81 5.5 2.4 3.8 1.1 versicolor
## 82 5.5 2.4 3.7 1.0 versicolor
## 83 5.8 2.7 3.9 1.2 versicolor
## 84 6.0 2.7 5.1 1.6 versicolor
## 85 5.4 3.0 4.5 1.5 versicolor
## 86 6.0 3.4 4.5 1.6 versicolor
## 87 6.7 3.1 4.7 1.5 versicolor
## 88 6.3 2.3 4.4 1.3 versicolor
## 89 5.6 3.0 4.1 1.3 versicolor
## 90 5.5 2.5 4.0 1.3 versicolor
## 91 5.5 2.6 4.4 1.2 versicolor
## 92 6.1 3.0 4.6 1.4 versicolor
## 93 5.8 2.6 4.0 1.2 versicolor
## 94 5.0 2.3 3.3 1.0 versicolor
## 95 5.6 2.7 4.2 1.3 versicolor
## 96 5.7 3.0 4.2 1.2 versicolor
## 97 5.7 2.9 4.2 1.3 versicolor
## 98 6.2 2.9 4.3 1.3 versicolor
## 99 5.1 2.5 3.0 1.1 versicolor
## 100 5.7 2.8 4.1 1.3 versicolor
## 101 6.3 3.3 6.0 2.5 virginica
## 102 5.8 2.7 5.1 1.9 virginica
## 103 7.1 3.0 5.9 2.1 virginica
## 104 6.3 2.9 5.6 1.8 virginica
## 105 6.5 3.0 5.8 2.2 virginica
## 106 7.6 3.0 6.6 2.1 virginica
## 107 4.9 2.5 4.5 1.7 virginica
## 108 7.3 2.9 6.3 1.8 virginica
## 109 6.7 2.5 5.8 1.8 virginica
## 110 7.2 3.6 6.1 2.5 virginica
## 111 6.5 3.2 5.1 2.0 virginica
## 112 6.4 2.7 5.3 1.9 virginica
## 113 6.8 3.0 5.5 2.1 virginica
## 114 5.7 2.5 5.0 2.0 virginica
## 115 5.8 2.8 5.1 2.4 virginica
## 116 6.4 3.2 5.3 2.3 virginica
## 117 6.5 3.0 5.5 1.8 virginica
## 118 7.7 3.8 6.7 2.2 virginica
## 119 7.7 2.6 6.9 2.3 virginica
## 120 6.0 2.2 5.0 1.5 virginica
## 121 6.9 3.2 5.7 2.3 virginica
## 122 5.6 2.8 4.9 2.0 virginica
## 123 7.7 2.8 6.7 2.0 virginica
## 124 6.3 2.7 4.9 1.8 virginica
## 125 6.7 3.3 5.7 2.1 virginica
## 126 7.2 3.2 6.0 1.8 virginica
## 127 6.2 2.8 4.8 1.8 virginica
## 128 6.1 3.0 4.9 1.8 virginica
## 129 6.4 2.8 5.6 2.1 virginica
## 130 7.2 3.0 5.8 1.6 virginica
## 131 7.4 2.8 6.1 1.9 virginica
## 132 7.9 3.8 6.4 2.0 virginica
## 133 6.4 2.8 5.6 2.2 virginica
## 134 6.3 2.8 5.1 1.5 virginica
## 135 6.1 2.6 5.6 1.4 virginica
## 136 7.7 3.0 6.1 2.3 virginica
## 137 6.3 3.4 5.6 2.4 virginica
## 138 6.4 3.1 5.5 1.8 virginica
## 139 6.0 3.0 4.8 1.8 virginica
## 140 6.9 3.1 5.4 2.1 virginica
## 141 6.7 3.1 5.6 2.4 virginica
## 142 6.9 3.1 5.1 2.3 virginica
## 143 5.8 2.7 5.1 1.9 virginica
## 144 6.8 3.2 5.9 2.3 virginica
## 145 6.7 3.3 5.7 2.5 virginica
## 146 6.7 3.0 5.2 2.3 virginica
## 147 6.3 2.5 5.0 1.9 virginica
## 148 6.5 3.0 5.2 2.0 virginica
## 149 6.2 3.4 5.4 2.3 virginica
## 150 5.9 3.0 5.1 1.8 virginica
Para confirmar vamos a realizar un gráfico de cajas con la nueva data
boxplot(iris_sin_atipicos$Petal.Length)
# Original
hist(iris$Petal.Length, 12)
Para sacar la raiz cuadrada, simplemente se puede utilizar la función sqrt
sqrt(iris$Petal.Length)
## [1] 1.183216 1.183216 1.140175 1.224745 1.183216 1.303840 1.183216 1.224745
## [9] 1.183216 1.224745 1.224745 1.264911 1.183216 1.048809 1.095445 1.224745
## [17] 1.140175 1.183216 1.303840 1.224745 1.303840 1.224745 1.000000 1.303840
## [25] 1.378405 1.264911 1.264911 1.224745 1.183216 1.264911 1.264911 1.224745
## [33] 1.224745 1.183216 1.224745 1.095445 1.140175 1.183216 1.140175 1.224745
## [41] 1.140175 1.140175 1.140175 1.264911 1.378405 1.183216 1.264911 1.183216
## [49] 1.224745 1.183216 2.167948 2.121320 2.213594 2.000000 2.144761 2.121320
## [57] 2.167948 1.816590 2.144761 1.974842 1.870829 2.049390 2.000000 2.167948
## [65] 1.897367 2.097618 2.121320 2.024846 2.121320 1.974842 2.190890 2.000000
## [73] 2.213594 2.167948 2.073644 2.097618 2.190890 2.236068 2.121320 1.870829
## [81] 1.949359 1.923538 1.974842 2.258318 2.121320 2.121320 2.167948 2.097618
## [89] 2.024846 2.000000 2.097618 2.144761 2.000000 1.816590 2.049390 2.049390
## [97] 2.049390 2.073644 1.732051 2.024846 2.449490 2.258318 2.428992 2.366432
## [105] 2.408319 2.569047 2.121320 2.509980 2.408319 2.469818 2.258318 2.302173
## [113] 2.345208 2.236068 2.258318 2.302173 2.345208 2.588436 2.626785 2.236068
## [121] 2.387467 2.213594 2.588436 2.213594 2.387467 2.449490 2.190890 2.213594
## [129] 2.366432 2.408319 2.469818 2.529822 2.366432 2.258318 2.366432 2.469818
## [137] 2.366432 2.345208 2.190890 2.323790 2.366432 2.258318 2.258318 2.428992
## [145] 2.387467 2.280351 2.236068 2.280351 2.323790 2.258318
Graficamente
hist(sqrt(iris$Petal.Length))
exp(iris$Petal.Length)
## [1] 4.055200 4.055200 3.669297 4.481689 4.055200 5.473947
## [7] 4.055200 4.481689 4.055200 4.481689 4.481689 4.953032
## [13] 4.055200 3.004166 3.320117 4.481689 3.669297 4.055200
## [19] 5.473947 4.481689 5.473947 4.481689 2.718282 5.473947
## [25] 6.685894 4.953032 4.953032 4.481689 4.055200 4.953032
## [31] 4.953032 4.481689 4.481689 4.055200 4.481689 3.320117
## [37] 3.669297 4.055200 3.669297 4.481689 3.669297 3.669297
## [43] 3.669297 4.953032 6.685894 4.055200 4.953032 4.055200
## [49] 4.481689 4.055200 109.947172 90.017131 134.289780 54.598150
## [55] 99.484316 90.017131 109.947172 27.112639 99.484316 49.402449
## [61] 33.115452 66.686331 54.598150 109.947172 36.598234 81.450869
## [67] 90.017131 60.340288 90.017131 49.402449 121.510418 54.598150
## [73] 134.289780 109.947172 73.699794 81.450869 121.510418 148.413159
## [79] 90.017131 33.115452 44.701184 40.447304 49.402449 164.021907
## [85] 90.017131 90.017131 109.947172 81.450869 60.340288 54.598150
## [91] 81.450869 99.484316 54.598150 27.112639 66.686331 66.686331
## [97] 66.686331 73.699794 20.085537 60.340288 403.428793 164.021907
## [103] 365.037468 270.426407 330.299560 735.095189 90.017131 544.571910
## [109] 330.299560 445.857770 164.021907 200.336810 244.691932 148.413159
## [115] 164.021907 200.336810 244.691932 812.405825 992.274716 148.413159
## [121] 298.867401 134.289780 812.405825 134.289780 298.867401 403.428793
## [127] 121.510418 134.289780 270.426407 330.299560 445.857770 601.845038
## [133] 270.426407 164.021907 270.426407 445.857770 270.426407 244.691932
## [139] 121.510418 221.406416 270.426407 164.021907 164.021907 365.037468
## [145] 298.867401 181.272242 148.413159 181.272242 221.406416 164.021907
para poder observarlo graficamente se tiene:
hist(exp(iris$Petal.Length))
Forma 2
Petal.Length_exp<- exp(iris$Petal.Length)
hist(Petal.Length_exp)
log(iris$Petal.Length)
## [1] 0.33647224 0.33647224 0.26236426 0.40546511 0.33647224 0.53062825
## [7] 0.33647224 0.40546511 0.33647224 0.40546511 0.40546511 0.47000363
## [13] 0.33647224 0.09531018 0.18232156 0.40546511 0.26236426 0.33647224
## [19] 0.53062825 0.40546511 0.53062825 0.40546511 0.00000000 0.53062825
## [25] 0.64185389 0.47000363 0.47000363 0.40546511 0.33647224 0.47000363
## [31] 0.47000363 0.40546511 0.40546511 0.33647224 0.40546511 0.18232156
## [37] 0.26236426 0.33647224 0.26236426 0.40546511 0.26236426 0.26236426
## [43] 0.26236426 0.47000363 0.64185389 0.33647224 0.47000363 0.33647224
## [49] 0.40546511 0.33647224 1.54756251 1.50407740 1.58923521 1.38629436
## [55] 1.52605630 1.50407740 1.54756251 1.19392247 1.52605630 1.36097655
## [61] 1.25276297 1.43508453 1.38629436 1.54756251 1.28093385 1.48160454
## [67] 1.50407740 1.41098697 1.50407740 1.36097655 1.56861592 1.38629436
## [73] 1.58923521 1.54756251 1.45861502 1.48160454 1.56861592 1.60943791
## [79] 1.50407740 1.25276297 1.33500107 1.30833282 1.36097655 1.62924054
## [85] 1.50407740 1.50407740 1.54756251 1.48160454 1.41098697 1.38629436
## [91] 1.48160454 1.52605630 1.38629436 1.19392247 1.43508453 1.43508453
## [97] 1.43508453 1.45861502 1.09861229 1.41098697 1.79175947 1.62924054
## [103] 1.77495235 1.72276660 1.75785792 1.88706965 1.50407740 1.84054963
## [109] 1.75785792 1.80828877 1.62924054 1.66770682 1.70474809 1.60943791
## [115] 1.62924054 1.66770682 1.70474809 1.90210753 1.93152141 1.60943791
## [121] 1.74046617 1.58923521 1.90210753 1.58923521 1.74046617 1.79175947
## [127] 1.56861592 1.58923521 1.72276660 1.75785792 1.80828877 1.85629799
## [133] 1.72276660 1.62924054 1.72276660 1.80828877 1.72276660 1.70474809
## [139] 1.56861592 1.68639895 1.72276660 1.62924054 1.62924054 1.77495235
## [145] 1.74046617 1.64865863 1.60943791 1.64865863 1.68639895 1.62924054
graficamente
hist(log(iris$Petal.Length))
Cambiar la base 2
log(iris$Petal.Length, base=2)
## [1] 0.4854268 0.4854268 0.3785116 0.5849625 0.4854268 0.7655347 0.4854268
## [8] 0.5849625 0.4854268 0.5849625 0.5849625 0.6780719 0.4854268 0.1375035
## [15] 0.2630344 0.5849625 0.3785116 0.4854268 0.7655347 0.5849625 0.7655347
## [22] 0.5849625 0.0000000 0.7655347 0.9259994 0.6780719 0.6780719 0.5849625
## [29] 0.4854268 0.6780719 0.6780719 0.5849625 0.5849625 0.4854268 0.5849625
## [36] 0.2630344 0.3785116 0.4854268 0.3785116 0.5849625 0.3785116 0.3785116
## [43] 0.3785116 0.6780719 0.9259994 0.4854268 0.6780719 0.4854268 0.5849625
## [50] 0.4854268 2.2326608 2.1699250 2.2927817 2.0000000 2.2016339 2.1699250
## [57] 2.2326608 1.7224660 2.2016339 1.9634741 1.8073549 2.0703893 2.0000000
## [64] 2.2326608 1.8479969 2.1375035 2.1699250 2.0356239 2.1699250 1.9634741
## [71] 2.2630344 2.0000000 2.2927817 2.2326608 2.1043367 2.1375035 2.2630344
## [78] 2.3219281 2.1699250 1.8073549 1.9259994 1.8875253 1.9634741 2.3504972
## [85] 2.1699250 2.1699250 2.2326608 2.1375035 2.0356239 2.0000000 2.1375035
## [92] 2.2016339 2.0000000 1.7224660 2.0703893 2.0703893 2.0703893 2.1043367
## [99] 1.5849625 2.0356239 2.5849625 2.3504972 2.5607150 2.4854268 2.5360529
## [106] 2.7224660 2.1699250 2.6553518 2.5360529 2.6088092 2.3504972 2.4059924
## [113] 2.4594316 2.3219281 2.3504972 2.4059924 2.4594316 2.7441611 2.7865964
## [120] 2.3219281 2.5109619 2.2927817 2.7441611 2.2927817 2.5109619 2.5849625
## [127] 2.2630344 2.2927817 2.4854268 2.5360529 2.6088092 2.6780719 2.4854268
## [134] 2.3504972 2.4854268 2.6088092 2.4854268 2.4594316 2.2630344 2.4329594
## [141] 2.4854268 2.3504972 2.3504972 2.5607150 2.5109619 2.3785116 2.3219281
## [148] 2.3785116 2.4329594 2.3504972
graficamente
hist(log(iris$Petal.Length, base=2))
#Obtener solo tranaformaciones
Petal.Length_sqrt <- sqrt(iris$Petal.Length)
Petal.Length_exp <- exp(iris$Petal.Length)
Petal.Length_ln <- log(iris$Petal.Length)
Petal.Length_log2 <- log(iris$Petal.Length, base=2)
Petal.Length_log5 <- log(iris$Petal.Length, base=5)
Ver graficamente cada una:
par(mfrow=c(3,2))
hist(iris$Petal.Length)
hist(Petal.Length_sqrt)
hist(Petal.Length_exp)
hist(Petal.Length_ln)
hist(Petal.Length_log2)
hist(Petal.Length_log5)
par(mfrow=c(1,1))
La visualización de la distribución puede mejorarse con la gráfica de densidad
par(mfrow=c(3,2))
plot(density(iris$Petal.Length), main = "Distribución de Petal.Length originales")
plot(density(Petal.Length_sqrt), main = "Distribución de Petal.Length transformadas - sqrt")
plot(density(Petal.Length_exp), main = "Distribución de Petal.Length transformadas - exp")
plot(density(Petal.Length_ln), main = "Distribución de Petal.Length transformadas - ln")
plot(density(Petal.Length_log2), main = "Distribución de Petal.Length transformadas - log2")
plot(density(Petal.Length_log5), main = "Distribución de Petal.Length transformadas - log5")
par(mfrow=c(1,1))
gráfica general
# Convertir las columnas seleccionadas a numéricas si es necesario
iris[, 1:5] <- sapply(iris[, 1:5], as.numeric)
# Verificar si hay algún problema con la conversión
print(sapply(iris[, 1:5], class))
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## "numeric" "numeric" "numeric" "numeric" "numeric"
# Ahora puedes calcular la correlación sin problemas
library(PerformanceAnalytics)
chart.Correlation(cor(iris[, 1:5]), histogram = TRUE)
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 1
## 2 4.9 3.0 1.4 0.2 1
## 3 4.7 3.2 1.3 0.2 1
## 4 4.6 3.1 1.5 0.2 1
## 5 5.0 3.6 1.4 0.2 1
## 6 5.4 3.9 1.7 0.4 1
Vamos a aplicar estandarización Z a la variable longitud de manera manual
iris$Petal.Length
## [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4
## [19] 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2
## [37] 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0
## [55] 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0
## [73] 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0
## [91] 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3
## [109] 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0
## [127] 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9
## [145] 5.7 5.2 5.0 5.2 5.4 5.1
media_Petal.Length <- mean(iris$Petal.Length)
media_Petal.Length
## [1] 3.758
desv_est <- sd(iris$Petal.Length)
desv_est
## [1] 1.765298
Petal.Length_estandar <- (iris$Petal.Length-media_Petal.Length)/desv_est
Petal.Length_estandar
## [1] -1.33575163 -1.33575163 -1.39239929 -1.27910398 -1.33575163 -1.16580868
## [7] -1.33575163 -1.27910398 -1.33575163 -1.27910398 -1.27910398 -1.22245633
## [13] -1.33575163 -1.50569459 -1.44904694 -1.27910398 -1.39239929 -1.33575163
## [19] -1.16580868 -1.27910398 -1.16580868 -1.27910398 -1.56234224 -1.16580868
## [25] -1.05251337 -1.22245633 -1.22245633 -1.27910398 -1.33575163 -1.22245633
## [31] -1.22245633 -1.27910398 -1.27910398 -1.33575163 -1.27910398 -1.44904694
## [37] -1.39239929 -1.33575163 -1.39239929 -1.27910398 -1.39239929 -1.39239929
## [43] -1.39239929 -1.22245633 -1.05251337 -1.33575163 -1.22245633 -1.33575163
## [49] -1.27910398 -1.33575163 0.53362088 0.42032558 0.64691619 0.13708732
## [55] 0.47697323 0.42032558 0.53362088 -0.25944625 0.47697323 0.08043967
## [61] -0.14615094 0.25038262 0.13708732 0.53362088 -0.08950329 0.36367793
## [67] 0.42032558 0.19373497 0.42032558 0.08043967 0.59026853 0.13708732
## [73] 0.64691619 0.53362088 0.30703027 0.36367793 0.59026853 0.70356384
## [79] 0.42032558 -0.14615094 0.02379201 -0.03285564 0.08043967 0.76021149
## [85] 0.42032558 0.42032558 0.53362088 0.36367793 0.19373497 0.13708732
## [91] 0.36367793 0.47697323 0.13708732 -0.25944625 0.25038262 0.25038262
## [97] 0.25038262 0.30703027 -0.42938920 0.19373497 1.27004036 0.76021149
## [103] 1.21339271 1.04344975 1.15674505 1.60992627 0.42032558 1.43998331
## [109] 1.15674505 1.32668801 0.76021149 0.87350679 0.98680210 0.70356384
## [115] 0.76021149 0.87350679 0.98680210 1.66657392 1.77986923 0.70356384
## [121] 1.10009740 0.64691619 1.66657392 0.64691619 1.10009740 1.27004036
## [127] 0.59026853 0.64691619 1.04344975 1.15674505 1.32668801 1.49663097
## [133] 1.04344975 0.76021149 1.04344975 1.32668801 1.04344975 0.98680210
## [139] 0.59026853 0.93015445 1.04344975 0.76021149 0.76021149 1.21339271
## [145] 1.10009740 0.81685914 0.70356384 0.81685914 0.93015445 0.76021149
Petal.Length_estandar2 <- (iris$Petal.Length-mean(iris$Petal.Length))/sd(iris$Petal.Length)
Petal.Length_estandar2
## [1] -1.33575163 -1.33575163 -1.39239929 -1.27910398 -1.33575163 -1.16580868
## [7] -1.33575163 -1.27910398 -1.33575163 -1.27910398 -1.27910398 -1.22245633
## [13] -1.33575163 -1.50569459 -1.44904694 -1.27910398 -1.39239929 -1.33575163
## [19] -1.16580868 -1.27910398 -1.16580868 -1.27910398 -1.56234224 -1.16580868
## [25] -1.05251337 -1.22245633 -1.22245633 -1.27910398 -1.33575163 -1.22245633
## [31] -1.22245633 -1.27910398 -1.27910398 -1.33575163 -1.27910398 -1.44904694
## [37] -1.39239929 -1.33575163 -1.39239929 -1.27910398 -1.39239929 -1.39239929
## [43] -1.39239929 -1.22245633 -1.05251337 -1.33575163 -1.22245633 -1.33575163
## [49] -1.27910398 -1.33575163 0.53362088 0.42032558 0.64691619 0.13708732
## [55] 0.47697323 0.42032558 0.53362088 -0.25944625 0.47697323 0.08043967
## [61] -0.14615094 0.25038262 0.13708732 0.53362088 -0.08950329 0.36367793
## [67] 0.42032558 0.19373497 0.42032558 0.08043967 0.59026853 0.13708732
## [73] 0.64691619 0.53362088 0.30703027 0.36367793 0.59026853 0.70356384
## [79] 0.42032558 -0.14615094 0.02379201 -0.03285564 0.08043967 0.76021149
## [85] 0.42032558 0.42032558 0.53362088 0.36367793 0.19373497 0.13708732
## [91] 0.36367793 0.47697323 0.13708732 -0.25944625 0.25038262 0.25038262
## [97] 0.25038262 0.30703027 -0.42938920 0.19373497 1.27004036 0.76021149
## [103] 1.21339271 1.04344975 1.15674505 1.60992627 0.42032558 1.43998331
## [109] 1.15674505 1.32668801 0.76021149 0.87350679 0.98680210 0.70356384
## [115] 0.76021149 0.87350679 0.98680210 1.66657392 1.77986923 0.70356384
## [121] 1.10009740 0.64691619 1.66657392 0.64691619 1.10009740 1.27004036
## [127] 0.59026853 0.64691619 1.04344975 1.15674505 1.32668801 1.49663097
## [133] 1.04344975 0.76021149 1.04344975 1.32668801 1.04344975 0.98680210
## [139] 0.59026853 0.93015445 1.04344975 0.76021149 0.76021149 1.21339271
## [145] 1.10009740 0.81685914 0.70356384 0.81685914 0.93015445 0.76021149
R tiene múltiple funciones para estandarizar, la clásica es la función scale
#Función scale
Petal.Length_estandar3 <- scale(iris$Petal.Length)
Petal.Length_estandar3
## [,1]
## [1,] -1.33575163
## [2,] -1.33575163
## [3,] -1.39239929
## [4,] -1.27910398
## [5,] -1.33575163
## [6,] -1.16580868
## [7,] -1.33575163
## [8,] -1.27910398
## [9,] -1.33575163
## [10,] -1.27910398
## [11,] -1.27910398
## [12,] -1.22245633
## [13,] -1.33575163
## [14,] -1.50569459
## [15,] -1.44904694
## [16,] -1.27910398
## [17,] -1.39239929
## [18,] -1.33575163
## [19,] -1.16580868
## [20,] -1.27910398
## [21,] -1.16580868
## [22,] -1.27910398
## [23,] -1.56234224
## [24,] -1.16580868
## [25,] -1.05251337
## [26,] -1.22245633
## [27,] -1.22245633
## [28,] -1.27910398
## [29,] -1.33575163
## [30,] -1.22245633
## [31,] -1.22245633
## [32,] -1.27910398
## [33,] -1.27910398
## [34,] -1.33575163
## [35,] -1.27910398
## [36,] -1.44904694
## [37,] -1.39239929
## [38,] -1.33575163
## [39,] -1.39239929
## [40,] -1.27910398
## [41,] -1.39239929
## [42,] -1.39239929
## [43,] -1.39239929
## [44,] -1.22245633
## [45,] -1.05251337
## [46,] -1.33575163
## [47,] -1.22245633
## [48,] -1.33575163
## [49,] -1.27910398
## [50,] -1.33575163
## [51,] 0.53362088
## [52,] 0.42032558
## [53,] 0.64691619
## [54,] 0.13708732
## [55,] 0.47697323
## [56,] 0.42032558
## [57,] 0.53362088
## [58,] -0.25944625
## [59,] 0.47697323
## [60,] 0.08043967
## [61,] -0.14615094
## [62,] 0.25038262
## [63,] 0.13708732
## [64,] 0.53362088
## [65,] -0.08950329
## [66,] 0.36367793
## [67,] 0.42032558
## [68,] 0.19373497
## [69,] 0.42032558
## [70,] 0.08043967
## [71,] 0.59026853
## [72,] 0.13708732
## [73,] 0.64691619
## [74,] 0.53362088
## [75,] 0.30703027
## [76,] 0.36367793
## [77,] 0.59026853
## [78,] 0.70356384
## [79,] 0.42032558
## [80,] -0.14615094
## [81,] 0.02379201
## [82,] -0.03285564
## [83,] 0.08043967
## [84,] 0.76021149
## [85,] 0.42032558
## [86,] 0.42032558
## [87,] 0.53362088
## [88,] 0.36367793
## [89,] 0.19373497
## [90,] 0.13708732
## [91,] 0.36367793
## [92,] 0.47697323
## [93,] 0.13708732
## [94,] -0.25944625
## [95,] 0.25038262
## [96,] 0.25038262
## [97,] 0.25038262
## [98,] 0.30703027
## [99,] -0.42938920
## [100,] 0.19373497
## [101,] 1.27004036
## [102,] 0.76021149
## [103,] 1.21339271
## [104,] 1.04344975
## [105,] 1.15674505
## [106,] 1.60992627
## [107,] 0.42032558
## [108,] 1.43998331
## [109,] 1.15674505
## [110,] 1.32668801
## [111,] 0.76021149
## [112,] 0.87350679
## [113,] 0.98680210
## [114,] 0.70356384
## [115,] 0.76021149
## [116,] 0.87350679
## [117,] 0.98680210
## [118,] 1.66657392
## [119,] 1.77986923
## [120,] 0.70356384
## [121,] 1.10009740
## [122,] 0.64691619
## [123,] 1.66657392
## [124,] 0.64691619
## [125,] 1.10009740
## [126,] 1.27004036
## [127,] 0.59026853
## [128,] 0.64691619
## [129,] 1.04344975
## [130,] 1.15674505
## [131,] 1.32668801
## [132,] 1.49663097
## [133,] 1.04344975
## [134,] 0.76021149
## [135,] 1.04344975
## [136,] 1.32668801
## [137,] 1.04344975
## [138,] 0.98680210
## [139,] 0.59026853
## [140,] 0.93015445
## [141,] 1.04344975
## [142,] 0.76021149
## [143,] 0.76021149
## [144,] 1.21339271
## [145,] 1.10009740
## [146,] 0.81685914
## [147,] 0.70356384
## [148,] 0.81685914
## [149,] 0.93015445
## [150,] 0.76021149
## attr(,"scaled:center")
## [1] 3.758
## attr(,"scaled:scale")
## [1] 1.765298
La ventaja de la función de R, es que se puede enviar todo el caso
iris_cuanti_scale <- scale(iris[ ,1:5])
head(iris_cuanti_scale)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## [1,] -0.8976739 1.01560199 -1.335752 -1.311052 -1.220656
## [2,] -1.1392005 -0.13153881 -1.335752 -1.311052 -1.220656
## [3,] -1.3807271 0.32731751 -1.392399 -1.311052 -1.220656
## [4,] -1.5014904 0.09788935 -1.279104 -1.311052 -1.220656
## [5,] -1.0184372 1.24503015 -1.335752 -1.311052 -1.220656
## [6,] -0.5353840 1.93331463 -1.165809 -1.048667 -1.220656
Petal.Length_normal <- (iris$Petal.Length-min(iris$Petal.Length))/(max(iris$Petal.Length)-min(iris$Petal.Length))
Petal.Length_normal
## [1] 0.06779661 0.06779661 0.05084746 0.08474576 0.06779661 0.11864407
## [7] 0.06779661 0.08474576 0.06779661 0.08474576 0.08474576 0.10169492
## [13] 0.06779661 0.01694915 0.03389831 0.08474576 0.05084746 0.06779661
## [19] 0.11864407 0.08474576 0.11864407 0.08474576 0.00000000 0.11864407
## [25] 0.15254237 0.10169492 0.10169492 0.08474576 0.06779661 0.10169492
## [31] 0.10169492 0.08474576 0.08474576 0.06779661 0.08474576 0.03389831
## [37] 0.05084746 0.06779661 0.05084746 0.08474576 0.05084746 0.05084746
## [43] 0.05084746 0.10169492 0.15254237 0.06779661 0.10169492 0.06779661
## [49] 0.08474576 0.06779661 0.62711864 0.59322034 0.66101695 0.50847458
## [55] 0.61016949 0.59322034 0.62711864 0.38983051 0.61016949 0.49152542
## [61] 0.42372881 0.54237288 0.50847458 0.62711864 0.44067797 0.57627119
## [67] 0.59322034 0.52542373 0.59322034 0.49152542 0.64406780 0.50847458
## [73] 0.66101695 0.62711864 0.55932203 0.57627119 0.64406780 0.67796610
## [79] 0.59322034 0.42372881 0.47457627 0.45762712 0.49152542 0.69491525
## [85] 0.59322034 0.59322034 0.62711864 0.57627119 0.52542373 0.50847458
## [91] 0.57627119 0.61016949 0.50847458 0.38983051 0.54237288 0.54237288
## [97] 0.54237288 0.55932203 0.33898305 0.52542373 0.84745763 0.69491525
## [103] 0.83050847 0.77966102 0.81355932 0.94915254 0.59322034 0.89830508
## [109] 0.81355932 0.86440678 0.69491525 0.72881356 0.76271186 0.67796610
## [115] 0.69491525 0.72881356 0.76271186 0.96610169 1.00000000 0.67796610
## [121] 0.79661017 0.66101695 0.96610169 0.66101695 0.79661017 0.84745763
## [127] 0.64406780 0.66101695 0.77966102 0.81355932 0.86440678 0.91525424
## [133] 0.77966102 0.69491525 0.77966102 0.86440678 0.77966102 0.76271186
## [139] 0.64406780 0.74576271 0.77966102 0.69491525 0.69491525 0.83050847
## [145] 0.79661017 0.71186441 0.67796610 0.71186441 0.74576271 0.69491525
library(scales)
## Warning: package 'scales' was built under R version 4.3.2
rescale(iris$Petal.Length)
## [1] 0.06779661 0.06779661 0.05084746 0.08474576 0.06779661 0.11864407
## [7] 0.06779661 0.08474576 0.06779661 0.08474576 0.08474576 0.10169492
## [13] 0.06779661 0.01694915 0.03389831 0.08474576 0.05084746 0.06779661
## [19] 0.11864407 0.08474576 0.11864407 0.08474576 0.00000000 0.11864407
## [25] 0.15254237 0.10169492 0.10169492 0.08474576 0.06779661 0.10169492
## [31] 0.10169492 0.08474576 0.08474576 0.06779661 0.08474576 0.03389831
## [37] 0.05084746 0.06779661 0.05084746 0.08474576 0.05084746 0.05084746
## [43] 0.05084746 0.10169492 0.15254237 0.06779661 0.10169492 0.06779661
## [49] 0.08474576 0.06779661 0.62711864 0.59322034 0.66101695 0.50847458
## [55] 0.61016949 0.59322034 0.62711864 0.38983051 0.61016949 0.49152542
## [61] 0.42372881 0.54237288 0.50847458 0.62711864 0.44067797 0.57627119
## [67] 0.59322034 0.52542373 0.59322034 0.49152542 0.64406780 0.50847458
## [73] 0.66101695 0.62711864 0.55932203 0.57627119 0.64406780 0.67796610
## [79] 0.59322034 0.42372881 0.47457627 0.45762712 0.49152542 0.69491525
## [85] 0.59322034 0.59322034 0.62711864 0.57627119 0.52542373 0.50847458
## [91] 0.57627119 0.61016949 0.50847458 0.38983051 0.54237288 0.54237288
## [97] 0.54237288 0.55932203 0.33898305 0.52542373 0.84745763 0.69491525
## [103] 0.83050847 0.77966102 0.81355932 0.94915254 0.59322034 0.89830508
## [109] 0.81355932 0.86440678 0.69491525 0.72881356 0.76271186 0.67796610
## [115] 0.69491525 0.72881356 0.76271186 0.96610169 1.00000000 0.67796610
## [121] 0.79661017 0.66101695 0.96610169 0.66101695 0.79661017 0.84745763
## [127] 0.64406780 0.66101695 0.77966102 0.81355932 0.86440678 0.91525424
## [133] 0.77966102 0.69491525 0.77966102 0.86440678 0.77966102 0.76271186
## [139] 0.64406780 0.74576271 0.77966102 0.69491525 0.69491525 0.83050847
## [145] 0.79661017 0.71186441 0.67796610 0.71186441 0.74576271 0.69491525
la función rescale solo permite aplicarse a vectores, no es posible directamente apicar al data frame.
library(caret)
## Warning: package 'caret' was built under R version 4.3.2
## Loading required package: lattice
pre_procesamiento<-preProcess(iris[,1:5]) # Así por defecto muestra la est. Z
predict(pre_procesamiento, iris[,1:5])
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 -0.89767388 1.01560199 -1.33575163 -1.3110521482 -1.220656
## 2 -1.13920048 -0.13153881 -1.33575163 -1.3110521482 -1.220656
## 3 -1.38072709 0.32731751 -1.39239929 -1.3110521482 -1.220656
## 4 -1.50149039 0.09788935 -1.27910398 -1.3110521482 -1.220656
## 5 -1.01843718 1.24503015 -1.33575163 -1.3110521482 -1.220656
## 6 -0.53538397 1.93331463 -1.16580868 -1.0486667950 -1.220656
## 7 -1.50149039 0.78617383 -1.33575163 -1.1798594716 -1.220656
## 8 -1.01843718 0.78617383 -1.27910398 -1.3110521482 -1.220656
## 9 -1.74301699 -0.36096697 -1.33575163 -1.3110521482 -1.220656
## 10 -1.13920048 0.09788935 -1.27910398 -1.4422448248 -1.220656
## 11 -0.53538397 1.47445831 -1.27910398 -1.3110521482 -1.220656
## 12 -1.25996379 0.78617383 -1.22245633 -1.3110521482 -1.220656
## 13 -1.25996379 -0.13153881 -1.33575163 -1.4422448248 -1.220656
## 14 -1.86378030 -0.13153881 -1.50569459 -1.4422448248 -1.220656
## 15 -0.05233076 2.16274279 -1.44904694 -1.3110521482 -1.220656
## 16 -0.17309407 3.08045544 -1.27910398 -1.0486667950 -1.220656
## 17 -0.53538397 1.93331463 -1.39239929 -1.0486667950 -1.220656
## 18 -0.89767388 1.01560199 -1.33575163 -1.1798594716 -1.220656
## 19 -0.17309407 1.70388647 -1.16580868 -1.1798594716 -1.220656
## 20 -0.89767388 1.70388647 -1.27910398 -1.1798594716 -1.220656
## 21 -0.53538397 0.78617383 -1.16580868 -1.3110521482 -1.220656
## 22 -0.89767388 1.47445831 -1.27910398 -1.0486667950 -1.220656
## 23 -1.50149039 1.24503015 -1.56234224 -1.3110521482 -1.220656
## 24 -0.89767388 0.55674567 -1.16580868 -0.9174741184 -1.220656
## 25 -1.25996379 0.78617383 -1.05251337 -1.3110521482 -1.220656
## 26 -1.01843718 -0.13153881 -1.22245633 -1.3110521482 -1.220656
## 27 -1.01843718 0.78617383 -1.22245633 -1.0486667950 -1.220656
## 28 -0.77691058 1.01560199 -1.27910398 -1.3110521482 -1.220656
## 29 -0.77691058 0.78617383 -1.33575163 -1.3110521482 -1.220656
## 30 -1.38072709 0.32731751 -1.22245633 -1.3110521482 -1.220656
## 31 -1.25996379 0.09788935 -1.22245633 -1.3110521482 -1.220656
## 32 -0.53538397 0.78617383 -1.27910398 -1.0486667950 -1.220656
## 33 -0.77691058 2.39217095 -1.27910398 -1.4422448248 -1.220656
## 34 -0.41462067 2.62159911 -1.33575163 -1.3110521482 -1.220656
## 35 -1.13920048 0.09788935 -1.27910398 -1.3110521482 -1.220656
## 36 -1.01843718 0.32731751 -1.44904694 -1.3110521482 -1.220656
## 37 -0.41462067 1.01560199 -1.39239929 -1.3110521482 -1.220656
## 38 -1.13920048 1.24503015 -1.33575163 -1.4422448248 -1.220656
## 39 -1.74301699 -0.13153881 -1.39239929 -1.3110521482 -1.220656
## 40 -0.89767388 0.78617383 -1.27910398 -1.3110521482 -1.220656
## 41 -1.01843718 1.01560199 -1.39239929 -1.1798594716 -1.220656
## 42 -1.62225369 -1.73753594 -1.39239929 -1.1798594716 -1.220656
## 43 -1.74301699 0.32731751 -1.39239929 -1.3110521482 -1.220656
## 44 -1.01843718 1.01560199 -1.22245633 -0.7862814418 -1.220656
## 45 -0.89767388 1.70388647 -1.05251337 -1.0486667950 -1.220656
## 46 -1.25996379 -0.13153881 -1.33575163 -1.1798594716 -1.220656
## 47 -0.89767388 1.70388647 -1.22245633 -1.3110521482 -1.220656
## 48 -1.50149039 0.32731751 -1.33575163 -1.3110521482 -1.220656
## 49 -0.65614727 1.47445831 -1.27910398 -1.3110521482 -1.220656
## 50 -1.01843718 0.55674567 -1.33575163 -1.3110521482 -1.220656
## 51 1.39682886 0.32731751 0.53362088 0.2632599711 0.000000
## 52 0.67224905 0.32731751 0.42032558 0.3944526477 0.000000
## 53 1.27606556 0.09788935 0.64691619 0.3944526477 0.000000
## 54 -0.41462067 -1.73753594 0.13708732 0.1320672944 0.000000
## 55 0.79301235 -0.59039513 0.47697323 0.3944526477 0.000000
## 56 -0.17309407 -0.59039513 0.42032558 0.1320672944 0.000000
## 57 0.55148575 0.55674567 0.53362088 0.5256453243 0.000000
## 58 -1.13920048 -1.50810778 -0.25944625 -0.2615107354 0.000000
## 59 0.91377565 -0.36096697 0.47697323 0.1320672944 0.000000
## 60 -0.77691058 -0.81982329 0.08043967 0.2632599711 0.000000
## 61 -1.01843718 -2.42582042 -0.14615094 -0.2615107354 0.000000
## 62 0.06843254 -0.13153881 0.25038262 0.3944526477 0.000000
## 63 0.18919584 -1.96696410 0.13708732 -0.2615107354 0.000000
## 64 0.30995914 -0.36096697 0.53362088 0.2632599711 0.000000
## 65 -0.29385737 -0.36096697 -0.08950329 0.1320672944 0.000000
## 66 1.03453895 0.09788935 0.36367793 0.2632599711 0.000000
## 67 -0.29385737 -0.13153881 0.42032558 0.3944526477 0.000000
## 68 -0.05233076 -0.81982329 0.19373497 -0.2615107354 0.000000
## 69 0.43072244 -1.96696410 0.42032558 0.3944526477 0.000000
## 70 -0.29385737 -1.27867961 0.08043967 -0.1303180588 0.000000
## 71 0.06843254 0.32731751 0.59026853 0.7880306775 0.000000
## 72 0.30995914 -0.59039513 0.13708732 0.1320672944 0.000000
## 73 0.55148575 -1.27867961 0.64691619 0.3944526477 0.000000
## 74 0.30995914 -0.59039513 0.53362088 0.0008746178 0.000000
## 75 0.67224905 -0.36096697 0.30703027 0.1320672944 0.000000
## 76 0.91377565 -0.13153881 0.36367793 0.2632599711 0.000000
## 77 1.15530226 -0.59039513 0.59026853 0.2632599711 0.000000
## 78 1.03453895 -0.13153881 0.70356384 0.6568380009 0.000000
## 79 0.18919584 -0.36096697 0.42032558 0.3944526477 0.000000
## 80 -0.17309407 -1.04925145 -0.14615094 -0.2615107354 0.000000
## 81 -0.41462067 -1.50810778 0.02379201 -0.1303180588 0.000000
## 82 -0.41462067 -1.50810778 -0.03285564 -0.2615107354 0.000000
## 83 -0.05233076 -0.81982329 0.08043967 0.0008746178 0.000000
## 84 0.18919584 -0.81982329 0.76021149 0.5256453243 0.000000
## 85 -0.53538397 -0.13153881 0.42032558 0.3944526477 0.000000
## 86 0.18919584 0.78617383 0.42032558 0.5256453243 0.000000
## 87 1.03453895 0.09788935 0.53362088 0.3944526477 0.000000
## 88 0.55148575 -1.73753594 0.36367793 0.1320672944 0.000000
## 89 -0.29385737 -0.13153881 0.19373497 0.1320672944 0.000000
## 90 -0.41462067 -1.27867961 0.13708732 0.1320672944 0.000000
## 91 -0.41462067 -1.04925145 0.36367793 0.0008746178 0.000000
## 92 0.30995914 -0.13153881 0.47697323 0.2632599711 0.000000
## 93 -0.05233076 -1.04925145 0.13708732 0.0008746178 0.000000
## 94 -1.01843718 -1.73753594 -0.25944625 -0.2615107354 0.000000
## 95 -0.29385737 -0.81982329 0.25038262 0.1320672944 0.000000
## 96 -0.17309407 -0.13153881 0.25038262 0.0008746178 0.000000
## 97 -0.17309407 -0.36096697 0.25038262 0.1320672944 0.000000
## 98 0.43072244 -0.36096697 0.30703027 0.1320672944 0.000000
## 99 -0.89767388 -1.27867961 -0.42938920 -0.1303180588 0.000000
## 100 -0.17309407 -0.59039513 0.19373497 0.1320672944 0.000000
## 101 0.55148575 0.55674567 1.27004036 1.7063794137 1.220656
## 102 -0.05233076 -0.81982329 0.76021149 0.9192233541 1.220656
## 103 1.51759216 -0.13153881 1.21339271 1.1816087073 1.220656
## 104 0.55148575 -0.36096697 1.04344975 0.7880306775 1.220656
## 105 0.79301235 -0.13153881 1.15674505 1.3128013839 1.220656
## 106 2.12140867 -0.13153881 1.60992627 1.1816087073 1.220656
## 107 -1.13920048 -1.27867961 0.42032558 0.6568380009 1.220656
## 108 1.75911877 -0.36096697 1.43998331 0.7880306775 1.220656
## 109 1.03453895 -1.27867961 1.15674505 0.7880306775 1.220656
## 110 1.63835547 1.24503015 1.32668801 1.7063794137 1.220656
## 111 0.79301235 0.32731751 0.76021149 1.0504160307 1.220656
## 112 0.67224905 -0.81982329 0.87350679 0.9192233541 1.220656
## 113 1.15530226 -0.13153881 0.98680210 1.1816087073 1.220656
## 114 -0.17309407 -1.27867961 0.70356384 1.0504160307 1.220656
## 115 -0.05233076 -0.59039513 0.76021149 1.5751867371 1.220656
## 116 0.67224905 0.32731751 0.87350679 1.4439940605 1.220656
## 117 0.79301235 -0.13153881 0.98680210 0.7880306775 1.220656
## 118 2.24217198 1.70388647 1.66657392 1.3128013839 1.220656
## 119 2.24217198 -1.04925145 1.77986923 1.4439940605 1.220656
## 120 0.18919584 -1.96696410 0.70356384 0.3944526477 1.220656
## 121 1.27606556 0.32731751 1.10009740 1.4439940605 1.220656
## 122 -0.29385737 -0.59039513 0.64691619 1.0504160307 1.220656
## 123 2.24217198 -0.59039513 1.66657392 1.0504160307 1.220656
## 124 0.55148575 -0.81982329 0.64691619 0.7880306775 1.220656
## 125 1.03453895 0.55674567 1.10009740 1.1816087073 1.220656
## 126 1.63835547 0.32731751 1.27004036 0.7880306775 1.220656
## 127 0.43072244 -0.59039513 0.59026853 0.7880306775 1.220656
## 128 0.30995914 -0.13153881 0.64691619 0.7880306775 1.220656
## 129 0.67224905 -0.59039513 1.04344975 1.1816087073 1.220656
## 130 1.63835547 -0.13153881 1.15674505 0.5256453243 1.220656
## 131 1.87988207 -0.59039513 1.32668801 0.9192233541 1.220656
## 132 2.48369858 1.70388647 1.49663097 1.0504160307 1.220656
## 133 0.67224905 -0.59039513 1.04344975 1.3128013839 1.220656
## 134 0.55148575 -0.59039513 0.76021149 0.3944526477 1.220656
## 135 0.30995914 -1.04925145 1.04344975 0.2632599711 1.220656
## 136 2.24217198 -0.13153881 1.32668801 1.4439940605 1.220656
## 137 0.55148575 0.78617383 1.04344975 1.5751867371 1.220656
## 138 0.67224905 0.09788935 0.98680210 0.7880306775 1.220656
## 139 0.18919584 -0.13153881 0.59026853 0.7880306775 1.220656
## 140 1.27606556 0.09788935 0.93015445 1.1816087073 1.220656
## 141 1.03453895 0.09788935 1.04344975 1.5751867371 1.220656
## 142 1.27606556 0.09788935 0.76021149 1.4439940605 1.220656
## 143 -0.05233076 -0.81982329 0.76021149 0.9192233541 1.220656
## 144 1.15530226 0.32731751 1.21339271 1.4439940605 1.220656
## 145 1.03453895 0.55674567 1.10009740 1.7063794137 1.220656
## 146 1.03453895 -0.13153881 0.81685914 1.4439940605 1.220656
## 147 0.55148575 -1.27867961 0.70356384 0.9192233541 1.220656
## 148 0.79301235 -0.13153881 0.81685914 1.0504160307 1.220656
## 149 0.43072244 0.78617383 0.93015445 1.4439940605 1.220656
## 150 0.06843254 -0.13153881 0.76021149 0.7880306775 1.220656
library(caret)
pre_procesamiento<-preProcess(iris[,1:5], method = "range")
predict(pre_procesamiento, iris[,1:5])
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 0.22222222 0.62500000 0.06779661 0.04166667 0.0
## 2 0.16666667 0.41666667 0.06779661 0.04166667 0.0
## 3 0.11111111 0.50000000 0.05084746 0.04166667 0.0
## 4 0.08333333 0.45833333 0.08474576 0.04166667 0.0
## 5 0.19444444 0.66666667 0.06779661 0.04166667 0.0
## 6 0.30555556 0.79166667 0.11864407 0.12500000 0.0
## 7 0.08333333 0.58333333 0.06779661 0.08333333 0.0
## 8 0.19444444 0.58333333 0.08474576 0.04166667 0.0
## 9 0.02777778 0.37500000 0.06779661 0.04166667 0.0
## 10 0.16666667 0.45833333 0.08474576 0.00000000 0.0
## 11 0.30555556 0.70833333 0.08474576 0.04166667 0.0
## 12 0.13888889 0.58333333 0.10169492 0.04166667 0.0
## 13 0.13888889 0.41666667 0.06779661 0.00000000 0.0
## 14 0.00000000 0.41666667 0.01694915 0.00000000 0.0
## 15 0.41666667 0.83333333 0.03389831 0.04166667 0.0
## 16 0.38888889 1.00000000 0.08474576 0.12500000 0.0
## 17 0.30555556 0.79166667 0.05084746 0.12500000 0.0
## 18 0.22222222 0.62500000 0.06779661 0.08333333 0.0
## 19 0.38888889 0.75000000 0.11864407 0.08333333 0.0
## 20 0.22222222 0.75000000 0.08474576 0.08333333 0.0
## 21 0.30555556 0.58333333 0.11864407 0.04166667 0.0
## 22 0.22222222 0.70833333 0.08474576 0.12500000 0.0
## 23 0.08333333 0.66666667 0.00000000 0.04166667 0.0
## 24 0.22222222 0.54166667 0.11864407 0.16666667 0.0
## 25 0.13888889 0.58333333 0.15254237 0.04166667 0.0
## 26 0.19444444 0.41666667 0.10169492 0.04166667 0.0
## 27 0.19444444 0.58333333 0.10169492 0.12500000 0.0
## 28 0.25000000 0.62500000 0.08474576 0.04166667 0.0
## 29 0.25000000 0.58333333 0.06779661 0.04166667 0.0
## 30 0.11111111 0.50000000 0.10169492 0.04166667 0.0
## 31 0.13888889 0.45833333 0.10169492 0.04166667 0.0
## 32 0.30555556 0.58333333 0.08474576 0.12500000 0.0
## 33 0.25000000 0.87500000 0.08474576 0.00000000 0.0
## 34 0.33333333 0.91666667 0.06779661 0.04166667 0.0
## 35 0.16666667 0.45833333 0.08474576 0.04166667 0.0
## 36 0.19444444 0.50000000 0.03389831 0.04166667 0.0
## 37 0.33333333 0.62500000 0.05084746 0.04166667 0.0
## 38 0.16666667 0.66666667 0.06779661 0.00000000 0.0
## 39 0.02777778 0.41666667 0.05084746 0.04166667 0.0
## 40 0.22222222 0.58333333 0.08474576 0.04166667 0.0
## 41 0.19444444 0.62500000 0.05084746 0.08333333 0.0
## 42 0.05555556 0.12500000 0.05084746 0.08333333 0.0
## 43 0.02777778 0.50000000 0.05084746 0.04166667 0.0
## 44 0.19444444 0.62500000 0.10169492 0.20833333 0.0
## 45 0.22222222 0.75000000 0.15254237 0.12500000 0.0
## 46 0.13888889 0.41666667 0.06779661 0.08333333 0.0
## 47 0.22222222 0.75000000 0.10169492 0.04166667 0.0
## 48 0.08333333 0.50000000 0.06779661 0.04166667 0.0
## 49 0.27777778 0.70833333 0.08474576 0.04166667 0.0
## 50 0.19444444 0.54166667 0.06779661 0.04166667 0.0
## 51 0.75000000 0.50000000 0.62711864 0.54166667 0.5
## 52 0.58333333 0.50000000 0.59322034 0.58333333 0.5
## 53 0.72222222 0.45833333 0.66101695 0.58333333 0.5
## 54 0.33333333 0.12500000 0.50847458 0.50000000 0.5
## 55 0.61111111 0.33333333 0.61016949 0.58333333 0.5
## 56 0.38888889 0.33333333 0.59322034 0.50000000 0.5
## 57 0.55555556 0.54166667 0.62711864 0.62500000 0.5
## 58 0.16666667 0.16666667 0.38983051 0.37500000 0.5
## 59 0.63888889 0.37500000 0.61016949 0.50000000 0.5
## 60 0.25000000 0.29166667 0.49152542 0.54166667 0.5
## 61 0.19444444 0.00000000 0.42372881 0.37500000 0.5
## 62 0.44444444 0.41666667 0.54237288 0.58333333 0.5
## 63 0.47222222 0.08333333 0.50847458 0.37500000 0.5
## 64 0.50000000 0.37500000 0.62711864 0.54166667 0.5
## 65 0.36111111 0.37500000 0.44067797 0.50000000 0.5
## 66 0.66666667 0.45833333 0.57627119 0.54166667 0.5
## 67 0.36111111 0.41666667 0.59322034 0.58333333 0.5
## 68 0.41666667 0.29166667 0.52542373 0.37500000 0.5
## 69 0.52777778 0.08333333 0.59322034 0.58333333 0.5
## 70 0.36111111 0.20833333 0.49152542 0.41666667 0.5
## 71 0.44444444 0.50000000 0.64406780 0.70833333 0.5
## 72 0.50000000 0.33333333 0.50847458 0.50000000 0.5
## 73 0.55555556 0.20833333 0.66101695 0.58333333 0.5
## 74 0.50000000 0.33333333 0.62711864 0.45833333 0.5
## 75 0.58333333 0.37500000 0.55932203 0.50000000 0.5
## 76 0.63888889 0.41666667 0.57627119 0.54166667 0.5
## 77 0.69444444 0.33333333 0.64406780 0.54166667 0.5
## 78 0.66666667 0.41666667 0.67796610 0.66666667 0.5
## 79 0.47222222 0.37500000 0.59322034 0.58333333 0.5
## 80 0.38888889 0.25000000 0.42372881 0.37500000 0.5
## 81 0.33333333 0.16666667 0.47457627 0.41666667 0.5
## 82 0.33333333 0.16666667 0.45762712 0.37500000 0.5
## 83 0.41666667 0.29166667 0.49152542 0.45833333 0.5
## 84 0.47222222 0.29166667 0.69491525 0.62500000 0.5
## 85 0.30555556 0.41666667 0.59322034 0.58333333 0.5
## 86 0.47222222 0.58333333 0.59322034 0.62500000 0.5
## 87 0.66666667 0.45833333 0.62711864 0.58333333 0.5
## 88 0.55555556 0.12500000 0.57627119 0.50000000 0.5
## 89 0.36111111 0.41666667 0.52542373 0.50000000 0.5
## 90 0.33333333 0.20833333 0.50847458 0.50000000 0.5
## 91 0.33333333 0.25000000 0.57627119 0.45833333 0.5
## 92 0.50000000 0.41666667 0.61016949 0.54166667 0.5
## 93 0.41666667 0.25000000 0.50847458 0.45833333 0.5
## 94 0.19444444 0.12500000 0.38983051 0.37500000 0.5
## 95 0.36111111 0.29166667 0.54237288 0.50000000 0.5
## 96 0.38888889 0.41666667 0.54237288 0.45833333 0.5
## 97 0.38888889 0.37500000 0.54237288 0.50000000 0.5
## 98 0.52777778 0.37500000 0.55932203 0.50000000 0.5
## 99 0.22222222 0.20833333 0.33898305 0.41666667 0.5
## 100 0.38888889 0.33333333 0.52542373 0.50000000 0.5
## 101 0.55555556 0.54166667 0.84745763 1.00000000 1.0
## 102 0.41666667 0.29166667 0.69491525 0.75000000 1.0
## 103 0.77777778 0.41666667 0.83050847 0.83333333 1.0
## 104 0.55555556 0.37500000 0.77966102 0.70833333 1.0
## 105 0.61111111 0.41666667 0.81355932 0.87500000 1.0
## 106 0.91666667 0.41666667 0.94915254 0.83333333 1.0
## 107 0.16666667 0.20833333 0.59322034 0.66666667 1.0
## 108 0.83333333 0.37500000 0.89830508 0.70833333 1.0
## 109 0.66666667 0.20833333 0.81355932 0.70833333 1.0
## 110 0.80555556 0.66666667 0.86440678 1.00000000 1.0
## 111 0.61111111 0.50000000 0.69491525 0.79166667 1.0
## 112 0.58333333 0.29166667 0.72881356 0.75000000 1.0
## 113 0.69444444 0.41666667 0.76271186 0.83333333 1.0
## 114 0.38888889 0.20833333 0.67796610 0.79166667 1.0
## 115 0.41666667 0.33333333 0.69491525 0.95833333 1.0
## 116 0.58333333 0.50000000 0.72881356 0.91666667 1.0
## 117 0.61111111 0.41666667 0.76271186 0.70833333 1.0
## 118 0.94444444 0.75000000 0.96610169 0.87500000 1.0
## 119 0.94444444 0.25000000 1.00000000 0.91666667 1.0
## 120 0.47222222 0.08333333 0.67796610 0.58333333 1.0
## 121 0.72222222 0.50000000 0.79661017 0.91666667 1.0
## 122 0.36111111 0.33333333 0.66101695 0.79166667 1.0
## 123 0.94444444 0.33333333 0.96610169 0.79166667 1.0
## 124 0.55555556 0.29166667 0.66101695 0.70833333 1.0
## 125 0.66666667 0.54166667 0.79661017 0.83333333 1.0
## 126 0.80555556 0.50000000 0.84745763 0.70833333 1.0
## 127 0.52777778 0.33333333 0.64406780 0.70833333 1.0
## 128 0.50000000 0.41666667 0.66101695 0.70833333 1.0
## 129 0.58333333 0.33333333 0.77966102 0.83333333 1.0
## 130 0.80555556 0.41666667 0.81355932 0.62500000 1.0
## 131 0.86111111 0.33333333 0.86440678 0.75000000 1.0
## 132 1.00000000 0.75000000 0.91525424 0.79166667 1.0
## 133 0.58333333 0.33333333 0.77966102 0.87500000 1.0
## 134 0.55555556 0.33333333 0.69491525 0.58333333 1.0
## 135 0.50000000 0.25000000 0.77966102 0.54166667 1.0
## 136 0.94444444 0.41666667 0.86440678 0.91666667 1.0
## 137 0.55555556 0.58333333 0.77966102 0.95833333 1.0
## 138 0.58333333 0.45833333 0.76271186 0.70833333 1.0
## 139 0.47222222 0.41666667 0.64406780 0.70833333 1.0
## 140 0.72222222 0.45833333 0.74576271 0.83333333 1.0
## 141 0.66666667 0.45833333 0.77966102 0.95833333 1.0
## 142 0.72222222 0.45833333 0.69491525 0.91666667 1.0
## 143 0.41666667 0.29166667 0.69491525 0.75000000 1.0
## 144 0.69444444 0.50000000 0.83050847 0.91666667 1.0
## 145 0.66666667 0.54166667 0.79661017 1.00000000 1.0
## 146 0.66666667 0.41666667 0.71186441 0.91666667 1.0
## 147 0.55555556 0.20833333 0.67796610 0.75000000 1.0
## 148 0.61111111 0.41666667 0.71186441 0.79166667 1.0
## 149 0.52777778 0.58333333 0.74576271 0.91666667 1.0
## 150 0.44444444 0.41666667 0.69491525 0.70833333 1.0
Los siguientes datos son extraidos desde iris.csv
nuevos_datos <- data.frame(
Sepal.Length= c(5,4,4,4,5,5,4,5,4,4,5,4,4,4,5,5,5),
Sepal.Width= c(1,9,7,6,3,4,6,3,4,9,4,8,8,3,8,7,4)
)
nuevos_datos
## Sepal.Length Sepal.Width
## 1 5 1
## 2 4 9
## 3 4 7
## 4 4 6
## 5 5 3
## 6 5 4
## 7 4 6
## 8 5 3
## 9 4 4
## 10 4 9
## 11 5 4
## 12 4 8
## 13 4 8
## 14 4 3
## 15 5 8
## 16 5 7
## 17 5 4
# Gráfico con plot
plot(nuevos_datos)
# Gráfico con pairs
pairs(nuevos_datos)
# Realizamos un gráfico mejorado
library(PerformanceAnalytics)
chart.Correlation(nuevos_datos)
## Warning in par(usr): argument 1 does not name a graphical parameter
#Realizamos un gráfico mejorado
library(corrplot)
## Warning: package 'corrplot' was built under R version 4.3.2
## corrplot 0.92 loaded
corrplot(cor(nuevos_datos))
# Mediante la función cor
cor(nuevos_datos) # Matriz de correlaciones
## Sepal.Length Sepal.Width
## Sepal.Length 1.0000000 -0.5069806
## Sepal.Width -0.5069806 1.0000000
Coeficiente de correlación:
r = 0.5069806
# lm, notación: Y ~ X, data=
modelo_iris <- lm(Sepal.Length ~ Sepal.Width, data=iris)
# Resumen de resultados
summary(modelo_iris)
##
## Call:
## lm(formula = Sepal.Length ~ Sepal.Width, data = iris)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.5561 -0.6333 -0.1120 0.5579 2.2226
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.5262 0.4789 13.63 <2e-16 ***
## Sepal.Width -0.2234 0.1551 -1.44 0.152
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8251 on 148 degrees of freedom
## Multiple R-squared: 0.01382, Adjusted R-squared: 0.007159
## F-statistic: 2.074 on 1 and 148 DF, p-value: 0.1519
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.5262 0.4789 13.63 <2e-16 ***
## Sepal.Width -0.2234 0.1551 -1.44 0.152
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8251 on 148 degrees of freedom
## Multiple R-squared: 0.01382, Adjusted R-squared: 0.007159
## F-statistic: 2.074 on 1 and 148 DF, p-value: 0.1519
# Mejoramos el grafico
ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, color = Sepal.Length)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.1) +
theme_bw() +
theme(legend.position = "null")
## Warning: Continuous x aesthetic
## ℹ did you forget `aes(group = ...)`?
## Warning: The following aesthetics were dropped during statistical transformation: colour
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
modelgest<-glm(Sepal.Length~Sepal.Width, data= iris, family = gaussian())
summary(modelgest)
##
## Call:
## glm(formula = Sepal.Length ~ Sepal.Width, family = gaussian(),
## data = iris)
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.5262 0.4789 13.63 <2e-16 ***
## Sepal.Width -0.2234 0.1551 -1.44 0.152
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 0.6807844)
##
## Null deviance: 102.17 on 149 degrees of freedom
## Residual deviance: 100.76 on 148 degrees of freedom
## AIC: 371.99
##
## Number of Fisher Scoring iterations: 2
# Codificación 0,1 de la variable respuesta
iris$Sepal.Width <- as.character(iris$Sepal.Width)
iris$Sepal.Width <- as.numeric(iris$Sepal.Width)
# Gráfico de dispersión
plot(Sepal.Width ~ Sepal.Length, iris, col = "darkblue",
main = "Modelo regresión lineal general",
ylab = "P(Sepal.Width=1|Sepal.Length)",
xlab = "Sepal.Length", pch = 16)
# Añade la línea de regresión
abline(coef(modelgest), col = "firebrick", lwd = 2.5)
ggplot(iris, aes(Sepal.Length))+
geom_histogram(binwidth= .25, fill="red", colour="black")+
labs(x = "Longitud del sepal", y = "Frecuencia")+
ggtitle("Frecuencia vs Longitud del sepal")
ggplot(iris, aes(Sepal.Width))+
geom_histogram(binwidth= 4, fill="red", colour="black")+
labs(x= "Ancho del sepal", y="Frecuancia")+
ggtitle("Frecuencia vs Ancho del sepal")
ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width)) +
geom_jitter(height=0.10) +
stat_smooth( method="glm", method.args = list(family = "binomial")) +
geom_smooth(color="yellow")+
geom_smooth(method = lm, color="purple")+
labs(x= "Longitud del sepal", y= "Ancho del sepal")+
ggtitle("Modelos de probabilidades de Longitud del sepal que puede ver en Ancho del sepal")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Computation failed in `stat_smooth()`
## Caused by error:
## ! y values must be 0 <= y <= 1
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'