1 Métodos

  1. Valores Z
  2. Escala min-max
  3. Método de la Desviación estándar
  4. Método del Rango

1.1 ¿Cuándo estandarizar?

Antes de:

  1. Cluster Analysis
  2. Principal Component Analysis
  3. k-nearest neighbors con medidas de distancias euclidianas
  4. Support Vector Machine (SVM)
  5. Lasso and Ridge Regression
  6. en regresión cuando hay dos variables con interacción. Mejor centrarlas para ver la real interacción con la variable dependiente.

2 Valores Z

summary(X)
       k1               k2        
 Min.   : 100.0   Min.   : 10.00  
 1st Qu.: 328.0   1st Qu.: 32.75  
 Median : 541.0   Median : 55.00  
 Mean   : 547.5   Mean   : 54.84  
 3rd Qu.: 772.2   3rd Qu.: 77.00  
 Max.   :1000.0   Max.   :100.00  
X.scaled <-  scale(X, center= TRUE, scale=TRUE)
summary(X.scaled)
       k1                 k2           
 Min.   :-1.72778   Min.   :-1.724898  
 1st Qu.:-0.84756   1st Qu.:-0.849755  
 Median :-0.02524   Median : 0.006155  
 Mean   : 0.00000   Mean   : 0.000000  
 3rd Qu.: 0.86753   3rd Qu.: 0.852448  
 Max.   : 1.74679   Max.   : 1.737208  

3 Escala min-max

library(dplyr)
package 㤼㸱dplyr㤼㸲 was built under R version 3.2.5Note: the specification for S3 class 㤼㸳AsIs㤼㸴 in package 㤼㸱DBI㤼㸲 seems equivalent to one from package 㤼㸱jsonlite㤼㸲: not turning on duplicate class definitions for this class.

Attaching package: 㤼㸱dplyr㤼㸲

The following objects are masked from 㤼㸱package:stats㤼㸲:

    filter, lag

The following objects are masked from 㤼㸱package:base㤼㸲:

    intersect, setdiff, setequal, union
summary(mins)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   10.0    32.5    55.0    55.0    77.5   100.0 
summary(rng)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   90.0   292.5   495.0   495.0   697.5   900.0 
summary(X.scaled)
       k1               k2        
 Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.2533   1st Qu.:0.2528  
 Median :0.4900   Median :0.5000  
 Mean   :0.4973   Mean   :0.4982  
 3rd Qu.:0.7469   3rd Qu.:0.7444  
 Max.   :1.0000   Max.   :1.0000  
boxplot(X.scaled)

4 Método de la Desviación estándar

summary(X.scaled)
       k1               k2        
 Min.   :0.3861   Min.   :0.3847  
 1st Qu.:1.2663   1st Qu.:1.2598  
 Median :2.0886   Median :2.1157  
 Mean   :2.1138   Mean   :2.1096  
 3rd Qu.:2.9814   3rd Qu.:2.9620  
 Max.   :3.8606   Max.   :3.8468  

5 Método del Rango

summary(X.scaled)
       k1               k2        
 Min.   :0.1111   Min.   :0.1111  
 1st Qu.:0.3644   1st Qu.:0.3639  
 Median :0.6011   Median :0.6111  
 Mean   :0.6084   Mean   :0.6093  
 3rd Qu.:0.8581   3rd Qu.:0.8556  
 Max.   :1.1111   Max.   :1.1111  

LS0tDQp0aXRsZTogIkVzdGFuZGFyaXphY2nDs24gZGUgdmFyaWFibGVzIg0Kb3V0cHV0OiANCiAgaHRtbF9ub3RlYm9vazogDQogICAgbnVtYmVyX3NlY3Rpb25zOiB5ZXMNCiAgICB0b2M6IHllcw0KLS0tDQojIE3DqXRvZG9zDQoNCiAxLiBWYWxvcmVzIFoNCiAxLiBFc2NhbGEgbWluLW1heA0KIDEuIE3DqXRvZG8gZGUgbGEgRGVzdmlhY2nDs24gZXN0w6FuZGFyDQogMS4gTcOpdG9kbyBkZWwgUmFuZ28NCiANCiMjIMK/Q3XDoW5kbyBlc3RhbmRhcml6YXI/IA0KQW50ZXMgZGU6IA0KDQogMS4gQ2x1c3RlciBBbmFseXNpcw0KIDEuIFByaW5jaXBhbCBDb21wb25lbnQgQW5hbHlzaXMNCiAxLiBrLW5lYXJlc3QgbmVpZ2hib3JzIGNvbiBtZWRpZGFzIGRlIGRpc3RhbmNpYXMgZXVjbGlkaWFuYXMNCiAxLiBTdXBwb3J0IFZlY3RvciBNYWNoaW5lIChTVk0pDQogMS4gTGFzc28gYW5kIFJpZGdlIFJlZ3Jlc3Npb24NCiAxLiBlbiByZWdyZXNpw7NuIGN1YW5kbyBoYXkgZG9zIHZhcmlhYmxlcyBjb24gaW50ZXJhY2Npw7NuLiBNZWpvciBjZW50cmFybGFzIHBhcmEgdmVyIGxhIHJlYWwgaW50ZXJhY2Npw7NuIGNvbiBsYSB2YXJpYWJsZSBkZXBlbmRpZW50ZS4gDQogDQojIFZhbG9yZXMgWg0KDQpgYGB7cn0NCnNldC5zZWVkKDEyMykNClggPC0gZGF0YS5mcmFtZShrMSA9IHNhbXBsZSgxMDA6MTAwMCwxMDAwLCByZXBsYWNlID0gVFJVRSksDQogICAgICAgICAgICAgICAgazIgPSBzYW1wbGUoMTA6MTAwLDEwMDAsIHJlcGxhY2UgPSBUUlVFKSkNCmBgYA0KYGBge3J9DQpib3hwbG90KFgpDQpgYGANCg0KDQpgYGB7cn0NCnN1bW1hcnkoWCkNCmBgYA0KDQpgYGB7cn0NClguc2NhbGVkIDwtICBzY2FsZShYLCBjZW50ZXI9IFRSVUUsIHNjYWxlPVRSVUUpDQpgYGANCg0KYGBge3J9DQpzdW1tYXJ5KFguc2NhbGVkKQ0KYGBgDQpgYGB7cn0NCmJveHBsb3QoWC5zY2FsZWQpDQpgYGANCg0KDQojIEVzY2FsYSBtaW4tbWF4DQpgYGB7cn0NCmxpYnJhcnkoZHBseXIpDQpgYGANCg0KYGBge3J9DQptaW5zIDwtIGFzLmludGVnZXIoc3VtbWFyaXNlX2FsbChYLCBtaW4pKQ0KYGBgDQpgYGB7cn0NCnN1bW1hcnkobWlucykNCmBgYA0KDQpgYGB7cn0NCnJuZyA8LSBhcy5pbnRlZ2VyKHN1bW1hcmlzZV9hbGwoWCwgZnVuY3Rpb24oeCkgZGlmZihyYW5nZSh4KSkpKQ0KYGBgDQpgYGB7cn0NCnN1bW1hcnkocm5nKQ0KYGBgDQpgYGB7cn0NClguc2NhbGVkIDwtICBkYXRhLmZyYW1lKHNjYWxlKFgsIGNlbnRlciA9IG1pbnMsIHNjYWxlID0gcm5nKSkNCmBgYA0KYGBge3J9DQpzdW1tYXJ5KFguc2NhbGVkKQ0KYGBgDQoNCg0KYGBge3J9DQpib3hwbG90KFguc2NhbGVkKQ0KYGBgDQojIE3DqXRvZG8gZGUgbGEgRGVzdmlhY2nDs24gZXN0w6FuZGFyDQpgYGB7cn0NClguc2NhbGVkID0gZGF0YS5mcmFtZShzY2FsZShYLCBjZW50ZXIgPSBGQUxTRSwNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICBzY2FsZSA9IGFwcGx5KFgsIDIsIHNkLCBuYS5ybSA9IFRSVUUpKSkNCmBgYA0KYGBge3J9DQpzdW1tYXJ5KFguc2NhbGVkKQ0KYGBgDQpgYGB7cn0NCmJveHBsb3QoWC5zY2FsZWQpDQpgYGANCg0KIyBNw6l0b2RvIGRlbCBSYW5nbw0KYGBge3J9DQpybmcgPC0gYXMuaW50ZWdlcihzdW1tYXJpc2VfYWxsKFgsIGZ1bmN0aW9uKHgpIGRpZmYocmFuZ2UoeCkpKSkNClguc2NhbGVkIDwtIGRhdGEuZnJhbWUoc2NhbGUoWCwgY2VudGVyPSBGQUxTRSwgc2NhbGUgPSBybmcpKQ0KYGBgDQpgYGB7cn0NCnN1bW1hcnkoWC5zY2FsZWQpDQpgYGANCg0KYGBge3J9DQpib3hwbG90KFguc2NhbGVkKQ0KYGBgDQoNCg==