Correlation

1. Find names of variables in data set.

dat <- iris[1:4]
names(dat)

## [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"

2. Plot the data

plot(dat)

3. Find the correlation between the variables using the function cor(). What is this correlation method?

cor(dat)

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000  -0.1175698    0.8717538   0.8179411
## Sepal.Width    -0.1175698   1.0000000   -0.4284401  -0.3661259
## Petal.Length    0.8717538  -0.4284401    1.0000000   0.9628654
## Petal.Width     0.8179411  -0.3661259    0.9628654   1.0000000

4. Use “Kendall’s” and “Spearman’s” correlation method.
Kendall’s

cor(dat,  use = "everything",
    method = c("kendall"))

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length   1.00000000 -0.07699679    0.7185159   0.6553086
## Sepal.Width   -0.07699679  1.00000000   -0.1859944  -0.1571257
## Petal.Length   0.71851593 -0.18599442    1.0000000   0.8068907
## Petal.Width    0.65530856 -0.15712566    0.8068907   1.0000000

Spearman’s

cor(dat,  use = "everything",
    method = c("spearman"))

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000  -0.1667777    0.8818981   0.8342888
## Sepal.Width    -0.1667777   1.0000000   -0.3096351  -0.2890317
## Petal.Length    0.8818981  -0.3096351    1.0000000   0.9376668
## Petal.Width     0.8342888  -0.2890317    0.9376668   1.0000000

5. Coduct the correlation test between Sepal.Length and Sepal.Width

cor.test(dat$Sepal.Length, dat$Sepal.Width)

## 
##  Pearson's product-moment correlation
## 
## data:  dat$Sepal.Length and dat$Sepal.Width
## t = -1.4403, df = 148, p-value = 0.1519
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.27269325  0.04351158
## sample estimates:
##        cor 
## -0.1175698

6. Let us denote cor(data name) as “cr”. You can use any name though!

cr = cor(dat)

7. Install the library “corrplot” to visualize through the function corrplot().

#install.packages("corrplot")
library(corrplot)

8. Visualize the same data set by different methods, for example “pie”, “color”, and “number”.

corrplot(cr)

corrplot(cr, method = "pie")

corrplot(cr, method = "color")

corrplot(cr, method = "number")

9. What do you observe from all these data plot? Are they repetitive? Can you make the graphs of type upper triangular and lower triangular?

All of these graphs use the same color to tell the same things. Dark blue = 1, light blue = positive, light orange = negative. We can all so see that the graph is a symmetric matrix.
Lower Triangular:

corrplot(cr, type = "lower")

Upper Triangular:

corrplot(cr, type = "upper")

10. Different function scatter plot with any two variables in “iris” from column 1 to 4. Use pairs().

pairs(iris[1:4])

11. Visualize histogram, scatterplot with fitter curve and correlation coefficient in a same matrix-graph.

library(psych)
pairs.panels(iris[1:4])

12. Change color of histogram:

pairs.panels(iris[1:4], hist.col = "#FF8000")

Correlation

Kevin Torres

March 1, 2019