Use the default set “iris” for the following experiment.

  1. Find the names of the variables in the data set.
names(iris)
## [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
## [5] "Species"
  1. Plot the data set.
plot(iris)

  1. Find the correlation between the variables using the function cor(). What is this correlation method?
cor(iris[1:4])
##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000  -0.1175698    0.8717538   0.8179411
## Sepal.Width    -0.1175698   1.0000000   -0.4284401  -0.3661259
## Petal.Length    0.8717538  -0.4284401    1.0000000   0.9628654
## Petal.Width     0.8179411  -0.3661259    0.9628654   1.0000000
  1. Use “Kendall’s” and “Spearman’s” correlation method.
cor(iris[1:4], method="kendall")
##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length   1.00000000 -0.07699679    0.7185159   0.6553086
## Sepal.Width   -0.07699679  1.00000000   -0.1859944  -0.1571257
## Petal.Length   0.71851593 -0.18599442    1.0000000   0.8068907
## Petal.Width    0.65530856 -0.15712566    0.8068907   1.0000000
cor(iris[1:4], method="spearman")
##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000  -0.1667777    0.8818981   0.8342888
## Sepal.Width    -0.1667777   1.0000000   -0.3096351  -0.2890317
## Petal.Length    0.8818981  -0.3096351    1.0000000   0.9376668
## Petal.Width     0.8342888  -0.2890317    0.9376668   1.0000000
  1. Conduct the correlation test between Sepal.Length and Sepal.Width
cor(iris[1:2])
##              Sepal.Length Sepal.Width
## Sepal.Length    1.0000000  -0.1175698
## Sepal.Width    -0.1175698   1.0000000
  1. Let us denote cor(data name) as “cr”. You can use any name though!
cr <- cor(iris[1:4])
  1. Install the library “corrplot” to visualize through the function corrplot()
library(corrplot)
## corrplot 0.84 loaded
corrplot(cr)

  1. Visualize the same data set by different methods, for example “pie”, “color” and “number”.
corrplot(cr,method="pie")

corrplot(cr,method="color")

corrplot(cr,method="number")

  1. What do you observe from all these data plots? Are they repetitive? Can you make the graphs of type upper tringular or lower triangular?

They are symmetrical in matrix form. They are both uppper and lower triangular form since they relfect on the diagonal axis.

  1. Try a different function for scatter plot for any two variable from column 1 to 4 in “iris”. Use pairs().
pairs(iris[1:4])

  1. Visualize histogram, scatterplot with fitted curve and correlation coefficient in a same matrix-graph. You may need to install the library psych and then use pairs.panels().
library(psych)
pairs.panels(iris[1:4], hist.col="purple")

  1. Change the color of the histogram from the website: www.rapidtables.com/web/color/RGB_Color.html
pairs.panels(iris[1:4], hist.col="#00CED1" )