Use the default data set “iris” for the following experiment.

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa
  1. Find the names of the variables in the data set.
names(iris)
## [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
## [5] "Species"
  1. Plot the data set.
plot(iris)

  1. Find the correlation between the variables using the functin cor(). What is this correlation method?
cor(iris[1:4])
##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000  -0.1175698    0.8717538   0.8179411
## Sepal.Width    -0.1175698   1.0000000   -0.4284401  -0.3661259
## Petal.Length    0.8717538  -0.4284401    1.0000000   0.9628654
## Petal.Width     0.8179411  -0.3661259    0.9628654   1.0000000

This correlation method is the Pearson correlation, which is the default correlation.

  1. Use “Kendall’s” and “Spearman’s” correlation method.
cor(iris[1:4], method = c("kendall"))
##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length   1.00000000 -0.07699679    0.7185159   0.6553086
## Sepal.Width   -0.07699679  1.00000000   -0.1859944  -0.1571257
## Petal.Length   0.71851593 -0.18599442    1.0000000   0.8068907
## Petal.Width    0.65530856 -0.15712566    0.8068907   1.0000000
cor(iris[1:4], method = c("spearman"))
##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000  -0.1667777    0.8818981   0.8342888
## Sepal.Width    -0.1667777   1.0000000   -0.3096351  -0.2890317
## Petal.Length    0.8818981  -0.3096351    1.0000000   0.9376668
## Petal.Width     0.8342888  -0.2890317    0.9376668   1.0000000
  1. Conduct the correlation test between Sepal.Length and Sepal.Width.
cor(iris[1:2])
##              Sepal.Length Sepal.Width
## Sepal.Length    1.0000000  -0.1175698
## Sepal.Width    -0.1175698   1.0000000
  1. Let us denote cor(data name) as “cr”. You can use any name, though.
cr <- cor(iris[1:4])
  1. Install the library “corrplot” to visualize through the function corrplot().
library(corrplot)
## corrplot 0.84 loaded
corrplot(cr)

  1. Visualize the same data by different methods, for example “pie”, “color”, and “number”.
corrplot(cr, method = c("pie"))

corrplot(cr, method = c("color"))

corrplot(cr, method = c("number"))

  1. What do you observe from all these data plot? Are they repetitive? Can you make the graphs of type upper triangle or lower triangle?
corrplot(cr, type = "upper")

corrplot(cr, type = "lower")

These data plots are all repetitive since it it representing the data in different forms.

  1. Try a different function for scatter plot for any two variable from column 1 to 4 in “iris”. Use pairs().
pairs(iris[3:4])

  1. Visualize histogram, scatterplot with fitted curve and correlation in a same matrix-graph. You may need to install the library pscyh and then use pairs.panels().
library(psych)
pairs.panels(iris[1:4])

  1. Change the color of the histogram from this website http://www.rapidtables.com/web/color/RGB_Color.html
pairs.panels(iris[1:4], hist.col = "orange")