Data Visualization

Introduction

Data visualization is one of the most important aspects in data analysis that allow an analyst or any data scientists to communicate their findings with others. With the advantage of a powerful statistical software, R supports various visualization systems such as ggplot, grid graphics and R base graphics. While ggplot and grid graphics are more advanced systems, and require learners with a great effort and painful learning curve, R base graphic systems is much more easier and suitable for beginners. Thus, mastering R base graphics tool is very important step, which will be a solid foundation for acquiring more advanced graphics systems in R latter on.

To support this purpose, some packages are used in order to derive its available dataset. You can install a package by calling function and pass the name of package you want to install like this install.packages(“MASS”)

Import library

# Import library 
library(MASS)
library(caret)
library(tidyverse)
  1. Visualize one continuous variable

There are many tools that facilitate visualizing one numeric variable such as hist, dotplot and even boxplot

# Dataset 
dataset<-mtcars
head(dataset)
# Histogram and Density
par(mfrow=c(1,2))
hist(dataset$mpg,main="Histogram",col=4,xlab="MPG")
plot(density(dataset$mpg),col=2,main="Density plot")

# adding xlim,ylim if necessary
# Boxplot
boxplot(dataset$mpg,main="Boxplot",col=4,ylab="mpg")

# Dotplot
# dotplot may not be useful in this case, but for the sake of demonstration
dotplot(dataset$mpg,main="Dot plot",xlab="mpg")

2.Visualize one categorical variable

Barplot may be more widely used as it provides great visualization

# barplot
# Before creating barplot, count table should be established
my_table<-table(Cars93$Cylinders)
barplot(my_table,main="Bar plot",xlab="Cylinders",ylab="Frequency or Count",col=3)

3.Visualize two variables

Two continuous variables

# Scatter plot
plot(dataset$mpg,dataset$hp,type="p",pch=16,col=4,xlab="mpg",ylab="hp")

sunflower plot

# Sunflower plot 
sunflowerplot(dataset$mpg~dataset$hp,main="Sunflower plot",col=2)

# This plot is useful if data points are repeated

Boxplot

# Boxplot can be used with two numeric data
boxplot(dataset$mpg~dataset$gear)

Overlaid plot

plot(c(1:10),c(20:11),type="o",col=4)

Adding points or lines to existing plot

dataset$gear<-as.factor(dataset$gear)
plot(dataset$mpg~dataset$hp,col=c(3,4,5),main="Scatter Plot\n Subtitle",pch=c(3,4,5))
legend("topright",legend = c("3","4","5"),col = c(3,4,5),bty="n",pch = c(3,4,5))

plot(dataset$mpg,dataset$hp,col=c(3),main="Scatter Plot\n Subtitle")
sup<-supsmu(dataset$mpg,dataset$hp,bass=5)
# Adding smooth line
lines(sup,lwd=3,col=5)

Adding text to the existing plot

# adding text needs to specify coordinates
sunflowerplot(dataset$mpg~dataset$hp,main="Sunflower plot",col=2)
text(300,20,"Outlier-->",font=1,srt=-20,cex=0.9,col=4)

Set layout

plot(dataset$mpg~dataset$hp,main="scatter plot")
boxplot(dataset$mpg,main="Boxplot")

sunflowerplot(dataset$hp~dataset$wt)

Other interesitng plot

library(corrplot)
mt_co<-cor(mtcars)
corrplot(mt_co,method = "circle")

Conclusion

R base provides a wide range of tools for data visualization. Next demonstration will be using ggplot2 package for more interesting graphics

LS0tDQp0aXRsZTogIlIgTm90ZWJvb2siDQpvdXRwdXQ6DQogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQNCiAgaHRtbF9kb2N1bWVudDogZGVmYXVsdA0KLS0tDQoNCiMgRGF0YSBWaXN1YWxpemF0aW9uDQoNCiBJbnRyb2R1Y3Rpb24NCg0KRGF0YSB2aXN1YWxpemF0aW9uIGlzIG9uZSBvZiB0aGUgbW9zdCBpbXBvcnRhbnQgYXNwZWN0cyBpbiBkYXRhIGFuYWx5c2lzIHRoYXQgYWxsb3cgYW4gYW5hbHlzdCBvciBhbnkgZGF0YSBzY2llbnRpc3RzIHRvIGNvbW11bmljYXRlIHRoZWlyIGZpbmRpbmdzIHdpdGggb3RoZXJzLiBXaXRoIHRoZSBhZHZhbnRhZ2Ugb2YgYSBwb3dlcmZ1bCBzdGF0aXN0aWNhbCBzb2Z0d2FyZSwgUiBzdXBwb3J0cyB2YXJpb3VzIHZpc3VhbGl6YXRpb24gc3lzdGVtcyBzdWNoIGFzIGdncGxvdCwgZ3JpZCBncmFwaGljcyBhbmQgUiBiYXNlIGdyYXBoaWNzLiBXaGlsZSBnZ3Bsb3QgYW5kIGdyaWQgZ3JhcGhpY3MgYXJlIG1vcmUgYWR2YW5jZWQgc3lzdGVtcywgYW5kIHJlcXVpcmUgbGVhcm5lcnMgd2l0aCBhIGdyZWF0IGVmZm9ydCBhbmQgcGFpbmZ1bCBsZWFybmluZyBjdXJ2ZSwgUiBiYXNlIGdyYXBoaWMgc3lzdGVtcyBpcyBtdWNoIG1vcmUgZWFzaWVyIGFuZCBzdWl0YWJsZSBmb3IgYmVnaW5uZXJzLiBUaHVzLCBtYXN0ZXJpbmcgUiBiYXNlIGdyYXBoaWNzIHRvb2wgaXMgdmVyeSBpbXBvcnRhbnQgc3RlcCwgd2hpY2ggd2lsbCBiZSBhIHNvbGlkIGZvdW5kYXRpb24gZm9yIGFjcXVpcmluZyBtb3JlIGFkdmFuY2VkIGdyYXBoaWNzIHN5c3RlbXMgaW4gUiBsYXR0ZXIgb24uDQoNClRvIHN1cHBvcnQgdGhpcyBwdXJwb3NlLCBzb21lIHBhY2thZ2VzIGFyZSB1c2VkIGluIG9yZGVyIHRvIGRlcml2ZSBpdHMgYXZhaWxhYmxlIGRhdGFzZXQuIFlvdSBjYW4gaW5zdGFsbCBhIHBhY2thZ2UgYnkgY2FsbGluZyBmdW5jdGlvbiBhbmQgcGFzcyB0aGUgbmFtZSBvZiBwYWNrYWdlIHlvdSB3YW50IHRvIGluc3RhbGwgbGlrZSB0aGlzICoqaW5zdGFsbC5wYWNrYWdlcygiTUFTUyIpKiogDQoNCipJbXBvcnQgbGlicmFyeSoNCg0KYGBge3J9DQojIEltcG9ydCBsaWJyYXJ5IA0KDQpsaWJyYXJ5KE1BU1MpDQoNCmxpYnJhcnkoY2FyZXQpDQoNCmxpYnJhcnkodGlkeXZlcnNlKQ0KDQpgYGANCg0KIDEuIFZpc3VhbGl6ZSBvbmUgY29udGludW91cyB2YXJpYWJsZQ0KDQpUaGVyZSBhcmUgbWFueSB0b29scyB0aGF0IGZhY2lsaXRhdGUgdmlzdWFsaXppbmcgb25lIG51bWVyaWMgdmFyaWFibGUgc3VjaCBhcyAqaGlzdCosICpkb3RwbG90KiBhbmQgZXZlbiAqYm94cGxvdCoNCg0KYGBge3J9DQojIERhdGFzZXQgDQoNCmRhdGFzZXQ8LW10Y2Fycw0KDQpoZWFkKGRhdGFzZXQpDQoNCmBgYA0KDQpgYGB7cn0NCiMgSGlzdG9ncmFtIGFuZCBEZW5zaXR5DQoNCnBhcihtZnJvdz1jKDEsMikpDQoNCmhpc3QoZGF0YXNldCRtcGcsbWFpbj0iSGlzdG9ncmFtIixjb2w9NCx4bGFiPSJNUEciKQ0KDQpwbG90KGRlbnNpdHkoZGF0YXNldCRtcGcpLGNvbD0yLG1haW49IkRlbnNpdHkgcGxvdCIpDQoNCiMgYWRkaW5nIHhsaW0seWxpbSBpZiBuZWNlc3NhcnkNCg0KYGBgDQoNCmBgYHtyfQ0KIyBCb3hwbG90DQoNCmJveHBsb3QoZGF0YXNldCRtcGcsbWFpbj0iQm94cGxvdCIsY29sPTQseWxhYj0ibXBnIikNCg0KYGBgDQpgYGB7cn0NCiMgRG90cGxvdA0KDQojIGRvdHBsb3QgbWF5IG5vdCBiZSB1c2VmdWwgaW4gdGhpcyBjYXNlLCBidXQgZm9yIHRoZSBzYWtlIG9mIGRlbW9uc3RyYXRpb24NCg0KZG90cGxvdChkYXRhc2V0JG1wZyxtYWluPSJEb3QgcGxvdCIseGxhYj0ibXBnIikNCmBgYA0KDQogMi5WaXN1YWxpemUgb25lIGNhdGVnb3JpY2FsIHZhcmlhYmxlDQoNCkJhcnBsb3QgbWF5IGJlIG1vcmUgd2lkZWx5IHVzZWQgYXMgaXQgcHJvdmlkZXMgZ3JlYXQgdmlzdWFsaXphdGlvbg0KDQpgYGB7cn0NCiMgYmFycGxvdA0KDQojIEJlZm9yZSBjcmVhdGluZyBiYXJwbG90LCBjb3VudCB0YWJsZSBzaG91bGQgYmUgZXN0YWJsaXNoZWQNCm15X3RhYmxlPC10YWJsZShDYXJzOTMkQ3lsaW5kZXJzKQ0KDQpiYXJwbG90KG15X3RhYmxlLG1haW49IkJhciBwbG90Iix4bGFiPSJDeWxpbmRlcnMiLHlsYWI9IkZyZXF1ZW5jeSBvciBDb3VudCIsY29sPTMpDQoNCmBgYA0KDQogMy5WaXN1YWxpemUgdHdvIHZhcmlhYmxlcw0KIA0KICAqVHdvIGNvbnRpbnVvdXMgdmFyaWFibGVzKg0KDQpgYGB7cn0NCg0KIyBTY2F0dGVyIHBsb3QNCg0KcGxvdChkYXRhc2V0JG1wZyxkYXRhc2V0JGhwLHR5cGU9InAiLHBjaD0xNixjb2w9NCx4bGFiPSJtcGciLHlsYWI9ImhwIikNCmBgYA0KDQoqKnN1bmZsb3dlciBwbG90KioNCg0KYGBge3J9DQojIFN1bmZsb3dlciBwbG90IA0KDQpzdW5mbG93ZXJwbG90KGRhdGFzZXQkbXBnfmRhdGFzZXQkaHAsbWFpbj0iU3VuZmxvd2VyIHBsb3QiLGNvbD0yKQ0KDQojIFRoaXMgcGxvdCBpcyB1c2VmdWwgaWYgZGF0YSBwb2ludHMgYXJlIHJlcGVhdGVkDQpgYGANCioqQm94cGxvdCoqDQoNCmBgYHtyfQ0KIyBCb3hwbG90IGNhbiBiZSB1c2VkIHdpdGggdHdvIG51bWVyaWMgZGF0YQ0KDQpib3hwbG90KGRhdGFzZXQkbXBnfmRhdGFzZXQkZ2VhcikNCmBgYA0KKipPdmVybGFpZCBwbG90KioNCg0KYGBge3J9DQpwbG90KGMoMToxMCksYygyMDoxMSksdHlwZT0ibyIsY29sPTQpDQoNCmBgYA0KDQoqKkFkZGluZyBwb2ludHMgb3IgbGluZXMgdG8gZXhpc3RpbmcgcGxvdCoqDQoNCmBgYHtyfQ0KZGF0YXNldCRnZWFyPC1hcy5mYWN0b3IoZGF0YXNldCRnZWFyKQ0KDQpwbG90KGRhdGFzZXQkbXBnfmRhdGFzZXQkaHAsY29sPWMoMyw0LDUpLG1haW49IlNjYXR0ZXIgUGxvdFxuIFN1YnRpdGxlIixwY2g9YygzLDQsNSkpDQoNCmxlZ2VuZCgidG9wcmlnaHQiLGxlZ2VuZCA9IGMoIjMiLCI0IiwiNSIpLGNvbCA9IGMoMyw0LDUpLGJ0eT0ibiIscGNoID0gYygzLDQsNSkpDQoNCmBgYA0KDQpgYGB7cn0NCnBsb3QoZGF0YXNldCRtcGcsZGF0YXNldCRocCxjb2w9YygzKSxtYWluPSJTY2F0dGVyIFBsb3RcbiBTdWJ0aXRsZSIpDQoNCnN1cDwtc3Vwc211KGRhdGFzZXQkbXBnLGRhdGFzZXQkaHAsYmFzcz01KQ0KDQojIEFkZGluZyBzbW9vdGggbGluZQ0KbGluZXMoc3VwLGx3ZD0zLGNvbD01KQ0KDQpgYGANCg0KKipBZGRpbmcgdGV4dCB0byB0aGUgZXhpc3RpbmcgcGxvdCoqDQoNCmBgYHtyfQ0KIyBhZGRpbmcgdGV4dCBuZWVkcyB0byBzcGVjaWZ5IGNvb3JkaW5hdGVzDQpzdW5mbG93ZXJwbG90KGRhdGFzZXQkbXBnfmRhdGFzZXQkaHAsbWFpbj0iU3VuZmxvd2VyIHBsb3QiLGNvbD0yKQ0KDQp0ZXh0KDMwMCwyMCwiT3V0bGllci0tPiIsZm9udD0xLHNydD0tMjAsY2V4PTAuOSxjb2w9NCkNCg0KYGBgDQoNCioqU2V0IGxheW91dCoqDQoNCmBgYHtyfQ0Kcm93MTwtYygxLDEsMSkgIyB3ZSB3aWxsIHNldCBhbGwgZmlyc3Qgcm93cyBmb3Igb25lIHBsb3QNCg0Kcm93MjwtYygxLDEsMSkgIyBhbGwgc2Vjb25kIHJvd3MgZm9yIHRoZSBzYW1lIHBsb3QgYWJvdmUNCg0Kcm93MzwtYygyLDAsMykgIyB0d28gc21hbGwgcGxvdHMgYXQgdGhpcmQgcm93cywgMCBpbmRpY2F0ZSBubyBwbG90DQoNCm15X2xheW91dDwtYXMubWF0cml4KHJvdzEscm93Mixyb3czLGJ5cm93PVQsbmNvbD0zKQ0KDQpsYXlvdXQobXlfbGF5b3V0KQ0KDQpwbG90KGRhdGFzZXQkbXBnfmRhdGFzZXQkaHAsbWFpbj0ic2NhdHRlciBwbG90IikNCg0KYm94cGxvdChkYXRhc2V0JG1wZyxtYWluPSJCb3hwbG90IikNCg0Kc3VuZmxvd2VycGxvdChkYXRhc2V0JGhwfmRhdGFzZXQkd3QpDQoNCg0KYGBgDQoNCioqT3RoZXIgaW50ZXJlc2l0bmcgcGxvdCoqDQoNCmBgYHtyfQ0KbGlicmFyeShjb3JycGxvdCkNCm10X2NvPC1jb3IobXRjYXJzKQ0KDQpjb3JycGxvdChtdF9jbyxtZXRob2QgPSAiY2lyY2xlIikNCmBgYA0KDQojQ29uY2x1c2lvbg0KDQpSIGJhc2UgcHJvdmlkZXMgYSB3aWRlIHJhbmdlIG9mIHRvb2xzIGZvciBkYXRhIHZpc3VhbGl6YXRpb24uIE5leHQgZGVtb25zdHJhdGlvbiB3aWxsIGJlIHVzaW5nIGdncGxvdDIgcGFja2FnZSBmb3IgbW9yZSBpbnRlcmVzdGluZyBncmFwaGljcw0K