More on coorelationship

You may want to present a pairwise scatterplot matrix of the multiple variables. Every column of input is plotted against every other column of input.

pairs(mtcars)

crabs <- read.csv("./crabs.csv")
cor(mtcars)
##             mpg        cyl       disp         hp        drat         wt
## mpg   1.0000000 -0.8521620 -0.8475514 -0.7761684  0.68117191 -0.8676594
## cyl  -0.8521620  1.0000000  0.9020329  0.8324475 -0.69993811  0.7824958
## disp -0.8475514  0.9020329  1.0000000  0.7909486 -0.71021393  0.8879799
## hp   -0.7761684  0.8324475  0.7909486  1.0000000 -0.44875912  0.6587479
## drat  0.6811719 -0.6999381 -0.7102139 -0.4487591  1.00000000 -0.7124406
## wt   -0.8676594  0.7824958  0.8879799  0.6587479 -0.71244065  1.0000000
## qsec  0.4186840 -0.5912421 -0.4336979 -0.7082234  0.09120476 -0.1747159
## vs    0.6640389 -0.8108118 -0.7104159 -0.7230967  0.44027846 -0.5549157
## am    0.5998324 -0.5226070 -0.5912270 -0.2432043  0.71271113 -0.6924953
## gear  0.4802848 -0.4926866 -0.5555692 -0.1257043  0.69961013 -0.5832870
## carb -0.5509251  0.5269883  0.3949769  0.7498125 -0.09078980  0.4276059
##             qsec         vs          am       gear        carb
## mpg   0.41868403  0.6640389  0.59983243  0.4802848 -0.55092507
## cyl  -0.59124207 -0.8108118 -0.52260705 -0.4926866  0.52698829
## disp -0.43369788 -0.7104159 -0.59122704 -0.5555692  0.39497686
## hp   -0.70822339 -0.7230967 -0.24320426 -0.1257043  0.74981247
## drat  0.09120476  0.4402785  0.71271113  0.6996101 -0.09078980
## wt   -0.17471588 -0.5549157 -0.69249526 -0.5832870  0.42760594
## qsec  1.00000000  0.7445354 -0.22986086 -0.2126822 -0.65624923
## vs    0.74453544  1.0000000  0.16834512  0.2060233 -0.56960714
## am   -0.22986086  0.1683451  1.00000000  0.7940588  0.05753435
## gear -0.21268223  0.2060233  0.79405876  1.0000000  0.27407284
## carb -0.65624923 -0.5696071  0.05753435  0.2740728  1.00000000

A coplot may be more enlightening for two continuous(numeric) variables and multiples of categorical(factor) variable.

with(crabs, coplot(weight~width | color))

Introduction to Plotting in R

We’ve already touched upon bar plots, box plots, histograms, and density plots. Now, we’ll delve deeper into these and introduce new plotting techniques.

Graphics parameters

R maintains a list of a large number of graphics parameters which control things such as line style, colors, figure arrangement and text justification among many others. Every graphics parameter has a name and a value. The par() function is used to access and modify the list of graphics parameters for the current graphics device.

par()
## $xlog
## [1] FALSE
## 
## $ylog
## [1] FALSE
## 
## $adj
## [1] 0.5
## 
## $ann
## [1] TRUE
## 
## $ask
## [1] FALSE
## 
## $bg
## [1] "white"
## 
## $bty
## [1] "o"
## 
## $cex
## [1] 1
## 
## $cex.axis
## [1] 1
## 
## $cex.lab
## [1] 1
## 
## $cex.main
## [1] 1.2
## 
## $cex.sub
## [1] 1
## 
## $cin
## [1] 0.15 0.20
## 
## $col
## [1] "black"
## 
## $col.axis
## [1] "black"
## 
## $col.lab
## [1] "black"
## 
## $col.main
## [1] "black"
## 
## $col.sub
## [1] "black"
## 
## $cra
## [1] 28.8 38.4
## 
## $crt
## [1] 0
## 
## $csi
## [1] 0.2
## 
## $cxy
## [1] 0.02604167 0.06329115
## 
## $din
## [1] 6.999999 4.999999
## 
## $err
## [1] 0
## 
## $family
## [1] ""
## 
## $fg
## [1] "black"
## 
## $fig
## [1] 0 1 0 1
## 
## $fin
## [1] 6.999999 4.999999
## 
## $font
## [1] 1
## 
## $font.axis
## [1] 1
## 
## $font.lab
## [1] 1
## 
## $font.main
## [1] 2
## 
## $font.sub
## [1] 1
## 
## $lab
## [1] 5 5 7
## 
## $las
## [1] 0
## 
## $lend
## [1] "round"
## 
## $lheight
## [1] 1
## 
## $ljoin
## [1] "round"
## 
## $lmitre
## [1] 10
## 
## $lty
## [1] "solid"
## 
## $lwd
## [1] 1
## 
## $mai
## [1] 1.02 0.82 0.82 0.42
## 
## $mar
## [1] 5.1 4.1 4.1 2.1
## 
## $mex
## [1] 1
## 
## $mfcol
## [1] 1 1
## 
## $mfg
## [1] 1 1 1 1
## 
## $mfrow
## [1] 1 1
## 
## $mgp
## [1] 3 1 0
## 
## $mkh
## [1] 0.001
## 
## $new
## [1] FALSE
## 
## $oma
## [1] 0 0 0 0
## 
## $omd
## [1] 0 1 0 1
## 
## $omi
## [1] 0 0 0 0
## 
## $page
## [1] TRUE
## 
## $pch
## [1] 1
## 
## $pin
## [1] 5.759999 3.159999
## 
## $plt
## [1] 0.1171429 0.9400000 0.2040000 0.8360000
## 
## $ps
## [1] 12
## 
## $pty
## [1] "m"
## 
## $smo
## [1] 1
## 
## $srt
## [1] 0
## 
## $tck
## [1] NA
## 
## $tcl
## [1] -0.5
## 
## $usr
## [1] 0 1 0 1
## 
## $xaxp
## [1] 0 1 5
## 
## $xaxs
## [1] "r"
## 
## $xaxt
## [1] "s"
## 
## $xpd
## [1] FALSE
## 
## $yaxp
## [1] 0 1 5
## 
## $yaxs
## [1] "r"
## 
## $yaxt
## [1] "s"
## 
## $ylbias
## [1] 0.2
par(c("pch"))
## [1] 1

we use the basic plot function to elobrate the parameters.

attach(crabs)
plot(weight,width)

plot(weight,width,pch=33) #0-25,33-127

plot(weight,width,col=31) 

plot(weight,width,col="#F12123") 

plot(weight,width,col="#F12123",col.axis="Blue")

plot(weight,width,col="#F12123",col.axis="Blue",col.lab="#F41234")

plot(weight,width,col="#F12123",col.axis="Blue",col.lab="#F41234",font=2)

plot(weight,width,lab=c(5,10,24),las=2)

Figure margins

A single plot in R is known as a figure and comprises a plot region surrounded by margins.

par(mar=c(0,0,0,0))
plot(weight,width,lab=c(5,10,24),las=2) 

par(mfrow=c(1,2))
plot(weight,width,lab=c(5,10,24),las=2)
plot(crab,sat,lab=c(5,10,24),las=2)

The default values chosen for this parameter are often too large; the right-hand margin is rarely needed, and neither is the top margin if no title is being used. The bottom and left margins must be large enough to accommodate the axis and tick labels.

Loading Necessary Libraries

While base R functions are sufficient for basic plotting, sometimes we need additional packages for data manipulation.

# Install required packages if not already installed 
library(dplyr)
library(reshape2) 

Frequency Charts (Bar Plots)

Bar plots are useful for visualizing the frequency or count of categorical data.

Example: Frequency of Categories
# Sample Data
categories <- data.frame(
  Category = c("Apples", "Bananas", "Cherries", "Dates"),
  Count = c(50, 30, 15, 5)
)

# Bar Plot
barplot(categories$Count, names.arg = categories$Category,
        col = "skyblue", main = "Frequency of Fruit Categories",
        xlab = "Fruit", ylab = "Count")

Customizing the Bar Plot

  • Color: Use the col argument to change bar colors.
  • Axis Labels: xlab and ylab for axis labels.
  • Title: main for the main title.

Tip:

If the category names are long or overlapping, you can adjust the las parameter to change the orientation of axis labels. #### You can specify colors directly or use built-in color palettes.

cls <- c("#4E79A7", "#F28E2B", "#E15759", "#76B7B2")
barplot(categories$Count, names.arg = categories$Category,
        col = cls, main = "Frequency of Fruit Categories",
        xlab = "Fruit", ylab = "Count", las = 2)

Adjusting Bar Width and Spacing

Control the width of bars and the spacing between them to improve readability.

Adjusting Bar Width

barplot(categories$Count, names.arg = categories$Category,
        col = cls, main = "Frequency of Fruit Categories",
        xlab = "Fruit", ylab = "Count", width = c(1:4))

#### Adjusting Bar Spacing Control spacing with the space argument.

barplot(categories$Count, names.arg = categories$Category,
        col = cls, main = "Frequency of Fruit Categories",
        xlab = "Fruit", ylab = "Count",space = c(0.5, 1, 1.5))

#### Horizontal Bar Plots Flip the orientation for better label readability.

barplot(categories$Count, names.arg = categories$Category,
        col = cls, main = "Frequency of Fruit Categories",
        xlab = "Fruit", ylab = "Count",space = c(0.5, 1, 1.5),horiz=TRUE)

Saving and Exporting Plots

Save your customized plots to files for reports and presentations.

png(filename = "barplot_custom.png", width = 800, height = 600)
barplot(categories$Count, names.arg = categories$Category,
        col = cls, main = "Frequency of Fruit Categories",
        xlab = "Fruit", ylab = "Count",space = c(0.5, 1, 1.5),horiz=TRUE)
dev.off() #In rconsole
## png 
##   2

Likewise you can also save in pdf.

###Exercise

  • Using the crabs dataset, create a bar plot showing the frequency of each spine.
  • Customize the color of the bars to improve visual clarity.
  • Adjust the width of the bars to make them either narrower or wider.
  • Modify the spacing between the bars for better readability.
  • Display a horizontal version of the bar plot.
  • Rotate the x-axis labels so that they are perpendicular to the axis.
  • Adjust the size of the labels so they fit better on the plot.
  • Save your bar plot as a PDF file with the desired width and height.
counts <- table(crabs$spine)
barplot(counts, col = cls[-1], main = "Frequency of Iris Species",
          xlab = "Species", ylab = "Count")

barplot(counts, col = cls[-1], main = "Customized Bar Width and Spacing",
        xlab = "Species", ylab = "Count", width = 0.5, space = 1.5)