“Last updated: 11:56:47 IST, 19 July, 2023”

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.

library(datasets)
str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

plot() function to plot charts

Few arguments of plot() function:

data for horizontal coordinate data for vertical coordinate ‘main’ for title ‘xlab’ ‘ylab’ for axis labels ‘xlim’ ‘ylim’ for axis limits ‘pch’ plot character.. plotting symbol ‘lty’ the line type ‘lwd’ the line width ‘col’ the colour

Plot mpg vs disp

plot(mtcars$mpg ~ mtcars$disp)

plot(mtcars$disp, mtcars$mpg)

with(mtcars,plot(disp,mpg))
title(main="MT Cars Dataset Chart") # This adds on to the existing plot

Alternatively include text in the same call

with(mtcars,plot(disp,mpg,main="MT Cars Dataset Chart"))

Plot, add legend, regression line and equation

with(mtcars,plot(disp,mpg,main="MT Cars Dataset Chart"))
#Colour certain points in red
with(subset(mtcars,cyl==6),points(disp,mpg,col="red"))
#Add a legend
legend("topright",pch=1, col=c("red","black"),legend=c("6Cyl","Other"))
#Add a regression line. lm - linear model.
model = lm(mpg~disp,mtcars)
abline(model,lwd=2)
#Add equation.
cf <- round(coef(model), 2) 
eq <- paste0("mpg = ", cf[1],ifelse(sign(cf[2])==1, " + ", " - "), abs(cf[2]), " disp")
mtext(eq, 3, line=0)

Boxplots

# Structure
str(iris)
## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# Five point summary
# Boxplot
boxplot(iris$Sepal.Length,ylab='Sepal Length')

# Introduce an error
actualvalue <- iris$Sepal.Length[1]
iris$Sepal.Length[1]=15
summary(iris$Sepal.Length)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.300   5.100   5.800   5.909   6.400  15.000
boxplot(iris$Sepal.Length,ylab = "Sepal Length")

summary(iris$Sepal.Length)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.300   5.100   5.800   5.909   6.400  15.000
# Removing the error introduced
iris$Sepal.Length[1]=actualvalue
summary(iris$Sepal.Length)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.300   5.100   5.800   5.843   6.400   7.900

Multiple Plots

par(mfrow = c(1,2))
boxplot(iris$Sepal.Length,ylab='Sepal Length')
boxplot(subset(iris,iris$Species=='setosa')$Sepal.Length, ylab='Sepal Length for Setosa')

Comparison

boxplot(iris$Sepal.Length~iris$Species,ylab='Sepal Length')

Histograms

hist(iris$Sepal.Length)

Multiple Histograms

par(mfrow=c(2,1))
hist(subset(iris,iris$Species=='setosa')$Sepal.Length)
hist(subset(iris,iris$Species=='versicolor')$Sepal.Length)

pairs() function

pairs(~mpg+disp+cyl,data=mtcars,main='Scatterplot Matrix sample',col='blue')