A well-crafted graph can help you make meaningful comparisons among thousands of pieces of information, extracting patterns not easily found through other methods. This is an area where R excels more.
R offers various packages tailored for visualizing data. Among these are the graphics package and the grid package, each built upon distinct graphics engines. These two systems, while robust, operate independently, resulting in a fragmentation of graphical capabilities within R.
The graphics package boasts a diverse array of functions dedicated to plotting data which we will cover here. Where as for grid package like ggplot2, we will disscuss in another chapter.
Throughout this chapter, we will:
consider the following example
#Figure 1
a <- c(70, 78, 62, 87, 84, 79, 61, 74, 83, 85)
b <- c(90, 93, 79, 99, 93, 94, 83, 92, 96, 99)
plot(a,b)
abline(lm(b~a))
title("scatter diagram with regression line")
We created two vector a and b. and generates a scatter plot between a
and b with a on the horizontal axis and b on the vertical axis. We add a
line of best fit and added the title.
We can add our graphs using code or by clicking.
using code
pdf("mygraph.pdf")
a <- c(70, 78, 62, 87, 84, 79, 61, 74, 83, 85)
b <- c(90, 93, 79, 99, 93, 94, 83, 92, 96, 99)
plot(a,b)
abline(lm(b~a))
title("scatter diagram with regression line")
dev.off()
## png
## 2
In addition to pdf(), we can use the functions png(), jpeg(), bmp(), tiff(), xfig(), and postscript() to save graphs in other formats.
Alternatively we can select Export > Save as… from the plots window, and choose the format and location desired in the resulting dialog.
If we just want to copy the image, click Zoom, right-clicking on the plot zoom window, and select copy image, then can paste it into an appropriate file type, such as a Word document.
In the same above example lets add some parameters
#Figure 2
c <- c(20, 30, 40, 45, 60)
d <- c(16, 20, 27, 40, 60)
plot(c, d, type="b")
Plot() is a generic function, which can draw many types of objects, including vectors, tables, and time series. In this case, plot(x, y, type=“b”) places x on the horizontal axis and y on the vertical axis, plots the (x, y) data points, and connects them with line segments. The option type=“b” indicates that both points and lines should be plotted.
Graphical parameters can be used to specify fonts, colors, line styles, axes, reference lines, and annotations. These values are specified for the settings either with a call to the par() function or as arguments to a specified graphics function such as plot().
# Figure 3
plot(c, d, type="b", lty=2, pch=17, lwd=3, cex=2)
- This statement changes the line type to dashed(lty=2) and symbol for
points to a solid triangle(pch=17)(rather than a solid line and an open
circle by default, as you can see from Figure 2)
-lwd=3 means the line is three times wider than the default width and cex=2 means the plotting symbols are twice as large as the default size. Check parameters for symbols and lines in Table 1
Table. 1
| Parameter | Description |
|---|---|
| type | Specifies what type of plot should be drawn |
| pch | Specifies the symbol to use when plotting points |
| cex | Specifies the symbol size. cex is a number indicating the amount by which plotting symbols should be scaled relative to the default. 1=default, 1.5 is 50% larger, 0.5 is 50% smaller, and so forth. |
| lty | Specifies the line type |
| lwd | Specifies the line width. lwd is expressed relative to the default (default=1). For example, lwd=2 generates a line twice as wide as the default. |
To change the color use the following parameters
Table 2
| Parameter | Description |
|---|---|
| col | Default plotting color. Some functions (such as lines and pie) accept a vector of values that are recycled. For example, if col=c(‘red’, ‘blue’)and three lines are plotted, the first line will be red, the second blue, and the third red. |
| col.axis | Color for axis text. |
| col.lab | Color for axis labels. |
| col.main | Color for titles. |
| col.sub | Color for subtitles. |
| fg | The plot’s foreground color. |
| bg | The plot’s background color. |
– To specify text size, font and style
Table 3
| Parameter | Description |
|---|---|
| cex | Number indicating the amount by which plotted text should be scaled relative to the default. 1=default, 1.5 is 50% larger, 0.5 is 50% smaller, etc. |
| cex.axis | Magnification of axis text relative to cex. |
| cex.lab | Magnification of axis labels relative to cex. |
| cex.main | Magnification of titles relative to cex. |
| cex.sub | Magnification of subtitles relative to cex. |
Table 4
| Parameter | Description |
|---|---|
| font | Integer specifying font to use for plotted text.. 1=plain, 2=bold, 3=italic, 4=bold italic, 5=symbol (in Adobe symbol encoding). |
| font.axis | Font for axis text. |
| font.lab | Font for axis labels. |
| font.main | Font for titles. |
| font.sub | Font for subtitles. |
| ps | Font point size (roughly 1/72 inch). The text size = ps*cex. |
| family | Font family for drawing text. Standard values are serif, sans, and mono. |
| Parameter | Description |
|---|---|
| pin | Plot dimensions (width, height) in inches. |
| mai | Numerical vector indicating margin size where c(bottom, left, top, right) is expressed in inches.. |
| mar | Numerical vector indicating margin size where c(bottom, left, top, right) is expressed in lines. The default is c(5, 4, 4, 2) + 0.1. |
The code par(pin=c(4,3), mai=c(1,.5, 1, .2)) produces graphs that are 4 inches wide by 3 inches tall, with a 1-inch margin on the bottom and top, a 0.5-inch margin on the left, and a 0.2-inch margin on the right.
Lets see one example
You may have interest in some low level function.However, not all functions allow you to add these options. See the help for the function of interest to see what options are accepted. In this case, possible options are to add further output to the plot using low-level graphics functions.
title(main="main title", sub="sub-title", xlab="x-axis label", ylab="y-axis label")
title(main="My Title", col.main="red",
sub="My Sub-title", col.sub="blue",
xlab="My X label", ylab="My Y label",
col.lab="green", cex.lab=0.75)
axis(side, at=, labels=, pos=, lty=, col=, las=, tck=, ...)
| Parameter | Description |
|---|---|
| side | An integer indicating the side of the graph to draw the axis (1=bottom, 2=left, 3=top, 4=right). |
| at | A numeric vector indicating where tick marks should be drawn. |
| labels | A character vector of labels to be placed at the tick marks (if NULL, the at values will be used). |
| pos | The coordinate at which the axis line is to be drawn (that is, the value on the other axis where it crosses). |
| lty | Line type. |
| col | The line and tick mark color. |
| las | Labels are parallel (=0) or perpendicular (=2) to the axis. |
| tck | Length of tick mark as a fraction of the plotting region (a negative number is outside the graph, a positive number is inside, 0 suppresses ticks, 1 creates gridlines); the default is 0.01. |
| (…) | Other graphical parameters. |
abline(h=yvalues, v=xvalues)
legend(location, title, legend, ...)
# Figure 5
#Specify data
x <- c(1:10)
y <- x
z <- 10/x
#Increase margins
par(mar=c(5, 4, 4, 8) + 0.1)
#Plot x versus y
plot(x, y, type="b",
pch=21, col="red",
yaxt="n", lty=4, ann=FALSE)
#Add x versus 1/x line
lines(x, z ,type="b", pch=22, col="blue", lty=1)
#Draw your axes
axis(2, at=y, labels=y, col.axis="red", las=2)
axis(4, at=z, labels=round(z, digits=2),
col.axis="blue", las=2, cex.axis=0.7, tck=-.02)
#Add titles and text
mtext("y=1/x", side=4, line=3, las=1, col="blue")
title("An Example of line graph",
xlab="X values",
ylab="Y=X")
For scatter plots, we will use data from github. we will use data from https://raw.githubusercontent.com/BijayLalPradhan/D4P/main/data2.csv
Lets read the data using csv file and save under the variable data1
data1=read.csv("https://raw.githubusercontent.com/BijayLalPradhan/D4P/main/data2.csv")
To show a scatter plot, we can use the plot() function. Let’s see the relationship between height and weight of the data1:
attach(data1)
plot(weight, height)
locate
attach(data1)
## The following objects are masked from data1 (pos = 3):
##
## age, gender, height, sex, weight
plot(weight, height,
xlab = "Weight in kg",
ylab = "Height in cm")
text(weight, height,
labels = gender,
cex = 0.5,
pos = 4)
If you have a data frame with n different variables and you would like
to generate a scatter plot for each pair of values in the data frame,
you can use pairs function. In our data (data1) we have character
variable so we have to make only number variables using subset function
and use pairs function as given below.
data2 <- subset(data1, select = c(age, height, weight))
pairs(data2)
To download .rda (time series data file) you need a package rio. Lets download price data from website https://github.com/BijayLalPradhan/D4P/raw/main/price.ts.rda”
#install.packages("rio")
library(rio)
url <- "https://github.com/BijayLalPradhan/D4P/raw/main/price.ts.rda"
data5 <- import(url)
plot(data5)
Another way to see the seasonal effects is with an autocorrelation plot. which can be done using acf function
acf(data5)
To draw bar (or column) charts in R, we use the barplot function.
sales=c(46, 76, 37, 86, 46, 75, 50)
barplot(sales)
If we want to indicate the seven sales in seven province then
sales = c(46, 76, 37, 86, 46, 75, 50)
aa = matrix(sales, nrow = 1, ncol=7, dimnames = list("province", c("Koshi", "Madhesh", "Bagmati", "Gandaki", "Lumbini", "Karnali", "sudur_P")))
barplot(aa)
Lets see next example
mark = matrix(c(34, 57, 54,76, 57, 87), nrow=2, ncol=3, dimnames = list(c("Rajan", "Hari"), c("Math", "Science", "Computer")))
barplot(mark)
barplot(mark, beside = TRUE, col=c("red","green"))
Lets See next example
#First create a vector for three regions by
reg= c("East", "West", "North")
#Create a vector for month names by
months=c("Mar","Apr","May","Jun","Jul")
#Next, create matrix for revenue values in the 3 regions as
rev = matrix(c(2,9,3,11,9,4,8,7,3,12,5,2,8,10,11),nrow = 3,ncol = 5,byrow = TRUE)
#Create a vector for three different colors for three different regions by
colors=c("Green","Orange","Brown")
#Create the bar plot by
barplot(rev, main="Total Revenue", names.arg=months, xlab="Months", ylab="Revenue", col=colors)
#Next, to provide legends to different regions
legend(x="topleft", legend=reg, fill=colors)
lets see the example from dataframe (our data is data1)
barplot(table(data1$gender),col=c("blue","green"))
Or
mm=matrix(table(data1$gender), nrow=1, ncol=2, dimnames=list("gender",c("Male","Female")))
barplot(mm,col="red")
pie function is used to draw pie chart
For example
| City | Length_of_road_in_km |
|---|---|
| Beijing | 1450 |
| New Delhi | 2500 |
| Moscow | 1255 |
| Kathmandu | 789 |
| Mumbai | 1435 |
then lets use pie() function
city = c("Beijing","New Delhi","Moscow", "Kathmandu", "Mumbai")
road = c(145, 2500, 1255, 789, 1435)
pie(road, city, col=rainbow(5))
boxplot function is used for drawing boxplot
boxplot(data1$height, ylab="height of person")
boxplot(height ~ gender, data = data1, ylab = "Height of person",
col = c("blue", "pink"), names = c("Male", "Female"))
hist function is used to draw histogram
seq1=seq(60,75,5)
hist(data1$weight, breaks=seq1,
main = "Histogram of Height",
xlab = "Height",
ylab = "Frequency",
col = "skyblue",
border = "black")
## Stemleaf plot
stem(data1$height)
##
## The decimal point is 1 digit(s) to the right of the |
##
## 10 | 045800023555889
## 12 | 000000558000055555555
## 14 | 000002555255
## 16 | 059955
## 18 | 00055
## 20 | 0
R makes it easy to combine several graphs into one overall graph, using the par() function. With the par() function, you can include the graphical parameter mfrow=c(nrows, ncols) to create a matrix of nrows x ncols plots that are filled in by row.
par(mfrow=c(2,2))
attach(data1)
## The following objects are masked from data1 (pos = 4):
##
## age, gender, height, sex, weight
## The following objects are masked from data1 (pos = 5):
##
## age, gender, height, sex, weight
boxplot(height, sub="boxplot of height")
boxplot(weight, sub="boxplot of weight")
plot(age,height, sub="scatter diagram between age & height")
barplot(table(gender), sub= "barplot of gender")
detach(data1)