Plots in R using ggplot

To use the ggplot function which will be used throughout this tutorial, you must first install the package:

install.packages("ggplot2") library(ggplot2)

There are three basic plotting functions in R: high-level plots, low-level plots, and the layout command par. Basically, a high-level plot function creates a complete plot and a low-level plot function adds to an existing plot, that is, one created by a high-level plot command.

High-Level Plot Functions

Read in the example data set States03

States03 <- read.csv("http://sites.google.com/site/chiharahesterberg/States03.csv")

ggplot(States03, aes(Region)) + geom_bar()

ggplot(States03, aes(Poverty)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

To create a scatter plot:

ggplot(States03, aes(x = Unemp, y = Poverty)) + geom_point() + xlab("Unemployment")

In the first approach, provide the plot command with the x-variable, then the y-variable. In the second approach, if the data are contained in a data frame, then provide the names of the variable Y ??? X along with the name of the data frame. High-level functions may also take optional arguments that enhance the plot.

ggplot(States03, aes(Poverty)) + geom_histogram() + xlab("percent") + xlim(c(0, 24)) + ylim(c(0, 20))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 2 rows containing missing values (geom_bar).

plot(1:19, 1:19, pch = 1:19, xlab = "x", ylab = "y")

 pie(rep(1, 8), col = 1:8)

Option | Description

pch | point character(pch=1,2,...)

lty | line type (lty=1, 2, ...)

lwd | line thickness (lwd= 1, 2,...)

col | color (col="red", "blue",...)

xlim | x-axis limits: xlim=c(min,max)

ylim | y-axis limits

xlab | x-axis label: xlab="my label"

ylab | y-axis label

main | main title

sub | sub title

To plot smooth curves, use the curve command. The first argument must be an expression in terms of x:

curve(x^2, from = 0, to = 2) curve(cos(x), from = 0, to = pi) curve(cos(x), from = 0, to = pi, lty = 4, col = "red")

Function | Description lines | add a line plot points | add points text | add text mtext | margin text abline | add a straight line qqline | add line to qqnorm title | add a title

ggplot(States03, aes(x = Unemp, y = Poverty)) + geom_point() + xlab("Unemployment") + geom_vline(xintercept = mean(States03$Unemp)) + ggtitle("Data from 2003") + geom_text(x = 30, y = 18, aes(label = "mean unemployment rate"))

The abline function has several options: abline(3, 5) adds the straight line y = 3 + 5x abline(v = 2) adds the vertical line, x = 2 abline(h = 0) adds the horizontal line, y = 0

ggplot(States03, aes(x = ColGrad)) + geom_point(aes(y = Poverty),color = "blue") + xlab("College grad (%)") + scale_y_continuous(name ="Poverty (%)", sec.axis = sec_axis(trans =~.*1, name = "Percent Uninsured")) + geom_point(aes(y = Uninsured), color = "red")

#### You can also use different plotting symbols for different levels of a factor variable:

ggplot(States03, aes(x = ColGrad, y = Poverty, color = Region)) + geom_point() 

#### curve(cos(x), from = 0, to = 2*pi) curve(sin(x), add = TRUE, col = "blue", lty = 2)

The par Command

The par command controls the layout of the graphics device. The option you will use most often will probably be mfrow (multi-figure, by row), or mfcol. For example, to have a 3x2 layout where the plots are added by row, set This setting will exist throughout the life of the graphics device unless you change it back to the default mfrow=c(1,1). You can also change the default color, plot character, etc. for the graphs created on the graphics device.

par(mfrow = c(2, 2)) #2x2 layout curve(3*x^2) curve(cos(x))

ggplot(States03, aes(Population)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(States03, aes(sample=Population)) + stat_qq() + stat_qq_line()

 par(mfrow = c(1, 1)) #reset to default layout

###Misc.

. Type colors() at the command line to see the list of colors available to the plotting commands.

. You can export to some common file formats (jpg, pdf, ps). With the graph in focus, go to the menu, in Windows, File > Save As... and save to jpg, pdf, ps, png or bmp. On the Macintosh, File > Save as to pdf only.

Or, at the command line, for instance

postscript(file = "MyPlot.eps") #open graphics device

ggplot(States03, aes(Births)) + geom_histogram(main= "Number of births") #create graph
## Warning: Ignoring unknown parameters: main
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
dev.off() #close graphics device
## png 
##   2

The file MyPlot.eps will be located in your working directory. See the help file for postscript, jpeg, png, tiff or pdf.