In the beginning

All of the following examples use Mike Marin’s “LungCapData” dataset, which you should download and attach if you want to copy and run the code. A summary of the data set follows. I renamed the data set “LCD” so I had less to type.

The data set may be found here: http://www.statslectures.com/index.php/r-stats-datasets

summary(LCD)
##     LungCap            Age            Height      Smoke        Gender   
##  Min.   : 0.507   Min.   : 3.00   Min.   :45.30   no :648   female:358  
##  1st Qu.: 6.150   1st Qu.: 9.00   1st Qu.:59.90   yes: 77   male  :367  
##  Median : 8.000   Median :13.00   Median :65.40                         
##  Mean   : 7.863   Mean   :12.33   Mean   :64.84                         
##  3rd Qu.: 9.800   3rd Qu.:15.00   3rd Qu.:70.30                         
##  Max.   :14.675   Max.   :19.00   Max.   :81.80                         
##  Caesarean
##  no :561  
##  yes:164  
##           
##           
##           
## 

Some basic graphing and plotting commands

The basic command for plotting is “plot”. If I want to plot LungCap and Age, all I need to do is to use the following code:

plot(LungCap, Age)

This produces a relatively uninteresting plot of the lung capacity and age for each of the 725 subjects. Each dot on the plot is a different subject.

To produce a barplot of subjects by gender we first create a table:

tab <- table(Gender)
tab
## Gender
## female   male 
##    358    367

Next, we use the “barplot” command to create the plot:

barplot(tab)

Additional commands may be used to add a title, x and y axis labels, and to turn the y axis numbers upright.

barplot(tab, main = "Barplot of Subjects by Gender", 
        xlab = "Gender",
        ylab = "Count",
        las = 1)

“main” is the main title, xlab and ylab are the labels for the x and y axes, and las = 1 rotates the y axis numbers.

Adding “horiz = T” to the code will produce a graph with the bars running horizontally. This can be useful if you have several categories you want to graph, as there is more room for the labels on the y axis. In our case it doesn’t make much difference. Note that I also changed the labels of the axes to reflect the change in orientation.

barplot(tab, main = "Barplot of Subjects by Gender", 
        xlab = "Count",
        ylab = "Gender",
        las = 1,
        horiz = T)

“Boxplot” is another useful function.

boxplot(LungCap)

We can break our boxplot down by Gender:

boxplot(LungCap ~ Gender
        )

And we can add labels, rotate y axis values, and add a title:

boxplot(LungCap ~ Gender,
        main = "Boxplot of Lung Capacity by Gender",
        xlab =  "Gender",
        ylab = "Lung Capacity",
        las = 1)

You can refer to our text for information on reading and interpreting boxplots.