Tables into Plots Tutorial

by Kelly Jedd and Alyssa Sinner

This tutorial will explain how to turn a table from an empircal article into a plot showing the same data using ggplot2 in RStudio.

The original table is Table 2. Focused and Casual Attention, Mean (Standard Deviation) Duration in Seconds per Minute of Free Play in:

Lawson, K.R., & Ruff, H.A. (2004). Early focused attention predicts outcome for children born prematurely. Developmental and Behavioral Pediatrics, 25(6), 399-406.

Part One: Create an Excel spreadsheet based on the data in the original table.

  1. In the original table two types of attention were each measured at three different time points. The table included means and standard deviations. To capture these variables, create a Excel spreadsheet with four columns and six rows.

  2. Label the first column “Age”, and enter each time point in months (in this case, 7, 24, and 36). Because there are two types of attention, enter these values again for a total of six rows.

  3. Label the second column “Seconds” and enter the mean values listed under focused attention and then casual attention.

  4. The third column will include a new variable that is dummy coded for type of attention. Label the column “Type” and enter “0"s for focused attention values and "1"s for casual attention values.

  5. The final column will include standard deviation values for each mean. Label the column "SD” and enter the standard deviation values.

  6. Save Excel sheet in .csv format.

  7. The Excel table should look like this:

    Age| Seconds | Type | SD
    ---|---------|------|----
    7  |   4.2   |   0  | 3.8
    24 |   9.1   |   0  | 9.1
    36 |  13.5   |   0  | 10.2
    7  |  25.3   |   1  | 8.7
    24 |  34.5   |   1  | 9.4
    36 |  38.6   |   1  | 8.6
    

Part Two: Import data into RStudio and create plot

  1. In RStudio, click on “Import Dataset” and choose “From Text File.” Choose the csv file in which you saved the table. In the window that will preview your data, name your data frame “att”.

  2. View the first five rows to make sure the data has been imported properly by using the head function:

    head(att)
    
  3. Load ggplot 2:

    library(ggplot2)
    
  4. To begin making the plot, choose the data frame and set the aesthetics to the variables to be included in the plot. The dataset to use is “att”, put “Age” on the x axis, and on the y axis put “Seconds”. Because two lines are needed to represent the two types of attention, specify that it will be colored by Type, which should be read as a factor.

    ggplot(data=att, aes(x=Age, y=Seconds, color=factor(Type)))+
    
  5. Now add the plot. In this case, because means are being compared, add a scatterplot with different shapes based on the factor. To do this use the geom_point function and specify the size you want and the kind of shapes. As with the colors, shapes should be based on “Type” while treating this variable as a factor.

    geom_point(size=6, aes(shape=factor(Type)))+
    
  6. To choose colors for your plot use the “scale color brewer” function which helps to choose colors that are appropriate for contrasting data. For more information on color brewer see http://colorbrewer2.org/ Choose a palette, in this case “Set1”. Name the legend “Type of Attention”. Label the different levels of attention.

    scale_color_brewer(palette="Set1", name= "Type of Attention", labels=c("Focused", "Casual"))+
    
  7. Next do a similar step to make the two types of attention be represented by different shapes:

    scale_shape_discrete(name= "Type of Attention", labels=c("Focused", "Casual"))+
    
  8. To make the background white and therefore make the plot easier to read, use the theme function:

    theme_bw()+
    
  9. Add lines and specify the size:

    geom_line(size=1) +
    
  10. Label the y axis:

    ylab("Seconds per Minute of Free Play") +
    
  11. Label the x axis:

    xlab("Age") +
    
  12. To make the x axis have ticks at the specific points represented by the data, first create a new data frame named “ticks” which will include a vector “t” of the time points you want.

    ticks= data.frame(t=c(7,24,36))
    
  13. After creating the new data frame, specify that the X axis scale is continuous with break at the time points in vector “t”. Next label the time points.

    scale_x_continuous(breaks = c(ticks$t), labels=c("7mo","2yr","3yr"))+
    
  14. To add error bars, use the geom_errorbar function and specify the minimum and maximum values of the bars and the width.

    geom_errorbar(aes(ymin=Seconds-sd, ymax=Seconds+sd), width=.5) +
    
  15. Add a title to your plot:

    labs(title="Mean Duration and Standard Deviations of Focused and Casual Attention")
    
  16. The final plot should look like this:

plot of chunk ggplot2ex