Introduction

Hi

I am Duong, a SAS programmer with many years of experience in the Pharmaceutical industry and I have been teaching myself R in the past year. I have produced some graphs with the ggplot2 package and now would like to share them with my thoughts and experiences. If you are about to set off a similar journey, these results and thoughts may be helpful. And if you are already an experienced R user, you might be intrigue on some of these results too.

While I find the R language quite weird and very frustrating at first, it has now become more and more natural to me. In fact I am now an R enthusiastic! I guess just like learning anything new, the first stage is often the most difficult hurdle to overcome, so remember to be easy on yourself when you struggle at this first hurdle. You are not the only one!

R is being touted as great for graphs and data visualisation, (this is not to say that it is rubbish in other areas by the way!) so one of my questions at start of my journey was, how to create graphs that are typically in a clinical study report? I have spent many hours at this and still feel that I have only just touched the surface of this topic. Nevertheless am very pleased with the achieved results (see below) and I hope to add more different types of graphs in time. By the way this blog was created with RMarkdown; another very popular R package for creating web pages, dashboards, blogs and reproducible reports etc.

Clinincal Graphs with ggplot2

These are some of the graphs that are typical in a clinical study report. They were all created with the R ggplot2 and associate packages such as grid, cowplot and patchwork to name a few. The output png images were then wrapped inside SAS PROC REPORT and output again via the output delivery system (ODS) in an RTF or PDF formats with typical header and footer labels that you normally see in a clinical report output. The final layout with headers and footers look just like as in table and listing outputs (Figure 1). Note the numbers on these graphs are nonsensical. They are just for suggestive.

The whole process is one smooth continuation of steps expressed in a SAS program calling a ‘SAS to R’ interface. This interface can be the SAS PROC IML or a creation of your own such as (Holland, Wei). So, the structure of the SAS program to create a graph with this method might look something like this:

Figure 1

Where to Start

If you are wondering where to start I would say start with these points in order.

1. Install R then R Studio
2. Familiarise the R Studio IDE
3. Install and load packages
4. Create data frames and start playing with the data with the dplyr package
5. Understand the pipe symbol (%>%)
6. Start creating plots with ggplot2

You can google and YouTube about these topics. If you are familiar with SAS then you should be able to transcribe your SAS knowledge to some of the dplyr’s functionalities. This package is the ‘equivalent’ of the SAS data steps plus some of the basic procedures such as proc summary and proc sort combined. In SAS, functionalities tend to be separated by topics with procedures. In R, a package performs multiple functionalities of different aspects. For example the dplyr package; mutate, sort, summarise and select etc. the variables.

Reference

[1] Holland P (2008). SAS to R to SAS
https://www.lexjansen.com/phuse/2005/cc/cc03.pdf

[2] Xin Wei (2012). %PROC_R: A SAS Macro that Enables Native R Programming in the Base SAS Environment https://www.jstatsoft.org/article/view/v046c02

Contact

Email: