The purpose of this tutorial (which is written entirely in R !) is to get you started with R and to show you how to import a probabilistic sensitivity analysis dataset into the R environment. In the short course we will analyze this dataset and produce various sensitivity analyses tables and figures.

Install R and RStudio

We will use two programs: R and RStudio.

  1. R is a programming language that can be downloaded for free, you can install it from:
  1. RStudio is a user-friendly interface that interacts with R. It is also free and can be downloaded from RStudio Download

Once you have installed R and RStudio, open RStudio. The first screen you will see should looks like this:

RStudio Environment

RStudio is an integrated environment that you can access various functionality of the R program from a single interface. Here we describe the four window panes shown above:

  1. Console window pane
  2. Editor or script window pane
  3. Workspace and history window pane
  4. Utility window pane that includes the following tabs:

1. Console Window

In this window, you can execute single commands and see the output directly.
For example, try typing 2+2 and press enter.

2 + 2
## [1] 4

You can also tell R to print a string:

print("Hello World!")
## [1] "Hello World!"

Next, we define a new variable by using the operator <- and the combine c() function:

a<-c(1,2,3)
a
## [1] 1 2 3

Notice that you can also use the operator = for this application, but the <- operator is more flexible:

a=c(1,2,3)
a
## [1] 1 2 3

No matter which operator you choose to use, object a is created as a vector (or an array) that contains 3 elements.

2. Editor or Script Window

This is where you can write a new R script or edit a previously saved one.

In this window the user can write code, modify it, and save it. When running the code, the user can either click the Run button (at the top) to run the entire script file or a selection of it, or may select a portion of the code and press Ctrl + Enter (or Cmd + Enter in Mac OS X) to run that section.

Note that lines that begin with the symbol # are skipped by the R interpreter. This is useful for documenting your R code.

3. Environment and History Window

The Environment tab shows the objects (e.g., variables) that R stores in memory. In this case, R has the object a in its memory.

The history tab shows a series of commands that the user has executed.

4. Utility window pane (Files/Plots/Packages/Help tabs)

The most important tabs in this window are: plots, packages and help. The Plots tab will show the plots the user produces either in the script file or in the console window. It keeps the history of the plots and it is possible to access them via the navigation arrows at the top. The Plots tab looks like:

The Packages tab shows a list of R packages already installed on the machine. To load any of these packages, you can check the checkbox next to the desired package. You can also install new packages as detailed below. Once new packages are installed, they become available in the Packages tab.

Finally, the Help tab shows information about any function the user wants from CRAN. In this case, the Help tab shows information about the c() function used to create vector a.

Install and load required packages

We will start by installing the packages that you will need for the short course from Comprehensive R Archive Network (CRAN). You only need to install these packages once. There are least two ways to install a package: (1) Use the Packages tab and click on the Install button and then enter the packages names listed below (i.e., stargazer, xtable, … etc) individually. (2) You can instruct R to install the package by using install.packages('packagename') at the console. You need to have access to the internet and may require administrator privileges in your computer to install packages.

install.packages("stargazer")
install.packages("xtable")
install.packages("gdata")
install.packages("xlsx")
install.packages("ggplot2")
install.packages("reshape2")
install.packages("reshape")
install.packages("VGAM")

Example loading PSA data from Excel to R

In this example, we load a Probabilistic Sensitivity Analysis (PSA) dataset saved in Excel PSA.xls file into R. The dataset consists of 10,000 rows (PSA samples), and 15 columns (an index column, 8 input parameter columns, and 6 output columns). The outputs are the costs and effectiveness of 3 treatment alternatives for a Markov model which will be used in the course.

First, it is a good idea to create a new folder (e.g., c:\\Test in Windows or /Users/Test in Mac OS X), and copy and paste the .xls file into this folder. Next, set the working directory with the command setwd() to tell R where to find PSA.xls.

setwd("c:\\Test") #Windows
setwd("/Users") #Mac OS X

It is important to use two backward slashes \\ instead of a single slash \ in Windows, otherwise, R gets confused about the meaning of the single backslash.

Finally, load the xlsx package and import the PSA dataset:

#Package to read files from Excel
library(xlsx)
Sim <- read.xlsx("PSA.xls",1,endRow=101) #endRow determines the last row to be imported from the .xls file

The first line loads the xlsx package which you installed above. The xlsx package includes the function read.xlsx needed to load the PSA dataset from Excel. The second line reads the PSA data from the first spreadsheet of the PSA.xls Excel file. You can change the second argument to read data from other spreadsheets. We also use the argument endRow=101 to speed up the process for this demo by limiting the number of rows returned.

The first five samples from our simulation file look like:

head(Sim)
##   Iteration Chemo_Cost Chemo_Eff Radio_Cost Radio_Eff Surg_Cost Surg_Eff
## 1         1      33683    12.728       8942    12.131     29706   11.263
## 2         2      30131     9.853      19705     9.855     28565   10.194
## 3         3      26839    10.053      13250    10.012     22430    9.832
## 4         4      28671    11.678      21505    11.014     28498   10.857
## 5         5      31540    10.759      13163    10.754     12835   10.575
## 6         6      32618    12.310      17060    11.816     42954   11.984
##   pFailChemo pFailRadio pFailSurg pDieSurg muDieCancer cChemo cRadio cSurg
## 1     0.3910     0.5008   0.12638  0.09894      0.1128  22569   5307 29706
## 2     0.4250     0.4176   0.03604  0.08764      0.2854  21018  13925 28565
## 3     0.4576     0.4546   0.04334  0.13891      0.2573  17874   8927 22430
## 4     0.4742     0.5574   0.04865  0.14257      0.1505  17687  12215 28498
## 5     0.4835     0.4705   0.05308  0.11785      0.2009  19810   8492 12835
## 6     0.4034     0.4864   0.09268  0.04084      0.1335  21747  10426 42954

As we will discuss further in the course, R is very flexible in loading various flat file formats. For example, you can simply load a comma-separated values (csv) from a .csv file using the read.csv command.

Sim <- read.csv("PSA.csv",header=TRUE)

You can also use read.table for .txt files.

Additional R tutorials

R has an increasing large community of developers and users. A large number of tutorials are available for R. A simple and useful tutorial to get you started is YaRI written by Andreas Handel. If you are already familiar with other programming language, such as MATLAB, David Hiebler has a great MATLAB\(\circledR\)/R Reference.