To begin,recall an article that while reading you had difficulty with the tables. The article I choose, and used in this tutorial, was of personal interest to me because I like to think about the interface between medical research findings, physician understanding and implementation of those findings, and patient acceptance(or not) and understanding. I know a few of the authors of the paper and the lead author is a mentor to me. I hear Jim has a strong preference for tables of numbers over displaying information in plots. Needless to say, my preference is not the same!
This tutorial will walk you through both the thought process of what might be a good way to display the information in the tables and the R code need to build the display of information.
The orginal article is quite old, but luckily available through the interwebs. Be sure to log into the U of MN library system to gain access to the U of MN journal holdings. Neaton J, Wentworth D, Rhame F, Hogan C, Abrams D, Deyton, and CPCRA, (1994) Methods of Studying Interventions: Considerations in Choice of a Clinical Endpoint for AIDS Clinical Trials. Statistics in Medicine, 13: 2107-2125.
The overriding goal of the paper by Neaton and colleagues is not completely contained in tables I and II, but these were the tables I had difficulty following and thus good fodder for this tutorial.

The primary purpose of the table II was to express the subjective assessments of severity of disease progression by both physicians and patients. This is valuable because estimates of risk of death alone (Table I) ignore the intrinsic differences in quality of life associated with the different opportunistic events. Additionally, the table shows the degree of agreement/disagreement between physician and patient ratings. Finally, the table indicates rank correlations with RR and p-values associated with the comparison of BLANK to BLANK. What? I am still not sure!
The tables were engaging, however, it is difficult to flip between table I and table II. Also, making sense of whether or not the patients and physicians agreed on severity rankings is not completely transparent. Most importantly the illustration of the relationship between the severity rankings and the relative risk of death was lost between tables even though the author put the risk of death for each opportunistic infection in risk order. The magnitude of the differences in relative risk between diseases was not maintained in table II from table I.
Our goal is to develop a plot to visualize the agreement between rank severity ordering by physicians and patients, while indicating the magnitude of the relative risk of death for those with the event compared to not. Additionally, by comparing the color scale and the position of the disease on the scatter-plot a view may be alerted to a discrepancy in the severity ranking compared the estimated risk of death. This may indicate an opportunistic disease that has serious quality of life implications or a potential disconnect between the risk of death for the event and the physicians' or patients' perception of risk of death.
Recording Data
Raw data for medical journal articles is rarely available, but the data displayed in the tables is enough information to get started. Using a spreadsheet program, enter the column headings for the variables of interest. For this tutorial, this is Event, RR, text.RR, CI, Phys.score, Patient.score.

Importing the Data
Importing data into R can be a little tricky. If you are using Excel as a spreadsheet, you may want to save the file as a .csv file. “.csv” stands for commma separated value. Other options are to copy and paste the file into a very basic text editor. You will need to know what is separating the columns in your data file. Common separaters are commas, white space, or tab. *More info: *How to read in a file with a different deliminator
#Reading data into a dataframe in R.
aids <- read.csv("C:/Users/telke/Desktop/PhD/Epsy8252/composite.endpoints.aids.csv")
Notice a few things in the code above.
| Code Fragment | Meaning |
|---|---|
| aids<- | assigns the name aids to the working data set in R |
| read.csv | tells R to read in a comma separated value file |
The remaining R code is the path to where the data is stored.
Once you have read the data into R, it does not automatically display. There are a few options to see the data.
#Viewing your data
View(aids)
head(aids)
tail(aids)
View will show you the entire dataframe. Head will show you the first several rows of the data and Tail will show you the final several rows of your data. Be sure to take a look at the data, so you know R has read the data as you thought it would.
summary(cars)
You can also embed plots, for example:
plot(cars)