Commands used



R Objects



Differences between objects

Different objects are used and show up in differnt contexts.



The Data

We’ll use the following data to explore R objects.

Motulsky 2nd Ed, Chapter 30, page 220, Table 30.1. Maximal relaxaction of muscle strips of old and young rat bladders stimualted w/ high concentrations of nonrepinephrine (Frazier et al 2006). Response variable is %E.max





The assignment operator “<-” makes object

The code above has made two objects. We can use several commands to learn about these objects.

  • is(): what an object is, ie, vector, matrix, list, dataframe
  • length():how long an object is; only works with vectors and lists, not dataframes!
  • dim(): how long AND how wide and object is; doesn’t work with vectors, only dataframes and matrices :(

is()

What is our “old.E.max” object?

is(old.E.max)
## [1] "numeric" "vector"


Its a vector, containing numeric data.

What if we made a vector like this?

cat.variables <- c("old","old","old","old","old","old","old","old","old")


And used is()

is(cat.variables)
## [1] "character"           "vector"              "data.frameRowLabels"
## [4] "SuperClassMethod"

It tells us we have a vector, containing chracter data. Not sure why it feels the need to tell us all the other stuff…

length()

Our vector has 9 elements, or is 9 elements long.

length(old.E.max)
## [1] 9


Note that dim(), for dimension, doens’t work with vector

dim(old.E.max)
## NULL

It would be nice if it said somethign like “1 x 9” for 1 row tall and 9 elements long. So it goes.


str()

str() stands for “structure”.

It summarizes info about an object; I find it most useful for looking at lists.
* If our vector here was really really long, str() would only show the first part of the vector

str(old.E.max)
##  num [1:9] 20.8 2.8 50 33.3 29.4 38.9 29.4 52.6 14.3


c()

  • We typically use c() to gather together things like numbers, as we did to make our objects above.
  • note: this is lower case “c”!
  • Uppercase is something else
  • For me, R’s font makes it hard sometiems to tell the difference between “c” and “C”
  • If code isn’t working, one problem might be a “C” instead of a “c”

Use c() to combine two objects

old.plus.new <- c(old.E.max, young.E.max)


Look at the legnth

length(old.plus.new)
## [1] 17


Note that str() just shows us the first few digits, not all 17

str(old.plus.new)
##  num [1:17] 20.8 2.8 50 33.3 29.4 38.9 29.4 52.6 14.3 45.5 ...




You can save statistical output as an object

We can run a t.test like this, and the output goes to R’s “console”

t.test(old.E.max, young.E.max)
## 
##  Welch Two Sample t-test
## 
## data:  old.E.max and young.E.max
## t = -3.6242, df = 13.778, p-value = 0.002828
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -37.501081  -9.590586
## sample estimates:
## mean of x mean of y 
##  30.16667  53.71250


We can save the output to an object using the assignment operator

rat.t.test <- t.test(old.E.max, young.E.max)


Then call up the output like this

rat.t.test
## 
##  Welch Two Sample t-test
## 
## data:  old.E.max and young.E.max
## t = -3.6242, df = 13.778, p-value = 0.002828
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -37.501081  -9.590586
## sample estimates:
## mean of x mean of y 
##  30.16667  53.71250



How does R save statistical output

Let’s use is() and str() to see what this rat.t.test object really is

is(rat.t.test)
## [1] "htest"


  • Its an “htest” class of object. Probably means “hypothesis test”.
  • Many R functions make their own special classes of objects
  • For example, the lm() function for linear regresssion (aka “linear model”) makes an “lm” class of object
  • Many classes of objects behave like lists, as we’ll explain below

Let’s use str()

str(rat.t.test)
## List of 9
##  $ statistic  : Named num -3.62
##   ..- attr(*, "names")= chr "t"
##  $ parameter  : Named num 13.8
##   ..- attr(*, "names")= chr "df"
##  $ p.value    : num 0.00283
##  $ conf.int   : atomic [1:2] -37.5 -9.59
##   ..- attr(*, "conf.level")= num 0.95
##  $ estimate   : Named num [1:2] 30.2 53.7
##   ..- attr(*, "names")= chr [1:2] "mean of x" "mean of y"
##  $ null.value : Named num 0
##   ..- attr(*, "names")= chr "difference in means"
##  $ alternative: chr "two.sided"
##  $ method     : chr "Welch Two Sample t-test"
##  $ data.name  : chr "old.E.max and young.E.max"
##  - attr(*, "class")= chr "htest"


  • Interesting; lots of stuff.
  • THe first thing it says is “List of 9”. So its a list
  • Note that each one of these things corresponds to what is printed out as the output of t.test
  • (A peak a deepR: The difference between what str() is showing us and the organized table we get from running t.test has to do with what is known as the “print method” and how it interacts with objects of class htest.)





Lists in R

  • I used R for years without ever making a list
  • But lists are everywhere in R
  • Learning how to access lists can greatly aid in working with the output of statistical models and test
  • This is especially true when plotting
  • Components in lists can be accessed two different ways: using dollar signs $ and using square brackets [ ]
  • (the same is true for dataframes)
  • We’ll focus on using dollar signs $

If I run just the name of the object, I get all of the output of the list

rat.t.test
## 
##  Welch Two Sample t-test
## 
## data:  old.E.max and young.E.max
## t = -3.6242, df = 13.778, p-value = 0.002828
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -37.501081  -9.590586
## sample estimates:
## mean of x mean of y 
##  30.16667  53.71250


Say I want to just see the p-value. I can use the dollar sign like this

rat.t.test$p.value
## [1] 0.002828427


What about the “test statisic?”

rat.t.test$statistic
##         t 
## -3.624246


The means of each group (old vs yound)

rat.t.test$estimate
## mean of x mean of y 
##  30.16667  53.71250


Try to get the confidence intervals (“confint”; answer below) …

#Type your attempt here:

























Confidence interval from t.test object:

rat.t.test$conf.int
## [1] -37.501081  -9.590586
## attr(,"conf.level")
## [1] 0.95

Do you know what this confidence interval means in this context?




An aside about t-tests



Working with square brackets

30.16667 - 53.71250 
## [1] -23.54583


Or like this

mean(old.E.max) - mean(young.E.max)
## [1] -23.54583


Challenge: can you save the output from the mean() commands to objects, then do math on those objects? Answer below…

#Type your attempt here:

























Make objects holding the means

#old
old.mean   <- mean(old.E.max) 

#young
young.mean <- mean(young.E.max)

Subtract them

old.mean - young.mean
## [1] -23.54583



Instead of calcuating the mean and their difference by hands let’s get the info from the t.test output.

First, get the means. Recall that for some reason they are called the “estimate” Answer below…

#Type your attempt here:

























rat.t.test$estimate
## mean of x mean of y 
##  30.16667  53.71250


Save this to an object

rat.means <- rat.t.test$estimate


Take a look with length(), is() and str()

#what is this object?
is(rat.means)
## [1] "numeric" "vector"
#how long is the vector?
length(rat.means)
## [1] 2
#take a peek at it
str(rat.means)
##  Named num [1:2] 30.2 53.7
##  - attr(*, "names")= chr [1:2] "mean of x" "mean of y"


Aside: This vector also happens to have labels attached to it, which we can access with the names() command

names(rat.means)
## [1] "mean of x" "mean of y"



Square brackets for real

Now let’s use this object to calcualte the difference between the means.

  • length() told is that the vector object has length of 2.
  • Each number in the vector is an “element”
  • we can select elements using square brackets

The first element

rat.means[1]
## mean of x 
##  30.16667

The 2nd element

rat.means[2]
## mean of y 
##   53.7125

Note that it gets angry if we try to access the 5th elemetn

rat.means[5]
## <NA> 
##   NA

Difference between means using square brackets

Can you figure out how to calcualte the difference using our rat.means objects and square brackets?

#Type your answer here:

























rat.means[1] - rat.means[2]
## mean of x 
## -23.54583

Let’s save that to an object

diff.means <- rat.means[1] - rat.means[2]





Upshot: plotting output of an R statistical test

  • The best way to make a plot of the results of a statistical test or model is to plot the “Effect size” and its confidence interval
  • For a t-test, the effect size is the difference between the means
  • This is important b/c the effect size is the most direct representation of what the test is doing
  • In contrast, people often plot group means with the confidence intervals.
  • THis is fine, but isn’t representing what the test is doing
  • (We’ll talk a lot in this class about effect sizes…)

How would we take the results of this t.test and plot it?

  • Old school: Pen and paper (if it were the 1980s…)
  • What many people do: Type numbers into Excel and make boxplots :(
  • Not bad: Type numbers into new data objects and make plots in R
  • Best: Access data from the lists and plot in R :)



The “not bad way”: typing directly into R

“NOt bad way” with a bar plot

We already have our difference between means

diff.means
## mean of x 
## -23.54583


Let’s make an object with the confidence interval

ci <- c(-37.501081,-9.590586)


We can load the barplot2() function from gplots() and plot the data.
* Use library(gplots) to load the package * you can get so called “help” for barplot2 using ?barplot2 * we want to use the height =.., plot.ci = TRUE, ci.l=, and ci.u= “arguements” of the function

A very very ugly plot of just the mean

library(gplots)
## 
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
## 
##     lowess
barplot2(height = diff.means)

Try to add the rest of the info. Hint: to get the lower ci access the 1st elemtn of our confidence itnerval object, ci, using square brackets

#Your attempt:




























barplot2(height = diff.means,
         plot.ci = TRUE,
         ci.l=ci[1],
         ci.u = ci[2])

Ok, this is pretty ugly but its technically correct.

“NOt bad way” with using a single dot

  • The gplots package has a function plotCI() which plots means as dots, instead of bars, and puts a CI around them
  • NOt that the instead of ci.l= it uses li = for the lower limit
  • … and instead of ci.u = it uses ui
plotCI(x = diff.means,
       li=ci[1],       #use "li", not "ci.l"!
        ui = ci[2])

Making the plot better part 1

  • Since our hypothesis test is with respect to zero (no difference between means) we should include it as reference point.
  • Most plotting functions can take on the arguement xlim = and ylim =
  • You can play with its values to it suits your tastes
plotCI(x = diff.means,
       li=ci[1],         #use "li", not "ci.l"!!!!
        ui = ci[2],
       ylim = c(-40,5))  #y limits


Note that we can use a vector here with the ylim = arguement instead of nubmers.

#make a vector containing the plotting limits
ylimits <- c(-40,0)

#plot
plotCI(x = diff.means,
       li=ci[1],
        ui = ci[2],
       ylim = ylimits)  #use vector of values for limits



Making the plot better part 2

  • To be really specific we could plot a line at 0.0 that shows the hypothesis being tested.
  • The abline() function can plot lines at different angles.
  • the “ab” part relates to the common geometric formula for a line, y = m*x + b
  • or rather y = a*x+b
  • for just a horizontal line we can use abline(h = 0)

Make a plot with line

#make the plot
plotCI(x = diff.means,
       li=ci[1],
        ui = ci[2],
       ylim = ylimits) 

#add the line
abline(h = 0)



To change the colors of thigns we can use the col = arguement within abline. col = 2 makes a red line Try it

# Your attempt

























Dot plot with red line at zero

#make the plot
plotCI(x = diff.means,
       li=ci[1],
        ui = ci[2],
       ylim = ylimits) 

#add the line
abline(h = 0, col = 2)


  • We can change the type of line from solid to dashed using the arguement “lty” for “line type”
  • lty = 1 is a solid line,
  • lty = 2 is a dashed line


#Try it

























Answer:

#make the plot
plotCI(x = diff.means,
       li=ci[1],
        ui = ci[2],
       ylim = ylimits) 

#add the line
abline(h = 0, col = 2, lty = 2)

The best way

Recall all of this stuff is in a list from the t.test

rat.t.test
## 
##  Welch Two Sample t-test
## 
## data:  old.E.max and young.E.max
## t = -3.6242, df = 13.778, p-value = 0.002828
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -37.501081  -9.590586
## sample estimates:
## mean of x mean of y 
##  30.16667  53.71250

We can acceess components of lists using dollar signs $

rat.t.test$conf.int
## [1] -37.501081  -9.590586
## attr(,"conf.level")
## [1] 0.95

What if we want jsut the first confidence limt? We can combine using a dollar sign and brackets

rat.t.test$conf.int[1]
## [1] -37.50108

<br.

We can therefore access everything directly from out t.test output when we plot. First, let’s use our diff.means for the mean difference and access the confidence intervals directly

#make the plot
plotCI(x = diff.means,
       li= rat.t.test$conf.int[1],
        ui = rat.t.test$conf.int[2],
       ylim = ylimits) 

#add the line
abline(h = 0, col = 2, lty = 2)


This will get a big ugly, but let’s even get our mean difference from the t.test object. We’ll do this by using rat.t.test\(estimate[1] - rat.t.test\)estimate[2]

#make the plot
plotCI(x = rat.t.test$estimate[1] -
           rat.t.test$estimate[2],
       li= rat.t.test$conf.int[1],
        ui = rat.t.test$conf.int[2],
        ylim = ylimits) 

#add the line
abline(h = 0, col = 2, lty = 2)



A final fancy touch

  • Say we want to add the p-value for the test to this effect size plot.
  • We can do this with the text() command like this
#make the plot
plotCI(x = rat.t.test$estimate[1] -
           rat.t.test$estimate[2],
       li= rat.t.test$conf.int[1],
        ui = rat.t.test$conf.int[2],
        ylim = ylimits) 

#add the line
abline(h = 0, col = 2, lty = 2)

# ADD THE TEXT
text(x = 0.627,   #x coordiante
     y = -2,      #y coord
     "p = 0.002") #text: the p values



Since we know how to access stuff from R objects we can do this also be

#make the plot
plotCI(x = rat.t.test$estimate[1] -
           rat.t.test$estimate[2],
       li= rat.t.test$conf.int[1],
        ui = rat.t.test$conf.int[2],
        ylim = ylimits) 

#add the line
abline(h = 0, col = 2, lty = 2)

# ADD THE TEXT
text(x = 0.627,   #x coordiante
     y = -2,      #y coord
     rat.t.test$p.value) #text: the p values



Note that its plotted off center.
* This is annoying deafult of text(). * this is fixed using pos = 4 to plot to the right of the coordinates you define

#make the plot
plotCI(x = rat.t.test$estimate[1] -
           rat.t.test$estimate[2],
       li= rat.t.test$conf.int[1],
        ui = rat.t.test$conf.int[2],
        ylim = ylimits) 

#add the line
abline(h = 0, col = 2, lty = 2)

# ADD THE TEXT
text(x = 0.627,   #x coordiante
     y = -2,      #y coord
     pos = 4,
     rat.t.test$p.value) #text: the p values



Now maybe we should round this using the round() command * round(…, 3) will rond to 3 decimal places

#make the plot
plotCI(x = rat.t.test$estimate[1] -
           rat.t.test$estimate[2],
       li= rat.t.test$conf.int[1],
        ui = rat.t.test$conf.int[2],
        ylim = ylimits) 

#add the line
abline(h = 0, col = 2, lty = 2)

# ADD THE TEXT
text(x = 0.627,   #x coordiante
     y = -2,      #y coord
     pos = 4,
     round(rat.t.test$p.value,3)) #text: the p values



Debrief

We can…

  • learn about objects using length(), is(), str()
  • access parts of list using $ (and also brackets)
  • access parts of vetors using square brackets [ ]
  • save the output of a model / test to an object
  • access part of lists for plotting instead of copying stuff