Here are some example R commands.

I created a data file for R with all of the labels and groups predefined for participants' demographics. Wrapping the "load" command in parentheses tells R to output the name of data table (called a data frame by R).

(load("~/Desktop/w00Demographics.rdata"))

[1] "w00Dem"

Let's see what's in the data. A summary command prints the minimum, maximum, mean, and median for each variable or the numbers of participants in groups for categorical variables.

summary(w00Dem)

     HNDid                 DOB                  Age0       Age0grp     Age0med        Sex          Race       PovStat    
 Min.   :8031042901   Min.   :1939-07-28   Min.   :30.0   30-34:392   Below:1860   Women:2035   White:1522   Above:2185  
 1st Qu.:8133082301   1st Qu.:1950-11-27   1st Qu.:40.0   35-39:461   Above:1860   Men  :1685   AfrAm:2198   Below:1535  
 Median :8162502101   Median :1958-02-09   Median :48.0   40-44:509                                                      
 Mean   :8160956892   Mean   :1958-07-25   Mean   :47.7   45-39:692                                                      
 3rd Qu.:8192566076   3rd Qu.:1966-01-17   3rd Qu.:55.0   50-54:631                                                      
 Max.   :8224521902   Max.   :1978-07-21   Max.   :64.0   55-59:577                                                      
                                                          60-64:458

The summary command show the distributions of values for each variable separately. What about if we want to know how many women are African Americans in our sample?

with(w00Dem, table(Race, Sex))

       Sex
Race    Women  Men
  White   835  687
  AfrAm  1200  998

Can you show how many men are in the below poverty status group?

You already know something about t-tests. In a t-test we ask whether a measure (e.g., age) is different in two groups. For example, do men and women have different mean ages in HANDLS?

In R there are two equivalent forms for the t-test command, but the most simple is:

with(w00Dem, t.test(Age0 ~ Sex))


    Welch Two Sample t-test

data:  Age0 by Sex
t = 0.7444, df = 3615, p-value = 0.4567
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3736  0.8310
sample estimates:
mean in group Women   mean in group Men 
              47.82               47.59

So what does this tell us? Are there statistically significant differences in the mean ages for men and women in HANDLS?

The notation with the "curl" character says that we are examining age (the outcome or dependent variable) as a function of sex (the predictor, the grouping factor, the independent variable).

What about age differences by poverty status?

with(w00Dem, t.test(Age0 ~ PovStat))


    Welch Two Sample t-test

data:  Age0 by PovStat
t = 2.491, df = 3363, p-value = 0.01277
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.1642 1.3773
sample estimates:
mean in group Above mean in group Below 
              48.03               47.26

Linear regression is an equivalent form of a t-test. Do you see some similarities between this analysis and the t.test above?

reg1 = lm(Age0 ~ Sex, data = w00Dem)
summary(reg1)


Call:
lm(formula = Age0 ~ Sex, data = w00Dem)

Residuals:
Initial age 
    Min      1Q  Median      3Q     Max 
-17.816  -7.588   0.412   7.412  16.412 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   47.816      0.207  230.81   <2e-16
SexMen        -0.229      0.308   -0.74     0.46

Residual standard error: 9.35 on 3718 degrees of freedom
Multiple R-squared:  0.000148,  Adjusted R-squared:  -0.00012 
F-statistic: 0.552 on 1 and 3718 DF,  p-value: 0.458