Global caching

If “knitting” starts to take a long time because of intensive computations or plots that take a while to generate, you can use knitr caching to improve performance. Note that when you cache code chunks, once they are knit once they will be skipped over on subsequent knits, unless R code in a chunk is changed (even adding a space), or chunk options are changed.

Be careful when using caching, because once a chunk is cached, side-effects from that chunk will not be accessible to other subsequent chunks (e.g., loading a library or setting options in a chunk that is cached might be unavailable for subsequent chunks). To avoid this, you can cache a single chunk with the option cache=TRUE.

Read in a dataset from CSV

behav_filepath = '~/Dropbox/Code/tutorial/objfam_groupcat_euc.csv'

# Load data
df = read.csv(behav_filepath)

# Examine data structure
summary(df)
##     Subject          Run            Trial      Prototype         Pair    
##  Min.   : 3.0   Min.   : 1.00   Min.   : 1   Min.   : 1.0   Min.   :1.0  
##  1st Qu.: 7.0   1st Qu.: 3.75   1st Qu.: 4   1st Qu.: 8.0   1st Qu.:2.0  
##  Median :12.0   Median : 6.50   Median : 8   Median :15.5   Median :3.5  
##  Mean   :13.1   Mean   : 6.50   Mean   : 8   Mean   :15.5   Mean   :3.5  
##  3rd Qu.:20.0   3rd Qu.: 9.25   3rd Qu.:12   3rd Qu.:23.0   3rd Qu.:5.0  
##  Max.   :25.0   Max.   :12.00   Max.   :15   Max.   :30.0   Max.   :6.0  
##                                                                          
##      Morph      Response         RT         EuclidDist    PercepDist  
##  Min.   :1   Min.   :1.0   Min.   :0.00   Min.   :  0   Min.   :2.55  
##  1st Qu.:1   1st Qu.:2.0   1st Qu.:1.15   1st Qu.:  0   1st Qu.:3.91  
##  Median :2   Median :4.0   Median :1.47   Median :179   Median :4.91  
##  Mean   :2   Mean   :3.5   Mean   :1.53   Mean   :183   Mean   :5.04  
##  3rd Qu.:3   3rd Qu.:5.0   3rd Qu.:1.83   3rd Qu.:307   3rd Qu.:6.36  
##  Max.   :3   Max.   :5.0   Max.   :4.00   Max.   :662   Max.   :6.91  
##              NA's   :27
str(df)
## 'data.frame':    3420 obs. of  10 variables:
##  $ Subject   : int  3 3 3 3 3 3 3 3 3 3 ...
##  $ Run       : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ Trial     : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Prototype : int  24 8 7 15 19 26 30 16 22 18 ...
##  $ Pair      : int  2 2 5 2 1 1 4 2 2 5 ...
##  $ Morph     : int  2 1 3 2 1 2 3 3 2 2 ...
##  $ Response  : num  5 2 5 2 4 3 2 2 5 5 ...
##  $ RT        : num  1.97 3.26 3.6 1.82 3.47 1.59 2.66 1.54 1.27 1.8 ...
##  $ EuclidDist: num  204 0 482 160 0 ...
##  $ PercepDist: num  4.27 6.73 3.55 5.09 6.27 4.82 3.82 4 4.82 4.36 ...

Plot data with ggplot2

## Warning: Removed 27 rows containing non-finite values (stat_boxplot).

plot of chunk plot_data

## Warning: Removed 27 rows containing non-finite values (stat_boxplot).

plot of chunk plot_data

General linear model analysis

Does Euclidean distance vary as a function of morph level?

rs1 = lm(EuclidDist~scale(Morph, scale=FALSE), data=df)
summary(rs1)
## 
## Call:
## lm(formula = EuclidDist ~ scale(Morph, scale = FALSE), data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -213.26  -27.50   -1.95   10.56  299.29 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  182.595      0.993     184   <2e-16 ***
## scale(Morph, scale = FALSE)  180.648      1.216     149   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 58 on 3418 degrees of freedom
## Multiple R-squared:  0.866,  Adjusted R-squared:  0.866 
## F-statistic: 2.21e+04 on 1 and 3418 DF,  p-value: <2e-16
final_model = rs1
sm = summary(final_model)

Stats Summary:

Euclidean distance significantly varies as a function of morph level, \(R^2\) = 0.866, F(1, 3418) = 2.2083 × 104, t = 148.605, estimate = 180.6484.

Plot

plot of chunk visualize_euclid_morph