Week 3 YAY!
1. What are some disadvantages of using experimental control (i.e. random assignment) that statistical control doesn’t have?
Answer: Well, it’s costlier and takes a lot more real-world planning and execution. Random assignment and other forms of control must be built into your experimental or data-collection mechanism from the start and cannot be applied to previous work. They may induce delays, require additional participants to fill latin square or other power-related conditions, or cost money in terms of equipment, planning, RA salaries and time.
2. What are some limitations of statistical control?
Answer: It is much less effective at reducing potential non-independence of your groups. While you can try to separate the effects of variables using math, it is usually not fully possible to disentangle them with such techniques. They can of course be used to work on factors that are basically impossible to control in the lab, however. You also of course have to identify and in many cases measure these covariates, while human biases and lack of clarity may obscure many of them.
3. What advantages are there to random assignment?
Answer: Random assignment all but guarantees you’ll have independence in your test groups and control groups. It ensures that there will be no relationship between participant characteristics and group membership. It may not fully erase the effects that certain group characteristics impart if these are undesired - eg. it may not completely remove order of stimuli effects in experiments where particpants see images and react to them. It
4. What are the advantages of using experimental and statistical control together?
Answer: Well, you get the advantage of more fully controlling for things within the researchers’ power to control using random assignment and good design, plus the ability to look at and account for covariates that they cannot change. It does inherit a number of the disadvantages of both, but remains the best of both worlds in many cases.
5. What is cross-validation and why is it useful?
Answer: Cross-validation is the process of creating your model in one sample, referred to in Mahcine Learning circles as the Training data, and then testing it on an entirely different sample drawn from the same population. This has led some bright folks to consider simply dividing a sample in half and using each as a training or testing sample for the other. This also gave rise to k-fold testing, which divides the sample into smaller and smaller parcels, each of which is used to generate an equation and test it against one another.
6. Why is shrunken R useful?
Answer: These techniques can help get residual error, runaway R2 and other issues related to adding many predictors to your model under control. They use a tuning term known as lambda to reduce the effect of many potentially valuable but problematic predictors until in some cases they are zero or near zero, letting you get away with a simpler model that is more interpretable. However, to be frank, I didn’t fully grasp many of the finer points and use-cases of them and they remain somewhat mysterious.
7. What are forward and backward stepwise regression?
Answer: Stepwise regression algorithms programatically create an optimized predictor equation by either adding terms to the model until some pre-determined condition, like a predictor that doesn’t do anything, is reached (forward) or by starting from a massive everything-bagel model and taking out poor performing predictors, one by one (reverse). These can lead to powerful predictive models, at the cost of the theory-driven approach and soemtimes even being able to understand the model at all.
8. Why is stepwise regression problematic?
Answer: as noted above, it does not respect the usual goal of utilizing regression in psychology, to figure out which predictor based on some real-world or at least quasi-real measurement is the best for predicting a given outcome, usually a behavior. The algorithmic solutions can produce great equations for prediction that are nearly impossible to create a cogent theory around. It can also get pretty deeply into overfitting and underfitting depending on the method used.
9. What are the advantages and disadvantages of using regressed change scores and gain/difference scores?
Answer: I honestly don’t understand this well at all. They are… somehow related to regression to the mean, which is also confusing and weird. I do fully understand the solution, which is to use more than two time points whenever possible, but the problems themselves are… unclear.
Prompt: In this assignment we use taste ratings of French fries that may or may not have been treated. Our outcome is how potatoy the French fries taste. 6 people tasted French fries with the treatment and 6 people tasted French fires without the treatment. They rated how potatoy, buttery, grassy, rancid, and painty the fries tasted. Let’s predict potato taste from treatment condition.
Before we run any analyses, let’s load our packages and data and make sure it’s in good shape
Here are the packages we will use this week
library(QuantPsyc)
## Loading required package: boot
## Loading required package: MASS
##
## Attaching package: 'QuantPsyc'
## The following object is masked from 'package:base':
##
## norm
library(psych)
##
## Attaching package: 'psych'
## The following object is masked from 'package:boot':
##
## logit
library(pwr)
Here we load the data
ff = read.csv("hw3data.csv")
#let's see what we got here
str(ff)
## 'data.frame': 696 obs. of 9 variables:
## $ time : int 1 1 1 1 1 1 1 1 1 1 ...
## $ treatment: int 1 1 1 1 1 1 1 1 1 1 ...
## $ subject : int 3 3 10 10 15 15 16 16 19 19 ...
## $ rep : int 1 2 1 2 1 2 1 2 1 2 ...
## $ potatoy : num 2.9 14 11 9.9 1.2 8.8 9 8.2 7 13 ...
## $ buttery : num 0 0 6.4 5.9 0.1 3 2.6 4.4 3.2 0 ...
## $ grassy : num 0 0 0 2.9 0 3.6 0.4 0.3 0 3.1 ...
## $ rancid : num 0 1.1 0 2.2 1.1 1.5 0.1 1.4 4.9 4.3 ...
## $ painty : num 5.5 0 0 0 5.1 2.3 0.2 4 3.2 10.3 ...
head(ff) #show only a few rows
## time treatment subject rep potatoy buttery grassy rancid painty
## 1 1 1 3 1 2.9 0.0 0.0 0.0 5.5
## 2 1 1 3 2 14.0 0.0 0.0 1.1 0.0
## 3 1 1 10 1 11.0 6.4 0.0 0.0 0.0
## 4 1 1 10 2 9.9 5.9 2.9 2.2 0.0
## 5 1 1 15 1 1.2 0.1 0.0 1.1 5.1
## 6 1 1 15 2 8.8 3.0 3.6 1.5 2.3
#Next we have to do a few things to our data to make it work for the current research questions
#removing treatment 2 so will compare 1 to 3
#also removing all but first time point
#and removing pre-test
ff = ff[!ff$treatment == 2 & ff$time == 1 & ff$rep == 2,]
#"random" assignment - just assigning first 6 people to 1 condition and second 6 to people to other condition
#doing this by subestting since in reality each person got both treatments
length(unique(ff$subject))
## [1] 12
ff = ff[c(1:6, 19:24),]
#removing unnecesary columns
ff= ff[,c(2, 5:9)]
#dummy coding dichtomous predictor
ff$treatment[ff$treatment == 1] = 0
ff$treatment[ff$treatment == 3] = 1
#describe dataset and correlations
describe(ff)
## vars n mean sd median trimmed mad min max range skew kurtosis
## treatment 1 12 0.50 0.52 0.50 0.50 0.74 0 1.0 1.0 0.00 -2.16
## potatoy 2 12 9.75 2.55 9.35 9.80 1.63 5 14.0 9.0 0.08 -0.91
## buttery 3 12 2.69 2.62 2.75 2.53 4.00 0 7.0 7.0 0.18 -1.69
## grassy 4 12 1.51 1.59 1.15 1.44 1.70 0 3.7 3.7 0.18 -1.93
## rancid 5 12 2.16 2.08 1.45 1.99 1.56 0 6.0 6.0 0.72 -1.04
## painty 6 12 2.63 3.82 0.00 2.13 0.00 0 10.3 10.3 0.92 -0.86
## se
## treatment 0.15
## potatoy 0.74
## buttery 0.76
## grassy 0.46
## rancid 0.60
## painty 1.10
print(cor(ff)) #wrapping in print will do just that, print!
## treatment potatoy buttery grassy rancid painty
## treatment 1.00000000 -0.5598476 0.02322993 -0.09322804 -0.2640749 -0.03644808
## potatoy -0.55984762 1.0000000 -0.24299827 -0.40605343 0.2061370 -0.23152006
## buttery 0.02322993 -0.2429983 1.00000000 0.16992407 -0.0604959 -0.28119059
## grassy -0.09322804 -0.4060534 0.16992407 1.00000000 0.2127916 0.58280403
## rancid -0.26407493 0.2061370 -0.06049590 0.21279163 1.0000000 0.50401512
## painty -0.03644808 -0.2315201 -0.28119059 0.58280403 0.5040151 1.00000000
1. create a scatter plot of potato taste as a function of treatment. What does the scatterplot reveal?
#dichotomous predictor
plot(ff$potatoy~ff$treatment, xaxt="none")+
axis(1, seq(0,1))
## numeric(0)
Answer: This plot has a bad scale but I can’t remember how to fix it… my solution sort of works but is bad for other reasons. Anyway, the plot shows that people in group 2, treatment 1, had a generally lower ranting of the fries’ potato-y flavor, though most are fairly similar to the other group. There are some outliers in both groups, high for 0 and low for 1.
2. Run a regression model predicting potatoy taste from treatment. What is the intercept and slope? What do these numbers represent?
#running model
treatmodel = lm(potatoy ~ treatment, ff)
#seeing output
summary(treatmodel)
##
## Call:
## lm(formula = potatoy ~ treatment, data = ff)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.3833 -1.4917 0.0167 1.8917 2.8833
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.1167 0.9046 12.289 2.33e-07 ***
## treatment -2.7333 1.2793 -2.137 0.0584 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.216 on 10 degrees of freedom
## Multiple R-squared: 0.3134, Adjusted R-squared: 0.2448
## F-statistic: 4.565 on 1 and 10 DF, p-value: 0.05837
Answer: The intercept in the model is 11.1167, and the b-value for treatment is -2.7333. This indicates that at value zero, whihc was notably one of hte major data categories in this coding, the predicted y value is 11.1167, and that when treatment increases from 0 to 1, the “potato-y” rating decreases by 2.73333 points.
3. Calculate a standardized regression coefficient and the correlation between potatoy taste and treatment. Are they the same or different? Why?
#betas and rs
lm.beta(treatmodel)
## treatment
## -0.5598476
#some more info
cor.test(ff$potatoy, ff$treatment)
##
## Pearson's product-moment correlation
##
## data: ff$potatoy and ff$treatment
## t = -2.1366, df = 10, p-value = 0.05837
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.85805754 0.02070715
## sample estimates:
## cor
## -0.5598476
Answer: Since there are only two elements in the model, the correlation and the standardized beta are the same. I think it’s because the math just works out that way. The correlation is covariance divided by standard deviations, and betas are covariance over variance.
5. If we convert the correlation to a t-value, that t-value will equal the t-value we saw where?
#can also get the r and store it in an object with:
treatr = cor(ff$treatment, ff$potatoy)
#can convert this r to a t and get the same t value in the regression model
sqrt(10*treatr^2/(1-treatr^2))
## [1] 2.13662
Answer: This is the same as the t-value computed for the b slope in the regression model, though the sign is different (presumably due to some squaring somewhere along the line)
1. Given our sample size and an alpha level of .05, what effect size do we have 80% power to detect? Do we have low or high power?
#what effect size do we have 80% power to detect?
pwr.r.test(n = 12, sig.level = .05, power = .8)
##
## approximate correlation power calculation (arctangh transformation)
##
## n = 12
## r = 0.7183955
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
Answer: I’m sort of lost on this function but I assume r here is an effect size measure, insofar as correlation is sort of one of those. It gives a result of 0.7184, which seems like a pretty strong correlation and thus, arguably, a high effect size.
2. Given our sample size and an alpha level of .05, what power do we have for the effect size we found? This is called ____ power.
#what power do we have for the effect size we found?
pwr.r.test(n = 12, r = treatr, sig.level = .05)
##
## approximate correlation power calculation (arctangh transformation)
##
## n = 12
## r = 0.5598476
## sig.level = 0.05
## power = 0.501989
## alternative = two.sided
Answer: Feeding this function the value treatr, the correlation between our treatment and outcome, gives an effect size of .5598 and a power output of .502. I don’t know what scale this is, but it is halfway between 1 and 0, so I am going to daringly assume it is a middling power. I have no idea what the _____ power is, I don’t think this has been covered yet and google is not being helpful.
3. Given an alpha level of .05, how many cases would we need in a follow-up study to have 80% power to detect the same effect size we obtained. This is called ___ power.
#how many people would we need to have 80% power for our effect size
pwr.r.test(r = treatr, sig.level = .05, power = .8)
##
## approximate correlation power calculation (arctangh transformation)
##
## n = 21.88629
## r = 0.5598476
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
Answer: According to this function, you’d need at least 22 people. Technically, 21.9 or so, but .9 of a person is likely either dead and not able to undergo a taste-test study or a rude way of referring to an amputee of some sort. Again I am not sure what the ____ is.
1. Build a regression model in which potatoy taste is predicted by treatment and how buttery, grassy, rancid, and painty the French fries were. Rank these predictors in order of their importance (more accurately, effect consistency) based on the regression coefficients. Should you use bs or betas for this?
#adding a set of predictors
fullmodel = lm(potatoy ~ treatment + buttery + grassy + rancid + painty, ff)
summary(fullmodel)
##
## Call:
## lm(formula = potatoy ~ treatment + buttery + grassy + rancid +
## painty, data = ff)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.60479 -1.11475 0.00082 1.21511 2.82154
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.2108 1.5100 8.087 0.000192 ***
## treatment -2.5759 1.3476 -1.912 0.104488
## buttery -0.2268 0.2980 -0.761 0.475426
## grassy -0.5133 0.5869 -0.875 0.415396
## rancid 0.3066 0.3980 0.770 0.470277
## painty -0.1708 0.2851 -0.599 0.570968
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.204 on 6 degrees of freedom
## Multiple R-squared: 0.5926, Adjusted R-squared: 0.253
## F-statistic: 1.745 on 5 and 6 DF, p-value: 0.2581
lm.beta(fullmodel)
## treatment buttery grassy rancid painty
## -0.5276057 -0.2333181 -0.3195422 0.2497109 -0.2559847
Answer: The treatment itself seems to have had the greatest impact on the amount of potato-y flavor detected, at -.528 std dev change from zero to 1. If we focus purely on the flavor predictors, grassy produced the strongest swing in the potato-y flavor element, with a reduction of -0.32. Since this model does have a scale problem, wherein we are comparing a binary treatment category with presumably likert or ordinal-coded flavor ratings, so we have to standardize and convert to betas in order to directly compare any of the findings. We could directly compare b’s if we looked at a model containing only flavor ratings taken on the same scale. (Also the idea of french fries tasting like paint is blowing my mind a little here.)
2. How much better is our model? What statistic do we use for this? Calculate a p-value for this statistic.
#change in R-squared
chgR2 = summary(fullmodel)$r.squared - summary(treatmodel)$r.squared
chgR2
## [1] 0.2791368
#is this significant?
anova(treatmodel, fullmodel)
## Analysis of Variance Table
##
## Model 1: potatoy ~ treatment
## Model 2: potatoy ~ treatment + buttery + grassy + rancid + painty
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 10 49.097
## 2 6 29.136 4 19.961 1.0277 0.4639
#to get exact p-value
#create object from anova test
a = anova(treatmodel, fullmodel)
#see what is in the object
str(a)
## Classes 'anova' and 'data.frame': 2 obs. of 6 variables:
## $ Res.Df : num 10 6
## $ RSS : num 49.1 29.1
## $ Df : num NA 4
## $ Sum of Sq: num NA 20
## $ F : num NA 1.03
## $ Pr(>F) : num NA 0.464
## - attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Model 1: potatoy ~ treatment\nModel 2: potatoy ~ treatment + buttery + grassy + rancid + painty"
#extract out the info
a$`Pr(>F)`[2]
## [1] 0.4638846
Answer: This change in R2 is .279. This suggests that about 28% more variance is predicted by the addition of the other flavor ratings. However, the F-test comparing the two models is only F(6,4)=1.0277, with a non-significant p-value. This tells us that the addition of all these additional flavor variables are not likely a valuable change.
3. Calculate the semipartial correlation of the added set of predictors. How does this relate to the statistic in the previous question? What does the semipartial correlation represent?
#change in adjusted Rsquared
chgadjR2 = summary(fullmodel)$adj.r.squared - summary(treatmodel)$adj.r.squared
chgadjR2
## [1] 0.008265667
Answer: wait… does this command do semipartial correlations? You can get those by subtracting adjusted R2’s? Clearly I am missing some of the finer points here or this was a mistake.
It appears that this command is a similar attempt to the previous one except using the adjusted R2 which is not inflated by addition of other predictors. This shows an even more paltry change of 0.008, which is not a great amount of additional variance.
Ordinarily, semi-partial correlations show the unique contribution of a variable or set of variables in predicting the outcome, so if this is indeed an accurate semi-partial, it would test the additional predictive power of all the other flavor measures in predicting potato-y flavor.
4. Calculate the partial correlation of the added set of predictors. What does this number represent?
#semipartial correlation of added set of predictors
sqrt(chgR2)
## [1] 0.528334
#partial correlation of added set of predictors
sqrt(chgR2/(1-summary(treatmodel)$r.squared))
## [1] 0.6376259
Answer: I don’t think this was mentioned in class, that these correlations can be calculated so simply. I must have been blindsided so much trying to understand the subscripts and notation that I missed it, if so…
Partial correlations measure the total variance explained by a predictor, but this can include overlap from other predictors. Semi-partials ignore the overlap areas and only represent the unique variance explanation. At least, I think so. These remain confusing and I am rapidly losing faith that I understand them.
#attempting to verify this using a package for partial/semi[artials]
#install.packages("ppcor")
library(ppcor)
?pcor
## starting httpd help server ... done
#confused as to terminology... researching...
#error, help file too confusing, switching to google...
pcor(ff)
## $estimate
## treatment potatoy buttery grassy rancid
## treatment 1.000000000 -0.6152150 -0.09770881 -0.32873051 -0.05017893
## potatoy -0.615214955 1.0000000 -0.29673083 -0.33628074 0.30003965
## buttery -0.097708814 -0.2967308 1.00000000 0.30862549 0.26224681
## grassy -0.328730509 -0.3362807 0.30862549 1.00000000 -0.09236031
## rancid -0.050178931 0.3000396 0.26224681 -0.09236031 1.00000000
## painty 0.007242086 -0.2376086 -0.53258351 0.53153498 0.55367892
## painty
## treatment 0.007242086
## potatoy -0.237608601
## buttery -0.532583509
## grassy 0.531534981
## rancid 0.553678923
## painty 1.000000000
##
## $p.value
## treatment potatoy buttery grassy rancid painty
## treatment 0.0000000 0.1044879 0.8179587 0.4265955 0.9060723 0.9864216
## potatoy 0.1044879 0.0000000 0.4754257 0.4153962 0.4702772 0.5709684
## buttery 0.8179587 0.4754257 0.0000000 0.4570228 0.5303666 0.1741686
## grassy 0.4265955 0.4153962 0.4570228 0.0000000 0.8278067 0.1751790
## rancid 0.9060723 0.4702772 0.5303666 0.8278067 0.0000000 0.1545092
## painty 0.9864216 0.5709684 0.1741686 0.1751790 0.1545092 0.0000000
##
## $statistic
## treatment potatoy buttery grassy rancid painty
## treatment 0.00000000 -1.9115174 -0.2404875 -0.8526067 -0.1230678 0.01773988
## potatoy -1.91151739 0.0000000 -0.7611190 -0.8746545 0.7704408 -0.59917975
## buttery -0.24048746 -0.7611190 0.0000000 0.7947729 0.6656687 -1.54134284
## grassy -0.85260668 -0.8746545 0.7947729 0.0000000 -0.2272068 1.53711172
## rancid -0.12306781 0.7704408 0.6656687 -0.2272068 0.0000000 1.62865569
## painty 0.01773988 -0.5991798 -1.5413428 1.5371117 1.6286557 0.00000000
##
## $n
## [1] 12
##
## $gp
## [1] 4
##
## $method
## [1] "pearson"
pcor.test(ff$treatment,ff$potatoy,ff[,c(3:6)])
## estimate p.value statistic n gp Method
## 1 -0.615215 0.1044879 -1.911517 12 4 pearson
#hmm, slightly different results than above... might be using it wrong...
?spcor
spcor(ff)
## $estimate
## treatment potatoy buttery grassy rancid
## treatment 1.000000000 -0.5808294 -0.07307398 -0.25907115 -0.03739511
## potatoy -0.498116786 1.0000000 -0.19833781 -0.22792370 0.20076693
## buttery -0.079700221 -0.2522433 1.00000000 0.26339659 0.22061003
## grassy -0.233816679 -0.2398630 0.21795649 1.00000000 -0.06230861
## rancid -0.038533511 0.2412311 0.20842618 -0.07114026 1.00000000
## painty 0.004290933 -0.1449299 -0.37282090 0.37179747 0.39394018
## painty
## treatment 0.0053904
## potatoy -0.1561385
## buttery -0.5108182
## grassy 0.4215336
## rancid 0.5099451
## painty 1.0000000
##
## $p.value
## treatment potatoy buttery grassy rancid painty
## treatment 0.0000000 0.1310928 0.8634733 0.5355393 0.9299495 0.9898932
## potatoy 0.2090224 0.0000000 0.6377542 0.5872130 0.6335552 0.7119637
## buttery 0.8511937 0.5467226 0.0000000 0.5284983 0.5995813 0.1957864
## grassy 0.5773102 0.5672095 0.6040897 0.0000000 0.8834734 0.2982616
## rancid 0.9278212 0.5649326 0.6203713 0.8670614 0.0000000 0.1966817
## painty 0.9919546 0.7320376 0.3630353 0.3644591 0.3342233 0.0000000
##
## $statistic
## treatment potatoy buttery grassy rancid painty
## treatment 0.00000000 -1.7477780 -0.1794738 -0.6570241 -0.09166305 0.01320392
## potatoy -1.40712485 0.0000000 -0.4956736 -0.5733889 0.50199769 -0.38720875
## buttery -0.19584789 -0.6385145 0.0000000 0.6688043 0.55403219 -1.45546146
## grassy -0.58905988 -0.6052101 0.5470337 0.0000000 -0.15292144 1.13864982
## rancid -0.09445759 0.6088747 0.5220019 -0.1747000 0.00000000 1.45209842
## painty 0.01051069 -0.3587926 -0.9841768 0.9810411 1.04984719 0.00000000
##
## $n
## [1] 12
##
## $gp
## [1] 4
##
## $method
## [1] "pearson"
spcor.test(ff$treatment,ff$potatoy,ff[,c(3:6)])
## estimate p.value statistic n gp Method
## 1 -0.5808294 0.1310928 -1.747778 12 4 pearson
#wait... what am I testing... what am I holding constant...
#I think I've lost the plot entirely here.
#error, too tired, shutting down. leaving block in output for posterity.