PSY460: Advanced Quantitative Methods

Week #2: Statistical Models

Today, we’ll make sure that we understand the tremendous usefulness of general linear models. Then, you’ll work in teams to refine your hypotheses.

Quiz: Comprehension Questions

  1. In a sentence or formula, what is the general linear model? (You may want to consider what two basic elements of the plot on the right a general linear model would attempt to capture.)
  2. Why might you want to transform data (e.g., by creating Z-scores or standardized scores)?

Quiz: Application Question

  1. Which of the following is the best “model” of a bull? Explain in 3-4 sentences. (I’m more interested in your justification than the answer itself; there isn’t a clearly correct answer.)

What are Statistical Models?

  • Fundamentally, statistics attempts to understand a particular outcome in terms of a simplified model and the predictive error that results from simplifying the messiness of reality into a model.
    • Data are messy and complex; statistics allows us to gain an understanding of the world by simplifying it and extracting the most important signals.
      • The best models have low prediction error and high generalizability to new datasets.

Simplifications are Always Incorrect

  • Models are mere representations of actual data; they always simplify and thus distort the truth.
  • Models ignore details that are considered to be non-essential to roughly predict the future and explain the past.

Linear Models

  • Assuming a linear relationship between predictors and an outcome variable, linear models predict an outcome of interest by fitting a line (or multiple lines) onto the data. Each line can be characterized by a slope and a y-intercept.
    • outcome = intercept + slope*predictors + error
    • or, in mathematical notation, y = a + bx + e

Surprise: You’ve Actually Been Using Linear Models All Along!

Most Statistical Tests are Various Formulations of Linear Models

  • One-sample t-test
    • y = intercept + error
  • Independent-samples t-test
    • y = intercept + slope*group + error
  • ANOVA
    • y = intercept + slope1*group1 + slope2*group2 + error
  • More complex statistics often don’t use wholly new model types; they just add additional model components.

Moving Toward Multivariate Models

  • Fitting a single line is not sufficient for modeling certain relationships; in such cases, multiple variables need to be taken into account (while taking into account the tradeoff between fit and simplicity/generalizability).

Groupwork

  • Share your hypotheses and background research with your teammates. Then, work to converge upon a shared question that is (a) feasible, (b) novel, and (c) testable with an “advanced” statistical model. Call me over to help as needed!
  • Please prepare sufficiently for our group meetings to make them worthwhile. I encourage you to refer to the syllabus to see my expectations for these weekly meetings.
  • I’ve realized that it will be easiest to meet in my office (LSP 132D), since it will often be helpful to look at my monitor.