Advanced Quantitative Methods
Course Syllabus/Outline
Course Description
This course provides an advanced treatment of regression analysis for complex data structures, building on foundational knowledge of linear regression and causal inference. Many real-world datasets violate the assumption of independent observations: students are nested within schools, patients within hospitals, survey responses within individuals over time. Standard regression approaches fail in these contexts, producing incorrect standard errors and potentially misleading conclusions. This course equips students with the conceptual understanding and practical skills to analyze clustered, longitudinal, and time-series data appropriately. Students will develop fluency in multilevel/hierarchical models, generalized linear mixed models for non-continuous outcomes, and growth curve models for longitudinal data. The course also provides foundational exposure to time-series methods. Throughout the course, we emphasize that these methods share a common theme: appropriately modeling observations that are not independent. The course uses R as the primary computing environment, and students will apply these methods to real datasets through unit problem sets, discussion activities, and a culminating final project. By the end of the course, students will be able to identify when standard regression is inappropriate, select and implement appropriate methods for complex data structures, and communicate findings from these analyses to technical and non-technical audiences.
Prerequisites
This course assumes prior completion of coursework covering:
- Linear regression (estimation, interpretation, diagnostics)
- Generalized linear models (logistic and Poisson regression)
- Fundamentals of causal inference and research design
- Working knowledge of R (data manipulation, basic programming, regression modeling)
Recommended prior texts: Bueno de Mesquita & Fowler, Thinking Clearly with Data; Gelman, Hill, & Vehtari, Regression and Other Stories
Course Learning Objectives Reference
| # | Course Learning Objective |
|---|---|
| CLO1 | Identify data structures that violate independence assumptions and select appropriate modeling strategies |
| CLO2 | Estimate, interpret, and evaluate multilevel/hierarchical models with varying intercepts and slopes |
| CLO3 | Extend multilevel models to binary and count outcomes using generalized linear mixed models |
| CLO4 | Model change over time using growth curve and longitudinal data analysis techniques |
| CLO5 | Describe foundational time-series concepts including autocorrelation, stationarity, and ARIMA models |
| CLO6 | Implement advanced regression models in R using appropriate packages (lme4, brms, etc.) |
| CLO7 | Communicate findings from complex models to technical and non-technical audiences |
| CLO8 | Critically evaluate published research using multilevel, longitudinal, and time-series methods |
Required Textbooks
| Textbook | Author(s) | Access |
|---|---|---|
| Data Analysis Using Regression and Multilevel/Hierarchical Models | Andrew Gelman & Jennifer Hill | Purchase required |
| Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence | Judith D. Singer & John B. Willett | Purchase required or library access |
| Forecasting: Principles and Practice (3e) | Rob J. Hyndman & George Athanasopoulos | Free online: https://otexts.com/fpp3/ |
Supplementary Resources
| Resource | Author(s) | Access |
|---|---|---|
| Applied Longitudinal Data Analysis in brms and the tidyverse | A. Solomon Kurz | Free online: https://bookdown.org/content/4253/ |
| Regression and Other Stories | Gelman, Hill, & Vehtari | Reference from prerequisite course |
Grading Overview
| Assignment | Points |
|---|---|
| Problem Sets (4 × 10 points each) | 40 |
| Discussion Activities | 10 |
| Final Project | 50 |
| Total | 100 |
Assignment Details
Problem Sets (4 total)
Due Dates: Week 4 (September 25), Week 8 (October 23), Week 11 (November 13), Week 13 (December 4)
Points: 40 total (10 points each)
Each problem set covers the material from one unit and includes applied exercises that require students to:
- Analyze provided datasets using the unit’s methods
- Interpret model output and write clear explanations
- Compare models and justify modeling decisions
- Visualize results appropriately
- Reflect on assumptions and limitations
Problem sets should be submitted as rendered Quarto documents (HTML) with all code visible and reproducible.
Grading Rubric
A (9.4-10): All problems completed correctly with clear, insightful explanations. Code is clean, well-commented, and reproducible. Interpretations demonstrate deep understanding of the methods. Visualizations effectively communicate results.
A- (9-9.3): Most problems completed correctly with clear explanations. Minor errors in code or interpretation that don’t reflect fundamental misunderstanding. Code is functional and reproducible.
B+ (8.7-8.9): Problems completed with generally correct analysis. Some explanations could be clearer or more thorough. Code is functional with minor issues.
B (8.3-8.6): Problems completed but with some errors in analysis or interpretation. Explanations lack clarity or depth in places. Code may have minor reproducibility issues.
B- (8-8.2): Problems completed but with notable errors in analysis or interpretation. Explanations lack clarity or depth. Code may have reproducibility issues.
C (7): Significant errors in analysis or interpretation. Incomplete problems or missing explanations. Demonstrates limited understanding of the methods.
D (5.1-6.9): Major errors throughout. Substantial portions incomplete or incorrect. Demonstrates minimal understanding of the methods.
F (5): Submitted but demonstrates fundamental misunderstanding of the methods; however, a good-faith attempt was made.
Zero (0): Not submitted or no meaningful attempt.
Ungraded Skills Check
Due Date: Week 2 (September 11)
Points: 0 (ungraded)
At the end of Week 2, students will complete a short skills check covering the foundational concepts from the first two modules: clustered data structures, varying intercepts, varying slopes, and basic lme4 syntax. This self-assessment helps students identify gaps in their understanding before Problem Set 1 is due.
The skills check includes 3-4 short problems similar in style to Problem Set 1. An answer key will be released after the due date so students can check their work. Students who struggle with the skills check are encouraged to attend office hours or review the module materials before proceeding.
This activity is ungraded to encourage honest self-assessment rather than strategic performance.
Discussion Activities
Overview
Discussion activities foster community and keep students engaged throughout the course. There are two types:
Article Reviews (4 total): Read an assigned short article and post a 200-300 word response addressing the prompt. Then reply substantively to at least one classmate’s post (minimum 100 words).
Project Posts (2 total): Share your final project plans and progress with the class, and provide feedback to classmates.
Points: 10 total
| Activity | Due Date | Points |
|---|---|---|
| Article Review #1 | Week 2 (September 11) | 1 |
| Article Review #2 | Week 5 (October 2) | 1 |
| Project Idea Post | Week 9 (October 30) | 3 |
| Article Review #3 | Week 10 (November 6) | 1 |
| Article Review #4 | Week 12 (November 20) | 1 |
| Project Status Update | Week 12 (November 20) | 3 |
Article Reviews
For each article review, read the assigned article and post a response addressing:
- What is the main argument or contribution?
- How does this connect to what we’re learning in the course?
- What questions does this raise for you, or what do you disagree with?
Then read your classmates’ posts and reply substantively to at least one, extending the conversation or offering a different perspective.
Grading Rubric (Article Reviews)
Full Credit (1): Post demonstrates thoughtful engagement with the article, makes clear connections to course material, and raises genuine questions or critiques. Reply substantively extends the conversation.
Partial Credit (0.5): Post summarizes the article but lacks depth in analysis or connection to course material. Reply is superficial.
No Credit (0): Post is incomplete, off-topic, or fails to engage meaningfully with the article. Reply is missing or perfunctory.
Project Posts
Project Idea Post (Week 9):
Share your planned final project with the class:
- What dataset will you use? (brief description of structure and source)
- What is your research question?
- Which method from the course is most appropriate, and why?
- What challenges do you anticipate?
Read at least two classmates’ posts and offer constructive feedback or suggestions.
Project Status Update (Week 12):
Update the class on your progress:
- What have you accomplished so far?
- What preliminary findings or challenges have emerged?
- What remains to be done?
- What specific feedback would be helpful from classmates?
Read at least two classmates’ posts and offer constructive feedback, resources, or encouragement.
Grading Rubric (Project Posts)
Full Credit (3): Post provides clear, substantive information about the project. Demonstrates thoughtful planning (Idea Post) or genuine progress and reflection (Status Update). Feedback to classmates is constructive and specific.
Partial Credit (1-2): Post addresses required elements but lacks depth or specificity. Feedback to classmates is generic.
No Credit (0): Post is incomplete or superficial. Feedback to classmates is missing or unhelpful.
Final Project
Due Date: Week 15 (December 18)
Points: 50
Apply one of the methods from Units 1-3 to a dataset of your choosing. The project should demonstrate your ability to identify an appropriate method for a given data structure, implement the analysis correctly, and communicate findings effectively.
Methods Scope: Final projects must use multilevel models, generalized linear mixed models, or longitudinal methods (growth curve models). Time-series methods (Unit 4) are not eligible for final projects. The time-series unit provides foundational exposure to prepare students for future coursework or self-study, but two weeks is insufficient for project-level application.
Deliverables:
- Written report (2000-2500 words) including:
- Introduction and research question
- Description of data structure and why chosen method is appropriate
- Model building process and justification of decisions
- Results with appropriate visualizations
- Discussion of limitations and assumptions
- Reproducible analysis code in a GitHub repository
- Brief presentation (10 minutes) to the class
Submission Requirements:
- Submit the written report as a self-contained HTML file (rendered from Quarto)
- Submit a link to your GitHub repository containing all code and data (or data access instructions)
- All deliverables must be submitted for the project to be graded
Grading Rubric
The final project is graded holistically by letter grade. The instructor assigns a letter grade based on the criteria below, then assigns points within the range for grades B- through A.
| Grade | Points | Description |
|---|---|---|
| A | 47-50 | Sophisticated understanding of the chosen method. Data structure and method choice are clearly and thoroughly justified. Model building is thoughtful with appropriate comparisons and diagnostics. Results are correctly interpreted with meaningful acknowledgment of limitations. Visualizations effectively communicate findings. Writing is clear, professional, and well-organized. Code is clean, well-documented, and fully reproducible. Presentation is clear, engaging, and demonstrates command of the material. |
| A- | 45-46 | Strong understanding of the method with clear justification for choices. Analysis is sound with minor opportunities for improvement in depth, presentation, or documentation. Code is clean and reproducible with minor documentation gaps. Presentation is clear and professional. |
| B+ | 44 | Competent application of the method with reasonable justification. Some aspects of model building, interpretation, or presentation could be strengthened. Code is functional with some organization or documentation issues. Presentation communicates main points effectively. |
| B | 42-43 | Meets requirements but the data story or methodological justification is underdeveloped. Interpretation may miss some nuances. Code is functional but may have reproducibility issues. Presentation is adequate but could be clearer or more polished. |
| B- | 40-41 | Method applied but with notable weaknesses in justification, interpretation, or presentation. May contain errors that don’t fundamentally undermine the analysis. Code has documentation or reproducibility issues. Presentation covers the basics but lacks depth or clarity. |
| C | 35 | Meets minimum expectations for graduate-level work but just barely. Significant issues with method application, interpretation, or analytical justification. Code may not be reproducible. Presentation is disorganized or unclear. |
| F | 25 | Submission demonstrates fundamental misunderstanding of the methods or fails to meet minimum expectations for graduate-level work; however, a good-faith attempt was made. |
| Zero | 0 | Not submitted or no meaningful attempt. |
Course Schedule Overview
| Week | Dates | Topic | Assignments Due |
|---|---|---|---|
| 1 | Aug 31 - Sep 4 | Introduction to Clustered Data and Varying Intercepts | |
| 2 | Sep 7 - Sep 11 | Varying Intercepts and Slopes | Article Review #1, Skills Check |
| 3 | Sep 14 - Sep 18 | Building and Evaluating Multilevel Models | |
| 4 | Sep 21 - Sep 25 | Unit 1 Work Week | Problem Set 1 |
| 5 | Sep 28 - Oct 2 | Multilevel Models for Binary Outcomes | Article Review #2 |
| 6 | Oct 5 - Oct 9 | Multilevel Models for Count Outcomes | |
| 7 | Oct 12 - Oct 16 | GLMM Applications and Best Practices | |
| 8 | Oct 19 - Oct 23 | Unit 2 Work Week | Problem Set 2 |
| 9 | Oct 26 - Oct 30 | Introduction to Longitudinal Data and Bayesian Primer | Project Idea Post |
| 10 | Nov 2 - Nov 6 | Growth Curve Models | Article Review #3 |
| 11 | Nov 9 - Nov 13 | Practical Issues in Longitudinal Analysis | Problem Set 3 |
| 12 | Nov 16 - Nov 20 | Foundations of Time-Series | Article Review #4, Project Status Update |
| 13 | Nov 30 - Dec 4 | Time-Series Regression and Looking Ahead | Problem Set 4 |
| 14 | Dec 7 - Dec 11 | Final Project Work Week | |
| 15 | Dec 14 - Dec 18 | Final Project Presentations | Final Project |
Note: November 23-27 is Thanksgiving Break—no class activities.
Unit 1: Multilevel/Hierarchical Models (Weeks 1–4)
Week 1: Introduction to Clustered Data and Varying Intercepts
Dates: August 31 - September 4
Module Description
This module introduces the fundamental problem that motivates multilevel modeling: observations that are not independent due to clustering or nesting. We examine why standard regression fails with clustered data, producing incorrect standard errors and potentially misleading inferences. Students will learn three approaches to clustered data—complete pooling, no pooling, and partial pooling—and understand why partial pooling through multilevel models often represents the best solution. The module introduces the varying intercept model, the simplest multilevel model, and key concepts including intraclass correlation and shrinkage. By the end of this module, students will recognize clustered data structures and implement basic varying intercept models in R.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 1.1 | Identify data structures that violate the independence assumption and explain why this matters for inference | CLO1 |
| 1.2 | Compare and contrast complete pooling, no pooling, and partial pooling approaches to clustered data | CLO1, CLO2 |
| 1.3 | Estimate varying intercept models using lme4 in R | CLO2, CLO6 |
| 1.4 | Calculate and interpret the intraclass correlation coefficient (ICC) | CLO2 |
| 1.5 | Explain the concept of shrinkage and why multilevel estimates differ from no-pooling estimates | CLO2 |
Required Readings
- Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 11: Multilevel Structures
- Chapter 12: Multilevel Linear Models: The Basics
Recommended Readings
- Gelman, Hill, & Vehtari, Regression and Other Stories
- Chapter 21: Additional Topics in Causal Inference (review partial pooling concepts)
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 3-4 hours |
| Coding practice (varying intercept models) | 3-4 hours |
| Begin Problem Set 1 | 1-2 hours |
| Total | 8-11 hours |
To Do This Week
Week 2: Varying Intercepts and Slopes
Dates: September 7 - September 11
Module Description
This module extends the basic multilevel model to include varying slopes, allowing the relationship between predictors and outcomes to differ across groups. We examine when varying slopes are necessary, how to interpret the covariance between intercepts and slopes, and the tradeoffs involved in adding model complexity. Students will learn to think carefully about which effects should vary and how to make principled decisions about model structure. The module also introduces the concept of centering predictors and its importance for interpretation in multilevel models.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 2.1 | Extend varying intercept models to include varying slopes | CLO2, CLO6 |
| 2.2 | Interpret the covariance between random intercepts and slopes | CLO2 |
| 2.3 | Make principled decisions about which effects should vary across groups | CLO1, CLO2 |
| 2.4 | Apply group-mean and grand-mean centering and explain the interpretive implications | CLO2 |
| 2.5 | Visualize varying intercepts and slopes to communicate model results | CLO7 |
Required Readings
- Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 13: Varying Intercepts and Slopes
Recommended Readings
- Enders, C.K. & Tofighi, D. (2007). Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. Psychological Methods, 12(2), 121-138.
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 2-3 hours |
| Coding practice (varying slopes, centering) | 3-4 hours |
| Continue Problem Set 1 | 1-2 hours |
| Article Review #1 | 1 hour |
| Skills Check | 1 hour |
| Total | 10-12 hours |
To Do This Week
Week 3: Building and Evaluating Multilevel Models
Dates: September 14 - September 18
Module Description
This module focuses on the practical aspects of building, evaluating, and presenting multilevel models. Students will learn strategies for model building, including how to decide on model complexity and compare nested models. We cover diagnostic tools for checking model assumptions, including residual analysis and examination of random effects distributions. The module also addresses how to present multilevel model results clearly in tables and figures, and briefly introduces extensions such as cross-classified and three-level models. By the end of this module, students will have a complete workflow for multilevel analysis.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 3.1 | Develop a systematic approach to building multilevel models of increasing complexity | CLO2 |
| 3.2 | Compare nested multilevel models using likelihood ratio tests and information criteria | CLO2, CLO6 |
| 3.3 | Conduct residual diagnostics and assess random effects assumptions | CLO2 |
| 3.4 | Present multilevel model results in clear tables and visualizations | CLO7 |
| 3.5 | Describe when cross-classified or three-level models might be necessary | CLO1 |
Required Readings
- Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 21: Understanding and Summarizing the Fitted Models
- Chapter 24: Model Checking and Comparison (selections)
Recommended Readings
- Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 25: Missing Data (skim for awareness)
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 2-3 hours |
| Coding practice (model comparison, diagnostics) | 3-4 hours |
| Continue Problem Set 1 | 2-3 hours |
| Total | 9-11 hours |
To Do This Week
Week 4: Unit 1 Work Week
Dates: September 21 - September 25
Module Description
No new lecture material this week. Students use this week to complete Problem Set 1, solidify their understanding of multilevel models through additional practice, and explore datasets for their final project. Office hours are available for troubleshooting and feedback on Problem Set 1.
Estimated Workload
| Activity | Time |
|---|---|
| Complete Problem Set 1 | 5-6 hours |
| Review and practice multilevel concepts | 2-3 hours |
| Explore Final Project datasets | 1-2 hours |
| Total | 8-11 hours |
To Do This Week
Unit 2: Generalized Linear Mixed Models (Weeks 5–8)
Week 5: Multilevel Models for Binary Outcomes
Dates: September 28 - October 2
Module Description
This module extends multilevel modeling to binary outcomes using multilevel logistic regression. We begin with a brief review of standard logistic regression before introducing the varying intercept logistic model. A key focus is on the interpretation challenges that arise when combining random effects with nonlinear link functions—random effects are on the log-odds scale, but we often want to communicate in terms of probabilities. Students will learn strategies for meaningful interpretation and visualization of multilevel logistic models.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 5.1 | Review and apply logistic regression for binary outcomes | CLO3 |
| 5.2 | Estimate multilevel logistic regression models with varying intercepts | CLO3, CLO6 |
| 5.3 | Interpret fixed effects and variance components in multilevel logistic models | CLO3 |
| 5.4 | Calculate and visualize predicted probabilities from multilevel logistic models | CLO3, CLO7 |
| 5.5 | Explain the challenges of interpreting random effects on the probability scale | CLO3 |
Required Readings
- Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 5: Logistic Regression (review)
- Chapter 6: Generalized Linear Models (sections 6.1-6.3)
- Chapter 14: Multilevel Logistic Regression
Recommended Readings
- Sommet, N. & Morselli, D. (2017). Keep calm and learn multilevel logistic modeling: A simplified three-step procedure using Stata, R, Mplus, and SPSS. International Review of Social Psychology, 30(1), 203-218.
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 3-4 hours |
| Coding practice (multilevel logistic models) | 3-4 hours |
| Begin Problem Set 2 | 1-2 hours |
| Article Review #2 | 1 hour |
| Total | 10-12 hours |
To Do This Week
Week 6: Multilevel Models for Count Outcomes
Dates: October 5 - October 9
Module Description
This module covers multilevel models for count outcomes, extending Poisson and negative binomial regression to include random effects. We review standard count models and their assumptions before introducing varying intercept Poisson models. A key topic is overdispersion—when the variance exceeds what the Poisson distribution predicts—and how negative binomial models address this issue. The module briefly introduces zero-inflated models for data with excess zeros.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 6.1 | Review and apply Poisson regression for count outcomes | CLO3 |
| 6.2 | Diagnose overdispersion and explain its consequences for inference | CLO3 |
| 6.3 | Estimate multilevel Poisson and negative binomial models | CLO3, CLO6 |
| 6.4 | Interpret rate ratios and expected counts from multilevel count models | CLO3, CLO7 |
| 6.5 | Describe when zero-inflated models might be appropriate | CLO1, CLO3 |
Required Readings
- Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 6: Generalized Linear Models (sections 6.4-6.5)
Recommended Readings
- Hilbe, J.M. (2014). Modeling Count Data. Cambridge University Press. (Chapters 1-4, available through library)
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 2-3 hours |
| Coding practice (multilevel count models) | 3-4 hours |
| Continue Problem Set 2 | 2-3 hours |
| Total | 9-11 hours |
To Do This Week
Week 7: GLMM Applications and Best Practices
Dates: October 12 - October 16
Module Description
This module consolidates learning about generalized linear mixed models through a complete applied example. We work through the full analysis pipeline—from data exploration through model building, diagnostics, and presentation of results—for a real-world dataset with a non-continuous outcome. The module also addresses practical issues including estimation challenges in GLMMs, convergence problems, and best practices for reporting results in academic and applied contexts.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 7.1 | Apply a complete GLMM analysis workflow to a real dataset | CLO3, CLO6 |
| 7.2 | Diagnose and address common estimation problems in GLMMs | CLO3 |
| 7.3 | Compare maximum likelihood and restricted maximum likelihood estimation approaches | CLO3 |
| 7.4 | Create publication-ready tables and figures for GLMM results | CLO7 |
| 7.5 | Evaluate the appropriateness of GLMM applications in published research | CLO8 |
Required Readings
- Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 24: Model Checking and Comparison (revisit relevant sections)
Recommended Readings
- Harrison, X.A. et al. (2018). A brief introduction to mixed effects modelling and multi-model inference in ecology. PeerJ, 6:e4794.
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 1-2 hours |
| Work through applied GLMM analysis example | 3-4 hours |
| Continue Problem Set 2 | 2-3 hours |
| Total | 8-10 hours |
To Do This Week
Week 8: Unit 2 Work Week
Dates: October 19 - October 23
Module Description
No new lecture material this week. Students use this week to complete Problem Set 2, practice GLMM applications, and continue developing their final project plans. Office hours are available for methodological questions and feedback on Problem Set 2.
Estimated Workload
| Activity | Time |
|---|---|
| Complete Problem Set 2 | 5-6 hours |
| Review and practice GLMM concepts | 2-3 hours |
| Develop Final Project plans | 1-2 hours |
| Total | 8-11 hours |
To Do This Week
Unit 3: Longitudinal Data Analysis (Weeks 9–11)
Week 9: Introduction to Longitudinal Data and Bayesian Primer
Dates: October 26 - October 30
Module Description
This module transitions from cross-sectional multilevel models to longitudinal data analysis. Students will learn to recognize longitudinal data as a special case of multilevel data—observations nested within individuals over time—while understanding what makes longitudinal analysis distinct. We introduce key concepts including time-varying vs. time-invariant predictors, the choice between wide and long data formats, and the critical importance of how time is measured and centered. This module also situates longitudinal analysis within the broader landscape of methods for non-independent data, distinguishing it from panel data approaches in econometrics and pure time-series analysis.
This module also includes a brief Bayesian primer. The Kurz supplementary text uses brms, a Bayesian modeling package. While this course does not teach Bayesian statistics, students need enough background to use brms intelligently: understanding priors, posteriors, credible intervals, and how to read brms output. This primer provides just enough context to follow along with the Kurz materials without requiring a full treatment of Bayesian inference.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 9.1 | Recognize longitudinal data as a special case of multilevel structure | CLO1, CLO4 |
| 9.2 | Distinguish between time-varying and time-invariant predictors | CLO4 |
| 9.3 | Restructure data between wide and long formats | CLO4, CLO6 |
| 9.4 | Distinguish longitudinal, panel, and time-series data structures and their typical methods | CLO1 |
| 9.5 | Explain the multilevel vs. econometric traditions for panel data analysis | CLO1, CLO8 |
| 9.6 | Describe basic Bayesian concepts (prior, posterior, credible interval) sufficiently to interpret brms output | CLO6 |
Required Readings
- Singer & Willett, Applied Longitudinal Data Analysis
- Chapter 1: A Framework for Investigating Change Over Time
- Chapter 2: Exploring Longitudinal Data on Change
- Kurz, Applied Longitudinal Data Analysis in brms and the tidyverse
- Chapters 1-2 (parallel R code)
- Instructor-provided Bayesian Primer handout
Recommended Readings
- Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 15: Multilevel Models and Longitudinal Data
- McElreath, R. Statistical Rethinking
- Chapter 1: The Golem of Prague (for accessible Bayesian motivation)
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures (including Bayesian primer) | 2 hours |
| Required readings | 3-4 hours |
| Data restructuring practice | 2-3 hours |
| Begin Problem Set 3 | 1-2 hours |
| Project Idea Post | 1-2 hours |
| Total | 9-12 hours |
To Do This Week
Week 10: Growth Curve Models
Dates: November 2 - November 6
Module Description
This module introduces growth curve models—multilevel models designed to describe and predict individual change over time. We begin with unconditional models that simply describe the average trajectory and individual variation around it, then add predictors to explain why some individuals change more (or differently) than others. Students will learn to model linear and nonlinear trajectories, interpret individual variation in change, and understand how centering time affects interpretation. The connection between growth curve models and the varying slopes models from Unit 1 is made explicit.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 10.1 | Estimate unconditional growth models to describe average and individual change | CLO4, CLO6 |
| 10.2 | Add predictors to explain between-person differences in change trajectories | CLO4 |
| 10.3 | Model nonlinear change using polynomial and other functional forms | CLO4 |
| 10.4 | Center time appropriately and explain the interpretive consequences | CLO4 |
| 10.5 | Visualize individual growth trajectories and model-implied predictions | CLO4, CLO7 |
Required Readings
- Singer & Willett, Applied Longitudinal Data Analysis
- Chapter 3: Introducing the Multilevel Model for Change
- Chapter 4: Doing Data Analysis with the Multilevel Model for Change
- Chapter 5: Treating Time More Flexibly
- Kurz, Applied Longitudinal Data Analysis in brms and the tidyverse
- Chapters 3-5 (parallel R code)
Recommended Readings
- Singer & Willett, Applied Longitudinal Data Analysis
- Chapter 6: Modeling Discontinuous and Nonlinear Change
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 3-4 hours |
| Coding practice (growth curve models) | 3-4 hours |
| Continue Problem Set 3 | 1-2 hours |
| Article Review #3 | 1 hour |
| Total | 10-12 hours |
To Do This Week
Week 11: Practical Issues in Longitudinal Analysis
Dates: November 9 - November 13
Module Description
This module addresses the practical challenges that arise in longitudinal data analysis. A major focus is missing data—a nearly universal feature of longitudinal studies. Students will learn to distinguish between missing data mechanisms (MCAR, MAR, MNAR) and understand how multilevel models handle missingness under MAR assumptions. We also cover unbalanced data (individuals with different numbers of observations or irregular time points) and strategies for model selection in longitudinal contexts. By the end of this module, students will be prepared to handle the messiness of real longitudinal data.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 11.1 | Distinguish between MCAR, MAR, and MNAR missing data mechanisms | CLO4 |
| 11.2 | Explain how and why multilevel models handle MAR missingness | CLO4 |
| 11.3 | Analyze unbalanced longitudinal data with varying numbers of observations | CLO4, CLO6 |
| 11.4 | Apply model selection strategies in longitudinal contexts | CLO4 |
| 11.5 | Diagnose potential violations of assumptions in longitudinal models | CLO4, CLO8 |
Required Readings
- Singer & Willett, Applied Longitudinal Data Analysis
- Chapter 7: Examining the Multilevel Model’s Error Covariance Structure
- Kurz, Applied Longitudinal Data Analysis in brms and the tidyverse
- Chapter 7 (parallel R code)
Recommended Readings
- Enders, C.K. (2010). Applied Missing Data Analysis. Guilford Press. (Chapters 1-3, available through library)
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 2-3 hours |
| Coding practice (missing data, unbalanced designs) | 2-3 hours |
| Complete Problem Set 3 | 3-4 hours |
| Total | 9-11 hours |
To Do This Week
Unit 4: Introduction to Time-Series (Weeks 12–13)
Note: This unit provides foundational exposure to time-series methods. Students will learn core concepts and vocabulary, but two weeks is insufficient for deep application. Time-series methods are not eligible for final projects. This unit prepares students for future coursework or self-study in time-series analysis.
Week 12: Foundations of Time-Series
Dates: November 16 - November 20
Module Description
This module introduces time-series analysis, completing our survey of methods for non-independent data. While longitudinal data typically involves many individuals measured at few time points, time-series data typically involves one (or few) units measured at many time points—think economic indicators, stock prices, or climate measurements. The key statistical concern shifts from clustering to autocorrelation: today’s value depends on yesterday’s. Students will learn to identify time-series data structures, diagnose autocorrelation, assess stationarity, and understand the basic building blocks of time-series models (AR and MA processes).
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 12.1 | Distinguish time-series data structures from longitudinal and panel data | CLO1, CLO5 |
| 12.2 | Calculate and interpret autocorrelation and partial autocorrelation functions | CLO5 |
| 12.3 | Assess and transform for stationarity | CLO5 |
| 12.4 | Explain autoregressive (AR) and moving average (MA) processes | CLO5 |
| 12.5 | Identify AR and MA signatures in ACF and PACF plots | CLO5 |
Required Readings
- Hyndman & Athanasopoulos, Forecasting: Principles and Practice (3e)
- Chapter 1: Getting Started
- Chapter 2: Time Series Graphics
- Chapter 9: ARIMA Models (sections 9.1-9.5)
Recommended Readings
- Hyndman & Athanasopoulos, Forecasting: Principles and Practice (3e)
- Chapter 3: Time Series Decomposition
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 3-4 hours |
| Coding practice (ACF, PACF, stationarity) | 2-3 hours |
| Begin Problem Set 4 | 1-2 hours |
| Article Review #4 | 1 hour |
| Project Status Update | 1-2 hours |
| Total | 10-12 hours |
To Do This Week
November 23 - November 27: Thanksgiving Break — No Class Activities
Week 13: Time-Series Regression and Looking Ahead
Dates: November 30 - December 4
Module Description
This module covers regression with time-series data and provides a bridge to further study. We examine how to incorporate autocorrelation into regression models, introduce ARIMA as a general framework, and briefly discuss how time-series concepts apply to panel data (when you have multiple units over many time points). The module concludes by surveying advanced topics and resources for continued learning, including forecasting, state-space models, and Bayesian approaches to time-series.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 13.1 | Estimate regression models with autocorrelated errors | CLO5, CLO6 |
| 13.2 | Fit and interpret basic ARIMA models | CLO5, CLO6 |
| 13.3 | Diagnose model fit using residual diagnostics for time-series | CLO5 |
| 13.4 | Describe how time-series methods extend to panel data contexts | CLO1, CLO5 |
| 13.5 | Identify resources for continued learning in time-series analysis | CLO5, CLO8 |
Required Readings
- Hyndman & Athanasopoulos, Forecasting: Principles and Practice (3e)
- Chapter 7: Time Series Regression Models (sections 7.1-7.4)
- Chapter 9: ARIMA Models (sections 9.6-9.9)
Recommended Readings
- Hyndman & Athanasopoulos, Forecasting: Principles and Practice (3e)
- Chapter 10: Dynamic Regression Models (sections 10.1-10.2)
Estimated Workload
| Activity | Time |
|---|---|
| Video lectures | 1.5 hours |
| Required readings | 2-3 hours |
| Coding practice (ARIMA, dynamic regression) | 2-3 hours |
| Complete Problem Set 4 | 3-4 hours |
| Total | 9-11 hours |
To Do This Week
Week 14: Final Project Work Week
Dates: December 7 - December 11
Module Description
No new lecture material this week. Students use this week to finalize their analysis, complete their written report, prepare their GitHub repository, and rehearse their presentation. Office hours are available for troubleshooting and feedback.
Estimated Workload
| Activity | Time |
|---|---|
| Finalize Final Project analysis | 4-5 hours |
| Write Final Project report | 3-4 hours |
| Prepare GitHub repository | 1-2 hours |
| Total | 8-11 hours |
To Do This Week
Week 15: Final Project Presentations
Dates: December 14 - December 18
Module Description
This final module brings together everything students have learned throughout the course. Students will present their final projects to the class, explaining their data structure, methodological choices, findings, and limitations. Presentations provide an opportunity to practice communicating complex statistical analyses to an audience and to receive feedback from peers and the instructor. We will also reflect on the course’s central theme—appropriately modeling observations that are not independent—and discuss how these methods connect to further study in causal inference, Bayesian analysis, and machine learning.
Module Learning Objectives
By the end of this module, students will be able to:
| # | Module Learning Objective | Maps to CLO |
|---|---|---|
| 15.1 | Present a complete statistical analysis to a live audience, clearly explaining methodological choices | CLO7 |
| 15.2 | Provide constructive peer feedback on analytical approaches and presentation effectiveness | CLO8 |
| 15.3 | Synthesize course themes around modeling non-independent observations | CLO1-CLO5 |
| 15.4 | Identify connections between course methods and advanced topics | CLO8 |
| 15.5 | Describe strategies for continued learning in advanced quantitative methods | CLO8 |
Required Readings
- No new readings—focus on final project completion
Estimated Workload
| Activity | Time |
|---|---|
| Finalize and polish Final Project report | 3-4 hours |
| Prepare and rehearse presentation | 2-3 hours |
| Attend class presentations and provide peer feedback | 2-3 hours |
| Total | 7-10 hours |
To Do This Week
Conversion Plan: 15-Week to 12-Week Semester
This syllabus is designed for a 15-week fall or spring semester. When teaching the course in a compressed 12-week summer semester, the three work weeks must be eliminated and content consolidated.
Recommended 12-Week Structure
Remove the three work weeks and adjust pacing:
| 12-Week | 15-Week Equivalent | Topic |
|---|---|---|
| Week 1 | Week 1 | Introduction to Clustered Data and Varying Intercepts |
| Week 2 | Week 2 | Varying Intercepts and Slopes |
| Week 3 | Week 3 | Building and Evaluating Multilevel Models |
| Week 4 | Week 5 | Multilevel Models for Binary Outcomes |
| Week 5 | Week 6 | Multilevel Models for Count Outcomes |
| Week 6 | Week 7 | GLMM Applications and Best Practices |
| Week 7 | Week 9 | Introduction to Longitudinal Data and Bayesian Primer |
| Week 8 | Week 10 | Growth Curve Models |
| Week 9 | Week 11 | Practical Issues in Longitudinal Analysis |
| Week 10 | Week 12 | Foundations of Time-Series |
| Week 11 | Week 13 | Time-Series Regression and Looking Ahead |
| Week 12 | Week 15 | Final Project Presentations |
Key Adjustments for 12-Week Format
Problem Sets: Without dedicated work weeks, problem sets will be due at the end of the final content week for each unit rather than during a work week:
| Assignment | 15-Week Due Date | 12-Week Due Date |
|---|---|---|
| Skills Check | Week 2 | Week 2 |
| Problem Set 1 | Week 4 | Week 3 |
| Problem Set 2 | Week 8 | Week 6 |
| Problem Set 3 | Week 11 | Week 9 |
| Problem Set 4 | Week 13 | Week 11 |
| Final Project | Week 15 | Week 12 |
Discussion Activities: Consolidate to reduce load:
| Activity | 15-Week Due Date | 12-Week Due Date |
|---|---|---|
| Article Review #1 | Week 2 | Week 2 |
| Article Review #2 | Week 5 | Week 4 |
| Project Idea Post | Week 9 | Week 7 |
| Article Review #3 | Week 10 | Week 8 |
| Article Review #4 | Week 12 | Week 10 |
| Project Status Update | Week 12 | Week 10 |
Workload Considerations: Without work weeks, students will need to manage problem sets alongside new content. Consider:
- Reducing coding practice exercises slightly
- Providing more starter code for problem sets
- Being flexible with office hours during problem set weeks
- Acknowledging the increased intensity in course communications
Reading Adjustments: Consider moving some required readings to recommended status, particularly:
- Gelman & Hill Chapter 25 (Missing Data)
- Singer & Willett Chapter 6 (Discontinuous and Nonlinear Change)
- Hyndman & Athanasopoulos Chapter 3 (Time Series Decomposition)