Advanced Quantitative Methods

Course Syllabus/Outline

Course Description

This course provides an advanced treatment of regression analysis for complex data structures, building on foundational knowledge of linear regression and causal inference. Many real-world datasets violate the assumption of independent observations: students are nested within schools, patients within hospitals, survey responses within individuals over time. Standard regression approaches fail in these contexts, producing incorrect standard errors and potentially misleading conclusions. This course equips students with the conceptual understanding and practical skills to analyze clustered, longitudinal, and time-series data appropriately. Students will develop fluency in multilevel/hierarchical models, generalized linear mixed models for non-continuous outcomes, and growth curve models for longitudinal data. The course also provides foundational exposure to time-series methods. Throughout the course, we emphasize that these methods share a common theme: appropriately modeling observations that are not independent. The course uses R as the primary computing environment, and students will apply these methods to real datasets through unit problem sets, discussion activities, and a culminating final project. By the end of the course, students will be able to identify when standard regression is inappropriate, select and implement appropriate methods for complex data structures, and communicate findings from these analyses to technical and non-technical audiences.

Prerequisites

This course assumes prior completion of coursework covering:

Linear regression (estimation, interpretation, diagnostics)
Generalized linear models (logistic and Poisson regression)
Fundamentals of causal inference and research design
Working knowledge of R (data manipulation, basic programming, regression modeling)

Recommended prior texts: Bueno de Mesquita & Fowler, Thinking Clearly with Data; Gelman, Hill, & Vehtari, Regression and Other Stories

Course Learning Objectives Reference

#	Course Learning Objective
CLO1	Identify data structures that violate independence assumptions and select appropriate modeling strategies
CLO2	Estimate, interpret, and evaluate multilevel/hierarchical models with varying intercepts and slopes
CLO3	Extend multilevel models to binary and count outcomes using generalized linear mixed models
CLO4	Model change over time using growth curve and longitudinal data analysis techniques
CLO5	Describe foundational time-series concepts including autocorrelation, stationarity, and ARIMA models
CLO6	Implement advanced regression models in R using appropriate packages (lme4, brms, etc.)
CLO7	Communicate findings from complex models to technical and non-technical audiences
CLO8	Critically evaluate published research using multilevel, longitudinal, and time-series methods

Required Textbooks

Textbook	Author(s)	Access
Data Analysis Using Regression and Multilevel/Hierarchical Models	Andrew Gelman & Jennifer Hill	Purchase required
Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence	Judith D. Singer & John B. Willett	Purchase required or library access
Forecasting: Principles and Practice (3e)	Rob J. Hyndman & George Athanasopoulos	Free online: https://otexts.com/fpp3/

Supplementary Resources

Resource	Author(s)	Access
Applied Longitudinal Data Analysis in brms and the tidyverse	A. Solomon Kurz	Free online: https://bookdown.org/content/4253/
Regression and Other Stories	Gelman, Hill, & Vehtari	Reference from prerequisite course

Grading Overview

Assignment	Points
Problem Sets (4 × 10 points each)	40
Discussion Activities	10
Final Project	50
Total	100

Assignment Details

Problem Sets (4 total)

Due Dates: Week 4 (September 25), Week 8 (October 23), Week 11 (November 13), Week 13 (December 4)

Points: 40 total (10 points each)

Each problem set covers the material from one unit and includes applied exercises that require students to:

Analyze provided datasets using the unit’s methods
Interpret model output and write clear explanations
Compare models and justify modeling decisions
Visualize results appropriately
Reflect on assumptions and limitations

Problem sets should be submitted as rendered Quarto documents (HTML) with all code visible and reproducible.

Grading Rubric

A (9.4-10): All problems completed correctly with clear, insightful explanations. Code is clean, well-commented, and reproducible. Interpretations demonstrate deep understanding of the methods. Visualizations effectively communicate results.

A- (9-9.3): Most problems completed correctly with clear explanations. Minor errors in code or interpretation that don’t reflect fundamental misunderstanding. Code is functional and reproducible.

B+ (8.7-8.9): Problems completed with generally correct analysis. Some explanations could be clearer or more thorough. Code is functional with minor issues.

B (8.3-8.6): Problems completed but with some errors in analysis or interpretation. Explanations lack clarity or depth in places. Code may have minor reproducibility issues.

B- (8-8.2): Problems completed but with notable errors in analysis or interpretation. Explanations lack clarity or depth. Code may have reproducibility issues.

C (7): Significant errors in analysis or interpretation. Incomplete problems or missing explanations. Demonstrates limited understanding of the methods.

D (5.1-6.9): Major errors throughout. Substantial portions incomplete or incorrect. Demonstrates minimal understanding of the methods.

F (5): Submitted but demonstrates fundamental misunderstanding of the methods; however, a good-faith attempt was made.

Zero (0): Not submitted or no meaningful attempt.

Ungraded Skills Check

Due Date: Week 2 (September 11)

Points: 0 (ungraded)

At the end of Week 2, students will complete a short skills check covering the foundational concepts from the first two modules: clustered data structures, varying intercepts, varying slopes, and basic lme4 syntax. This self-assessment helps students identify gaps in their understanding before Problem Set 1 is due.

The skills check includes 3-4 short problems similar in style to Problem Set 1. An answer key will be released after the due date so students can check their work. Students who struggle with the skills check are encouraged to attend office hours or review the module materials before proceeding.

This activity is ungraded to encourage honest self-assessment rather than strategic performance.

Discussion Activities

Overview

Discussion activities foster community and keep students engaged throughout the course. There are two types:

Article Reviews (4 total): Read an assigned short article and post a 200-300 word response addressing the prompt. Then reply substantively to at least one classmate’s post (minimum 100 words).

Project Posts (2 total): Share your final project plans and progress with the class, and provide feedback to classmates.

Points: 10 total

Activity	Due Date	Points
Article Review #1	Week 2 (September 11)	1
Article Review #2	Week 5 (October 2)	1
Project Idea Post	Week 9 (October 30)	3
Article Review #3	Week 10 (November 6)	1
Article Review #4	Week 12 (November 20)	1
Project Status Update	Week 12 (November 20)	3

Article Reviews

For each article review, read the assigned article and post a response addressing:

What is the main argument or contribution?
How does this connect to what we’re learning in the course?
What questions does this raise for you, or what do you disagree with?

Then read your classmates’ posts and reply substantively to at least one, extending the conversation or offering a different perspective.

Grading Rubric (Article Reviews)

Full Credit (1): Post demonstrates thoughtful engagement with the article, makes clear connections to course material, and raises genuine questions or critiques. Reply substantively extends the conversation.

Partial Credit (0.5): Post summarizes the article but lacks depth in analysis or connection to course material. Reply is superficial.

No Credit (0): Post is incomplete, off-topic, or fails to engage meaningfully with the article. Reply is missing or perfunctory.

Project Posts

Project Idea Post (Week 9):

Share your planned final project with the class:

What dataset will you use? (brief description of structure and source)
What is your research question?
Which method from the course is most appropriate, and why?
What challenges do you anticipate?

Read at least two classmates’ posts and offer constructive feedback or suggestions.

Project Status Update (Week 12):

Update the class on your progress:

What have you accomplished so far?
What preliminary findings or challenges have emerged?
What remains to be done?
What specific feedback would be helpful from classmates?

Read at least two classmates’ posts and offer constructive feedback, resources, or encouragement.

Grading Rubric (Project Posts)

Full Credit (3): Post provides clear, substantive information about the project. Demonstrates thoughtful planning (Idea Post) or genuine progress and reflection (Status Update). Feedback to classmates is constructive and specific.

Partial Credit (1-2): Post addresses required elements but lacks depth or specificity. Feedback to classmates is generic.

No Credit (0): Post is incomplete or superficial. Feedback to classmates is missing or unhelpful.

Final Project

Due Date: Week 15 (December 18)

Points: 50

Apply one of the methods from Units 1-3 to a dataset of your choosing. The project should demonstrate your ability to identify an appropriate method for a given data structure, implement the analysis correctly, and communicate findings effectively.

Methods Scope: Final projects must use multilevel models, generalized linear mixed models, or longitudinal methods (growth curve models). Time-series methods (Unit 4) are not eligible for final projects. The time-series unit provides foundational exposure to prepare students for future coursework or self-study, but two weeks is insufficient for project-level application.

Deliverables:

Written report (2000-2500 words) including:
- Introduction and research question
- Description of data structure and why chosen method is appropriate
- Model building process and justification of decisions
- Results with appropriate visualizations
- Discussion of limitations and assumptions
Reproducible analysis code in a GitHub repository
Brief presentation (10 minutes) to the class

Submission Requirements:

Submit the written report as a self-contained HTML file (rendered from Quarto)
Submit a link to your GitHub repository containing all code and data (or data access instructions)
All deliverables must be submitted for the project to be graded

Grading Rubric

The final project is graded holistically by letter grade. The instructor assigns a letter grade based on the criteria below, then assigns points within the range for grades B- through A.

Grade	Points	Description
A	47-50	Sophisticated understanding of the chosen method. Data structure and method choice are clearly and thoroughly justified. Model building is thoughtful with appropriate comparisons and diagnostics. Results are correctly interpreted with meaningful acknowledgment of limitations. Visualizations effectively communicate findings. Writing is clear, professional, and well-organized. Code is clean, well-documented, and fully reproducible. Presentation is clear, engaging, and demonstrates command of the material.
A-	45-46	Strong understanding of the method with clear justification for choices. Analysis is sound with minor opportunities for improvement in depth, presentation, or documentation. Code is clean and reproducible with minor documentation gaps. Presentation is clear and professional.
B+	44	Competent application of the method with reasonable justification. Some aspects of model building, interpretation, or presentation could be strengthened. Code is functional with some organization or documentation issues. Presentation communicates main points effectively.
B	42-43	Meets requirements but the data story or methodological justification is underdeveloped. Interpretation may miss some nuances. Code is functional but may have reproducibility issues. Presentation is adequate but could be clearer or more polished.
B-	40-41	Method applied but with notable weaknesses in justification, interpretation, or presentation. May contain errors that don’t fundamentally undermine the analysis. Code has documentation or reproducibility issues. Presentation covers the basics but lacks depth or clarity.
C	35	Meets minimum expectations for graduate-level work but just barely. Significant issues with method application, interpretation, or analytical justification. Code may not be reproducible. Presentation is disorganized or unclear.
F	25	Submission demonstrates fundamental misunderstanding of the methods or fails to meet minimum expectations for graduate-level work; however, a good-faith attempt was made.
Zero	0	Not submitted or no meaningful attempt.

Course Schedule Overview

Week	Dates	Topic	Assignments Due
1	Aug 31 - Sep 4	Introduction to Clustered Data and Varying Intercepts
2	Sep 7 - Sep 11	Varying Intercepts and Slopes	Article Review #1, Skills Check
3	Sep 14 - Sep 18	Building and Evaluating Multilevel Models
4	Sep 21 - Sep 25	Unit 1 Work Week	Problem Set 1
5	Sep 28 - Oct 2	Multilevel Models for Binary Outcomes	Article Review #2
6	Oct 5 - Oct 9	Multilevel Models for Count Outcomes
7	Oct 12 - Oct 16	GLMM Applications and Best Practices
8	Oct 19 - Oct 23	Unit 2 Work Week	Problem Set 2
9	Oct 26 - Oct 30	Introduction to Longitudinal Data and Bayesian Primer	Project Idea Post
10	Nov 2 - Nov 6	Growth Curve Models	Article Review #3
11	Nov 9 - Nov 13	Practical Issues in Longitudinal Analysis	Problem Set 3
12	Nov 16 - Nov 20	Foundations of Time-Series	Article Review #4, Project Status Update
13	Nov 30 - Dec 4	Time-Series Regression and Looking Ahead	Problem Set 4
14	Dec 7 - Dec 11	Final Project Work Week
15	Dec 14 - Dec 18	Final Project Presentations	Final Project

Note: November 23-27 is Thanksgiving Break—no class activities.

Unit 1: Multilevel/Hierarchical Models (Weeks 1–4)

Week 1: Introduction to Clustered Data and Varying Intercepts

Dates: August 31 - September 4

Module Description

This module introduces the fundamental problem that motivates multilevel modeling: observations that are not independent due to clustering or nesting. We examine why standard regression fails with clustered data, producing incorrect standard errors and potentially misleading inferences. Students will learn three approaches to clustered data—complete pooling, no pooling, and partial pooling—and understand why partial pooling through multilevel models often represents the best solution. The module introduces the varying intercept model, the simplest multilevel model, and key concepts including intraclass correlation and shrinkage. By the end of this module, students will recognize clustered data structures and implement basic varying intercept models in R.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
1.1	Identify data structures that violate the independence assumption and explain why this matters for inference	CLO1
1.2	Compare and contrast complete pooling, no pooling, and partial pooling approaches to clustered data	CLO1, CLO2
1.3	Estimate varying intercept models using lme4 in R	CLO2, CLO6
1.4	Calculate and interpret the intraclass correlation coefficient (ICC)	CLO2
1.5	Explain the concept of shrinkage and why multilevel estimates differ from no-pooling estimates	CLO2

Required Readings

Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 11: Multilevel Structures
- Chapter 12: Multilevel Linear Models: The Basics

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	3-4 hours
Coding practice (varying intercept models)	3-4 hours
Begin Problem Set 1	1-2 hours
Total	8-11 hours

To Do This Week

Complete required readings
Complete coding practice exercises on varying intercept models
Begin Problem Set 1
Begin exploring potential datasets for Final Project

Week 2: Varying Intercepts and Slopes

Dates: September 7 - September 11

Module Description

This module extends the basic multilevel model to include varying slopes, allowing the relationship between predictors and outcomes to differ across groups. We examine when varying slopes are necessary, how to interpret the covariance between intercepts and slopes, and the tradeoffs involved in adding model complexity. Students will learn to think carefully about which effects should vary and how to make principled decisions about model structure. The module also introduces the concept of centering predictors and its importance for interpretation in multilevel models.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
2.1	Extend varying intercept models to include varying slopes	CLO2, CLO6
2.2	Interpret the covariance between random intercepts and slopes	CLO2
2.3	Make principled decisions about which effects should vary across groups	CLO1, CLO2
2.4	Apply group-mean and grand-mean centering and explain the interpretive implications	CLO2
2.5	Visualize varying intercepts and slopes to communicate model results	CLO7

Required Readings

Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 13: Varying Intercepts and Slopes

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	2-3 hours
Coding practice (varying slopes, centering)	3-4 hours
Continue Problem Set 1	1-2 hours
Article Review #1	1 hour
Skills Check	1 hour
Total	10-12 hours

To Do This Week

Complete required readings
Complete coding practice exercises on varying slopes models
Continue working on Problem Set 1
Post Article Review #1 and reply to a classmate by September 11
Complete Skills Check by September 11 (ungraded; answer key released after due date)

Week 3: Building and Evaluating Multilevel Models

Dates: September 14 - September 18

Module Description

This module focuses on the practical aspects of building, evaluating, and presenting multilevel models. Students will learn strategies for model building, including how to decide on model complexity and compare nested models. We cover diagnostic tools for checking model assumptions, including residual analysis and examination of random effects distributions. The module also addresses how to present multilevel model results clearly in tables and figures, and briefly introduces extensions such as cross-classified and three-level models. By the end of this module, students will have a complete workflow for multilevel analysis.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
3.1	Develop a systematic approach to building multilevel models of increasing complexity	CLO2
3.2	Compare nested multilevel models using likelihood ratio tests and information criteria	CLO2, CLO6
3.3	Conduct residual diagnostics and assess random effects assumptions	CLO2
3.4	Present multilevel model results in clear tables and visualizations	CLO7
3.5	Describe when cross-classified or three-level models might be necessary	CLO1

Required Readings

Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 21: Understanding and Summarizing the Fitted Models
- Chapter 24: Model Checking and Comparison (selections)

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	2-3 hours
Coding practice (model comparison, diagnostics)	3-4 hours
Continue Problem Set 1	2-3 hours
Total	9-11 hours

To Do This Week

Complete required readings
Complete coding practice exercises on model building and diagnostics
Continue working on Problem Set 1

Week 4: Unit 1 Work Week

Dates: September 21 - September 25

Module Description

No new lecture material this week. Students use this week to complete Problem Set 1, solidify their understanding of multilevel models through additional practice, and explore datasets for their final project. Office hours are available for troubleshooting and feedback on Problem Set 1.

Estimated Workload

Activity	Time
Complete Problem Set 1	5-6 hours
Review and practice multilevel concepts	2-3 hours
Explore Final Project datasets	1-2 hours
Total	8-11 hours

To Do This Week

Submit Problem Set 1 by September 25
Review any challenging concepts from Unit 1
Continue exploring potential datasets for Final Project

Unit 2: Generalized Linear Mixed Models (Weeks 5–8)

Week 5: Multilevel Models for Binary Outcomes

Dates: September 28 - October 2

Module Description

This module extends multilevel modeling to binary outcomes using multilevel logistic regression. We begin with a brief review of standard logistic regression before introducing the varying intercept logistic model. A key focus is on the interpretation challenges that arise when combining random effects with nonlinear link functions—random effects are on the log-odds scale, but we often want to communicate in terms of probabilities. Students will learn strategies for meaningful interpretation and visualization of multilevel logistic models.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
5.1	Review and apply logistic regression for binary outcomes	CLO3
5.2	Estimate multilevel logistic regression models with varying intercepts	CLO3, CLO6
5.3	Interpret fixed effects and variance components in multilevel logistic models	CLO3
5.4	Calculate and visualize predicted probabilities from multilevel logistic models	CLO3, CLO7
5.5	Explain the challenges of interpreting random effects on the probability scale	CLO3

Required Readings

Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 5: Logistic Regression (review)
- Chapter 6: Generalized Linear Models (sections 6.1-6.3)
- Chapter 14: Multilevel Logistic Regression

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	3-4 hours
Coding practice (multilevel logistic models)	3-4 hours
Begin Problem Set 2	1-2 hours
Article Review #2	1 hour
Total	10-12 hours

To Do This Week

Complete required readings
Complete coding practice exercises on multilevel logistic models
Begin Problem Set 2
Post Article Review #2 and reply to a classmate by October 2

Week 6: Multilevel Models for Count Outcomes

Dates: October 5 - October 9

Module Description

This module covers multilevel models for count outcomes, extending Poisson and negative binomial regression to include random effects. We review standard count models and their assumptions before introducing varying intercept Poisson models. A key topic is overdispersion—when the variance exceeds what the Poisson distribution predicts—and how negative binomial models address this issue. The module briefly introduces zero-inflated models for data with excess zeros.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
6.1	Review and apply Poisson regression for count outcomes	CLO3
6.2	Diagnose overdispersion and explain its consequences for inference	CLO3
6.3	Estimate multilevel Poisson and negative binomial models	CLO3, CLO6
6.4	Interpret rate ratios and expected counts from multilevel count models	CLO3, CLO7
6.5	Describe when zero-inflated models might be appropriate	CLO1, CLO3

Required Readings

Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 6: Generalized Linear Models (sections 6.4-6.5)

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	2-3 hours
Coding practice (multilevel count models)	3-4 hours
Continue Problem Set 2	2-3 hours
Total	9-11 hours

To Do This Week

Complete required readings
Complete coding practice exercises on multilevel count models
Continue working on Problem Set 2

Week 7: GLMM Applications and Best Practices

Dates: October 12 - October 16

Module Description

This module consolidates learning about generalized linear mixed models through a complete applied example. We work through the full analysis pipeline—from data exploration through model building, diagnostics, and presentation of results—for a real-world dataset with a non-continuous outcome. The module also addresses practical issues including estimation challenges in GLMMs, convergence problems, and best practices for reporting results in academic and applied contexts.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
7.1	Apply a complete GLMM analysis workflow to a real dataset	CLO3, CLO6
7.2	Diagnose and address common estimation problems in GLMMs	CLO3
7.3	Compare maximum likelihood and restricted maximum likelihood estimation approaches	CLO3
7.4	Create publication-ready tables and figures for GLMM results	CLO7
7.5	Evaluate the appropriateness of GLMM applications in published research	CLO8

Required Readings

Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Chapter 24: Model Checking and Comparison (revisit relevant sections)

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	1-2 hours
Work through applied GLMM analysis example	3-4 hours
Continue Problem Set 2	2-3 hours
Total	8-10 hours

To Do This Week

Complete required readings
Work through applied GLMM analysis example
Continue working on Problem Set 2

Week 8: Unit 2 Work Week

Dates: October 19 - October 23

Module Description

No new lecture material this week. Students use this week to complete Problem Set 2, practice GLMM applications, and continue developing their final project plans. Office hours are available for methodological questions and feedback on Problem Set 2.

Estimated Workload

Activity	Time
Complete Problem Set 2	5-6 hours
Review and practice GLMM concepts	2-3 hours
Develop Final Project plans	1-2 hours
Total	8-11 hours

To Do This Week

Submit Problem Set 2 by October 23
Review any challenging concepts from Unit 2
Finalize Final Project dataset and method selection

Unit 3: Longitudinal Data Analysis (Weeks 9–11)

Week 9: Introduction to Longitudinal Data and Bayesian Primer

Dates: October 26 - October 30

Module Description

This module transitions from cross-sectional multilevel models to longitudinal data analysis. Students will learn to recognize longitudinal data as a special case of multilevel data—observations nested within individuals over time—while understanding what makes longitudinal analysis distinct. We introduce key concepts including time-varying vs. time-invariant predictors, the choice between wide and long data formats, and the critical importance of how time is measured and centered. This module also situates longitudinal analysis within the broader landscape of methods for non-independent data, distinguishing it from panel data approaches in econometrics and pure time-series analysis.

This module also includes a brief Bayesian primer. The Kurz supplementary text uses brms, a Bayesian modeling package. While this course does not teach Bayesian statistics, students need enough background to use brms intelligently: understanding priors, posteriors, credible intervals, and how to read brms output. This primer provides just enough context to follow along with the Kurz materials without requiring a full treatment of Bayesian inference.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
9.1	Recognize longitudinal data as a special case of multilevel structure	CLO1, CLO4
9.2	Distinguish between time-varying and time-invariant predictors	CLO4
9.3	Restructure data between wide and long formats	CLO4, CLO6
9.4	Distinguish longitudinal, panel, and time-series data structures and their typical methods	CLO1
9.5	Explain the multilevel vs. econometric traditions for panel data analysis	CLO1, CLO8
9.6	Describe basic Bayesian concepts (prior, posterior, credible interval) sufficiently to interpret brms output	CLO6

Required Readings

Singer & Willett, Applied Longitudinal Data Analysis
- Chapter 1: A Framework for Investigating Change Over Time
- Chapter 2: Exploring Longitudinal Data on Change
Kurz, Applied Longitudinal Data Analysis in brms and the tidyverse
- Chapters 1-2 (parallel R code)
Instructor-provided Bayesian Primer handout

Estimated Workload

Activity	Time
Video lectures (including Bayesian primer)	2 hours
Required readings	3-4 hours
Data restructuring practice	2-3 hours
Begin Problem Set 3	1-2 hours
Project Idea Post	1-2 hours
Total	9-12 hours

To Do This Week

Complete required readings (including Bayesian primer handout)
Complete data restructuring exercises
Begin Problem Set 3
Post Project Idea and reply to two classmates by October 30

Week 10: Growth Curve Models

Dates: November 2 - November 6

Module Description

This module introduces growth curve models—multilevel models designed to describe and predict individual change over time. We begin with unconditional models that simply describe the average trajectory and individual variation around it, then add predictors to explain why some individuals change more (or differently) than others. Students will learn to model linear and nonlinear trajectories, interpret individual variation in change, and understand how centering time affects interpretation. The connection between growth curve models and the varying slopes models from Unit 1 is made explicit.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
10.1	Estimate unconditional growth models to describe average and individual change	CLO4, CLO6
10.2	Add predictors to explain between-person differences in change trajectories	CLO4
10.3	Model nonlinear change using polynomial and other functional forms	CLO4
10.4	Center time appropriately and explain the interpretive consequences	CLO4
10.5	Visualize individual growth trajectories and model-implied predictions	CLO4, CLO7

Required Readings

Singer & Willett, Applied Longitudinal Data Analysis
- Chapter 3: Introducing the Multilevel Model for Change
- Chapter 4: Doing Data Analysis with the Multilevel Model for Change
- Chapter 5: Treating Time More Flexibly
Kurz, Applied Longitudinal Data Analysis in brms and the tidyverse
- Chapters 3-5 (parallel R code)

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	3-4 hours
Coding practice (growth curve models)	3-4 hours
Continue Problem Set 3	1-2 hours
Article Review #3	1 hour
Total	10-12 hours

To Do This Week

Complete required readings
Complete coding practice exercises on growth curve models
Continue working on Problem Set 3
Post Article Review #3 and reply to a classmate by November 6

Week 11: Practical Issues in Longitudinal Analysis

Dates: November 9 - November 13

Module Description

This module addresses the practical challenges that arise in longitudinal data analysis. A major focus is missing data—a nearly universal feature of longitudinal studies. Students will learn to distinguish between missing data mechanisms (MCAR, MAR, MNAR) and understand how multilevel models handle missingness under MAR assumptions. We also cover unbalanced data (individuals with different numbers of observations or irregular time points) and strategies for model selection in longitudinal contexts. By the end of this module, students will be prepared to handle the messiness of real longitudinal data.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
11.1	Distinguish between MCAR, MAR, and MNAR missing data mechanisms	CLO4
11.2	Explain how and why multilevel models handle MAR missingness	CLO4
11.3	Analyze unbalanced longitudinal data with varying numbers of observations	CLO4, CLO6
11.4	Apply model selection strategies in longitudinal contexts	CLO4
11.5	Diagnose potential violations of assumptions in longitudinal models	CLO4, CLO8

Required Readings

Singer & Willett, Applied Longitudinal Data Analysis
- Chapter 7: Examining the Multilevel Model’s Error Covariance Structure
Kurz, Applied Longitudinal Data Analysis in brms and the tidyverse
- Chapter 7 (parallel R code)

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	2-3 hours
Coding practice (missing data, unbalanced designs)	2-3 hours
Complete Problem Set 3	3-4 hours
Total	9-11 hours

To Do This Week

Complete required readings
Complete coding practice exercises on practical longitudinal issues
Submit Problem Set 3 by November 13

Unit 4: Introduction to Time-Series (Weeks 12–13)

Note: This unit provides foundational exposure to time-series methods. Students will learn core concepts and vocabulary, but two weeks is insufficient for deep application. Time-series methods are not eligible for final projects. This unit prepares students for future coursework or self-study in time-series analysis.

Week 12: Foundations of Time-Series

Dates: November 16 - November 20

Module Description

This module introduces time-series analysis, completing our survey of methods for non-independent data. While longitudinal data typically involves many individuals measured at few time points, time-series data typically involves one (or few) units measured at many time points—think economic indicators, stock prices, or climate measurements. The key statistical concern shifts from clustering to autocorrelation: today’s value depends on yesterday’s. Students will learn to identify time-series data structures, diagnose autocorrelation, assess stationarity, and understand the basic building blocks of time-series models (AR and MA processes).

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
12.1	Distinguish time-series data structures from longitudinal and panel data	CLO1, CLO5
12.2	Calculate and interpret autocorrelation and partial autocorrelation functions	CLO5
12.3	Assess and transform for stationarity	CLO5
12.4	Explain autoregressive (AR) and moving average (MA) processes	CLO5
12.5	Identify AR and MA signatures in ACF and PACF plots	CLO5

Required Readings

Hyndman & Athanasopoulos, Forecasting: Principles and Practice (3e)
- Chapter 1: Getting Started
- Chapter 2: Time Series Graphics
- Chapter 9: ARIMA Models (sections 9.1-9.5)

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	3-4 hours
Coding practice (ACF, PACF, stationarity)	2-3 hours
Begin Problem Set 4	1-2 hours
Article Review #4	1 hour
Project Status Update	1-2 hours
Total	10-12 hours

To Do This Week

Complete required readings
Complete coding practice exercises on time-series foundations
Begin Problem Set 4
Post Article Review #4 and reply to a classmate by November 20
Post Project Status Update and reply to two classmates by November 20

November 23 - November 27: Thanksgiving Break — No Class Activities

Week 13: Time-Series Regression and Looking Ahead

Dates: November 30 - December 4

Module Description

This module covers regression with time-series data and provides a bridge to further study. We examine how to incorporate autocorrelation into regression models, introduce ARIMA as a general framework, and briefly discuss how time-series concepts apply to panel data (when you have multiple units over many time points). The module concludes by surveying advanced topics and resources for continued learning, including forecasting, state-space models, and Bayesian approaches to time-series.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
13.1	Estimate regression models with autocorrelated errors	CLO5, CLO6
13.2	Fit and interpret basic ARIMA models	CLO5, CLO6
13.3	Diagnose model fit using residual diagnostics for time-series	CLO5
13.4	Describe how time-series methods extend to panel data contexts	CLO1, CLO5
13.5	Identify resources for continued learning in time-series analysis	CLO5, CLO8

Required Readings

Hyndman & Athanasopoulos, Forecasting: Principles and Practice (3e)
- Chapter 7: Time Series Regression Models (sections 7.1-7.4)
- Chapter 9: ARIMA Models (sections 9.6-9.9)

Estimated Workload

Activity	Time
Video lectures	1.5 hours
Required readings	2-3 hours
Coding practice (ARIMA, dynamic regression)	2-3 hours
Complete Problem Set 4	3-4 hours
Total	9-11 hours

To Do This Week

Complete required readings
Complete coding practice exercises on time-series regression
Submit Problem Set 4 by December 4
Begin finalizing Final Project analysis

Week 14: Final Project Work Week

Dates: December 7 - December 11

Module Description

No new lecture material this week. Students use this week to finalize their analysis, complete their written report, prepare their GitHub repository, and rehearse their presentation. Office hours are available for troubleshooting and feedback.

Estimated Workload

Activity	Time
Finalize Final Project analysis	4-5 hours
Write Final Project report	3-4 hours
Prepare GitHub repository	1-2 hours
Total	8-11 hours

To Do This Week

Finalize Final Project analysis
Draft Final Project written report
Prepare GitHub repository with reproducible code
Begin preparing presentation

Week 15: Final Project Presentations

Dates: December 14 - December 18

Module Description

This final module brings together everything students have learned throughout the course. Students will present their final projects to the class, explaining their data structure, methodological choices, findings, and limitations. Presentations provide an opportunity to practice communicating complex statistical analyses to an audience and to receive feedback from peers and the instructor. We will also reflect on the course’s central theme—appropriately modeling observations that are not independent—and discuss how these methods connect to further study in causal inference, Bayesian analysis, and machine learning.

Module Learning Objectives

By the end of this module, students will be able to:

#	Module Learning Objective	Maps to CLO
15.1	Present a complete statistical analysis to a live audience, clearly explaining methodological choices	CLO7
15.2	Provide constructive peer feedback on analytical approaches and presentation effectiveness	CLO8
15.3	Synthesize course themes around modeling non-independent observations	CLO1-CLO5
15.4	Identify connections between course methods and advanced topics	CLO8
15.5	Describe strategies for continued learning in advanced quantitative methods	CLO8

Required Readings

No new readings—focus on final project completion

Estimated Workload

Activity	Time
Finalize and polish Final Project report	3-4 hours
Prepare and rehearse presentation	2-3 hours
Attend class presentations and provide peer feedback	2-3 hours
Total	7-10 hours

To Do This Week

Finalize and polish Final Project report
Prepare and rehearse 10-minute presentation
Submit Final Project by December 18 (include GitHub repository link and written report HTML)
Attend class presentations and provide peer feedback

Conversion Plan: 15-Week to 12-Week Semester

This syllabus is designed for a 15-week fall or spring semester. When teaching the course in a compressed 12-week summer semester, the three work weeks must be eliminated and content consolidated.

Recommended 12-Week Structure

Remove the three work weeks and adjust pacing:

12-Week	15-Week Equivalent	Topic
Week 1	Week 1	Introduction to Clustered Data and Varying Intercepts
Week 2	Week 2	Varying Intercepts and Slopes
Week 3	Week 3	Building and Evaluating Multilevel Models
Week 4	Week 5	Multilevel Models for Binary Outcomes
Week 5	Week 6	Multilevel Models for Count Outcomes
Week 6	Week 7	GLMM Applications and Best Practices
Week 7	Week 9	Introduction to Longitudinal Data and Bayesian Primer
Week 8	Week 10	Growth Curve Models
Week 9	Week 11	Practical Issues in Longitudinal Analysis
Week 10	Week 12	Foundations of Time-Series
Week 11	Week 13	Time-Series Regression and Looking Ahead
Week 12	Week 15	Final Project Presentations

Key Adjustments for 12-Week Format

Problem Sets: Without dedicated work weeks, problem sets will be due at the end of the final content week for each unit rather than during a work week:

Assignment	15-Week Due Date	12-Week Due Date
Skills Check	Week 2	Week 2
Problem Set 1	Week 4	Week 3
Problem Set 2	Week 8	Week 6
Problem Set 3	Week 11	Week 9
Problem Set 4	Week 13	Week 11
Final Project	Week 15	Week 12

Discussion Activities: Consolidate to reduce load:

Activity	15-Week Due Date	12-Week Due Date
Article Review #1	Week 2	Week 2
Article Review #2	Week 5	Week 4
Project Idea Post	Week 9	Week 7
Article Review #3	Week 10	Week 8
Article Review #4	Week 12	Week 10
Project Status Update	Week 12	Week 10

Workload Considerations: Without work weeks, students will need to manage problem sets alongside new content. Consider:

Reducing coding practice exercises slightly
Providing more starter code for problem sets
Being flexible with office hours during problem set weeks
Acknowledging the increased intensity in course communications

Reading Adjustments: Consider moving some required readings to recommended status, particularly:

Gelman & Hill Chapter 25 (Missing Data)
Singer & Willett Chapter 6 (Discontinuous and Nonlinear Change)
Hyndman & Athanasopoulos Chapter 3 (Time Series Decomposition)

--- title: "Advanced Quantitative Methods" subtitle: "Course Syllabus/Outline" format: html: theme: cosmo toc: true toc-depth: 3 toc-location: left toc-title: "Contents" number-sections: false smooth-scroll: true link-external-newwindow: true code-fold: true code-tools: true highlight-style: github mainfont: "Open Sans" fontsize: "1.1em" linestretch: 1.5 grid: sidebar-width: 300px body-width: 900px margin-width: 200px embed-resources: true --- ## Course Description This course provides an advanced treatment of regression analysis for complex data structures, building on foundational knowledge of linear regression and causal inference. Many real-world datasets violate the assumption of independent observations: students are nested within schools, patients within hospitals, survey responses within individuals over time. Standard regression approaches fail in these contexts, producing incorrect standard errors and potentially misleading conclusions. This course equips students with the conceptual understanding and practical skills to analyze clustered, longitudinal, and time-series data appropriately. Students will develop fluency in multilevel/hierarchical models, generalized linear mixed models for non-continuous outcomes, and growth curve models for longitudinal data. The course also provides foundational exposure to time-series methods. Throughout the course, we emphasize that these methods share a common theme: appropriately modeling observations that are not independent. The course uses R as the primary computing environment, and students will apply these methods to real datasets through unit problem sets, discussion activities, and a culminating final project. By the end of the course, students will be able to identify when standard regression is inappropriate, select and implement appropriate methods for complex data structures, and communicate findings from these analyses to technical and non-technical audiences. --- ## Prerequisites This course assumes prior completion of coursework covering: - Linear regression (estimation, interpretation, diagnostics) - Generalized linear models (logistic and Poisson regression) - Fundamentals of causal inference and research design - Working knowledge of R (data manipulation, basic programming, regression modeling) Recommended prior texts: Bueno de Mesquita & Fowler, *Thinking Clearly with Data*; Gelman, Hill, & Vehtari, *Regression and Other Stories* --- ## Course Learning Objectives Reference | # | Course Learning Objective | |---|---------------------------| | **CLO1** | Identify data structures that violate independence assumptions and select appropriate modeling strategies | | **CLO2** | Estimate, interpret, and evaluate multilevel/hierarchical models with varying intercepts and slopes | | **CLO3** | Extend multilevel models to binary and count outcomes using generalized linear mixed models | | **CLO4** | Model change over time using growth curve and longitudinal data analysis techniques | | **CLO5** | Describe foundational time-series concepts including autocorrelation, stationarity, and ARIMA models | | **CLO6** | Implement advanced regression models in R using appropriate packages (lme4, brms, etc.) | | **CLO7** | Communicate findings from complex models to technical and non-technical audiences | | **CLO8** | Critically evaluate published research using multilevel, longitudinal, and time-series methods | --- ## Required Textbooks | Textbook | Author(s) | Access | |----------|-----------|--------| | *Data Analysis Using Regression and Multilevel/Hierarchical Models* | Andrew Gelman & Jennifer Hill | Purchase required | | *Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence* | Judith D. Singer & John B. Willett | Purchase required or library access | | *Forecasting: Principles and Practice (3e)* | Rob J. Hyndman & George Athanasopoulos | Free online: https://otexts.com/fpp3/ | ## Supplementary Resources | Resource | Author(s) | Access | |----------|-----------|--------| | *Applied Longitudinal Data Analysis in brms and the tidyverse* | A. Solomon Kurz | Free online: https://bookdown.org/content/4253/ | | *Regression and Other Stories* | Gelman, Hill, & Vehtari | Reference from prerequisite course | --- ## Grading Overview | Assignment | Points | |------------|--------| | Problem Sets (4 × 10 points each) | 40 | | Discussion Activities | 10 | | Final Project | 50 | | **Total** | **100** | --- ## Assignment Details ### Problem Sets (4 total) **Due Dates:** Week 4 (September 25), Week 8 (October 23), Week 11 (November 13), Week 13 (December 4) **Points:** 40 total (10 points each) Each problem set covers the material from one unit and includes applied exercises that require students to: - Analyze provided datasets using the unit's methods - Interpret model output and write clear explanations - Compare models and justify modeling decisions - Visualize results appropriately - Reflect on assumptions and limitations Problem sets should be submitted as rendered Quarto documents (HTML) with all code visible and reproducible. #### Grading Rubric **A (9.4-10):** All problems completed correctly with clear, insightful explanations. Code is clean, well-commented, and reproducible. Interpretations demonstrate deep understanding of the methods. Visualizations effectively communicate results. **A- (9-9.3):** Most problems completed correctly with clear explanations. Minor errors in code or interpretation that don't reflect fundamental misunderstanding. Code is functional and reproducible. **B+ (8.7-8.9):** Problems completed with generally correct analysis. Some explanations could be clearer or more thorough. Code is functional with minor issues. **B (8.3-8.6):** Problems completed but with some errors in analysis or interpretation. Explanations lack clarity or depth in places. Code may have minor reproducibility issues. **B- (8-8.2):** Problems completed but with notable errors in analysis or interpretation. Explanations lack clarity or depth. Code may have reproducibility issues. **C (7):** Significant errors in analysis or interpretation. Incomplete problems or missing explanations. Demonstrates limited understanding of the methods. **D (5.1-6.9):** Major errors throughout. Substantial portions incomplete or incorrect. Demonstrates minimal understanding of the methods. **F (5):** Submitted but demonstrates fundamental misunderstanding of the methods; however, a good-faith attempt was made. **Zero (0):** Not submitted or no meaningful attempt. --- ### Ungraded Skills Check **Due Date:** Week 2 (September 11) **Points:** 0 (ungraded) At the end of Week 2, students will complete a short skills check covering the foundational concepts from the first two modules: clustered data structures, varying intercepts, varying slopes, and basic lme4 syntax. This self-assessment helps students identify gaps in their understanding before Problem Set 1 is due. The skills check includes 3-4 short problems similar in style to Problem Set 1. An answer key will be released after the due date so students can check their work. Students who struggle with the skills check are encouraged to attend office hours or review the module materials before proceeding. This activity is ungraded to encourage honest self-assessment rather than strategic performance. --- ### Discussion Activities #### Overview Discussion activities foster community and keep students engaged throughout the course. There are two types: **Article Reviews (4 total):** Read an assigned short article and post a 200-300 word response addressing the prompt. Then reply substantively to at least one classmate's post (minimum 100 words). **Project Posts (2 total):** Share your final project plans and progress with the class, and provide feedback to classmates. **Points:** 10 total | Activity | Due Date | Points | |----------|----------|--------| | Article Review #1 | Week 2 (September 11) | 1 | | Article Review #2 | Week 5 (October 2) | 1 | | Project Idea Post | Week 9 (October 30) | 3 | | Article Review #3 | Week 10 (November 6) | 1 | | Article Review #4 | Week 12 (November 20) | 1 | | Project Status Update | Week 12 (November 20) | 3 | --- #### Article Reviews For each article review, read the assigned article and post a response addressing: - What is the main argument or contribution? - How does this connect to what we're learning in the course? - What questions does this raise for you, or what do you disagree with? Then read your classmates' posts and reply substantively to at least one, extending the conversation or offering a different perspective. ##### Grading Rubric (Article Reviews) **Full Credit (1):** Post demonstrates thoughtful engagement with the article, makes clear connections to course material, and raises genuine questions or critiques. Reply substantively extends the conversation. **Partial Credit (0.5):** Post summarizes the article but lacks depth in analysis or connection to course material. Reply is superficial. **No Credit (0):** Post is incomplete, off-topic, or fails to engage meaningfully with the article. Reply is missing or perfunctory. --- #### Project Posts **Project Idea Post (Week 9):** Share your planned final project with the class: - What dataset will you use? (brief description of structure and source) - What is your research question? - Which method from the course is most appropriate, and why? - What challenges do you anticipate? Read at least two classmates' posts and offer constructive feedback or suggestions. **Project Status Update (Week 12):** Update the class on your progress: - What have you accomplished so far? - What preliminary findings or challenges have emerged? - What remains to be done? - What specific feedback would be helpful from classmates? Read at least two classmates' posts and offer constructive feedback, resources, or encouragement. ##### Grading Rubric (Project Posts) **Full Credit (3):** Post provides clear, substantive information about the project. Demonstrates thoughtful planning (Idea Post) or genuine progress and reflection (Status Update). Feedback to classmates is constructive and specific. **Partial Credit (1-2):** Post addresses required elements but lacks depth or specificity. Feedback to classmates is generic. **No Credit (0):** Post is incomplete or superficial. Feedback to classmates is missing or unhelpful. --- ### Final Project **Due Date:** Week 15 (December 18) **Points:** 50 Apply one of the methods from Units 1-3 to a dataset of your choosing. The project should demonstrate your ability to identify an appropriate method for a given data structure, implement the analysis correctly, and communicate findings effectively. **Methods Scope:** Final projects must use multilevel models, generalized linear mixed models, or longitudinal methods (growth curve models). Time-series methods (Unit 4) are not eligible for final projects. The time-series unit provides foundational exposure to prepare students for future coursework or self-study, but two weeks is insufficient for project-level application. **Deliverables:** - Written report (2000-2500 words) including: - Introduction and research question - Description of data structure and why chosen method is appropriate - Model building process and justification of decisions - Results with appropriate visualizations - Discussion of limitations and assumptions - Reproducible analysis code in a GitHub repository - Brief presentation (10 minutes) to the class **Submission Requirements:** - Submit the written report as a self-contained HTML file (rendered from Quarto) - Submit a link to your GitHub repository containing all code and data (or data access instructions) - All deliverables must be submitted for the project to be graded #### Grading Rubric The final project is graded holistically by letter grade. The instructor assigns a letter grade based on the criteria below, then assigns points within the range for grades B- through A. | Grade | Points | Description | |-------|--------|-------------| | A | 47-50 | Sophisticated understanding of the chosen method. Data structure and method choice are clearly and thoroughly justified. Model building is thoughtful with appropriate comparisons and diagnostics. Results are correctly interpreted with meaningful acknowledgment of limitations. Visualizations effectively communicate findings. Writing is clear, professional, and well-organized. Code is clean, well-documented, and fully reproducible. Presentation is clear, engaging, and demonstrates command of the material. | | A- | 45-46 | Strong understanding of the method with clear justification for choices. Analysis is sound with minor opportunities for improvement in depth, presentation, or documentation. Code is clean and reproducible with minor documentation gaps. Presentation is clear and professional. | | B+ | 44 | Competent application of the method with reasonable justification. Some aspects of model building, interpretation, or presentation could be strengthened. Code is functional with some organization or documentation issues. Presentation communicates main points effectively. | | B | 42-43 | Meets requirements but the data story or methodological justification is underdeveloped. Interpretation may miss some nuances. Code is functional but may have reproducibility issues. Presentation is adequate but could be clearer or more polished. | | B- | 40-41 | Method applied but with notable weaknesses in justification, interpretation, or presentation. May contain errors that don't fundamentally undermine the analysis. Code has documentation or reproducibility issues. Presentation covers the basics but lacks depth or clarity. | | C | 35 | Meets minimum expectations for graduate-level work but just barely. Significant issues with method application, interpretation, or analytical justification. Code may not be reproducible. Presentation is disorganized or unclear. | | F | 25 | Submission demonstrates fundamental misunderstanding of the methods or fails to meet minimum expectations for graduate-level work; however, a good-faith attempt was made. | | Zero | 0 | Not submitted or no meaningful attempt. | --- ## Course Schedule Overview | Week | Dates | Topic | Assignments Due | |------|-------|-------|-----------------| | 1 | Aug 31 - Sep 4 | Introduction to Clustered Data and Varying Intercepts | | | 2 | Sep 7 - Sep 11 | Varying Intercepts and Slopes | Article Review #1, Skills Check | | 3 | Sep 14 - Sep 18 | Building and Evaluating Multilevel Models | | | 4 | Sep 21 - Sep 25 | **Unit 1 Work Week** | Problem Set 1 | | 5 | Sep 28 - Oct 2 | Multilevel Models for Binary Outcomes | Article Review #2 | | 6 | Oct 5 - Oct 9 | Multilevel Models for Count Outcomes | | | 7 | Oct 12 - Oct 16 | GLMM Applications and Best Practices | | | 8 | Oct 19 - Oct 23 | **Unit 2 Work Week** | Problem Set 2 | | 9 | Oct 26 - Oct 30 | Introduction to Longitudinal Data and Bayesian Primer | Project Idea Post | | 10 | Nov 2 - Nov 6 | Growth Curve Models | Article Review #3 | | 11 | Nov 9 - Nov 13 | Practical Issues in Longitudinal Analysis | Problem Set 3 | | 12 | Nov 16 - Nov 20 | Foundations of Time-Series | Article Review #4, Project Status Update | | 13 | Nov 30 - Dec 4 | Time-Series Regression and Looking Ahead | Problem Set 4 | | 14 | Dec 7 - Dec 11 | **Final Project Work Week** | | | 15 | Dec 14 - Dec 18 | Final Project Presentations | Final Project | *Note: November 23-27 is Thanksgiving Break—no class activities.* --- ## Unit 1: Multilevel/Hierarchical Models (Weeks 1–4) --- ## Week 1: Introduction to Clustered Data and Varying Intercepts **Dates:** August 31 - September 4 ### Module Description This module introduces the fundamental problem that motivates multilevel modeling: observations that are not independent due to clustering or nesting. We examine why standard regression fails with clustered data, producing incorrect standard errors and potentially misleading inferences. Students will learn three approaches to clustered data—complete pooling, no pooling, and partial pooling—and understand why partial pooling through multilevel models often represents the best solution. The module introduces the varying intercept model, the simplest multilevel model, and key concepts including intraclass correlation and shrinkage. By the end of this module, students will recognize clustered data structures and implement basic varying intercept models in R. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 1.1 | Identify data structures that violate the independence assumption and explain why this matters for inference | CLO1 | | 1.2 | Compare and contrast complete pooling, no pooling, and partial pooling approaches to clustered data | CLO1, CLO2 | | 1.3 | Estimate varying intercept models using lme4 in R | CLO2, CLO6 | | 1.4 | Calculate and interpret the intraclass correlation coefficient (ICC) | CLO2 | | 1.5 | Explain the concept of shrinkage and why multilevel estimates differ from no-pooling estimates | CLO2 | ### Required Readings - Gelman & Hill, *Data Analysis Using Regression and Multilevel/Hierarchical Models* - Chapter 11: Multilevel Structures - Chapter 12: Multilevel Linear Models: The Basics ### Recommended Readings - Gelman, Hill, & Vehtari, *Regression and Other Stories* - Chapter 21: Additional Topics in Causal Inference (review partial pooling concepts) ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 3-4 hours | | Coding practice (varying intercept models) | 3-4 hours | | Begin Problem Set 1 | 1-2 hours | | **Total** | **8-11 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Complete coding practice exercises on varying intercept models - [ ] Begin Problem Set 1 - [ ] Begin exploring potential datasets for Final Project --- ## Week 2: Varying Intercepts and Slopes **Dates:** September 7 - September 11 ### Module Description This module extends the basic multilevel model to include varying slopes, allowing the relationship between predictors and outcomes to differ across groups. We examine when varying slopes are necessary, how to interpret the covariance between intercepts and slopes, and the tradeoffs involved in adding model complexity. Students will learn to think carefully about which effects should vary and how to make principled decisions about model structure. The module also introduces the concept of centering predictors and its importance for interpretation in multilevel models. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 2.1 | Extend varying intercept models to include varying slopes | CLO2, CLO6 | | 2.2 | Interpret the covariance between random intercepts and slopes | CLO2 | | 2.3 | Make principled decisions about which effects should vary across groups | CLO1, CLO2 | | 2.4 | Apply group-mean and grand-mean centering and explain the interpretive implications | CLO2 | | 2.5 | Visualize varying intercepts and slopes to communicate model results | CLO7 | ### Required Readings - Gelman & Hill, *Data Analysis Using Regression and Multilevel/Hierarchical Models* - Chapter 13: Varying Intercepts and Slopes ### Recommended Readings - Enders, C.K. & Tofighi, D. (2007). Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. *Psychological Methods*, 12(2), 121-138. ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 2-3 hours | | Coding practice (varying slopes, centering) | 3-4 hours | | Continue Problem Set 1 | 1-2 hours | | Article Review #1 | 1 hour | | Skills Check | 1 hour | | **Total** | **10-12 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Complete coding practice exercises on varying slopes models - [ ] Continue working on Problem Set 1 - [ ] **Post Article Review #1 and reply to a classmate by September 11** - [ ] **Complete Skills Check by September 11** (ungraded; answer key released after due date) --- ## Week 3: Building and Evaluating Multilevel Models **Dates:** September 14 - September 18 ### Module Description This module focuses on the practical aspects of building, evaluating, and presenting multilevel models. Students will learn strategies for model building, including how to decide on model complexity and compare nested models. We cover diagnostic tools for checking model assumptions, including residual analysis and examination of random effects distributions. The module also addresses how to present multilevel model results clearly in tables and figures, and briefly introduces extensions such as cross-classified and three-level models. By the end of this module, students will have a complete workflow for multilevel analysis. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 3.1 | Develop a systematic approach to building multilevel models of increasing complexity | CLO2 | | 3.2 | Compare nested multilevel models using likelihood ratio tests and information criteria | CLO2, CLO6 | | 3.3 | Conduct residual diagnostics and assess random effects assumptions | CLO2 | | 3.4 | Present multilevel model results in clear tables and visualizations | CLO7 | | 3.5 | Describe when cross-classified or three-level models might be necessary | CLO1 | ### Required Readings - Gelman & Hill, *Data Analysis Using Regression and Multilevel/Hierarchical Models* - Chapter 21: Understanding and Summarizing the Fitted Models - Chapter 24: Model Checking and Comparison (selections) ### Recommended Readings - Gelman & Hill, *Data Analysis Using Regression and Multilevel/Hierarchical Models* - Chapter 25: Missing Data (skim for awareness) ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 2-3 hours | | Coding practice (model comparison, diagnostics) | 3-4 hours | | Continue Problem Set 1 | 2-3 hours | | **Total** | **9-11 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Complete coding practice exercises on model building and diagnostics - [ ] Continue working on Problem Set 1 --- ## Week 4: Unit 1 Work Week **Dates:** September 21 - September 25 ### Module Description No new lecture material this week. Students use this week to complete Problem Set 1, solidify their understanding of multilevel models through additional practice, and explore datasets for their final project. Office hours are available for troubleshooting and feedback on Problem Set 1. ### Estimated Workload | Activity | Time | |----------|------| | Complete Problem Set 1 | 5-6 hours | | Review and practice multilevel concepts | 2-3 hours | | Explore Final Project datasets | 1-2 hours | | **Total** | **8-11 hours** | ### To Do This Week - [ ] **Submit Problem Set 1 by September 25** - [ ] Review any challenging concepts from Unit 1 - [ ] Continue exploring potential datasets for Final Project --- ## Unit 2: Generalized Linear Mixed Models (Weeks 5–8) --- ## Week 5: Multilevel Models for Binary Outcomes **Dates:** September 28 - October 2 ### Module Description This module extends multilevel modeling to binary outcomes using multilevel logistic regression. We begin with a brief review of standard logistic regression before introducing the varying intercept logistic model. A key focus is on the interpretation challenges that arise when combining random effects with nonlinear link functions—random effects are on the log-odds scale, but we often want to communicate in terms of probabilities. Students will learn strategies for meaningful interpretation and visualization of multilevel logistic models. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 5.1 | Review and apply logistic regression for binary outcomes | CLO3 | | 5.2 | Estimate multilevel logistic regression models with varying intercepts | CLO3, CLO6 | | 5.3 | Interpret fixed effects and variance components in multilevel logistic models | CLO3 | | 5.4 | Calculate and visualize predicted probabilities from multilevel logistic models | CLO3, CLO7 | | 5.5 | Explain the challenges of interpreting random effects on the probability scale | CLO3 | ### Required Readings - Gelman & Hill, *Data Analysis Using Regression and Multilevel/Hierarchical Models* - Chapter 5: Logistic Regression (review) - Chapter 6: Generalized Linear Models (sections 6.1-6.3) - Chapter 14: Multilevel Logistic Regression ### Recommended Readings - Sommet, N. & Morselli, D. (2017). Keep calm and learn multilevel logistic modeling: A simplified three-step procedure using Stata, R, Mplus, and SPSS. *International Review of Social Psychology*, 30(1), 203-218. ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 3-4 hours | | Coding practice (multilevel logistic models) | 3-4 hours | | Begin Problem Set 2 | 1-2 hours | | Article Review #2 | 1 hour | | **Total** | **10-12 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Complete coding practice exercises on multilevel logistic models - [ ] Begin Problem Set 2 - [ ] **Post Article Review #2 and reply to a classmate by October 2** --- ## Week 6: Multilevel Models for Count Outcomes **Dates:** October 5 - October 9 ### Module Description This module covers multilevel models for count outcomes, extending Poisson and negative binomial regression to include random effects. We review standard count models and their assumptions before introducing varying intercept Poisson models. A key topic is overdispersion—when the variance exceeds what the Poisson distribution predicts—and how negative binomial models address this issue. The module briefly introduces zero-inflated models for data with excess zeros. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 6.1 | Review and apply Poisson regression for count outcomes | CLO3 | | 6.2 | Diagnose overdispersion and explain its consequences for inference | CLO3 | | 6.3 | Estimate multilevel Poisson and negative binomial models | CLO3, CLO6 | | 6.4 | Interpret rate ratios and expected counts from multilevel count models | CLO3, CLO7 | | 6.5 | Describe when zero-inflated models might be appropriate | CLO1, CLO3 | ### Required Readings - Gelman & Hill, *Data Analysis Using Regression and Multilevel/Hierarchical Models* - Chapter 6: Generalized Linear Models (sections 6.4-6.5) ### Recommended Readings - Hilbe, J.M. (2014). *Modeling Count Data*. Cambridge University Press. (Chapters 1-4, available through library) ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 2-3 hours | | Coding practice (multilevel count models) | 3-4 hours | | Continue Problem Set 2 | 2-3 hours | | **Total** | **9-11 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Complete coding practice exercises on multilevel count models - [ ] Continue working on Problem Set 2 --- ## Week 7: GLMM Applications and Best Practices **Dates:** October 12 - October 16 ### Module Description This module consolidates learning about generalized linear mixed models through a complete applied example. We work through the full analysis pipeline—from data exploration through model building, diagnostics, and presentation of results—for a real-world dataset with a non-continuous outcome. The module also addresses practical issues including estimation challenges in GLMMs, convergence problems, and best practices for reporting results in academic and applied contexts. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 7.1 | Apply a complete GLMM analysis workflow to a real dataset | CLO3, CLO6 | | 7.2 | Diagnose and address common estimation problems in GLMMs | CLO3 | | 7.3 | Compare maximum likelihood and restricted maximum likelihood estimation approaches | CLO3 | | 7.4 | Create publication-ready tables and figures for GLMM results | CLO7 | | 7.5 | Evaluate the appropriateness of GLMM applications in published research | CLO8 | ### Required Readings - Gelman & Hill, *Data Analysis Using Regression and Multilevel/Hierarchical Models* - Chapter 24: Model Checking and Comparison (revisit relevant sections) ### Recommended Readings - Harrison, X.A. et al. (2018). A brief introduction to mixed effects modelling and multi-model inference in ecology. *PeerJ*, 6:e4794. ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 1-2 hours | | Work through applied GLMM analysis example | 3-4 hours | | Continue Problem Set 2 | 2-3 hours | | **Total** | **8-10 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Work through applied GLMM analysis example - [ ] Continue working on Problem Set 2 --- ## Week 8: Unit 2 Work Week **Dates:** October 19 - October 23 ### Module Description No new lecture material this week. Students use this week to complete Problem Set 2, practice GLMM applications, and continue developing their final project plans. Office hours are available for methodological questions and feedback on Problem Set 2. ### Estimated Workload | Activity | Time | |----------|------| | Complete Problem Set 2 | 5-6 hours | | Review and practice GLMM concepts | 2-3 hours | | Develop Final Project plans | 1-2 hours | | **Total** | **8-11 hours** | ### To Do This Week - [ ] **Submit Problem Set 2 by October 23** - [ ] Review any challenging concepts from Unit 2 - [ ] Finalize Final Project dataset and method selection --- ## Unit 3: Longitudinal Data Analysis (Weeks 9–11) --- ## Week 9: Introduction to Longitudinal Data and Bayesian Primer **Dates:** October 26 - October 30 ### Module Description This module transitions from cross-sectional multilevel models to longitudinal data analysis. Students will learn to recognize longitudinal data as a special case of multilevel data—observations nested within individuals over time—while understanding what makes longitudinal analysis distinct. We introduce key concepts including time-varying vs. time-invariant predictors, the choice between wide and long data formats, and the critical importance of how time is measured and centered. This module also situates longitudinal analysis within the broader landscape of methods for non-independent data, distinguishing it from panel data approaches in econometrics and pure time-series analysis. This module also includes a brief Bayesian primer. The Kurz supplementary text uses brms, a Bayesian modeling package. While this course does not teach Bayesian statistics, students need enough background to use brms intelligently: understanding priors, posteriors, credible intervals, and how to read brms output. This primer provides just enough context to follow along with the Kurz materials without requiring a full treatment of Bayesian inference. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 9.1 | Recognize longitudinal data as a special case of multilevel structure | CLO1, CLO4 | | 9.2 | Distinguish between time-varying and time-invariant predictors | CLO4 | | 9.3 | Restructure data between wide and long formats | CLO4, CLO6 | | 9.4 | Distinguish longitudinal, panel, and time-series data structures and their typical methods | CLO1 | | 9.5 | Explain the multilevel vs. econometric traditions for panel data analysis | CLO1, CLO8 | | 9.6 | Describe basic Bayesian concepts (prior, posterior, credible interval) sufficiently to interpret brms output | CLO6 | ### Required Readings - Singer & Willett, *Applied Longitudinal Data Analysis* - Chapter 1: A Framework for Investigating Change Over Time - Chapter 2: Exploring Longitudinal Data on Change - Kurz, *Applied Longitudinal Data Analysis in brms and the tidyverse* - Chapters 1-2 (parallel R code) - Instructor-provided Bayesian Primer handout ### Recommended Readings - Gelman & Hill, *Data Analysis Using Regression and Multilevel/Hierarchical Models* - Chapter 15: Multilevel Models and Longitudinal Data - McElreath, R. *Statistical Rethinking* - Chapter 1: The Golem of Prague (for accessible Bayesian motivation) ### Estimated Workload | Activity | Time | |----------|------| | Video lectures (including Bayesian primer) | 2 hours | | Required readings | 3-4 hours | | Data restructuring practice | 2-3 hours | | Begin Problem Set 3 | 1-2 hours | | Project Idea Post | 1-2 hours | | **Total** | **9-12 hours** | ### To Do This Week - [ ] Complete required readings (including Bayesian primer handout) - [ ] Complete data restructuring exercises - [ ] Begin Problem Set 3 - [ ] **Post Project Idea and reply to two classmates by October 30** --- ## Week 10: Growth Curve Models **Dates:** November 2 - November 6 ### Module Description This module introduces growth curve models—multilevel models designed to describe and predict individual change over time. We begin with unconditional models that simply describe the average trajectory and individual variation around it, then add predictors to explain why some individuals change more (or differently) than others. Students will learn to model linear and nonlinear trajectories, interpret individual variation in change, and understand how centering time affects interpretation. The connection between growth curve models and the varying slopes models from Unit 1 is made explicit. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 10.1 | Estimate unconditional growth models to describe average and individual change | CLO4, CLO6 | | 10.2 | Add predictors to explain between-person differences in change trajectories | CLO4 | | 10.3 | Model nonlinear change using polynomial and other functional forms | CLO4 | | 10.4 | Center time appropriately and explain the interpretive consequences | CLO4 | | 10.5 | Visualize individual growth trajectories and model-implied predictions | CLO4, CLO7 | ### Required Readings - Singer & Willett, *Applied Longitudinal Data Analysis* - Chapter 3: Introducing the Multilevel Model for Change - Chapter 4: Doing Data Analysis with the Multilevel Model for Change - Chapter 5: Treating Time More Flexibly - Kurz, *Applied Longitudinal Data Analysis in brms and the tidyverse* - Chapters 3-5 (parallel R code) ### Recommended Readings - Singer & Willett, *Applied Longitudinal Data Analysis* - Chapter 6: Modeling Discontinuous and Nonlinear Change ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 3-4 hours | | Coding practice (growth curve models) | 3-4 hours | | Continue Problem Set 3 | 1-2 hours | | Article Review #3 | 1 hour | | **Total** | **10-12 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Complete coding practice exercises on growth curve models - [ ] Continue working on Problem Set 3 - [ ] **Post Article Review #3 and reply to a classmate by November 6** --- ## Week 11: Practical Issues in Longitudinal Analysis **Dates:** November 9 - November 13 ### Module Description This module addresses the practical challenges that arise in longitudinal data analysis. A major focus is missing data—a nearly universal feature of longitudinal studies. Students will learn to distinguish between missing data mechanisms (MCAR, MAR, MNAR) and understand how multilevel models handle missingness under MAR assumptions. We also cover unbalanced data (individuals with different numbers of observations or irregular time points) and strategies for model selection in longitudinal contexts. By the end of this module, students will be prepared to handle the messiness of real longitudinal data. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 11.1 | Distinguish between MCAR, MAR, and MNAR missing data mechanisms | CLO4 | | 11.2 | Explain how and why multilevel models handle MAR missingness | CLO4 | | 11.3 | Analyze unbalanced longitudinal data with varying numbers of observations | CLO4, CLO6 | | 11.4 | Apply model selection strategies in longitudinal contexts | CLO4 | | 11.5 | Diagnose potential violations of assumptions in longitudinal models | CLO4, CLO8 | ### Required Readings - Singer & Willett, *Applied Longitudinal Data Analysis* - Chapter 7: Examining the Multilevel Model's Error Covariance Structure - Kurz, *Applied Longitudinal Data Analysis in brms and the tidyverse* - Chapter 7 (parallel R code) ### Recommended Readings - Enders, C.K. (2010). *Applied Missing Data Analysis*. Guilford Press. (Chapters 1-3, available through library) ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 2-3 hours | | Coding practice (missing data, unbalanced designs) | 2-3 hours | | Complete Problem Set 3 | 3-4 hours | | **Total** | **9-11 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Complete coding practice exercises on practical longitudinal issues - [ ] **Submit Problem Set 3 by November 13** --- ## Unit 4: Introduction to Time-Series (Weeks 12–13) **Note:** This unit provides foundational exposure to time-series methods. Students will learn core concepts and vocabulary, but two weeks is insufficient for deep application. Time-series methods are not eligible for final projects. This unit prepares students for future coursework or self-study in time-series analysis. --- ## Week 12: Foundations of Time-Series **Dates:** November 16 - November 20 ### Module Description This module introduces time-series analysis, completing our survey of methods for non-independent data. While longitudinal data typically involves many individuals measured at few time points, time-series data typically involves one (or few) units measured at many time points—think economic indicators, stock prices, or climate measurements. The key statistical concern shifts from clustering to autocorrelation: today's value depends on yesterday's. Students will learn to identify time-series data structures, diagnose autocorrelation, assess stationarity, and understand the basic building blocks of time-series models (AR and MA processes). ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 12.1 | Distinguish time-series data structures from longitudinal and panel data | CLO1, CLO5 | | 12.2 | Calculate and interpret autocorrelation and partial autocorrelation functions | CLO5 | | 12.3 | Assess and transform for stationarity | CLO5 | | 12.4 | Explain autoregressive (AR) and moving average (MA) processes | CLO5 | | 12.5 | Identify AR and MA signatures in ACF and PACF plots | CLO5 | ### Required Readings - Hyndman & Athanasopoulos, *Forecasting: Principles and Practice (3e)* - Chapter 1: Getting Started - Chapter 2: Time Series Graphics - Chapter 9: ARIMA Models (sections 9.1-9.5) ### Recommended Readings - Hyndman & Athanasopoulos, *Forecasting: Principles and Practice (3e)* - Chapter 3: Time Series Decomposition ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 3-4 hours | | Coding practice (ACF, PACF, stationarity) | 2-3 hours | | Begin Problem Set 4 | 1-2 hours | | Article Review #4 | 1 hour | | Project Status Update | 1-2 hours | | **Total** | **10-12 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Complete coding practice exercises on time-series foundations - [ ] Begin Problem Set 4 - [ ] **Post Article Review #4 and reply to a classmate by November 20** - [ ] **Post Project Status Update and reply to two classmates by November 20** --- *November 23 - November 27: Thanksgiving Break — No Class Activities* --- ## Week 13: Time-Series Regression and Looking Ahead **Dates:** November 30 - December 4 ### Module Description This module covers regression with time-series data and provides a bridge to further study. We examine how to incorporate autocorrelation into regression models, introduce ARIMA as a general framework, and briefly discuss how time-series concepts apply to panel data (when you have multiple units over many time points). The module concludes by surveying advanced topics and resources for continued learning, including forecasting, state-space models, and Bayesian approaches to time-series. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 13.1 | Estimate regression models with autocorrelated errors | CLO5, CLO6 | | 13.2 | Fit and interpret basic ARIMA models | CLO5, CLO6 | | 13.3 | Diagnose model fit using residual diagnostics for time-series | CLO5 | | 13.4 | Describe how time-series methods extend to panel data contexts | CLO1, CLO5 | | 13.5 | Identify resources for continued learning in time-series analysis | CLO5, CLO8 | ### Required Readings - Hyndman & Athanasopoulos, *Forecasting: Principles and Practice (3e)* - Chapter 7: Time Series Regression Models (sections 7.1-7.4) - Chapter 9: ARIMA Models (sections 9.6-9.9) ### Recommended Readings - Hyndman & Athanasopoulos, *Forecasting: Principles and Practice (3e)* - Chapter 10: Dynamic Regression Models (sections 10.1-10.2) ### Estimated Workload | Activity | Time | |----------|------| | Video lectures | 1.5 hours | | Required readings | 2-3 hours | | Coding practice (ARIMA, dynamic regression) | 2-3 hours | | Complete Problem Set 4 | 3-4 hours | | **Total** | **9-11 hours** | ### To Do This Week - [ ] Complete required readings - [ ] Complete coding practice exercises on time-series regression - [ ] **Submit Problem Set 4 by December 4** - [ ] Begin finalizing Final Project analysis --- ## Week 14: Final Project Work Week **Dates:** December 7 - December 11 ### Module Description No new lecture material this week. Students use this week to finalize their analysis, complete their written report, prepare their GitHub repository, and rehearse their presentation. Office hours are available for troubleshooting and feedback. ### Estimated Workload | Activity | Time | |----------|------| | Finalize Final Project analysis | 4-5 hours | | Write Final Project report | 3-4 hours | | Prepare GitHub repository | 1-2 hours | | **Total** | **8-11 hours** | ### To Do This Week - [ ] Finalize Final Project analysis - [ ] Draft Final Project written report - [ ] Prepare GitHub repository with reproducible code - [ ] Begin preparing presentation --- ## Week 15: Final Project Presentations **Dates:** December 14 - December 18 ### Module Description This final module brings together everything students have learned throughout the course. Students will present their final projects to the class, explaining their data structure, methodological choices, findings, and limitations. Presentations provide an opportunity to practice communicating complex statistical analyses to an audience and to receive feedback from peers and the instructor. We will also reflect on the course's central theme—appropriately modeling observations that are not independent—and discuss how these methods connect to further study in causal inference, Bayesian analysis, and machine learning. ### Module Learning Objectives By the end of this module, students will be able to: | # | Module Learning Objective | Maps to CLO | |---|---------------------------|-------------| | 15.1 | Present a complete statistical analysis to a live audience, clearly explaining methodological choices | CLO7 | | 15.2 | Provide constructive peer feedback on analytical approaches and presentation effectiveness | CLO8 | | 15.3 | Synthesize course themes around modeling non-independent observations | CLO1-CLO5 | | 15.4 | Identify connections between course methods and advanced topics | CLO8 | | 15.5 | Describe strategies for continued learning in advanced quantitative methods | CLO8 | ### Required Readings - No new readings—focus on final project completion ### Estimated Workload | Activity | Time | |----------|------| | Finalize and polish Final Project report | 3-4 hours | | Prepare and rehearse presentation | 2-3 hours | | Attend class presentations and provide peer feedback | 2-3 hours | | **Total** | **7-10 hours** | ### To Do This Week - [ ] Finalize and polish Final Project report - [ ] Prepare and rehearse 10-minute presentation - [ ] **Submit Final Project by December 18** (include GitHub repository link and written report HTML) - [ ] Attend class presentations and provide peer feedback --- ## Conversion Plan: 15-Week to 12-Week Semester This syllabus is designed for a 15-week fall or spring semester. When teaching the course in a compressed 12-week summer semester, the three work weeks must be eliminated and content consolidated. ### Recommended 12-Week Structure Remove the three work weeks and adjust pacing: | 12-Week | 15-Week Equivalent | Topic | |---------|-------------------|-------| | Week 1 | Week 1 | Introduction to Clustered Data and Varying Intercepts | | Week 2 | Week 2 | Varying Intercepts and Slopes | | Week 3 | Week 3 | Building and Evaluating Multilevel Models | | Week 4 | Week 5 | Multilevel Models for Binary Outcomes | | Week 5 | Week 6 | Multilevel Models for Count Outcomes | | Week 6 | Week 7 | GLMM Applications and Best Practices | | Week 7 | Week 9 | Introduction to Longitudinal Data and Bayesian Primer | | Week 8 | Week 10 | Growth Curve Models | | Week 9 | Week 11 | Practical Issues in Longitudinal Analysis | | Week 10 | Week 12 | Foundations of Time-Series | | Week 11 | Week 13 | Time-Series Regression and Looking Ahead | | Week 12 | Week 15 | Final Project Presentations | ### Key Adjustments for 12-Week Format **Problem Sets:** Without dedicated work weeks, problem sets will be due at the end of the final content week for each unit rather than during a work week: | Assignment | 15-Week Due Date | 12-Week Due Date | |------------|------------------|------------------| | Skills Check | Week 2 | Week 2 | | Problem Set 1 | Week 4 | Week 3 | | Problem Set 2 | Week 8 | Week 6 | | Problem Set 3 | Week 11 | Week 9 | | Problem Set 4 | Week 13 | Week 11 | | Final Project | Week 15 | Week 12 | **Discussion Activities:** Consolidate to reduce load: | Activity | 15-Week Due Date | 12-Week Due Date | |----------|------------------|------------------| | Article Review #1 | Week 2 | Week 2 | | Article Review #2 | Week 5 | Week 4 | | Project Idea Post | Week 9 | Week 7 | | Article Review #3 | Week 10 | Week 8 | | Article Review #4 | Week 12 | Week 10 | | Project Status Update | Week 12 | Week 10 | **Workload Considerations:** Without work weeks, students will need to manage problem sets alongside new content. Consider: - Reducing coding practice exercises slightly - Providing more starter code for problem sets - Being flexible with office hours during problem set weeks - Acknowledging the increased intensity in course communications **Reading Adjustments:** Consider moving some required readings to recommended status, particularly: - Gelman & Hill Chapter 25 (Missing Data) - Singer & Willett Chapter 6 (Discontinuous and Nonlinear Change) - Hyndman & Athanasopoulos Chapter 3 (Time Series Decomposition)