About us: Jens Roeser

  • associate professor in psycholinguistics @ psychology department (Nottingham Trent University)
  • module leader for PSYC30815
  • research
    • Language processing (production, acquisition, comprehension) often with focus on writing (Roeser, Torrance, and Baguley 2019) and understudied languages (Garcia, Roeser, and Kidd 2023).
    • Bayesian modelling (Roeser et al. 2024, 2025); keystroke logging; eye tracking
  • teaching:
    • \(>\) 10 years of experience teaching data science and statistics to UG, PG students, academics and professionals (psyntur, Andrews and Roeser 2021)
    • cognitive psychology and language acquisition (Roeser and Wood 2019)

About us: Thom Baguley

  • Professor of Experimental Psychology
  • Lectures on modeling discrete outcomes (e.g., binary and count data)
  • Modeling data from a wide range of areas including cognitive, forensic and occupational data
  • Serious Stats: A guide to advanced statistics for the behavioral sciences (Baguley 2012)

About us: Nikolas Pautz

  • Lecturer, Cognitive Psychology
  • Teaches on UG module Statistics II, ML for PG module Advanced Statistics 2
  • ACL MRes/MSc Psychological Research Methods
  • Uses experimental designs in an applied forensic & person perception context.
  • Focuses mainly on mixed effects and signal detection theory models (Pautz et al. 2023, 2024).

Who are you?

Say hi to everyone on your table and brainstorm …

  • Which statistical tests do you remember and when do you use them?

Module Aims

Wouldn’t it be nice to have one approach that can capture most (all?) of our data problems?

  • Provide a unified set of advanced statistical techniques that covers a wider range of data-analysis problems: multilevel generalised linear models
  • General framework encapsulating a range of widely used techniques including linear regression, ANOVA, ANCOVA, MANCOVA, logistic regression, Poisson regression, log-linear, random effects / mixed-effects models etc.
  • These models arise as special cases of a single underlying general principle.
  • Apply this general principle to a wider range of data analysis problems.

Module Aims

  • Multilevel generalised linear models will be introduce as sequence of consecutively more general models, all of which build on linear regression.
  • Linear regression: review of multiple linear regression and factorial ANOVA.
  • General linear models: certain problems, which are poorly handled by ANOVA, can be easily accommodated (e.g. varying slope models).
  • Generalised linear models: transformations applied to linear models allows modelling of data-types including categorical, ordinal, discrete frequency data.
  • Multi-level models: data with inherently hierarchical structure, which have often been inadequately dealt with.

Learning outcomes

By the end of this module, you should be able to:

  1. Apply advanced statistics to real-world data-analysis problems.
  2. Use rational and principled arguments to justify the choice of statistical methods.
  3. Engage in peer-to-peer debate about methods and practices of data-analysis.
  4. Demonstrate a range of transferable skills, including data-analysis, data-preparation, computer programming, graphical presentation of quantitative evidence, reporting of data-analysis results.

Why was it a good idea to choose this module?

  • Why did you choose Statistics III?
  • What do you expect to get out off this module?
  • How confident do you feel about RStudio; how about stats?
  • Is there anything you’re worried about with this module?

Why was it a good idea to choose this module?

  • Career opportunities! Stats and statistical software skills transfer to many different areas in academia and industry.
  • You will be able to show-off your data analysis skill in your final year project, unless you are using qualitative methods.
  • Solid understanding of statistical methods is relevant for planning data collections and making sense of quantitative data.
  • Data analysis can be efficient, flexible, reliable, and fast once you have had enough R practice.
  • You have the opportunity to improve your data analysis and R skills with three experts.

Outline

Each week of Term 1, except PACE week. Each week two 2-hours workshops.
Dates Topic Lecturer
w/c 22/09/2025 Introduction to Statistics III; Introduction to RMarkdown Jens Roeser
w/c 29/09/2025 Normal linear models Jens Roeser
w/c 06/10/2025 Model comparison Jens Roeser
w/c 13/10/2025 Logistic regression Thom Balguey
w/c 20/10/2025 Poisson regression Thom Balguey
w/c 27/10/2025 PACE week
w/c 03/11/2025 Negative binomial and zero-inflated count models Thom Balguey
w/c 10/11/2025 Multilevel linear models (I) Nikolas Pautz
w/c 17/11/2025 Multilevel linear models (II) Nikolas Pautz
w/c 24/11/2025 Revision (and loose ends) Nikolas Pautz

Engaging with this module

  • 2 \(\times\) 2-hours workshop each week.
  • Mixture of theory and practical exercises.
  • Together, we will work through statistics problems using R.
  • For each topic, there will be detail accompanying lecture notes.
  • To engage successfully with this module, attend all workshops, and read the lecture notes.
  • For excellent results, you will need to demonstrate independent learning that goes beyond what we cover in class.

Assessment

Using data-sets of your own choice, perform and report analyses using the tools in 1 – 4 to address a set of theoretical questions to address a research question of relevance to psychology.

  1. Logistic regression for binomial data
  2. Regression models for count data
  3. Multilevel regression models
  4. A new methods that was not explicitly covered in class

Assessment

Psychological relevance: address some non-trivial question of some psychological relevance using your chosen data.

  • Very wide range of topics and sub-fields, many of which overlap with many other disciplines.
  • Consider the data-set described here published in

Bertrand, M. and Mullainathan, S. (2004). Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review, 94, 991–1013.

  • Published in an economics journal but it addresses implicit racial bias, and is obviously related to psychology.

Assessment

  • Choose a different data set per each question.
  • 4 separate data analyses reported on no more than 4 pages each.
  • For all questions, provide full technical details, and graphical analysis.
  • Theoretical model (assumed data-generative process) needs to be motivated.

Assessment

  • Deadline for submission via Dropbox: Jan 12th 2026
  • Individual written feedback (by Feb 2nd 2026) to help you how to best carry out and report statistical analyses.
  • We are looking for evidence of
    • understanding and knowledge of the theoretical basis.
    • how and why to apply certain statistical models.
    • practical skills in data manipulation and analysis.
  • For high or exceptional grades, display learning that goes beyond the content explicitly taught in class.
  • See Grading Matrix on NOW for details.

Assessment

  • Work must be submitted as single zip archive with
    • One RMarkdown file
    • PDF document generated from RMarkdown
    • RStudio project file
    • Data and supplementary files (e.g. bibliography, custom R function)
    • For an example see demo on NOW.
  • Work must be reproducible using RMarkdown file.
  • Your analysis, plots, descriptions, for each question will be contained in RMarkdown file, and this will reference all relevant data files.

Assessment

Assessment

For all tasks, provide …

  • descriptive statistics and graphical visualizations of the data.
  • full details of the statistical model that you are using (multiple predictors incl. categorical predictors, interaction terms).
  • full statistical details of model fit and comparisons with alternative models.
  • full statistical details of the inferred model parameter value estimates.
  • analysis, including graphical analysis, of the predictions of the model (model checks).

Question 4: A new topic

  • Y3 modules require you to go beyond the class contents.
  • Look for text books and online tutorials, e.g. my tutorial for mixture models.
  • Examples: time series, structural equation modelling (SEM), auto-regression models, non-linear regression, signal detection theory, mixture models, categorical models, multi-nominal models, multi-variate models, diffusion models, Bayesian models …
  • For example you could apply a Bayesian ordinal regression as described in Bürkner and Vuorre (2019).
  • Tip: keep the analysis simple and focus on demonstrating your understanding of the method used.
  • Talk to your supervisor / us re advanced analyses relevant for your project.

Useage of GenAI

Generative Artificial Intelligence (GenAI) should be used to support and enhance your learning, not to replace independent thinking, critical analysis, or academic integrity.

GenAI use in this assessment is rated as AMBER; which means:

  • You are allowed to ask an AI tool to support you to deepen your understanding of a topic, to check that you have understood a concept correctly, or to check that your own writing makes sense.
  • You must not use an AI tool to create any of the text that you actually submit for the assignment; the work you submit must be clearly your own. You also must not use an AI tool as a reference.

Please submit the Generative Artificial Intelligence (GenAI) Usage Declaration Form along with your coursework. Please explain how you used GenAI to help prepare your work in the “Explanation of AI use” box.

Formative assessment

  • Practice how to write a the kind of report we expect from you.
  • Data-analysis report using linear regression to address a psychologically relevant question.
  • Support in workshops.
  • Just one data analysis hence 4 pages max.
  • Submit single zip archive via Dropbox until 24th October 2025, 2pm.
  • Feedback on what you did well and what you can improve.
  • It does not (directly) affect your grade.

Support

  • MS Teams channel for R, stats, assessment-related questions, and useful resources.
  • Because our answers will be of interest to everyone / easy code sharing, tagging.
  • Also you will receive an answer faster.
  • We might refer emails to the forum so everyone can benefit from our answer.
  • Help each other!
  • Online support will stop at Dec 20 2025.
  • For confidential questions contact ML Jens Roeser: jens.roeser@ntu.ac.uk

Directed learning

Required:

  • Weekly lecture notes on NOW.
  • Andrews (2021) Doing Data Science in R: An Introduction for Social Scientists LINK

You choose:

  • Dienes (2008) Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference LINK
  • Wickham and Grolemund (2016) R for data science LINK
  • Faraway (2015) Linear models with R LINK
  • Gelman, Hill, and Vehtari (2020) Regression and other Stories LINK

What questions have you got?

  • How will this module be taught?
  • What’s the assessment like?
  • How will you be supported?
  • What’s expected of you?

For the rest of today

Tech requirements

  • You need R and RStudio on your own machine: instructions on NOW
  • Don’t rely on RStudio Cloud or university machines.
  • Check if tidyverse is installed. If not run
install.packages("tidyverse")




Explore available datasets

  • Explore available data sets.
  • Finding the right data sets is difficult!
  • Data need wrangling to fit your purpose.
  • Use the repositories mentioned in the slides and ChatGPT / copilot.
  • Ideally, look for data sets in areas of psychology you’re interested in (maybe related to your third year project).
  • Share on Teams any resources you found helpful.

References

Andrews, Mark. 2021. Doing Data Science in R: An Introduction for Social Scientists. SAGE Publications Ltd.

Andrews, Mark, and Jens Roeser. 2021. psyntur: Helper Tools for Teaching Statistical Data Analysis. https://CRAN.R-project.org/package=psyntur.

Baguley, Thom. 2012. Serious Stats: A Guide to Advanced Statistics for the Behavioral Sciences. Macmillan International Higher Education.

Bürkner, Paul-Christian, and Matti Vuorre. 2019. “Ordinal Regression Models in Psychology: A Tutorial.” Advances in Methods and Practices in Psychological Science 2 (1): 77–101.

Dienes, Zoltan. 2008. Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference. Basingstoke, UK: Palgrave Macmillan.

Faraway, Julian J. 2015. Linear Models with R. Vol. 2. CRC press.

Garcia, Rowena, Jens Roeser, and Evan Kidd. 2023. “Finding Your Voice: Voice-Specific Effects in Tagalog Reveal the Limits of Word Order Priming.” Cognition 236: 105424.

Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2020. Regression and Other Stories. Cambridge University Press.

Pautz, N., K. McDougall, K. Mueller-Johnson, F. Nolan, A. Paver, and H. M. Smith. 2023. “Identifying Unfamiliar Voices: Examining the System Variables of Sample Duration and Parade Size.” Quarterly Journal of Experimental Psychology 76 (12): 2804–22. https://doi.org/10.1177/17470218231204792.

———. 2024. “Time to Reflect on Voice Parades: The Influence of Reflection and Retention Interval Duration on Earwitness Performance.” Applied Cognitive Psychology 38 (1): e4162. https://doi.org/10.1002/acp.4162.

Roeser, Jens, Rianne Conijn, E. Chukharev, G. H. Ofstad, and Mark Torrance. 2025. “Typing in Tandem: Language Planning in Multisentence Text Production Is Fundamentally Parallel.” Journal of Experimental Psychology: General 154 (7): 1824–54. https://doi.org/10.1037/xge0001759.

Roeser, Jens, Sven De Maeyer, Mariëlle Leijten, and Luuk VaWaes. 2024. “Modelling Typing Disfluencies as Finite Mixture Process.” Reading and Writing 37 (2): 359–84. https://doi.org/10.1007/s11145-023-10489-4.

Roeser, Jens, Mark Torrance, and Thom Baguley. 2019. “Advance Planning in Written and Spoken Sentence Production.” Journal of Experimental Psychology: Learning, Memory, and Cognition 45 (11): 1983–2009.

Roeser, Jens, and Clare Wood. 2019. “Language and Literacy.” In Essential Psychology, edited by P. Banyard, C. Norman, G. Dillon, and B. Winder, 3:197–226. London: Sage.

Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, Inc.