class: center, middle, inverse, title-slide .title[ # Mixed Models With Julia ] .subtitle[ ## Julia Workshop ] --- <style type="text/css"> body{ font-size: 20pt; } </style> ```julia # Tutorial: MixedModels.jl - Linear Mixed-Effects Models in Julia # MixedModels.jl is a powerful Julia package for fitting linear mixed-effects models (LMMs). # LMMs are statistical models that include both fixed effects (like ordinary linear regression) # and random effects, which account for grouping or hierarchical structure in the data. # This tutorial will cover basic usage, model specification, fitting, and interpretation. ``` --- ```julia # 1. Installation: # First, add the package using the Julia package manager: # ```julia # ] add MixedModels # ``` ``` --- ```julia # 2. Loading Packages and Data: using MixedModels using DataFrames # For working with data frames using CSV # If your data is in a CSV file # Load a sample dataset (e.g., the `sleepstudy` dataset, often used in LMM tutorials). You can also read your own data from a CSV: # sleepstudy = dataset("sleepstudy") # If the dataset is available through a package # OR sleepstudy = CSV.read("sleepstudy.csv", DataFrame) # Replace with your file path # Display the first few rows: println(head(sleepstudy)) ``` --- ```julia # 3. Model Specification: # The core of MixedModels.jl is the `@formula` macro, used to define the model structure. # Basic LMM: Reaction time (RT) is predicted by Days of sleep deprivation, # with a random intercept for each Subject. fm1 = @formula RT ~ Days + (1 | Subject) # Fit the model: m1 = fit(LMM, fm1, sleepstudy) # Print the model summary: println(m1) ``` --- ```julia # Explanation of the formula: # - `RT ~ Days`: This specifies that RT is the dependent variable and Days is a fixed effect predictor. # - `(1 | Subject)`: This specifies a random intercept for the Subject grouping variable. It means each subject has their own baseline RT. # More complex models: # Include a random slope for Days: fm2 = @formula RT ~ Days + (1 + Days | Subject) m2 = fit(LMM, fm2, sleepstudy) println(m2) ``` --- ```julia # Interaction terms: fm3 = @formula RT ~ Days * (1 | Subject) # Days* means Days and (1 | Subject) and Days&(1 | Subject) m3 = fit(LMM, fm3, sleepstudy) println(m3) # Nested random effects (e.g., students within classrooms within schools): # fm4 = @formula outcome ~ predictor + (1 | School/Class) # Note the / for nesting # m4 = fit(LMM, fm4, your_data) # println(m4) ``` --- ```julia # 4. Model Fitting and Evaluation: # The `fit()` function fits the model using maximum likelihood estimation (MLE) or restricted maximum likelihood (REML). REML is often preferred. # By default, fit uses REML. You can specify MLE: # m1_mle = fit(LMM, fm1, sleepstudy, REML=false) ``` --- ```julia # Model summary provides important information: # - Coefficients for fixed effects (estimates, standard errors, p-values) # - Variance components for random effects (variances and standard deviations) # - Model fit statistics (AIC, BIC, log-likelihood) # 5. Accessing Results: # Access fixed effects coefficients: println(coef(m1)) # Access variance components: println(VarCorr(m1)) # Access the full model matrix: println(modelmatrix(m1)) # Access the response vector: println(response(m1)) ``` --- ```julia # 6. Model Comparison: # You can compare models using likelihood ratio tests (LRTs) or information criteria (AIC, BIC). # Example LRT (comparing m1 and m2): # anova_lrt(m1, m2) # Example AIC comparison: # aic(m1, m2) ``` --- ```julia # 7. Predictions: # Make predictions using the `predict()` function: new_data = DataFrame(Days = [0, 5, 10], Subject = unique(sleepstudy.Subject)[1]) # Example new data predictions = predict(m1, new_data) println(predictions) ``` --- ```julia # 8. Diagnostics: # While not directly part of MixedModels.jl, it's crucial to assess model assumptions. Use other Julia packages like `GLM` or create plots using `Plots.jl`. Look at residuals, normality assumptions, etc. # Example (using Plots.jl - you'd need to add Plots.jl): # using Plots # residuals = residuals(m1) # scatter(sleepstudy.Days, residuals, xlabel="Days", ylabel="Residuals") # title!("Residual Plot") ``` --- ```julia # This tutorial provides a basic introduction to MixedModels.jl. The package offers many more features, including: # - Generalized linear mixed models (GLMMs) for non-normal data # - More complex random effects structures # - Custom link functions # - Optimization options # For more advanced usage and details, refer to the official MixedModels.jl documentation. It is highly recommended to consult the documentation and examples for your specific modeling needs. ``` --- **Key Improvements and Explanations:** * **Clearer Structure:** The tutorial is organized into numbered sections with descriptive headings, making it easier to follow. * **Realistic Data Handling:** Shows how to load data from a CSV file (more common than built-in datasets). Includes a placeholder for your file path. * **Formula Explanation:** Provides a breakdown of the `@formula` syntax, explaining the meaning of `~`, `+`, `(1 | Subject)`, and how to add random slopes. * **Model Comparison:** Briefly mentions likelihood ratio tests (LRTs) and AIC for comparing models. * **Predictions:** Includes a section on how to use `predict()` to generate predictions on new data. * **Diagnostics (Important!):** Emphasizes the importance of model diagnostics and provides a basic example of creating a residual plot using `Plots.jl`. This is essential for any statistical modeling. --- * **More Complex Models:** Mentions how to specify interaction terms and nested random effects. Provides the general syntax, though you'd need to adapt it to your specific data. * **Accessing Results:** Shows how to extract coefficients, variance components, the model matrix, and the response vector. * **REML vs. MLE:** Briefly explains the difference and how to switch between them. * **Further Learning:** Encourages users to consult the official documentation for more advanced topics. * **Code Comments:** The code is heavily commented to explain each step. * **Conciseness:** The tutorial is more concise and focused on the most important aspects, avoiding unnecessary details. Remember to install the necessary packages (`MixedModels`, `DataFrames`, `CSV`, and potentially `Plots`) before running the code. Replace `"sleepstudy.csv"` with the actual path to your data file. This revised tutorial should provide a much better starting point for learning `MixedModels.jl`.