SUMMARY

This report provides the summary of the key takeaways we obtained from Lessons 6-8 of the Regression Models course. Lessons 6-8 focused on delving deeper into Multivariable Regression by supplying 3 examples we can easily visualize and understand.

LESSON 6: MULTIVAR EXAMPLES

In this lesson, we delved deeper into the application of regression models with multiple variables using the Swiss dataset from R’s datasets package. This dataset, collected in 1888, includes six variables from 47 French-speaking provinces of Switzerland. Through scatterplot matrices, we were able to examine pairwise relationships among variables found in the data. We focused on how each variable, specifically agriculture and education, influences fertility rates. Here, we have observed that the influence of agriculture on fertility shifted from negative to positive when other variables were not considered from the model. This highlights the importance of including or considering multiple factors in regression analysis.

An important takeaway from this lesson is the role of understanding the relationships between independent variables in a multivariable regression model. The correlation analysis between variables like agriculture and education demonstrated how multicollinearity could affect the interpretation of results. Additionally, the exercise of adding a redundant variable (ec) showed that such additions don’t provide new information, highlighting the importance of carefully selecting variables that highly contribute to the model.

Lesson 7: MULTIVAR EXAMPLES3

In this lesson, we explored regression models with multiple independent variables using the InsectSprays dataset from R’s datasets package. The data provides information on the counts of insects for each of six different sprays, letting us examine the effectiveness of each spray. We created linear models and so, we were able to see the comparison of different sprays, with spray A as the reference group. We used R functions like lm for linear models and summary to analyze the coefficients, which showed the differences between the means of the sprays compared to the reference spray.

A takeaway from this lesson is the importance of understanding how different reference groups in a regression model affect the interpretation of the results. By changing the reference group, as we did by reordering the levels with the relevel function, we could see how the model’s coefficients change, which impacts the t-test results for comparing different sprays. This understanding is important for interpreting the outcomes of regression models correctly, especially in multivariable contexts.

Lesson 8: MULTIVAR EXAMPLES3

In this lesson, we explored multivariable linear regression models using WHO data on child hunger rates. We started by analyzing the relationship between the rate of hunger among children and the year the data was collected. From the data, we found that hunger rates decreased over time. We then examined how hunger rates differed by gender, creating separate models for male and female children. The results indicated that while both genders showed a decline in hunger rates, the rates were not parallel, suggesting different rates of decrease.

Next, we combined gender with the year as predictors in a single model to see their combined effect on hunger rates. This process revealed that the slope of the decline in hunger rates was consistent across genders when both variables were included. Finally, we explored interactions between the predictors, showing how the relationship between year and hunger rates changed depending on the gender.