Introduction

This analysis will focus on secondary school absences. Is there a relationship between age, sex, and weekend alcohol consumption and student absences? The homework being used is Homework 5.

library(tidyverse)
library(Zelig)
library(texreg)
library(mvtnorm)
library(radiant.data)
library(sjmisc)
library(lattice)
library(texreg)
library(stargazer)
library(ggplot2)
library(ggthemes)
library(plotly)
library(Zelig)
library(devtools)
library(readr)
  student <- read_csv("/Users/cruz/Desktop/students.csv", col_names = TRUE)

Linear Regression

The dependent variable chosen in this analysis explains some of the underlying reason for “absences” in this particular secondary school. The independent variables chosen are age, sex, and Walc(Weekend Student Alcohol Consumption).

lm0 <- lm(absences ~ age + sex + Walc, data = student)
summary(lm0)

Call:
lm(formula = absences ~ age + sex + Walc, data = student)

Residuals:
   Min     1Q Median     3Q    Max 
-9.372 -4.618 -1.837  2.405 68.418 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept) -11.8414     5.1921  -2.281  0.02311 * 
age           0.9730     0.3113   3.126  0.00190 **
sexM         -1.6428     0.8205  -2.002  0.04595 * 
Walc          0.9087     0.3206   2.835  0.00482 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7.814 on 391 degrees of freedom
Multiple R-squared:  0.05399,   Adjusted R-squared:  0.04673 
F-statistic: 7.438 on 3 and 391 DF,  p-value: 0.0000743

Linear Regression X Interaction

As observed in the following model, the impact age has on absences is statistically significant. For every year “age” increase, absences go up by (.992). The data also displays that among sexes, males have (0.91) fewer absences than females in this particular school, it is important to note that this was not statistically significant. The independent variable “Walc” (weekend alcohol consumption) displays that as weekend alcohol consumption rating increased, absences increased by (1.107). Lastly, when the interaction term was introduced (sex*Walc) the data displayed that males who engaged in weekend alcohol consumption were (-0.325) less likely than females who engaged in weekend alcohol consumption to be absent but it is important to note that this interaction was not statistically significant.

lm1 <- lm(absences ~ age + sex*Walc, data = student)
summary(lm1)

Call:
lm(formula = absences ~ age + sex * Walc, data = student)

Residuals:
   Min     1Q Median     3Q    Max 
-9.853 -4.481 -1.762  2.412 68.583 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept) -12.5606     5.3985  -2.327   0.0205 * 
age           0.9928     0.3142   3.160   0.0017 **
sexM         -0.9157     1.6899  -0.542   0.5882   
Walc          1.1071     0.5151   2.149   0.0322 * 
sexM:Walc    -0.3251     0.6604  -0.492   0.6227   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7.821 on 390 degrees of freedom
Multiple R-squared:  0.05458,   Adjusted R-squared:  0.04488 
F-statistic: 5.628 on 4 and 390 DF,  p-value: 0.0002057

Group-wise summary of dependent variable

Verifying regression results

library(magrittr)
library(dplyr)
library(sjmisc)
abstudent <- student%>%
  select(absences, sex, Walc)%>%
  group_by(sex, Walc)%>%
  summarise(mean = mean(absences))
head(abstudent)
stargazer(lm0, lm1, type = "html")
Dependent variable:
absences
(1) (2)
age 0.973*** 0.993***
(0.311) (0.314)
sexM -1.643** -0.916
(0.820) (1.690)
Walc 0.909*** 1.107**
(0.321) (0.515)
sexM:Walc -0.325
(0.660)
Constant -11.841** -12.561**
(5.192) (5.399)
Observations 395 395
R2 0.054 0.055
Adjusted R2 0.047 0.045
Residual Std. Error 7.814 (df = 391) 7.821 (df = 390)
F Statistic 7.438*** (df = 3; 391) 5.628*** (df = 4; 390)
Note: p<0.1; p<0.05; p<0.01

Model lm0 seems to be the better fit for this data.

AIC(lm0,lm1)
BIC(lm0,lm1)

Purpose of plotting some of the variables

I went and plotted some of the variables being used to visually understand some of the relationships occuring in this analysis and also to verify visually that the output was correct.

student <- student%>%
  mutate(sex = as.factor(sex))
library(visreg)
abstudent2 <- lm(absences ~ age + sex + Walc, data=student)
visreg(abstudent2)

Extra Plots X Data Visuals

As a very visually driven person the purpose of the extra plots is to simply help me visually understand the data and variables I chose for this analysis.

Weekend Alcohol Consumption X Age

In this graph we see that as the age of the students in this secondary school increases, so does the level of weekend alcohol consumption.

ggplot(student)+
  geom_smooth(aes(x = age, y = Walc), color= "cyan", fill = "blue") + geom_smooth(aes(x = age, y = Dalc), color= "aqua marine1", fill = "black") + theme_solarized() 

ggplotly()

Weekend Alcohol Consumption X Absences

This graph displays that as weekend alcohol levels increase to around moderate range so do absences, then it begins to taper down interestingly.

library(ggplot2)
ggplot(student)+
  geom_smooth(aes(x = absences, y = Walc), color= "cyan", fill = "blue") + geom_smooth(aes(x = absences, y = Dalc), color= "Aqua Marine1", fill = "black") + theme_dark() + scale_colour_stata()

ggplotly()
plot(absences ~ Walc, data = student)

plot(absences ~ age, data = student)

plot(absences ~ sex*Walc, data = student)

studentlm <- lm(absences ~ Walc, data = student)
library(visreg)
visreg(studentlm)

Sex Vs Absences

In this particular secondary school, females tend to be more absent than males.

ggplot(student)+
  geom_smooth(aes(x = absences, y = sex), color= "cyan", fill = "blue") + theme_dark() 

ggplotly()

Age X Absences

In this secondary school as age increases so does absences.

g2 <- ggplot(student, mapping = aes(x = age, y = absences))
g2 <- g2 + geom_smooth(color = "aqua marine" , fill = "cyan") + theme_dark()
ggplotly(g2)

*Note I choose to utilize both static and interactive graphs solely for the purpose to show skill. My last graph is simply interactive with each individual process/code to display I can make a graph separate as well.

Citations

(Gershenson, Jacknowitz, and Brannegan 2017) (Chang 2012) (Maindonald and Braun 2010)

Chang, Winston. 2012. R Graphics Cookbook. Sebastopol, CA: O’Reilly Media, Inc.

Gershenson, Seth, Alison Jacknowitz, and Andrew Brannegan. 2017. “Are Student Absences Worth the Worry in Us Primary Schools?” Education Finance and Policy. MIT Press.

Maindonald, John, and W John Braun. 2010. Data Analysis and Graphics Using R: An Example-Based Approach. Cambridge: Cambridge University Press.

LS0tCnRpdGxlOiAiU2Vjb25kYXJ5IFNjaG9vbCBBYnNlbmNlcyIKb3V0cHV0OiBodG1sX25vdGVib29rCmJpYmxpb2dyYXBoeTogc29jNzEyLmJpYgotLS0KCiFbXShUU19hYnNlbnQuanBnKQoKCgojSW50cm9kdWN0aW9uClRoaXMgYW5hbHlzaXMgd2lsbCBmb2N1cyBvbiBzZWNvbmRhcnkgc2Nob29sIGFic2VuY2VzLiBJcyB0aGVyZSBhIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIGFnZSwgc2V4LCBhbmQgd2Vla2VuZCBhbGNvaG9sIGNvbnN1bXB0aW9uIGFuZCBzdHVkZW50IGFic2VuY2VzPyBUaGUgaG9tZXdvcmsgYmVpbmcgdXNlZCBpcyBIb21ld29yayA1LgoKCgoKYGBge3IsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KFplbGlnKQpsaWJyYXJ5KHRleHJlZykKbGlicmFyeShtdnRub3JtKQpsaWJyYXJ5KHJhZGlhbnQuZGF0YSkKbGlicmFyeShzam1pc2MpCmxpYnJhcnkobGF0dGljZSkKYGBgCgoKYGBge3IsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CmxpYnJhcnkodGV4cmVnKQpsaWJyYXJ5KHN0YXJnYXplcikKbGlicmFyeShnZ3Bsb3QyKQpsaWJyYXJ5KGdndGhlbWVzKQpsaWJyYXJ5KHBsb3RseSkKbGlicmFyeShaZWxpZykKbGlicmFyeShkZXZ0b29scykKYGBgCgoKYGBge3IsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CmxpYnJhcnkocmVhZHIpCiAgc3R1ZGVudCA8LSByZWFkX2NzdigiL1VzZXJzL2NydXovRGVza3RvcC9zdHVkZW50cy5jc3YiLCBjb2xfbmFtZXMgPSBUUlVFKQoKYGBgCgojTGluZWFyIFJlZ3Jlc3Npb24KVGhlIGRlcGVuZGVudCB2YXJpYWJsZSBjaG9zZW4gaW4gdGhpcyBhbmFseXNpcyBleHBsYWlucyBzb21lIG9mIHRoZSB1bmRlcmx5aW5nIHJlYXNvbiBmb3IgImFic2VuY2VzIiBpbiB0aGlzIHBhcnRpY3VsYXIgc2Vjb25kYXJ5IHNjaG9vbC4gVGhlIGluZGVwZW5kZW50IHZhcmlhYmxlcyBjaG9zZW4gYXJlIGFnZSwgc2V4LCBhbmQgV2FsYyhXZWVrZW5kIFN0dWRlbnQgIEFsY29ob2wgQ29uc3VtcHRpb24pLgpgYGB7cn0KbG0wIDwtIGxtKGFic2VuY2VzIH4gYWdlICsgc2V4ICsgV2FsYywgZGF0YSA9IHN0dWRlbnQpCnN1bW1hcnkobG0wKQpgYGAKCiNMaW5lYXIgUmVncmVzc2lvbiBYIEludGVyYWN0aW9uCkFzIG9ic2VydmVkIGluIHRoZSBmb2xsb3dpbmcgbW9kZWwsIHRoZSBpbXBhY3QgYWdlIGhhcyBvbiBhYnNlbmNlcyBpcyBzdGF0aXN0aWNhbGx5IHNpZ25pZmljYW50LiBGb3IgZXZlcnkgeWVhciAiYWdlIiBpbmNyZWFzZSwgYWJzZW5jZXMgZ28gdXAgYnkgKC45OTIpLiAgVGhlIGRhdGEgYWxzbyBkaXNwbGF5cyB0aGF0IGFtb25nIHNleGVzLCBtYWxlcyBoYXZlICgwLjkxKSBmZXdlciBhYnNlbmNlcyB0aGFuIGZlbWFsZXMgaW4gdGhpcyBwYXJ0aWN1bGFyIHNjaG9vbCwgaXQgaXMgaW1wb3J0YW50IHRvIG5vdGUgdGhhdCB0aGlzIHdhcyBub3Qgc3RhdGlzdGljYWxseSBzaWduaWZpY2FudC4gVGhlIGluZGVwZW5kZW50IHZhcmlhYmxlICJXYWxjIiAod2Vla2VuZCBhbGNvaG9sIGNvbnN1bXB0aW9uKSBkaXNwbGF5cyB0aGF0IGFzIHdlZWtlbmQgYWxjb2hvbCBjb25zdW1wdGlvbiByYXRpbmcgaW5jcmVhc2VkLCBhYnNlbmNlcyBpbmNyZWFzZWQgYnkgKDEuMTA3KS4gTGFzdGx5LCB3aGVuIHRoZSBpbnRlcmFjdGlvbiB0ZXJtIHdhcyBpbnRyb2R1Y2VkIChzZXgqV2FsYykgdGhlIGRhdGEgZGlzcGxheWVkIHRoYXQgbWFsZXMgd2hvIGVuZ2FnZWQgaW4gd2Vla2VuZCBhbGNvaG9sIGNvbnN1bXB0aW9uIHdlcmUgKC0wLjMyNSkgbGVzcyBsaWtlbHkgdGhhbiBmZW1hbGVzIHdobyBlbmdhZ2VkIGluIHdlZWtlbmQgYWxjb2hvbCBjb25zdW1wdGlvbiB0byBiZSBhYnNlbnQgYnV0IGl0IGlzIGltcG9ydGFudCB0byBub3RlIHRoYXQgdGhpcyBpbnRlcmFjdGlvbiB3YXMgbm90IHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQuIAoKYGBge3J9CmxtMSA8LSBsbShhYnNlbmNlcyB+IGFnZSArIHNleCpXYWxjLCBkYXRhID0gc3R1ZGVudCkKc3VtbWFyeShsbTEpCmBgYAojR3JvdXAtd2lzZSBzdW1tYXJ5IG9mIGRlcGVuZGVudCB2YXJpYWJsZQpWZXJpZnlpbmcgcmVncmVzc2lvbiByZXN1bHRzCmBgYHtyfQpsaWJyYXJ5KG1hZ3JpdHRyKQpsaWJyYXJ5KGRwbHlyKQpsaWJyYXJ5KHNqbWlzYykKCmFic3R1ZGVudCA8LSBzdHVkZW50JT4lCiAgc2VsZWN0KGFic2VuY2VzLCBzZXgsIFdhbGMpJT4lCiAgZ3JvdXBfYnkoc2V4LCBXYWxjKSU+JQogIHN1bW1hcmlzZShtZWFuID0gbWVhbihhYnNlbmNlcykpCgpoZWFkKGFic3R1ZGVudCkKYGBgCgoKYGBge3IsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0UsIHJlc3VsdHM9ImFzaXMifQpzdGFyZ2F6ZXIobG0wLCBsbTEsIHR5cGUgPSAiaHRtbCIpCmBgYAoKTW9kZWwgbG0wIHNlZW1zIHRvIGJlIHRoZSBiZXR0ZXIgZml0IGZvciB0aGlzIGRhdGEuCmBgYHtyfQpBSUMobG0wLGxtMSkKQklDKGxtMCxsbTEpCmBgYAoKCiNQdXJwb3NlIG9mIHBsb3R0aW5nIHNvbWUgb2YgdGhlIHZhcmlhYmxlcwpJIHdlbnQgYW5kIHBsb3R0ZWQgc29tZSBvZiB0aGUgdmFyaWFibGVzIGJlaW5nIHVzZWQgdG8gdmlzdWFsbHkgdW5kZXJzdGFuZCBzb21lIG9mIHRoZSByZWxhdGlvbnNoaXBzIG9jY3VyaW5nIGluIHRoaXMgYW5hbHlzaXMgYW5kIGFsc28gdG8gdmVyaWZ5IHZpc3VhbGx5IHRoYXQgdGhlIG91dHB1dCB3YXMgY29ycmVjdC4KCmBgYHtyLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQoKc3R1ZGVudCA8LSBzdHVkZW50JT4lCiAgbXV0YXRlKHNleCA9IGFzLmZhY3RvcihzZXgpKQoKbGlicmFyeSh2aXNyZWcpCmFic3R1ZGVudDIgPC0gbG0oYWJzZW5jZXMgfiBhZ2UgKyBzZXggKyBXYWxjLCBkYXRhPXN0dWRlbnQpCnZpc3JlZyhhYnN0dWRlbnQyKQpgYGAKCgoKI0V4dHJhIFBsb3RzIFggRGF0YSBWaXN1YWxzCkFzIGEgdmVyeSB2aXN1YWxseSBkcml2ZW4gcGVyc29uIHRoZSBwdXJwb3NlIG9mIHRoZSBleHRyYSBwbG90cyBpcyB0byBzaW1wbHkgaGVscCBtZSB2aXN1YWxseSB1bmRlcnN0YW5kIHRoZSBkYXRhIGFuZCB2YXJpYWJsZXMgSSBjaG9zZSBmb3IgdGhpcyBhbmFseXNpcy4KCgojV2Vla2VuZCBBbGNvaG9sIENvbnN1bXB0aW9uIFggQWdlCkluIHRoaXMgZ3JhcGggd2Ugc2VlIHRoYXQgYXMgdGhlIGFnZSBvZiB0aGUgc3R1ZGVudHMgaW4gdGhpcyBzZWNvbmRhcnkgc2Nob29sIGluY3JlYXNlcywgc28gZG9lcyB0aGUgbGV2ZWwgb2Ygd2Vla2VuZCBhbGNvaG9sIGNvbnN1bXB0aW9uLgpgYGB7ciwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KZ2dwbG90KHN0dWRlbnQpKwogIGdlb21fc21vb3RoKGFlcyh4ID0gYWdlLCB5ID0gV2FsYyksIGNvbG9yPSAiY3lhbiIsIGZpbGwgPSAiYmx1ZSIpICsgZ2VvbV9zbW9vdGgoYWVzKHggPSBhZ2UsIHkgPSBEYWxjKSwgY29sb3I9ICJhcXVhIG1hcmluZTEiLCBmaWxsID0gImJsYWNrIikgKyB0aGVtZV9zb2xhcml6ZWQoKSAKZ2dwbG90bHkoKQpgYGAKCgojV2Vla2VuZCBBbGNvaG9sIENvbnN1bXB0aW9uIFggQWJzZW5jZXMKVGhpcyBncmFwaCBkaXNwbGF5cyB0aGF0IGFzIHdlZWtlbmQgYWxjb2hvbCBsZXZlbHMgaW5jcmVhc2UgdG8gYXJvdW5kIG1vZGVyYXRlIHJhbmdlIHNvIGRvIGFic2VuY2VzLCB0aGVuIGl0IGJlZ2lucyB0byB0YXBlciBkb3duIGludGVyZXN0aW5nbHkuCmBgYHtyLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQpsaWJyYXJ5KGdncGxvdDIpCmdncGxvdChzdHVkZW50KSsKICBnZW9tX3Ntb290aChhZXMoeCA9IGFic2VuY2VzLCB5ID0gV2FsYyksIGNvbG9yPSAiY3lhbiIsIGZpbGwgPSAiYmx1ZSIpICsgZ2VvbV9zbW9vdGgoYWVzKHggPSBhYnNlbmNlcywgeSA9IERhbGMpLCBjb2xvcj0gIkFxdWEgTWFyaW5lMSIsIGZpbGwgPSAiYmxhY2siKSArIHRoZW1lX2RhcmsoKSArIHNjYWxlX2NvbG91cl9zdGF0YSgpCmdncGxvdGx5KCkKCmBgYAoKYGBge3J9CnBsb3QoYWJzZW5jZXMgfiBXYWxjLCBkYXRhID0gc3R1ZGVudCkKcGxvdChhYnNlbmNlcyB+IGFnZSwgZGF0YSA9IHN0dWRlbnQpCnBsb3QoYWJzZW5jZXMgfiBzZXgqV2FsYywgZGF0YSA9IHN0dWRlbnQpCgoKYGBgCgoKCmBgYHtyfQpzdHVkZW50bG0gPC0gbG0oYWJzZW5jZXMgfiBXYWxjLCBkYXRhID0gc3R1ZGVudCkKCmxpYnJhcnkodmlzcmVnKQp2aXNyZWcoc3R1ZGVudGxtKQoKCmBgYAoKI1NleCBWcyBBYnNlbmNlcwpJbiB0aGlzIHBhcnRpY3VsYXIgc2Vjb25kYXJ5IHNjaG9vbCwgZmVtYWxlcyB0ZW5kIHRvIGJlIG1vcmUgYWJzZW50IHRoYW4gbWFsZXMuCmBgYHtyLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQpnZ3Bsb3Qoc3R1ZGVudCkrCiAgZ2VvbV9zbW9vdGgoYWVzKHggPSBhYnNlbmNlcywgeSA9IHNleCksIGNvbG9yPSAiY3lhbiIsIGZpbGwgPSAiYmx1ZSIpICsgdGhlbWVfZGFyaygpIApnZ3Bsb3RseSgpCmBgYAoKI0FnZSBYIEFic2VuY2VzCkluIHRoaXMgc2Vjb25kYXJ5IHNjaG9vbCBhcyBhZ2UgaW5jcmVhc2VzIHNvIGRvZXMgYWJzZW5jZXMuCgpgYGB7ciwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KZzIgPC0gZ2dwbG90KHN0dWRlbnQsIG1hcHBpbmcgPSBhZXMoeCA9IGFnZSwgeSA9IGFic2VuY2VzKSkKCmcyIDwtIGcyICsgZ2VvbV9zbW9vdGgoY29sb3IgPSAiYXF1YSBtYXJpbmUiICwgZmlsbCA9ICJjeWFuIikgKyB0aGVtZV9kYXJrKCkKCmdncGxvdGx5KGcyKQpgYGAKCgoqTm90ZSBJIGNob29zZSB0byB1dGlsaXplIGJvdGggc3RhdGljIGFuZCBpbnRlcmFjdGl2ZSBncmFwaHMgc29sZWx5IGZvciB0aGUgcHVycG9zZSB0byBzaG93IHNraWxsLiBNeSBsYXN0IGdyYXBoIGlzIHNpbXBseSBpbnRlcmFjdGl2ZSB3aXRoIGVhY2ggaW5kaXZpZHVhbCBwcm9jZXNzL2NvZGUgdG8gZGlzcGxheSBJIGNhbiBtYWtlIGEgZ3JhcGggc2VwYXJhdGUgYXMgd2VsbC4KCgoKI0NpdGF0aW9ucwoKW0BnZXJzaGVuc29uMjAxN3N0dWRlbnRdIFtAY2hhbmcyMDEycl0gW0BtYWluZG9uYWxkMjAxMGRhdGFdCg==