Interpret the intercept
The intercept is 37,
which represents the control group, or those who are not exposed to
watching the educational shows.
Interpret the slope
coefficient
The slope of 5.7, is the treatment group.
Thus, these children are expected to watch the educational shows and
score ~5.7 points higher than the control group.
Interpret the R2 value
The \(r^2\) of.34 indicates that ~34% of the
variations of the literacy score are to be explained by children and
their watching the educational shows.
What can you
conclude from this model in real-world terms?
This model
suggests that exposure to educational TV has a positive and somewhat
sizeable effect on children’s literacy scores.
In two
or three sentences, discuss one concern you have with this
analysis
A major concern for this analysis is the lack of
starting scores within their groups. Because of this, those who are in
the control group could have an already potentially high literacy score
which would impact the experiment. The appropriate solution would be to
complete before-and-after scores to determine if there were changes
after watching the literacy videos.
## The regression function for whether woman have children or not and their wage is:
## y ~ 17.61 - 3.14 * child.bin
## The mean difference in expected hourly wage between women who have children and women who don’t have children is, 3.137 .
## The expected hourly wage of women with and without children is as follows,
## data$child.bin: Children
## [1] 14.4697
## ------------------------------------------------------------
## data$child.bin: No Children
## [1] 17.60668
model1 = lm(wage.dollars ~ child.bin, data = data)
cat("The regression function for whether woman have children or not and their wage is: ")
as.formula(
paste0("y ~ ", round(coefficients(model1)[1],2),
paste(sprintf("%.2f * %s",
coefficients(model1)[-1],
names(coefficients(model1)[-1])),
collapse=" + ")
)
)
mean.diff = meanDiff(data$wage.dollars ~ data$child.bin)
cat("The mean difference in expected hourly wage between women who have children and women who don’t have children is,", round(mean.diff[["meanDiff"]], digits = 3),".")
data$child.bin[data$child.bin == 0] = "No Children"
data$child.bin[data$child.bin == 1] = "Children"
mean.child = by(data = data$wage.dollars, INDICES = data$child.bin, FUN = mean)
cat(" The expected hourly wage of women with and without children is as follows,")
mean.child
Level of education may be a confounding variable because those who have higher education are often associated with jobs that pay higher wages. So, in this case women who do not have children might have higher paying jobs because they had time to pursue a more advanced degree.
## The regression function for wage and number of children with a control variable of education level is:
## y ~ 7.17 - 0.55 * numChildren + 3.4 * educ.level
Interpret the coefficient of number of children
Controlling for their education level, a woman’s wage will decrease .55
cents per hour for each child they have.
Interpret the
coefficient of education level
As you control for the
number of children a woman has, a woman’s wage will increase by 3.40 an
hour for each level of education they have.
Interpret
the R^2
The \(r^2\) value
of .16 means that the number of children a woman has and their level of
education explain 16% of the variance in their hourly wage.
Interpret the R^2
Marital status could be a useful
control variable to implement into this model. This is due to the fact
that if a woman has child(ren) and is single, they’re most likely not
going to have the flexibility of someone who has spousal support or no
children at all. This could lead to not pursuing higher education and/or
not working as much.
Conclusion of the model
The analysis suggests that having children affects the hourly wage
of women. However, the low coefficient could imply that there are
confounding variables that are not being accounted for in the control
variables and the model should be modified to better understand the
effect of children on women’s wages.
model2 = lm(wage.dollars ~ numChildren + educ.level, data = data)
cat("The regression function for wage and number of children with a control variable of education level is: ")
as.formula(
paste0("y ~ ", round(coefficients(model2)[1],2), "",
paste(sprintf("%.2f * %s",
coefficients(model2)[-1],
names(coefficients(model2)[-1])),
collapse=" + ")
)
)
r.2 = round(summary(model2)$r.squared, digits = 2)