The data file Weeklylab7data.xlsx contains mock data on relationship satisfaction (measured on a scale from 0 to 100) for 200 individuals. Other variables include Sex (0 male, 1 female), age (continuous), shared house work (0 no, 1 yes), nights spent together on average per week (0 to 7), and financial security measured on a scale from 0 (heavily in debt) to 10 (very secure). Your goal is to build a model that predicts satisfaction using these variables. Summarize your findings and include and a graph.
library(readxl)
## Warning: package 'readxl' was built under R version 3.4.4
Relationship <- read_excel("C:/Users/Enrique/OneDrive/Documents/HU/ANLY510_Principles7Applicaitons02/Data/RelationshipSatisfaction.xlsx")
attach(Relationship)
str(Relationship)
## Classes 'tbl_df', 'tbl' and 'data.frame': 199 obs. of 6 variables:
## $ RelationshipSatisfaction: num 17 63 48 45 53 41 41 6 10 71 ...
## $ Sex : num 1 1 1 1 1 0 0 0 0 1 ...
## $ Age : num 19 48 49 46 18 41 23 22 25 30 ...
## $ ShareInHouseWork : num 0 1 0 1 1 0 1 0 0 1 ...
## $ NightsTogether : num 4 2 2 4 5 2 4 2 5 4 ...
## $ FinancialSecurity : num 1 1 3 10 5 9 3 0 4 10 ...
Plot displays a normal distribution.Let’s test for skewness and normality to make sure.
plot(density(Relationship$RelationshipSatisfaction))
Data passed for skewness and normality.
library(moments)
shapiro.test(Relationship$RelationshipSatisfaction)
##
## Shapiro-Wilk normality test
##
## data: Relationship$RelationshipSatisfaction
## W = 0.99146, p-value = 0.2923
agostino.test(Relationship$RelationshipSatisfaction)
##
## D'Agostino skewness test
##
## data: Relationship$RelationshipSatisfaction
## skew = -0.14113, z = -0.83724, p-value = 0.4025
## alternative hypothesis: data have a skewness
Lets plot some visuals to display the association between predictors and Satisfaction.
Females feel slightly more satisfied with their relationship than males
boxplot(Relationship$RelationshipSatisfaction~Relationship$Sex,
main="Satisfaction per Gender",ylab="Relationship Satisfaction",
xlab="Gender (1 = Female)", col=c("gray","pink"))
Older Couples seem to be slightly more satisfied than younger couples
plot(Relationship$Age,Relationship$RelationshipSatisfaction, main="Satisfaction and Age",
xlab="Age",ylab="Relationship Satisfaction", col="gray", pch=18)
abline(lm(RelationshipSatisfaction~Age), lwd=2, col="green", lty="dashed")
Couples that share the housework are more satisfied
boxplot(Relationship$RelationshipSatisfaction~Relationship$ShareInHouseWork,
main="Satisfaction by Shared Housework",
ylab="Relationship Satisfaction", xlab="Shared House work (1 = Yes)", col=c("gray","pink"))
Spending more nights together yields slightly higher satisfaction in a relationship
plot(Relationship$NightsTogether,Relationship$RelationshipSatisfaction, main="Satisfaction and Nights Together",
xlab="Nights together",ylab="Relationship Satisfaction", col="gray", pch=18)
abline(lm(RelationshipSatisfaction~NightsTogether), lwd=2, col="green", lty="dashed")
Greater financial secruty causes couples to feel more satisfied with their relationship.
plot(Relationship$FinancialSecurity,Relationship$RelationshipSatisfaction, main="Satisfaction and Financial Security",
xlab="Financial Security",ylab="Relationship Satisfaction", col="gray", pch=18)
abline(lm(RelationshipSatisfaction~FinancialSecurity), lwd=2, col="green", lty="dashed")
68% of the variation in the data is explained by our model.
Rel_Model=lm(RelationshipSatisfaction~Sex+Age+ShareInHouseWork+NightsTogether+FinancialSecurity,
data = Relationship)
summary(Rel_Model)
##
## Call:
## lm(formula = RelationshipSatisfaction ~ Sex + Age + ShareInHouseWork +
## NightsTogether + FinancialSecurity, data = Relationship)
##
## Residuals:
## Min 1Q Median 3Q Max
## -25.5757 -6.8926 0.6291 7.2475 23.0836
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.7333 3.1873 -0.858 0.392
## Sex 12.7091 1.4750 8.616 2.46e-15 ***
## Age 0.2761 0.0683 4.043 7.63e-05 ***
## ShareInHouseWork 23.6837 1.4814 15.987 < 2e-16 ***
## NightsTogether 1.6254 0.3204 5.073 9.15e-07 ***
## FinancialSecurity 1.7714 0.2199 8.056 7.99e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.33 on 193 degrees of freedom
## Multiple R-squared: 0.6954, Adjusted R-squared: 0.6875
## F-statistic: 88.1 on 5 and 193 DF, p-value: < 2.2e-16
52% of the variation is explained by testing strongest predictors (housework and sex).
Rel_model2=lm(RelationshipSatisfaction~ShareInHouseWork*Sex, Relationship)
summary(Rel_model2)
##
## Call:
## lm(formula = RelationshipSatisfaction ~ ShareInHouseWork * Sex,
## data = Relationship)
##
## Residuals:
## Min 1Q Median 3Q Max
## -33.130 -9.190 0.698 8.750 29.698
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 22.250 1.704 13.056 < 2e-16 ***
## ShareInHouseWork 23.052 2.586 8.915 3.50e-16 ***
## Sex 12.641 2.538 4.981 1.39e-06 ***
## ShareInHouseWork:Sex -0.814 3.638 -0.224 0.823
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.75 on 195 degrees of freedom
## Multiple R-squared: 0.5307, Adjusted R-squared: 0.5235
## F-statistic: 73.52 on 3 and 195 DF, p-value: < 2.2e-16
Let’s verify the results by ploting residuals.
qqnorm(Rel_Model$residuals)
qqline(Rel_Model$residuals)
Residuals follow a normal distribution.
__Conclusion:__The first model fits the data much better with R^2 of 68%. That tells us that the predictors reduce unexpected variation much better when tested together.
Let’s plug in the coefficients of model 1 to predict relationship satisfaction of three different profiles.
Prediction 1:
#Female #36 years old
#Shares housework
#3 nights together per week
#5 for financial security
Prediction 2:
#Male #36 years old
#Doesn't share housework
#3 nights together per week
#5 for financial security
Prediction 2:
#Female #45 years old
#shares housework
#5 nights together per week
#8 for financial security
Prediction1=((1*12)+(36*.27)+(1*23)+(3*1.62)+(5*1.77))
Prediction2=((0*12)+(36*.27)+(0*23)+(3*1.62)+(5*1.77))
Prediction3=((1*12)+(45*.27)+(1*23)+(5*1.62)+(8*1.77))
Satisfaction=c(Prediction1,Prediction2,Prediction3)
Age1=c(36,36,45)
ShareInHouseWork1=c(1,0,1)
NightsTogether1=c(3,3,5)
FinancialSecurity1=c(5,5,8)
(Predictions=data.frame(Satisfaction,Age1,ShareInHouseWork1,NightsTogether1,FinancialSecurity1))
## Satisfaction Age1 ShareInHouseWork1 NightsTogether1 FinancialSecurity1
## 1 58.43 36 1 3 5
## 2 23.43 36 0 3 5
## 3 69.41 45 1 5 8
Conclusion
The coefficients of model 1 yielded similar results to the association displayed initially:
*Females tend to feel more satisfied in their relationships than males
*Couples that share housework also feel more satisfied
*More nights together per week yields higher satisfaciton
*Older couples tend to feel more satisfied
*Financial stability yields more satisfaction as well