Among the 100 interesting data sets, we select Global health data which would be health for identifying high-impact ways to improve world health
In Global health data, ‘Mental health’ is selected, which includes Mental health governance (3 factors: legislation, plan, and policy), Human resources (1 factor: psychiatrists), and Suicide rates (1 response variable).
The data set was collected for examining Effects of Mental health care on Suicide rate and consist of 5 variables and 160 observations.
Mental <- read.table("C:/Users/bokjh3/Desktop/Global health data.csv", header=T, sep=",")
head(Mental)
## country X2012.suicide.rate X2011.legislation X2011.plan X2011.policy
## 1 Afghanistan 4.0 2 2 2
## 2 Albania 6.5 2 2 2
## 3 Algeria 1.8 2 2 2
## 4 Angola 10.6 1 2 1
## 5 Armenia 3.3 2 1 1
## 6 Australia 11.6 2 2 2
## X2011.Psychiatrists
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 2
Definitions of the factors in dataset are as below. The levels of factors are 2(no or yes, below average or above average)
str(Mental)
## 'data.frame': 160 obs. of 6 variables:
## $ country : Factor w/ 160 levels "Afghanistan",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ X2012.suicide.rate : num 4 6.5 1.8 10.6 3.3 11.6 15.6 1.7 7.2 6.6 ...
## $ X2011.legislation : int 2 2 2 1 2 2 2 2 1 1 ...
## $ X2011.plan : int 2 2 2 2 1 2 2 2 2 2 ...
## $ X2011.policy : int 2 2 2 1 1 2 1 2 2 2 ...
## $ X2011.Psychiatrists: int 1 1 1 1 1 2 2 2 2 1 ...
The response variable is 2012 suicide rate. The 2012 suicide rate is defined as Suicide mortality rate per 100 000 in 2012. 2012 suicide rate is a continuous variable.
There is time lag between factors (year 2011) and response variable (year 2012) because of the constraints of data from Global health data. And, the countries which have missing value are left out of analysis.
We are using 4 factors with two levels analyzing their main and interaction effects on suicide rate. In this experiment, the null hypothesis is that there is no statistically significant main or interaction effect present within certain factors
Some people might argue that suicide is personal thing. However, we think that we could reduce suicide rate through Institutional Arrangements such as government policy or aid from others. Therefore, we try to investigate the impact of 4 factors such as government legistlation, government plan, government policy and accessibility of psychiatrists on reducing suicide rate.
Randomization is a technique used to balance the effect of extraneous or uncontrollable conditions that can impact the results of an experiment. In this experiment, we do not consider randomization becasue we include all countries around world (it is not sample, but it is population).
Replicates are multiple experimental runs with the same factor settings (levels). Replicates are subject to the same sources of variability, independently of each other. We can replicate combinations of factor levels, groups of factor level combinations, or entire designs. In our experiments, Global Health data is not conducted without this repeated measurement.
In experimental design, blocking is a technique used to deal with nuisance factors that may affect the results of the experiment. The experiment is organized into blocks, where the nuisance factor is maintained at a constant level in each block. Blocking is unnecessary in this experimental design, because the factors related with this experiment are just questions (not treatment).
At first, we need to investigate our dataset.
summary(Mental)
## country X2012.suicide.rate X2011.legislation X2011.plan
## Afghanistan: 1 Min. : 0.300 Min. :1.0 Min. :1.000
## Albania : 1 1st Qu.: 3.775 1st Qu.:1.0 1st Qu.:2.000
## Algeria : 1 Median : 7.550 Median :2.0 Median :2.000
## Angola : 1 Mean : 9.401 Mean :1.6 Mean :1.775
## Armenia : 1 3rd Qu.:13.000 3rd Qu.:2.0 3rd Qu.:2.000
## Australia : 1 Max. :36.800 Max. :2.0 Max. :2.000
## (Other) :154
## X2011.policy X2011.Psychiatrists
## Min. :1.000 Min. :1.000
## 1st Qu.:1.000 1st Qu.:1.000
## Median :2.000 Median :1.000
## Mean :1.656 Mean :1.306
## 3rd Qu.:2.000 3rd Qu.:2.000
## Max. :2.000 Max. :2.000
##
In the section, the levels of each factor are shown in a boxplot to analyze the main effects of factors over the response variable, suicide rate. Keep in mind that No is 1 and Yes is 2 (in the case of psychiatrists, ‘Under average’ is 1, ’Above average is 2)
boxplot(Mental$X2012.suicide.rate~Mental$X2011.legislation, xlab="Mental health legislation", ylab="Suicide rate")
title("Impact of legislation on Suicide")
boxplot(Mental$X2012.suicide.rate~Mental$X2011.plan, xlab="Mental health plan", ylab="Suicide rate")
title("Impact of plan on Suicide")
boxplot(Mental$X2012.suicide.rate~Mental$X2011.policy, xlab="Mental health policy", ylab="Suicide rate")
title("Impact of policy on Suicide")
boxplot(Mental$X2012.suicide.rate~Mental$X2011.Psychiatrists, xlab="Psychiatrists working in mental health", ylab="Suicide rate")
title("Impact of psychiatrists on Suicide")
In this section, we examine main effects of all factors by using ANOVA. The factors of legislation and psychiatrists are statistically significant at even 1% significance level.
me = aov(Mental$X2012.suicide.rate~Mental$X2011.legislation+Mental$X2011.plan+Mental$X2011.plan+Mental$X2011.policy+Mental$X2011.Psychiatrists)
summary(me)
## Df Sum Sq Mean Sq F value Pr(>F)
## Mental$X2011.legislation 1 386 385.6 9.588 0.00233 **
## Mental$X2011.plan 1 45 45.5 1.131 0.28923
## Mental$X2011.policy 1 7 6.5 0.163 0.68714
## Mental$X2011.Psychiatrists 1 1474 1474.3 36.660 1.02e-08 ***
## Residuals 155 6233 40.2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We could examine the ANOVA results for interaction effects as below. There are no interaction effects in this experiments.
legislation x plan - p > 0.05, it is not significant interaction effect.
ie_12 <- aov(Mental$X2012.suicide.rate~Mental$X2011.legislation*Mental$X2011.plan)
anova(ie_12)
## Analysis of Variance Table
##
## Response: Mental$X2012.suicide.rate
## Df Sum Sq Mean Sq F value
## Mental$X2011.legislation 1 385.6 385.57 7.8008
## Mental$X2011.plan 1 45.5 45.48 0.9202
## Mental$X2011.legislation:Mental$X2011.plan 1 3.6 3.62 0.0732
## Residuals 156 7710.7 49.43
## Pr(>F)
## Mental$X2011.legislation 0.005877 **
## Mental$X2011.plan 0.338914
## Mental$X2011.legislation:Mental$X2011.plan 0.787109
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
interaction.plot(Mental$X2011.plan,Mental$X2011.legislation,Mental$X2012.suicide.rate)
legislation x policy - p > 0.05, it is not significant interaction effect.
ie_13 <- aov(Mental$X2012.suicide.rate~Mental$X2011.legislation*Mental$X2011.policy)
anova(ie_13)
## Analysis of Variance Table
##
## Response: Mental$X2012.suicide.rate
## Df Sum Sq Mean Sq F value
## Mental$X2011.legislation 1 385.6 385.57 7.8011
## Mental$X2011.policy 1 0.9 0.94 0.0190
## Mental$X2011.legislation:Mental$X2011.policy 1 48.5 48.47 0.9807
## Residuals 156 7710.4 49.43
## Pr(>F)
## Mental$X2011.legislation 0.005876 **
## Mental$X2011.policy 0.890530
## Mental$X2011.legislation:Mental$X2011.policy 0.323564
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
interaction.plot(Mental$X2011.policy,Mental$X2011.legislation,Mental$X2012.suicide.rate)
legislation x psychiatrists - p > 0.05, it is not significant interaction effect.
ie_14 <- aov(Mental$X2012.suicide.rate~Mental$X2011.legislation*Mental$X2011.Psychiatrists)
anova(ie_14)
## Analysis of Variance Table
##
## Response: Mental$X2012.suicide.rate
## Df Sum Sq Mean Sq
## Mental$X2011.legislation 1 385.6 385.57
## Mental$X2011.Psychiatrists 1 1476.6 1476.60
## Mental$X2011.legislation:Mental$X2011.Psychiatrists 1 4.8 4.80
## Residuals 156 6278.4 40.25
## F value Pr(>F)
## Mental$X2011.legislation 9.5803 0.002332 **
## Mental$X2011.Psychiatrists 36.6890 9.971e-09 ***
## Mental$X2011.legislation:Mental$X2011.Psychiatrists 0.1192 0.730389
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
interaction.plot(Mental$X2011.Psychiatrists,Mental$X2011.legislation,Mental$X2012.suicide.rate)
plan x policy - p > 0.05, it is not significant interaction effect.
ie_23 <- aov(Mental$X2012.suicide.rate~Mental$X2011.plan*Mental$X2011.policy)
anova(ie_23)
## Analysis of Variance Table
##
## Response: Mental$X2012.suicide.rate
## Df Sum Sq Mean Sq F value Pr(>F)
## Mental$X2011.plan 1 89.1 89.051 1.7620 0.18631
## Mental$X2011.policy 1 1.4 1.430 0.0283 0.86662
## Mental$X2011.plan:Mental$X2011.policy 1 170.7 170.704 3.3776 0.06799 .
## Residuals 156 7884.2 50.540
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
interaction.plot(Mental$X2011.policy,Mental$X2011.plan,Mental$X2012.suicide.rate)
plan x psychiatrists - p > 0.05, it is not significant interaction effect.
ie_24 <- aov(Mental$X2012.suicide.rate~Mental$X2011.plan*Mental$X2011.Psychiatrists)
anova(ie_24)
## Analysis of Variance Table
##
## Response: Mental$X2012.suicide.rate
## Df Sum Sq Mean Sq F value
## Mental$X2011.plan 1 89.1 89.05 2.2028
## Mental$X2011.Psychiatrists 1 1749.1 1749.08 43.2665
## Mental$X2011.plan:Mental$X2011.Psychiatrists 1 0.9 0.87 0.0214
## Residuals 156 6306.4 40.43
## Pr(>F)
## Mental$X2011.plan 0.1398
## Mental$X2011.Psychiatrists 6.835e-10 ***
## Mental$X2011.plan:Mental$X2011.Psychiatrists 0.8838
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
interaction.plot(Mental$X2011.Psychiatrists,Mental$X2011.plan,Mental$X2012.suicide.rate)
policy x psychiatrists - p > 0.05, it is not significant interaction effect.
ie_24 <- aov(Mental$X2012.suicide.rate~Mental$X2011.policy*Mental$X2011.Psychiatrists)
anova(ie_24)
## Analysis of Variance Table
##
## Response: Mental$X2012.suicide.rate
## Df Sum Sq Mean Sq F value
## Mental$X2011.policy 1 32.3 32.35 0.8007
## Mental$X2011.Psychiatrists 1 1798.0 1797.98 44.5066
## Mental$X2011.policy:Mental$X2011.Psychiatrists 1 13.0 12.99 0.3216
## Residuals 156 6302.1 40.40
## Pr(>F)
## Mental$X2011.policy 0.3723
## Mental$X2011.Psychiatrists 4.169e-10 ***
## Mental$X2011.policy:Mental$X2011.Psychiatrists 0.5715
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
interaction.plot(Mental$X2011.Psychiatrists,Mental$X2011.policy,Mental$X2012.suicide.rate)
Quantile-Quantile (Q-Q) plots are graphs used to verify the distributional assumption for a set of data. The relatively linear relationship for all data sets justifies the use of ANOVA to test for the significant difference. However, when we check for main effects, the data is not normally distributed in my experiment.
qqnorm(residuals(me))
qqline(residuals(me))
Residuals vs. Fits Plot is a common graph used in residual analysis. It is a scatter plot of residuals as a function of fitted values, or the estimated responses. There are slightly outliers in the ‘suicide rate’ response variable when we check for main effects.
plot(fitted(me),residuals(me))
Montgomery, Douglas C.. Design and Analysis of Experiments, 8th Edition
Raw Data: http://apps.who.int/gho/data/node.main.MENTALHEALTH?lang=en