For this assignment, we will be working to understand the impact of different working models on the perceived satisfaction of employees with regard to their work/life balance. Corporate goals often incentivize working longer hours and in more demanding roles. However, alternative models exist. Some businesses and countries are experimenting with shorter working weeks or changes to their corporate cultures.
To further examine the issue, a team of researchers partnered with a number of companies to examine the impact of their corporate culture and the length of the working week on the overall satisfaction of employees with regard to their work/life balance. The research team had previously classified the working culture of these companies as either relaxed or demanding based upon validated research tools.
The researchers conducted an experiment with working weeks of different lengths. Prior to the experiment, all of these companies operated with conventional 5-day working weeks and standard hours. Each company was randomized to implement either a 3-day working week, a 4-day working week, or to maintain its conventional 5-day working week. The overall number of expected working hours was held in proportion to the working week (e.g. 8 hours per day for the number of days worked). However, the compensation of the participating employees remained fixed at their prior levels. Training was provided to the managers and the employees to set reasonable expectations for what should be accomplished in the shortened working weeks. The companies were monitored to ensure compliance with the schedule and the expectations. The study was conducted over a period of 6 weeks.
At the end of this period, consenting employees were given a survey that assessed their satisfaction with their balance of work and life. The answers were combined into an overall measure of satisfaction ranging from 0 to 100.
In this assignment, we will be working with the information provided to analyze the satisfaction scores and consider other possible implications of changes in the typical working conditions of companies.
The data are available in the file work and life balance.csv.
For each consenting employee, information on their years of experience and whether they are a manager was collected. Data about each employee’s company was recorded, including its identifier, industry, and the assessment of its working culture. The company’s randomly assigned workweek was included, and each employee’s overall satisfaction score was recorded.
Based upon the information above and the data provided, please answer the following questions. When numeric answers are requested, a few sentences of explanation should also be provided. Please show your code for any calculations performed.
This section of the report is reserved for any work you plan to do ahead of answering the questions – such as loading or exploring the data.
library(readr)
library(ggplot2)
library(ggpubr)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ tibble 3.0.6 ✓ dplyr 1.0.5
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ purrr 0.3.4 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(broom)
library(data.table)
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## The following object is masked from 'package:purrr':
##
## transpose
library(pwr)
data <- read_csv("work and life balance-1.csv")
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## years.experience = col_double(),
## manager = col_logical(),
## company = col_character(),
## industry = col_character(),
## culture = col_character(),
## workweek = col_character(),
## satisfaction = col_double()
## )
What are the primary research questions of the study? State them clearly in plain language. Then briefly explain the importance of this investigation.
This investigation is about the impact of corporate culture and the length of the working week on the overall satisfaction of employees with regard to their work/life balance. Therefore the questions we will investigate are:
This study tests a conventional 5-day working week as well as alternative 3-day and 4-day working weeks. This is an important step towards companies potentially changing the working week to be less demanding on employees. If models other than the traditional 5-day week prove to increase satisfaction in regards to work/life balance, it could be beneficial to both companies and employees to continue researching the implications of these models and determine if they are viable options beyond this study. Similarly, when shaping corporate culture, it is valuable to understand the impact that culture has on employee satisfaction.
For each research question you mentioned above, describe how well the study is designed to evaluate the question.
For the first research question, regarding corporate culture’s influence on employee work/life balance satisfaction, the study is perhaps not perfectly designed to evaluate the impact. The reason for this is that the research team previously classified the working culture of these companies before the experiment began, whereas the satisfaction scores were measured at the conclusion of the experiment. We can’t confirm or deny with the information we have from the study that each participant’s corporate culture would be classified the same way after the working weeks are altered. The culture is already defined and not a treatment of the experiment, so we can’t prove causation between culture and satisfaction score.
For the second research question, regarding the length of the working week’s influence on employee work/life balance satisfaction, the study is well designed to evaluate the impact. The length of the working week is the treatment variable in the experiment. It is randomized across companies and other factors are controlled. This design should allow us to understand the direct impact of the length of the working week on the satisfaction score.
What kind of statistical method could be employed to analyze the data and evaluate the research questions?
hist(data$satisfaction)
A statistical method that would be appropriate to analyze the data and evaluate the two research questions is regression. Our dependent variable, satisfaction, is numerical on a scale of 1-100, and we can see from the histogram above that it is approximately normally distributed, so we can reasonably perform a linear regression with satisfaction as our response. This will allow us to evaluate our research questions by seeing if and how our independent variables, culture and work week, influence the satisfaction scores. We will also be able to explore if and how the other variables measured in the study relate to satisfaction scores.
Fit your intended model and show a summary of its results. While you may include other variables, we will specifically exclude the company from the analysis because these effects would not generalize as well to the broader industries.
model4 <- lm(satisfaction ~ . -company, data = data)
summary(model4)
##
## Call:
## lm(formula = satisfaction ~ . - company, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.9151 -3.5342 0.1197 3.7360 15.3489
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.66164 0.77144 96.782 < 2e-16 ***
## years.experience 0.10025 0.01936 5.177 3.29e-07 ***
## managerTRUE -5.04069 0.61409 -8.208 1.98e-15 ***
## industryEngineering -0.75220 0.78660 -0.956 0.339
## industryHealth Care -3.27432 0.64320 -5.091 5.09e-07 ***
## cultureRelaxed 7.44416 0.71554 10.404 < 2e-16 ***
## workweek4 Days -5.74359 0.77326 -7.428 4.92e-13 ***
## workweek5 Days -14.92846 0.90103 -16.568 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.545 on 492 degrees of freedom
## Multiple R-squared: 0.5298, Adjusted R-squared: 0.5231
## F-statistic: 79.2 on 7 and 492 DF, p-value: < 2.2e-16
Explain the results of your model. Describe how the estimates relate to your research questions and any other notable findings.
The results of the model indicate that culture and work week length both have a significant impact on satisfaction score. The variable cultureRelaxed has a positive coefficient of 7.44 and a p-value below 0.05, suggesting that according to this model, participants who work at a company that is described as having a relaxed culture have significantly higher satisfaction scores than participants who work at a company that is described as having a demanding culture. The variable workweek4Days has a negative coefficient of -5.74 and a p-value below 0.05, suggesting that according to this model, participants in the 4-day work week group have significantly lower satisfaction scores than participants in the 3-day work week group. The variable workweek5Days has a negative coefficient of -14.93 and a p-value below 0.05, suggesting that according to this model, participants in the 5-day work week group have significantly lower satisfaction scores than participants in the 3-day work week group. It is also notable that years of experience has a significant positive correlation with satisfaction score, being a manager has a significant negative correlation to score compared to nonmanagers, and working in the healthcare industry has a significant negative correlation to score compared to the education industry. The R-squared value of this model is approximately 53%.
Would variable interactions also play a role? If your research question includes multiple independent variables, then include pairwise interactions with them. If you think there is only one independent variable in the study, then create an interaction between that variable and other measured factors that you might consider relevant. Show the numeric results and comment on the interactions.
model6 <- lm(satisfaction ~ . -company + culture*workweek, data = data)
summary(model6)
##
## Call:
## lm(formula = satisfaction ~ . - company + culture * workweek,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.8728 -3.6139 0.0598 3.6726 15.1611
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.6250 0.8875 84.084 < 2e-16 ***
## years.experience 0.1013 0.0194 5.222 2.62e-07 ***
## managerTRUE -5.0509 0.6133 -8.236 1.64e-15 ***
## industryEngineering -0.4599 0.8015 -0.574 0.566
## industryHealth Care -3.1538 0.6691 -4.714 3.18e-06 ***
## cultureRelaxed 7.3983 1.2066 6.132 1.79e-09 ***
## workweek4 Days -5.6713 0.8552 -6.632 8.78e-11 ***
## workweek5 Days -16.5866 1.4981 -11.072 < 2e-16 ***
## cultureRelaxed:workweek4 Days -1.4758 1.7716 -0.833 0.405
## cultureRelaxed:workweek5 Days 1.7432 1.7470 0.998 0.319
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.537 on 490 degrees of freedom
## Multiple R-squared: 0.533, Adjusted R-squared: 0.5245
## F-statistic: 62.15 on 9 and 490 DF, p-value: < 2.2e-16
Variable interactions could play a role in how our independent variables, culture and work week, affect our dependent variable, satisfaction. It is reasonable that changing the length of the work week could affect the satisfaction differently between companies that are demanding and companies that are relaxed. For instance, employees in relaxed companies may feel that they are still able to get all of their work done in a shorter amount of time and therefore their satisfaction improves with a shorter week, but employees at demanding companies may feel that the shorter schedule is impeding their ability to perform their duties and therefore their satisfaction becomes negative or less positive with a shorter week. Thus this model includes an interaction between culture and work week. However, the p-values in these interactions (0.405 and 0.319) are higher than our significance level of 0.05. These high p-values indicate that in this model, the interaction between culture and workweek does not have a significant association with employee satisfaction.
Are there other variables that would be helpful to measure? Is this even necessary? Explain your answer and reasoning.
It could be helpful to measure the hours of work in a day. While this has been kept proportional between the working week groups, it is not necessarily equal between each company. There could be some interaction between the work week length and the hours per day in impacting satisfaction with work/life balance. For instance, at a company where employees work 10 hours a day, they could see a bigger improvement in satisfaction switching to a shorter week than at a company where employees work 8 hours a day.
The same could be true for compensation. While it has been kept equal between working week groups, it is not equal for every role, every experience level, every company, or every industry. It could be valuable to see if compensation level has any affect on how the different work week lengths affect satisfaction.
Additionally, while it is outside the scope of examining the impact these variables can have on the satisfaction of employees with regards to their work/life balance, there are other things that should be considered in determining whether or not to actually implement a new work week. For example, it would be important to measure the work output and any meaningful business results, because these could have a negative correlation with employee satisfaction of work/life balance.
What if we wanted to compare all of the average satisfaction scores in the three groups of working weeks? For this analysis, you may ignore the other variables. Show the results of a statistical test to simultaneously evaluate the difference in satisfaction for all of the pairs of possible working weeks. Comment on the results.
model8 <- aov(satisfaction ~ workweek, data = data)
tukey <- as.data.table(x = TukeyHSD(x = model8)[[1]], keep.rownames = TRUE)
tukey
## rn diff lwr upr p adj
## 1: 4 Days-3 Days -9.185567 -11.160426 -7.210708 1.193158e-10
## 2: 5 Days-3 Days -11.880235 -13.831328 -9.929142 1.192965e-10
## 3: 5 Days-4 Days -2.694668 -4.277937 -1.111399 2.143526e-04
Tukey’s HSD method adjusts the p-values from two-sample t-tests to account for multiple testing. So using this method, we are able to simultaneously evaluate the difference in satisfaction for all pairs of working weeks (4 Days-3 Days, 5 Days-3 Days, 5 Days-4 Days). All three adjusted p-values, shown in the far right column above, are below our significance level of 0.05. This indicates that there is evidence of a significant difference in mean satisfaction scores between all three pairs of possible working weeks.
Now conduct separate tests of whether a shorter working schedule increases satisfaction for each pair of schedules. Which of these results would remain significant with a Bonferroni correction for multiple comparisons? Show the p-values for the t-tests, the corrected threshold for a 0.05 significance level, and whether the differences remain significant after the adjustment.
threes <- data %>%
filter(workweek == "3 Days")
fours <- data %>%
filter(workweek == "4 Days")
fives <- data %>%
filter(workweek == "5 Days")
significance_level = 0.05
confidence_level = 1 - significance_level
num_tests = 3
Bonferroni_level = 1 - significance_level/num_tests
Bonferroni_level
## [1] 0.9833333
three_four <- t.test(x = threes$satisfaction, y = fours$satisfaction, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = confidence_level)
three_four
##
## Welch Two Sample t-test
##
## data: threes$satisfaction and fours$satisfaction
## t = 10.394, df = 191.65, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 7.724848 Inf
## sample estimates:
## mean of x mean of y
## 77.59794 68.41237
four_five <- t.test(x = fours$satisfaction, y = fives$satisfaction, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = confidence_level)
four_five
##
## Welch Two Sample t-test
##
## data: fours$satisfaction and fives$satisfaction
## t = 4.0348, df = 385.46, p-value = 3.295e-05
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 1.593503 Inf
## sample estimates:
## mean of x mean of y
## 68.41237 65.71770
three_five <- t.test(x = threes$satisfaction, y = fives$satisfaction, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = confidence_level)
three_five
##
## Welch Two Sample t-test
##
## data: threes$satisfaction and fives$satisfaction
## t = 14.117, df = 167.1, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 10.48827 Inf
## sample estimates:
## mean of x mean of y
## 77.59794 65.71770
three_four_adjusted <- t.test(x = threes$satisfaction, y = fours$satisfaction, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = Bonferroni_level)
three_four_adjusted
##
## Welch Two Sample t-test
##
## data: threes$satisfaction and fours$satisfaction
## t = 10.394, df = 191.65, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 98.33333 percent confidence interval:
## 7.291224 Inf
## sample estimates:
## mean of x mean of y
## 77.59794 68.41237
four_five_adjusted <- t.test(x = fours$satisfaction, y = fives$satisfaction, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = Bonferroni_level)
four_five_adjusted
##
## Welch Two Sample t-test
##
## data: fours$satisfaction and fives$satisfaction
## t = 4.0348, df = 385.46, p-value = 3.295e-05
## alternative hypothesis: true difference in means is greater than 0
## 98.33333 percent confidence interval:
## 1.268336 Inf
## sample estimates:
## mean of x mean of y
## 68.41237 65.71770
three_five_adjusted <- t.test(x = threes$satisfaction, y = fives$satisfaction, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = Bonferroni_level)
three_five_adjusted
##
## Welch Two Sample t-test
##
## data: threes$satisfaction and fives$satisfaction
## t = 14.117, df = 167.1, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 98.33333 percent confidence interval:
## 10.07442 Inf
## sample estimates:
## mean of x mean of y
## 77.59794 65.71770
There are three pairs of tests: comparing the difference in means between the 3-day week and the 4-day week, between the 4-day week and the 5-day week, and between the 3-day week and the 5-day week. Each of these three tests provide evidence that a shorter working schedule increases satisfaction. This is true both at our original significance level of 0.05 and our adjusted Bonferroni significance level of 0.05/3, as seen in the results above, because all three p-values are below both of these levels. This confirms the same conclusion found using the Tukey method in Question #8.
Do you think the 6 week time frame is an appropriate length to investigate the effect of changes in the working schedule on the satisfaction of work/life balance? Explain why or why not.
We can see from the results of our statistical tests that 6 weeks was enough time to see a meaningful difference in satisfaction scores regarding work/life balance among the different work week lengths. I do think that additional time is needed to determine the actual magnitude of the effects, since people can’t always adjust to major changes in 6 weeks. For instance, it took many organizations longer than 6 weeks to refine work from home schedules when offices closed due to COVID, and I personally did not feel fully adjusted to the schedule in that amount of time. However, it is important to remember that this experiment could take a major toll on companies’ productivity, and 6 weeks is already a lot of time to dedicate. I think that it was appropriate to make this initial experiment 6 weeks, and we are able to see some differences already, but it should be thought of more as a check-in point than the basis for any major long-term decisions like changing the length of the working week. The insights from this 6 week period can be used as a starting point to determine whether additional research for a longer period of time seems worthwhile or not.
Now the researchers would like to build upon the work of the first experiment they conducted. In the comments on the surveys, a sizable number of the employees in the first study noted that they did not get enough sleep with a 5-day working week. Anecdotally, those working the shorter weeks during the experiment frequently mentioned the benefit of getting enough rest.
With this in mind, the researchers would like your help in planning their next experiment. They would once again like to randomize companies to shorter working weeks. Based on the feedback of the previous experiment, a 3-day working week would not be very practical for the companies, while 4 days seemed more actionable. Comparing the amount of sleep of employees with 4-day schedules to the amount of sleep of those with 5-day schedules, how would you conduct the experiment to answer this question? State a research question, comment on the operational designs, and describe the type of data you would gather.
A research question for this experiment could be the following:
Does shortening the work week from 5 days to 4 days increase the amount of sleep that employees get?
After companies are randomized to 4-day and 5-day working weeks and those changes are implemented, we can start measuring the amount of sleep that employees are getting. This can be recorded manually by each employee every night, or if the resources are available, it can be recorded automatically using a monitoring device such as a FitBit or Apple Watch for more accuracy. These devices can measure things like the total amount of time in a sleep session, the amount of time in a full sleep versus partially or fully awake during the night, and a “score” indicating the restfulness or quality of the sleep. Whichever measure we choose to use as our independent variable, it should be reasonably indicative of “amount of sleep” and will be consistent across all participants.
The employees measured must be randomly sampled in order to mitigate bias. It will be important to control for company, experience level, manager status, industry, and culture so that the only meaningful difference between participants in the two groups is the treatment of work week length. This will allow us to reasonably attribute any measured differences in sleep to the work week length.
One way to measure sleep by employee is the average total time slept in a week. This can also be an average time per night. Then we can perform statistical tests to determine if the mean amount of sleep (per week or per night) is significantly lower for employees with the 4-day than employees with the 5-day work week.
What kind of statistical test would be appropriate for your research question? Provide sufficient details on all of the choices you would make.
It would be appropriate to perform a one-sided, two-sample t-test to evaluate this research question. The two samples are the 4-day work week group and the 5-day work week group. The response variable is time asleep. We are using a one-sided test because we are hypothesizing that the mean amount of sleep for participants in the 4-day work week is greater than the mean amount of sleep for participants in the 5-day work week. The results of this test should reveal if there is evidence of participants in the 4-day work week group getting significantly more sleep than participants in the 5-day work week group.
What is the smallest amount of additional average sleeping time that would constitute a meaningful improvement for the typical employee? Explain your reasoning.
Sleep experts recommend that the ideal time for a nap is 20 minutes, because that is, to put it very simplistically, the smallest amount of time for meaningful rest. Therefore, I think that 20 minutes of additional sleeping time per night would be the minimum amount to constitute a meaningful improvement for the typical employee. 20 minutes a night does not seem like much, but it adds up to 2 1/3 extra hours per week and over 10 extra hours per month.
The researchers are hoping to sample approximately 200 employees for the new study, evenly divided into two groups of 100. What would be the power of your proposed statistical test in this scenario? Use your suggested effect size from the previous question in units of hours and a significance level of 0.05. For now, assume that the standard deviation of sleeping times is 1 hour. Produce a numeric answer and then comment on the results.
effect_size = 1/3
standard_dev = 1
n1 = 100
n2 = 100
pwr.t2n.test(n1 = n1, n2 = n2, d = effect_size/standard_dev, sig.level = significance_level, alternative = "greater")
##
## t test power calculation
##
## n1 = 100
## n2 = 100
## d = 0.3333333
## sig.level = 0.05
## power = 0.7593159
## alternative = greater
The power of this two-sample t-test is 0.7593159. This means that given our sample size, effect size, and significance level, the likelihood of rejecting the null hypothesis when it is false is approximately 75.9%. This is a relatively low power, meaning that it will be a bit difficult to prove this effect with these sample sizes.
It may be difficult to convince companies to consider a 4-day working week and to convince employees to provide you with their records of sleep. How would these results change if you could only get 30 employees in the 4-day working week? Assume that the other inputs from the previous question will be used. Calculate the power and comment on the results, along with the differences from the previous question.
effect_size = 1/3
standard_dev = 1
n1 = 100
n2 = 30
pwr.t2n.test(n1 = n1, n2 = n2, d = effect_size/standard_dev, sig.level = significance_level, alternative = "greater")
##
## t test power calculation
##
## n1 = 100
## n2 = 30
## d = 0.3333333
## sig.level = 0.05
## power = 0.4792477
## alternative = greater
The power of this two-sample t-test is 0.4792477. This means that given our sample size, effect size, and significance level, the likelihood of rejecting the null hypothesis when it is false is approximately 47.9%. This is a much lower power, meaning that it will be very difficult to prove this effect with these sample sizes.
Assuming that we hold the other inputs fixed from the previous 2 questions, what sample size would be needed in the 4-day working week group to achieve a power of 0.9? Make sure to round your answer up to a whole number.
effect_size = 1/3
standard_dev = 1
n1 = 100
power = 0.9
pwr.t2n.test(n1 = n1, d = effect_size/standard_dev, sig.level = significance_level, power = power, alternative = "greater")
##
## t test power calculation
##
## n1 = 100
## n2 = 340.7823
## d = 0.3333333
## sig.level = 0.05
## power = 0.9
## alternative = greater
Our result for n2 is 340.7823. This means that, assuming that we hold the inputs fixed from the previous 2 questions, we would need a sample size of at least 341 people in the 4-day working group to achieve a power of 0.9.
Describe the trade-offs between power and sample size in this setting. Including considerations of the statistical issues along with the practical aspects of running the experiment.
As sample size increases, standard error decreases, so effects become a larger number of standard errors away from the null value, and thus are easier to detect. Therefore, as sample size increases, power increases along with it. If we want to increase power, increasing the sample size is an effective way to do so. However, there are trade-offs we must consider, because increasing the sample size means that the experiment becomes more resource intensive and likely more expensive to run. Conversely, if we keep a small sample size, the experiment will be easier to conduct, but the power will be lower and thus it will be more difficult to detect the desired testing effects.
In our earlier analyses, we had assumed that the standard deviation of sleeping times was 1 hour. What if this assumption is incorrect? For now, you may consider an experiment with 100 sampled employees in each treatment group and a significance level of 0.05. Describe how the power changes if our assumption is wrong in each direction.
effect_size = 1/3
standard_dev_lower = 3/4
n1 = 100
n2 = 100
pwr.t2n.test(n1 = n1, n2 = n2, d = effect_size/standard_dev_lower, sig.level = significance_level, alternative = "greater")
##
## t test power calculation
##
## n1 = 100
## n2 = 100
## d = 0.4444444
## sig.level = 0.05
## power = 0.9315029
## alternative = greater
standard_dev_higher = 5/4
pwr.t2n.test(n1 = n1, n2 = n2, d = effect_size/standard_dev_higher, sig.level = significance_level, alternative = "greater")
##
## t test power calculation
##
## n1 = 100
## n2 = 100
## d = 0.2666667
## sig.level = 0.05
## power = 0.5926304
## alternative = greater
If the assumption on standard deviation is incorrect in either direction, the power of our test changes. When the standard deviation decreases, our effect size is relatively higher and therefore easier to detect, so the power increases. For example, if we assume 100 sampled employees in each treatment group and a significance level of 0.05, lowering the standard deviation of sleep amount from 1 hour to 45 minutes increases our power from 0.759 to 0.932. Conversely, when the standard deviation increases, our effect size is relatively smaller and therefore harder to detect, so the power decreases. For example, if we again assume 100 sampled employees in each treatment group and a significance level of 0.05, raising the standard deviation from 1 hour to 75 minutes decreases our power from 0.759 to 0.593.
Suppose we had been able to add a third group to the planned study so that we could test the 3-day, 4-day, and 5-day working weeks. We would like to study the differences in mean nightly sleeping time across these groups using a one-way ANOVA model. The experiment would have 100 employees in each group while planning for a power of 0.8 using a significance level of 0.05. Under these circumstances, what effect size could be detected? Convert the calculated effect size into minutes under an assumption that the standard deviation is 1 hour.
groups = 3
n = 100
power = 0.8
pwr.anova.test(k = groups, n = n, sig.level = significance_level, power = power)
##
## Balanced one-way analysis of variance power calculation
##
## k = 3
## n = 100
## f = 0.1801187
## sig.level = 0.05
## power = 0.8
##
## NOTE: n is number in each group
#From results
f = 0.1801187
#Calculate effect size in minutes
minutes = (f/standard_dev)*60
minutes
## [1] 10.80712
To determine what effect size can be detected at a power of 0.8, we use a one-way ANOVA model with three groups. The results of the calculation indicate that the effect size, f, is 0.1801187. Assuming a standard deviation of 1 hour, this is equivalent to approximately 10.8 minutes. Therefore, under these circumstances, we will be able to detect an effect size of 10.8 minutes between the three groups.
Taking into account your analyses and statistical planning, what kind of recommendations would you make to the companies in order to help them to improve the satisfaction of their employees with regard to work/life balance?
The results of these analyses demonstrate that shortening the work week does indeed have a significant positive effect on employee satisfaction with regards to work/life balance. If shortening the work week is a viable potential long-term option for the companies, they should perform additional longer-term analyses to figure out how this would affect employee satisfaction as well as relevant business outcomes. But if shortening the work week is not a viable long-term option, there are still actionable insights related to this experiment that could be meaningful to the companies. For example, if the amount of hours work per week can be reduced even a small amount, it’s possible that that would still significantly increase employee satisfaction with regard to work/life balance. It is also worth further exploring the amount of sleep employees are getting based on how long they spend at work and other factors, and how sleep affects satisfaction with regard to work/life balance and thus, ultimately, performance in the workplace. Companies could also study the correlation between satisfaction scores and recruitment metrics like employee retention and turnover.
While the satisfaction score survey was implemented as part of a short-term experiment, that practice does not have to go away just because the experiment has concluded. Companies can build on the work done in these analyses by continuously monitoring employee satisfaction and measuring the impact that other workplace decisions have on it.