Increase hours in working programming a day make a worker do more point in a week?
Hypotesis: Increase 1 hour working programming a day not make a worker do more point in a week?
The hours have been rounded. Example, for more:36.87 = 37, for less: 36.40 = 37.
library(readxl)
library(readr)
AlleySprint <- read_csv("AlleySprint.csv")
library(readr)
clean <- read_delim("clean.csv", ";", escape_double = FALSE,
col_types = cols(Hours_p_point = col_number(),
Points = col_number(), Program_Hours = col_number(),
regime = col_number()), trim_ws = TRUE)
DT::datatable(clean[,1:7])Since week xxx until xx, there are 35 weeks of analysis, about 1211 hours of working programming, and 315 points. The avarage is spend 3.44 hours per point.
Comparation <- data_outliers %>%
dplyr::select(Sprint_enddate,cat_regime, regime, Points, Program_Hours,Hours_p_point) %>%
group_by(cat_regime) %>%
summarise(Weeks_qnt = n_distinct(Sprint_enddate),
M_Points = mean(Points),
M_Program_Hours_W = mean(Program_Hours),
M_Hours_p_point_W = mean(Hours_p_point),
Total_points = sum(Points),
Total_Hours = sum(Program_Hours))
DT::datatable(Comparation)H0: X = Y
H1: X > Y, where
X = Points per week with 8h/d Y = Points per week with 7h/d
it means, weeks with 8h/d working have statistically more Points than weeks with 7h/d working
For H1 to be true the p-value need to be less than α (alfa = 0.05), or equal. If p-value > than alfa it means there are not statistically significant difference between Points per week with 8h/d and 7h/d (H0 is not reject)
The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is true. For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference.
##
## Welch Two Sample t-test
##
## data: Points by cat_regime
## t = -1.4194, df = 22.865, p-value = 0.1693
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.3095889 0.6166065
## sample estimates:
## mean in group 7h_p_day mean in group 8h_p_day
## 7.916667 9.263158
The results show us that HO could not be reject (p-value = 0.1693), it means, there are not statistically significant difference in Points between weeks with 8h p/day working and 7h/p day working (programing hours).
# Create a boxplot with geom_boxplot()
ggplot(data_outliers, aes(x = as.factor(cat_regime) , y = Points)) +
geom_boxplot() + geom_hline(yintercept = 7.3, color="red", size=1.5) + geom_hline(yintercept = 10.3, color="red", size=1.5)boxplot(Weeks_7hs$Points, Weeks_8hs$Points,
outline = FALSE,
names = c("Point_W7", "Point_W8"),
col = c("blue", "yellow"),
main = "Points comparation")
abline(h=7.3, col = "Red")
abline(h=10.3, col = "Red")boxplot(Weeks_7hs$Program_Hours,Weeks_8hs$Program_Hours,
outline = FALSE,
names = c("Hours_W7", "Hours_W8"),
col = c("blue", "yellow"),
main = "Hours comparation")
abline(h=32, col = "Red")
abline(h=35, col = "Red")library("ggpubr")
library(rpart)
library(rpart.plot)
library(corrplot)
library(ggvis)
library(viridis)
library(hrbrthemes)
ggplot(data = data_outliers,
aes(x = Program_Hours, y= Points, colour = as.factor(cat_regime))) + geom_point() + labs(x = "Programing hours", y = "Points") + geom_smooth(method = "lm") + theme(legend.position = "bottom") + ggtitle("Points and Hours")data_outliers %>%
ggplot(aes(x= Program_Hours, y= Points, size=Program_Hours , color=as.factor(cat_regime))) +
geom_point(alpha=0.5) +
scale_size(range = c(.1, 20), name="Points")qplot(Program_Hours, Points, data = data_outliers, facets = . ~ as.factor(cat_regime), color = Hours_p_point, geom = c("point", "smooth"), method = "lm")fit2 <- lm(formula = Points ~ Program_Hours + as.factor(cat_regime) -1, family = "peason", data = data_outliers)
summary(fit2)##
## Call:
## lm(formula = Points ~ Program_Hours + as.factor(cat_regime) -
## 1, data = data_outliers, family = "peason")
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.5871 -1.7448 0.2036 1.6077 5.4129
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## Program_Hours 0.2980 0.1516 1.965 0.0594 .
## as.factor(cat_regime)7h_p_day -1.8415 5.0159 -0.367 0.7163
## as.factor(cat_regime)8h_p_day -1.8241 5.6704 -0.322 0.7501
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.433 on 28 degrees of freedom
## Multiple R-squared: 0.9355, Adjusted R-squared: 0.9286
## F-statistic: 135.4 on 3 and 28 DF, p-value: < 2.2e-16
fit2 <- lm(formula = Points ~ Program_Hours -1, family = "peason", data = data_outliers)
summary(fit2)##
## Call:
## lm(formula = Points ~ Program_Hours - 1, data = data_outliers,
## family = "peason")
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.642 -1.765 0.099 1.599 5.358
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## Program_Hours 0.24691 0.01188 20.79 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.358 on 30 degrees of freedom
## Multiple R-squared: 0.9351, Adjusted R-squared: 0.9329
## F-statistic: 432.2 on 1 and 30 DF, p-value: < 2.2e-16
Although the number of hours worked in a week with 8h/d schedule is statistically greater than the number of hours worked in a week with 7h/d, there is no statistically significant difference between the number of points produced between those weeks