data <- read.csv("C:\\Users\\Krishna\\Downloads\\productivity+prediction+of+garment+employees\\garments_worker_productivity.csv")

# 1. Selecting Variables
 #**Response Variable:** Actual Productivity
## **Explanatory Variable:** Department

# 2. ANOVA Test
# Null Hypothesis:
# Null hypothesis: There is no significant difference in actual productivity among different departments.

# Perform ANOVA
anova_result <- aov(actual_productivity ~ department, data = data)
summary(anova_result)

##               Df Sum Sq Mean Sq F value   Pr(>F)    
## department     2   0.72  0.3615   12.09 6.31e-06 ***
## Residuals   1194  35.69  0.0299                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Results and interpretation

The p-value from the ANOVA test is 0.00000631. Since this value is less than 0.05, we reject the null hypothesis.
There is significant evidence to suggest that there are differences in actual productivity among different departments.

# Identifying the predictor variable
# In my dataset i will consider SMV(standard minute value) as a predictor variable
 # it's the time to build regreesion model on my predictor variable
 # Build linear regression model
model <- lm(actual_productivity ~ smv, data = data)

# Summary of the model
summary(model)

## 
## Call:
## lm(formula = actual_productivity ~ smv, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.52095 -0.08825  0.04300  0.11334  0.37991 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.7644124  0.0085220  89.699  < 2e-16 ***
## smv         -0.0019467  0.0004578  -4.252 2.28e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1733 on 1195 degrees of freedom
## Multiple R-squared:  0.01491,    Adjusted R-squared:  0.01408 
## F-statistic: 18.08 on 1 and 1195 DF,  p-value: 2.281e-05

Model evalution

The R-squared value of the model is 0.01491. This indicates that 1.491% of the variability in actual productivity can be explained by the predictor variable (SMV).

Interpretation and Recommendations

Interpretation of Coefficients: The coefficient of SMV is -0.0019. This suggests that for each unit increase in SMV, the actual productivity is expected to decrease by 0.0019467 units.

Recommendations: Based on the analysis, it is recommended to focus on optimizing SMV to improve actual productivity in the manufacturing process.

# Load required packages
library(ggplot2)

# Plot SMV vs. Actual Productivity
ggplot(data, aes(x = smv, y = actual_productivity)) +
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE) +
  labs(title = "SMV vs. Actual Productivity",
       x = "SMV (Standard Minute Value)",
       y = "Actual Productivity")

## `geom_smooth()` using formula = 'y ~ x'

Insights

1)ANOVA test: The anova test evaluates whether there are statistically significant differences in actual productivity across different catagories of a categorical variable (ex-department)

2)linear regression model:

The linear regression model depicts the relationship between a continuous predictor variable (smv) and response variable (actual productivity)

WEEK8 DD

2024-03-04

Results and interpretation

Model evalution

Interpretation and Recommendations

Insights