data <- read.csv("C:\\Users\\Krishna\\Downloads\\productivity+prediction+of+garment+employees\\garments_worker_productivity.csv")
# 1. Selecting Variables
 #**Response Variable:** Actual Productivity
## **Explanatory Variable:** Department

# 2. ANOVA Test
# Null Hypothesis:
# Null hypothesis: There is no significant difference in actual productivity among different departments.
# Perform ANOVA
anova_result <- aov(actual_productivity ~ department, data = data)
summary(anova_result)
##               Df Sum Sq Mean Sq F value   Pr(>F)    
## department     2   0.72  0.3615   12.09 6.31e-06 ***
## Residuals   1194  35.69  0.0299                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Results and interpretation

# Identifying the predictor variable
# In my dataset i will consider SMV(standard minute value) as a predictor variable
 # it's the time to build regreesion model on my predictor variable
 # Build linear regression model
model <- lm(actual_productivity ~ smv, data = data)

# Summary of the model
summary(model)
## 
## Call:
## lm(formula = actual_productivity ~ smv, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.52095 -0.08825  0.04300  0.11334  0.37991 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.7644124  0.0085220  89.699  < 2e-16 ***
## smv         -0.0019467  0.0004578  -4.252 2.28e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1733 on 1195 degrees of freedom
## Multiple R-squared:  0.01491,    Adjusted R-squared:  0.01408 
## F-statistic: 18.08 on 1 and 1195 DF,  p-value: 2.281e-05

Model evalution

The R-squared value of the model is 0.01491. This indicates that 1.491% of the variability in actual productivity can be explained by the predictor variable (SMV).

Interpretation and Recommendations

  • Interpretation of Coefficients: The coefficient of SMV is -0.0019. This suggests that for each unit increase in SMV, the actual productivity is expected to decrease by 0.0019467 units.

  • Recommendations: Based on the analysis, it is recommended to focus on optimizing SMV to improve actual productivity in the manufacturing process.

    # Load required packages
    library(ggplot2)
    
    # Plot SMV vs. Actual Productivity
    ggplot(data, aes(x = smv, y = actual_productivity)) +
      geom_point() + 
      geom_smooth(method = "lm", se = FALSE) +
      labs(title = "SMV vs. Actual Productivity",
           x = "SMV (Standard Minute Value)",
           y = "Actual Productivity")
    ## `geom_smooth()` using formula = 'y ~ x'

Insights

1)ANOVA test: The anova test evaluates whether there are statistically significant differences in actual productivity across different catagories of a categorical variable (ex-department)

2)linear regression model:

The linear regression model depicts the relationship between a continuous predictor variable (smv) and response variable (actual productivity)