#read csv file
garment_prod <-read.csv("/Users/lakshmimounikab/Desktop/Stats with R/R practice/garment_prod.csv")

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(garment_prod)
##      date             quarter           department            day           
##  Length:1197        Length:1197        Length:1197        Length:1197       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##       team        targeted_productivity      smv             wip         
##  Min.   : 1.000   Min.   :0.0700        Min.   : 2.90   Min.   :    7.0  
##  1st Qu.: 3.000   1st Qu.:0.7000        1st Qu.: 3.94   1st Qu.:  774.5  
##  Median : 6.000   Median :0.7500        Median :15.26   Median : 1039.0  
##  Mean   : 6.427   Mean   :0.7296        Mean   :15.06   Mean   : 1190.5  
##  3rd Qu.: 9.000   3rd Qu.:0.8000        3rd Qu.:24.26   3rd Qu.: 1252.5  
##  Max.   :12.000   Max.   :0.8000        Max.   :54.56   Max.   :23122.0  
##                                                         NA's   :506      
##    over_time       incentive         idle_time           idle_men      
##  Min.   :    0   Min.   :   0.00   Min.   :  0.0000   Min.   : 0.0000  
##  1st Qu.: 1440   1st Qu.:   0.00   1st Qu.:  0.0000   1st Qu.: 0.0000  
##  Median : 3960   Median :   0.00   Median :  0.0000   Median : 0.0000  
##  Mean   : 4567   Mean   :  38.21   Mean   :  0.7302   Mean   : 0.3693  
##  3rd Qu.: 6960   3rd Qu.:  50.00   3rd Qu.:  0.0000   3rd Qu.: 0.0000  
##  Max.   :25920   Max.   :3600.00   Max.   :300.0000   Max.   :45.0000  
##                                                                        
##  no_of_style_change no_of_workers   actual_productivity
##  Min.   :0.0000     Min.   : 2.00   Min.   :0.2337     
##  1st Qu.:0.0000     1st Qu.: 9.00   1st Qu.:0.6503     
##  Median :0.0000     Median :34.00   Median :0.7733     
##  Mean   :0.1504     Mean   :34.61   Mean   :0.7351     
##  3rd Qu.:0.0000     3rd Qu.:57.00   3rd Qu.:0.8503     
##  Max.   :2.0000     Max.   :89.00   Max.   :1.1204     
## 

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Goal/Purpose

The goal of this project is to predict the productivity of the employees working at a garment factory.

Data documentation

The Garment Industry serves as a prominent example of contemporary industrial globalization. Meeting the substantial global demand for garment products largely hinges on the performance and efficiency of employees within garment manufacturing firms. Consequently, decision-makers in the garment industry have a strong desire to monitor, analyze, and forecast the productivity of their factory teams. This dataset can serve the purpose of regression, allowing for the prediction of productivity within a range of 0 to 1, or classification, where the productivity range (0-1) can be categorized into different classes.

This dataset contains significant characteristics related to the garment production procedure and the efficiency of the workforce. The data was collected through manual means and subsequently verified by industry professionals.

The details regarding the attributes here are: 01) date : Date in MM-DD-YYYY 02) day : Day of the Week 03) quarter : A portion of the month. A month was divided into four quarters 04) department : Associated department with the instance 05) team_no : Associated team number with the instance 06) no_of_workers : Number of workers in each team 07) no_of_style_change : Number of changes in the style of a particular product 08) targeted_productivity : Targeted productivity set by the Authority for each team for each day. 09) smv : Standard Minute Value, it is the allocated time for a task 10) wip : Work in progress. Includes the number of unfinished items for products 11) over_time : Represents the amount of overtime by each team in minutes 12) incentive : Represents the amount of financial incentive (in BDT) that enables or motivates a particular course of action. 13) idle_time : The amount of time when the production was interrupted due to several reasons 14) idle_men : The number of workers who were idle due to production interruption 15) actual_productivity : The actual % of productivity that was delivered by the workers. It ranges from 0-1.

Possible Questions

  1. What are they key factors influencing employee productivity?
  2. Can we identify any trends or patterns in employee producitvity over time or across. different departments/teams? 3)Are there any significant difference in productivity between different shifts or work schedules?
  3. Can we predict future producitivty based on historical dta or trends?
  4. What are the most effective strategies or interventions for improving employee producitvity based on the data analysis?

Aggregate functions

# standard deviation of targeted productivity
std_dev_tp <- sd(garment_prod$targeted_productivity, na.rm = TRUE)
print(std_dev_tp)
## [1] 0.09789096
#variance of targeted productivity
var_tp <- var(garment_prod$targeted_productivity, na.rm= TRUE)
print(var_tp)
## [1] 0.009582641
#sum of total incentive provided
incent_sum <- sum(garment_prod$incentive)
print(incent_sum)
## [1] 45738

Visual summary

# Scattered plot for Team vs targeted productivity
library(ggplot2)
plot(garment_prod$team,garment_prod$targeted_productivity, main = "Scatter plot for Team vs Targeted productivity", xlab="Team number", ylab="Targeted productivity")

# Histogram for overtime
hist(garment_prod$over_time, breaks = 20, col = "blue", main = "Histogram of Overtime", xlab = "Over time")

# Pie chart for Quarter
pie(table(garment_prod$quarter), main="Pie chart for Quarter")

# Quarter vs Overtime bar plot
data_frame_ot <- garment_prod
nrow(data_frame_ot)
## [1] 1197
result <- aggregate(data_frame_ot$over_time,by=list(data_frame_ot$quarter), mean)
result
##    Group.1        x
## 1 Quarter1 4480.917
## 2 Quarter2 4355.015
## 3 Quarter3 4896.000
## 4 Quarter4 4851.250
## 5 Quarter5 3725.455
barplot(result$x, names.arg=result$Group.1, xlab="Quarter", ylab="Average Worktime", col=rainbow(6),
        main="Quarter vs Overtime",border="black")