#read csv file
garment_prod <-read.csv("/Users/lakshmimounikab/Desktop/Stats with R/R practice/garment_prod.csv")
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(garment_prod)
## date quarter department day
## Length:1197 Length:1197 Length:1197 Length:1197
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## team targeted_productivity smv wip
## Min. : 1.000 Min. :0.0700 Min. : 2.90 Min. : 7.0
## 1st Qu.: 3.000 1st Qu.:0.7000 1st Qu.: 3.94 1st Qu.: 774.5
## Median : 6.000 Median :0.7500 Median :15.26 Median : 1039.0
## Mean : 6.427 Mean :0.7296 Mean :15.06 Mean : 1190.5
## 3rd Qu.: 9.000 3rd Qu.:0.8000 3rd Qu.:24.26 3rd Qu.: 1252.5
## Max. :12.000 Max. :0.8000 Max. :54.56 Max. :23122.0
## NA's :506
## over_time incentive idle_time idle_men
## Min. : 0 Min. : 0.00 Min. : 0.0000 Min. : 0.0000
## 1st Qu.: 1440 1st Qu.: 0.00 1st Qu.: 0.0000 1st Qu.: 0.0000
## Median : 3960 Median : 0.00 Median : 0.0000 Median : 0.0000
## Mean : 4567 Mean : 38.21 Mean : 0.7302 Mean : 0.3693
## 3rd Qu.: 6960 3rd Qu.: 50.00 3rd Qu.: 0.0000 3rd Qu.: 0.0000
## Max. :25920 Max. :3600.00 Max. :300.0000 Max. :45.0000
##
## no_of_style_change no_of_workers actual_productivity
## Min. :0.0000 Min. : 2.00 Min. :0.2337
## 1st Qu.:0.0000 1st Qu.: 9.00 1st Qu.:0.6503
## Median :0.0000 Median :34.00 Median :0.7733
## Mean :0.1504 Mean :34.61 Mean :0.7351
## 3rd Qu.:0.0000 3rd Qu.:57.00 3rd Qu.:0.8503
## Max. :2.0000 Max. :89.00 Max. :1.1204
##
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.
The goal of this project is to predict the productivity of the employees working at a garment factory.
The Garment Industry serves as a prominent example of contemporary industrial globalization. Meeting the substantial global demand for garment products largely hinges on the performance and efficiency of employees within garment manufacturing firms. Consequently, decision-makers in the garment industry have a strong desire to monitor, analyze, and forecast the productivity of their factory teams. This dataset can serve the purpose of regression, allowing for the prediction of productivity within a range of 0 to 1, or classification, where the productivity range (0-1) can be categorized into different classes.
This dataset contains significant characteristics related to the garment production procedure and the efficiency of the workforce. The data was collected through manual means and subsequently verified by industry professionals.
The details regarding the attributes here are: 01) date : Date in MM-DD-YYYY 02) day : Day of the Week 03) quarter : A portion of the month. A month was divided into four quarters 04) department : Associated department with the instance 05) team_no : Associated team number with the instance 06) no_of_workers : Number of workers in each team 07) no_of_style_change : Number of changes in the style of a particular product 08) targeted_productivity : Targeted productivity set by the Authority for each team for each day. 09) smv : Standard Minute Value, it is the allocated time for a task 10) wip : Work in progress. Includes the number of unfinished items for products 11) over_time : Represents the amount of overtime by each team in minutes 12) incentive : Represents the amount of financial incentive (in BDT) that enables or motivates a particular course of action. 13) idle_time : The amount of time when the production was interrupted due to several reasons 14) idle_men : The number of workers who were idle due to production interruption 15) actual_productivity : The actual % of productivity that was delivered by the workers. It ranges from 0-1.
# standard deviation of targeted productivity
std_dev_tp <- sd(garment_prod$targeted_productivity, na.rm = TRUE)
print(std_dev_tp)
## [1] 0.09789096
#variance of targeted productivity
var_tp <- var(garment_prod$targeted_productivity, na.rm= TRUE)
print(var_tp)
## [1] 0.009582641
#sum of total incentive provided
incent_sum <- sum(garment_prod$incentive)
print(incent_sum)
## [1] 45738
# Scattered plot for Team vs targeted productivity
library(ggplot2)
plot(garment_prod$team,garment_prod$targeted_productivity, main = "Scatter plot for Team vs Targeted productivity", xlab="Team number", ylab="Targeted productivity")
# Histogram for overtime
hist(garment_prod$over_time, breaks = 20, col = "blue", main = "Histogram of Overtime", xlab = "Over time")
# Pie chart for Quarter
pie(table(garment_prod$quarter), main="Pie chart for Quarter")
# Quarter vs Overtime bar plot
data_frame_ot <- garment_prod
nrow(data_frame_ot)
## [1] 1197
result <- aggregate(data_frame_ot$over_time,by=list(data_frame_ot$quarter), mean)
result
## Group.1 x
## 1 Quarter1 4480.917
## 2 Quarter2 4355.015
## 3 Quarter3 4896.000
## 4 Quarter4 4851.250
## 5 Quarter5 3725.455
barplot(result$x, names.arg=result$Group.1, xlab="Quarter", ylab="Average Worktime", col=rainbow(6),
main="Quarter vs Overtime",border="black")