Assignment 2
Evaluation of Risks
Based on the safety data set as provided in WIL Week 2, publish an R-Markdown document providing the following graphical analysis.
To ensure the code is reproducible, we clean the memory and load the necessary libraries
We load the safety_data.csv and 3mma.csv as data
Overall Trend: The overall cumulative number of injuries (Black) shows a steady increase over time, indicating that injuries have been accumulating consistently. The average number of events per day is approximately 3.55 (7742 events over 2177 days).
Dominant injury types seem to be Restricted Work Injury “RWI” (Red) and “LTI” (Green),with the RWI was increasing from mid 2010 while “Other” injury was ‘slowing down’ since the same time. This most likely an indication of changing classification of Injury types, where some number of ‘Other’ injuries has been re-classified as “RWI” from mid 2010.
“Fatal” (cyan) injuries, as expected and fortunately, have the lowest frequency among all injury types. The curve for fatal injuries is relatively flat, indicating rare occurrences.
Above is the time series plot (3 Month Moving Average) of events. This shows us what we have seen from the cumulative events plot. The “RWI” was the main type of injuries since mid 2010, with the average number of events of 1.8 to 3.1 events per three months. Overall there were roughly 2.4 injury events (in Q1 2009) to 4.8 injury events (in Q2 2012).
It seems clear that the middle age groups, specifically those between the ages of 25-29, 30-34, 35-39, have a higher number of reported injuries. This could indicate that these age groups are more susceptible to risks or engage in more hazardous tasks. Alternatively, it is possible that these age groups make up a significant portion of the workforce.
The same histogram can be rearranged into a Pareto chart above. It shows the majority injury happened to age group 25 to 54. However, workers from age group 20-24 registering less injuries compared to the previous age groups. We noticed the similarity of age group 20-24 and 55-59, possibly due to their attitude to safety and over confident of work in the other age groups. For age group above 59 and less than 20, due to the number of workers in this age groups, the reported injuries are much lower than any other age group.
The day with the most reported injuries is Tuesday and Wednesday. This suggests that Tuesday might be more prone to incidents, which might be related to workload is more on Tuesday.The red line represents the cumulative percentage of injuries. By the time we reach Friday, almost 80% of the injuries have already occurred. Saturday and Sunday have the fewest reported injuries. This could be due to fewer operational activities during the weekend or fewer workers being present. After the initial spike in injuries on Monday and Tuesday, there’s a decline from Wednesday to Friday.
The data set and the plots give us good insights into the patterns and frequencies of injury events in relation to the Time of events, Age groups of workers, and Weekday.
Time (Trend Analysis): The cumulative plot over time allowed us to analyse the progression and accumulation of injuries. A steady increase in cumulative injuries over time signifies consistent reporting and the need for sustained safety measures. Observing the trend could also help identify periods of increased incidents, which might correspond to operational changes, seasonal activities, or other external factors.
An extra plot created to analyse monthly event numbers shows us there are more events in 2012 to 2014.
We also noticed the sudden change of trend in injury type from mid-2010, which might be related to re-classification of some injury from ‘Other’ to ‘Restricted Work Injury (RWI)’.
Age group of workers (Histogram): The histogram and Pareto chart of injuries by age revealed that middle-aged groups, particularly those between 25 and 39, reported the highest frequency of injuries. This could be indicative of the roles, responsibilities, or sheer number of workers in these age brackets. By understanding age-related injury patterns, organisations can tailor safety training and interventions specific to age groups.
Weekday (Pareto Analysis): The Pareto chart of injuries by weekday shows us that a majority of the injuries occurred at the beginning of the workweek, with Tuesday being the most significant day. The weekends reported the least number of injuries, which might be attributed to decreased operational activities or fewer workers on-site.
In summary, this dataset and analysis serve as a valuable tool for understanding the temporal and demographic dimensions of injury events. Recognising these patterns and trends helps us in the formulation of targeted safety protocols, training modules, and preventive measures. By addressing the key areas of concern, such as the start of the workweek or specific age brackets, organisations can proactively work towards reducing the occurrence of future injuries.
It’s important to note that while this conclusion provides an overview based on the data analysis, a more in-depth examination, including other external factors and qualitative data, would provide a comprehensive understanding of the injury patterns and their underlying causes.