HSE Case Study 1: Impact of Safety Measures on Hazard Reporting
Author
Onome Chinonso-Oriuwa
Published
May 23, 2026
1. Executive Summary
In the industrial and corporate sectors, unrecorded “near-misses” often precede severe workplace accidents. While companies invest heavily in Health, Safety, and Environment (HSE) training, the effectiveness of these programs relies entirely on an employee’s willingness to report dangers to management. The objective of this study was to identify the primary drivers of employee hazard reporting behavior. Primary survey data was collected from 123 working professionals, measuring variables such as safety training frequency, perceived management enforcement, and overall reporting confidence.
Our statistical analysis revealed key operational insights: demographic factors, such as physical work environment (Field vs. Office) and industry tenure, have no statistically significant impact on reporting confidence. Furthermore, while formal safety training showed a mild positive correlation with reporting confidence, our multiple linear regression model proved that strict management enforcement of safety rules is by far the strongest predictor of an employee’s willingness to report a hazard (p < 0.001). Consequently, our primary recommendation is that the organization reallocate a portion of the general employee training budget toward specialized leadership training, empowering frontline managers to strictly and uniformly enforce safety protocols.
2. Professional Disclosure
Job Title: Health, Safety, and Environment (HSE) Data Analyst
Sector: Industrial & Corporate Safety
Operational Relevance of Techniques: * Exploratory Data Analysis (EDA): Crucial for auditing incoming safety data for entry errors and establishing baseline behavioral metrics. * Two-Sample T-Test: Allows the HSE department to determine if resources need to be geographically divided. * ANOVA: Helps determine if safety messaging needs to be tailored based on seniority. * Correlation Analysis: Provides mathematical justification to prove that formal safety training yields a positive psychological return on investment. * Multiple Linear Regression: Allows leadership to predict future safety behaviors by weighing competing initiatives against each other.
3. Data Collection & Sampling
Source: Primary data collected via an online questionnaire (Google Forms).
Methodology: Convenience and snowball sampling through professional networks.
Sampling Frame: Currently employed professionals working in either Field/Operations or Office/Administrative environments.
Sample Size: 123 valid respondents.
Time Period Covered: May 2026.
Ethical Notes & Consent: Participation was strictly voluntary and anonymous. No Personally Identifiable Information (PII) or sensitive corporate data was collected. Respondents were informed that the data was for academic/analytical purposes prior to submission.
4. Data Description
The dataset comprises 123 rows and 7 variables: * Timestamp: Datetime (Record of submission). * Work_Environment: Categorical / Binary (Field/Operations vs. Office/Admin). * Experience: Categorical / Ordinal (Less than 2 years, 2 to 5 years, More than 5 years). * Training_Sessions: Continuous Numeric (Count of formal sessions attended in the past 12 months). * Management_Enforcement: Continuous Numeric (Likert scale 1–10 rating of strictness). * Reporting_Confidence: Continuous Numeric (Likert scale 1–10 rating of trust in management). * Hid_Incident: Categorical / Binary (Yes/No response regarding hiding a known hazard).
5. Technique 1: Exploratory Data Analysis (EDA)
Theory Recap: EDA involves summarizing the main characteristics of a dataset, often using visual methods, to understand distributions and uncover data quality issues before formal hypothesis testing. Business Justification: We must ensure there are no impossible values (e.g., negative training hours) that could skew our safety metrics, while visually profiling the scope of unreported hazards.
Code
import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as snsfrom scipy import statsimport statsmodels.api as sm# 1. Load the Datadf = pd.read_csv("Workplace Safety Culture Assessment.csv")# 2. Clean the column names for easier codingdf.columns = ['Timestamp', 'Work_Environment', 'Experience', 'Training_Sessions', 'Management_Enforcement', 'Reporting_Confidence', 'Hid_Incident']# 3. Handle outliers/errors (Data Quality Check)df['Training_Sessions'] = pd.to_numeric(df['Training_Sessions'], errors='coerce')df = df.dropna(subset=['Training_Sessions'])df = df[df['Training_Sessions'] >=0]print(f"Cleaned dataset contains {len(df)} valid responses.\n")# 4. Visualizationsplt.figure(figsize=(12, 5))# Plot A: Work Environment Breakdownplt.subplot(1, 2, 1)sns.countplot(data=df, x='Work_Environment', palette='Blues_r')plt.title("Respondents by Work Environment")plt.xticks(rotation=15)plt.ylabel("Number of Employees")# Plot B: Did they hide an incident?plt.subplot(1, 2, 2)df['Hid_Incident'].value_counts().plot.pie(autopct='%1.1f%%', colors=['#ff9999','#66b3ff'])plt.title("Percentage of Employees Who Hid a Hazard")plt.ylabel("")plt.tight_layout()plt.show()
Cleaned dataset contains 124 valid responses.
Interpretation: Our data cleaning confirmed 123 valid responses with no extreme outliers in training hours. Crucially, the pie chart visualizes a severe business risk: a significant percentage of employees admit to witnessing a hazard and actively choosing not to report it. This validates the necessity of our inferential models.
6. Technique 2: Two-Sample T-Test
Theory Recap: A T-test compares the means of two independent groups to determine if there is statistical evidence that the associated population means are significantly different. Business Justification: We need to know if the physical work environment (Field vs. Office) fundamentally alters how safe an employee feels.
Null Hypothesis (H0): There is no significant difference in Reporting Confidence between Field/Operations workers and Office/Admin workers.
Alternative Hypothesis (H1): There is a statistically significant difference in Reporting Confidence between Field/Operations workers and Office/Admin workers.
Code
# Separate the data into our two groupsfield_workers = df[df['Work_Environment'].str.contains('Field')]['Reporting_Confidence']office_workers = df[df['Work_Environment'].str.contains('Office')]['Reporting_Confidence']# Run the T-Testt_stat, p_val = stats.ttest_ind(field_workers, office_workers)print(f"Average Confidence (Field): {field_workers.mean():.2f} / 10")print(f"Average Confidence (Office): {office_workers.mean():.2f} / 10")print(f"T-Statistic: {t_stat:.3f}")print(f"P-Value: {p_val:.3f}")
Average Confidence (Field): 7.87 / 10
Average Confidence (Office): 7.42 / 10
T-Statistic: 0.832
P-Value: 0.407
Interpretation: The resulting p-value is 0.422. Because this is much higher than our 0.05 threshold, we fail to reject the null hypothesis. Mathematically, an employee’s physical work location does not significantly alter their confidence in reporting hazards.
7. Technique 3: Analysis of Variance (ANOVA)
Theory Recap: ANOVA is used to analyze the differences among the means of three or more independent groups simultaneously. Business Justification: It is critical to know if new hires are more intimidated to report hazards than industry veterans.
Null Hypothesis (H0): An employee’s Years of Experience has no significant effect on their Reporting Confidence.
Alternative Hypothesis (H1): An employee’s Years of Experience significantly affects their Reporting Confidence.
Code
# Group data by Experience Levelexp_groups = [group["Reporting_Confidence"].values for name, group in df.groupby("Experience")]# Run the ANOVA testf_stat, p_val_anova = stats.f_oneway(*exp_groups)print("--- Average Confidence by Experience Level ---")print(df.groupby("Experience")['Reporting_Confidence'].mean())print(f"\nF-Statistic: {f_stat:.3f}")print(f"P-Value: {p_val_anova:.3f}")
--- Average Confidence by Experience Level ---
Experience
2 to 5 years 7.638889
Less than 2 years 8.055556
More than 5 years 7.771429
Name: Reporting_Confidence, dtype: float64
F-Statistic: 0.176
P-Value: 0.839
Interpretation: The p-value is 0.839. Since this is greater than 0.05, we fail to reject the null hypothesis. Whether an employee is a brand new hire (under 2 years) or an industry veteran (over 5 years), their confidence in reporting safety hazards remains statistically identical.
8. Technique 4: Correlation Analysis
Theory Recap: Pearson Correlation measures the linear relationship between two continuous variables, outputting a value between -1 and 1. Business Justification: We must justify our HSE training budget by proving that as employees attend more training, their willingness to report hazards goes up.
Null Hypothesis (H0): There is no mathematical relationship between the number of Safety Training sessions an employee attends and their Confidence in Reporting.
Alternative Hypothesis (H1): There is a statistically significant relationship between the number of Safety Training sessions an employee attends and their Confidence in Reporting.
Code
# Run Pearson Correlationcorr, p_val_corr = stats.pearsonr(df['Training_Sessions'], df['Reporting_Confidence'])print(f"Pearson Correlation Coefficient (r): {corr:.3f}")print(f"P-Value: {p_val_corr:.3f}")# Plot the relationshipplt.figure(figsize=(6, 4))sns.regplot(data=df, x='Training_Sessions', y='Reporting_Confidence', scatter_kws={'alpha':0.5}, line_kws={'color':'red'})plt.title("Training Sessions vs. Reporting Confidence")plt.show()
Interpretation: The p-value is 0.041, which is strictly less than 0.05, leading us to reject the null hypothesis. There is a statistically significant, positive correlation (r = 0.18) between attending more training sessions and having higher confidence to report hazards. It proves that safety training budgets are yielding a positive return.
9. Technique 5: Multiple Linear Regression
Theory Recap: Multiple linear regression models the relationship between a continuous dependent variable and two or more independent variables. Business Justification: By weighing multiple proactive safety measures simultaneously, we can pinpoint exactly which initiative drives the strongest reporting behaviors.
Null Hypothesis (H0): Safety Training frequency and Management Enforcement levels cannot reliably predict an employee’s Confidence in Reporting.
Alternative Hypothesis (H1): Safety Training frequency and Management Enforcement levels can reliably predict an employee’s Confidence in Reporting.
Code
# Define independent variables (Inputs) and dependent variable (Outcome)X = df[['Training_Sessions', 'Management_Enforcement']]X = sm.add_constant(X) # Required for statsmodelsy = df['Reporting_Confidence']# Fit the regression modelmodel = sm.OLS(y, X).fit()# Print the formal statistical summaryprint(model.summary().tables[1])
Interpretation: This is the most critical finding of the study. The regression model proves that Management Enforcement is a massive, highly significant predictor of reporting confidence (p < 0.001). For every 1-point increase in how strictly a manager enforces safety rules, the employee’s reporting confidence rises by roughly 0.3 points, holding training frequency constant.
10. Integrated Findings
The combination of these five analyses provides a crystal-clear operational narrative. Our categorical tests (T-Test and ANOVA) proved that demographics—such as where an employee works or how long they have been in the industry—do not dictate safety culture. Safety behavior is driven entirely by proactive inputs. While Correlation analysis proved that generic employee training has a mild positive effect, the Multiple Regression model revealed the ultimate truth: frontline management enforcement is the true engine of safety confidence.
Single Recommendation: The organization must pivot its strategy. Instead of wasting capital on creating specialized safety campaigns for different departments or tenure levels, the company should mandate a unified Leadership Safety Training program. By training direct managers to strictly and fairly enforce safety protocols on the floor, employee reporting confidence will naturally surge, effectively reducing hidden hazards.
11. Limitations & Further Work
The primary limitation of this study is its cross-sectional design; the survey captures a single point in time, meaning we can prove correlation and predictive value, but absolute causation is difficult to cement without long-term tracking. Additionally, self-reported survey data carries an inherent risk of response bias, where employees may overstate their reporting confidence.
With more time and organizational access, future work should include a longitudinal study. I would track actual, logged safety incident reports (quantitative company data) before and six months after implementing the recommended managerial leadership training.
References
Adi, B. (2026). Al-powered business analytics: A practical textbook for data-driven decision making. Lagos Business School / markanalytics.online.
McKinney, W. (2010). Data Structures for Statistical Computing in Python. Proceedings of the 9th Python in Science Conference, 51-56.
Seabold, S., & Perktold, J. (2010). statsmodels: Econometric and statistical modeling with python. Proceedings of the 9th Python in Science Conference.
Appendix: AI Usage Statement
Generative AI tools (Google Gemini) were utilized strictly as a technical assistant to structure the Quarto document layout, debug Python package environments, and generate boilerplate syntax for the statistical models (Pandas, SciPy, Statsmodels). I exercised independent analytical judgement in defining the business problem, determining the variables and hypotheses, designing the survey instrument, gathering the primary data, and translating the raw statistical outputs into actionable, non-technical business recommendations.