[Interactive Dashboard: https://ammarey.shinyapps.io/PROYECTO_RStudio/]

1 REPORT

1.1 INTRODUCTION

Safety and well-being in the workplace are fundamental aspects to ensure healthy working conditions and protect the integrity of workers.

In Spain, like in other countries, occupational accidents represent a significant concern for both employees and organizations.This report will conduct a detailed analysis of the incidence rates of occupational accidents with sick leave in the construction sector in Spain, aiming to examine the current situation and propose measures to improve workplace safety in the country. The report will focus on occupational safety in agriculture, services, and industry in Spain, organized by autonomous communities, from 2009 to 2021. The Shiny App platform and R Studio will be used to visualize the accident rates during this period. Furthermore, trends over time will be assessed to identify patterns and evaluate the impact of safety measures implemented to date. The results of this analysis will provide valuable information for policy development, implementation of preventive measures, and promotion of a culture of workplace safety in Spain. With a better understanding of the underlying factors contributing to occupational accidents, it will be possible to develop more effective strategies to prevent accidents and protect the safety and well-being of workers. Additionally, this report presents the findings, discusses the implications, and provides evidence-based recommendations. The objective is to uncover differences among the autonomous communities and provide more efficient guidance and training in specific areas in the future, in order to reduce the overall incidence of occupational accidents in Spain.

1.2 BACKGROUND

1.2.1 WORK CONTEXT

Workplace safety and accident prevention have been constant concerns at both national and international levels. In Spain, various regulations and norms have been established to ensure a safe working environment and protect the health and well-being of workers.

The National Institute of Safety and Health at Work (INSST) is the organization responsible for promoting and developing occupational risk prevention measures in Spain. In addition, there are specific laws and regulations, such as the Law on Occupational Risk Prevention (Law 31/1995), which establish the obligations and responsibilities of employers and workers to guarantee a safe working environment. Previous research on workplace accident rates in Spain has addressed different aspects, such as the most affected sectors, the most common types of injuries, and associated risk factors. These studies have provided a solid foundation for the implementation of prevention measures and raising awareness about the importance of workplace safety. This report builds upon these efforts and aims to visualize the analysis of workplace accident rates in Spain. By reviewing and evaluating available data, the goal is to identify patterns, trends, and persistent challenges in this field. Based on this information, concrete recommendations and measures will be proposed to improve workplace safety and reduce the incidence of accidents in the country.

1.2.2 STATE OF THE ART

  1. CALOPA RUESTA, A., 2020. Analysis of the evolution of occupational accident rates in Spain between 2000 and 2018. S.l.: Universitat Politècnica de Catalunya. [Keywords: Incidence rate, Occupational accident rate, Economic cycle]

  2. Gallego Blasco, V. S. (2021). Analysis of the incidence of causal factors in the evolution of occupational accidents in Spain (Doctoral dissertation, Universitat Politècnica de València). [Keywords: Trend analysis, Breakpoints, Generalized regression models, Generalized linear models, Occupational risk prevention, Occupational health, Sociosanitary indicators, Risk indicators, Occupational accidents]

  3. Sievert, C. (2020). Interactive web-based data visualization with R, plotly, and shiny. CRC Press. I[Keywords: Shiny app, rmarkdown, web-based graphics, r package, ggplot, web technologies, graphical data analysis, data visualization]

1.3 METHODOLOGY

To carry out this work focused on the study of the evolution of the incidence rate of occupational accidents with sick leave in the construction sector of Spain in the period 2009-2021, the databases of the Ministry of Labor and Social Economy through the Secretary of State for Employment and Social Economy were consulted.

[source: https://www.mites.gob.es/estadisticas/eat/welcome.htm]

The data obtained for this work cover the period from 2009 to 2021, and all the information is disaggregated at the level of autonomous communities. Through the consultation of the Ministry of Labor and Social Economy, it was possible to obtain information for the three variables analyzed in this work: the number of occupational accidents with sick leave that occurred in companies operating in Spain, the incidence rate of occupational accidents with sick leave in the construction sector by autonomous community and sector of activity, which is defined as the total number of occupational accidents with sick leave per 100,000 workers. The analysis methodology in this work will involve using R Studio and Shiny to visualize the trend of accident rates in each autonomous community of Spain and compare them with the national average.

The methodology will consist of the following steps:

  1. Data collection: The necessary data on accident rates for each autonomous community in Spain, as well as the aggregated national data, will be obtained.

  2. Data preparation and cleaning: The collected data will be imported into R Studio, and necessary cleaning and preparation operations will be performed. This may include handling missing values, standardizing formats, and creating additional variables if needed.

  3. Development of the Shiny application: Using the Shiny package, an interactive web application will be created that allows for visualizing the trend of accident rates in each autonomous community and comparing them with the national average. The application will include interactive graphics, dynamic tables, and filters to facilitate data exploration by the user.

  4. Implementation and deployment: The Shiny application will be implemented and deployed in a web environment to be accessible through a browser. This can be done using hosting services or specific platforms for Shiny application deployment.

  5. Analysis and conclusions: Once the application is up and running, it will be used to analyze the trend of accident rates in each autonomous community and make comparisons with the national average. Relevant conclusions will be drawn, and significant findings will be highlighted.

The combination of R Studio and Shiny provides a powerful methodology for data analysis and interactive visualization. By leveraging these tools, you will be able to dynamically explore accident rates in Spain and gain a deeper understanding of the trends and differences between autonomous communities and the national average.

1.4 RESULTS

Through the analysis of visual data using R Studio and Shiny, we can clearly obtain the following results:

1. The evolution of the incidence rate

The evolution of the incidence rate of occupational accidents with sick leave in the construction sector of Spain from 2009 to 2021. This indicator, in the overall context of Spain, decreased significantly from 2009 to 2012, increased slowly from 2012 to 2019, and experienced a considerable increase from 2019 to 2020. From 2020 to 2021, it shows an upward trend. The magnitude of the increase and decrease varies across different autonomous communities, but the general trend is similar to the overall trend in Spain.

Image 1: The evolution of the incidence rate of work-related accidents with sick leave in the construction sector in Spain during the period 2006-2021.

Image 2: The evolution of the incidence rate of work-related accidents with sick leave in the construction sector of Cataluña during the period 2009-2021.

2. Comparison to the industrial, agricultural, and service sectors

From the perspective of different sectors, the construction sector has a higher incidence rate of occupational accidents with sick leave compared to the industrial, agricultural, and service sectors. The ranking of these sectors also changes accordingly. Judging by the increase or decrease during this period, the construction sector has the largest range, with the highest value reaching 8980.4 in 2009 and the lowest value at 5804.1 in 2020. The service sector has the smallest range, with the highest value at 3141.4 in 2009 and the lowest value at 1828.3 in 2020.

Image 3: Contrast among various sectors.

3. Differences in the incidence rates

Through the map of different autonomous communities depicting this index, we can quickly compare the differences in the incidence rates among various autonomous communities in the same period, taking the year 2009 as an example. In terms of the overall index, Castilla-La Mancha and Asturias have relatively high rates. Madrid and Zaragoza have low rates. In the agricultural sector, Valencia, the Basque Country, and Extremadura have high rates, while La Rioja has a low rate. In the construction sector, Castilla-La Mancha and the Basque Country have high rates, while Aragon has a low rate. Regarding the industrial sector, Asturias has a high rate, while Valencia, Madrid, and Aragon have lower rates. In the services sector, Castilla-La Mancha has a higher rate, while Aragon has a lower rate.

Image 4: Map showing the total incidence rate of work-related accidents for each autonomous community in the year 2009.
Click here to see the work-related accidents for each sector map

Image 5: Maps depicting the incidence rate of work-related accidents for each sector in each autonomous community in the year 2009.

1.5 CONCLUSIONS

By analyzing the evolution of the incidence rate of occupational accidents with sick leave in the construction sector in Spain during this period and comparing it with the GDP (Gross Domestic Product) index in Spain, it is evident that the changes in the incidence rate of accidents are consistent with the changes in the GDP index. This demonstrates that the incidence rate of accidents will fluctuate with changes in economic activities, policies, and volatility. The two significant drops correspond to the global financial crisis in 2009 and the COVID-19 pandemic in 2019.

Castilla-La Mancha and the Basque Country exhibit a high risk of occupational accidents in the construction sector, indicating the presence of precarious working conditions or a lack of a safety culture. To improve the situation in Castilla-La Mancha and the Basque Country, specific measures need to be implemented to reduce the incidence rate of occupational accidents in the construction sector.

The construction sector is particularly vulnerable in terms of occupational safety. This vulnerability may stem from the nature of activities performed in this sector, which often involve the use of heavy machinery, work at heights, and exposure to various risks such as falls, impacts, or entrapments. It is necessary to improve the situation in the construction sector and reduce its incidence rate of occupational accidents. Strengthening policies, enhancing training and awareness, adopting safe technologies, and promoting a safety culture can help reduce the incidence rate of occupational accidents in the construction sector and improve working conditions in this industry.

1.6 REFERENCES

Calopa Ruesta, A. (2020). Análisis de la evolución de la tasa de accidentabilidad laboral en España entre 2000 y 2018. S.l.: Universitat Politècnica de Catalunya. Recuperado de https://upcommons.upc.edu/handle/2117/331955

Gallego Blasco, V. S. (2021). Análisis de la incidencia de factores causales en la evolución de la siniestralidad laboral en España [Doctoral dissertation, Universitat Politècnica de València]. Recuperado de https://riunet.upv.es/handle/10251/168774

Sievert, C. (2020). Interactive web-based data visualization with R, plotly, and shiny. CRC Press. Recuperado de https://plotly-r.com/

Ministerio de Trabajo y Economía Social a través de la Secretaría de Estado de Empleo y Economía Social. (n.d.). Recuperado de https://www.mites.gob.es/estadisticas/eat/welcome.htm

Sallan, J. M. (2021). Paquete ESdata: PIB. Recuperado de https://rpubs.com/jmsallan/ESdata_PIB

RStudio. (2020). Shiny from RStudio: Share your apps. Recuperado de https://shiny.rstudio.com/tutorial/written-tutorial/lesson7/

2 REPORT DEVELOPMENT

The research was conducted using R Studio and various related packages, which facilitated the analysis and visualization of the data. The data utilized in the research was obtained from the mites.gob website, a comprehensive database specific to Spain that provided the necessary information for our study.

Those packages included:

library(tidyverse)
library(tidymodels)
library(xlsx)
library(knitr)
library(kableExtra)
library(tinytex)
library(AER)
library(sf)
library(ESdata)
library(rmarkdown)

The research process involved several steps to import and analyze the data using R Studio and related packages. The initial step was to download the necessary data file from the mites.gob website, which serves as a comprehensive database for Spain. Once the file was obtained, it was imported into R Studio.

To ensure seamless integration of the data, we carefully read and organized it. One crucial aspect was creating a list of communities, which acted as a reference for consistency across different datasets. This step facilitated effective comparisons and analysis between various regions. The data import process was carried out using the read.xlsx() function in R, enabling us to specify the relevant page and lines to be extracted from the Excel file provided by the Spanish database. This function was executed multiple times, each time focusing on different sections of the data related to specific sectors. By importing and structuring the data in this manner, we established a solid foundation for further analysis and exploration. The comprehensive datasets for each sector allowed for in-depth investigations and comparisons, ensuring a thorough examination of the incidence rates of work-related accidents in the construction sector across different autonomous communities in Spain.

list <- c("Andalucía",
          "Aragón",
          "PrincipadodeAsturias",
          "IslasBaleares",
          "IslasCanarias",
          "Cantabria",
          "Castilla-LaMancha",
          "CastillayLeón",
          "Cataluña",
          "ComunidadValenciana",
          "Extremadura",
          "Galicia",
          "ComunidaddeMadrid",
          "RegióndeMurcia",
          "ComunidadForaldeNavarra",
          "PaísVasco",
          "LaRioja",
          "CeutayMelilla")

1. AGRARIO

data_agrario2009 <- read.xlsx("ATR_2014_I.xls", sheetName = "ATR-I.1.3", startRow = 31, endRow = 49)

data_agrario2009 <- data_agrario2009 |> 
  select(NA., NA..1, NA..2) |>
  rename("Comunidades_Autonoma" = "NA.",
         "2009" = "NA..1",
         "2010" = "NA..2")

data_agrario2009$Comunidades_Autonoma <- list

data_agrario2011 <- read.xlsx("ATR_2021_I.xlsx", sheetName = "ATR-I.1.3", startRow = 31, endRow = 49)

data_agrario2011 <- data_agrario2011 |> 
  rename("Comunidades_Autonoma" = "NA.",
         "2011" = "NA..1",
         "2012" = "NA..2",
         "2013" = "NA..3",
         "2014" = "NA..4",
         "2015" = "NA..5",
         "2016" = "NA..6",
         "2017" = "NA..7",
         "2018" = "NA..8",
         "2019" = "NA..9",
         "2020" = "NA..10",
         "2021" = "NA..11")

data_agrario2011$Comunidades_Autonoma <- list

data_agrario <- full_join(data_agrario2009, data_agrario2011, by = "Comunidades_Autonoma")

data_agrario <-data_agrario |>
  pivot_longer(!Comunidades_Autonoma, names_to = "year", values_to = "value") |>
  mutate("Series" = "Agrario")
data_agrario$year <- as.numeric(data_agrario$year)
paged_table(data_agrario)

2. CONSTRUCTION

data_construccion2009 <- read.xlsx("ATR_2014_I.xls", sheetName = "ATR-I.1.3", startRow = 73, endRow = 91)

data_construccion2009 <- data_construccion2009 |> 
  select(NA., NA..1, NA..2) |>
  rename("Comunidades_Autonoma" = "NA.",
         "2009" = "NA..1",
         "2010" = "NA..2")

data_construccion2009$Comunidades_Autonoma <- list

data_construccion2011 <- read.xlsx("ATR_2021_I.xlsx", sheetName = "ATR-I.1.3", startRow = 73, endRow = 91)

data_construccion2011 <- data_construccion2011 |> 
  rename("Comunidades_Autonoma" = "NA.",
         "2011" = "NA..1",
         "2012" = "NA..2",
         "2013" = "NA..3",
         "2014" = "NA..4",
         "2015" = "NA..5",
         "2016" = "NA..6",
         "2017" = "NA..7",
         "2018" = "NA..8",
         "2019" = "NA..9",
         "2020" = "NA..10",
         "2021" = "NA..11")

data_construccion2011$Comunidades_Autonoma <- list

data_construccion <- full_join(data_construccion2009, data_construccion2011, by = "Comunidades_Autonoma")

data_construccion <-data_construccion |>
  pivot_longer(!Comunidades_Autonoma, names_to = "year", values_to = "value") |>
  mutate("Series" = "Construcción")
data_construccion$year <- as.numeric(data_construccion$year)
paged_table(data_construccion)

3. INDUSTRIAL

data_industrial2009 <- read.xlsx("ATR_2014_I.xls", sheetName = "ATR-I.1.3", startRow = 52, endRow = 70)

data_industrial2009 <- data_industrial2009 |> 
  select(NA., NA..1, NA..2) |>
  rename("Comunidades_Autonoma" = "NA.",
         "2009" = "NA..1",
         "2010" = "NA..2")

data_industrial2009$Comunidades_Autonoma <- list

data_industrial2011 <- read.xlsx("ATR_2021_I.xlsx", sheetName = "ATR-I.1.3", startRow = 52, endRow = 70)

data_industrial2011 <- data_industrial2011 |> 
  rename("Comunidades_Autonoma" = "NA.",
         "2011" = "NA..1",
         "2012" = "NA..2",
         "2013" = "NA..3",
         "2014" = "NA..4",
         "2015" = "NA..5",
         "2016" = "NA..6",
         "2017" = "NA..7",
         "2018" = "NA..8",
         "2019" = "NA..9",
         "2020" = "NA..10",
         "2021" = "NA..11")

data_industrial2011$Comunidades_Autonoma <- list

data_industrial <- full_join(data_industrial2009, data_industrial2011, by = "Comunidades_Autonoma")

data_industrial <-data_industrial |>
  pivot_longer(!Comunidades_Autonoma, names_to = "year", values_to = "value") |>
  mutate("Series" = "Industria")
data_industrial$year <- as.numeric(data_industrial$year)
paged_table(data_industrial)

4. SERVICSE

data_servicio2009 <- read.xlsx("ATR_2014_I.xls", sheetName = "ATR-I.1.3", startRow = 94, endRow = 112)

data_servicio2009 <- data_servicio2009 |> 
  select(NA., NA..1, NA..2) |>
  rename("Comunidades_Autonoma" = "NA.",
         "2009" = "NA..1",
         "2010" = "NA..2")

data_servicio2009$Comunidades_Autonoma <- list

data_servicio2011 <- read.xlsx("ATR_2021_I.xlsx", sheetName = "ATR-I.1.3", startRow = 94, endRow = 112)

data_servicio2011 <- data_servicio2011 |> 
  rename("Comunidades_Autonoma" = "NA.",
         "2011" = "NA..1",
         "2012" = "NA..2",
         "2013" = "NA..3",
         "2014" = "NA..4",
         "2015" = "NA..5",
         "2016" = "NA..6",
         "2017" = "NA..7",
         "2018" = "NA..8",
         "2019" = "NA..9",
         "2020" = "NA..10",
         "2021" = "NA..11")

data_servicio2011$Comunidades_Autonoma <- list

data_servicio <- full_join(data_servicio2009, data_servicio2011, by = "Comunidades_Autonoma")

data_servicio <-data_servicio |>
  pivot_longer(!Comunidades_Autonoma, names_to = "year", values_to = "value") |>
  mutate("Series" = "Servicios")
data_servicio$year <- as.numeric(data_servicio$year)
paged_table(data_servicio)

5. SPAIN, by sector

data_espana2009 <- read.xlsx("ATR_2014_I.xls", sheetName = "ATR-I.1.1", startRow = 10, endRow = 14)

data_espana2009 <- data_espana2009 |> 
  select(SECTOR, NA..1, NA..2, NA..3, NA..4, NA..5) |>
  rename("Series" = "SECTOR",
         "2006" = "NA..1",
         "2007" = "NA..2",
         "2008" = "NA..3",
         "2009" = "NA..4",
         "2010" = "NA..5")

data_espana2011 <- read.xlsx("ATR_2021_I.xlsx", sheetName = "ATR-I.1.1", startRow = 11, endRow = 15)

data_espana2011 <- data_espana2011 |> 
  select(-NA.) |>
  rename("Series" = "SECTOR",
         "2011" = "NA..1",
         "2012" = "NA..2",
         "2013" = "NA..3",
         "2014" = "NA..4",
         "2015" = "NA..5",
         "2016" = "NA..6",
         "2017" = "NA..7",
         "2018" = "NA..8",
         "2019" = "NA..9",
         "2020" = "NA..10",
         "2021" = "NA..11")

data_espana <- full_join(data_espana2009, data_espana2011, by = "Series")

data_espana <-data_espana |>
  pivot_longer(!Series, names_to = "year", values_to = "value") |>
  mutate("Comunidades_Autonoma" = "Tot_España")
data_espana$year <- as.numeric(data_espana$year)
paged_table(data_espana)

6. SPAIN, Total

data_espana2009_total <- read.xlsx("ATR_2014_I.xls", sheetName = "ATR-I.1.1", startRow = 7, endRow = 8)

data_espana2009_total <- data_espana2009_total |> 
  select(NA..1, X..2006, X..2007, X..2008, X..2009, X..2010) |>
  rename("Series" = "NA..1",
         "2006" = "X..2006",
         "2007" = "X..2007",
         "2008" = "X..2008",
         "2009" = "X..2009",
         "2010" = "X..2010")

data_espana2011_total <- read.xlsx("ATR_2021_I.xlsx", sheetName = "ATR-I.1.1", startRow = 8, endRow = 9)

data_espana2011_total <- data_espana2011_total |> 
  select(-NA., -NA..13, -NA..14) |>
  rename("Series" = "NA..1",
         "2011" = "NA..2",
         "2012" = "NA..3",
         "2013" = "NA..4",
         "2014" = "NA..5",
         "2015" = "NA..6",
         "2016" = "NA..7",
         "2017" = "NA..8",
         "2018" = "NA..9",
         "2019" = "NA..10",
         "2020" = "NA..11",
         "2021" = "NA..12")

data_espana_total <- full_join(data_espana2009_total, data_espana2011_total, by = "Series")

data_espana_total <-data_espana_total |>
  pivot_longer(!Series, names_to = "year", values_to = "value") |>
  mutate("Comunidades_Autonoma" = "Tot_España")
data_espana_total$year <- as.numeric(data_espana_total$year)
paged_table(data_espana_total)

7. COMMUNITIES, All sectors

data_comunidad2009_total <- read.xlsx("ATR_2014_I.xls", sheetName = "ATR-I.1.3", startRow = 10, endRow = 28)

data_comunidad2009_total <- data_comunidad2009_total |>
  select(NA., NA..1, NA..2) |>
  rename("Comunidades_Autonoma" = "NA.",
         "2009" = "NA..1",
         "2010" = "NA..2")

data_comunidad2009_total$Comunidades_Autonoma <- list

data_comunidad2011_total <- read.xlsx("ATR_2021_I.xlsx", sheetName = "ATR-I.1.3", startRow = 10, endRow = 28)

data_comunidad2011_total <- data_comunidad2011_total |>
  rename("Comunidades_Autonoma" = "NA.",
         "2011" = "NA..1",
         "2012" = "NA..2",
         "2013" = "NA..3",
         "2014" = "NA..4",
         "2015" = "NA..5",
         "2016" = "NA..6",
         "2017" = "NA..7",
         "2018" = "NA..8",
         "2019" = "NA..9",
         "2020" = "NA..10",
         "2021" = "NA..11")

data_comunidad2011_total$Comunidades_Autonoma <- list

data_comunidad_total <- full_join(data_comunidad2009_total, data_comunidad2011_total, by = "Comunidades_Autonoma")

data_comunidad_total <-data_comunidad_total |>
  pivot_longer(!Comunidades_Autonoma, names_to = "year", values_to = "value") |>
  mutate("Series" = "TOTAL")
data_comunidad_total$year <- as.numeric(data_comunidad_total$year)
paged_table(data_comunidad_total)

After importing and organizing the data for each sector from the respective files, the next step in the research process was to merge and combine all the datasets. This merging process aimed to create a comprehensive database that encompassed all the relevant information necessary for the analysis.

Having a unified database provided us with the flexibility to apply various filters and criteria aligned with the specific research needs. Whether it involved examining the incidence rates across different regions, comparing sector-specific trends, or analyzing the impact of external factors, the merged database allowed for targeted and efficient data retrieval.

data_frame <- bind_rows(data_agrario, data_construccion, data_industrial, data_servicio, data_espana, data_espana_total, data_comunidad_total)
paged_table(data_frame)

Once the data was prepared and organized in the desired format, it was exported from R Studio to create an archive that is readily accessible for future reference or to be shared with other team members or collaborators. Exporting the data in a suitable format ensures that it can be easily read and interpreted by others, regardless of whether they have access to R Studio or the related packages.

In this research process, we employed functions such as write.csv() to export the data to commonly used file formats such as CSV (Comma-Separated Values). These formats allow for easy integration with other software tools and facilitate data sharing and collaboration.

By exporting the prepared data, we not only preserve the integrity of the research findings but also enable the replication of the analysis in the future. This ensures that the research process can be validated and serves as a foundation for further investigations or improvements. Additionally, sharing the exported data with the team or other stakeholders enhances collaboration and enables collective decision-making based on the research outcomes. It promotes transparency, allows for discussions, and facilitates the dissemination of knowledge. Overall, exporting the prepared data in a suitable format provides a convenient and accessible means of archiving, sharing, and utilizing the research findings, both within the team and beyond.

data <- read_csv(file = "archivo_indice_de_incidencia", col_names = TRUE)