February 20, 2017

Overview

In Thailand, There are many car accident during New Year Festival in every year. This is because a lot of people travel back to their hometown in countryside. It's really sad that most of accident occured because they drive while they're drunk.

This presentation will use 'Plotly' R Package to plot interactive 3D Scatterplot to show number of people died and injured during new year festival (Dec 27 to Jan 5) from year 2008 to year 2015.

Load Data

The data set can be downloaded from Thailand Open Data https://data.go.th/DatasetDetail.aspx?id=7d61f508-d2e1-4f0c-8408-dfde29f111f5

after download, I need to mannualy convert the dataset file from xls to csv because I had some problem using the R package xlsx and rJava. So assume that the dataset file is in current working directory.

suppressPackageStartupMessages(library(plotly))
filename <- "51-58_CutName_NewYear_Edit.csv"
rawData <- read.csv(filename, header = TRUE)

Preprocessing Data

It's needed to clean some data and convert thai stuff to english.

tidyData <- rawData[, c(1, 6, 18)]
names(tidyData) <- c("year", "new_year_date", "died_or_injured")
tidyData$new_year_date <- as.factor(tidyData$new_year_date)
levels(tidyData$died_or_injured) <-
        c("died", "died", "died", "died", "died", "died", "injured")
levels(tidyData$year) <-
        c("2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015")
tidyData$die <- 0
tidyData$die[tidyData$died_or_injured == "died"] <- 1
tidyData$injure <- 0
tidyData$injure[tidyData$died_or_injured == "injured"] <- 1
tidyData <- tidyData[!(tidyData$new_year_date == '0'),]
tidyData$new_year_date <- droplevels(tidyData$new_year_date)
levels(tidyData$new_year_date) <- c("Jan_1","Jan_2","Jan_3","Jan_4","Jan_5","Dec_27","Dec_28","Dec_29","Dec_30","Dec_31")
plotData <- aggregate(cbind(die, injure) ~ new_year_date + year, data = tidyData, sum)

3D Scatter plot

plot_ly(plotData, x = ~ year, y = ~ die, z =  ~ injure, type = "scatter3d",mode = "lines+markers+text",color = ~ new_year_date) %>% layout(title = "Car Accident Died and Injured During New Year Festival in Thailand")

Conclusion

I know that it may be better to understand if the plot is in 2D but I just want to try the 3D.

From the plot, during Dec 31 to Jan 1 have the most number of died and injured in every year. The number of died and injured seem to lower from year 2009 to 2014 but unfortunately it's increasing in year 2015.