Survival Models
Ujian Akhir Semester
*Kontak | : \(\downarrow\)* |
naufal3433@gmail.com | |
https://www.instagram.com/m_naufalardiansyah/ | |
RPubs | https://rpubs.com/muhamad_naufal/ |
Introduction
The aim of this report is to create a better understanding of shark attacks in South Africa. This is done through the use of visualizations and these visualizations should actively encourage more informed discussions about why and where shark encounters occur.
####Shark attack data####
library(ggplot2)
library(DT)
##Load the dataset##
<- read.csv("attacks.csv")
attacks
##Get data for South Africa##
<- subset(attacks, Country == "SOUTH AFRICA")
SA datatable(attacks)
## Warning in instance$preRenderHook(instance): It seems your data is too big for
## client-side DataTables. You may consider server-side processing:
## https://rstudio.github.io/DT/server.html
First look at the data
To begin with, the data is examined over time from 1900 until 2016 to determine if there are any trends over time. The graph below also examines if there is a difference in the number of males and females attacked by sharks.
##Attacks after 1900##
ggplot(SA[SA$Year > 1900,], aes(Year, fill = Sex)) + geom_bar() + ggtitle("Shark attacks in South Africa from 1900")
Interpretation of the graph above: It does appear that the number of sharks attacks has increased significantly since 1900. The majority of attacks involve males. This could be the result of several reasons,such a the majority of surfers/spear fishermen are male which would increase their chances of encountering a shark.
The most recent shark attacks are examined next. Only the attacks after 2000 are visualized below.
ggplot(SA[SA$Year > 2000,], aes(Year, fill = Type)) + geom_bar() + ggtitle("Shark attacks in SA from 2000")
Interpretation of the graph above: There does appear to be significant variability in the number of shark attacks after 2000. There would be twelve attacks one year and then three attacks the next year. Most shark attacks after 2000 are unprovoked attacks. Only a small proportion of attacks are provoked attacks.
Loess Smoothing
##loess smoothing##
=SA$Year
years= table(years)
year.freq
=as.matrix(year.freq) ##Create data set for loess. y values
year.freq.matrix=year.freq.matrix[-1,]
year.freq.matrix=as.matrix(year.freq.matrix)
year.freq.matrix
=sort(unique(years)) #x values
years.unique=as.matrix(years.unique)
years.unique=years.unique[-1,]
years.unique
#create plot#
qplot(years.unique,year.freq.matrix,geom=c("point","smooth"))+ ylab("Number of Attacks") + xlab("Year")+ggtitle("Shark attacks in South Africa 1850-2016")
## Warning: `qplot()` was deprecated in ggplot2 3.4.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Interpretation of the graph above: The number of shark attacks appear to follow an increasing trend from 1900 until approximately 1990. Since the beginning of the century there has been a decreasing trend in the number of shark attacks.
In which provinces are you most at risk?
ggplot(SA, aes(reorder(Area, table(SA$Area)[Area]),fill=Type)) + geom_bar() + coord_flip() + ylab("Count of Attacks") + xlab("Province")
Interpretation of the graph above: The majority of shark attacks occur in three of South Africa’s nine provinces. Just over 200 shark attacks occurred in KwaZulu-Natal and the majority of the attacks are unprovoked. Slightly less than 150 of the shark attacks in KwaZulu-Natal are unprovoked attacks. The number of unprovoked attacks are similar in KwaZulu-Natal, the Western Cape Province and the Eastern Cape Province.
Chance of surviving a shark attack
####Calculations with the fatality rates####
=as.matrix(SA[,13]) #Create a matrix with the data
x###For loop to calculate non-fatal attacks###
=0
count.total.nonfatalfor (i in x)
{if(i=="N") {count.total.nonfatal=count.total.nonfatal+1}
}cat("Non-fatal attacks:",count.total.nonfatal)
## Non-fatal attacks: 422
###For loop to calculate fatal attacks###
=0
count.total.fatalfor (i in x)
{if(i=="Y") {count.total.fatal=count.total.fatal+1}
}cat("Fatal attacks:",count.total.fatal)
## Fatal attacks: 137
Chances of survival when attacked
###Chances of survival if attacked###
=count.total.nonfatal+count.total.fatal
total.attacks=count.total.nonfatal/total.attacks
percentage.survived=count.total.fatal/total.attacks
percentage.diedcat("Percentage of non-fatal attacks:",percentage.survived)
## Percentage of non-fatal attacks: 0.7549195
cat("Percentage of fatal attacks:",percentage.died)
## Percentage of fatal attacks: 0.2450805
#Create a pie chart woth percentages#
<- data.frame(
df group = c("Non-fatal", "Fatal"),
value = c(0.7549195, 0.2450805 )
)
<- ggplot(df, aes(x="", y=value, fill=group))+
bpgeom_bar(width = 1, stat = "identity")
#bp
<- bp + coord_polar("y", start=0) + ggtitle("Chance of surviving an attack")
pie pie
Interpretation of the results above: Approximately a quarter of all shark attacks/encounters are fatal in South Africa.