In this project, we want to reveal the failure rate of Spar Nord ATMs by AIM ID. Spar Nord can adjust the strategy for operation improvement.
We can easily identify the outliers in funnel plot. Hence we will create a funnel plot. Number of records set as x and failure rate set as y in the plot.
Spar Nord Failure Rate by Funnel Plot
packages = c('tidyverse','plotly')
for (p in packages){
if(!require(p,character.only = T)){install.packages(p)
}
library(p,character.only = T)
}
In our project, Spar Nord Open Data will be used. The dataset contains information about the location of the ATM, manufacturer, currency, cardtype, weather conditions, etc. Please Download the data here.The original data set is in csv format.
In the code chunk below, read.csv() of readr is used to import atm_all.csv into R and parsed it into tibble R data frame format.
data = read.csv("data/atm_all.csv")
We identify the situation by ATM_ID and month, and filter out inactive ATMS.
data1<- data %>%
filter(atm_status=="Active")%>%
select(atm_id,atm_location,message_text,weather_main,month)%>%
group_by(atm_id,month,atm_location)%>%
summarise(Total_Number = n())
data2<- data %>%
filter(atm_status=="Active")%>%
select(atm_id,atm_location,message_text,weather_main,month)%>%
filter(message_text != '')%>%
group_by(atm_id,month,atm_location)%>%
summarise(Failure_Number = n())
data3 <- merge(x=data1,y=data2,by = c("atm_id","month","atm_location"))
data3$Failure_Rate = (data3$Failure_Number/data3$Total_Number)
At first, we can show the points by x and y in the plot. x is the total number of records and y is failure rate.
x = data3$Total_Number
y = data3$Failure_Rate
ggplot(data3,aes(x=x,y=y))+
geom_point()
At first, we can show the points by x and y in the plot. x is the total number of records and y is failure rate.
x = data3$Total_Number
y = data3$Failure_Rate
data3$month=factor(data3$month, levels = month.name)
p_plot <- ggplot(data3,aes(x=x,y=y,color=atm_location))+
geom_point(aes(frame=month,ids=atm_id))
ggplotly(p_plot)
We need caculate the se for failure rate and weighted mean of y.
y.se <- sqrt(y*(1-y)/x)
df <- data.frame(data3$atm_id,x, y, y.se)
names(df)[1]<-paste("Atm_ID")
y.fem <- weighted.mean(y, 1/y.se^2)
## lower and upper limits for 95% and 99.9% CI, based on FEM estimator
number.seq <- seq(0.001, max(x), 0.1)
number.ll95 <- y.fem - 1.96 * sqrt((y.fem*(1-y.fem)) / (number.seq))
number.ul95 <- y.fem + 1.96 * sqrt((y.fem*(1-y.fem)) / (number.seq))
number.ll999 <- y.fem - 3.29 * sqrt((y.fem*(1-y.fem)) / (number.seq))
number.ul999 <- y.fem + 3.29 * sqrt((y.fem*(1-y.fem)) / (number.seq))
dfCI <- data.frame(number.ll95, number.ul95, number.ll999, number.ul999, number.seq, y.fem)
Some point really close to zero in y-axis and the plot cannot show a full view. Until now, I not fixed the problem of how to combine the points and line in animation plot, since there are some error happend.
ggplot()+
geom_point(data=data3,aes(x=x,y=y,color=month,frame=month))+
geom_line(aes(x = number.seq, y = number.ll95), data = dfCI) +
geom_line(aes(x = number.seq, y = number.ul95), data = dfCI) +
geom_line(aes(x = number.seq, y = number.ll999), linetype = "dashed", data = dfCI)+
geom_line(aes(x = number.seq, y = number.ul999), linetype = "dashed", data = dfCI)+
geom_hline(aes(yintercept = y.fem), data = dfCI) +
scale_y_continuous(limits = c(0,0.07)) +
ggtitle("Funnel Plot in Failure Rate")+
xlab("Number of Records") + ylab("Failure rate") + theme_bw()+
theme(plot.title = element_text(hjust=0.5))
At first, ATM 39 performed good from January to November, but its failure rate was increased a lot and our suggestion is Spar Nord shold to check and to fix it.
ATM 83 and 79 performed really bad from January to Feburary, and it was not record from March to May. It would be fixed already, because its failure rate was not increased again in the next months.
The number of records of ATM 45,39,10 is more than 4000, and they kept increasing in the whole year. But it was strange that number of records of ATM 41 increased a lot and ranked first.
Animation graph has more fun and exciting than stastic graph.
Animation graph shows treands in data clearly, especially for presentation.
User can interactivate with animation graph, where the user can swap out different metrics, such as month and failure rate of atm.
Clearly indicates data correlation (illustrates positive, negative, strong, weak relationships); method of illustration non-linear patterns; shows spread of data, outliers; clearly demonstrate atypical relationships; used for data extrapolation and interpolation.