In this task open data has been visualized and changed into interactive using ggplotly() function in R.
# Quietly load your packages
library(ggplot2)
library(reshape)
library(gridExtra)
library(grid)
library(lubridate)
library(tidyverse)
library(plotly)
The dataset was taken from: Euro Stat Website
# Load your data and prepare for visualization
rm(list=ls())
df = read.csv('Australia_Unemployment.csv', header = TRUE)
head(df)
## TIME Population
## 1 1995-01 4.8
## 2 1995-02 4.9
## 3 1995-03 4.8
## 4 1995-04 4.3
## 5 1995-05 3.6
## 6 1995-06 3.5
df.ts = mutate(df, Date = as.Date(paste(df$TIME,'01',sep = '-')))
head(df.ts)
## TIME Population Date
## 1 1995-01 4.8 1995-01-01
## 2 1995-02 4.9 1995-02-01
## 3 1995-03 4.8 1995-03-01
## 4 1995-04 4.3 1995-04-01
## 5 1995-05 3.6 1995-05-01
## 6 1995-06 3.5 1995-06-01
# Visualise Your Data
p <- ggplot(df.ts,aes(Date, Population)) +
geom_line(color= "blue") +
geom_hline(yintercept = 2, color = 'black') +
scale_x_date(breaks = scales::pretty_breaks(n = 20)) +
annotate("rect", xmin = as.Date('2015-07-01'), xmax = as.Date('2016-08-01'), ymax = 6.6, ymin = 5,alpha = .3) +
annotate("text", x = as.Date('2015-04-01'), y= 6.8, label = "Higher Unemployment Rate", size=3) +
annotate("rect", xmin = as.Date('1999-05-01'), xmax = as.Date('2001-01-01'), ymax = 5.5, ymin = 2.9,alpha = .3) +
annotate("text", x = as.Date('1999-12-01'), y= 2.8, label = "Lower Unemployment Rate", size=3) +
geom_smooth(se = FALSE, color = 'red') +
xlab('Time') +
ylab('Population (%)') +
ggtitle('Unemployment Rate of Australia') +
labs(subtitle = expression(italic("Monthly average from Januaray 1995 to May 2017")),
caption="Data Source: Euro Stat Website") +
theme(plot.title = element_text(hjust = 0.5, vjust =0.1, size = 14, color = "Black", face = "bold")) +
theme(axis.title = element_text(size = 12)) +
theme(axis.text.x = element_text(vjust=1, hjust=1, angle=45, size = 10, color = "black"),
axis.text.y = element_text(vjust=0, hjust=0, size = 10, color = "black")) +
theme(plot.subtitle = element_text(hjust = 0.5, size = 11)) +
theme(plot.caption = element_text(size = 8),
element_line(color = '###AAAA'))
ggplotly(p)
The visualization demonstrates the time series plot of percentage of unemplyoment population of Australia from 1995 to 2017. Multiple trend observed in the series. Middle of 2000 is recorded as the lowest while early in 2016 represented the highest unemplyment population proportion. Since the points in line follow each other in increasing or decreasing manner tells the presence of auto regressive behavior in series. Fluctuaiton around the mean level shows the presence of moving average behavior in series. Seasonality is not cleary seen throughout the time peroid but some part of series have slight effect of seasonality. There is no more information in the dataset except the proportion of population so the reason behind the trend, pick and depression can not be analyzed.