Assigment 2 Part B
The first step is to create an aggregated data set of the fields date, industry and location, with a mean of monthly amount. The code below uses the package dplyr to group by location, industry and data and summarising by monthly amount. ###Code 1
library(readxl)
## Warning: package 'readxl' was built under R version 3.4.4
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.4.4
data <- read.csv("transactions.csv")
data$date <- as.Date(data$date, format = "%d/%m/%Y")
data <- data %>% group_by(date,industry,location) %>% summarise( Mean= mean(monthly_amount))
The code below use dplyr package to filter the data set by location and industry 1. In the same way, the plot can illustrate the trend of the data together with the seasonality by month.
initialdatasubset <- data %>% filter(location == 1, industry ==1)
library(ggplot2)
ggplot(initialdatasubset, aes(date, y=Mean))+geom_line(color= "red")+geom_smooth(method = "lm", se = FALSE)