This is a prototype R Markdown Notebook which summarises the key facts and figures of Ontario.ca. It is a proof of concept that summarises a script which pulls directly from the Google Analytics API to produce a report.
Not contained in this notebook is a call to the Google Analytics API to download the following parameters from three ODS Google Analytics properties: Onesite, Onepage (both Ontario.ca) and Roll-up (gov.on.ca).
date_range = c("2010-01-01", "2018-08-19"),
metrics = c("sessions", "avgSessionDuration"),
dimensions = c("date", "deviceCategory", "country", "language"),
anti_sample = TRUE)
Instead of reproducing those calls, the results are merged into one flat file and stored on disk in a file named ga_data_raw.rds
.
ga_data_raw <- read_rds("ga_data_raw.rds")
head(ga_data_raw)
Unfortunately, for historical reasons, Google Analytics install on Ontario.ca is split between two GA properties: Onepage - Master and Onesite - Master. Those two properties have the following number of rows in the table:
table(ga_data_raw$viewName)
##
## Onepage - Master Onesite - Master Roll-up - Master
## 1092849 830028 3235370
We are only interested in the combined total of these two GA properties, so we make a new table named ga_data that sums all the traffic between the two properties and dumps all the other columns:
ga_data <- # this cleans up the raw inputs from the API by combining the Ontario.ca properties and renaming the properties
ga_data_raw %>%
select(-avgSessionDuration, -viewId) %>% # Combine `Onesite - Master` and `Onepage - Master`` into `Ontario.ca`
spread(viewName, sessions, fill = 0) %>%
mutate(`Ontario.ca` = `Onepage - Master` + `Onesite - Master`) %>%
rename(`gov.on.ca` = `Roll-up - Master`) %>%
select(-`Onepage - Master`, -`Onesite - Master`) %>% # Remove components
gather(viewName, sessions, c(`Ontario.ca`, `gov.on.ca`)) %>% # Clean up into tidy data
filter(sessions != 0) %>% # Remove 0 rows, so lines aren't drawn on the chart
mutate(yearmonth = as_date(parse_date_time(paste0(year(date), month(date)), "ym"))) %>% # create a year-month date column
group_by(yearmonth, viewName) %>%
summarise(sessions = sum(sessions)) %>%
write_rds("ga_data.rds")
head(ga_data)
Now that we have a tidy table, we can produce our report.
Traffic data for Ontario.ca begins on , and this Factsheet contains data up to -. The website had the goal of migrating traffic from the constilation of gov.on.ca websites to Ontario.ca. This factsheet summarises the performance towards that goal.
ga_data %>%
ggplot() +
geom_line(aes(x = yearmonth, y = sessions, color = viewName, group = viewName)) +
theme_minimal() + # formatting lines
labs(
title = "The long work of Ontario.ca transitions",
subtitle = "Ontario.ca (Onepage + Onesite) slowly taking traffic away from gov.on.ca",
caption = "Data from Google Analytics, monthly figures\nalex.lougheed@Ontario.ca",
x = "",
y = "Sessions",
color = "Property"
) +
theme(legend.position = "bottom") +
scale_x_date(date_labels = "'%y", date_breaks = "1 year") +
scale_y_continuous(labels = comma)
The key performance indicator for the ODS is the percent of traffic shifting from gov.on.ca to Ontario.ca.
ga_data %>%
spread(viewName, sessions) %>%
mutate(traffic_shift_kpi = `Ontario.ca` / (`Ontario.ca` + `gov.on.ca`)) %>%
filter(!is.na(traffic_shift_kpi)) %>% # Do not consider time when Ontario.ca did not exist and kpi = NA
ggplot() +
geom_line(aes(x = yearmonth, y = traffic_shift_kpi)) +
theme_minimal() + # formatting lines
labs(
title = "The long work of Ontario.ca transitions",
subtitle = "Ontario.ca's slowly taking up traffic formerly belonging to gov.on.ca",
caption = "Data from Google Analytics, monthly figures\nalex.lougheed@Ontario.ca",
x = "",
y = "Percent of traffic",
color = ""
) +
theme(legend.position = "bottom") +
scale_x_date(date_labels = "'%y", date_breaks = "1 year") +
scale_y_continuous(labels = percent)
We can also produce tables in these types of reports, such as the last 9 months of the KPI.
ga_data %>%
spread(viewName, sessions) %>%
mutate(traffic_shift_kpi = `Ontario.ca` / (`Ontario.ca` + `gov.on.ca`)) %>%
filter(!is.na(traffic_shift_kpi)) %>%
filter(yearmonth > today()-months(9)) %>%
kable(digits = 2)
yearmonth | gov.on.ca | Ontario.ca | traffic_shift_kpi |
---|---|---|---|
2017-12-01 | 4180776 | 4814117 | 0.54 |
2018-01-01 | 5773173 | 7054606 | 0.55 |
2018-02-01 | 4613114 | 6161148 | 0.57 |
2018-03-01 | 4732410 | 6710891 | 0.59 |
2018-04-01 | 4229004 | 6133631 | 0.59 |
2018-05-01 | 4065967 | 6651769 | 0.62 |
2018-06-01 | 3691552 | 6405524 | 0.63 |
2018-07-01 | 3655742 | 7477206 | 0.67 |
2018-08-01 | 1959498 | 4385741 | 0.69 |