1.0 Overview

This dataviz makeover attempts to create the calendar views with suitable interactivities and animations on our Google Analytics data to detect any seasonality patterns in online purchases.

2.0 Installing and Launching R Packages

This code chunk installs the basic jsonlite and tidyverse packages and load them into RStudio environment.

packages <- c("jsonlite","tidyverse","sf","tmap","rnaturalearth","rnaturalearthdata","RColorBrewer")
for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

3.0 Importing JSON data

We use ‘fromJSON()’ function from ‘jsonlite’ package to import a typical JSON files below.

result <- fromJSON("data/ga-20170729.json")

Error in parse_con(txt, bigint_as_char) : parse error: trailing garbage ype“:”Not Socially Engaged“} {”visitNumber“:”1“,”visitId“:”1 (right here) ——^

3.1 Importing Newline Delimited JSON file

However, when we run the above command we would actully get an error like above. This is because this JSON file is ‘NDJSON (Newline delimited JSON)’, which means there are multiple JSON values inside this file and each of the JSON values is considered as an independent object. In our case, each “fullVisitorId” and “visitId” makes up one single JSON value therefore there are many JSON values inside of this JSON file. ‘jsonlite’ has a function to deal with this ‘NDJSON’ file type with ‘stream_in()’ function, so we can use it instead like below.

result <- stream_in(file("data/ga-20170729.json"))
## opening file input connection.
## 
 Found 500 records...
 Found 1000 records...
 Found 1500 records...
 Found 1597 records...
 Imported 1597 records. Simplifying...
## closing file input connection.
result <- jsonlite::flatten(result)

The original nested hierarchical data structure is hard to analyse in R. Hence, we use ‘flatten()’ function from ‘jsonlite’ package to flatten the structure by assigning each of the nested variable as its own column. Note this only flattens the 1st level structure; if there are further nestings for a variable, it will still show as a ‘list’ data type for that variable. Therefore, we use ‘unnest()’ to unnest the values inside the list into their own columns.

result_flat <- result %>%
  unnest(customDimensions, names_sep=".") %>%
  unnest(hits, names_sep=".") %>%
  unnest(hits.product, names_sep=".")