Loading a json object

Often when you’re interested in some super complicated data presentation online, and converting the underlying data to a nice table, there’s a super elegant way to proceed lurking underneath the site.

Javascript is the way super rich maps and graphs are built online

When you imagine:

…these are all Javascript!

(Quick aside–Javascript was built in the early 1990s by Netscape to build more dynamic and elaborate websites. As Internet Explorer receded in the early 2000s, the emerging open source browser companies settled on Javascript as an efficient client side scripting language. Most of the elaborate web designs use Javascript).

Of key relevance to us–underpinning these elaborate websites are often simple data hierarchies loaded invisibly in your browser. You can often locate these on a site and load them directly, ready for us to do statistics.

Let’s work an example!

The Guardian published a pretty elaborate and elegant graph from last month’s general election. It can be accessed here: https://www.theguardian.com/politics/ng-interactive/2024/jul/04/uk-general-election-results-2024-live-in-full



Poke around. Zoom into London or Birmingham or Manchester and see how little UK constituencies are!

And whoa these data are dis-aggregated!



Mmmmm how to scrape though? Are go do some goofy SelectorGadget’ing around?

No we’re going to access the JSON object underpinning this map

  1. Go to your Chrome window with the Guardian map open.

  2. Press Command + Option + I (for Mac) or CTRL + Shift + I (for Windows) to Inspect the page.
    It should return something like



    It may look foreboding, but it’s really just a symbolic depiction of the objects which are depicted in the page’s visual form.

  3. Click on the Network tab in the tab header



    We’re going to exploit the json object’s size (anything will all these candidates and their vote totals must be sizable.

  4. Now refresh the web page. You’ll see all the objects reloaded. Sort these objects by their size. Also enter the search term “.json” in the filter text field.

  5. Inspect the output of each object. You might find this item particularly interesting



    Right click the thinresults.json and select Copy URL. We can load this json object in R.

library(tidyverse)
library(magrittr)
library(rvest)
library(jsonlite)


j1 <- "https://interactive.guim.co.uk/2024/07/elex-data/production/data/ge/thinresults.json" %>% 
  jsonlite::read_json()

The first item in the j1 list describes candidates

t1 <- j1 %>% 
  map(
    \(i)
    i %>% 
      extract(
        j1[[1]] %>% 
          names %>% 
          extract(c(1:4, 6:12))
      ) %>% 
      enframe %>% 
      mutate(
        value = value %>% 
          unlist
      ) %>% 
      spread(name, value)
  ) %>% 
  list_rbind

And this item indexes election results

t2 <- j1 %>% 
  map(
    \(i){
      
      k <- i %>% 
        extract2(
          j1[[1]] %>% 
            names %>% 
            extract(13)
        )
      
      k %>% 
        map(
          \(m)
          
          m %>%
            discard(is.null) %>% 
            enframe %>% 
            mutate(
              value = value %>% unlist
            ) %>% 
            spread(
              name, value
            )
          
        ) %>% 
        list_rbind %>% 
        mutate(
          ons = i$ons,
          name = i$name
          )
    }, 
    .progress = T
  ) %>% 
  list_rbind