My Google Location Data

Like it or not, Google tracks a lot of information about you. Giving into big brother’s watchful eye, I’ve allowed Google to track and store my location data for the past couple of years. One nice feature, though, is that Google also allows you access to this information for download if you request it via Google Takeout (https://www.google.com/settings/takeout). For my own curiosity, I decided to download this and plot it against a map. (Note: I borrowed some of this code from another person, but it’s been awhile and I can’t figure out where I found it. If it is yours, let me know and I will make sure to credit you.)

First, you will have to download your data and read it into R. I just downloaded mine from Google Takeout directly into my Google Drive directory.

require(jsonlite)
require(plyr)

raw <- fromJSON('~/Google Drive/Takeout/Takeout/Location History/LocationHistory.json')

Next you want to extract the relevant information from JSON file.

# Get the 'locations' part of the list
locs <- raw$locations

# These are all the columns that it contains...
names(locs)

## [1] "timestampMs"      "latitudeE7"       "longitudeE7"     
## [4] "accuracy"         "activitys"        "heading"         
## [7] "velocity"         "altitude"         "verticalAccuracy"

# Get columns into useful formats.
ldf <- data.frame(t=rep(0,nrow(locs)))

# Time is in POSIX * 1000 (milliseconds) format, convert it to useful scale...
ldf$t <- as.numeric(locs$timestampMs)/1000
class(ldf$t) <- 'POSIXct'

# Convert longitude and lattitude to to usable numbers...
ldf$lat <- as.numeric(locs$latitudeE7/1E7)
ldf$lon <- as.numeric(locs$longitudeE7/1E7)

# Accuracy doesn't need changing.
ldf$accuracy <- locs$accuracy

# Activity guesses (i.e., walking vs. driving) are in a list. We can unpack these lists to get the most likely activity for each location (takes a while, depending on the size of your dataset).

# get the most likely activity type and confidence for each time point.
act <- laply(locs$activitys, function(f) {
  if(is.null(f[[1]])) data.frame(activity=NA,confidence=NA,stringsAsFactors=F) else data.frame(activity=f[[2]][[1]][[1]][1],confidence=f[[2]][[1]][[2]][1],stringsAsFactors=F)
},.progress="none")

# combine activity data with the main dataset
ldf$activity <- as.character(act[,1])
ldf$confidence <- as.numeric(act[,2])

# Velocity, altitude and heading need no alteration:
ldf$velocity <- locs$velocity
ldf$altitude <- locs$altitude
ldf$heading <- locs$heading

Now I have my location dataframe called ‘ldf’ with everything needed to make the plot.

First, let’s create two maps that we can plot the location data over. You will need the latitude and longitude of the place you want to see. You will also have to set an appropriate zoom level, the source source of the map images, and the type of map you want to use.

The first place I want to see is the town I currently reside in, Hanover, New Hampshire. The second is my hometown of Albuquerque, New Mexico, where I’ve visited a few times in the last couple of years.

require(ggplot2)
require(ggmap)

hanover <- get_map(c(-72.285,43.704),15,source='google',maptype="satellite")
ABQ <- get_map(c(-106.59,35.110833),12,source='google',maptype="satellite")

Now we can overlay the location data on the maps.

ggmap(hanover) + geom_point(data=ldf,aes(lon,lat),color="Red", alpha = .01) + 
  ggtitle("Hanover, NH") + xlab(" ") + ylab(" ")

plot of chunk unnamed-chunk-4

# alpha = .01 to enhance visulization of overplotted data.

ggmap(ABQ) + geom_point(data=ldf,aes(lon,lat),color="Red", size=2,alpha = .05) + 
  ggtitle("Albuquerque, NM") + xlab(" ") + ylab(" ")

Because I’ve spent more time in Hanover and it is a smaller place, there is a much higher density of information in that plot. However, if you are familiar the area and me, you can see I’ve spend a lot of time at home, work, the library, the grocery store, and Salt Hill Pub (unsurprisingly, the gym is conspicuously missing).

Back home in Albuquerque, there is much less data and everything is much more scattered. I tend to hop around a lot, visiting people, eating at all of my favorite restaurants, and dropping by my favorite places to have a beer. Google knows it, and now you do too.

Happy data hunting.

My Google Location Data

Rob Chavez

December 28, 2014