Here’s a quick way to plot AlienVault’s geolocated reputation data on a map.
This bit gets the data:
library(htmltools)
library(leaflet)
library(tidyr)
library(dplyr)
library(scales)
url <- "http://reputation.alienvault.com/reputation.data"
fil <- "reputation.data"
if (!file.exists(fil)) download.file(url, fil)
This bit reads it in and cleans it up a bit:
geo <- read.table("reputation.data", sep="#",
col.names=c("ip", "risk", "rep", "type", "cc", "region", "coords", "n"),
stringsAsFactors=FALSE, comment.char="")
geo <- separate(geo, coords, into=c("lat", "lon"), sep=",")
You can see what it looks like:
head(geo)
## ip risk rep type cc region lat
## 1 203.121.165.16 6 5 C&C TH 15.0
## 2 46.4.123.15 4 2 Malicious Host DE 51.0
## 3 61.67.129.145 6 5 C&C TW Taipei 25.0391998291
## 4 222.124.202.178 9 5 C&C ID Jakarta -6.17439985275
## 5 62.209.195.186 6 5 C&C CZ Karvina 49.8568000793
## 6 210.253.108.243 6 4 C&C JP 35.6899986267
## lon n
## 1 100.0 2
## 2 9.0 3
## 3 121.525001526 2
## 4 106.829399109 2
## 5 18.5468997955 2
## 6 139.690002441 2
This bit aggregates the geolcated data (which shows the imprecision of geolocating IP addresses):
bulk <- count(geo, lon, lat)
bulk$meters <- 1000 * sqrt(bulk$n) #leaflet "addCircles" is in meters
This bit makes the map. The circles are click-able and will show the # of IPs that were mapped to that address
leaflet(bulk, width="100%") %>%
addTiles() %>%
addCircles(~lon, ~lat, radius=~meters,
popup=~sprintf("%s IP(s) geolocate here",
htmlEscape(comma(n)))) %>%
setView(-71.0382679, 42.3489054, zoom=3)