This vignette focusses on extracting airline flight data from FlightAware.com using their FlightAware2 XML API and the httr package.
We will step through the data collection, cleansing and mapping by: * Selecting a commercial flight * Extracting the historical flight information and track in the form of latitude and longitude data * Wrangle the extracted data to make it user friendly * Plot the data using ggmap and google maps
For more details on using the Flight Aware API see http://flightaware.com.
The FlightAwareXML2 explorer is an easy way to understand the XML table structure as sometimes you will have to query one table to extract data from another table which you will experience in this vignette.
We first have to locate a unique flight ID and then extract the historical flight path.
To effectively use the FlightAware XML2 you will need an understanding of airline and airport codes. This data is accessed in the International Civil Aviation Authority Codes database which can be accessed at:
http://airportsbase.org/ICAO.php
A quick example from the ICAO data base -> Sydney Kingsford Smith Airport = YSSY and any Qantas airlines flight would be prefixed by QFA e.g. QFA64.
This should assist you in the future if you would like to extract data for airports or other airlines.
We will use the httr GET function to access the data. If you look at Tip 1 you will notice that the FlightAware XML API tables require certain Inputs to be present to extract the data or the query returns an error.
The image in Tip 1 shows the format for Inputs and expected Return values from a query to the AircraftType table.
library(dplyr)
library(httr)
library(jsonlite)
library(tidyverse)
library(ggmap)
library(data.table)
username <- “enterusernamehere” my_API_key <-“enterAPIKeyhere”"
Due to the number of flights around the world each day, FlightAware assign a unique flight identifier to each flight. So while you might have a Qantas flight with the same flight number e.g. QF575 that departs to from Sydney to Perth everyday the flight aware logs each of these flights under a unique flight ID.
I have selected a flight that has been completed for this example for the purpose of plotting the data.
To extact the data from the FlightAware XML we use the httr GET function and the URL for the API : http://flightxml.flightaware.com/json/FlightXML2 note that the API is a parameter based API so after the URL you include the name of the table followed by a “?” and the input parameter.
Example - > to get the flightID for QFA575 Sydney - > Perth we query table “InflightInfo” and “?ident=QFA575”"
request <- GET("http://flightxml.flightaware.com/json/FlightXML2/InFlightInfo?ident=QFA575",
authenticate(user=username, password=my_API_key, type = "basic"))
stop_for_status(request)
content(request)
## $InFlightInfoResult
## $InFlightInfoResult$faFlightID
## [1] "QFA575-1553808300-schedule-0001"
##
## $InFlightInfoResult$ident
## [1] "QFA575"
##
## $InFlightInfoResult$prefix
## [1] "H"
##
## $InFlightInfoResult$type
## [1] "A332"
##
## $InFlightInfoResult$suffix
## [1] ""
##
## $InFlightInfoResult$origin
## [1] "YSSY"
##
## $InFlightInfoResult$destination
## [1] "YPPH"
##
## $InFlightInfoResult$timeout
## [1] "timed_out"
##
## $InFlightInfoResult$timestamp
## [1] 1553998639
##
## $InFlightInfoResult$departureTime
## [1] 1553982833
##
## $InFlightInfoResult$firstPositionTime
## [1] 1553982833
##
## $InFlightInfoResult$arrivalTime
## [1] 1553998672
##
## $InFlightInfoResult$longitude
## [1] 0
##
## $InFlightInfoResult$latitude
## [1] 0
##
## $InFlightInfoResult$lowLongitude
## [1] 115.9402
##
## $InFlightInfoResult$lowLatitude
## [1] -35.18582
##
## $InFlightInfoResult$highLongitude
## [1] 151.1704
##
## $InFlightInfoResult$highLatitude
## [1] -31.98092
##
## $InFlightInfoResult$groundspeed
## [1] 0
##
## $InFlightInfoResult$altitude
## [1] 0
##
## $InFlightInfoResult$heading
## [1] 0
##
## $InFlightInfoResult$altitudeStatus
## [1] ""
##
## $InFlightInfoResult$updateType
## [1] ""
##
## $InFlightInfoResult$altitudeChange
## [1] ""
##
## $InFlightInfoResult$waypoints
## [1] "-32.181 116.9 -32.145 116.62 -32.094 116.24 -32.069 116.1 -32.038 115.94 -31.945 115.96 -31.94 115.97"
If we inspect the content of the request you will notice that the faFlightID is QFA575-1553808300-schedule-0001.
The query also returns a range of other data such as origin and destination airports, heading, departure and arrival times. For now we just need the fsFlightID
Using the data from the InflightInfo table we now use the “faFlightID”" with the httr GET function to pull the historical track data from the GetHistoricalTrack table and store the returned values as a list.
To inspect the data returned by the query we use the httr content function. In this case I have stored the data as a list.
request <- GET("http://flightxml.flightaware.com/json/FlightXML2/GetHistoricalTrack?faFlightID=QFA575-1553808300-schedule-0001",
authenticate(user=username, password=my_API_key, type = "basic"))
stop_for_status(request)
flightdata <- content(request)
In order to plot flight on ggmap we have to extract the way points - Latitude and Longitude from the list. There are probably easier ways to do this but in this case I have simply piped the data into a data frame.
The data is then merged into 1 column in the data frame so I have then used the separate fuction from tidyverse to split the data into columns.
The Latitude and Longitude columns then contain some text data which needs to be removed and then the data type is changed to numeric from character.
Finally the Latitude and Longitude data is captured and we can inspect the data to ensure we have just the remaining waypoints.
flightdata2 <- as.data.frame(rbindlist(flightdata, fill=FALSE))
names(flightdata2)[1]<-"Data"
flightdata3 <- flightdata2 %>%
separate(Data, into = c("timestamp", "latitude", "longitude","groundspeed","altitude","altitudeStatus","updateType","altitudeChange"), sep = ",")
Latitude <- as.numeric(gsub("latitude =", "", flightdata3$latitude))
Longitude <-as.numeric(gsub("longitude =", "", flightdata3$longitude))
flightdata4 <- as.data.frame(cbind(Latitude, Longitude))
head(flightdata4)
## Latitude Longitude
## 1 -33.92597 151.1704
## 2 -33.91525 151.1646
## 3 -33.90784 151.1535
## 4 -33.89994 151.1385
## 5 -33.89227 151.1237
## 6 -33.88305 151.1068
To check this and plot the data we first load our Google key and pull a extract of the Australian continent from ggmap.
register_google(key =“InsertGoogleKeyhere”)
register_google(key = GoogleKey)
ggmap(get_map("Australia",zoom=4, scale=1))
ggmap(get_map("Australia", zoom=4, scale=1))+ geom_point(aes(x = Longitude, y = Latitude), data = flightdata4, shape = 21, color = "red", size = 1)+ggtitle("QF575 Sydney -> Perth Flight Path ")
Szczepanski, A. (2018). “http://amyszczepanski.com/2018/02/04/flightaware-flightxml-and-r.html”
South, A. (2016). “Rworldmap FAQ” from (https://cran.r-project.org/web/packages/rworldmap/vignettes/rworldmapFAQ.pdf)