In this assignment, we will take our flights relational database and migrate it to a Neo4j NoSQL database.

First, we will load in our flights and airports tables that are used in a relational database.

flights <- read.csv(file='https://raw.githubusercontent.com/omarp120/DATA607Week12/master/flights.csv')
airports <- read.csv(file='https://raw.githubusercontent.com/omarp120/DATA607Week12/master/airports.csv')
airports <- airports[1:4,] #trim the excess rows from our table

Next, we initiate our Neo4j graph database.

library(RNeo4j)

graph = startGraph("http://localhost:7474/db/data/",
                   username = 'neo4j',
                   password = 'neo4j1')
clear(graph, input = FALSE)

We then create our flights nodes. Attributes: flight number, airline, origin and destination.

q <- 'CREATE (a:flights {flight: {flight}, airline: {airline}, origin: {origin}, destination: {destination}})'

t = newTransaction(graph)

for(i in 1:length(flights$flight)) {
    flight = flights[i,]$flight
    airline = flights[i,]$airline
    origin = flights[i,]$depart
    destination = flights[i,]$arrive 
    appendCypher(t, q, flight = flight, airline = airline, origin = origin, destination = destination)
}
commit(t)

We also create nodes for our airports. Attributes: airport iata code, city and state.

q1 <- 'CREATE (b:airports {iata: {iata}, city: {city}, state: {state}})'

t = newTransaction(graph)

for(i in 1:length(airports$label)) {
     iata = airports[i,]$label 
     city = airports[i,]$city
     state = airports[i,]$state
     appendCypher(t, q1, iata = iata, city = city, state = state)
}
commit(t)

Next, we create departure relationships between the flights and airport nodes.

q2 = '
MATCH (a:flights {flight:{flight}}), (b:airports {iata:{depart}}) 
CREATE (b) -[r:departs {takeoff:{takeoff}}] -> (a)
'

t = newTransaction(graph)
for(i in 1:length(flights$flight)) {
  depart = flights[i,]$depart
  flight = flights[i,]$flight
  takeoff = flights[i,]$takeoff
  appendCypher(t, q2, depart = depart, takeoff = takeoff, flight = flight)
}
commit(t)

We also create arrival relationships between the flights and airport nodes.

q3 = '
MATCH (a:flights {flight:{flight}}), (b:airports {iata:{arrive}}) 
CREATE (a) -[r:arrivesAt {landing:{landing}}] -> (b)
'

t = newTransaction(graph)
for(i in 1:length(flights$flight)) {
  arrive = flights[i,]$arrive
  flight = flights[i,]$flight
  landing = flights[i,]$landing
  appendCypher(t, q3, arrive = arrive, landing = landing, flight = flight)
}
commit(t)

The resulting graph network looks as follows:

There are advantages and disadvantages to relational databases versus NoSQL databases. The biggest advantage for relational databases is that the technology is much more advanced and lends well to analyses, but queries may require many joins that slow down processing. NoSQL graph databases have faster and more direct queries, but since this technology is more recent, there are fewer tools for analyses.