The goal of this vignette is to lean how to use API and visualise the data using the the NSW fuel price data. “jsonlite” packages can be used to import JSON formatted data into R. FuelCheck gives access to real-time fuel price information in NSW. This workbook will show you how to import real-time data using API in JSON format and convert the data into data frame to anlyse the information based on the location, price and brand of the difference petrol stations.

Loading Data

The FuelCheck API returns real-time fuel prices in JSON format. JSON data can be converted to a data frame using jsonlite package. The below API call requests the prices of all available unleaded 91 fuel prices across NSW. There are 21 different brand opearing in NSW. Small independent petrol station are combined as brand “independent”. The column names from the FuelCheck has are converted to lower cases using “janitor” package.

library(jsonlite)
## 
## Attaching package: 'jsonlite'
## The following object is masked from 'package:purrr':
## 
##     flatten
#URL for API call to collect fuel data
url <- paste("https://api.onegov.nsw.gov.au/FuelCheckApp/v1/fuel/prices/",
"bylocation?bottomLeftLatitude=-41.574361&bottomLeftLongitude=107.929688&",
"topRightLatitude=-11.264612&topRightLongitude=155.039063&",
"fueltype=U91&brands=SelectAll", sep="", collapse=NULL) 

fueldata <- fromJSON(url) %>%
  clean_names() %>%
  select(name, lat, long, address, brand, price) %>%
  mutate(brand = as.factor(brand))

str(fueldata)
## 'data.frame':    1828 obs. of  6 variables:
##  $ name   : chr  "Speedway Fairfield" "APEX PETROLEUM VILLAWOOD" "SPEEDWAY SACKVILLE STREET FAIRFIELD" "Metro Fuel Fairfield" ...
##  $ lat    : num  -33.9 -33.9 -33.9 -33.9 -33.9 ...
##  $ long   : num  151 151 151 151 151 ...
##  $ address: chr  "251 The Horsley Drive, Fairfield NSW 2165" "896A Woodville Rd, Villawood NSW 2163" "115 Sackville Street, FAIRFIELD NSW 2165" "130 Hamilton  Road, Fairfield NSW 2165" ...
##  $ brand  : Factor w/ 21 levels "7-Eleven","BP",..: 19 8 19 13 13 8 2 13 19 19 ...
##  $ price  : num  87.9 87.9 90.9 90.9 90.9 91.9 91.9 91.9 92.5 92.5 ...
head(fueldata)
##                                  name       lat     long
## 1                  Speedway Fairfield -33.87136 150.9628
## 2            APEX PETROLEUM VILLAWOOD -33.88389 150.9760
## 3 SPEEDWAY SACKVILLE STREET FAIRFIELD -33.87272 150.9452
## 4                Metro Fuel Fairfield -33.87264 150.9459
## 5                     Metro Fairfield -33.87290 150.9499
## 6                      Gas to connect -33.84721 150.9623
##                                       address       brand price
## 1   251 The Horsley Drive, Fairfield NSW 2165    Speedway  87.9
## 2       896A Woodville Rd, Villawood NSW 2163 Independent  87.9
## 3    115 Sackville Street, FAIRFIELD NSW 2165    Speedway  90.9
## 4      130 Hamilton  Road, Fairfield NSW 2165  Metro Fuel  90.9
## 5          82 Hamilton Rd, FAIRFIELD NSW 2165  Metro Fuel  90.9
## 6 54c Fairfield Road, GUILDFORD WEST NSW 2161 Independent  91.9

Data structure

The information from the API includes names, location, brand of the petrol stations and prices of unleaded 91 fuel across 1828 petrol stations in NSW.

We will first compare the 5 chepest petrol stations and 5 most expensive petrol station.

cheappetrol <- fueldata %>%
  arrange(price) %>%
  select(brand, price, address) 
cheappetrol %>%
  head(5)
##         brand price                                   address
## 1    Speedway  87.9 251 The Horsley Drive, Fairfield NSW 2165
## 2 Independent  87.9     896A Woodville Rd, Villawood NSW 2163
## 3    Speedway  90.9  115 Sackville Street, FAIRFIELD NSW 2165
## 4  Metro Fuel  90.9    130 Hamilton  Road, Fairfield NSW 2165
## 5  Metro Fuel  90.9        82 Hamilton Rd, FAIRFIELD NSW 2165
expensivepetrol <- fueldata %>%
  arrange(-price) %>%
  select(brand, price, address) 
cheappetrol %>%
  tail(5)
##        brand price                                            address
## 1824  Caltex 167.9                   43 Urben st, Urbenville NSW 2475
## 1825 Liberty 168.0                 2 Myers Street, WILCANNIA NSW 2880
## 1826   Shell 170.0                      31 Obley St, CUMNOCK NSW 2867
## 1827  Caltex 170.0                       1466 Kyogle Rd, Uki NSW 2484
## 1828   Mobil 191.9 Cnr Keraro Rd & Johnston St, WHITE CLIFFS NSW 2836

The most expensive unleaded 91 petrol price is more than duoble the cheapest price of U91 available in Australia. The cheap petrol stations are mostly located in South west Sydney wheras the expensive petrol stations are located in regional NSW areas. Now, Let’s have a look at the distribution of unleaded 91 fuel prices.

fueldata %>%
  ggplot(aes(xintercept=mean(price), x= price)) +
  geom_histogram(aes(y=..density..), colour="black", fill="white") +
  geom_vline(aes(xintercept=mean(price)),color="blue", linetype="dashed", size=1) +
  geom_density(alpha=.2, fill="#FF6666") +
  labs(title = "Distribution of U91 Petrol Prices in NSW") +
  xlab("Price") + ylab("Density")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The average and median price of fuel is 124.85 and 125.9 cents per litre. You can see that there is a large variance in fuel prices between different fuel stations. However, are there any noticable the prices differences among different brands? Below will compare the average U91 price of 8 largest fuel brands.

fueldata1 <-  fueldata %>%
  group_by(brand) %>%
  summarise(no_rows = length(brand), price = mean(price)) %>%
  filter(!brand == "Independent") %>% #remove independent small petrol stations
  arrange(-no_rows) 
fueldata2 <- fueldata1 %>%
  head(n = 8) %>%
  arrange(price)


fueldata1 %>%
  head(n = 8) %>%
  ggplot(aes(x = brand, y = price))+
  geom_bar(stat="identity", fill = "blue", width = 0.5) +
  labs(title = "Average U91 price of 8 petrol stations in NSW") +
  coord_cartesian(ylim=c(80,150)) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  xlab("Brand") + ylab("Price")

Metro Fuel provides the chepeast price of unleaded 91 petrol among the lagest 8 petrol brands. Shell was the most expensive at 129.41 cents per litre. However, the difference in fuel prices were relatively small between petrol stations.

cheapfuel <- fueldata1 %>%
  arrange(price) %>%
  select(brand, price) 
cheapfuel %>%
  head(5)
## # A tibble: 5 x 2
##   brand           price
##   <fct>           <dbl>
## 1 Speedway         104.
## 2 Prime Petroleum  113.
## 3 Enhance          114.
## 4 Budget           116.
## 5 Matilda          117.

The cheapest 5 brands are Speedway, Prime Petroleum, Enhance, Budget and Matilda. The cheapest petrol station, Speedway, is about 20.45 cents cheaper than the average fuel price of U91.

expensivefuel <- fueldata1 %>%
  arrange(-price) %>%
  select(brand, price) 
  
expensivefuel %>%
  head(5)
## # A tibble: 5 x 2
##   brand            price
##   <fct>            <dbl>
## 1 South West        143.
## 2 Inland Petroleum  134.
## 3 Lowes             132.
## 4 Liberty           130.
## 5 Shell             129.

The most expensive 5 brands are South West, Inland Petroleum, Lowes, Liberty and Shell. The most expensive petrol station, South West, is about 18.45 cents more expensive than the average fuel price of U91.

Geographic Analysis on Fuel Price

The petrol prices in Northern part of Sydney are usually known to be more expensive than the petrol price in South or West side of Sydney. Using “ggmap” package the price can be visualised using a heatmap.

nsw_bb <- c(left = 150.8,
          bottom = -34,
          right = 151.3,
          top = -33.7)
nsw_stamen <- get_stamenmap(bbox = nsw_bb, zoom = 12, maptype = "terrain")
ggmap(nsw_stamen)  +
  stat_summary_2d(data = fueldata, aes(x = long, y = lat, z = price), 
                  alpha = 0.5, bins = 40) +
  scale_fill_gradient(name = "Price", low = "green", high = "red") +
  xlab("Longitude") + ylab("Latitude")
## Warning: Removed 1329 rows containing non-finite values (stat_summary2d).
## Warning: Removed 1 rows containing missing values (geom_tile).

Looking at the map, it is actually true that the fuel prices are more expensive than other areas of Sydney. It maybe a good idea to fill up when you are in South or West Sydney!

Summary

In this workbook, I have showed to how to anlalyse a real-time dataset using API, cleansing the dataset, plotting the data using histogram and bar plot and heatmap.