The goal of this vignette is to lean how to use API and visualise the data using the the NSW fuel price data. “jsonlite” packages can be used to import JSON formatted data into R. FuelCheck gives access to real-time fuel price information in NSW. This workbook will show you how to import real-time data using API in JSON format and convert the data into data frame to anlyse the information based on the location, price and brand of the difference petrol stations.
The FuelCheck API returns real-time fuel prices in JSON format. JSON data can be converted to a data frame using jsonlite package. The below API call requests the prices of all available unleaded 91 fuel prices across NSW. There are 21 different brand opearing in NSW. Small independent petrol station are combined as brand “independent”. The column names from the FuelCheck has are converted to lower cases using “janitor” package.
library(jsonlite)
##
## Attaching package: 'jsonlite'
## The following object is masked from 'package:purrr':
##
## flatten
#URL for API call to collect fuel data
url <- paste("https://api.onegov.nsw.gov.au/FuelCheckApp/v1/fuel/prices/",
"bylocation?bottomLeftLatitude=-41.574361&bottomLeftLongitude=107.929688&",
"topRightLatitude=-11.264612&topRightLongitude=155.039063&",
"fueltype=U91&brands=SelectAll", sep="", collapse=NULL)
fueldata <- fromJSON(url) %>%
clean_names() %>%
select(name, lat, long, address, brand, price) %>%
mutate(brand = as.factor(brand))
str(fueldata)
## 'data.frame': 1828 obs. of 6 variables:
## $ name : chr "Speedway Fairfield" "APEX PETROLEUM VILLAWOOD" "SPEEDWAY SACKVILLE STREET FAIRFIELD" "Metro Fuel Fairfield" ...
## $ lat : num -33.9 -33.9 -33.9 -33.9 -33.9 ...
## $ long : num 151 151 151 151 151 ...
## $ address: chr "251 The Horsley Drive, Fairfield NSW 2165" "896A Woodville Rd, Villawood NSW 2163" "115 Sackville Street, FAIRFIELD NSW 2165" "130 Hamilton Road, Fairfield NSW 2165" ...
## $ brand : Factor w/ 21 levels "7-Eleven","BP",..: 19 8 19 13 13 8 2 13 19 19 ...
## $ price : num 87.9 87.9 90.9 90.9 90.9 91.9 91.9 91.9 92.5 92.5 ...
head(fueldata)
## name lat long
## 1 Speedway Fairfield -33.87136 150.9628
## 2 APEX PETROLEUM VILLAWOOD -33.88389 150.9760
## 3 SPEEDWAY SACKVILLE STREET FAIRFIELD -33.87272 150.9452
## 4 Metro Fuel Fairfield -33.87264 150.9459
## 5 Metro Fairfield -33.87290 150.9499
## 6 Gas to connect -33.84721 150.9623
## address brand price
## 1 251 The Horsley Drive, Fairfield NSW 2165 Speedway 87.9
## 2 896A Woodville Rd, Villawood NSW 2163 Independent 87.9
## 3 115 Sackville Street, FAIRFIELD NSW 2165 Speedway 90.9
## 4 130 Hamilton Road, Fairfield NSW 2165 Metro Fuel 90.9
## 5 82 Hamilton Rd, FAIRFIELD NSW 2165 Metro Fuel 90.9
## 6 54c Fairfield Road, GUILDFORD WEST NSW 2161 Independent 91.9
The information from the API includes names, location, brand of the petrol stations and prices of unleaded 91 fuel across 1828 petrol stations in NSW.
We will first compare the 5 chepest petrol stations and 5 most expensive petrol station.
cheappetrol <- fueldata %>%
arrange(price) %>%
select(brand, price, address)
cheappetrol %>%
head(5)
## brand price address
## 1 Speedway 87.9 251 The Horsley Drive, Fairfield NSW 2165
## 2 Independent 87.9 896A Woodville Rd, Villawood NSW 2163
## 3 Speedway 90.9 115 Sackville Street, FAIRFIELD NSW 2165
## 4 Metro Fuel 90.9 130 Hamilton Road, Fairfield NSW 2165
## 5 Metro Fuel 90.9 82 Hamilton Rd, FAIRFIELD NSW 2165
expensivepetrol <- fueldata %>%
arrange(-price) %>%
select(brand, price, address)
cheappetrol %>%
tail(5)
## brand price address
## 1824 Caltex 167.9 43 Urben st, Urbenville NSW 2475
## 1825 Liberty 168.0 2 Myers Street, WILCANNIA NSW 2880
## 1826 Shell 170.0 31 Obley St, CUMNOCK NSW 2867
## 1827 Caltex 170.0 1466 Kyogle Rd, Uki NSW 2484
## 1828 Mobil 191.9 Cnr Keraro Rd & Johnston St, WHITE CLIFFS NSW 2836
The most expensive unleaded 91 petrol price is more than duoble the cheapest price of U91 available in Australia. The cheap petrol stations are mostly located in South west Sydney wheras the expensive petrol stations are located in regional NSW areas. Now, Let’s have a look at the distribution of unleaded 91 fuel prices.
fueldata %>%
ggplot(aes(xintercept=mean(price), x= price)) +
geom_histogram(aes(y=..density..), colour="black", fill="white") +
geom_vline(aes(xintercept=mean(price)),color="blue", linetype="dashed", size=1) +
geom_density(alpha=.2, fill="#FF6666") +
labs(title = "Distribution of U91 Petrol Prices in NSW") +
xlab("Price") + ylab("Density")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The average and median price of fuel is 124.85 and 125.9 cents per litre. You can see that there is a large variance in fuel prices between different fuel stations. However, are there any noticable the prices differences among different brands? Below will compare the average U91 price of 8 largest fuel brands.
fueldata1 <- fueldata %>%
group_by(brand) %>%
summarise(no_rows = length(brand), price = mean(price)) %>%
filter(!brand == "Independent") %>% #remove independent small petrol stations
arrange(-no_rows)
fueldata2 <- fueldata1 %>%
head(n = 8) %>%
arrange(price)
fueldata1 %>%
head(n = 8) %>%
ggplot(aes(x = brand, y = price))+
geom_bar(stat="identity", fill = "blue", width = 0.5) +
labs(title = "Average U91 price of 8 petrol stations in NSW") +
coord_cartesian(ylim=c(80,150)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
xlab("Brand") + ylab("Price")
Metro Fuel provides the chepeast price of unleaded 91 petrol among the lagest 8 petrol brands. Shell was the most expensive at 129.41 cents per litre. However, the difference in fuel prices were relatively small between petrol stations.
cheapfuel <- fueldata1 %>%
arrange(price) %>%
select(brand, price)
cheapfuel %>%
head(5)
## # A tibble: 5 x 2
## brand price
## <fct> <dbl>
## 1 Speedway 104.
## 2 Prime Petroleum 113.
## 3 Enhance 114.
## 4 Budget 116.
## 5 Matilda 117.
The cheapest 5 brands are Speedway, Prime Petroleum, Enhance, Budget and Matilda. The cheapest petrol station, Speedway, is about 20.45 cents cheaper than the average fuel price of U91.
expensivefuel <- fueldata1 %>%
arrange(-price) %>%
select(brand, price)
expensivefuel %>%
head(5)
## # A tibble: 5 x 2
## brand price
## <fct> <dbl>
## 1 South West 143.
## 2 Inland Petroleum 134.
## 3 Lowes 132.
## 4 Liberty 130.
## 5 Shell 129.
The most expensive 5 brands are South West, Inland Petroleum, Lowes, Liberty and Shell. The most expensive petrol station, South West, is about 18.45 cents more expensive than the average fuel price of U91.
The petrol prices in Northern part of Sydney are usually known to be more expensive than the petrol price in South or West side of Sydney. Using “ggmap” package the price can be visualised using a heatmap.
nsw_bb <- c(left = 150.8,
bottom = -34,
right = 151.3,
top = -33.7)
nsw_stamen <- get_stamenmap(bbox = nsw_bb, zoom = 12, maptype = "terrain")
ggmap(nsw_stamen) +
stat_summary_2d(data = fueldata, aes(x = long, y = lat, z = price),
alpha = 0.5, bins = 40) +
scale_fill_gradient(name = "Price", low = "green", high = "red") +
xlab("Longitude") + ylab("Latitude")
## Warning: Removed 1329 rows containing non-finite values (stat_summary2d).
## Warning: Removed 1 rows containing missing values (geom_tile).
Looking at the map, it is actually true that the fuel prices are more expensive than other areas of Sydney. It maybe a good idea to fill up when you are in South or West Sydney!
In this workbook, I have showed to how to anlalyse a real-time dataset using API, cleansing the dataset, plotting the data using histogram and bar plot and heatmap.