Nighttime satellite images are some of the most stunning photographs. Just have a look at the image below. What you see is a cloud-free composite of numerous individual satellite images taken in 2013. Together they not only produce a beautiful image of our planet, they also provide a great source of data. In this tutorial you will learn how you can use the open-source software QGIS to turn satellite images into data that you can then analyze using R. In particular, we will analyze how closely artificial lighting is related to economic activity as measured by gross domestic product.
Today there are a number of places where you can get free satellite images. The data for this post come from NOAA and can be downloaded here. NOAA provides yearly cloud-free nighttime composite images of planet earth. Obviously, due to the high resolution of the images, the files are quite large. So bear that in mind when opening them on your computer. The images come as .tif files, which are a lossless version of the more commonly used .jpeg files.
Before we start with any formal sort of analysis, let’s have a more detailed look at the image above. We can clearly see the coastlines of many parts of the world. Large parts of North America, pretty much all of Europe and the island of Japan are brightly lit. Other areas, like Greenland, the Amazon rain forest or the Sahara desert are nearly pitch black. Let’s zoom in on four parts of the photograph that are particularly interesting (see below). We will start with an image of the continental United States. Unsurprisingly, the United States, as a highly developed nation, emits much artificial light during nighttime. We can easily identify all major population centers and the roads and highways connecting them. But we also notice how the country is somewhat split in half. The image clearly reflects the fact that most of the people living in the United States live in the country’s Eastern half. An even more bizarre pattern emerges in Egypt (second image). Here, almost all sources of artificial light are tightly clustered around the river banks of the Nile, the country’s primary source of water. In Argentina (third image), modern population centers follow old railway lines creating a web-like pattern with cities neatly aligned and spaced equally far apart. Lastly, Korea (fourth image) tells the story of a country that just couldn’t be divided more deeply. The democratic south is brightly illuminated, while the socialist north, despite the fact that it is home to millions of people, is so dark, one almost mistakes the Korean peninsula for an island.
In order to study the relationship between nighttime lighting and economic activity, we will have to clip the original image depending on national borders. Unfortunately, we cannot use a simple rectangular crop. Have a look at the following two images of Germany. The left image is the result of a simple rectangular crop around Germany. The problem: Germany’s borders are anything but a perfect rectangle. The clipped image also includes parts of Germany’s neighboring countries like the Netherlands, Belgium, and France. Consequently, if we were to measure the amount of light in this picture, we would not only measure light emitted from Germany but also light emitted from other countries. Now have a look at the right hand side image. Here, we really only see light emitted from within Germany. Everything else is completely black.
You can produce clipped images like this using the free and open source software QGIS which you can download ħere. The software works on Linux, Windows and Mac computers alike. When opening QGIS the software looks as in the image below. You start the clipping routine by dragging the original .tif image into the QGIS. Next, we need to tell QGIS which region to clip, i.e., we must tell QGIS where each country’s borders are. We can do so using so-called “shapefiles” or “geopacakges”. You can download geopackages of national (as well as regional and even municipal) borders from GADM. Once you downloaded a country’s geopackage you can drag and drop the corresponding .gpkg file into QGIS. The geopackage includes several layers of administrative borders. Simply select the layer with only a single feature, to import national borders only. You will then see a mask appear on the location of the country on the satellite image of the world. Now you click on “Raster” at the top of the screen and select “Extraction”, after which you click on “Clip Raster by Mask Layer”. In the pop-up window select the only available “Input layer” and hit “OK”. Wait until the extraction is complete and then close the pop-up window. Next, in the “Layers” tab at the bottom left of you screen, right-click the country mask and select “Export” and “Save Feature As …”. Select a location and file-name of your choosing and click on “OK”. You will now have successfully extracted the lights data of one particular country and saved them in a new .tif file.
Once we clipped all the satellite images that we need, we are ready to import these images into R. We do so using the “readTIFF()” function of the tiff package.
library(tiff)
img <- readTIFF("filename.tif")
class(img)
## [1] "matrix"
img[1:10,1:10]
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 0 0 0 0 0 9 0 0 0 3
## [2,] 1 2 0 6 0 9 4 10 7 10
## [3,] 7 0 3 2 2 0 0 4 5 0
## [4,] 4 6 9 4 0 0 0 0 0 3
## [5,] 0 4 1 0 0 0 0 0 0 0
## [6,] 9 0 0 6 0 7 2 3 6 0
## [7,] 0 0 2 0 6 2 0 2 0 0
## [8,] 0 10 6 6 0 0 0 0 0 0
## [9,] 0 0 0 0 0 0 0 0 7 9
## [10,] 0 0 8 6 2 2 3 0 0 3
The result is a matrix of numerical values that tell us how bright each pixel in the image is. Small values correspond to very dark shades, with zero indicating that a pixel is pitch black. Conversely, large values indicate that a pixel is very white. Thus, we now have a numerical “light map” of a country telling us which regions of it emit large amounts of light into the nighttime sky and which regions of the country are presumably largely unpopulated.
Several papers have already explored how satellite images can be used to measure economic activity and how we can even use them to debunk fraudulent GDP figures. For example, check out these two recent working papers by Hu & Yao and Martinez. To demonstrate how easy it is to run similar analyses ourselves, I downloaded GDP and population data from the World Bank and merged it with the clipped nighttime satellite images of almost 200 countries. My final dataset looks like this:
df <- readRDS("data.RDS")
head(df)
## code country gdp pop gdpCap light
## 1 AFG Afghanistan 20561069558 32269589 637.1655 681.47059
## 2 AGO Angola 136709862831 26015780 5254.8823 1081.07059
## 3 ALB Albania 12776217195 2895092 4413.0609 639.37255
## 4 AND Andorra 3193704343 80774 39538.7667 68.36471
## 5 ARE United Arab Emirates 390107556161 9197910 42412.6303 4943.32941
## 6 ARG Argentina 552025140252 42202935 13080.2547 15969.23922
## region
## 1 South Asia
## 2 Sub-Saharan Africa
## 3 Europe & Central Asia
## 4 Europe & Central Asia
## 5 Middle East & North Africa
## 6 Latin America & Caribbean
Now, let’s have a look at the relationship between gross domestic product and artificial lighting:
library(ggplot2)
ggplot(df, aes(x = gdp, y = light)) +
geom_point(aes(col = region,
size = pop/1e6),
alpha = .8) +
scale_x_continuous(trans = "log10") +
scale_y_continuous(trans = "log10") +
labs(x = "GDP (in USD)",
y = "Light emission (in px points)",
size = "Population (in m)",
color = "Region")
As we can see, on a logarithmic scale, the relationship is almost 1:1. Thus, for a one percent increase in GDP, light emission also increases by one percent. Bear in mind, however, that not all countries report their GDP equally accurately. In fact, Martinez (2019), using nighttime satellite images, was able to show that authoritarian governments might be fudging their national statistics to inflate their GDP figures. Thus, processing satellite data is not only a fun exercise but can also be a very powerful tool.