Welcome to the first script of Module 1!
At the example of some agricultural statistics of Armenia, you will here learn how to:
Please read the following code instructions carefully, try to understand the code that follows each instruction, execute it and see what happens. Do not hesitate to change the code or write your own code, and execute it as well!
We start off with defining our working directory. This should be the folder to which you downloaded the material of module 1. Please change the following path accordingly, and ensure that the folder names are separated by forward slashes, not backward slashes:
dir <- "C:/Users/Max/Desktop/IAMO_neu/eLearning/Module_1_Data_visualization/"
Let’s import the file “all_stats.csv”. We can specify its location by pasting the name to the previously defined object “dir” using the function paste0(). We then import the data with the function read.csv():
dir2 <- paste0(dir, "M01_Part01_Agricultural-Statistics/data/all_stats.csv")
dir2
## [1] "C:/Users/Max/Desktop/IAMO_neu/eLearning/Module_1_Data_visualization/M01_Part01_Agricultural-Statistics/data/all_stats.csv"
data_raw <- read.csv(dir2)
Let’s have a look at the data. The function head() shows us the first 6 lines of “data_raw”:
head(data_raw)
| region | crop | indicator | year | value |
|---|---|---|---|---|
| 00_ARMENIA | Winter Wheat | Sown area | 2020 | 57964 |
| Yerevan c. | Winter Wheat | Sown area | 2020 | NA |
| Aragatsotn | Winter Wheat | Sown area | 2020 | 4954 |
| Ararat | Winter Wheat | Sown area | 2020 | 1752 |
| Armavir | Winter Wheat | Sown area | 2020 | 3193 |
| Gegharkunik | Winter Wheat | Sown area | 2020 | 10470 |
We can obtain a list of all unique elements of a specific column in the following way:
unique(data_raw$region)
## [1] "00_ARMENIA" "Yerevan c." "Aragatsotn" "Ararat" "Armavir"
## [6] "Gegharkunik" "Lori" "Kotayk" "Shirak" "Syunik"
## [11] "Vayots dzor" "Tavush"
unique(data_raw$crop)
## [1] "Winter Wheat" "Winter Barley" "Spring Wheat" "Spring Barley"
## [5] "Potato" "Cucumber" "Tomato" "Maize for silage"
## [9] "Pomaceous Fruits" "Stone Fruits" "Berries"
unique(data_raw$indicator)
## [1] "Sown area" "Harvested area" "Yield" "Production"
unique(data_raw$year)
## [1] 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006
## [16] 2005
To create subsets of “data_raw”, we can use the function filter() from the package dplyr. To use this function, we first need to load the package with the function library(). If you have never installed dplyr before, you have to execute the installation line first. Once you have installed the package, you will not need the installation line again, but you always have to load the package when you had closed R Studio before.
# install.packages("dplyr")
library(dplyr)
Now let’s create a sub-sample of “data_raw” that only contains data on spring wheat:
data_sample1 <- filter(data_raw, crop == "Spring Wheat")
head(data_sample1)
| region | crop | indicator | year | value |
|---|---|---|---|---|
| 00_ARMENIA | Spring Wheat | Sown area | 2020 | 1429 |
| Yerevan c. | Spring Wheat | Sown area | 2020 | NA |
| Aragatsotn | Spring Wheat | Sown area | 2020 | 134 |
| Ararat | Spring Wheat | Sown area | 2020 | 26 |
| Armavir | Spring Wheat | Sown area | 2020 | 68 |
| Gegharkunik | Spring Wheat | Sown area | 2020 | 460 |
Other ways to sub-sample the data are:
data_sample2 <- filter(data_raw, crop %in% c("Spring Wheat", "Winter Wheat"))
data_sample3 <- filter(data_raw, crop %in% c("Spring Wheat", "Winter Wheat") == FALSE)
data_sample4 <- filter(data_raw, year > 2019)
data_sample5 <- filter(data_raw, crop != "Spring Wheat", year == 2020, region == "Ararat", indicator == "Sown area")
For the next steps, let us work with a sub-sample that excludes the country-level statistics (“00_ARMENIA”) and the statistics for Yerevan, because we cannot or do not want to display these statistics in the map later:
data <- filter(data_raw, region %in% c("00_ARMENIA", "Yerevan c.") == FALSE)
We also have to re-name “Vayots dzor” to “Vayotz Dzor” for a later step where we join the agricultural statistics with a shapefile:
data$region[data$region=="Vayots dzor"] <- "Vayotz Dzor"
Let us also recalculate the units of the measurements and add a column with explanations:
## convert yields to tonnes per hectare
data$value[data$indicator=="Yield"] <- data$value[data$indicator=="Yield"]/10
## convert production to kilotonnes
data$value[data$indicator=="Production"] <- data$value[data$indicator=="Production"]/1000
## convert area to *1000 ha
data$value[data$indicator=="Sown area"] <- data$value[data$indicator=="Sown area"]/1000
data$value[data$indicator=="Harvested area"] <- data$value[data$indicator=="Harvested area"]/1000
## add units to indicators
data$indicator_long[data$indicator=="Sown area"] <- "Sown area (*1000 ha)"
data$indicator_long[data$indicator=="Harvested area"] <- "Harvested area *1000 (ha)"
data$indicator_long[data$indicator=="Production"] <- "Production (*1000 tonnes)"
data$indicator_long[data$indicator=="Yield"] <- "Yield (tonnes/ha)"
Have a look at the final data subsample:
head(data)
| region | crop | indicator | year | value | indicator_long |
|---|---|---|---|---|---|
| Aragatsotn | Winter Wheat | Sown area | 2020 | 4.954 | Sown area (*1000 ha) |
| Ararat | Winter Wheat | Sown area | 2020 | 1.752 | Sown area (*1000 ha) |
| Armavir | Winter Wheat | Sown area | 2020 | 3.193 | Sown area (*1000 ha) |
| Gegharkunik | Winter Wheat | Sown area | 2020 | 10.470 | Sown area (*1000 ha) |
| Lori | Winter Wheat | Sown area | 2020 | 7.836 | Sown area (*1000 ha) |
| Kotayk | Winter Wheat | Sown area | 2020 | 3.910 | Sown area (*1000 ha) |
The data in “all_stats.csv” is already in the right format for creating plots based on it. However, our data is sometimes in a different format. Have a look at the file “pomaceous_fruits.csv” in the folder “yield_by_crop”. As you can see, there is one column for each year:
data_pom_raw <- read.csv(paste0(dir, "M01_Part01_Agricultural-Statistics/data/yield_by_crop/pomaceous_fruits.csv"), sep=";")
head(data_pom_raw)
| region | region_type | crop | indicator | year_2020 | year_2019 | year_2018 | year_2017 | year_2016 | year_2015 | year_2014 | year_2013 | year_2012 | year_2011 | year_2010 | year_2009 | year_2008 | year_2007 | year_2006 | year_2005 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Aragatsotn | marzes | pomaceous_fruits | crop_capacity | 113.0 | 118.7 | 181.8 | 140.6 | 86.4 | 228.6 | 261.4 | 193.7 | 213.1 | 116.9 | 65.8 | 263.4 | 313.2 | 197.6 | 191.4 | 206.3 |
| Ararat | marzes | pomaceous_fruits | crop_capacity | 186.6 | 157.6 | 152.5 | 185.0 | 107.1 | 143.9 | 200.2 | 159.9 | 168.9 | 197.1 | 187.7 | 214.9 | 218.4 | 172.3 | 194.3 | 179.9 |
| Armavir | marzes | pomaceous_fruits | crop_capacity | 120.6 | 108.4 | 107.8 | 120.9 | 77.5 | 105.4 | 184.5 | 135.2 | 138.1 | 124.9 | 45.2 | 189.2 | 191.7 | 171.8 | 224.3 | 127.9 |
| Gegharkunik | marzes | pomaceous_fruits | crop_capacity | 96.3 | 100.3 | 90.1 | 101.9 | 84.6 | 128.4 | 205.0 | 200.6 | 195.3 | 183.1 | 169.2 | 169.2 | 179.1 | 176.0 | 133.1 | 121.2 |
| Lori | marzes | pomaceous_fruits | crop_capacity | 37.2 | 33.5 | 36.0 | 45.1 | 6.6 | 11.7 | 8.9 | 27.5 | 29.8 | 16.7 | 2.8 | 34.3 | 35.1 | 69.3 | 33.8 | 24.5 |
| Kotayk | marzes | pomaceous_fruits | crop_capacity | 53.5 | 40.3 | 68.4 | 110.0 | 29.2 | 58.8 | 48.6 | 53.7 | 49.1 | 44.0 | 24.6 | 76.1 | 81.4 | 70.5 | 43.3 | 49.2 |
We can easily re-arrange this table with the function melt() from the package reshape2. The argument “id.vars” specifies all columns that should remain in the new dataframe “data_pom”; the rest is joined into two new columns: the column “variable” contains the column names of the previous object “data_pom_raw”, the column “value” contains the yield (= crop capacity) measurements.
# install.packages("reshape2")
library(reshape2)
data_pom <- melt(data_pom_raw, id.vars = c("region", "region_type", "crop", "indicator"))
head(data_pom)
| region | region_type | crop | indicator | variable | value |
|---|---|---|---|---|---|
| Aragatsotn | marzes | pomaceous_fruits | crop_capacity | year_2020 | 113.0 |
| Ararat | marzes | pomaceous_fruits | crop_capacity | year_2020 | 186.6 |
| Armavir | marzes | pomaceous_fruits | crop_capacity | year_2020 | 120.6 |
| Gegharkunik | marzes | pomaceous_fruits | crop_capacity | year_2020 | 96.3 |
| Lori | marzes | pomaceous_fruits | crop_capacity | year_2020 | 37.2 |
| Kotayk | marzes | pomaceous_fruits | crop_capacity | year_2020 | 53.5 |
For clarification, we re-name the last two columns:
names(data_pom)[5:6] <- c("year", "yield")
head(data_pom)
| region | region_type | crop | indicator | year | yield |
|---|---|---|---|---|---|
| Aragatsotn | marzes | pomaceous_fruits | crop_capacity | year_2020 | 113.0 |
| Ararat | marzes | pomaceous_fruits | crop_capacity | year_2020 | 186.6 |
| Armavir | marzes | pomaceous_fruits | crop_capacity | year_2020 | 120.6 |
| Gegharkunik | marzes | pomaceous_fruits | crop_capacity | year_2020 | 96.3 |
| Lori | marzes | pomaceous_fruits | crop_capacity | year_2020 | 37.2 |
| Kotayk | marzes | pomaceous_fruits | crop_capacity | year_2020 | 53.5 |
Let’s delete some columns:
data_pom <- data_pom[,-c(2:4)]
head(data_pom)
| region | year | yield |
|---|---|---|
| Aragatsotn | year_2020 | 113.0 |
| Ararat | year_2020 | 186.6 |
| Armavir | year_2020 | 120.6 |
| Gegharkunik | year_2020 | 96.3 |
| Lori | year_2020 | 37.2 |
| Kotayk | year_2020 | 53.5 |
Finally, we have to convert the entries in the column “year” to numbers. We do that by subsetting the text entries, and then converting them to numbers:
data_pom$year <- as.numeric(substr(data_pom$year, 6, 9))
head(data_pom)
| region | year | yield |
|---|---|---|
| Aragatsotn | 2020 | 113.0 |
| Ararat | 2020 | 186.6 |
| Armavir | 2020 | 120.6 |
| Gegharkunik | 2020 | 96.3 |
| Lori | 2020 | 37.2 |
| Kotayk | 2020 | 53.5 |
There are of course many different ways in which we could visualize the agricultural statistics contained in the object “data”. Here, we focus on creating bar plots to compare the production amounts between marzes and over time.
To create diagrams, we will almost always use the function ggplot() from the package ggplot2. Please install this package and then load it:
# install.packages("ggplot2")
library(ggplot2)
To visualize the production amounts of winter wheat in 2020 among all marzes, we create a new sub-sample, accordingly:
prod_wwheat <- filter(data, crop == "Winter Wheat", year == 2020, indicator == "Production")
prod_wwheat
| region | crop | indicator | year | value | indicator_long |
|---|---|---|---|---|---|
| Aragatsotn | Winter Wheat | Production | 2020 | 118.201 | Production (*1000 tonnes) |
| Ararat | Winter Wheat | Production | 2020 | 68.485 | Production (*1000 tonnes) |
| Armavir | Winter Wheat | Production | 2020 | 135.460 | Production (*1000 tonnes) |
| Gegharkunik | Winter Wheat | Production | 2020 | 187.817 | Production (*1000 tonnes) |
| Lori | Winter Wheat | Production | 2020 | 221.201 | Production (*1000 tonnes) |
| Kotayk | Winter Wheat | Production | 2020 | 73.759 | Production (*1000 tonnes) |
| Shirak | Winter Wheat | Production | 2020 | 316.660 | Production (*1000 tonnes) |
| Syunik | Winter Wheat | Production | 2020 | 140.509 | Production (*1000 tonnes) |
| Vayotz Dzor | Winter Wheat | Production | 2020 | 3.499 | Production (*1000 tonnes) |
| Tavush | Winter Wheat | Production | 2020 | 25.869 | Production (*1000 tonnes) |
We create a first plot by specifying the data object, and by defining in the function aes() what should be on the x and y axes. We then add with the sign + a new line of code that contains the function geom_col() to draw the bars:
ggplot(data=prod_wwheat, aes(x=region, y=value)) +
geom_col()
In the function aes(), we can specify additional arguments. For example, we can give each bar a different color:
ggplot(data=prod_wwheat, aes(x=region, y=value, fill=region)) +
geom_col()
By default, R uses a rainbow color scale to color the bars. We can change that by supplying an own vector of colors by using words or HEX codes. By the way, the order in which you specify the layers (almost) does not matter. Just make sure that ggplot() is always the first line.
ggplot(data=prod_wwheat, aes(x=region, y=value, fill=region)) +
geom_col() +
scale_fill_manual(values=c("red", "#E69F00", "#56B4E9", "green", "orange", "yellow", "blue", "steelblue", "grey", "black"))
Let’s include a title, change the y axis name, and delete the x axis name:
ggplot(data=prod_wwheat, aes(x=region, y=value, fill=region)) +
geom_col() +
ggtitle("Production of winter wheat in 2020, in kilotonnes") +
ylab("production (t*1000)") +
xlab("")
You can also save plots as objects. This saves you lines of code if you want to add more layers to the plot later.
plot1 <- ggplot(data=prod_wwheat, aes(x=region, y=value, fill=region)) +
geom_col() +
ggtitle("Production of winter wheat in 2020, in kilotonnes") +
xlab("") +
ylab("production (t*1000)")
Add an additional layer to “plot1” that results in rotated x axis labels:
plot1 + theme(axis.text.x = element_text(angle = 45, hjust=1))
There are many more function for changing the appearance of a plot. You will get to know them step by step!
You can easily export plots with the function png(), followed by the function dev.off(). To export the plot, let’s increase the text size in the function theme():
png(paste0(dir, "plot1.png"), height = 600, width = 800)
plot1 + theme(axis.text.x = element_text(angle = 45, hjust=1),
text = element_text(size = 20))
dev.off()
## png
## 2
Assume we do not just want to plot the 2020 data of one crop, but of all crops, at once. To do so, we first need to draw the respective sub-sample from our dataset:
prod_allcrops <- filter(data, year == 2020, indicator == "Production")
We can now use facetting to easily create a multiplot that contains a separate plot for each crop, by specifying “crop” in the function facet_wrap(). The argument “scales=”free"" allows different y axes for all plots. You can ignore the warning message “Removed 6 rows containing missing values (position_stack).”:
ggplot(data=prod_allcrops, aes(x=region, y=value, fill=region)) +
geom_col() +
ggtitle("Production in 2020, in kilotonnes") +
xlab("") +
ylab("production (t*1000)") +
theme(axis.text.x = element_text(angle = 45, hjust=1)) +
facet_wrap(~crop, scales="free")
## Warning: Removed 6 rows containing missing values (position_stack).
We can also create grouped bar plots, in which we visualize the evolution of the production amount over time. For this, We choose the data from 2016 to 2020 from Aragatsotn, Ararat and Armavir, and create four facets - one for berries, potato, cucumber and tomato, respectively:
prod_4crops <- filter(data, year >= 2016, crop %in% c("Berries", "Potato", "Cucumber", "Tomato"), region %in% c("Aragatsotn", "Ararat", "Armavir"), indicator == "Production")
ggplot(data=prod_4crops, aes(x=year, y=value, fill=region)) +
geom_bar(position="dodge", stat="identity") +
ggtitle("Production by crop and marz, in kilotonnes") +
xlab("") +
ylab("production (t*1000)") +
theme(axis.text.x = element_text(angle = 45, hjust=1)) +
facet_wrap(~crop, scales="free")
For each crop, we want to create maps of the long-year mean yield, production and sown area. To do so, we first need to calculate these means for each marz and each crop. To calculate means across groups that are defined by parameter combinations, we can use the functions group_by() and summarize() from the package dplyr. You can ignore the error message “summarise() has grouped output by ‘region’, ‘crop’. You can override using the .groups argument.” The resulting object “data_means_raw” is a tibble object, that we convert to a dataframe:
data_means_raw <- data %>% group_by(region, crop, indicator_long) %>% summarize(mean=mean(na.omit(value)))
## `summarise()` has grouped output by 'region', 'crop'. You can override using the `.groups` argument.
data_means <- as.data.frame(data_means_raw)
head(data_means)
| region | crop | indicator_long | mean |
|---|---|---|---|
| Aragatsotn | Berries | Harvested area *1000 (ha) | 0.8195625 |
| Aragatsotn | Berries | Production (*1000 tonnes) | 33.9123125 |
| Aragatsotn | Berries | Sown area (*1000 ha) | 0.8205333 |
| Aragatsotn | Berries | Yield (tonnes/ha) | 5.1143750 |
| Aragatsotn | Cucumber | Harvested area *1000 (ha) | 0.1407500 |
| Aragatsotn | Cucumber | Production (*1000 tonnes) | 35.2226875 |
Let’s create a sub-sample of “data_means” that only contains tomato yields:
yield_tomato <- filter(data_means, crop == "Tomato", indicator_long == "Yield (tonnes/ha)")
yield_tomato
| region | crop | indicator_long | mean |
|---|---|---|---|
| Aragatsotn | Tomato | Yield (tonnes/ha) | 28.85250 |
| Ararat | Tomato | Yield (tonnes/ha) | 46.07250 |
| Armavir | Tomato | Yield (tonnes/ha) | 44.72937 |
| Gegharkunik | Tomato | Yield (tonnes/ha) | 12.02250 |
| Kotayk | Tomato | Yield (tonnes/ha) | 11.73875 |
| Lori | Tomato | Yield (tonnes/ha) | 5.79875 |
| Shirak | Tomato | Yield (tonnes/ha) | 19.76500 |
| Syunik | Tomato | Yield (tonnes/ha) | 14.40563 |
| Tavush | Tomato | Yield (tonnes/ha) | 8.92125 |
| Vayotz Dzor | Tomato | Yield (tonnes/ha) | 18.01312 |
We can import a shapefile with the function readOGR() from the package rgdal:
# install.packages("rgdal")
library(rgdal)
Again, we have to specify the file location in the function readOGR():
marzes <- readOGR(paste0(dir, "/shapefiles/ARM_marzes_short_simplified_500.shp"), verbose=F)
We can easily plot the polygons of the shapefile “marzes”:
plot(marzes)
The object “marzes” is of the type “SpatialPolygonsDataFrame”, and therefore has a particular data structure. We can access certain specific information such as the attribute table or the coordinate system by calling the slots “data” and “proj4string”, indicated by the sign “@”:
class(marzes)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
summary(marzes)
## Object of class SpatialPolygonsDataFrame
## Coordinates:
## min max
## x 369676.6 641124.7
## y 4299853.9 4572195.5
## Is projected: TRUE
## proj4string :
## [+proj=utm +zone=38 +datum=WGS84 +units=m +no_defs]
## Data attributes:
## div_short
## Length:11
## Class :character
## Mode :character
marzes@data
| div_short | |
|---|---|
| 0 | Syunik |
| 1 | Vayotz Dzor |
| 2 | Ararat |
| 3 | Armavir |
| 4 | Gegharkunik |
| 5 | Yerevan |
| 6 | Aragatsotn |
| 7 | Kotayk |
| 8 | Shirak |
| 9 | Tavush |
| 10 | Lori |
marzes@proj4string
## CRS arguments:
## +proj=utm +zone=38 +datum=WGS84 +units=m +no_defs
If you want to know the coordinates of all vertices of one polygon, you can call them by executing:
head(marzes@polygons[[1]]@Polygons[[1]]@coords)
## [,1] [,2]
## [1,] 562430.1 4383917
## [2,] 559782.3 4383971
## [3,] 559473.9 4385158
## [4,] 561113.8 4388292
## [5,] 561405.4 4392555
## [6,] 562535.2 4395234
As you have seen, the attribute table of “marzes” only contains the names of the marzes. To visualize yields in a map, we first have to populate the attribute table with this information. So, let’s join “marzes” and “yield_tomato”. We can do this using the function geo_join() from the package tigris, specifying the column names in “marzes” and “yield_tomato” on which the join should be based on. The resulting object “marzes_yield_tomato” is again a SpatialPolygonsDataFrame.
# install.packages("tigris")
library(tigris)
marzes_yield_tomato <- geo_join(marzes, yield_tomato, "div_short", "region", how = 'left')
marzes_yield_tomato@data
| div_short | region | crop | indicator_long | mean | |
|---|---|---|---|---|---|
| 0 | Syunik | Syunik | Tomato | Yield (tonnes/ha) | 14.40563 |
| 1 | Vayotz Dzor | Vayotz Dzor | Tomato | Yield (tonnes/ha) | 18.01312 |
| 2 | Ararat | Ararat | Tomato | Yield (tonnes/ha) | 46.07250 |
| 3 | Armavir | Armavir | Tomato | Yield (tonnes/ha) | 44.72937 |
| 4 | Gegharkunik | Gegharkunik | Tomato | Yield (tonnes/ha) | 12.02250 |
| 5 | Yerevan | NA | NA | NA | NA |
| 6 | Aragatsotn | Aragatsotn | Tomato | Yield (tonnes/ha) | 28.85250 |
| 7 | Kotayk | Kotayk | Tomato | Yield (tonnes/ha) | 11.73875 |
| 8 | Shirak | Shirak | Tomato | Yield (tonnes/ha) | 19.76500 |
| 9 | Tavush | Tavush | Tomato | Yield (tonnes/ha) | 8.92125 |
| 10 | Lori | Lori | Tomato | Yield (tonnes/ha) | 5.79875 |
We can now create a map of Armenia with each marz colored according to the long-year tomato yield, using the function spplot() from the package sp. We specify the name of the column on which the coloring should be based on:
# install.packages("sp")
library(sp)
spplot(marzes_yield_tomato, "mean")
Let’s customize the map. We use the color scale “rocket” from the package viridis, supply a title, and change the label size in the legend:
# install.packages("viridis")
library(viridis)
## Loading required package: viridisLite
spplot(marzes_yield_tomato, "mean",
col.regions = rev(rocket(21)[6:21]),
main=list(cex=1, label="Yield of tomatoes, mean 2005-2020"),
colorkey=list(labels=list(cex=0.8)))
You can find more information on nice color scales here:
https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html
https://mran.microsoft.com/snapshot/2017-06-05/web/packages/pals/vignettes/pals_examples.html
So far, we have only created one map for the yield of tomatoes. We can use a for-loop to create several yield maps at a time, and then combine them to one figure. However first, we have to re-arrange our data to be able to join the yields of all crops with our shapefile at once. We first select all yield data from “data_means”, and then use the function dcast() from reshape2, which does the opposite of melt(), that is create several columns out of one. We then join the resulting table “data_means_yield_bycrop” with “marzes”. The resulting object “marzes_yields” contains one column for each crop.
data_means_yield <- filter(data_means, indicator_long == "Yield (tonnes/ha)")
data_means_yield_bycrop <- dcast(data_means_yield, region~crop)
## Using mean as value column: use value.var to override.
data_means_yield_bycrop
| region | Berries | Cucumber | Maize for silage | Pomaceous Fruits | Potato | Spring Barley | Spring Wheat | Stone Fruits | Tomato | Winter Barley | Winter Wheat |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Aragatsotn | 5.114375 | 22.610000 | 5.41125 | 18.074375 | 22.25250 | 2.238750 | 2.285625 | 6.330000 | 28.85250 | 2.381875 | 2.426250 |
| Ararat | 13.430000 | 39.010625 | 24.91692 | 17.664375 | 30.18688 | 2.898125 | 3.256000 | 11.126250 | 46.07250 | 3.433750 | 3.877500 |
| Armavir | 9.918125 | 31.996875 | 25.04000 | 13.583750 | 32.74813 | 2.838125 | 3.417778 | 11.237500 | 44.72937 | 2.968125 | 3.535000 |
| Gegharkunik | 5.507500 | 10.106875 | 11.67000 | 14.583750 | 19.63000 | 2.240000 | 2.300625 | 3.648125 | 12.02250 | 2.800000 | 2.478750 |
| Kotayk | 3.662500 | 10.239375 | 21.58583 | 5.629375 | 19.96688 | 1.793750 | 1.732500 | 3.329375 | 11.73875 | 1.496154 | 1.957500 |
| Lori | 1.967500 | 5.768125 | 18.38062 | 2.830000 | 14.06250 | 2.123750 | 2.188750 | 1.975625 | 5.79875 | 1.682500 | 2.560625 |
| Shirak | 5.461875 | 19.366875 | 19.49250 | 11.175625 | 23.30937 | 2.203750 | 2.420625 | 8.424375 | 19.76500 | 2.033333 | 2.536250 |
| Syunik | 14.062500 | 13.213750 | 23.30000 | 4.919375 | 16.85625 | 1.908125 | 1.763125 | 2.808750 | 14.40563 | 2.058125 | 2.126250 |
| Tavush | 6.765000 | 9.823125 | 17.56545 | 2.883125 | 10.30125 | 1.815000 | 1.695000 | 5.715625 | 8.92125 | 1.981250 | 2.053125 |
| Vayotz Dzor | 1.081875 | 16.463750 | 4.49500 | 3.809375 | 16.78000 | 1.720625 | 1.653125 | 2.279375 | 18.01312 | 1.502500 | 2.050625 |
marzes_yields <- geo_join(marzes, data_means_yield_bycrop, "div_short", "region", how = 'left')
## Warning: We recommend using the dplyr::*_join() family of functions instead.
marzes_yields@data
| div_short | region | Berries | Cucumber | Maize.for.silage | Pomaceous.Fruits | Potato | Spring.Barley | Spring.Wheat | Stone.Fruits | Tomato | Winter.Barley | Winter.Wheat | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Syunik | Syunik | 14.062500 | 13.213750 | 23.30000 | 4.919375 | 16.85625 | 1.908125 | 1.763125 | 2.808750 | 14.40563 | 2.058125 | 2.126250 |
| 1 | Vayotz Dzor | Vayotz Dzor | 1.081875 | 16.463750 | 4.49500 | 3.809375 | 16.78000 | 1.720625 | 1.653125 | 2.279375 | 18.01312 | 1.502500 | 2.050625 |
| 2 | Ararat | Ararat | 13.430000 | 39.010625 | 24.91692 | 17.664375 | 30.18688 | 2.898125 | 3.256000 | 11.126250 | 46.07250 | 3.433750 | 3.877500 |
| 3 | Armavir | Armavir | 9.918125 | 31.996875 | 25.04000 | 13.583750 | 32.74813 | 2.838125 | 3.417778 | 11.237500 | 44.72937 | 2.968125 | 3.535000 |
| 4 | Gegharkunik | Gegharkunik | 5.507500 | 10.106875 | 11.67000 | 14.583750 | 19.63000 | 2.240000 | 2.300625 | 3.648125 | 12.02250 | 2.800000 | 2.478750 |
| 5 | Yerevan | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| 6 | Aragatsotn | Aragatsotn | 5.114375 | 22.610000 | 5.41125 | 18.074375 | 22.25250 | 2.238750 | 2.285625 | 6.330000 | 28.85250 | 2.381875 | 2.426250 |
| 7 | Kotayk | Kotayk | 3.662500 | 10.239375 | 21.58583 | 5.629375 | 19.96688 | 1.793750 | 1.732500 | 3.329375 | 11.73875 | 1.496154 | 1.957500 |
| 8 | Shirak | Shirak | 5.461875 | 19.366875 | 19.49250 | 11.175625 | 23.30937 | 2.203750 | 2.420625 | 8.424375 | 19.76500 | 2.033333 | 2.536250 |
| 9 | Tavush | Tavush | 6.765000 | 9.823125 | 17.56545 | 2.883125 | 10.30125 | 1.815000 | 1.695000 | 5.715625 | 8.92125 | 1.981250 | 2.053125 |
| 10 | Lori | Lori | 1.967500 | 5.768125 | 18.38062 | 2.830000 | 14.06250 | 2.123750 | 2.188750 | 1.975625 | 5.79875 | 1.682500 | 2.560625 |
We now iterate over each column of “marzes” that contains yield data, and each time create a plot that we save in a plot list. In each step, we take the column name i+2 and save it as object “crop”. We use “crop” to tell the function spplot() on which column to base the coloring on, and also use it as title of the respective map (argument “label” in “main=list()”). We then use the function grid.arrange() from the package gridExtra to bind the first six plots contained in “plot_list” into one figure. We give that figure an overall title in the function textGrob() that comes with the package grid.
library(gridExtra)
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
library(grid)
plot_list <- list()
for (i in 1:10)
{ crop <- names(marzes_yields@data)[i+2]
plot_list[[i]] <- spplot(marzes_yields, crop, col.regions = rev(rocket(21)[6:21]),
main=list(cex=1, label=crop, colorkey=list(labels=list(cex=0.8))))
}
plot_list[[1]]
grid.arrange(plot_list[[1]], plot_list[[2]], plot_list[[3]], plot_list[[4]], plot_list[[5]], plot_list[[6]], ncol=2,
top = textGrob("Mean yield from 2005 to 2020", vjust = 1, gp = gpar(fontface = "bold", cex = 1.5)))
Let’s export the plot with maps of the first 6 crops as a PNG:
png(paste0(dir, "plot2.png"), height=1000, width=600)
grid.arrange(plot_list[[1]], plot_list[[2]],
plot_list[[3]], plot_list[[4]],
plot_list[[5]], plot_list[[6]], ncol=2,
top = textGrob("Mean yield from 2005 to 2020",
vjust = 1,
gp = gpar(fontface = "bold", cex = 1.5)))
dev.off()
## png
## 2
Congratulations, you made it to the end of the first script of module 1! You can now solve the exercises 1-5 (in “M01_Exercises_Assignment.Rmd”), or go to the next script about visualizing phenological observation data (“M01_Script02_Phenological-Observations.Rmd”).