United Nations Food & Agriculture Organization: Week 1

Chidinma Emenike

The data on area, land use, and population are from the UN FAO records for 2009.

data(FAOsimple)  # Read in the data: boilerplate
ggplot(data = FAOsimple, aes(x = Country.area, y = Total.Population...Both.sexes)) + 
    geom_point() + scale_x_log10() + scale_y_log10()

plot of chunk unnamed-chunk-3

According to the plot, as the area of a country goes up, so does its population.

ggplot(data = FAOsimple, aes(x = Arable.land, y = Total.economically.active.population.in.Agr)) + 
    geom_point() + scale_x_log10() + scale_y_log10()

plot of chunk unnamed-chunk-4

This command plots the fraction of a country's total land that is arable against the fraction of a country's total population that is engaged in agricultural work. There appears to be a positive relationship between the two.

New Variables

Population can be calculated in two ways (divide the total number of people in an area by either agricultural or land area).

First Method:

FAOsimple = transform(FAOsimple, popdens = Total.Population...Both.sexes/Agricultural.area)

Second Method:

FAOsimple = transform(FAOsimple, popdense = Total.Population...Both.sexes/Land.area)
names(FAOsimple)
##  [1] "Country"                                                     
##  [2] "Year"                                                        
##  [3] "Agricultural.area"                                           
##  [4] "Agricultural.area.certified.organic"                         
##  [5] "Agricultural.area.in.conversion.to.organic"                  
##  [6] "Agricultural.area.irrigated"                                 
##  [7] "Agricultural.area.organic..total"                            
##  [8] "Arable.land"                                                 
##  [9] "Arable.land.and.Permanent.crops"                             
## [10] "Arable.land.area.certified.organic"                          
## [11] "Arable.land.area.in.conversion.to.organic"                   
## [12] "Arable.land.organic..total"                                  
## [13] "Country.area"                                                
## [14] "Fallow.land"                                                 
## [15] "Forest.area"                                                 
## [16] "Inland.water"                                                
## [17] "Land.area"                                                   
## [18] "Other.land"                                                  
## [19] "Perm..crops.irrigated"                                       
## [20] "Perm..crops.non.irrigated"                                   
## [21] "Perm..meadows...pastures...Cultivated"                       
## [22] "Perm..meadows...pastures...Nat..grown"                       
## [23] "Perm..meadows...pastures.Cult....irrig"                      
## [24] "Perm..meadows...pastures.Cult..non.irrig"                    
## [25] "Permanent.crops"                                             
## [26] "Permanent.crops.area.certified.organic"                      
## [27] "Permanent.crops.area.in.conversion.to.organic"               
## [28] "Permanent.crops.organic..total"                              
## [29] "Permanent.meadows.and.pastures"                              
## [30] "Permanent.meadows.and.pastures.area.certified.organic"       
## [31] "Permanent.meadows.and.pastures.area.in.conversion.to.organic"
## [32] "Permanent.meadows.and.pastures.organic..total"               
## [33] "Temp..crops.irrigated"                                       
## [34] "Temp..crops.non.irrigated"                                   
## [35] "Temp..meadows...pastures.irrigated"                          
## [36] "Temp..meadows...pastures.non.irrig."                         
## [37] "Temporary.crops"                                             
## [38] "Temporary.meadows.and.pastures"                              
## [39] "Total.area.equipped.for.irrigation"                          
## [40] "Agricultural.population"                                     
## [41] "Female.economically.active.population"                       
## [42] "Female.economically.active.population.in.Agr"                
## [43] "Male.economically.active.population"                         
## [44] "Male.economically.active.population.in.Agr"                  
## [45] "Non.agricultural.population"                                 
## [46] "Rural.population"                                            
## [47] "Total.economically.active.population"                        
## [48] "Total.economically.active.population.in.Agr"                 
## [49] "Total.Population...Both.sexes"                               
## [50] "Total.Population...Female"                                   
## [51] "Total.Population...Male"                                     
## [52] "Urban.population"                                            
## [53] "popdens"                                                     
## [54] "popdense"

The new variables appear at the bottom of the set of names.

To compare the populations (total and divided by sexes), make a new variable that adds together the male and female population.

FAOsimple = transform(FAOsimple, poptotal = Total.Population...Female + Total.Population...Male)
ggplot(data = FAOsimple, aes(x = Total.Population...Both.sexes, y = poptotal)) + 
    geom_point() + scale_x_log10() + scale_y_log10()

plot of chunk unnamed-chunk-7

There is a strong, positively linear correlation between the actual and expected total population, so we can be sure that the data is correct.

As the amount of farmable land increases, one would expect that forest area decreases in order to make room for it. Check the hypothesis with a plot.

ggplot(data = FAOsimple, aes(x = Arable.land, y = Forest.area)) + geom_point() + 
    scale_x_log10() + scale_y_log10()
## Warning: Removed 7 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-8

There is a positive association between these variables, so from the plot it would appear that as arable land increases with forest area.

In the previous example, most data points are in the same range, but there are some outliers. In order to show all the points on one plot, most of the data is grouped together, and it is hard to see what's going on.

A plot of the original data:

ggplot(data = FAOsimple, aes(x = Arable.land, y = Forest.area)) + geom_point()
## Warning: Removed 7 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-9

From this plot, it is not obvious that there is a relationship between the variables, but when logarithmic axes are used, it becomes clear that the variables are positively associated.