R assignment 2

library(dplyr)
  1. Read the world-small.csv dataset (available here (../data/world-small.csv)) into R. Get to know the structure of this dataset using functions like dim(), head(), and summary().
world <-read.csv('world-small.csv')
dim(world)
head(world)
tail(world)
summary(world)
  1. Find the mean and median GDP per capita and Polity IV score, by region (that is, for each region in the dataset). Also find the number of countries by region.
world %>%
  group_by(region) %>%
    summarize(mean(gdppcap08))
## # A tibble: 8 x 2
##   region       `mean(gdppcap08)`
##   <chr>                    <dbl>
## 1 Africa                   3613.
## 2 Asia-Pacific            11552.
## 3 C&E Europe              12571.
## 4 Middle East             21450.
## 5 N. America              32552.
## 6 S. America               7862.
## 7 Scandinavia             41889.
## 8 W. Europe               35040.
world %>%
  group_by(region) %>%
    summarize(mean(polityIV))
## # A tibble: 8 x 2
##   region       `mean(polityIV)`
##   <chr>                   <dbl>
## 1 Africa                  11.4 
## 2 Asia-Pacific            13.6 
## 3 C&E Europe              14.2 
## 4 Middle East              5.69
## 5 N. America              19.3 
## 6 S. America              16.6 
## 7 Scandinavia             20   
## 8 W. Europe               19.9
world%>%
  group_by(region) %>%
    summarize(n_distinct(country))
## # A tibble: 8 x 2
##   region       `n_distinct(country)`
##   <chr>                        <int>
## 1 Africa                          42
## 2 Asia-Pacific                    24
## 3 C&E Europe                      25
## 4 Middle East                     16
## 5 N. America                       3
## 6 S. America                      19
## 7 Scandinavia                      4
## 8 W. Europe                       12
  1. Find the mean and median GDP per capita, by region and whether a country is a “democracy” or not. For the purpose of this exercise, a country is a “democracy” if it has a Polity IV score of 15 or higher.
world %>%
    filter(polityIV >= 15) %>%
      group_by(region) %>%
        summarize(mean(gdppcap08))
## # A tibble: 8 x 2
##   region       `mean(gdppcap08)`
##   <chr>                    <dbl>
## 1 Africa                   3220.
## 2 Asia-Pacific            13433.
## 3 C&E Europe              14919.
## 4 Middle East             20734 
## 5 N. America              32552.
## 6 S. America               8159.
## 7 Scandinavia             41889.
## 8 W. Europe               35040.
world %>%
  filter(polityIV >= 15) %>%
    group_by(region) %>%
      summarize(median(gdppcap08))
## # A tibble: 8 x 2
##   region       `median(gdppcap08)`
##   <chr>                      <dbl>
## 1 Africa                     1404 
## 2 Asia-Pacific               4268.
## 3 C&E Europe                16620.
## 4 Middle East               20734 
## 5 N. America                36444 
## 6 S. America                 8009 
## 7 Scandinavia               36995 
## 8 W. Europe                 34969