Course: Data-Driven Management and Policy

Prof. José Manuel Magallanes, PhD


Session 4 LAB: Working with Data Frames

Lab Instructions

Part 1:

You saw this code in the session: For any number N,for loop is executed twice the amount of the N number. The first time if for the “if” statement, and one for the “return” statement.

**As the factors will include the number itself, so I decided to put the number first and let the for loop run without the number itself. This way the number of run of for loop is reduced.

# one input, and several output in simple data structure:
factors=function(number){
    # empty vector that will collect output
    vectorOfAnswers=c(number)
  
    
    # for every value in the sequence...
    for (i in 2:number-1){
        
        #if the remainder of 'number'/'i' equals zero...
        if ((number %% i) == 0){ 
            
            # ...add 'i' to the vector of factors!
            vectorOfAnswers=c(vectorOfAnswers,i)
          
          
    }
  }
  return (vectorOfAnswers) # returning  the vector
}

factors(20)
## [1] 20  1  2  4  5 10

This function accepted an integer number and returned its factors. Please answer these questions:

  1. For any number N, how many times does the for section is executed to compute the factos of N?

  2. Make a change in the code to reduce the times the for section is executed.

Part 2:

Open the file demo_hdi, it is a csv file in the folder you downloaded. Then,

  • Get the max value of human development index for each type of democracy.

You should get something like this:

**The code below represents the max values of hdi for each demType in demo_hdi file. It is also has been put into longformat.

HumanDevelopment = read.csv("demo_hdi.csv")
library(reshape2)

MaxValue = tapply(HumanDevelopment$hdi,HumanDevelopment$demType, max)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
topvalues = HumanDevelopment %>% filter(hdi==0.863| hdi==0.933| hdi == 0.953|hdi == 0.814)

HumanDevelopment_L1 = melt(topvalues)
## Using country, demType as id variables
HumanDevelopment_L1
##                 country          demType variable value
## 1             Hong Kong Flawed democracy demScore 6.150
## 2            Montenegro    Hybrid regime demScore 5.740
## 3                Norway   Full democracy demScore 9.870
## 4                Sweden   Full democracy demScore 9.390
## 5  United Arab Emirates    Authoritarian demScore 2.760
## 6             Hong Kong Flawed democracy      hdi 0.933
## 7            Montenegro    Hybrid regime      hdi 0.814
## 8                Norway   Full democracy      hdi 0.953
## 9                Sweden   Full democracy      hdi 0.933
## 10 United Arab Emirates    Authoritarian      hdi 0.863
  • Use those values to select the rows whose hdi matches one of those.

You should get something like this:

  • Turn that wide format into a long format.

You should get something like this:

Final project reminder:

  1. Keep collecting data (and start if you haven’t).

  2. Be ready to start your experiment soon.