Project II - Poverty and Inequality

“Our lives begin to end the day we become silent about things that matter.”

— Martin Luther King Jr.

Poverty and Inequality are important concepts in economics and social sciences that are closely related but distinct. These two concepts are of great importance in development economics because they are among the topics that development economists research on, as well as their causes and the consequences of various policies in both developed and developing countries. Both indicate deprivation in access to resources, opportunities, and outcomes among certain individuals or groups.

Poverty

Poverty is a state or condition in which an individual lacks the sufficient financial resources and essentials for a basic standard of living.

Causes of Poverty include market failures like:

Labor Market Frictions
Credit Constraints
Human Capital Development

Inequality

Inequality refers to the unequal distribution of income, wealth, opportunities, and access to resources across individuals or groups within a society.

Causes of Inequality include:

Unequal Access to Resources
Differences in Education and Skills
Wealth Accumulation

This project aims to analyze the socioeconomic landscape of Iran by examining both poverty and inequality. It includes estimating poverty metrics such as the poverty line, poverty gap, and multidimensional poverty index (MPI), as well as inequality indicators like the Gini coefficient and the share of the top 1%.

Necessary Packages

First, we install the necessary packages for this project.

Then, we load these packages.

knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)
library(dplyr)
library(data.table)
library(knitr)
library(ggplot2)
library(tidyverse)
library(readxl)
library(tidyr)
library(shiny)
library(leaflet)
library(mdbr)
library(Hmisc)
library(leaflet)
#library(tigris)
library(sf)
library(readxl)
library(tidyverse)
library(shiny)
library(labelled)
library(RSocrata)
library(shinythemes)
library(RColorBrewer)
library(rsconnect)
library(haven)
#library(mapview)
library(forcats)
library(writexl)
library(psych)
library(foreign)
library(kableExtra)
library(htmltools)
library(ineq)
library(glue)
library(scales)

Import the Data

Here, we load data by specifying our file path. The data set used for this project is the Households Expenditure and Income Survey (HEIS) for the year 1402, which contains information on household expenditures on durable and non-durable goods, sources of income, residence, and other factors in different provinces, collected in different tables.

The HEIS data set displays urban and rural households in different tables. Each table has its own aim to indicate some dimensions of households.

HEIS 1402 includes the following tables:

U1402Data & R1402Data: survey details section
U1402P1 & R1402P1: Social characteristics of household members
U1402P2 & R1402P2: Location details
U1402P3S & R1402P3S (S = 01, \(\cdots\) , 12 , S \(\neq\) 4) : Household expenditure on non-durable goods
U1402P3S04 & R1402P3S04: Housing costs section
U1402P3S13 & R1402P3S13: Household expenditure on durable goods
U1402P3S14 & R1402P3S14: Household investment in last 12 months
U1402P4S1 & R1402P4S1: Income of employed household members from public sector
U1402P4S2 & R1402P4S2: Income of employed household members from private sector
U1402P4S3 & R1402P4S3: Miscellaneous household income
U1402P4S4 & R1402P4S4: Household income from government Cash Transfers (Subsidy)

In this part, we extract all the tables from the HEIS data set and add a new column called “Urban_Rural”.

# Removing the empty tables from HEIS 1402
HEIS[["R1402P3S10"]] <- NULL
HEIS[["U1402P3S10"]] <- NULL

origin_names <- names(HEIS)
# Iterating over data frames in HEIS and modifying them 
HEIS <- lapply(seq_along(HEIS), function(idx) {
    name <- origin_names[idx]
    #Extracting the first character of the name
    first_char <- substr(name, 1, 1)
    #Checking if the first character of the name is either "R" (Rural) or "U" (Urban).
    if (first_char == "R" || first_char == "U") {
      # Adding a new column Urban_Rural and filling with "R" and "U"
        HEIS[[idx]]$Urban_Rural <- first_char}
    return(HEIS[[idx]])})
# Reassigning original names
names(HEIS) <- origin_names
# Assigning each data frame back to the global environment
invisible(lapply(names(HEIS), function(x) assign(x,HEIS[[x]],envir=.GlobalEnv)))

In this part, we combine the rural and urban tables with the same names.

# Function to merge R and U tables and rename them to 'RU'
merge_tables <- function() {
  # Loop through each name and check for the pattern
  for (name in origin_names) {
    # If a table starts with 'R', find the corresponding 'U' table
    if (substr(name, 1, 1) == "R") {
      u_name <- paste0("U", substr(name, 2, nchar(name)))
      # Merge the R and U tables if the U table exists
      if (u_name %in% origin_names) {
        # Merge the R and U tables
        merged_data <- rbind(HEIS[[name]], HEIS[[u_name]])
        # Create the new name with 'RU' prefix
        new_name <- paste0("RU", substr(name, 2, nchar(name)))
        # Assign the merged data to the new name
        assign(new_name, merged_data, envir = .GlobalEnv)
        # Remove the original R and U tables
        rm(list = c(name, u_name), envir = .GlobalEnv)}}}}
# Call the function
merge_tables()
# Remove the unnecarry data and function
rm(HEIS)

Now, the data tables are available to start the cleaning process.

Data Cleaning Process:

As reported in the project text, we first create a column that represents the weight of household members and add it to our table.

Head of Household (HH): \(\text{Weight} = 1\)
Non-head above 18 years old: \(\text{Weight} = 0.8\)
Non-head under 18 years old: \(\text{Weight} = 0.5\)

# Creating the Weight column in our data set
RU1402P1 <- RU1402P1 %>%
  mutate(Weight = case_when(
    DYCOL03 == 1 ~ 1,                         # If DYCOL03 == 1, assign 1 to weight
    DYCOL03 != 1 & DYCOL05 >= 18 ~ 0.8,       # If DYCOL03 != 1 and DYCOL05 >= 18, assign 0.8
    DYCOL03 != 1 & DYCOL05 < 18 ~ 0.5,        # If DYCOL03 != 1 and DYCOL05 < 18, assign 0.5
    TRUE ~ NA_real_))

In this section, we label the variables in the tables to enhance file readability and also create a province variable and add it to the tables.

# Part 0
# Assign each province to their codes
Province <- c(Markazi = "00", Ardabil = "24", Bushehr = "18", `Chaharmahal and Bakhtiari` = "14",
  `East Azerbaijan` = "03", Fars = "07", Gilan = "01", Golestan = "27",
  Hamadan = "13", Hormozgan = "22", Ilam = "16", Isfahan = "10",
  Kerman = "08", Kermanshah = "05", Khuzestan = "06", `Kohgiluyeh and Boyer-Ahmad` = "17",
  Kurdistan = "12", Lorestan = "15", Alborz = "30", Mazandaran = "02",
  `North Khorasan` = "28", Qazvin = "26", Qom = "25", `Razavi Khorasan` = "09",
  Semnan = "20", `Sistan and Baluchestan` = "11", `South Khorasan` = "29", Tehran = "23",
  `West Azerbaijan` = "04", Yazd = "21", Zanjan = "19")
  
# This data frame shows specifications of the survey
RU1402Data <- RU1402Data %>% 
  rename(khanevartype = NoeKhn)  %>% 
  # A new column "province" is added to the data frame which contains the mapped names of province
  mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))

##############################
# Part 1
# Relation to head
relation <- c(head="1", spouse="2", child="3", childinlaw="4", grandchild="5", parent="6", sibling="7", relative="8", nonrelative="9")
# Gender dummy variable
gender <- c(Male="1", Female="2")
# Literacy dummy variable 
literacy <- c(literate="1", illiterate="2")
# Dummy variable for answering Questions
yesno <- c(Yes="1", No="2")
# Degree of Education dummy variable
education <- c(Elemantary="1", Secondary="2", HighSchool="3", Diploma="4", College="5", Bachelor="6", Master="7", PhD="8", Other="9")
# Activity Status
occupation <- c(employed="1", unemployed="2", IncomeWOJob="3", Student="4", Housewife="5", Other="6")
# Marital Status
marital <- c(Married ="1", Widowed="2", Divorced="3", Single="4")

# This data frame shows social characteristics of family members (Renaming Columns)
RU1402P1 <- RU1402P1 %>% 
  rename(
    member = DYCOL01,
    relation = DYCOL03,
    gender = DYCOL04,
    age = DYCOL05,
    literacy = DYCOL06,
    is_studying = DYCOL07,
    education_deg = DYCOL08,
    occupational_stat = DYCOL09,
    marital_stat = DYCOL10) %>%  
  mutate(across(where(is.character), as.integer),
         across(c(relation,gender,literacy,is_studying,education_deg,occupational_stat,marital_stat), as.factor),
         relation = fct_recode(relation, !!!relation), 
         gender = fct_recode(gender, !!!gender),
         literacy = fct_recode(literacy, !!!literacy), 
         is_studying = fct_recode(is_studying, !!!yesno),
         education_deg = fct_recode(education_deg, !!!education), 
         occupational_stat = fct_recode(occupational_stat, !!!occupation),
         marital_stat = fct_recode(marital_stat, !!!marital))

##############################
# Part 2
# Type of occupation of the residence
tenure <- c(OwnedEstateLand="1", OwnedEstate="2", Rent="3", Mortgage="4", Service="5", Free="6", Other="7")
# Major materials
material <- c(MetalBlock="1", BrickWood="2", Cement="3", Brick="4", Wood="5", WoodKesht="6", KeshtGel="7", Other="8")
# Type of used fuel in the cooking
fuel <- c(Oil="1", Gasoline="2", LiquidGas="3", NaturalGas="4", Electricity="5", Wood="6", AnimalOil="7", Coke="8", Other="9", None="10" )
# Type of used fuel in heating
fuel1 <- c(Oil="11", Gasoline="12", LiquidGas="13", NaturalGas="14", Electricity="15", Wood="16", AnimalOil="17", Coke="18", Other="19", None="20" )
# Type of used fuel in providing Hot Water
fuel2 <- c(Oil="21", Gasoline="22", LiquidGas="23", NaturalGas="24", Electricity="25", Wood="26", AnimalOil="27", Coke="28", Other="29", None="30" )

# This data frame shows Residence details (Renaming Columns)
RU1402P2 <- RU1402P2 %>% 
  rename(
    tenure = DYCOL01,
    room = DYCOL03,
    space = DYCOL04,
    construction = DYCOL05,
    material = DYCOL06,
    vehicle = DYCOL07,
    motorcycle = DYCOL08,
    bicycle = DYCOL09,
    radio = DYCOL10,
    radiotape = DYCOL11,
    TVbw = DYCOL12,
    TV = DYCOL13,
    VHS_VCD_DVD = DYCOL14,
    computer = DYCOL15,
    cellphone = DYCOL16,
    freezer = DYCOL17,
    refridgerator = DYCOL18,
    fridge = DYCOL19,
    stove = DYCOL20,
    vacuum = DYCOL21,
    washingmachine = DYCOL22,
    sewingmachine = DYCOL23,
    fan = DYCOL24,
    evapcoolingportable = DYCOL25,
    splitportable = DYCOL26,
    dishwasher = DYCOL27,
    microwave = DYCOL28,
    none = DYCOL29,
    pipewater = DYCOL30,
    electricity = DYCOL31,
    pipegas = DYCOL32,
    telephone = DYCOL33,
    internet  = DYCOL34,
    bathroom = DYCOL35,
    kitchen = DYCOL36,
    evapcooling = DYCOL37,
    centralcooling = DYCOL38,
    centralheating = DYCOL39,
    package = DYCOL40,
    split = DYCOL41,
    wastewater = DYCOL42,
    cookingfuel = DYCOL43,
    heatingfuel = DYCOL44,
    waterheatingfuel = DYCOL45) %>% 
  mutate(across(where(is.character), as.integer),
         across(c(tenure,material,cookingfuel,heatingfuel,waterheatingfuel), as.factor),
         tenure = fct_recode(tenure, !!!tenure), 
         material = fct_recode(material, !!!material),
         cookingfuel = fct_recode(cookingfuel, !!!fuel), 
         heatingfuel = fct_recode(heatingfuel, !!!fuel1),
         waterheatingfuel = fct_recode(waterheatingfuel, !!!fuel2),
         across(vehicle:wastewater, ~!is.na(.x)))

##############################
# Part 3, Table 1
# This data frame shows food and tobacco expenditures (Renaming Columns)
RU1402P3S01 <- RU1402P3S01 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    gram = DYCOL03,
    kilogram = DYCOL04,
    price = DYCOL05,
    value = DYCOL06 ) %>% 
  mutate(
    across(c(price,value,kilogram),  ~ as.numeric(as.character(.x)) ),
    table = 1L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))                                                                                                                                                            
# Part 3, Table 2
# This data frame shows drink expenditures (Renaming Columns)
RU1402P3S02 <- RU1402P3S02 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    gram = DYCOL03,
    kilogram = DYCOL04,
    price = DYCOL05,
    value = DYCOL06 ) %>% 
  mutate(
    table = 2L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))  
  


# Part 3, Table 3
RU1402P3S03 <- RU1402P3S03 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03 ) %>% 
  mutate(
    table = 3L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))  
# Part 3, Table 4
# This data frame shows the housing costs (Renaming columns)
RU1402P3S04 <- RU1402P3S04 %>% 
  rename(
    goods_code = DYCOL01,
    mortgage = DYCOL02,
    provision_type = DYCOL03,
    value = DYCOL04 ) %>% 
  mutate(table = 4L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free"))) 
# Part 3, Table 5
RU1402P3S05 <- RU1402P3S05  %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03 ) %>% 
  mutate(table = 5L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 6
RU1402P3S06 <- RU1402P3S06  %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(table = 6L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 7
RU1402P3S07 <- RU1402P3S07 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 7L)  %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))

# Part 3, Table 8
RU1402P3S08 <- RU1402P3S08 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 8L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 9
RU1402P3S09 <- RU1402P3S09 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 9L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 11
RU1402P3S11 <- RU1402P3S11 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 11L)  %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 12
RU1402P3S12 <- RU1402P3S12 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 12L)  %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))

Here, we append the “Weight” column from RU1402P1 based on their “Address” to datasets which start with “RU1402P3S”

# Merging Process
RU1402P1_getweight <- select(RU1402P1, Address, Weight)
# Generate potential data set names
datasetP3_names <- paste0("RU1402P3S", sprintf("%02d", 1:12))
# Filter names of datasets that actually exist in this environment
P3_datasets <- Filter(function(name) exists(name, envir = .GlobalEnv), datasetP3_names)
# Apply the transformation and merge with RU1402Data_selected to each data set
P3weight_datasets <- lapply(mget(P3_datasets, envir = .GlobalEnv), function(df) {
  df %>% 
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province)) %>%
    left_join(RU1402P1_getweight, by = "Address")})
invisible(lapply(names(P3weight_datasets), function(x) assign(x,P3weight_datasets[[x]],envir=.GlobalEnv)))
# Removing the unnecessary data sets
#rm(RU1402P1_getweight)
rm(P3weight_datasets)

Poverty Analysis

1. Estimation of the Absolute Poverty Line

Based on an article from FAO about “Impacts of Policies on Poverty (Absolute Poverty Lines)”, there are some methods to calculate the absolute poverty line.

Food energy intake (FEI): Enough food to meet energy requirements
Cost of basic needs (CBN): A consumption bundle with food and nonfood
Consumption insufficiency method (CI): All necessary goods and services to satisfy the basic needs
Budget standard method (BS): All necessary goods and services to satisfy the basic needs \(+\) basic minimum for social lives

All of the above methods define a Set of goods that would ensure a standard of living and convert this set of goods into Monetary Values. The final aim is to define a poverty line (Threshold) below which an individual is considered poor.

The food energy intake (FEI) methodology defines the minimum food intake needed by a given individual to lead a decent life. By this definition, those people who can’t afford the cost of the FEI are poor. In addition, by definition, FEI is an absolute concept of poverty that is entirely food-based. This measure is a good indicator of poverty in those countries where a large part of the population spends a significant fraction of their budget on food (Specially for less developed economies).

In this part, we only consider the first method which is FEI. Here, we calculate the absolute poverty line based on the food bundle proposed by the Ministry of Health, providing 2,100 Kilo Calories per day and the required protein contents.

The following table is a food bundle designed by Iran’s Ministry of Health, which is by the dietary pattern of an adult and provides 2100 kilo calories daily.

سبد غذایی تامین کننده ۲۱۰۰ کیلو کالری در روز (منطبق با الگوی غذایی بالغین)

ماده غذایی	مقدار سبد غذایی ماهانه
نان	۸ کیلوگرم
برنج	۳ کیلوگرم
ماکارونی (رشته فرنگی)	۷۰۰ گرم (یک بسته)
عدس (دیگر حبوبات)	۶۰۰ گرم
سیب زمینی (نخود سبز)	۱/۵ کیلوگرم
شیر	۷ کیسه ۱ لیتری
پنیر	۴۵۰ گرم
ماست	۳ کیلوگرم
گوشت قرمز	۱/۲ کیلوگرم
گوشت سفید (ماهی منجمد)	۱/۵ کیلوگرم (یک عدد مرغ)
تخم مرغ	۱۰ عدد
روغن مایع	۹۰۰ سی سی
شکر (قند یا عسل یا مربا)	۱ کیلوگرم
میوه	۶۰ واحد
سبزی های برگ سبز	۶۰ واحد
دیگر سبزی ها	۶۰ واحد

First, We extract food items from data and label them.

# For the materials that has replacement we add | instead &
RU1402P3S01 <- RU1402P3S01 %>% 
  mutate(label = case_when(
    goods_code >= 11141 & goods_code <= 11156 ~ "bread",
    goods_code >= 11111 & goods_code <= 11118 ~ "rice",
    goods_code == 11164 ~ "spaghetti", #Considering (noodle)
    goods_code == 11731 | goods_code == 11725 ~ "potato", # Considering (green pea)
    goods_code >= 11411 & goods_code <= 11414 ~ "milk",
    goods_code >= 11424 & goods_code <= 11426 ~ "yoghurt",
    (goods_code == 11241) & (goods_code >= 11211 & goods_code <= 11224) ~ "red meat",
    goods_code >= 11441 & goods_code <= 11443 ~ "egg",
    goods_code >= 11428 & goods_code <= 11431 ~ "cheese",
    goods_code >= 11611 & goods_code <= 11643 ~ "fruit",
    (goods_code >= 11711 & goods_code <= 11715) | (goods_code >= 11721 & goods_code <= 11753) & goods_code != 11731 ~ "vegetables", # Considering both green and non green vegetables
    goods_code == 11533 ~ "oil",
    goods_code == 11812 | goods_code == 11811 | goods_code == 11821 | goods_code == 11823 ~ "sugar", # Considering (Jam, cube sugar, honey)
    goods_code == 11768 | (goods_code >= 11761 & goods_code <= 11769) ~ "lentil", #considering other legumes)
    (goods_code == 11312 & goods_code == 11314) | (goods_code >= 11231 & goods_code <= 11239) ~ "white meat", # considering fish
    TRUE ~ NA_character_))

In this part, we modify the “Kilogram” column and generate a data frame to indicate the above food bundle table. Here we need to have some assumptions about some foods. Based on the search on internet, we create our assumption (there may not be the exact weight of each food but it’s a good estimation):

One medium egg: 50 grams
One unit fruit: 80 grams
One unit vegetable: 75 grams
One liter oil: 900 grams

P3S01_ModifiedKg <- RU1402P3S01 %>%
  mutate(gram = ifelse(is.na(gram), 0, gram/1000)) %>% 
  mutate(kilogram = ifelse(is.na(kilogram), 0, kilogram)) %>% 
  mutate(kilogram = kilogram + gram) %>% 
  select(-gram)
# Generating the food bundle based on the above table
food_bundle <- data.frame(
  label = c("bread", "rice", "spaghetti", "potato", "lentil", "milk", "yoghurt",
            "red meat", "white meat", "egg", "cheese", "fruit", "vegetables", "oil", "sugar"),
  needs_kg = c(8, 3, 0.7, 1.5, 0.6, 7, 3, 1.2, 1.5, 0.5, 0.45, 4.8, 9, 0.8, 1))

Here, We use two methods to calculate the weighted mean price.\[\text{Weighted Mean Price}_1 = \frac{\sum_{i=1}^{n} (\text{weight}_i \times \text{value}_i)}{\sum_{i=1}^{n} (\text{weight}_i \times \text{kilogram}_i)}\] \[\text{Weighted Mean Price}_2 = \frac{\sum_{i=1}^{n} (\text{weight}_i \times \text{price}_i)}{\sum_{i=1}^{n} (\text{weight}_i)}\]

The key differences between these two methods are:

The first method (\(\text{Weighted Mean Price}_1\)) is useful when dealing with heterogeneous items with varying prices and quantities.
The second method (\(\text{Weighted Mean Price}_2\)) is useful when comparing prices for the same product in different markets.

We estimate the cost of this bundle for urban and rural regions in each province to determine the absolute poverty line.

The following table summarizes the results from the Absolute Food Poverty Line.

Absolute Poverty Line (Individual Level - Rial)
province	Urban_Rural	First_Method	Second_Method
Markazi	Rural	10,892,461	12,159,927
Markazi	Urban	11,753,554	13,336,748
Gilan	Rural	12,321,780	14,695,402
Gilan	Urban	13,071,919	15,646,669
Mazandaran	Rural	13,241,551	15,454,503
Mazandaran	Urban	13,719,724	15,408,803
East Azerbaijan	Rural	11,192,237	12,163,803
East Azerbaijan	Urban	11,736,456	12,947,565
West Azerbaijan	Rural	10,385,923	11,234,672
West Azerbaijan	Urban	10,782,061	11,785,844
Kermanshah	Rural	10,258,364	11,075,460
Kermanshah	Urban	10,973,332	11,828,619
Khuzestan	Rural	11,202,407	11,730,458
Khuzestan	Urban	12,103,919	13,429,049
Fars	Rural	11,499,322	12,293,533
Fars	Urban	11,973,511	13,069,107
Kerman	Rural	10,797,652	11,694,707
Kerman	Urban	11,217,183	12,528,833
Razavi Khorasan	Rural	10,262,250	11,176,523
Razavi Khorasan	Urban	10,978,742	12,250,785
Isfahan	Rural	10,612,120	11,612,185
Isfahan	Urban	11,066,719	12,484,515
Sistan and Baluchestan	Rural	11,959,658	13,429,793
Sistan and Baluchestan	Urban	11,691,361	13,343,773
Kurdistan	Rural	10,331,593	11,321,938
Kurdistan	Urban	11,250,252	12,929,946
Hamadan	Rural	9,748,355	10,846,703
Hamadan	Urban	10,674,406	11,757,492
Chaharmahal and Bakhtiari	Rural	11,467,329	11,631,019
Chaharmahal and Bakhtiari	Urban	11,360,201	13,056,104
Lorestan	Rural	9,576,627	10,383,641
Lorestan	Urban	10,340,902	11,164,087
Ilam	Rural	9,969,086	10,622,510
Ilam	Urban	10,736,474	11,847,126
Kohgiluyeh and Boyer-Ahmad	Rural	10,894,116	11,799,621
Kohgiluyeh and Boyer-Ahmad	Urban	10,708,175	12,992,214
Bushehr	Rural	12,596,052	13,708,934
Bushehr	Urban	12,905,540	14,032,785
Zanjan	Rural	11,262,237	13,072,803
Zanjan	Urban	11,744,954	13,941,275
Semnan	Rural	10,236,025	10,835,096
Semnan	Urban	10,709,173	11,364,694
Yazd	Rural	11,778,132	13,192,722
Yazd	Urban	11,632,590	13,215,317
Hormozgan	Rural	14,434,812	16,844,817
Hormozgan	Urban	14,207,530	15,780,112
Tehran	Rural	11,693,750	13,107,647
Tehran	Urban	12,815,263	14,361,055
Ardabil	Rural	11,446,626	12,279,087
Ardabil	Urban	12,168,514	12,689,191
Qom	Rural	10,854,294	11,874,798
Qom	Urban	10,489,917	11,711,749
Qazvin	Rural	11,572,768	12,689,321
Qazvin	Urban	11,682,550	13,202,049
Golestan	Rural	9,591,023	10,381,877
Golestan	Urban	10,305,236	11,065,175
North Khorasan	Rural	9,665,898	10,599,748
North Khorasan	Urban	10,495,577	11,715,921
South Khorasan	Rural	10,849,373	11,314,446
South Khorasan	Urban	10,852,866	12,410,013
Alborz	Rural	12,037,217	12,855,809
Alborz	Urban	12,937,873	13,628,687

In the first method, the cost of food bundle for one person in a month ranges from 9,576,627 Rial in rural areas of ‘Lorestan’ province to 14,434,812 Rial in rural areas of ‘Hormozgan’ province. Provinces with highest poverty line are ‘Hormozgan’ and ‘Mazandaran’ in both areas.

In the second method, the cost of food bundle for one person in a month ranges from 10,381,877 Rial in rural areas of ‘Golestan’ province to 16,844,817 Rial in rural areas of ‘Hormozgan’ province.

Provinces with highest absolute poverty line are ‘Hormozgan’ and ‘Mazandaran’. Provinces with lowest absolute poverty line are ‘Lorestan’ and ‘North Khorasan’.

2. Estimation of the Relative Poverty Line

Based on an article from FAO about “Impacts of Policies on Poverty (Relative Poverty Lines)”, there are some methods to calculate the relative poverty line.

Income levels (IL): percentage of mean/median income below
Income positions (IP): income quantile below

This approach considers the welfare position of each household about the welfare position of other individuals or households. These methods rely on a threshold that is relative to either income or expenditure.

It is usually taken income as a reference variable. However, in applied works, expenditure is sometimes taken as a more correct welfare indicator than income, as transitory shocks may drive observed income far from its permanent level. Expenditures are thought to better reflect this level of permanent income.

According to the low accuracy of our data in reporting income, we concentrate on expenditure rather than income as a monetary indicator.

Traditional Approach (IL):

We focus on 50 % of the median per capita expenditure as the poverty threshold.

We assess total spending on non-durable goods for individuals in all provinces, including both urban and rural areas.

# Combining the tables which starts with P3S
combined_p3 <- bind_rows(mget(P3_datasets, envir = .GlobalEnv))

# Calculating the total expenditures for individuals in each province, including both urban and rural areas.
total_expenditure <- combined_p3 %>%
  dplyr::group_by(Address, province, Urban_Rural, Weight) %>%
  dplyr::summarize(total_value = sum(value, na.rm = TRUE), .groups = 'drop')

# Calculate 50% of the median in each province, including both urban and rural areas
relative_line <- total_expenditure %>%
  dplyr::group_by(province, Urban_Rural) %>%
  dplyr::summarize(Half_Median = 0.5 * median(total_value, na.rm = TRUE), .groups = 'drop') %>%
  mutate(Urban_Rural = recode(Urban_Rural, "R" = "Rural", "U" = "Urban")) %>%
  arrange(province, Urban_Rural, -Half_Median) 

kable(
  relative_line %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Relative Poverty Line - Traditional Approach (Individual Level - Rial)" = ncol(relative_line))) %>%
  row_spec(which(relative_line$Urban_Rural == "Rural"), background = "violet") %>%
  row_spec(which(relative_line$Urban_Rural == "Urban"), background = "yellow")

Relative Poverty Line - Traditional Approach (Individual Level - Rial)
province	Urban_Rural	Half_Median
Markazi	Rural	47,050,000
Markazi	Urban	61,550,000
Gilan	Rural	34,165,000
Gilan	Urban	54,135,250
Mazandaran	Rural	59,686,619
Mazandaran	Urban	71,847,500
East Azerbaijan	Rural	32,717,500
East Azerbaijan	Urban	46,120,500
West Azerbaijan	Rural	29,887,000
West Azerbaijan	Urban	40,482,500
Kermanshah	Rural	39,786,000
Kermanshah	Urban	52,030,000
Khuzestan	Rural	43,755,000
Khuzestan	Urban	64,177,500
Fars	Rural	52,862,500
Fars	Urban	81,802,500
Kerman	Rural	29,305,000
Kerman	Urban	37,080,000
Razavi Khorasan	Rural	35,338,250
Razavi Khorasan	Urban	58,067,500
Isfahan	Rural	46,695,500
Isfahan	Urban	64,286,500
Sistan and Baluchestan	Rural	26,307,500
Sistan and Baluchestan	Urban	44,267,500
Kurdistan	Rural	40,077,500
Kurdistan	Urban	55,362,500
Hamadan	Rural	37,876,250
Hamadan	Urban	54,967,400
Chaharmahal and Bakhtiari	Rural	54,890,000
Chaharmahal and Bakhtiari	Urban	66,162,500
Lorestan	Rural	37,079,500
Lorestan	Urban	50,695,050
Ilam	Rural	30,913,500
Ilam	Urban	42,693,750
Kohgiluyeh and Boyer-Ahmad	Rural	36,967,500
Kohgiluyeh and Boyer-Ahmad	Urban	50,917,000
Bushehr	Rural	70,438,200
Bushehr	Urban	78,498,250
Zanjan	Rural	43,818,000
Zanjan	Urban	68,323,000
Semnan	Rural	27,542,500
Semnan	Urban	36,870,000
Yazd	Rural	49,700,000
Yazd	Urban	58,330,000
Hormozgan	Rural	43,610,000
Hormozgan	Urban	60,189,000
Tehran	Rural	60,374,300
Tehran	Urban	101,536,580
Ardabil	Rural	40,580,000
Ardabil	Urban	51,282,500
Qom	Rural	41,446,775
Qom	Urban	54,981,500
Qazvin	Rural	40,275,250
Qazvin	Urban	57,117,000
Golestan	Rural	35,715,000
Golestan	Urban	48,305,000
North Khorasan	Rural	32,448,000
North Khorasan	Urban	49,644,500
South Khorasan	Rural	26,759,500
South Khorasan	Urban	50,287,500
Alborz	Rural	62,097,362
Alborz	Urban	77,540,000

3 Provinces with highest relative poverty line are ‘Tehran’ and ‘Fars’ and ‘Bushehr’ in urban areas. 3 Provinces with lowest relative poverty line are ‘Sistan and Baluchestan’ and ‘South Khorasan’ and ‘Semnan’ in rural areas.

Hybrid Approach:

It is important to analyze the food bundle of very poor individuals, as they are often stuck in a poverty trap. This analysis will enable us to better target assistance and improve their access to necessary food items.

In this part, we first rank households by per capita non-durable expenditure.
Second, we identify the bottom 20% of households in both urban and rural zones.
Third, we analyze the food bundle typically consumed by these households.

In this step, we need to match the bottom 20% of households to the RU1402P3S01 table to find their food bundle by “label” column.

In this step, we need to get the goods code that they have NA label but they are consumed typically by these households. We set our threshold at 4000. This means that we only consider the goods with NA labels which consumed by more than 4000 of the bottom 20% of households.

NA_label <- filtered_RU1402P3S01 %>% 
  filter(is.na(label)) %>% 
  group_by(goods_code) %>% 
  dplyr::summarize(count = n(), .groups = 'drop') %>%
  filter(count>4000) %>%
  arrange(-count)

Here, we set the labels for these goods and consider them in their food bundle.

NA_label <- NA_label %>% 
  mutate(label = case_when(
    goods_code == 11921 ~ "tomato paste",
    goods_code == 12211 ~ "soda",
    goods_code == 11911 ~ "salt",
    goods_code == 12112 ~ "non-irani tea",
    goods_code == 11665 ~ "chips & pofak",
    goods_code == 11531 ~ "margarine",
    goods_code == 11171 ~ "biscuit",
    goods_code == 11914 ~ "turmeric",
    goods_code == 11423 ~ "ice cream",
    goods_code == 11161 ~ "flour",
    goods_code == 11172 ~ "cake",
    TRUE ~ NA_character_
  )) %>% 
    filter(!is.na(label))
# Indicate the table 
kable(
  NA_label %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  column_spec(1, background = "olivedrab1") %>%
  column_spec(2, background = "slateblue1") %>%
  column_spec(3, background = "deepskyblue")

goods_code	count	label
11,921	16,566	tomato paste
12,211	12,901	soda
11,911	9,958	salt
12,112	9,375	non-irani tea
11,665	8,083	chips & pofak
11,531	6,796	margarine
11,171	6,516	biscuit
11,914	4,801	turmeric
11,423	4,338	ice cream
11,161	4,219	flour
11,172	4,184	cake

In addition to the food bundle defined by the Iran’s Ministry of Health, we found goods that are typically consumed by the bottom 20% of households. These new goods include:

رب گوجه فرنگی
انواع نوشابه
انواع نمک
انواع چای خارجی
پفک و چیپس
انواع روغن نباتی جامد و مارگارین
انواع بیسکوییت و ویفر
زردچوبه
انواع بستنی شیری و میوه ای
آرد گندم
انواع کیک ساده

In this step, we convert the “gram” column to “kilogram”. Then we assign a label to this new code of goods consumed by the bottom 20% of poor households.

The following code generates the new food bundle.

# Get the unique values of label column
uniq <- unique(filtered_RU1402P3S01$label)

n_rows <- ceiling(length(uniq) / 2)
uniq_table <- data.frame(
  Column1 = uniq[1:n_rows],
  Column2 = uniq[(n_rows + 1):length(uniq)])

kable(uniq_table, format = "html", col.names = c("Column 1", "Column 2"), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
    full_width = FALSE,
    position = "center") %>%
  column_spec(1, background = "wheat3", border_right = TRUE) %>%
  column_spec(2, background = "wheat3")

Column 1	Column 2
bread	sugar
ice cream	non-irani tea
yoghurt	NA
cheese	oil
egg	soda
vegetables	biscuit
potato	rice
lentil	margarine
tomato paste	salt
spaghetti	turmeric
white meat	cake
milk	chips & pofak
fruit	flour

A notable point about the table above is that the new food bundle for the bottom 20% of households does not include Red Meat. This indicates that households in the lowest 20% do not incorporate Red Meat into their monthly food bundle.

Fourth, we determine the amounts that provide 2,100 calories based on this new bundle.

In this step, we calculate the average food bundle of poor people below the 20% of bottom.

The following code creates a data set to calculate the average consumption of each household in every province, distinguishing between urban and rural areas by their labels.

Based on the search on this Website, we find the calories per kilogram for each item included in the new food bundle for the bottom 20% of households.

kable(
  average_consumption %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Average Daily Calories (Individual Level - 20% bottom)" = ncol(average_consumption))) %>%
  row_spec(which(average_consumption$Urban_Rural == "Rural"), background = "slategray1") %>%
  row_spec(which(average_consumption$Urban_Rural == "Urban"), background = "peachpuff1")

Average Daily Calories (Individual Level - 20% bottom)
province	Urban_Rural	average_personal_calories
Markazi	Rural	1,334
Markazi	Urban	1,029
Gilan	Rural	1,218
Gilan	Urban	1,117
Mazandaran	Rural	1,871
Mazandaran	Urban	1,364
East Azerbaijan	Rural	1,692
East Azerbaijan	Urban	1,337
West Azerbaijan	Rural	1,190
West Azerbaijan	Urban	1,076
Kermanshah	Rural	2,719
Kermanshah	Urban	2,034
Khuzestan	Rural	2,357
Khuzestan	Urban	1,708
Fars	Rural	1,943
Fars	Urban	1,581
Kerman	Rural	2,354
Kerman	Urban	1,965
Razavi Khorasan	Rural	1,743
Razavi Khorasan	Urban	1,644
Isfahan	Rural	1,572
Isfahan	Urban	1,349
Sistan and Baluchestan	Rural	1,964
Sistan and Baluchestan	Urban	2,045
Kurdistan	Rural	1,637
Kurdistan	Urban	1,487
Hamadan	Rural	1,332
Hamadan	Urban	1,110
Chaharmahal and Bakhtiari	Rural	2,703
Chaharmahal and Bakhtiari	Urban	1,919
Lorestan	Rural	2,465
Lorestan	Urban	1,826
Ilam	Rural	2,569
Ilam	Urban	2,510
Kohgiluyeh and Boyer-Ahmad	Rural	1,941
Kohgiluyeh and Boyer-Ahmad	Urban	1,235
Bushehr	Rural	2,577
Bushehr	Urban	2,146
Zanjan	Rural	1,585
Zanjan	Urban	1,198
Semnan	Rural	2,242
Semnan	Urban	2,286
Yazd	Rural	1,444
Yazd	Urban	1,330
Hormozgan	Rural	1,170
Hormozgan	Urban	1,110
Tehran	Rural	1,297
Tehran	Urban	1,335
Ardabil	Rural	1,731
Ardabil	Urban	1,372
Qom	Rural	1,456
Qom	Urban	1,194
Qazvin	Rural	1,752
Qazvin	Urban	1,307
Golestan	Rural	1,513
Golestan	Urban	1,375
North Khorasan	Rural	2,278
North Khorasan	Urban	1,874
South Khorasan	Rural	2,132
South Khorasan	Urban	1,859
Alborz	Rural	1,021
Alborz	Urban	881

On average, people living in ‘Karaj’ Province receive the lowest daily calorie intake among the bottom 20% of households nationwide, with 1,021 calories in urban areas and 881 calories in rural areas. On average, people living in rural areas of ‘Kermanshah’ Province and ‘Chaharmahal Bakhtiari’ Province receive the highest daily calorie intake among the bottom 20% of households nationwide, with 2,719 calories in ‘Kermanshah’ Province and 2,703 calories in ‘Chaharmahal Bakhtiari’ Province.

Fifth, we use the cost of this new bundle as a refined poverty line

# Generate the price vector of this new bundle
filtered_RU1402P3S01 <- left_join(filtered_RU1402P3S01,address_count, by = "Address")
price_vector <- filtered_RU1402P3S01 %>%
  filter(!is.na(label) & !is.na(value) & !is.na(kilogram)) %>%
  group_by(label, province, Urban_Rural) %>%
  dplyr::summarize(weighted_mean_price = sum(value * Weight, na.rm = TRUE) / sum(kilogram * Weight, na.rm = TRUE), .groups = 'drop')

filtered_RU1402P3S01 <- filtered_RU1402P3S01 %>%
  left_join(price_vector, by = c('label', 'province','Urban_Rural')) %>%
  mutate(kilogram = kilogram / member)

hybrid_poverty_line <- filtered_RU1402P3S01 %>%
  group_by(label, province, Urban_Rural) %>%
  mutate(total_product_cost = weighted_mean_price * kilogram) %>%
  group_by(Address, province, Urban_Rural) %>%
  summarise(total_bundle_price = sum(total_product_cost, na.rm = TRUE),.groups = 'drop') %>% 
  group_by(province, Urban_Rural) %>% 
  mutate(Urban_Rural = recode(Urban_Rural, "R" = "Rural", "U" = "Urban")) %>%
  summarise(cost_new_bundle = mean(total_bundle_price, na.rm = TRUE),.groups = 'drop')

kable(
  hybrid_poverty_line %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Relative Poverty Line - Hybrid Approach (Individual Level - Rial)" = ncol(hybrid_poverty_line))) %>%
  row_spec(which(hybrid_poverty_line$Urban_Rural == "Rural"), background = "cadetblue1") %>%
  row_spec(which(hybrid_poverty_line$Urban_Rural == "Urban"), background = "goldenrod2")

Relative Poverty Line - Hybrid Approach (Individual Level - Rial)
province	Urban_Rural	cost_new_bundle
Markazi	Rural	12,078,831
Markazi	Urban	13,231,259
Gilan	Rural	7,349,822
Gilan	Urban	9,134,857
Mazandaran	Rural	12,448,715
Mazandaran	Urban	15,157,034
East Azerbaijan	Rural	11,181,421
East Azerbaijan	Urban	14,011,940
West Azerbaijan	Rural	13,887,902
West Azerbaijan	Urban	14,114,726
Kermanshah	Rural	17,262,331
Kermanshah	Urban	20,576,228
Khuzestan	Rural	17,043,927
Khuzestan	Urban	20,538,817
Fars	Rural	10,886,670
Fars	Urban	16,188,458
Kerman	Rural	17,223,598
Kerman	Urban	19,287,059
Razavi Khorasan	Rural	13,303,883
Razavi Khorasan	Urban	15,515,614
Isfahan	Rural	10,667,042
Isfahan	Urban	13,561,812
Sistan and Baluchestan	Rural	13,373,659
Sistan and Baluchestan	Urban	15,962,129
Kurdistan	Rural	16,256,677
Kurdistan	Urban	17,687,751
Hamadan	Rural	10,316,798
Hamadan	Urban	9,957,974
Chaharmahal and Bakhtiari	Rural	14,525,660
Chaharmahal and Bakhtiari	Urban	19,639,382
Lorestan	Rural	16,868,750
Lorestan	Urban	17,367,830
Ilam	Rural	17,752,596
Ilam	Urban	20,174,603
Kohgiluyeh and Boyer-Ahmad	Rural	13,535,465
Kohgiluyeh and Boyer-Ahmad	Urban	14,852,076
Bushehr	Rural	17,913,329
Bushehr	Urban	20,064,721
Zanjan	Rural	12,849,297
Zanjan	Urban	13,614,465
Semnan	Rural	13,111,742
Semnan	Urban	16,250,432
Yazd	Rural	9,684,500
Yazd	Urban	12,981,312
Hormozgan	Rural	13,037,213
Hormozgan	Urban	16,132,922
Tehran	Rural	12,832,553
Tehran	Urban	15,794,607
Ardabil	Rural	14,216,254
Ardabil	Urban	15,813,921
Qom	Rural	11,437,388
Qom	Urban	12,957,472
Qazvin	Rural	12,717,597
Qazvin	Urban	15,932,593
Golestan	Rural	11,234,501
Golestan	Urban	14,426,796
North Khorasan	Rural	11,811,541
North Khorasan	Urban	13,448,426
South Khorasan	Rural	13,351,752
South Khorasan	Urban	16,099,192
Alborz	Rural	8,386,304
Alborz	Urban	10,573,553

3 Provinces with highest relative poverty line are ‘Khuzestan’ and ‘Chaharmahal and Bakhtiari’ and ‘Bushehr’ in urban areas. 3 Provinces with lowest relative poverty line are ‘Gilan’ and ‘Alborz’ in rural areas and ‘Gilan’ in urban areas.

comparison_approach <- relative_line %>% 
  left_join(hybrid_poverty_line, by = c("province", "Urban_Rural")) %>% 
  mutate(Urban_Rural = recode(Urban_Rural, "R" = "Rural", "U" = "Urban")) %>% 
  rename(Tradition_Approach = Half_Median, Hybrid_Approach = cost_new_bundle)

kable(
  comparison_approach %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Comparison of 2 approaches in Relative poverty line (Individual Level - Rial)" = ncol(comparison_approach))) %>%
  row_spec(which(comparison_approach$Urban_Rural == "Rural"), background = "chartreuse2") %>%
  row_spec(which(comparison_approach$Urban_Rural == "Urban"), background = "slateblue1")

Comparison of 2 approaches in Relative poverty line (Individual Level - Rial)
province	Urban_Rural	Tradition_Approach	Hybrid_Approach
Markazi	Rural	47,050,000	12,078,831
Markazi	Urban	61,550,000	13,231,259
Gilan	Rural	34,165,000	7,349,822
Gilan	Urban	54,135,250	9,134,857
Mazandaran	Rural	59,686,619	12,448,715
Mazandaran	Urban	71,847,500	15,157,034
East Azerbaijan	Rural	32,717,500	11,181,421
East Azerbaijan	Urban	46,120,500	14,011,940
West Azerbaijan	Rural	29,887,000	13,887,902
West Azerbaijan	Urban	40,482,500	14,114,726
Kermanshah	Rural	39,786,000	17,262,331
Kermanshah	Urban	52,030,000	20,576,228
Khuzestan	Rural	43,755,000	17,043,927
Khuzestan	Urban	64,177,500	20,538,817
Fars	Rural	52,862,500	10,886,670
Fars	Urban	81,802,500	16,188,458
Kerman	Rural	29,305,000	17,223,598
Kerman	Urban	37,080,000	19,287,059
Razavi Khorasan	Rural	35,338,250	13,303,883
Razavi Khorasan	Urban	58,067,500	15,515,614
Isfahan	Rural	46,695,500	10,667,042
Isfahan	Urban	64,286,500	13,561,812
Sistan and Baluchestan	Rural	26,307,500	13,373,659
Sistan and Baluchestan	Urban	44,267,500	15,962,129
Kurdistan	Rural	40,077,500	16,256,677
Kurdistan	Urban	55,362,500	17,687,751
Hamadan	Rural	37,876,250	10,316,798
Hamadan	Urban	54,967,400	9,957,974
Chaharmahal and Bakhtiari	Rural	54,890,000	14,525,660
Chaharmahal and Bakhtiari	Urban	66,162,500	19,639,382
Lorestan	Rural	37,079,500	16,868,750
Lorestan	Urban	50,695,050	17,367,830
Ilam	Rural	30,913,500	17,752,596
Ilam	Urban	42,693,750	20,174,603
Kohgiluyeh and Boyer-Ahmad	Rural	36,967,500	13,535,465
Kohgiluyeh and Boyer-Ahmad	Urban	50,917,000	14,852,076
Bushehr	Rural	70,438,200	17,913,329
Bushehr	Urban	78,498,250	20,064,721
Zanjan	Rural	43,818,000	12,849,297
Zanjan	Urban	68,323,000	13,614,465
Semnan	Rural	27,542,500	13,111,742
Semnan	Urban	36,870,000	16,250,432
Yazd	Rural	49,700,000	9,684,500
Yazd	Urban	58,330,000	12,981,312
Hormozgan	Rural	43,610,000	13,037,213
Hormozgan	Urban	60,189,000	16,132,922
Tehran	Rural	60,374,300	12,832,553
Tehran	Urban	101,536,580	15,794,607
Ardabil	Rural	40,580,000	14,216,254
Ardabil	Urban	51,282,500	15,813,921
Qom	Rural	41,446,775	11,437,388
Qom	Urban	54,981,500	12,957,472
Qazvin	Rural	40,275,250	12,717,597
Qazvin	Urban	57,117,000	15,932,593
Golestan	Rural	35,715,000	11,234,501
Golestan	Urban	48,305,000	14,426,796
North Khorasan	Rural	32,448,000	11,811,541
North Khorasan	Urban	49,644,500	13,448,426
South Khorasan	Rural	26,759,500	13,351,752
South Khorasan	Urban	50,287,500	16,099,192
Alborz	Rural	62,097,362	8,386,304
Alborz	Urban	77,540,000	10,573,553

As we anticipated, the relative poverty line determined by the hybrid approach is lower than that calculated using the traditional approach. This difference arises because the hybrid approach is based on the food bundle consumed by the bottom 20% of households, rather than the 50% median of all households. When households are limited to specific food items, the cost of obtaining 2,100 calories is significantly higher compared to when they have the option to choose from a more diverse food bundle. As we noted in previous part, households in the bottom 20% exclude red meat from their monthly food bundle.

3. Calculation of the Poverty Gap

The poverty gap index measures the intensity of poverty by calculating the average difference between the income of poor people and the income poverty line, as a percentage of the income poverty line. It it tells us how much poorer the poor are relative to the income poverty line.

A higher PGI means that the poor are farther below the poverty line.
A lower PGI means that the poor are closer to the poverty line.

\[ \text{PGI}=\frac{1}{N}\sum_{j=1}^{q}\left(\frac{z-y_j}{z}\right) \]

Where,

\(N\) is total population
\(q\) is the number of poor
\(z\) is the income poverty line
\(y_j\) is the income for person \(j\) living below the poverty line

In this part we estimate poverty gap index for each province.

First, we need to calculate income poverty line.

The income poverty line can be calculated by the following formula:

\[ \text{Income Poverty Line} = \frac{\text{Absolute Poverty Line}}{\text{Share Food Expenditure to Total Expenditure}} = \frac{\text{Absolute Poverty Line}}{\frac{\text{Food expenditure}}{\text{Total Expenditure}}} \]

Income poverty line (2 Methods) (Individual Level - Rial)
province	Urban_Rural	income_poverty_line1	income_poverty_line2
Markazi	Rural	25,150,564	28,077,129
Markazi	Urban	27,138,818	30,794,395
Gilan	Rural	38,052,828	45,383,187
Gilan	Urban	40,369,451	48,320,944
Mazandaran	Rural	34,797,712	40,613,168
Mazandaran	Urban	36,054,311	40,493,072
East Azerbaijan	Rural	25,544,023	27,761,429
East Azerbaijan	Urban	26,786,095	29,550,206
West Azerbaijan	Rural	25,500,063	27,583,958
West Azerbaijan	Urban	26,472,683	28,937,225
Kermanshah	Rural	20,678,475	22,325,552
Kermanshah	Urban	22,119,685	23,843,744
Khuzestan	Rural	20,916,150	21,902,081
Khuzestan	Urban	22,599,374	25,073,539
Fars	Rural	30,448,510	32,551,464
Fars	Urban	31,704,092	34,605,070
Kerman	Rural	23,048,598	24,963,445
Kerman	Urban	23,944,125	26,743,965
Razavi Khorasan	Rural	22,052,404	24,017,072
Razavi Khorasan	Urban	23,592,064	26,325,539
Isfahan	Rural	29,608,483	32,398,728
Isfahan	Urban	30,876,842	34,832,584
Sistan and Baluchestan	Rural	23,393,531	26,269,170
Sistan and Baluchestan	Urban	22,868,734	26,100,911
Kurdistan	Rural	22,089,260	24,206,647
Kurdistan	Urban	24,053,380	27,644,617
Hamadan	Rural	27,747,101	30,873,372
Hamadan	Urban	30,382,956	33,465,784
Chaharmahal and Bakhtiari	Rural	22,654,133	22,977,508
Chaharmahal and Bakhtiari	Urban	22,442,497	25,792,817
Lorestan	Rural	19,392,652	21,026,854
Lorestan	Urban	20,940,307	22,607,255
Ilam	Rural	18,174,783	19,366,049
Ilam	Urban	19,573,818	21,598,664
Kohgiluyeh and Boyer-Ahmad	Rural	24,225,532	26,239,127
Kohgiluyeh and Boyer-Ahmad	Urban	23,812,051	28,891,128
Bushehr	Rural	27,823,714	30,281,985
Bushehr	Urban	28,507,349	30,997,346
Zanjan	Rural	28,053,554	32,563,565
Zanjan	Urban	29,255,973	34,726,876
Semnan	Rural	22,075,047	23,367,008
Semnan	Urban	23,095,441	24,509,141
Yazd	Rural	31,217,223	34,966,507
Yazd	Urban	30,831,473	35,026,395
Hormozgan	Rural	31,700,116	36,992,699
Hormozgan	Urban	31,200,985	34,654,511
Tehran	Rural	42,498,904	47,637,469
Tehran	Urban	46,574,847	52,192,759
Ardabil	Rural	26,836,899	28,788,623
Ardabil	Urban	28,529,383	29,750,122
Qom	Rural	28,525,991	31,207,959
Qom	Urban	27,568,377	30,779,452
Qazvin	Rural	29,465,016	32,307,831
Qazvin	Urban	29,744,527	33,613,270
Golestan	Rural	23,889,704	25,859,595
Golestan	Urban	25,668,695	27,561,580
North Khorasan	Rural	23,967,249	26,282,793
North Khorasan	Urban	26,024,493	29,050,418
South Khorasan	Rural	20,119,177	20,981,614
South Khorasan	Urban	20,125,655	23,013,243
Alborz	Rural	36,785,771	39,287,390
Alborz	Urban	39,538,178	41,649,306

In the first column, income poverty line 1, which comes from calculating the absolute poverty line, the highest income poverty line is related to the rural areas of ‘Ilam’ and ‘Lorestan’ provinces and the urban areas of ‘Ilam’, respectively. The lowest income poverty line is related to the urban and rural areas of ‘Tehran’ province and the urban areas of ‘Gilan’, respectively. In the second column, income poverty line 2, which comes from calculating the absolute poverty line, the highest income poverty line is related to the rural areas of ‘Ilam’, ‘South Khorasan’, and ‘Lorestan’ provinces, respectively. The lowest income poverty line is related to the urban areas of ‘Tehran’ and ‘Gilan’ provinces and the rural areas of ‘Tehran’ province, respectively.

The difference in the income poverty line in the first column is between 18,174,783 and 46,574,847 rials per person, and the difference in the income poverty line in the second column is between 19,366,049 and 52,192,759 rials per person.

Second, we need to calculate the sources of household income from the data sets.

In this step, we clean tables that are associated with individuals’ sources of income.

# Part 4, Table 1 (For public sector)
RU1402P4S01 <- RU1402P4S01 %>%
  rename(
    member = DYCOL01,
    is_employed = DYCOL02,
    ISCO_code = DYCOL03,
    ISIC_code = DYCOL04,
    sector_status = DYCOL05,
    daily_hours = DYCOL06,
    weekly_days = DYCOL07,
    income_month = DYCOL08,
    income_year = DYCOL09,
    wage_month = DYCOL10,
    wage_year = DYCOL11,
    benefit_month = DYCOL12,
    benefit_year = DYCOL13,
    netincome_month = DYCOL14,
    netincome_year = DYCOL15) %>%
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))
# Part 4, Table 2 (For private sector)
RU1402P4S02 <- RU1402P4S02 %>%
  rename(
    member = DYCOL01,
    is_employed = DYCOL02,
    ISCO_code = DYCOL03,
    ISIC_code = DYCOL04,
    worker_status = DYCOL05,
    agriculture = DYCOL06,
    daily_hours = DYCOL07,
    weekly_days = DYCOL08,
    employment_cost = DYCOL09,
    agriculture_cost = DYCOL10,
    tool_cost = DYCOL11,
    fee_cost = DYCOL12,
    tax_cost = DYCOL13,
    sale = DYCOL14,
    netincome = DYCOL15)%>%
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))
# Part 4, Table 3 (For other incomes)
RU1402P4S03 <- RU1402P4S03 %>%
  rename(
    member = DYCOL01,
    income_pension = DYCOL03,
    income_rent = DYCOL04,
    income_interest = DYCOL05,
    income_granteduc = DYCOL06,
    income_sale = DYCOL07,
    income_famtransfer = DYCOL08)%>%
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))
# Part 4, Table 4 (Column 9 of table 3 (cash transfer or subsidy))
RU1402P4S04 <- RU1402P4S04 %>%
  rename(
    member = Dycol01,
    subsidy_member = Dycol03,
    subsidy_month = Dycol04,
    total_subsidy = Dycol05)%>%
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))

Third, we calculate monthly individuals’ sources of income from the tables.

Fourth, we identify individuals earning below the poverty line.

In this step, we only keep these individuals and exclude those above the poverty line.

In addition, we calculate the ratio of individuals who are below the income poverty line. (Following Formula)

\[\frac{1}{N}\sum_{j=1}^{q}\left(\frac{z-y_j}{z}\right)\]

combined_monthlyinc <- combined_monthlyinc %>%
  mutate(
    Urban_Rural = case_when(
      Urban_Rural == "U" ~ "Urban",
      Urban_Rural == "R" ~ "Rural",
      TRUE ~ Urban_Rural))
# Join income_poverty_line & weight & member count to combined_income
combined_monthlyinc <- combined_monthlyinc %>%
  left_join(income_poverty_line, by = c("province","Urban_Rural")) %>%
  left_join(RU1402P1_getweight, by = "Address") %>%
  left_join(address_count, by = "Address")

# Filter to keep rows where monthly net income is lower than income poverty line 1
poverty_gap1 <- combined_monthlyinc %>%
  # Filter the monthly net income of households below the income poverty line 1
  filter(total_netincome_month < income_poverty_line1) %>%
  # Calculate per capita monthly income (individual)
  mutate(personal_netincome_month = total_netincome_month / member) %>%
  # Calculate the ratio of net personal income to the income poverty line 1
  mutate(ratio = (income_poverty_line1 - personal_netincome_month) / income_poverty_line1) %>%
  group_by(province, Urban_Rural) %>%
  dplyr::summarize(
    sum_ratio = sum(Weight * ratio, na.rm = TRUE),  # Weighted sum of shortfall ratios
    PGI_1 = 100 * round(sum_ratio / nrow(combined_monthlyinc), 7),   # Divide by total rows (N)
    .groups = 'drop')

# Filter to keep rows where monthly net income is lower than income poverty line 2
poverty_gap2 <- combined_monthlyinc %>%
  # Filter the monthly net income of households below the income poverty line 2
  filter(total_netincome_month < income_poverty_line2) %>%
  # Calculate per capita monthly income (individual)
  mutate(personal_netincome_month = total_netincome_month / member) %>%
  # Calculate the ratio of net personal income to the income poverty line 2
  mutate(ratio = (income_poverty_line2 - personal_netincome_month) / income_poverty_line2) %>%
  group_by(province, Urban_Rural) %>%
  dplyr::summarize(
    sum_ratio = sum(Weight * ratio, na.rm = TRUE),  # Weighted sum of shortfall ratios
    PGI_2 = 100 * round(sum_ratio / nrow(combined_monthlyinc), 7),   # Divide by total rows (N)
    .groups = 'drop')

poverty_gap_index <- poverty_gap1 %>%
  left_join(poverty_gap2, by = c("province","Urban_Rural")) %>%
  select(province, Urban_Rural, PGI_1, PGI_2)

kable(
  poverty_gap_index %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 4))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Poverty Gap Index (2 Methods)" = ncol(poverty_gap_index))) %>%
  row_spec(which(poverty_gap_index$Urban_Rural == "Rural"), background = "lightsteelblue1") %>%
  row_spec(which(poverty_gap_index$Urban_Rural == "Urban"), background = "mistyrose1")

Poverty Gap Index (2 Methods)
province	Urban_Rural	PGI_1	PGI_2
Markazi	Rural	0.0632	0.0807
Markazi	Urban	0.0104	0.0162
Gilan	Rural	0.0748	0.0866
Gilan	Urban	0.0785	0.0955
Mazandaran	Rural	0.0260	0.0319
Mazandaran	Urban	0.0076	0.0103
East Azerbaijan	Rural	0.0491	0.0589
East Azerbaijan	Urban	0.0148	0.0175
West Azerbaijan	Rural	0.0533	0.0606
West Azerbaijan	Urban	0.0249	0.0303
Kermanshah	Rural	0.0149	0.0179
Kermanshah	Urban	0.0157	0.0174
Khuzestan	Rural	0.0372	0.0399
Khuzestan	Urban	0.0095	0.0100
Fars	Rural	0.0842	0.0964
Fars	Urban	0.0322	0.0393
Kerman	Rural	0.0131	0.0189
Kerman	Urban	0.0052	0.0073
Razavi Khorasan	Rural	0.0745	0.0905
Razavi Khorasan	Urban	0.0209	0.0274
Isfahan	Rural	0.0262	0.0320
Isfahan	Urban	0.0081	0.0096
Sistan and Baluchestan	Rural	0.1730	0.2097
Sistan and Baluchestan	Urban	0.1108	0.1387
Kurdistan	Rural	0.0183	0.0230
Kurdistan	Urban	0.0135	0.0154
Hamadan	Rural	0.1174	0.1280
Hamadan	Urban	0.0667	0.0741
Chaharmahal and Bakhtiari	Rural	0.0056	0.0057
Chaharmahal and Bakhtiari	Urban	0.0010	0.0019
Lorestan	Rural	0.0253	0.0325
Lorestan	Urban	0.0205	0.0238
Ilam	Rural	0.0086	0.0133
Ilam	Urban	0.0065	0.0101
Kohgiluyeh and Boyer-Ahmad	Rural	0.0229	0.0323
Kohgiluyeh and Boyer-Ahmad	Urban	0.0059	0.0157
Bushehr	Rural	0.0221	0.0273
Bushehr	Urban	0.0114	0.0151
Zanjan	Rural	0.0253	0.0376
Zanjan	Urban	0.0144	0.0230
Semnan	Rural	0.0179	0.0214
Semnan	Urban	0.0049	0.0050
Yazd	Rural	0.0841	0.0875
Yazd	Urban	0.0139	0.0160
Hormozgan	Rural	0.1351	0.1700
Hormozgan	Urban	0.0982	0.1086
Tehran	Rural	0.0095	0.0157
Tehran	Urban	0.0392	0.0577
Ardabil	Rural	0.0307	0.0330
Ardabil	Urban	0.0285	0.0291
Qom	Rural	0.0152	0.0180
Qom	Urban	0.0154	0.0213
Qazvin	Rural	0.0287	0.0379
Qazvin	Urban	0.0066	0.0076
Golestan	Rural	0.0707	0.0778
Golestan	Urban	0.0551	0.0598
North Khorasan	Rural	0.0220	0.0319
North Khorasan	Urban	0.0185	0.0222
South Khorasan	Rural	0.0604	0.0734
South Khorasan	Urban	0.0191	0.0336
Alborz	Rural	0.0172	0.0178
Alborz	Urban	0.0129	0.0138

In the first column, PGI 1, which comes from calculating the income poverty line 1, the highest PGI is related to the rural areas of ‘Sistan and Baluchestan’ and ‘Hormozgan’ and ‘Hamadan’ provinces, respectively. The lowest PGI is related to the urban areas of ‘Chaharmahal and Bakhtiari’ and ‘Semnan’ and ‘Kerman’ provinces, respectively. In the second column, PGI 2, which comes from calculating the income poverty line 2, the highest PGI is related to the rural and urban areas of ‘Sistan and Baluchestan’ and rural areas of ‘Hormozgan’ provinces. The lowest PGI is related to the rural and urban areas of ‘Chaharmahal and Bakhtiari’ and urban areas of ‘Semnan’ provinces.

4. Multidimensional Poverty Index (MPI)

In the previous sections, we focused on estimating the poverty line based on food bundles. In this section, we expand our analysis to include other important aspects of life, such as ‘education’, ‘health’, and ‘nutrition’. Based on an article from World Bank about “Multidimensional Poverty Measure”, we follow the following construction. We will calculate the Multidimensional Poverty Index (MPI), which measures the percentage of households in a country that are deprived in three areas: ‘monetary poverty’, ‘education’, and ‘access to basic infrastructure services’. This approach provides a more comprehensive understanding of poverty by considering various dimensions of well-being beyond just financial resources.

In this part, we consider the 4 aspects of poverty including;

Monetary Dimension
Education Dimension
Living Standards Dimension

Then, we will use following equation to estimate multidimensional poverty index:

\[ MPI = \sum_{i=1}^{n} W_i D_i \]

where \(W_{i}\) is the weight of indicator \(D_{i}\).

In this section, we define the first dimension: (Income)

In this step, we identify the individuals who have a monthly total net income below the income poverty line. To aim this process, we create the following dummy:

If a household is below the income poverty line, \(\text{dummy income poverty line} = 1\)
If a household is not below the income poverty line, \(\text{dummy income poverty line} = 0\)

combined_monthlyinc_dum1 <- combined_monthlyinc %>%
  mutate(dummy_below_poverty = ifelse((total_netincome_month / member) < income_poverty_line1, 1, 0))
# poverty status information
RU1402P1_dum <- RU1402P1 %>%
  left_join(combined_monthlyinc_dum1 %>% select(Address, dummy_below_poverty), by = "Address") %>%
  # Handle Missing Values
  mutate(income_poverty_dummy = ifelse(is.na(dummy_below_poverty), 0, dummy_below_poverty)) %>%
  select(-dummy_below_poverty)

In this section, we define the second dimension: (Consumption)

In this step, we first identify the individuals who have a monthly total consumption below the consumption poverty line. To aim this process, we create the following dummy:

If a household is below the consumption poverty line, \(\text{dummy consumption poverty line} = 1\)
If a household is not below the consumption poverty line, \(\text{dummy consumption poverty line} = 0\)

consumption <- filtered_RU1402P3S01 %>%
  dplyr::group_by(Address, province, Urban_Rural, label) %>%
  dplyr::summarize(mean_consumption = round(weighted.mean(kilogram, Weight, na.rm = TRUE), 3), .groups = 'drop') 

consumption <- consumption %>%
  mutate(cal_per_kg = case_when(
    label == "bread" ~ 2660,
    label == "cheese" ~ 3500,
    label == "egg" ~ 1470,
    label == "fruit" ~ 500,
    label == "oil" ~ 8840,
    label == "potato" ~ 1040,
    label == "rice" ~ 1350,
    label == "soda" ~ 380,
    label == "sugar" ~ 3870,
    label == "tomato paste" ~ 820,
    label == "vegetables" ~ 200,
    label == "yoghurt" ~ 990,
    label == "white meat" ~ 2200,
    label == "milk" ~ 500,
    label == "biscuit" ~ 3250,
    label == "salt" ~ 0,
    label == "spaghetti" ~ 1570,
    label == "zardchobe" ~ 3540,
    label == "ice cream" ~ 2010,
    label == "lentil" ~ 1650,
    label == "margarine" ~ 5260,
    label == "non-irani tea" ~ 10,
    label == "spices" ~ 3140,
    label == "cake" ~ 3800,
    label == "chips & pofak" ~ 5200,
    label == "flour" ~ 3640,
    TRUE ~ NA_real_))

consumption_dummy <- consumption %>%
  # Calculate total calories consumed per label in a month
  mutate(total_calories = mean_consumption * cal_per_kg, na.rm = TRUE) %>%
  # Group by Address, Province, and Urban_Rural areas
  group_by(Address, province) %>%
  # Calculate the total calories of households in a month in each province
  dplyr::summarize(Total_calories = sum(total_calories, na.rm = TRUE), .groups = 'drop') %>%
  # Calculate daily calories in each household
  mutate(daily_calories = Total_calories / 30) %>%
  left_join(address_count, by = "Address") %>%
  # Calculate the daily personal calorie consumption
  mutate(daily_personal_calories = daily_calories / member) %>% 
  left_join(average_consumption, by = "province") %>% 
  mutate(below_calorie_dummy = ifelse(daily_personal_calories < average_personal_calories, 1, 0))

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(consumption_dummy %>% select(Address, below_calorie_dummy), by = "Address") %>%
  mutate(consumption_poverty_dummy = ifelse(is.na(below_calorie_dummy), 0, below_calorie_dummy)) %>% 
  select(-below_calorie_dummy)

In this section, we define the third dimension: (education)

First, we define a dummy variable for the education of household’s head.

Second, we define a dummy variable for the attendance of children at school.

Here, the thresholds that we consider for not being deprived of education is that the household’s head is literate and the children attend school.

If a Household’s Head is Literate, \(\text{Head's Education} = 0\)
If a Household’s Child is between 7 and 18 and is going to school, \(\text{Child's Education} = 0\)

# For Head
RU1402P1_dum <- RU1402P1_dum %>%
  group_by(Address) %>%
  mutate(head_education_dummy = case_when(
    any(relation == "head" & literacy == "illiterate", na.rm = TRUE) ~ 1,
    any(relation == "head" & literacy != "illiterate", na.rm = TRUE) ~ 0,
    TRUE ~ NA_real_))
# For Children
RU1402P1_dum <- RU1402P1_dum %>%
  group_by(Address) %>%
  mutate(child_education_dummy = ifelse(any(relation == "child" & age >= 7 & age <= 18 & is_studying == "No", na.rm = TRUE), 1, 0)) %>%
  ungroup()

In this section, we define the fourth dimension: (Living Standards)

First, we define a dummy variable for the Per Capita Living Area.

Second, we define a dummy variable for the Being a Homeowner.

Third, we define a dummy variable for the Access to Necessary Home Items.

Forth, we define a dummy variable for the Access to Internet and Cell Phone.

Here, the thresholds that we consider for the Living Standards are:

According to Tehran Municipality’s regulations and guidelines, the Per Capita Living Area per person is 17.5 square meters.
If a Household owns its House, \(\text{Home Owenship} = 0\).
If a Household has access to more than 3 of 5 necessary Home Items, \(\text{Home Accessibility} = 0\).

(necessary home items include TV, washing machine, refrigerator, freezer and stove)
If a Household has access to Internet and Cell Phone, \(\text{Technology Accessibility} = 0\).

# Dummy for Home Ownership
RU1402P2 <- RU1402P2 %>%
  mutate(ownership_dummy = ifelse(tenure %in% c("OwnedEstateLand", "OwnedEstate", "Free"), 0, 1))

RU1402P2 <- RU1402P2 %>%
  mutate(Address = as.numeric(Address))
#Merging it
RU1402P1_dum <- RU1402P1_dum %>%
  left_join(RU1402P2 %>% select(Address, ownership_dummy), by = "Address") %>%
  mutate(tenure = ifelse(is.na(ownership_dummy), 1, ownership_dummy)) %>%
  select(-ownership_dummy)

RU1402P2 <- merge(RU1402P2, address_count, by = "Address", all.x = TRUE)

# Calculate space_per_capita
RU1402P2 <- RU1402P2 %>%
  mutate(space_per_capita = space / member)

# Create a dummy variable for space_per_capita < 17.5
address_space_issue <- RU1402P2 %>%
  mutate(space_issue = ifelse(space_per_capita < 17.5, 1, 0)) %>%
  select(Address, space_issue)

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(address_space_issue, by = "Address") %>%
  mutate(space_issue = ifelse(is.na(space_issue), 0, space_issue))

#internet
RU1402P2 <- RU1402P2 %>%
  mutate(no_internet_dummy = ifelse(internet == "FALSE", 1, 0))

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(RU1402P2 %>% select(Address, no_internet_dummy), by = "Address") %>%
  mutate(no_internet_dummy = ifelse(is.na(no_internet_dummy), 0, no_internet_dummy))

#cellphone
RU1402P2 <- RU1402P2 %>%
  mutate(no_mobile_dummy = ifelse(cellphone == "FALSE", 1, 0))

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(RU1402P2 %>% select(Address, no_mobile_dummy), by = "Address") %>%
  mutate(no_mobile_dummy = ifelse(is.na(no_mobile_dummy), 0, no_mobile_dummy))

RU1402P2 <- RU1402P2 %>%
  mutate(lack_appliances = ifelse((TV + washingmachine + refridgerator + freezer + stove) < 3, 1, 0))

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(RU1402P2 %>% select(Address, lack_appliances), by = "Address") %>%
  mutate(lack_appliances = ifelse(is.na(lack_appliances), 0, lack_appliances))

In this section, we calculate the multidimensional poverty index by considering an equally weighted average of the variables mentioned above for each province. A higher index value indicates a worse poverty situation in that province.

RU1402P1_dum <- RU1402P1_dum %>%
  rename(
    Below_income_dummy = income_poverty_dummy,
    Below_consumption_dummy = consumption_poverty_dummy ,
    Head_educ_dummy = head_education_dummy,
    Child_educ_dummy = child_education_dummy,
    Ownership_dummy = tenure,
    Living_space_dummy = space_issue,
    Internet_dummy = no_internet_dummy,
    Cellphone_dummy = no_mobile_dummy,
    Appliance_dummy = lack_appliances)

MDP_index <- RU1402P1_dum %>%
  select(Address, member,
    Below_income_dummy, Below_consumption_dummy,
    Head_educ_dummy, Child_educ_dummy,
    Ownership_dummy, Living_space_dummy,
    Internet_dummy, Cellphone_dummy, Appliance_dummy)

MDP_index <- MDP_index %>%
  mutate(
    province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province),

    # Monetary Dimension (Total Weight = 1/3)
    Monetary_Score = (Below_consumption_dummy * 1/6 + Below_income_dummy * 1/6),
    
    # Education Dimension (Total Weight = 1/3)
    Education_Score = (Head_educ_dummy * 1/6 + Child_educ_dummy * 1/6),

    # Living Standards Dimension (Total Weight = 1/3)
    Living_Standards_Score = (Ownership_dummy * 1/15 + Living_space_dummy * 1/15 +
                              Internet_dummy * 1/15 + Cellphone_dummy * 1/15 +
                              Appliance_dummy * 1/15),
    # Total MPI Score
    MPI_Score = Monetary_Score + Education_Score + Living_Standards_Score
  ) %>%
  select(Address, member, province, Monetary_Score, Education_Score, Living_Standards_Score, MPI_Score)

MDP_index <- MDP_index %>%
  left_join(RU1402P1_getweight, by = "Address")
# Calculate the weighted average of MPI scores for each province
MDP_index_province <- MDP_index %>%
  group_by(province) %>%
  dplyr::summarize(weighted_average_MPI = sum(MPI_Score * Weight, na.rm = TRUE) / sum(Weight, na.rm = TRUE)) %>% 
  arrange(weighted_average_MPI)

kable(
  MDP_index_province %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 3))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Multi Dimentional Poverty Index (Household)" = ncol(MDP_index_province))) %>%
  column_spec(1, background = "mediumslateblue") %>%
  column_spec(2, background = "mediumspringgreen")

Multi Dimentional Poverty Index (Household)
province	weighted_average_MPI
Chaharmahal and Bakhtiari	0.131
Mazandaran	0.136
Yazd	0.149
Isfahan	0.151
Bushehr	0.178
Fars	0.185
Zanjan	0.194
Markazi	0.196
Ardabil	0.197
Qazvin	0.199
East Azerbaijan	0.212
Kohgiluyeh and Boyer-Ahmad	0.215
Gilan	0.218
South Khorasan	0.223
Khuzestan	0.224
Ilam	0.227
Kurdistan	0.231
North Khorasan	0.232
Kermanshah	0.233
Hamadan	0.241
Tehran	0.242
Razavi Khorasan	0.250
Alborz	0.251
Lorestan	0.255
Qom	0.256
Semnan	0.258
Golestan	0.261
Hormozgan	0.278
Kerman	0.278
West Azerbaijan	0.306
Sistan and Baluchestan	0.431

3 Provinces with highest MPI Score are ‘Sistan and Baluchestan’ (0.43) and ‘West Azerbaijan’ (0.30) and ‘Kerman’ (0.27).

3 Provinces with lowest MPI Score are ‘Chaharmahal and Bakhtiari’ (0.13) and ‘Mazandaran’ (0.13) and ‘Yazd’ (0.14).

Inequality Analysis

1. Estimation of Inequality Metrics

Gini Coefficient

Import the Data

welfare_dta <- read.csv('/Users/mahan/Desktop/TEIAS/3th-Semester/DevEcon1/Project2/Data/PaygahRefah/nemone_2_darsadi_1402.csv')

In this Step, We calculate the Revenue for each ID.

Since the “Daramad” column is NA for many observations, we combine “CardPerMonth_1402” column with “Daramad” column to create an alternative proxy for income.

Revenue <- welfare_dta %>%
  select(id, SabteAhval_provincename, CardPerMonth_1402, Daramad) %>%
  filter(!(is.na(CardPerMonth_1402) & is.na(Daramad))) %>%
  mutate(Total_Revenue = (CardPerMonth_1402 * 12) + Daramad)

We also consider assets as an indication of wealth distribution, since it is common to estimate the Gini Coefficient for both wealth and income.

Asset <- welfare_dta %>%
  select(id, SabteAhval_provincename, CarsPrice, Bourse_NetPortfoValue) %>%
  filter(!(is.na(CarsPrice) & is.na(Bourse_NetPortfoValue))) %>%
  mutate(Total_Asset = CarsPrice + Bourse_NetPortfoValue)

In this part, we estimate the Gini Coefficient for both wealth and income from “Iranian welfare database” and plot the Lorenz Curve.

# Compute Lorenz curves
lorenz_revenue <- Lc(Revenue$Total_Revenue)
lorenz_asset <- Lc(Asset$Total_Asset)
lorenz_HEIS <- Lc(combined_monthlyinc$total_netincome_month/combined_monthlyinc$member)

# Compute Gini coefficients
gini_revenue <- round(Gini(Revenue$Total_Revenue), 5) 
gini_asset <- round(Gini(Asset$Total_Asset), 5)
gini_HEIS <- round(Gini(combined_monthlyinc$total_netincome_month/combined_monthlyinc$member), 5)

# Plot Lorenz curves
plot(lorenz_revenue,
     main = "Lorenz Curves for Revenue and Asset Distribution",
     xlab = "Cumulative Share of Population",
     ylab = "Cumulative Share of Revenue & Asset",
     col = "blue",
     lwd = 2)

# Add Lorenz curve for Asset
lines(lorenz_asset, col = "red", lwd = 2)

lines(lorenz_HEIS, col = "darkgreen", lwd = 3)

# Add the line of perfect equality
abline(0, 1, col = "black", lty = 2)

# Add legend
legend("topleft",
       legend = c("Revenue - Welfare DataBase", "Asset - Welfare DataBase", "Revenue - HEIS", "Line of Equality"),
       col = c("blue", "red","darkgreen", "black"),
       lty = c(1, 1, 1, 1),
       lwd = c(2, 2, 2, 2),
       bg = "white")
# Add Gini coefficient text directly on the plot
text(0.8, 0.25, paste("Gini (Revenue - WelfareDataBase):", gini_revenue), col = "blue", cex = 0.7)
text(0.8, 0.15, paste("Gini (Asset - WelfareDataBase):", gini_asset), col = "red", cex = 0.7)
text(0.5, 0.40, paste("Gini (Revenue - HEIS):", gini_HEIS), col = "darkgreen", cex = 0.7)

The primary reason for the high value of the Gini coefficient is that we used a proxy variable, which may not accurately reflect the distribution of income and wealth in society. Additionally, the estimation of this coefficient was based on only 2% of data from the Iranian Welfare Database, which may not adequately represent the entire population and could be biased.

Top 1% Share

# sorted_Revenue <- Revenue %>%
#   arrange(-Total_Revenue)
# 
# top1perc_Revenue <- sorted_Revenue %>%
#   filter(row_number() <= ceiling(0.01 * n()))
# 
# top1_perc_Revenue <- sum(top1perc_Revenue$Total_Revenue,na.rm=TRUE)/sum(sorted_Revenue$Total_Revenue, na.rm = TRUE)
# # message(glue("The top 1% of revenue accounts for {percent(top1perc_Revenue, accuracy = 0.01)} of the total revenue."))
# # top_1perc_Revenue

# sorted_Asset <- Asset %>% 
#   arrange(-Total_Asset)
#  
# top_1perc_Asset <- sorted_Asset %>% 
#    filter(row_number() <= ceiling(0.01 * n()))
#   #dplyr:: summarize(Top_1 = sum(Total_Asset)/ sum(sorted_Asset$Total_Asset),na.rm = TRUE)
# 
# top1perc_Asset <- sum(top_1perc_Asset$Total_Total_Asset,na.rm=TRUE)/sum(sorted_Asset$Total_Asset, na.rm = TRUE)

library(shiny)
library(leaflet)
library(sf)
library(dplyr)
library(htmltools)
library(htmlwidgets)

# Load the shapefile and rename the province column
iran_map <- st_read("/Users/mahan/Desktop/TEIAS/3th-Semester/DevEcon1/Project2/Mapping/iranmap/irn_admbnda_adm1_unhcr_20190514.shp")

## Reading layer `irn_admbnda_adm1_unhcr_20190514' from data source 
##   `/Users/mahan/Desktop/TEIAS/3th-Semester/DevEcon1/Project2/Mapping/iranmap/irn_admbnda_adm1_unhcr_20190514.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 31 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 44.03189 ymin: 25.05932 xmax: 63.33327 ymax: 39.78165
## Geodetic CRS:  WGS 84

iran_map <- rename(iran_map, province = ADM1_EN)

# Function to generate the Leaflet map
generateLeafletMap <- function(data, metric, metric_label) {
  # Set labels for map
  labels <- sprintf("<strong>%s</strong><br/>%s: %s", 
                    data$province, 
                    metric_label, 
                    format(data[[metric]], big.mark = ",", scientific = FALSE))
  
  # Create the color palette
  pal <- colorNumeric(palette = "PuBu", domain = data[[metric]])
  
  # Generate the map
  leaflet(data) %>%
    addProviderTiles(providers$CartoDB.Positron) %>%
    addPolygons(
      fillColor = ~pal(data[[metric]]),
      fillOpacity = 1,
      weight = 0.3,
      smoothFactor = 0.3,
      label = lapply(labels, htmltools::HTML),
      labelOptions = labelOptions(
        style = list("font-family" = "serif", "font-weight" = "bold", "padding" = "3px 8px",
                     "box-shadow" = "3px 3px rgba(0,0,0,0.25)",
                     "font-size" = "13px", "border-color" = "rgba(0,0,0,0.5)"),
        direction = "auto"
      ),
      highlightOptions = highlightOptions(color = "darkred", weight = 3, bringToFront = TRUE)
    ) %>%
    addLegend(
      pal = pal,
      values = data[[metric]],
      position = "topright",
      title = metric_label
    )
}

# UI
ui <- fluidPage(
  titlePanel("Atlas Poverty of Iran"),
  sidebarLayout(
    sidebarPanel(
      selectInput("selectedProvince", "Choose a Province", 
                  choices = c("Iran" = "All", unique(iran_map$province))),
      selectInput("selectedDataset", "Choose a Poverty Measure:", 
                  choices = c(
                    "Absolute Poverty Line" = "TC_province1",
                    "Relative Poverty Line (Traditional)" = "relative_line",
                    "Relative Poverty Line (Hybrid)" = "hybrid_poverty_line",
                    "Income Poverty Line" = "income_poverty_line",
                    "Poverty Gap Index" = "poverty_gap_index",
                    "Multidimensional Poverty Index" = "MDP_index_province"
                  )),
      selectInput("selectedMetric", "Choose an Approach:", choices = NULL),
      selectInput("urbanRural", "Choose an Area Type:", 
                  choices = c("All", "Urban", "Rural"), 
                  selected = "All"),
      actionButton("resetMap", "Reset View"),
      width = 3
    ),
    mainPanel(
      leafletOutput("mapDisplay", height = "800px"),
      width = 9
    )
  )
)

# Server
server <- function(input, output, session) {
  # Update the 'Choose an Approach' dropdown dynamically based on selected dataset
  observe({
    selected_dataset <- input$selectedDataset
    
    metric_choices <- switch(selected_dataset,
      "TC_province1" = c("Method 1" = "First_Method", "Method 2" = "Second_Method"),
      "relative_line" = c("50 % of Median" = "Half_Median"),
      "hybrid_poverty_line" = c("Cost of bottom 20%" = "cost_new_bundle"),
      "income_poverty_line" = c("Income Poverty Line 1" = "income_poverty_line1", "Income Poverty Line 2" = "income_poverty_line2"),
      "poverty_gap_index" = c("PGI Method 1" = "PGI_1", "PGI Method 2" = "PGI_2"),
      "MDP_index_province" = c("Equally Weighted MPI" = "weighted_average_MPI")
    )
    
    updateSelectInput(session, "selectedMetric", choices = metric_choices)
    
    # Update the 'Choose an Area Type' dropdown based on selected dataset
    if (selected_dataset == "MDP_index_province") {
      updateSelectInput(session, "urbanRural", choices = c("All"), selected = "All")
    } else {
      updateSelectInput(session, "urbanRural", choices = c("All", "Urban", "Rural"), selected = "All")
    }
  })
  
  # Render the Leaflet map
  output$mapDisplay <- renderLeaflet({
    req(input$selectedMetric)
    
    selected_dataset <- input$selectedDataset
    metric <- input$selectedMetric
    metric_label <- names(which(sapply(switch(selected_dataset,
      "TC_province1" = c("First Method" = "First_Method", "Second Method" = "Second_Method"),
      "relative_line" = c("Half of Median Income" = "Half_Median"),
      "hybrid_poverty_line" = c("Cost of New Bundle" = "cost_new_bundle"),
      "income_poverty_line" = c("Income Poverty Line 1" = "income_poverty_line1", "Income Poverty Line 2" = "income_poverty_line2"),
      "poverty_gap_index" = c("PGI Method 1" = "PGI_1", "PGI Method 2" = "PGI_2"),
      "MDP_index_province" = c("Weighted Average MPI" = "weighted_average_MPI")
    ), identical, metric)))
    
    # Filter dataset
    data <- switch(selected_dataset,
      "TC_province1" = TC_province1,
      "relative_line" = relative_line,
      "hybrid_poverty_line" = hybrid_poverty_line,
      "income_poverty_line" = income_poverty_line,
      "poverty_gap_index" = poverty_gap_index,
      "MDP_index_province" = MDP_index_province
    )
    
    # Merge with shapefile
    filtered_data <- merge(iran_map, data, by = "province")
    
    # Apply filters for area type and province
    if (input$urbanRural != "All" && selected_dataset != "MDP_index_province") {
      filtered_data <- filtered_data[filtered_data$Urban_Rural == input$urbanRural, ]
    }
    if (input$selectedProvince != "All") {
      filtered_data <- filtered_data[filtered_data$province == input$selectedProvince, ]
    }
    
    # Generate the map
    generateLeafletMap(filtered_data, metric, metric_label)
  })
  
  # Reset map view
  observeEvent(input$resetMap, {
    updateSelectInput(session, "selectedProvince", selected = "All")
    updateSelectInput(session, "urbanRural", selected = "All")
    leafletProxy("mapDisplay", session) %>%
      setView(lng = 110, lat = 32, zoom = 4)
  })
}

# Run the app
shinyApp(ui = ui, server = server)

Shiny applications not supported in static R Markdown documents

# Save the HTML widget
#saveWidget(app, "/Users/mahan/Desktop/shiny_app_output.html")

Project II - Poverty and Inequality

Mahan Rezaei

2025-01-02

Poverty

Inequality

Necessary Packages

Import the Data

Data Cleaning Process:

Poverty Analysis

1. Estimation of the Absolute Poverty Line

سبد غذایی تامین کننده ۲۱۰۰ کیلو کالری در روز (منطبق با الگوی غذایی بالغین)

2. Estimation of the Relative Poverty Line

3. Calculation of the Poverty Gap

4. Multidimensional Poverty Index (MPI)

Inequality Analysis

1. Estimation of Inequality Metrics

Import the Data