“Our lives begin to end the day we become silent about things that matter.”

— Martin Luther King Jr.

Poverty and Inequality are important concepts in economics and social sciences that are closely related but distinct. These two concepts are of great importance in development economics because they are among the topics that development economists research on, as well as their causes and the consequences of various policies in both developed and developing countries. Both indicate deprivation in access to resources, opportunities, and outcomes among certain individuals or groups.

Poverty

Poverty is a state or condition in which an individual lacks the sufficient financial resources and essentials for a basic standard of living.

Causes of Poverty include market failures like:

  • Labor Market Frictions

  • Credit Constraints

  • Human Capital Development

Inequality

Inequality refers to the unequal distribution of income, wealth, opportunities, and access to resources across individuals or groups within a society.

Causes of Inequality include:

  • Unequal Access to Resources

  • Differences in Education and Skills

  • Wealth Accumulation


This project aims to analyze the socioeconomic landscape of Iran by examining both poverty and inequality. It includes estimating poverty metrics such as the poverty line, poverty gap, and multidimensional poverty index (MPI), as well as inequality indicators like the Gini coefficient and the share of the top 1%.

Necessary Packages

First, we install the necessary packages for this project.

Then, we load these packages.

knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)
library(dplyr)
library(data.table)
library(knitr)
library(ggplot2)
library(tidyverse)
library(readxl)
library(tidyr)
library(shiny)
library(leaflet)
library(mdbr)
library(Hmisc)
library(leaflet)
#library(tigris)
library(sf)
library(readxl)
library(tidyverse)
library(shiny)
library(labelled)
library(RSocrata)
library(shinythemes)
library(RColorBrewer)
library(rsconnect)
library(haven)
#library(mapview)
library(forcats)
library(writexl)
library(psych)
library(foreign)
library(kableExtra)
library(htmltools)
library(ineq)
library(glue)
library(scales)

Import the Data

Here, we load data by specifying our file path. The data set used for this project is the Households Expenditure and Income Survey (HEIS) for the year 1402, which contains information on household expenditures on durable and non-durable goods, sources of income, residence, and other factors in different provinces, collected in different tables.

The HEIS data set displays urban and rural households in different tables. Each table has its own aim to indicate some dimensions of households.

HEIS 1402 includes the following tables:

  • U1402Data & R1402Data: survey details section

  • U1402P1 & R1402P1: Social characteristics of household members

  • U1402P2 & R1402P2: Location details

  • U1402P3S & R1402P3S (S = 01, \(\cdots\) , 12 , S \(\neq\) 4) : Household expenditure on non-durable goods

  • U1402P3S04 & R1402P3S04: Housing costs section

  • U1402P3S13 & R1402P3S13: Household expenditure on durable goods

  • U1402P3S14 & R1402P3S14: Household investment in last 12 months

  • U1402P4S1 & R1402P4S1: Income of employed household members from public sector

  • U1402P4S2 & R1402P4S2: Income of employed household members from private sector

  • U1402P4S3 & R1402P4S3: Miscellaneous household income

  • U1402P4S4 & R1402P4S4: Household income from government Cash Transfers (Subsidy)

In this part, we extract all the tables from the HEIS data set and add a new column called “Urban_Rural”.

# Removing the empty tables from HEIS 1402
HEIS[["R1402P3S10"]] <- NULL
HEIS[["U1402P3S10"]] <- NULL

origin_names <- names(HEIS)
# Iterating over data frames in HEIS and modifying them 
HEIS <- lapply(seq_along(HEIS), function(idx) {
    name <- origin_names[idx]
    #Extracting the first character of the name
    first_char <- substr(name, 1, 1)
    #Checking if the first character of the name is either "R" (Rural) or "U" (Urban).
    if (first_char == "R" || first_char == "U") {
      # Adding a new column Urban_Rural and filling with "R" and "U"
        HEIS[[idx]]$Urban_Rural <- first_char}
    return(HEIS[[idx]])})
# Reassigning original names
names(HEIS) <- origin_names
# Assigning each data frame back to the global environment
invisible(lapply(names(HEIS), function(x) assign(x,HEIS[[x]],envir=.GlobalEnv)))

In this part, we combine the rural and urban tables with the same names.

# Function to merge R and U tables and rename them to 'RU'
merge_tables <- function() {
  # Loop through each name and check for the pattern
  for (name in origin_names) {
    # If a table starts with 'R', find the corresponding 'U' table
    if (substr(name, 1, 1) == "R") {
      u_name <- paste0("U", substr(name, 2, nchar(name)))
      # Merge the R and U tables if the U table exists
      if (u_name %in% origin_names) {
        # Merge the R and U tables
        merged_data <- rbind(HEIS[[name]], HEIS[[u_name]])
        # Create the new name with 'RU' prefix
        new_name <- paste0("RU", substr(name, 2, nchar(name)))
        # Assign the merged data to the new name
        assign(new_name, merged_data, envir = .GlobalEnv)
        # Remove the original R and U tables
        rm(list = c(name, u_name), envir = .GlobalEnv)}}}}
# Call the function
merge_tables()
# Remove the unnecarry data and function
rm(HEIS)

Now, the data tables are available to start the cleaning process.

Data Cleaning Process:

As reported in the project text, we first create a column that represents the weight of household members and add it to our table.

  • Head of Household (HH): \(\text{Weight} = 1\)

  • Non-head above 18 years old: \(\text{Weight} = 0.8\)

  • Non-head under 18 years old: \(\text{Weight} = 0.5\)

# Creating the Weight column in our data set
RU1402P1 <- RU1402P1 %>%
  mutate(Weight = case_when(
    DYCOL03 == 1 ~ 1,                         # If DYCOL03 == 1, assign 1 to weight
    DYCOL03 != 1 & DYCOL05 >= 18 ~ 0.8,       # If DYCOL03 != 1 and DYCOL05 >= 18, assign 0.8
    DYCOL03 != 1 & DYCOL05 < 18 ~ 0.5,        # If DYCOL03 != 1 and DYCOL05 < 18, assign 0.5
    TRUE ~ NA_real_))

In this section, we label the variables in the tables to enhance file readability and also create a province variable and add it to the tables.

# Part 0
# Assign each province to their codes
Province <- c(Markazi = "00", Ardabil = "24", Bushehr = "18", `Chaharmahal and Bakhtiari` = "14",
  `East Azerbaijan` = "03", Fars = "07", Gilan = "01", Golestan = "27",
  Hamadan = "13", Hormozgan = "22", Ilam = "16", Isfahan = "10",
  Kerman = "08", Kermanshah = "05", Khuzestan = "06", `Kohgiluyeh and Boyer-Ahmad` = "17",
  Kurdistan = "12", Lorestan = "15", Alborz = "30", Mazandaran = "02",
  `North Khorasan` = "28", Qazvin = "26", Qom = "25", `Razavi Khorasan` = "09",
  Semnan = "20", `Sistan and Baluchestan` = "11", `South Khorasan` = "29", Tehran = "23",
  `West Azerbaijan` = "04", Yazd = "21", Zanjan = "19")
  
# This data frame shows specifications of the survey
RU1402Data <- RU1402Data %>% 
  rename(khanevartype = NoeKhn)  %>% 
  # A new column "province" is added to the data frame which contains the mapped names of province
  mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))

##############################
# Part 1
# Relation to head
relation <- c(head="1", spouse="2", child="3", childinlaw="4", grandchild="5", parent="6", sibling="7", relative="8", nonrelative="9")
# Gender dummy variable
gender <- c(Male="1", Female="2")
# Literacy dummy variable 
literacy <- c(literate="1", illiterate="2")
# Dummy variable for answering Questions
yesno <- c(Yes="1", No="2")
# Degree of Education dummy variable
education <- c(Elemantary="1", Secondary="2", HighSchool="3", Diploma="4", College="5", Bachelor="6", Master="7", PhD="8", Other="9")
# Activity Status
occupation <- c(employed="1", unemployed="2", IncomeWOJob="3", Student="4", Housewife="5", Other="6")
# Marital Status
marital <- c(Married ="1", Widowed="2", Divorced="3", Single="4")

# This data frame shows social characteristics of family members (Renaming Columns)
RU1402P1 <- RU1402P1 %>% 
  rename(
    member = DYCOL01,
    relation = DYCOL03,
    gender = DYCOL04,
    age = DYCOL05,
    literacy = DYCOL06,
    is_studying = DYCOL07,
    education_deg = DYCOL08,
    occupational_stat = DYCOL09,
    marital_stat = DYCOL10) %>%  
  mutate(across(where(is.character), as.integer),
         across(c(relation,gender,literacy,is_studying,education_deg,occupational_stat,marital_stat), as.factor),
         relation = fct_recode(relation, !!!relation), 
         gender = fct_recode(gender, !!!gender),
         literacy = fct_recode(literacy, !!!literacy), 
         is_studying = fct_recode(is_studying, !!!yesno),
         education_deg = fct_recode(education_deg, !!!education), 
         occupational_stat = fct_recode(occupational_stat, !!!occupation),
         marital_stat = fct_recode(marital_stat, !!!marital))
##############################
# Part 2
# Type of occupation of the residence
tenure <- c(OwnedEstateLand="1", OwnedEstate="2", Rent="3", Mortgage="4", Service="5", Free="6", Other="7")
# Major materials
material <- c(MetalBlock="1", BrickWood="2", Cement="3", Brick="4", Wood="5", WoodKesht="6", KeshtGel="7", Other="8")
# Type of used fuel in the cooking
fuel <- c(Oil="1", Gasoline="2", LiquidGas="3", NaturalGas="4", Electricity="5", Wood="6", AnimalOil="7", Coke="8", Other="9", None="10" )
# Type of used fuel in heating
fuel1 <- c(Oil="11", Gasoline="12", LiquidGas="13", NaturalGas="14", Electricity="15", Wood="16", AnimalOil="17", Coke="18", Other="19", None="20" )
# Type of used fuel in providing Hot Water
fuel2 <- c(Oil="21", Gasoline="22", LiquidGas="23", NaturalGas="24", Electricity="25", Wood="26", AnimalOil="27", Coke="28", Other="29", None="30" )

# This data frame shows Residence details (Renaming Columns)
RU1402P2 <- RU1402P2 %>% 
  rename(
    tenure = DYCOL01,
    room = DYCOL03,
    space = DYCOL04,
    construction = DYCOL05,
    material = DYCOL06,
    vehicle = DYCOL07,
    motorcycle = DYCOL08,
    bicycle = DYCOL09,
    radio = DYCOL10,
    radiotape = DYCOL11,
    TVbw = DYCOL12,
    TV = DYCOL13,
    VHS_VCD_DVD = DYCOL14,
    computer = DYCOL15,
    cellphone = DYCOL16,
    freezer = DYCOL17,
    refridgerator = DYCOL18,
    fridge = DYCOL19,
    stove = DYCOL20,
    vacuum = DYCOL21,
    washingmachine = DYCOL22,
    sewingmachine = DYCOL23,
    fan = DYCOL24,
    evapcoolingportable = DYCOL25,
    splitportable = DYCOL26,
    dishwasher = DYCOL27,
    microwave = DYCOL28,
    none = DYCOL29,
    pipewater = DYCOL30,
    electricity = DYCOL31,
    pipegas = DYCOL32,
    telephone = DYCOL33,
    internet  = DYCOL34,
    bathroom = DYCOL35,
    kitchen = DYCOL36,
    evapcooling = DYCOL37,
    centralcooling = DYCOL38,
    centralheating = DYCOL39,
    package = DYCOL40,
    split = DYCOL41,
    wastewater = DYCOL42,
    cookingfuel = DYCOL43,
    heatingfuel = DYCOL44,
    waterheatingfuel = DYCOL45) %>% 
  mutate(across(where(is.character), as.integer),
         across(c(tenure,material,cookingfuel,heatingfuel,waterheatingfuel), as.factor),
         tenure = fct_recode(tenure, !!!tenure), 
         material = fct_recode(material, !!!material),
         cookingfuel = fct_recode(cookingfuel, !!!fuel), 
         heatingfuel = fct_recode(heatingfuel, !!!fuel1),
         waterheatingfuel = fct_recode(waterheatingfuel, !!!fuel2),
         across(vehicle:wastewater, ~!is.na(.x)))
##############################
# Part 3, Table 1
# This data frame shows food and tobacco expenditures (Renaming Columns)
RU1402P3S01 <- RU1402P3S01 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    gram = DYCOL03,
    kilogram = DYCOL04,
    price = DYCOL05,
    value = DYCOL06 ) %>% 
  mutate(
    across(c(price,value,kilogram),  ~ as.numeric(as.character(.x)) ),
    table = 1L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))                                                                                                                                                            
# Part 3, Table 2
# This data frame shows drink expenditures (Renaming Columns)
RU1402P3S02 <- RU1402P3S02 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    gram = DYCOL03,
    kilogram = DYCOL04,
    price = DYCOL05,
    value = DYCOL06 ) %>% 
  mutate(
    table = 2L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))  
  


# Part 3, Table 3
RU1402P3S03 <- RU1402P3S03 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03 ) %>% 
  mutate(
    table = 3L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))  
# Part 3, Table 4
# This data frame shows the housing costs (Renaming columns)
RU1402P3S04 <- RU1402P3S04 %>% 
  rename(
    goods_code = DYCOL01,
    mortgage = DYCOL02,
    provision_type = DYCOL03,
    value = DYCOL04 ) %>% 
  mutate(table = 4L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free"))) 
# Part 3, Table 5
RU1402P3S05 <- RU1402P3S05  %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03 ) %>% 
  mutate(table = 5L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 6
RU1402P3S06 <- RU1402P3S06  %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(table = 6L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 7
RU1402P3S07 <- RU1402P3S07 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 7L)  %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))

# Part 3, Table 8
RU1402P3S08 <- RU1402P3S08 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 8L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 9
RU1402P3S09 <- RU1402P3S09 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 9L) %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 11
RU1402P3S11 <- RU1402P3S11 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 11L)  %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))
# Part 3, Table 12
RU1402P3S12 <- RU1402P3S12 %>% 
  rename(
    goods_code = DYCOL01,
    provision_type = DYCOL02,
    value = DYCOL03) %>% 
  mutate(
    table = 12L)  %>% 
  mutate(provision_type=factor(provision_type, levels = c(1,2,3,4,5,6,7,8),
                                labels = c("purchased",
                                           "homemade",
                                            "publicservice",
                                            "cooperativeservice",
                                            "privateservice",
                                            "agriculture",
                                            "nonagriculture",
                                            "free")))

Here, we append the “Weight” column from RU1402P1 based on their “Address” to datasets which start with “RU1402P3S”

# Merging Process
RU1402P1_getweight <- select(RU1402P1, Address, Weight)
# Generate potential data set names
datasetP3_names <- paste0("RU1402P3S", sprintf("%02d", 1:12))
# Filter names of datasets that actually exist in this environment
P3_datasets <- Filter(function(name) exists(name, envir = .GlobalEnv), datasetP3_names)
# Apply the transformation and merge with RU1402Data_selected to each data set
P3weight_datasets <- lapply(mget(P3_datasets, envir = .GlobalEnv), function(df) {
  df %>% 
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province)) %>%
    left_join(RU1402P1_getweight, by = "Address")})
invisible(lapply(names(P3weight_datasets), function(x) assign(x,P3weight_datasets[[x]],envir=.GlobalEnv)))
# Removing the unnecessary data sets
#rm(RU1402P1_getweight)
rm(P3weight_datasets)

Poverty Analysis

1. Estimation of the Absolute Poverty Line

Based on an article from FAO about “Impacts of Policies on Poverty (Absolute Poverty Lines)”, there are some methods to calculate the absolute poverty line.

  • Food energy intake (FEI): Enough food to meet energy requirements

  • Cost of basic needs (CBN): A consumption bundle with food and nonfood

  • Consumption insufficiency method (CI): All necessary goods and services to satisfy the basic needs

  • Budget standard method (BS): All necessary goods and services to satisfy the basic needs \(+\) basic minimum for social lives

All of the above methods define a Set of goods that would ensure a standard of living and convert this set of goods into Monetary Values. The final aim is to define a poverty line (Threshold) below which an individual is considered poor.

The food energy intake (FEI) methodology defines the minimum food intake needed by a given individual to lead a decent life. By this definition, those people who can’t afford the cost of the FEI are poor. In addition, by definition, FEI is an absolute concept of poverty that is entirely food-based. This measure is a good indicator of poverty in those countries where a large part of the population spends a significant fraction of their budget on food (Specially for less developed economies).

In this part, we only consider the first method which is FEI. Here, we calculate the absolute poverty line based on the food bundle proposed by the Ministry of Health, providing 2,100 Kilo Calories per day and the required protein contents.

The following table is a food bundle designed by Iran’s Ministry of Health, which is by the dietary pattern of an adult and provides 2100 kilo calories daily.

سبد غذایی تامین کننده ۲۱۰۰ کیلو کالری در روز (منطبق با الگوی غذایی بالغین)

ماده غذایی مقدار سبد غذایی ماهانه
نان ۸ کیلوگرم
برنج ۳ کیلوگرم
ماکارونی (رشته فرنگی) ۷۰۰ گرم (یک بسته)
عدس (دیگر حبوبات) ۶۰۰ گرم
سیب زمینی (نخود سبز) ۱/۵ کیلوگرم
شیر ۷ کیسه ۱ لیتری
پنیر ۴۵۰ گرم
ماست ۳ کیلوگرم
گوشت قرمز ۱/۲ کیلوگرم
گوشت سفید (ماهی منجمد) ۱/۵ کیلوگرم (یک عدد مرغ)
تخم مرغ ۱۰ عدد
روغن مایع ۹۰۰ سی سی
شکر (قند یا عسل یا مربا) ۱ کیلوگرم
میوه ۶۰ واحد
سبزی های برگ سبز ۶۰ واحد
دیگر سبزی ها ۶۰ واحد

First, We extract food items from data and label them.

# For the materials that has replacement we add | instead &
RU1402P3S01 <- RU1402P3S01 %>% 
  mutate(label = case_when(
    goods_code >= 11141 & goods_code <= 11156 ~ "bread",
    goods_code >= 11111 & goods_code <= 11118 ~ "rice",
    goods_code == 11164 ~ "spaghetti", #Considering (noodle)
    goods_code == 11731 | goods_code == 11725 ~ "potato", # Considering (green pea)
    goods_code >= 11411 & goods_code <= 11414 ~ "milk",
    goods_code >= 11424 & goods_code <= 11426 ~ "yoghurt",
    (goods_code == 11241) & (goods_code >= 11211 & goods_code <= 11224) ~ "red meat",
    goods_code >= 11441 & goods_code <= 11443 ~ "egg",
    goods_code >= 11428 & goods_code <= 11431 ~ "cheese",
    goods_code >= 11611 & goods_code <= 11643 ~ "fruit",
    (goods_code >= 11711 & goods_code <= 11715) | (goods_code >= 11721 & goods_code <= 11753) & goods_code != 11731 ~ "vegetables", # Considering both green and non green vegetables
    goods_code == 11533 ~ "oil",
    goods_code == 11812 | goods_code == 11811 | goods_code == 11821 | goods_code == 11823 ~ "sugar", # Considering (Jam, cube sugar, honey)
    goods_code == 11768 | (goods_code >= 11761 & goods_code <= 11769) ~ "lentil", #considering other legumes)
    (goods_code == 11312 & goods_code == 11314) | (goods_code >= 11231 & goods_code <= 11239) ~ "white meat", # considering fish
    TRUE ~ NA_character_))

In this part, we modify the “Kilogram” column and generate a data frame to indicate the above food bundle table. Here we need to have some assumptions about some foods. Based on the search on internet, we create our assumption (there may not be the exact weight of each food but it’s a good estimation):

  • One medium egg: 50 grams

  • One unit fruit: 80 grams

  • One unit vegetable: 75 grams

  • One liter oil: 900 grams

P3S01_ModifiedKg <- RU1402P3S01 %>%
  mutate(gram = ifelse(is.na(gram), 0, gram/1000)) %>% 
  mutate(kilogram = ifelse(is.na(kilogram), 0, kilogram)) %>% 
  mutate(kilogram = kilogram + gram) %>% 
  select(-gram)
# Generating the food bundle based on the above table
food_bundle <- data.frame(
  label = c("bread", "rice", "spaghetti", "potato", "lentil", "milk", "yoghurt",
            "red meat", "white meat", "egg", "cheese", "fruit", "vegetables", "oil", "sugar"),
  needs_kg = c(8, 3, 0.7, 1.5, 0.6, 7, 3, 1.2, 1.5, 0.5, 0.45, 4.8, 9, 0.8, 1))

Here, We use two methods to calculate the weighted mean price.\[\text{Weighted Mean Price}_1 = \frac{\sum_{i=1}^{n} (\text{weight}_i \times \text{value}_i)}{\sum_{i=1}^{n} (\text{weight}_i \times \text{kilogram}_i)}\] \[\text{Weighted Mean Price}_2 = \frac{\sum_{i=1}^{n} (\text{weight}_i \times \text{price}_i)}{\sum_{i=1}^{n} (\text{weight}_i)}\]

The key differences between these two methods are:

  1. The first method (\(\text{Weighted Mean Price}_1\)) is useful when dealing with heterogeneous items with varying prices and quantities.

  2. The second method (\(\text{Weighted Mean Price}_2\)) is useful when comparing prices for the same product in different markets.

We estimate the cost of this bundle for urban and rural regions in each province to determine the absolute poverty line.

The following table summarizes the results from the Absolute Food Poverty Line.

Absolute Poverty Line (Individual Level - Rial)
province Urban_Rural First_Method Second_Method
Markazi Rural 10,892,461 12,159,927
Markazi Urban 11,753,554 13,336,748
Gilan Rural 12,321,780 14,695,402
Gilan Urban 13,071,919 15,646,669
Mazandaran Rural 13,241,551 15,454,503
Mazandaran Urban 13,719,724 15,408,803
East Azerbaijan Rural 11,192,237 12,163,803
East Azerbaijan Urban 11,736,456 12,947,565
West Azerbaijan Rural 10,385,923 11,234,672
West Azerbaijan Urban 10,782,061 11,785,844
Kermanshah Rural 10,258,364 11,075,460
Kermanshah Urban 10,973,332 11,828,619
Khuzestan Rural 11,202,407 11,730,458
Khuzestan Urban 12,103,919 13,429,049
Fars Rural 11,499,322 12,293,533
Fars Urban 11,973,511 13,069,107
Kerman Rural 10,797,652 11,694,707
Kerman Urban 11,217,183 12,528,833
Razavi Khorasan Rural 10,262,250 11,176,523
Razavi Khorasan Urban 10,978,742 12,250,785
Isfahan Rural 10,612,120 11,612,185
Isfahan Urban 11,066,719 12,484,515
Sistan and Baluchestan Rural 11,959,658 13,429,793
Sistan and Baluchestan Urban 11,691,361 13,343,773
Kurdistan Rural 10,331,593 11,321,938
Kurdistan Urban 11,250,252 12,929,946
Hamadan Rural 9,748,355 10,846,703
Hamadan Urban 10,674,406 11,757,492
Chaharmahal and Bakhtiari Rural 11,467,329 11,631,019
Chaharmahal and Bakhtiari Urban 11,360,201 13,056,104
Lorestan Rural 9,576,627 10,383,641
Lorestan Urban 10,340,902 11,164,087
Ilam Rural 9,969,086 10,622,510
Ilam Urban 10,736,474 11,847,126
Kohgiluyeh and Boyer-Ahmad Rural 10,894,116 11,799,621
Kohgiluyeh and Boyer-Ahmad Urban 10,708,175 12,992,214
Bushehr Rural 12,596,052 13,708,934
Bushehr Urban 12,905,540 14,032,785
Zanjan Rural 11,262,237 13,072,803
Zanjan Urban 11,744,954 13,941,275
Semnan Rural 10,236,025 10,835,096
Semnan Urban 10,709,173 11,364,694
Yazd Rural 11,778,132 13,192,722
Yazd Urban 11,632,590 13,215,317
Hormozgan Rural 14,434,812 16,844,817
Hormozgan Urban 14,207,530 15,780,112
Tehran Rural 11,693,750 13,107,647
Tehran Urban 12,815,263 14,361,055
Ardabil Rural 11,446,626 12,279,087
Ardabil Urban 12,168,514 12,689,191
Qom Rural 10,854,294 11,874,798
Qom Urban 10,489,917 11,711,749
Qazvin Rural 11,572,768 12,689,321
Qazvin Urban 11,682,550 13,202,049
Golestan Rural 9,591,023 10,381,877
Golestan Urban 10,305,236 11,065,175
North Khorasan Rural 9,665,898 10,599,748
North Khorasan Urban 10,495,577 11,715,921
South Khorasan Rural 10,849,373 11,314,446
South Khorasan Urban 10,852,866 12,410,013
Alborz Rural 12,037,217 12,855,809
Alborz Urban 12,937,873 13,628,687

In the first method, the cost of food bundle for one person in a month ranges from 9,576,627 Rial in rural areas of ‘Lorestan’ province to 14,434,812 Rial in rural areas of ‘Hormozgan’ province. Provinces with highest poverty line are ‘Hormozgan’ and ‘Mazandaran’ in both areas.

In the second method, the cost of food bundle for one person in a month ranges from 10,381,877 Rial in rural areas of ‘Golestan’ province to 16,844,817 Rial in rural areas of ‘Hormozgan’ province.

Provinces with highest absolute poverty line are ‘Hormozgan’ and ‘Mazandaran’. Provinces with lowest absolute poverty line are ‘Lorestan’ and ‘North Khorasan’.

2. Estimation of the Relative Poverty Line

Based on an article from FAO about “Impacts of Policies on Poverty (Relative Poverty Lines)”, there are some methods to calculate the relative poverty line.

  • Income levels (IL): percentage of mean/median income below

  • Income positions (IP): income quantile below

This approach considers the welfare position of each household about the welfare position of other individuals or households. These methods rely on a threshold that is relative to either income or expenditure.

It is usually taken income as a reference variable. However, in applied works, expenditure is sometimes taken as a more correct welfare indicator than income, as transitory shocks may drive observed income far from its permanent level. Expenditures are thought to better reflect this level of permanent income.

According to the low accuracy of our data in reporting income, we concentrate on expenditure rather than income as a monetary indicator.

  1. Traditional Approach (IL):

We focus on 50 % of the median per capita expenditure as the poverty threshold.

We assess total spending on non-durable goods for individuals in all provinces, including both urban and rural areas.

# Combining the tables which starts with P3S
combined_p3 <- bind_rows(mget(P3_datasets, envir = .GlobalEnv))

# Calculating the total expenditures for individuals in each province, including both urban and rural areas.
total_expenditure <- combined_p3 %>%
  dplyr::group_by(Address, province, Urban_Rural, Weight) %>%
  dplyr::summarize(total_value = sum(value, na.rm = TRUE), .groups = 'drop')

# Calculate 50% of the median in each province, including both urban and rural areas
relative_line <- total_expenditure %>%
  dplyr::group_by(province, Urban_Rural) %>%
  dplyr::summarize(Half_Median = 0.5 * median(total_value, na.rm = TRUE), .groups = 'drop') %>%
  mutate(Urban_Rural = recode(Urban_Rural, "R" = "Rural", "U" = "Urban")) %>%
  arrange(province, Urban_Rural, -Half_Median) 

kable(
  relative_line %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Relative Poverty Line - Traditional Approach (Individual Level - Rial)" = ncol(relative_line))) %>%
  row_spec(which(relative_line$Urban_Rural == "Rural"), background = "violet") %>%
  row_spec(which(relative_line$Urban_Rural == "Urban"), background = "yellow")
Relative Poverty Line - Traditional Approach (Individual Level - Rial)
province Urban_Rural Half_Median
Markazi Rural 47,050,000
Markazi Urban 61,550,000
Gilan Rural 34,165,000
Gilan Urban 54,135,250
Mazandaran Rural 59,686,619
Mazandaran Urban 71,847,500
East Azerbaijan Rural 32,717,500
East Azerbaijan Urban 46,120,500
West Azerbaijan Rural 29,887,000
West Azerbaijan Urban 40,482,500
Kermanshah Rural 39,786,000
Kermanshah Urban 52,030,000
Khuzestan Rural 43,755,000
Khuzestan Urban 64,177,500
Fars Rural 52,862,500
Fars Urban 81,802,500
Kerman Rural 29,305,000
Kerman Urban 37,080,000
Razavi Khorasan Rural 35,338,250
Razavi Khorasan Urban 58,067,500
Isfahan Rural 46,695,500
Isfahan Urban 64,286,500
Sistan and Baluchestan Rural 26,307,500
Sistan and Baluchestan Urban 44,267,500
Kurdistan Rural 40,077,500
Kurdistan Urban 55,362,500
Hamadan Rural 37,876,250
Hamadan Urban 54,967,400
Chaharmahal and Bakhtiari Rural 54,890,000
Chaharmahal and Bakhtiari Urban 66,162,500
Lorestan Rural 37,079,500
Lorestan Urban 50,695,050
Ilam Rural 30,913,500
Ilam Urban 42,693,750
Kohgiluyeh and Boyer-Ahmad Rural 36,967,500
Kohgiluyeh and Boyer-Ahmad Urban 50,917,000
Bushehr Rural 70,438,200
Bushehr Urban 78,498,250
Zanjan Rural 43,818,000
Zanjan Urban 68,323,000
Semnan Rural 27,542,500
Semnan Urban 36,870,000
Yazd Rural 49,700,000
Yazd Urban 58,330,000
Hormozgan Rural 43,610,000
Hormozgan Urban 60,189,000
Tehran Rural 60,374,300
Tehran Urban 101,536,580
Ardabil Rural 40,580,000
Ardabil Urban 51,282,500
Qom Rural 41,446,775
Qom Urban 54,981,500
Qazvin Rural 40,275,250
Qazvin Urban 57,117,000
Golestan Rural 35,715,000
Golestan Urban 48,305,000
North Khorasan Rural 32,448,000
North Khorasan Urban 49,644,500
South Khorasan Rural 26,759,500
South Khorasan Urban 50,287,500
Alborz Rural 62,097,362
Alborz Urban 77,540,000

3 Provinces with highest relative poverty line are ‘Tehran’ and ‘Fars’ and ‘Bushehr’ in urban areas. 3 Provinces with lowest relative poverty line are ‘Sistan and Baluchestan’ and ‘South Khorasan’ and ‘Semnan’ in rural areas.

  1. Hybrid Approach:

It is important to analyze the food bundle of very poor individuals, as they are often stuck in a poverty trap. This analysis will enable us to better target assistance and improve their access to necessary food items.

  • In this part, we first rank households by per capita non-durable expenditure.

  • Second, we identify the bottom 20% of households in both urban and rural zones.

  • Third, we analyze the food bundle typically consumed by these households.

In this step, we need to match the bottom 20% of households to the RU1402P3S01 table to find their food bundle by “label” column.

In this step, we need to get the goods code that they have NA label but they are consumed typically by these households. We set our threshold at 4000. This means that we only consider the goods with NA labels which consumed by more than 4000 of the bottom 20% of households.

NA_label <- filtered_RU1402P3S01 %>% 
  filter(is.na(label)) %>% 
  group_by(goods_code) %>% 
  dplyr::summarize(count = n(), .groups = 'drop') %>%
  filter(count>4000) %>%
  arrange(-count)

Here, we set the labels for these goods and consider them in their food bundle.

NA_label <- NA_label %>% 
  mutate(label = case_when(
    goods_code == 11921 ~ "tomato paste",
    goods_code == 12211 ~ "soda",
    goods_code == 11911 ~ "salt",
    goods_code == 12112 ~ "non-irani tea",
    goods_code == 11665 ~ "chips & pofak",
    goods_code == 11531 ~ "margarine",
    goods_code == 11171 ~ "biscuit",
    goods_code == 11914 ~ "turmeric",
    goods_code == 11423 ~ "ice cream",
    goods_code == 11161 ~ "flour",
    goods_code == 11172 ~ "cake",
    TRUE ~ NA_character_
  )) %>% 
    filter(!is.na(label))
# Indicate the table 
kable(
  NA_label %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  column_spec(1, background = "olivedrab1") %>%
  column_spec(2, background = "slateblue1") %>%
  column_spec(3, background = "deepskyblue")
goods_code count label
11,921 16,566 tomato paste
12,211 12,901 soda
11,911 9,958 salt
12,112 9,375 non-irani tea
11,665 8,083 chips & pofak
11,531 6,796 margarine
11,171 6,516 biscuit
11,914 4,801 turmeric
11,423 4,338 ice cream
11,161 4,219 flour
11,172 4,184 cake

In addition to the food bundle defined by the Iran’s Ministry of Health, we found goods that are typically consumed by the bottom 20% of households. These new goods include:

  1. رب گوجه فرنگی

  2. انواع نوشابه

  3. انواع نمک

  4. انواع چای خارجی

  5. پفک و چیپس

  6. انواع روغن نباتی جامد و مارگارین

  7. انواع بیسکوییت و ویفر

  8. زردچوبه

  9. انواع بستنی شیری و میوه ای

  10. آرد گندم

  11. انواع کیک ساده

In this step, we convert the “gram” column to “kilogram”. Then we assign a label to this new code of goods consumed by the bottom 20% of poor households.

The following code generates the new food bundle.

# Get the unique values of label column
uniq <- unique(filtered_RU1402P3S01$label)

n_rows <- ceiling(length(uniq) / 2)
uniq_table <- data.frame(
  Column1 = uniq[1:n_rows],
  Column2 = uniq[(n_rows + 1):length(uniq)])

kable(uniq_table, format = "html", col.names = c("Column 1", "Column 2"), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
    full_width = FALSE,
    position = "center") %>%
  column_spec(1, background = "wheat3", border_right = TRUE) %>%
  column_spec(2, background = "wheat3") 
Column 1 Column 2
bread sugar
ice cream non-irani tea
yoghurt NA
cheese oil
egg soda
vegetables biscuit
potato rice
lentil margarine
tomato paste salt
spaghetti turmeric
white meat cake
milk chips & pofak
fruit flour

A notable point about the table above is that the new food bundle for the bottom 20% of households does not include Red Meat. This indicates that households in the lowest 20% do not incorporate Red Meat into their monthly food bundle.

  • Fourth, we determine the amounts that provide 2,100 calories based on this new bundle.

In this step, we calculate the average food bundle of poor people below the 20% of bottom.

The following code creates a data set to calculate the average consumption of each household in every province, distinguishing between urban and rural areas by their labels.

Based on the search on this Website, we find the calories per kilogram for each item included in the new food bundle for the bottom 20% of households.

kable(
  average_consumption %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Average Daily Calories (Individual Level - 20% bottom)" = ncol(average_consumption))) %>%
  row_spec(which(average_consumption$Urban_Rural == "Rural"), background = "slategray1") %>%
  row_spec(which(average_consumption$Urban_Rural == "Urban"), background = "peachpuff1")
Average Daily Calories (Individual Level - 20% bottom)
province Urban_Rural average_personal_calories
Markazi Rural 1,334
Markazi Urban 1,029
Gilan Rural 1,218
Gilan Urban 1,117
Mazandaran Rural 1,871
Mazandaran Urban 1,364
East Azerbaijan Rural 1,692
East Azerbaijan Urban 1,337
West Azerbaijan Rural 1,190
West Azerbaijan Urban 1,076
Kermanshah Rural 2,719
Kermanshah Urban 2,034
Khuzestan Rural 2,357
Khuzestan Urban 1,708
Fars Rural 1,943
Fars Urban 1,581
Kerman Rural 2,354
Kerman Urban 1,965
Razavi Khorasan Rural 1,743
Razavi Khorasan Urban 1,644
Isfahan Rural 1,572
Isfahan Urban 1,349
Sistan and Baluchestan Rural 1,964
Sistan and Baluchestan Urban 2,045
Kurdistan Rural 1,637
Kurdistan Urban 1,487
Hamadan Rural 1,332
Hamadan Urban 1,110
Chaharmahal and Bakhtiari Rural 2,703
Chaharmahal and Bakhtiari Urban 1,919
Lorestan Rural 2,465
Lorestan Urban 1,826
Ilam Rural 2,569
Ilam Urban 2,510
Kohgiluyeh and Boyer-Ahmad Rural 1,941
Kohgiluyeh and Boyer-Ahmad Urban 1,235
Bushehr Rural 2,577
Bushehr Urban 2,146
Zanjan Rural 1,585
Zanjan Urban 1,198
Semnan Rural 2,242
Semnan Urban 2,286
Yazd Rural 1,444
Yazd Urban 1,330
Hormozgan Rural 1,170
Hormozgan Urban 1,110
Tehran Rural 1,297
Tehran Urban 1,335
Ardabil Rural 1,731
Ardabil Urban 1,372
Qom Rural 1,456
Qom Urban 1,194
Qazvin Rural 1,752
Qazvin Urban 1,307
Golestan Rural 1,513
Golestan Urban 1,375
North Khorasan Rural 2,278
North Khorasan Urban 1,874
South Khorasan Rural 2,132
South Khorasan Urban 1,859
Alborz Rural 1,021
Alborz Urban 881

On average, people living in ‘Karaj’ Province receive the lowest daily calorie intake among the bottom 20% of households nationwide, with 1,021 calories in urban areas and 881 calories in rural areas. On average, people living in rural areas of ‘Kermanshah’ Province and ‘Chaharmahal Bakhtiari’ Province receive the highest daily calorie intake among the bottom 20% of households nationwide, with 2,719 calories in ‘Kermanshah’ Province and 2,703 calories in ‘Chaharmahal Bakhtiari’ Province.

  • Fifth, we use the cost of this new bundle as a refined poverty line
# Generate the price vector of this new bundle
filtered_RU1402P3S01 <- left_join(filtered_RU1402P3S01,address_count, by = "Address")
price_vector <- filtered_RU1402P3S01 %>%
  filter(!is.na(label) & !is.na(value) & !is.na(kilogram)) %>%
  group_by(label, province, Urban_Rural) %>%
  dplyr::summarize(weighted_mean_price = sum(value * Weight, na.rm = TRUE) / sum(kilogram * Weight, na.rm = TRUE), .groups = 'drop')

filtered_RU1402P3S01 <- filtered_RU1402P3S01 %>%
  left_join(price_vector, by = c('label', 'province','Urban_Rural')) %>%
  mutate(kilogram = kilogram / member)

hybrid_poverty_line <- filtered_RU1402P3S01 %>%
  group_by(label, province, Urban_Rural) %>%
  mutate(total_product_cost = weighted_mean_price * kilogram) %>%
  group_by(Address, province, Urban_Rural) %>%
  summarise(total_bundle_price = sum(total_product_cost, na.rm = TRUE),.groups = 'drop') %>% 
  group_by(province, Urban_Rural) %>% 
  mutate(Urban_Rural = recode(Urban_Rural, "R" = "Rural", "U" = "Urban")) %>%
  summarise(cost_new_bundle = mean(total_bundle_price, na.rm = TRUE),.groups = 'drop')

kable(
  hybrid_poverty_line %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Relative Poverty Line - Hybrid Approach (Individual Level - Rial)" = ncol(hybrid_poverty_line))) %>%
  row_spec(which(hybrid_poverty_line$Urban_Rural == "Rural"), background = "cadetblue1") %>%
  row_spec(which(hybrid_poverty_line$Urban_Rural == "Urban"), background = "goldenrod2")
Relative Poverty Line - Hybrid Approach (Individual Level - Rial)
province Urban_Rural cost_new_bundle
Markazi Rural 12,078,831
Markazi Urban 13,231,259
Gilan Rural 7,349,822
Gilan Urban 9,134,857
Mazandaran Rural 12,448,715
Mazandaran Urban 15,157,034
East Azerbaijan Rural 11,181,421
East Azerbaijan Urban 14,011,940
West Azerbaijan Rural 13,887,902
West Azerbaijan Urban 14,114,726
Kermanshah Rural 17,262,331
Kermanshah Urban 20,576,228
Khuzestan Rural 17,043,927
Khuzestan Urban 20,538,817
Fars Rural 10,886,670
Fars Urban 16,188,458
Kerman Rural 17,223,598
Kerman Urban 19,287,059
Razavi Khorasan Rural 13,303,883
Razavi Khorasan Urban 15,515,614
Isfahan Rural 10,667,042
Isfahan Urban 13,561,812
Sistan and Baluchestan Rural 13,373,659
Sistan and Baluchestan Urban 15,962,129
Kurdistan Rural 16,256,677
Kurdistan Urban 17,687,751
Hamadan Rural 10,316,798
Hamadan Urban 9,957,974
Chaharmahal and Bakhtiari Rural 14,525,660
Chaharmahal and Bakhtiari Urban 19,639,382
Lorestan Rural 16,868,750
Lorestan Urban 17,367,830
Ilam Rural 17,752,596
Ilam Urban 20,174,603
Kohgiluyeh and Boyer-Ahmad Rural 13,535,465
Kohgiluyeh and Boyer-Ahmad Urban 14,852,076
Bushehr Rural 17,913,329
Bushehr Urban 20,064,721
Zanjan Rural 12,849,297
Zanjan Urban 13,614,465
Semnan Rural 13,111,742
Semnan Urban 16,250,432
Yazd Rural 9,684,500
Yazd Urban 12,981,312
Hormozgan Rural 13,037,213
Hormozgan Urban 16,132,922
Tehran Rural 12,832,553
Tehran Urban 15,794,607
Ardabil Rural 14,216,254
Ardabil Urban 15,813,921
Qom Rural 11,437,388
Qom Urban 12,957,472
Qazvin Rural 12,717,597
Qazvin Urban 15,932,593
Golestan Rural 11,234,501
Golestan Urban 14,426,796
North Khorasan Rural 11,811,541
North Khorasan Urban 13,448,426
South Khorasan Rural 13,351,752
South Khorasan Urban 16,099,192
Alborz Rural 8,386,304
Alborz Urban 10,573,553

3 Provinces with highest relative poverty line are ‘Khuzestan’ and ‘Chaharmahal and Bakhtiari’ and ‘Bushehr’ in urban areas. 3 Provinces with lowest relative poverty line are ‘Gilan’ and ‘Alborz’ in rural areas and ‘Gilan’ in urban areas.

comparison_approach <- relative_line %>% 
  left_join(hybrid_poverty_line, by = c("province", "Urban_Rural")) %>% 
  mutate(Urban_Rural = recode(Urban_Rural, "R" = "Rural", "U" = "Urban")) %>% 
  rename(Tradition_Approach = Half_Median, Hybrid_Approach = cost_new_bundle)

kable(
  comparison_approach %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 0))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Comparison of 2 approaches in Relative poverty line (Individual Level - Rial)" = ncol(comparison_approach))) %>%
  row_spec(which(comparison_approach$Urban_Rural == "Rural"), background = "chartreuse2") %>%
  row_spec(which(comparison_approach$Urban_Rural == "Urban"), background = "slateblue1")
Comparison of 2 approaches in Relative poverty line (Individual Level - Rial)
province Urban_Rural Tradition_Approach Hybrid_Approach
Markazi Rural 47,050,000 12,078,831
Markazi Urban 61,550,000 13,231,259
Gilan Rural 34,165,000 7,349,822
Gilan Urban 54,135,250 9,134,857
Mazandaran Rural 59,686,619 12,448,715
Mazandaran Urban 71,847,500 15,157,034
East Azerbaijan Rural 32,717,500 11,181,421
East Azerbaijan Urban 46,120,500 14,011,940
West Azerbaijan Rural 29,887,000 13,887,902
West Azerbaijan Urban 40,482,500 14,114,726
Kermanshah Rural 39,786,000 17,262,331
Kermanshah Urban 52,030,000 20,576,228
Khuzestan Rural 43,755,000 17,043,927
Khuzestan Urban 64,177,500 20,538,817
Fars Rural 52,862,500 10,886,670
Fars Urban 81,802,500 16,188,458
Kerman Rural 29,305,000 17,223,598
Kerman Urban 37,080,000 19,287,059
Razavi Khorasan Rural 35,338,250 13,303,883
Razavi Khorasan Urban 58,067,500 15,515,614
Isfahan Rural 46,695,500 10,667,042
Isfahan Urban 64,286,500 13,561,812
Sistan and Baluchestan Rural 26,307,500 13,373,659
Sistan and Baluchestan Urban 44,267,500 15,962,129
Kurdistan Rural 40,077,500 16,256,677
Kurdistan Urban 55,362,500 17,687,751
Hamadan Rural 37,876,250 10,316,798
Hamadan Urban 54,967,400 9,957,974
Chaharmahal and Bakhtiari Rural 54,890,000 14,525,660
Chaharmahal and Bakhtiari Urban 66,162,500 19,639,382
Lorestan Rural 37,079,500 16,868,750
Lorestan Urban 50,695,050 17,367,830
Ilam Rural 30,913,500 17,752,596
Ilam Urban 42,693,750 20,174,603
Kohgiluyeh and Boyer-Ahmad Rural 36,967,500 13,535,465
Kohgiluyeh and Boyer-Ahmad Urban 50,917,000 14,852,076
Bushehr Rural 70,438,200 17,913,329
Bushehr Urban 78,498,250 20,064,721
Zanjan Rural 43,818,000 12,849,297
Zanjan Urban 68,323,000 13,614,465
Semnan Rural 27,542,500 13,111,742
Semnan Urban 36,870,000 16,250,432
Yazd Rural 49,700,000 9,684,500
Yazd Urban 58,330,000 12,981,312
Hormozgan Rural 43,610,000 13,037,213
Hormozgan Urban 60,189,000 16,132,922
Tehran Rural 60,374,300 12,832,553
Tehran Urban 101,536,580 15,794,607
Ardabil Rural 40,580,000 14,216,254
Ardabil Urban 51,282,500 15,813,921
Qom Rural 41,446,775 11,437,388
Qom Urban 54,981,500 12,957,472
Qazvin Rural 40,275,250 12,717,597
Qazvin Urban 57,117,000 15,932,593
Golestan Rural 35,715,000 11,234,501
Golestan Urban 48,305,000 14,426,796
North Khorasan Rural 32,448,000 11,811,541
North Khorasan Urban 49,644,500 13,448,426
South Khorasan Rural 26,759,500 13,351,752
South Khorasan Urban 50,287,500 16,099,192
Alborz Rural 62,097,362 8,386,304
Alborz Urban 77,540,000 10,573,553

As we anticipated, the relative poverty line determined by the hybrid approach is lower than that calculated using the traditional approach. This difference arises because the hybrid approach is based on the food bundle consumed by the bottom 20% of households, rather than the 50% median of all households. When households are limited to specific food items, the cost of obtaining 2,100 calories is significantly higher compared to when they have the option to choose from a more diverse food bundle. As we noted in previous part, households in the bottom 20% exclude red meat from their monthly food bundle.

3. Calculation of the Poverty Gap

The poverty gap index measures the intensity of poverty by calculating the average difference between the income of poor people and the income poverty line, as a percentage of the income poverty line. It it tells us how much poorer the poor are relative to the income poverty line.

  • A higher PGI means that the poor are farther below the poverty line.

  • A lower PGI means that the poor are closer to the poverty line.

\[ \text{PGI}=\frac{1}{N}\sum_{j=1}^{q}\left(\frac{z-y_j}{z}\right) \]

Where,

  • \(N\) is total population

  • \(q\) is the number of poor

  • \(z\) is the income poverty line

  • \(y_j\) is the income for person \(j\) living below the poverty line

In this part we estimate poverty gap index for each province.

  • First, we need to calculate income poverty line.

The income poverty line can be calculated by the following formula:

\[ \text{Income Poverty Line} = \frac{\text{Absolute Poverty Line}}{\text{Share Food Expenditure to Total Expenditure}} = \frac{\text{Absolute Poverty Line}}{\frac{\text{Food expenditure}}{\text{Total Expenditure}}} \]

Income poverty line (2 Methods) (Individual Level - Rial)
province Urban_Rural income_poverty_line1 income_poverty_line2
Markazi Rural 25,150,564 28,077,129
Markazi Urban 27,138,818 30,794,395
Gilan Rural 38,052,828 45,383,187
Gilan Urban 40,369,451 48,320,944
Mazandaran Rural 34,797,712 40,613,168
Mazandaran Urban 36,054,311 40,493,072
East Azerbaijan Rural 25,544,023 27,761,429
East Azerbaijan Urban 26,786,095 29,550,206
West Azerbaijan Rural 25,500,063 27,583,958
West Azerbaijan Urban 26,472,683 28,937,225
Kermanshah Rural 20,678,475 22,325,552
Kermanshah Urban 22,119,685 23,843,744
Khuzestan Rural 20,916,150 21,902,081
Khuzestan Urban 22,599,374 25,073,539
Fars Rural 30,448,510 32,551,464
Fars Urban 31,704,092 34,605,070
Kerman Rural 23,048,598 24,963,445
Kerman Urban 23,944,125 26,743,965
Razavi Khorasan Rural 22,052,404 24,017,072
Razavi Khorasan Urban 23,592,064 26,325,539
Isfahan Rural 29,608,483 32,398,728
Isfahan Urban 30,876,842 34,832,584
Sistan and Baluchestan Rural 23,393,531 26,269,170
Sistan and Baluchestan Urban 22,868,734 26,100,911
Kurdistan Rural 22,089,260 24,206,647
Kurdistan Urban 24,053,380 27,644,617
Hamadan Rural 27,747,101 30,873,372
Hamadan Urban 30,382,956 33,465,784
Chaharmahal and Bakhtiari Rural 22,654,133 22,977,508
Chaharmahal and Bakhtiari Urban 22,442,497 25,792,817
Lorestan Rural 19,392,652 21,026,854
Lorestan Urban 20,940,307 22,607,255
Ilam Rural 18,174,783 19,366,049
Ilam Urban 19,573,818 21,598,664
Kohgiluyeh and Boyer-Ahmad Rural 24,225,532 26,239,127
Kohgiluyeh and Boyer-Ahmad Urban 23,812,051 28,891,128
Bushehr Rural 27,823,714 30,281,985
Bushehr Urban 28,507,349 30,997,346
Zanjan Rural 28,053,554 32,563,565
Zanjan Urban 29,255,973 34,726,876
Semnan Rural 22,075,047 23,367,008
Semnan Urban 23,095,441 24,509,141
Yazd Rural 31,217,223 34,966,507
Yazd Urban 30,831,473 35,026,395
Hormozgan Rural 31,700,116 36,992,699
Hormozgan Urban 31,200,985 34,654,511
Tehran Rural 42,498,904 47,637,469
Tehran Urban 46,574,847 52,192,759
Ardabil Rural 26,836,899 28,788,623
Ardabil Urban 28,529,383 29,750,122
Qom Rural 28,525,991 31,207,959
Qom Urban 27,568,377 30,779,452
Qazvin Rural 29,465,016 32,307,831
Qazvin Urban 29,744,527 33,613,270
Golestan Rural 23,889,704 25,859,595
Golestan Urban 25,668,695 27,561,580
North Khorasan Rural 23,967,249 26,282,793
North Khorasan Urban 26,024,493 29,050,418
South Khorasan Rural 20,119,177 20,981,614
South Khorasan Urban 20,125,655 23,013,243
Alborz Rural 36,785,771 39,287,390
Alborz Urban 39,538,178 41,649,306

In the first column, income poverty line 1, which comes from calculating the absolute poverty line, the highest income poverty line is related to the rural areas of ‘Ilam’ and ‘Lorestan’ provinces and the urban areas of ‘Ilam’, respectively. The lowest income poverty line is related to the urban and rural areas of ‘Tehran’ province and the urban areas of ‘Gilan’, respectively. In the second column, income poverty line 2, which comes from calculating the absolute poverty line, the highest income poverty line is related to the rural areas of ‘Ilam’, ‘South Khorasan’, and ‘Lorestan’ provinces, respectively. The lowest income poverty line is related to the urban areas of ‘Tehran’ and ‘Gilan’ provinces and the rural areas of ‘Tehran’ province, respectively.

The difference in the income poverty line in the first column is between 18,174,783 and 46,574,847 rials per person, and the difference in the income poverty line in the second column is between 19,366,049 and 52,192,759 rials per person.

  • Second, we need to calculate the sources of household income from the data sets.

In this step, we clean tables that are associated with individuals’ sources of income.

# Part 4, Table 1 (For public sector)
RU1402P4S01 <- RU1402P4S01 %>%
  rename(
    member = DYCOL01,
    is_employed = DYCOL02,
    ISCO_code = DYCOL03,
    ISIC_code = DYCOL04,
    sector_status = DYCOL05,
    daily_hours = DYCOL06,
    weekly_days = DYCOL07,
    income_month = DYCOL08,
    income_year = DYCOL09,
    wage_month = DYCOL10,
    wage_year = DYCOL11,
    benefit_month = DYCOL12,
    benefit_year = DYCOL13,
    netincome_month = DYCOL14,
    netincome_year = DYCOL15) %>%
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))
# Part 4, Table 2 (For private sector)
RU1402P4S02 <- RU1402P4S02 %>%
  rename(
    member = DYCOL01,
    is_employed = DYCOL02,
    ISCO_code = DYCOL03,
    ISIC_code = DYCOL04,
    worker_status = DYCOL05,
    agriculture = DYCOL06,
    daily_hours = DYCOL07,
    weekly_days = DYCOL08,
    employment_cost = DYCOL09,
    agriculture_cost = DYCOL10,
    tool_cost = DYCOL11,
    fee_cost = DYCOL12,
    tax_cost = DYCOL13,
    sale = DYCOL14,
    netincome = DYCOL15)%>%
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))
# Part 4, Table 3 (For other incomes)
RU1402P4S03 <- RU1402P4S03 %>%
  rename(
    member = DYCOL01,
    income_pension = DYCOL03,
    income_rent = DYCOL04,
    income_interest = DYCOL05,
    income_granteduc = DYCOL06,
    income_sale = DYCOL07,
    income_famtransfer = DYCOL08)%>%
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))
# Part 4, Table 4 (Column 9 of table 3 (cash transfer or subsidy))
RU1402P4S04 <- RU1402P4S04 %>%
  rename(
    member = Dycol01,
    subsidy_member = Dycol03,
    subsidy_month = Dycol04,
    total_subsidy = Dycol05)%>%
    mutate(province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province))

Third, we calculate monthly individuals’ sources of income from the tables.

  • Fourth, we identify individuals earning below the poverty line.

In this step, we only keep these individuals and exclude those above the poverty line.

In addition, we calculate the ratio of individuals who are below the income poverty line. (Following Formula)

\[\frac{1}{N}\sum_{j=1}^{q}\left(\frac{z-y_j}{z}\right)\]

combined_monthlyinc <- combined_monthlyinc %>%
  mutate(
    Urban_Rural = case_when(
      Urban_Rural == "U" ~ "Urban",
      Urban_Rural == "R" ~ "Rural",
      TRUE ~ Urban_Rural))
# Join income_poverty_line & weight & member count to combined_income
combined_monthlyinc <- combined_monthlyinc %>%
  left_join(income_poverty_line, by = c("province","Urban_Rural")) %>%
  left_join(RU1402P1_getweight, by = "Address") %>%
  left_join(address_count, by = "Address")

# Filter to keep rows where monthly net income is lower than income poverty line 1
poverty_gap1 <- combined_monthlyinc %>%
  # Filter the monthly net income of households below the income poverty line 1
  filter(total_netincome_month < income_poverty_line1) %>%
  # Calculate per capita monthly income (individual)
  mutate(personal_netincome_month = total_netincome_month / member) %>%
  # Calculate the ratio of net personal income to the income poverty line 1
  mutate(ratio = (income_poverty_line1 - personal_netincome_month) / income_poverty_line1) %>%
  group_by(province, Urban_Rural) %>%
  dplyr::summarize(
    sum_ratio = sum(Weight * ratio, na.rm = TRUE),  # Weighted sum of shortfall ratios
    PGI_1 = 100 * round(sum_ratio / nrow(combined_monthlyinc), 7),   # Divide by total rows (N)
    .groups = 'drop')

# Filter to keep rows where monthly net income is lower than income poverty line 2
poverty_gap2 <- combined_monthlyinc %>%
  # Filter the monthly net income of households below the income poverty line 2
  filter(total_netincome_month < income_poverty_line2) %>%
  # Calculate per capita monthly income (individual)
  mutate(personal_netincome_month = total_netincome_month / member) %>%
  # Calculate the ratio of net personal income to the income poverty line 2
  mutate(ratio = (income_poverty_line2 - personal_netincome_month) / income_poverty_line2) %>%
  group_by(province, Urban_Rural) %>%
  dplyr::summarize(
    sum_ratio = sum(Weight * ratio, na.rm = TRUE),  # Weighted sum of shortfall ratios
    PGI_2 = 100 * round(sum_ratio / nrow(combined_monthlyinc), 7),   # Divide by total rows (N)
    .groups = 'drop')
poverty_gap_index <- poverty_gap1 %>%
  left_join(poverty_gap2, by = c("province","Urban_Rural")) %>%
  select(province, Urban_Rural, PGI_1, PGI_2)

kable(
  poverty_gap_index %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 4))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Poverty Gap Index (2 Methods)" = ncol(poverty_gap_index))) %>%
  row_spec(which(poverty_gap_index$Urban_Rural == "Rural"), background = "lightsteelblue1") %>%
  row_spec(which(poverty_gap_index$Urban_Rural == "Urban"), background = "mistyrose1")
Poverty Gap Index (2 Methods)
province Urban_Rural PGI_1 PGI_2
Markazi Rural 0.0632 0.0807
Markazi Urban 0.0104 0.0162
Gilan Rural 0.0748 0.0866
Gilan Urban 0.0785 0.0955
Mazandaran Rural 0.0260 0.0319
Mazandaran Urban 0.0076 0.0103
East Azerbaijan Rural 0.0491 0.0589
East Azerbaijan Urban 0.0148 0.0175
West Azerbaijan Rural 0.0533 0.0606
West Azerbaijan Urban 0.0249 0.0303
Kermanshah Rural 0.0149 0.0179
Kermanshah Urban 0.0157 0.0174
Khuzestan Rural 0.0372 0.0399
Khuzestan Urban 0.0095 0.0100
Fars Rural 0.0842 0.0964
Fars Urban 0.0322 0.0393
Kerman Rural 0.0131 0.0189
Kerman Urban 0.0052 0.0073
Razavi Khorasan Rural 0.0745 0.0905
Razavi Khorasan Urban 0.0209 0.0274
Isfahan Rural 0.0262 0.0320
Isfahan Urban 0.0081 0.0096
Sistan and Baluchestan Rural 0.1730 0.2097
Sistan and Baluchestan Urban 0.1108 0.1387
Kurdistan Rural 0.0183 0.0230
Kurdistan Urban 0.0135 0.0154
Hamadan Rural 0.1174 0.1280
Hamadan Urban 0.0667 0.0741
Chaharmahal and Bakhtiari Rural 0.0056 0.0057
Chaharmahal and Bakhtiari Urban 0.0010 0.0019
Lorestan Rural 0.0253 0.0325
Lorestan Urban 0.0205 0.0238
Ilam Rural 0.0086 0.0133
Ilam Urban 0.0065 0.0101
Kohgiluyeh and Boyer-Ahmad Rural 0.0229 0.0323
Kohgiluyeh and Boyer-Ahmad Urban 0.0059 0.0157
Bushehr Rural 0.0221 0.0273
Bushehr Urban 0.0114 0.0151
Zanjan Rural 0.0253 0.0376
Zanjan Urban 0.0144 0.0230
Semnan Rural 0.0179 0.0214
Semnan Urban 0.0049 0.0050
Yazd Rural 0.0841 0.0875
Yazd Urban 0.0139 0.0160
Hormozgan Rural 0.1351 0.1700
Hormozgan Urban 0.0982 0.1086
Tehran Rural 0.0095 0.0157
Tehran Urban 0.0392 0.0577
Ardabil Rural 0.0307 0.0330
Ardabil Urban 0.0285 0.0291
Qom Rural 0.0152 0.0180
Qom Urban 0.0154 0.0213
Qazvin Rural 0.0287 0.0379
Qazvin Urban 0.0066 0.0076
Golestan Rural 0.0707 0.0778
Golestan Urban 0.0551 0.0598
North Khorasan Rural 0.0220 0.0319
North Khorasan Urban 0.0185 0.0222
South Khorasan Rural 0.0604 0.0734
South Khorasan Urban 0.0191 0.0336
Alborz Rural 0.0172 0.0178
Alborz Urban 0.0129 0.0138

In the first column, PGI 1, which comes from calculating the income poverty line 1, the highest PGI is related to the rural areas of ‘Sistan and Baluchestan’ and ‘Hormozgan’ and ‘Hamadan’ provinces, respectively. The lowest PGI is related to the urban areas of ‘Chaharmahal and Bakhtiari’ and ‘Semnan’ and ‘Kerman’ provinces, respectively. In the second column, PGI 2, which comes from calculating the income poverty line 2, the highest PGI is related to the rural and urban areas of ‘Sistan and Baluchestan’ and rural areas of ‘Hormozgan’ provinces. The lowest PGI is related to the rural and urban areas of ‘Chaharmahal and Bakhtiari’ and urban areas of ‘Semnan’ provinces.

4. Multidimensional Poverty Index (MPI)

In the previous sections, we focused on estimating the poverty line based on food bundles. In this section, we expand our analysis to include other important aspects of life, such as ‘education’, ‘health’, and ‘nutrition’. Based on an article from World Bank about “Multidimensional Poverty Measure”, we follow the following construction. We will calculate the Multidimensional Poverty Index (MPI), which measures the percentage of households in a country that are deprived in three areas: ‘monetary poverty’, ‘education’, and ‘access to basic infrastructure services’. This approach provides a more comprehensive understanding of poverty by considering various dimensions of well-being beyond just financial resources.

In this part, we consider the 4 aspects of poverty including;

  1. Monetary Dimension

  2. Education Dimension

  3. Living Standards Dimension

Then, we will use following equation to estimate multidimensional poverty index:

\[ MPI = \sum_{i=1}^{n} W_i D_i \]

where \(W_{i}\) is the weight of indicator \(D_{i}\).

In this section, we define the first dimension: (Income)

In this step, we identify the individuals who have a monthly total net income below the income poverty line. To aim this process, we create the following dummy:

  • If a household is below the income poverty line, \(\text{dummy income poverty line} = 1\)

  • If a household is not below the income poverty line, \(\text{dummy income poverty line} = 0\)

combined_monthlyinc_dum1 <- combined_monthlyinc %>%
  mutate(dummy_below_poverty = ifelse((total_netincome_month / member) < income_poverty_line1, 1, 0))
# poverty status information
RU1402P1_dum <- RU1402P1 %>%
  left_join(combined_monthlyinc_dum1 %>% select(Address, dummy_below_poverty), by = "Address") %>%
  # Handle Missing Values
  mutate(income_poverty_dummy = ifelse(is.na(dummy_below_poverty), 0, dummy_below_poverty)) %>%
  select(-dummy_below_poverty)

In this section, we define the second dimension: (Consumption)

In this step, we first identify the individuals who have a monthly total consumption below the consumption poverty line. To aim this process, we create the following dummy:

  • If a household is below the consumption poverty line, \(\text{dummy consumption poverty line} = 1\)

  • If a household is not below the consumption poverty line, \(\text{dummy consumption poverty line} = 0\)

consumption <- filtered_RU1402P3S01 %>%
  dplyr::group_by(Address, province, Urban_Rural, label) %>%
  dplyr::summarize(mean_consumption = round(weighted.mean(kilogram, Weight, na.rm = TRUE), 3), .groups = 'drop') 

consumption <- consumption %>%
  mutate(cal_per_kg = case_when(
    label == "bread" ~ 2660,
    label == "cheese" ~ 3500,
    label == "egg" ~ 1470,
    label == "fruit" ~ 500,
    label == "oil" ~ 8840,
    label == "potato" ~ 1040,
    label == "rice" ~ 1350,
    label == "soda" ~ 380,
    label == "sugar" ~ 3870,
    label == "tomato paste" ~ 820,
    label == "vegetables" ~ 200,
    label == "yoghurt" ~ 990,
    label == "white meat" ~ 2200,
    label == "milk" ~ 500,
    label == "biscuit" ~ 3250,
    label == "salt" ~ 0,
    label == "spaghetti" ~ 1570,
    label == "zardchobe" ~ 3540,
    label == "ice cream" ~ 2010,
    label == "lentil" ~ 1650,
    label == "margarine" ~ 5260,
    label == "non-irani tea" ~ 10,
    label == "spices" ~ 3140,
    label == "cake" ~ 3800,
    label == "chips & pofak" ~ 5200,
    label == "flour" ~ 3640,
    TRUE ~ NA_real_))

consumption_dummy <- consumption %>%
  # Calculate total calories consumed per label in a month
  mutate(total_calories = mean_consumption * cal_per_kg, na.rm = TRUE) %>%
  # Group by Address, Province, and Urban_Rural areas
  group_by(Address, province) %>%
  # Calculate the total calories of households in a month in each province
  dplyr::summarize(Total_calories = sum(total_calories, na.rm = TRUE), .groups = 'drop') %>%
  # Calculate daily calories in each household
  mutate(daily_calories = Total_calories / 30) %>%
  left_join(address_count, by = "Address") %>%
  # Calculate the daily personal calorie consumption
  mutate(daily_personal_calories = daily_calories / member) %>% 
  left_join(average_consumption, by = "province") %>% 
  mutate(below_calorie_dummy = ifelse(daily_personal_calories < average_personal_calories, 1, 0))

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(consumption_dummy %>% select(Address, below_calorie_dummy), by = "Address") %>%
  mutate(consumption_poverty_dummy = ifelse(is.na(below_calorie_dummy), 0, below_calorie_dummy)) %>% 
  select(-below_calorie_dummy)

In this section, we define the third dimension: (education)

First, we define a dummy variable for the education of household’s head.

Second, we define a dummy variable for the attendance of children at school.

Here, the thresholds that we consider for not being deprived of education is that the household’s head is literate and the children attend school.

  • If a Household’s Head is Literate, \(\text{Head's Education} = 0\)

  • If a Household’s Child is between 7 and 18 and is going to school, \(\text{Child's Education} = 0\)

# For Head
RU1402P1_dum <- RU1402P1_dum %>%
  group_by(Address) %>%
  mutate(head_education_dummy = case_when(
    any(relation == "head" & literacy == "illiterate", na.rm = TRUE) ~ 1,
    any(relation == "head" & literacy != "illiterate", na.rm = TRUE) ~ 0,
    TRUE ~ NA_real_))
# For Children
RU1402P1_dum <- RU1402P1_dum %>%
  group_by(Address) %>%
  mutate(child_education_dummy = ifelse(any(relation == "child" & age >= 7 & age <= 18 & is_studying == "No", na.rm = TRUE), 1, 0)) %>%
  ungroup()

In this section, we define the fourth dimension: (Living Standards)

First, we define a dummy variable for the Per Capita Living Area.

Second, we define a dummy variable for the Being a Homeowner.

Third, we define a dummy variable for the Access to Necessary Home Items.

Forth, we define a dummy variable for the Access to Internet and Cell Phone.

Here, the thresholds that we consider for the Living Standards are:

  • According to Tehran Municipality’s regulations and guidelines, the Per Capita Living Area per person is 17.5 square meters.

  • If a Household owns its House, \(\text{Home Owenship} = 0\).

  • If a Household has access to more than 3 of 5 necessary Home Items, \(\text{Home Accessibility} = 0\).

    (necessary home items include TV, washing machine, refrigerator, freezer and stove)

  • If a Household has access to Internet and Cell Phone, \(\text{Technology Accessibility} = 0\).

# Dummy for Home Ownership
RU1402P2 <- RU1402P2 %>%
  mutate(ownership_dummy = ifelse(tenure %in% c("OwnedEstateLand", "OwnedEstate", "Free"), 0, 1))

RU1402P2 <- RU1402P2 %>%
  mutate(Address = as.numeric(Address))
#Merging it
RU1402P1_dum <- RU1402P1_dum %>%
  left_join(RU1402P2 %>% select(Address, ownership_dummy), by = "Address") %>%
  mutate(tenure = ifelse(is.na(ownership_dummy), 1, ownership_dummy)) %>%
  select(-ownership_dummy)

RU1402P2 <- merge(RU1402P2, address_count, by = "Address", all.x = TRUE)

# Calculate space_per_capita
RU1402P2 <- RU1402P2 %>%
  mutate(space_per_capita = space / member)

# Create a dummy variable for space_per_capita < 17.5
address_space_issue <- RU1402P2 %>%
  mutate(space_issue = ifelse(space_per_capita < 17.5, 1, 0)) %>%
  select(Address, space_issue)

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(address_space_issue, by = "Address") %>%
  mutate(space_issue = ifelse(is.na(space_issue), 0, space_issue))

#internet
RU1402P2 <- RU1402P2 %>%
  mutate(no_internet_dummy = ifelse(internet == "FALSE", 1, 0))

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(RU1402P2 %>% select(Address, no_internet_dummy), by = "Address") %>%
  mutate(no_internet_dummy = ifelse(is.na(no_internet_dummy), 0, no_internet_dummy))

#cellphone
RU1402P2 <- RU1402P2 %>%
  mutate(no_mobile_dummy = ifelse(cellphone == "FALSE", 1, 0))

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(RU1402P2 %>% select(Address, no_mobile_dummy), by = "Address") %>%
  mutate(no_mobile_dummy = ifelse(is.na(no_mobile_dummy), 0, no_mobile_dummy))

RU1402P2 <- RU1402P2 %>%
  mutate(lack_appliances = ifelse((TV + washingmachine + refridgerator + freezer + stove) < 3, 1, 0))

RU1402P1_dum <- RU1402P1_dum %>%
  left_join(RU1402P2 %>% select(Address, lack_appliances), by = "Address") %>%
  mutate(lack_appliances = ifelse(is.na(lack_appliances), 0, lack_appliances))

In this section, we calculate the multidimensional poverty index by considering an equally weighted average of the variables mentioned above for each province. A higher index value indicates a worse poverty situation in that province.

RU1402P1_dum <- RU1402P1_dum %>%
  rename(
    Below_income_dummy = income_poverty_dummy,
    Below_consumption_dummy = consumption_poverty_dummy ,
    Head_educ_dummy = head_education_dummy,
    Child_educ_dummy = child_education_dummy,
    Ownership_dummy = tenure,
    Living_space_dummy = space_issue,
    Internet_dummy = no_internet_dummy,
    Cellphone_dummy = no_mobile_dummy,
    Appliance_dummy = lack_appliances)

MDP_index <- RU1402P1_dum %>%
  select(Address, member,
    Below_income_dummy, Below_consumption_dummy,
    Head_educ_dummy, Child_educ_dummy,
    Ownership_dummy, Living_space_dummy,
    Internet_dummy, Cellphone_dummy, Appliance_dummy)

MDP_index <- MDP_index %>%
  mutate(
    province = fct_recode(as.factor(substr(Address, 2, 3)), !!!Province),

    # Monetary Dimension (Total Weight = 1/3)
    Monetary_Score = (Below_consumption_dummy * 1/6 + Below_income_dummy * 1/6),
    
    # Education Dimension (Total Weight = 1/3)
    Education_Score = (Head_educ_dummy * 1/6 + Child_educ_dummy * 1/6),

    # Living Standards Dimension (Total Weight = 1/3)
    Living_Standards_Score = (Ownership_dummy * 1/15 + Living_space_dummy * 1/15 +
                              Internet_dummy * 1/15 + Cellphone_dummy * 1/15 +
                              Appliance_dummy * 1/15),
    # Total MPI Score
    MPI_Score = Monetary_Score + Education_Score + Living_Standards_Score
  ) %>%
  select(Address, member, province, Monetary_Score, Education_Score, Living_Standards_Score, MPI_Score)

MDP_index <- MDP_index %>%
  left_join(RU1402P1_getweight, by = "Address")
# Calculate the weighted average of MPI scores for each province
MDP_index_province <- MDP_index %>%
  group_by(province) %>%
  dplyr::summarize(weighted_average_MPI = sum(MPI_Score * Weight, na.rm = TRUE) / sum(Weight, na.rm = TRUE)) %>% 
  arrange(weighted_average_MPI)

kable(
  MDP_index_province %>%
    mutate(across(where(is.numeric), ~ formatC(., format = "f", big.mark = ",", digits = 3))), align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), position = "center", full_width = FALSE) %>%
  add_header_above(c("Multi Dimentional Poverty Index (Household)" = ncol(MDP_index_province))) %>%
  column_spec(1, background = "mediumslateblue") %>%
  column_spec(2, background = "mediumspringgreen")
Multi Dimentional Poverty Index (Household)
province weighted_average_MPI
Chaharmahal and Bakhtiari 0.131
Mazandaran 0.136
Yazd 0.149
Isfahan 0.151
Bushehr 0.178
Fars 0.185
Zanjan 0.194
Markazi 0.196
Ardabil 0.197
Qazvin 0.199
East Azerbaijan 0.212
Kohgiluyeh and Boyer-Ahmad 0.215
Gilan 0.218
South Khorasan 0.223
Khuzestan 0.224
Ilam 0.227
Kurdistan 0.231
North Khorasan 0.232
Kermanshah 0.233
Hamadan 0.241
Tehran 0.242
Razavi Khorasan 0.250
Alborz 0.251
Lorestan 0.255
Qom 0.256
Semnan 0.258
Golestan 0.261
Hormozgan 0.278
Kerman 0.278
West Azerbaijan 0.306
Sistan and Baluchestan 0.431

3 Provinces with highest MPI Score are ‘Sistan and Baluchestan’ (0.43) and ‘West Azerbaijan’ (0.30) and ‘Kerman’ (0.27).

3 Provinces with lowest MPI Score are ‘Chaharmahal and Bakhtiari’ (0.13) and ‘Mazandaran’ (0.13) and ‘Yazd’ (0.14).

Inequality Analysis

1. Estimation of Inequality Metrics

  • Gini Coefficient

Import the Data

welfare_dta <- read.csv('/Users/mahan/Desktop/TEIAS/3th-Semester/DevEcon1/Project2/Data/PaygahRefah/nemone_2_darsadi_1402.csv')

In this Step, We calculate the Revenue for each ID.

Since the “Daramad” column is NA for many observations, we combine “CardPerMonth_1402” column with “Daramad” column to create an alternative proxy for income.

Revenue <- welfare_dta %>%
  select(id, SabteAhval_provincename, CardPerMonth_1402, Daramad) %>%
  filter(!(is.na(CardPerMonth_1402) & is.na(Daramad))) %>%
  mutate(Total_Revenue = (CardPerMonth_1402 * 12) + Daramad)

We also consider assets as an indication of wealth distribution, since it is common to estimate the Gini Coefficient for both wealth and income.

Asset <- welfare_dta %>%
  select(id, SabteAhval_provincename, CarsPrice, Bourse_NetPortfoValue) %>%
  filter(!(is.na(CarsPrice) & is.na(Bourse_NetPortfoValue))) %>%
  mutate(Total_Asset = CarsPrice + Bourse_NetPortfoValue)

In this part, we estimate the Gini Coefficient for both wealth and income from “Iranian welfare database” and plot the Lorenz Curve.

# Compute Lorenz curves
lorenz_revenue <- Lc(Revenue$Total_Revenue)
lorenz_asset <- Lc(Asset$Total_Asset)
lorenz_HEIS <- Lc(combined_monthlyinc$total_netincome_month/combined_monthlyinc$member)

# Compute Gini coefficients
gini_revenue <- round(Gini(Revenue$Total_Revenue), 5) 
gini_asset <- round(Gini(Asset$Total_Asset), 5)
gini_HEIS <- round(Gini(combined_monthlyinc$total_netincome_month/combined_monthlyinc$member), 5)

# Plot Lorenz curves
plot(lorenz_revenue,
     main = "Lorenz Curves for Revenue and Asset Distribution",
     xlab = "Cumulative Share of Population",
     ylab = "Cumulative Share of Revenue & Asset",
     col = "blue",
     lwd = 2)

# Add Lorenz curve for Asset
lines(lorenz_asset, col = "red", lwd = 2)

lines(lorenz_HEIS, col = "darkgreen", lwd = 3)

# Add the line of perfect equality
abline(0, 1, col = "black", lty = 2)

# Add legend
legend("topleft",
       legend = c("Revenue - Welfare DataBase", "Asset - Welfare DataBase", "Revenue - HEIS", "Line of Equality"),
       col = c("blue", "red","darkgreen", "black"),
       lty = c(1, 1, 1, 1),
       lwd = c(2, 2, 2, 2),
       bg = "white")
# Add Gini coefficient text directly on the plot
text(0.8, 0.25, paste("Gini (Revenue - WelfareDataBase):", gini_revenue), col = "blue", cex = 0.7)
text(0.8, 0.15, paste("Gini (Asset - WelfareDataBase):", gini_asset), col = "red", cex = 0.7)
text(0.5, 0.40, paste("Gini (Revenue - HEIS):", gini_HEIS), col = "darkgreen", cex = 0.7)

The primary reason for the high value of the Gini coefficient is that we used a proxy variable, which may not accurately reflect the distribution of income and wealth in society. Additionally, the estimation of this coefficient was based on only 2% of data from the Iranian Welfare Database, which may not adequately represent the entire population and could be biased.

  • Top 1% Share
# sorted_Revenue <- Revenue %>%
#   arrange(-Total_Revenue)
# 
# top1perc_Revenue <- sorted_Revenue %>%
#   filter(row_number() <= ceiling(0.01 * n()))
# 
# top1_perc_Revenue <- sum(top1perc_Revenue$Total_Revenue,na.rm=TRUE)/sum(sorted_Revenue$Total_Revenue, na.rm = TRUE)
# # message(glue("The top 1% of revenue accounts for {percent(top1perc_Revenue, accuracy = 0.01)} of the total revenue."))
# # top_1perc_Revenue
# sorted_Asset <- Asset %>% 
#   arrange(-Total_Asset)
#  
# top_1perc_Asset <- sorted_Asset %>% 
#    filter(row_number() <= ceiling(0.01 * n()))
#   #dplyr:: summarize(Top_1 = sum(Total_Asset)/ sum(sorted_Asset$Total_Asset),na.rm = TRUE)
# 
# top1perc_Asset <- sum(top_1perc_Asset$Total_Total_Asset,na.rm=TRUE)/sum(sorted_Asset$Total_Asset, na.rm = TRUE)
library(shiny)
library(leaflet)
library(sf)
library(dplyr)
library(htmltools)
library(htmlwidgets)

# Load the shapefile and rename the province column
iran_map <- st_read("/Users/mahan/Desktop/TEIAS/3th-Semester/DevEcon1/Project2/Mapping/iranmap/irn_admbnda_adm1_unhcr_20190514.shp")
## Reading layer `irn_admbnda_adm1_unhcr_20190514' from data source 
##   `/Users/mahan/Desktop/TEIAS/3th-Semester/DevEcon1/Project2/Mapping/iranmap/irn_admbnda_adm1_unhcr_20190514.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 31 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 44.03189 ymin: 25.05932 xmax: 63.33327 ymax: 39.78165
## Geodetic CRS:  WGS 84
iran_map <- rename(iran_map, province = ADM1_EN)

# Function to generate the Leaflet map
generateLeafletMap <- function(data, metric, metric_label) {
  # Set labels for map
  labels <- sprintf("<strong>%s</strong><br/>%s: %s", 
                    data$province, 
                    metric_label, 
                    format(data[[metric]], big.mark = ",", scientific = FALSE))
  
  # Create the color palette
  pal <- colorNumeric(palette = "PuBu", domain = data[[metric]])
  
  # Generate the map
  leaflet(data) %>%
    addProviderTiles(providers$CartoDB.Positron) %>%
    addPolygons(
      fillColor = ~pal(data[[metric]]),
      fillOpacity = 1,
      weight = 0.3,
      smoothFactor = 0.3,
      label = lapply(labels, htmltools::HTML),
      labelOptions = labelOptions(
        style = list("font-family" = "serif", "font-weight" = "bold", "padding" = "3px 8px",
                     "box-shadow" = "3px 3px rgba(0,0,0,0.25)",
                     "font-size" = "13px", "border-color" = "rgba(0,0,0,0.5)"),
        direction = "auto"
      ),
      highlightOptions = highlightOptions(color = "darkred", weight = 3, bringToFront = TRUE)
    ) %>%
    addLegend(
      pal = pal,
      values = data[[metric]],
      position = "topright",
      title = metric_label
    )
}

# UI
ui <- fluidPage(
  titlePanel("Atlas Poverty of Iran"),
  sidebarLayout(
    sidebarPanel(
      selectInput("selectedProvince", "Choose a Province", 
                  choices = c("Iran" = "All", unique(iran_map$province))),
      selectInput("selectedDataset", "Choose a Poverty Measure:", 
                  choices = c(
                    "Absolute Poverty Line" = "TC_province1",
                    "Relative Poverty Line (Traditional)" = "relative_line",
                    "Relative Poverty Line (Hybrid)" = "hybrid_poverty_line",
                    "Income Poverty Line" = "income_poverty_line",
                    "Poverty Gap Index" = "poverty_gap_index",
                    "Multidimensional Poverty Index" = "MDP_index_province"
                  )),
      selectInput("selectedMetric", "Choose an Approach:", choices = NULL),
      selectInput("urbanRural", "Choose an Area Type:", 
                  choices = c("All", "Urban", "Rural"), 
                  selected = "All"),
      actionButton("resetMap", "Reset View"),
      width = 3
    ),
    mainPanel(
      leafletOutput("mapDisplay", height = "800px"),
      width = 9
    )
  )
)

# Server
server <- function(input, output, session) {
  # Update the 'Choose an Approach' dropdown dynamically based on selected dataset
  observe({
    selected_dataset <- input$selectedDataset
    
    metric_choices <- switch(selected_dataset,
      "TC_province1" = c("Method 1" = "First_Method", "Method 2" = "Second_Method"),
      "relative_line" = c("50 % of Median" = "Half_Median"),
      "hybrid_poverty_line" = c("Cost of bottom 20%" = "cost_new_bundle"),
      "income_poverty_line" = c("Income Poverty Line 1" = "income_poverty_line1", "Income Poverty Line 2" = "income_poverty_line2"),
      "poverty_gap_index" = c("PGI Method 1" = "PGI_1", "PGI Method 2" = "PGI_2"),
      "MDP_index_province" = c("Equally Weighted MPI" = "weighted_average_MPI")
    )
    
    updateSelectInput(session, "selectedMetric", choices = metric_choices)
    
    # Update the 'Choose an Area Type' dropdown based on selected dataset
    if (selected_dataset == "MDP_index_province") {
      updateSelectInput(session, "urbanRural", choices = c("All"), selected = "All")
    } else {
      updateSelectInput(session, "urbanRural", choices = c("All", "Urban", "Rural"), selected = "All")
    }
  })
  
  # Render the Leaflet map
  output$mapDisplay <- renderLeaflet({
    req(input$selectedMetric)
    
    selected_dataset <- input$selectedDataset
    metric <- input$selectedMetric
    metric_label <- names(which(sapply(switch(selected_dataset,
      "TC_province1" = c("First Method" = "First_Method", "Second Method" = "Second_Method"),
      "relative_line" = c("Half of Median Income" = "Half_Median"),
      "hybrid_poverty_line" = c("Cost of New Bundle" = "cost_new_bundle"),
      "income_poverty_line" = c("Income Poverty Line 1" = "income_poverty_line1", "Income Poverty Line 2" = "income_poverty_line2"),
      "poverty_gap_index" = c("PGI Method 1" = "PGI_1", "PGI Method 2" = "PGI_2"),
      "MDP_index_province" = c("Weighted Average MPI" = "weighted_average_MPI")
    ), identical, metric)))
    
    # Filter dataset
    data <- switch(selected_dataset,
      "TC_province1" = TC_province1,
      "relative_line" = relative_line,
      "hybrid_poverty_line" = hybrid_poverty_line,
      "income_poverty_line" = income_poverty_line,
      "poverty_gap_index" = poverty_gap_index,
      "MDP_index_province" = MDP_index_province
    )
    
    # Merge with shapefile
    filtered_data <- merge(iran_map, data, by = "province")
    
    # Apply filters for area type and province
    if (input$urbanRural != "All" && selected_dataset != "MDP_index_province") {
      filtered_data <- filtered_data[filtered_data$Urban_Rural == input$urbanRural, ]
    }
    if (input$selectedProvince != "All") {
      filtered_data <- filtered_data[filtered_data$province == input$selectedProvince, ]
    }
    
    # Generate the map
    generateLeafletMap(filtered_data, metric, metric_label)
  })
  
  # Reset map view
  observeEvent(input$resetMap, {
    updateSelectInput(session, "selectedProvince", selected = "All")
    updateSelectInput(session, "urbanRural", selected = "All")
    leafletProxy("mapDisplay", session) %>%
      setView(lng = 110, lat = 32, zoom = 4)
  })
}

# Run the app
shinyApp(ui = ui, server = server)
Shiny applications not supported in static R Markdown documents
# Save the HTML widget
#saveWidget(app, "/Users/mahan/Desktop/shiny_app_output.html")