Data Presentation

Anna Mergoni

DemoBel

Demobel collects demographic information about population residing in Belgium from 2001 to 2018. The information cover a population of 71251 individuals.

### DEMOBEL data
##################

demobel.1 <- read_sas("C:/Users/u0125884/OneDrive - KU Leuven/Documents/HIVA/Reinvest/Data/Demo/c2020_119_dataset2_part1_c.sas7bdat")
demobel.2 <- read_sas("C:/Users/u0125884/OneDrive - KU Leuven/Documents/HIVA/Reinvest/Data/Demo/c2020_119_dataset2_part2_c.sas7bdat")
demobel.3 <- read_sas("C:/Users/u0125884/OneDrive - KU Leuven/Documents/HIVA/Reinvest/Data/Demo/c2020_119_dataset2_part3_c.sas7bdat")

#All the database match perfectly 
table(demobel.2$ID_DEMO_C %in% demobel.1$ID_DEMO_C)

#Merging
demobel <- left_join(demobel.2, demobel.3, by=c("ID_DEMO_C", "STOCK_YR"="YR_STOCK"))

Evolution of nationality composition

To do so we transformed the country codes in macro - areas

Evolution of nationality composition

Here we considered larger in macro - areas

Evolution of nationality composition

Here we investigate whether the individuals change the nationality (as a sort of preliminary check on the goodness of our data).

Excluding people with Belgian Nationality Including people with Belgian Nationality
Change nationality 26 3,695
Did not change nationality 9,925 67,419

Evolution of gender composition

Similarly we investigate the gender composition and the changes in gender as a preliminary check.

Evolution of gender composition

Similarly we investigate the gender composition and the changes in gender as a preliminary check.

Count over the years Change in sex
Female 584,939 0
Male 560,312 0
NA 137,267

Evolution of Households Type

Census Data

#Census

census_2001_par <- read_sas("C:/Users/u0125884/OneDrive - KU Leuven/Documents/HIVA/Reinvest/Data/Census/tu_reinvest_c2001par_2020_119_c.sas7bdat")      census_2011_par <- read_sas("C:/Users/u0125884/OneDrive - KU Leuven/Documents/HIVA/Reinvest/Data/Census/tu_reinvest_c2011par_2020_119_c.sas7bdat")      census_2001_resp <- read_sas("C:/Users/u0125884/OneDrive - KU Leuven/Documents/HIVA/Reinvest/Data/Census/tu_reinvest_c2001resp_2020_119_c.sas7bdat")      census_2011_resp <- read_sas("C:/Users/u0125884/OneDrive - KU Leuven/Documents/HIVA/Reinvest/Data/Census/tu_reinvest_c2011resp_2020_119_c.sas7bdat")   

#Merging Census
census_2001 <- left_join(census_2001_par, census_2001_resp, by = c("ID_DEMO_C"))
census_2001$year <- 2001
census_2011 <- left_join(census_2011_par, census_2011_resp, by = c("ID_DEMO_C"))
census_2011$year <- 2011

#merging Census with Demobel
demobel.census2001 <- left_join(demobel, census_2001, by = c("ID_DEMO_C", "STOCK_YR"="year"))
demobel.census <- left_join(demobel.census2001, census_2011, by = c("ID_DEMO_C", "STOCK_YR"="year"))
dim(demobel.census)

Employment Status in census

Census provide information regarding Father’s employment status, Mother’s employment status, and Respondend’s employment status in 2001 and 2011. However, the two variables do not match perfectly in the definition.

Census 2001 Census 2011
Q1 CAS = Situation de l’emploi

Father’s Employment Status

2001 2011

Mother’s Employment Status

2001 2011

Respondent’s Employment status

2001 2011

Education Level

Census offers also the information about educational level of respondent, respondent’s mother and respondent’s father. However, also in this case the information is not harmonized between 2001 and 2011.

Census 2001 Census 2011

Father’s Education Level

2001 2011

Mother’s Education Level

2001 2011

Respondent’s Education Level

2001 2011

Sector of Employment

This information is available only for the year 2001

Sector of Employment

This information is available only for the year 2001

Mother Father

Property Regime

This information is also not completely harmonized between 2001 and 2011

2001 2011

Property Regime

2001 2011

IPCAL

IPCAL data are obtained from two sources:

(1) eu_self_ipcal (which contain only information about MS_TNJPI_INDEP and FL_IOE_A); note that eu_self_ipcal is the same of tf_reinvest_ipcal

(2) and tf_silc_pit_rounded (which contain more detailed info about the income).

IMPORTANT REMARK: tf_silc_pit_rounded can be merged with census and demobel, while eu_self_ipcal and tf_reinvest_ipcal no. So we are missing these variables… which in any case were full of NA more than 90%

Distribution of MS_TNJPI_INDEP

Data from eu_self_ipcal

Average MS_TNJPI_INDEP Percentage NA

tf_silc_pt

From this dataset we consider the following variables

Variable Rounded by Maximum Description
MS_NET_JOINTLY_WAGE = Net wage 1000 83000 Net wages of the declaration
MS_NET_JOINTLY_UNEMPL = Net unemployment 500 14500 Net unemployment benefits of the declaration
MS_NET_JOINTLY_SICK = Net sick 500 18500 Net sickness and disability benefits of the declaration
MS_NET_JOINTLY_PENSION = Net pension 1000 44000 Net pensions of the declaration

tf_silc_pt

… and also these ones

Variable Rounded by Maximum Description
MS_TOT_NET_PROF_INC = Prof income 1000 83000 Total net professional income of the declaration
MS_TOT_NET_TAXABLE_INC = Taxable income 1000 83000 Net taxable income of the declaration
MS_TOT_TAXES_DECL ?? ?? ??

Income Composition

Data coming from tf_silc_pit_rounded

Averages Sums

Average Income Composition

Data coming from tf_silc_pit_rounded

Net sources Tot net sources

PRIMA (LEEFLOON)

The variable MS_BEDRAG_OCMW is rounded by 500 and capped at 14000. It represents the Annual living wage.

Average Bedrag Percentage NA

EUSILC

Eusilc data are composed by 4 component:

(1) Houshold register data = dfile

(2) Personal register data = rfile

(3) Household data = hfile

(4) Personal data = pfile

General health (PH10)

Marital status (PB190)

ISCED (PE040)

Economic Status (PL030 or PL031)

need more investigation… are there differences between PL030 and PL031? Does it change in 2013?

Type of contract (PL140)

Income Distribution (cash)

What should I do with the people receiving 0?

Income Distribution (overall)

What should I do with the people receiving 0?

Pension

Support income