Births
This is an independent study conducted by Nikolaos Kouvoutsakis. The main subject of this research is to monitor and explore the number of births in Greece from 2001 to 2016. We will try to identify key Factors that are influencing the phenomenon and further predict the future outcome. All data used is publicly available and was downloaded from the official website of the Hellenic Statistical Authority. The study is divided into 5 key segments (Parts) as follows and was entirely written in R Statistical Programming Language.
Main Parts of the study
No. of Births in Greece (Development through the years)
We notice a decrease in the number of Births when comparing two different periods, before (2001-2010) and during (2011-2016) the economic crisis. We also notice different trends when comparing these two periods, moving from an upward trend to a downward trend.
The next step is to examine if the results are driven from specific Greek Regions or this drop in Births is generally noticed across all country.
We will run a T-Test to examine if this drop in births when comparing these two periods is statistical significant.
##
## Welch Two Sample t-test
##
## data: Value by Period
## t = 4.4236, df = 10.836, p-value = 0.00106
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 6706.309 20037.691
## sample estimates:
## mean in group 2001-2010 mean in group 2011-2016
## 109494 96122
There is a statistical significant difference in the no of Births in Greece when we compare the periods before and during the Economic Crisis (T-Test). No of Births are dropping and this trend is generally noticed across all country’s regions.
We will now examine how each women age group is contributing to the total no of births annually.
We notice (Figure 5a) that 96%-97% of the total births annually are coming from women of the age 20-44. There is a 3%-4% of total births coming from ages outside the group (20-44) but that number of births is steady across the years. We can easily conclude that any difference and fluctuation in the number of births is coming entirely from the age group of 20-44.
Above graph (Figure 5b) shows us additional some interesting differentiations between women age groups. We see that the no of births coming from women of age 35 to 44 is steadily increasing before and during the economic crisis. On the contrary no of births coming from women of age 20 to 34 is decreasing during the economic crisis. Most interesting are:
No of births coming from Women of age 30-34 was increasing before the economic crisis but is decreasing during the economic crisis.
No of births coming from women of age 25-29 is decreasing rapidly during the economic crisis.
Below we see an overall picture of births both before and during the Economic Crisis per women age group.
We ’ve noticed in the previous charts that the negative contributors in the number of births during the Economic Crisis are entirely women of age 20 to 34. We ’ve also noticed that women of ages 35 to 44 are positive contributors to the no of births both before and during the Economic Crisis, but they cannot stabilize or reverse the negative contribution coming from ages 20 to 34. This is the main reason that the total no of births in Greece is decreasing eventually.
Before we proceed with this Part of the study, we must introduce some Definitions.
Total Fertility Rate(TFR) - The total fertility rate is defined as the mean number of children who would be born to a woman during her lifetime, if she were to spend her childbearing years conforming to the age-specific fertility rates, that have been measured in a given year.
The Age-Specific Fertility Rate (ASFR) is the number of live births per 1000 women in a specific age group for a specified geographic area and for a specific point in time, usually a calendar year.
We assume that the total no of births can be easily calculated if we know the ASFR and the corresponding women population respectively by using below formula.
Births = ASFR X Corresponding Women Population/1000
We will explore below factors
and also factor variation by year and by period. These factors will be investigated only regarding women of age 20 to 44 for two main reasons that we had confirmed earlier.
We will explore both ASFR and TFR in below graphs and we will also run a T-Test on both factors per period.
##
## Welch Two Sample t-test
##
## data: Births.to.Population by Period
## t = 0.52945, df = 12.924, p-value = 0.6055
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.05463788 0.09008459
## sample estimates:
## mean in group 2001-2010 mean in group 2011-2016
## 1.322755 1.305032
##
## Welch Two Sample t-test
##
## data: Births.to.Population by Period
## t = 1.444, df = 13.138, p-value = 0.1722
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.8859188 4.4688734
## sample estimates:
## mean in group 2001-2010 mean in group 2011-2016
## 53.31565 51.52417
Total Fertility Rate for women of age 20 to 44 is almost equivalent when comparing the periods before and during the Economic Crisis. We recognize that this is not the reason that the number of births is decreasing during the economic crisis. We get the same exact results when also comparing ASFR for women 20-44.
We will explore ASFR on each women age group in below graphs and also will run a T-Test per women age group for both periods under investigation.
##
## Welch Two Sample t-test
##
## data: Births.to.Population by Period
## t = 7.5084, df = 5.5851, p-value = 0.0004033
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 8.759992 17.461601
## sample estimates:
## mean in group 2001-2010 mean in group 2011-2016
## 43.65388 30.54309
##
## Welch Two Sample t-test
##
## data: Births.to.Population by Period
## t = 4.9175, df = 5.3113, p-value = 0.003737
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 4.892918 15.228316
## sample estimates:
## mean in group 2001-2010 mean in group 2011-2016
## 83.89394 73.83333
##
## Welch Two Sample t-test
##
## data: Births.to.Population by Period
## t = -2.0189, df = 12.745, p-value = 0.06505
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -14.2853631 0.4981226
## sample estimates:
## mean in group 2001-2010 mean in group 2011-2016
## 88.00530 94.89892
##
## Welch Two Sample t-test
##
## data: Births.to.Population by Period
## t = -3.6259, df = 13.609, p-value = 0.002871
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -15.659355 -3.999442
## sample estimates:
## mean in group 2001-2010 mean in group 2011-2016
## 41.01857 50.84797
##
## Welch Two Sample t-test
##
## data: Births.to.Population by Period
## t = -3.9915, df = 13.936, p-value = 0.001349
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4.464672 -1.342777
## sample estimates:
## mean in group 2001-2010 mean in group 2011-2016
## 7.979343 10.883068
We notice important differentiations when examining ASFR on each Women age group. For Women of ages 20 to 29 ASFR is significant lower during the economic crisis than before. For Women of ages 30 to 34 ASFR is almost stable on both examined periods. For Women of ages 35 to 44 ASFR is significant larger during the economic crisis than before. These differentiations when combined together in a single group (Women 20-44) are resulting in a similar overall average TFR and average ASFR for both examined periods.
We have created three major women Age.Groups in order to investigate furthermore the women population:
As an Early sign we see that the population of women that are capable of delivering births(20-44) was quite stable up until 2010 but is decreasing dramatically during the Economic Crisis. (Figure 11)
##
## Welch Two Sample t-test
##
## data: Population by Period
## t = 5.4159, df = 5.3626, p-value = 0.002339
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 96137.19 263350.35
## sample estimates:
## mean in group 2001-2010 mean in group 2011-2016
## 1984328 1804584
There is a statistical significant difference on women(20-44) population when comparing both periods before(2001-2010) and during(2011-2016) the economic crisis. Women(20-44) population is on average, decreased by 200.000 persons, almost by 10% during the economic crisis when compared to the period before. We will then explore which age groups are driving these results in forthcoming graphs.
We see that for women of ages 35-44 the population is quite stable before and during the Economic Crisis. On the contrary for women of ages 20-34 the population is decreasing rapidly during the economic Crisis.
In Figure.15 we will explore factors(Population - ASFR) variation and their respective result(Births) per women age group in one combined graph.
Births Contributors
What is causing the Births decrease during the Economic Crisis when comparing with the period before.
It is important to note again that all variables used in our model refer to Women of Age 20-44, for two main reasons that we ’ve mentioned earlier.
Below we will explore Births , Women Population and ASFR relationship fixed in pairs.
Generating a Multilinear Regression model using Births basic Factors analyzed on the previous chapters (ASFR , Women Population). Below we introduce the coefficients of such a model fitted.
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | -1.005688e+05 | 840.0847671 | -119.7127 | 0 |
| Population | 5.135380e-02 | 0.0003526 | 145.6411 | 0 |
| ASFR | 1.958376e+03 | 12.2852689 | 159.4084 | 0 |
All above diagnostics are indicating that there is not any significant issue detected for the generated model. The model is good and suitable for our case.
When we refer to an Unadjusted relationship between two variables Y and X1, notably we generate the first variable Y as a linear function of the second variable X1. We Adjust the relationship between two variables (Y and X1) by removing the effect that other variables (X2,X3,…) might have in each one of them.
It is important to note that the adjusted relationship between births and main factors is completely linear.
In this chapter we will identify main Factors that influence Women Population and we predict the population outcome up until 2020.
Generally when we need to estimate the next year’s population we must use below Formulation.1:
Next Year’s Population = Previous Year’s Population + Next Year’s Births - Next Year’s Deaths + Next Year’s Immigration Balance
We need to introduce some more definitions that are more appropriate when examining population in each Women Age Group.
AGS.IB - Age Group Specific Immigration Balance = Age Group Incoming Immigration in a given Year - Age Group Outgoing Immigration in a given Year
AGS.TPB - Age Group Specific Transferred Population Balance = Population Entering The Age Group in a given Year - Population Leaving The Age group in a given Year
We introduce this definition AGS.TPB - Age Group Specific Transferred Population Balance in order to measure the effect of people’s aging. When we examine the population of any given Age group for example 20-24, 24 years old persons will leave this group and 19 years old persons will enter the group by next year. In order to estimate the effect of aging in that group we must add Next Year’s Age Group Specific Transferred Population Balance to the previous year’s group population.
If we want to examine how the population of a given Age Group will develop through the years we must use below Formulation.2a:
Next year’s Age Group Population = Previous Year’s Age Group Population + Next Year’s AGS.TPB + Next Year’s AGS.IB - Next Year’s Deaths
We can see the impact of people’s aging within a given Age Group of examination if we instead use below Formulation.2b:
Next year’s Age Group Population = Previous Year’s Age Group Population + Next Year’s AGS.TPB
Births influence the result depending on the Age Group of examination and the reference Year used. In our research births do not influence the outcome and that is the reason that have been excluded from the formulations.
e.g. 2011 Births will influence women of age (20-44) population after at least 20 years, starting from 2031. If we use 2011 as a reference year we can estimate women population for upcoming years by using Formulation.2a exclusively and without taking (2011-YTD) births into account up until 2031.
We will examine all parameters of Formulation.2a
We will explore women(20-44) population through the years. In the following graph we see the actual women population up until 2011 and the population development from 2012 to 2020 by only applying the effect of people’s aging.
We notice that the factor of aging is causing and will cause in the upcoming years, a huge effect on Women(20-44) population. The available population is decreasing rapidly, because the number of persons entering the 20-44 age group are significant less that the number of persons exiting the group. If we split women age group 20-44 into smaller age group segments ,this effect is noticed across all sub groups, having as an only exception women age group 40-44 which is managing to maintain it’s numbers steady.
We use %Deaths to population figures instead of exact number of deaths for each year, in order to eliminate the population impact on the No. of deaths. The largest the population in a given year the more deaths are likely to be recorded for that year (when examining exact numbers). In the next graphs we are exploring deaths per year for women age groups of interest.
%Deaths to population figures per Age Group and per Year (Results Rounded at 2 decimal digits)
| 20-24 | 25-29 | 30-34 | 35-39 | 40-44 | |
|---|---|---|---|---|---|
| 2001 | 0.03 | 0.03 | 0.04 | 0.06 | 0.09 |
| 2002 | 0.03 | 0.03 | 0.04 | 0.06 | 0.10 |
| 2003 | 0.02 | 0.03 | 0.04 | 0.06 | 0.09 |
| 2004 | 0.03 | 0.03 | 0.04 | 0.06 | 0.09 |
| 2005 | 0.03 | 0.04 | 0.04 | 0.06 | 0.10 |
| 2006 | 0.02 | 0.03 | 0.03 | 0.05 | 0.08 |
| 2007 | 0.03 | 0.03 | 0.04 | 0.06 | 0.10 |
| 2008 | 0.02 | 0.03 | 0.04 | 0.05 | 0.08 |
| 2009 | 0.03 | 0.03 | 0.04 | 0.06 | 0.08 |
| 2010 | 0.03 | 0.03 | 0.03 | 0.05 | 0.09 |
| 2011 | 0.03 | 0.03 | 0.04 | 0.06 | 0.08 |
| 2012 | 0.02 | 0.03 | 0.04 | 0.05 | 0.08 |
| 2013 | 0.02 | 0.02 | 0.04 | 0.05 | 0.08 |
| 2014 | 0.02 | 0.03 | 0.03 | 0.05 | 0.08 |
| 2015 | 0.02 | 0.02 | 0.04 | 0.05 | 0.08 |
| 2016 | 0.02 | 0.02 | 0.04 | 0.05 | 0.09 |
We notice that %Deaths to population figures for each Age Group are almost steady through the years, having very small variation. We can assume that we will not have any significant changes in the figures and it is quite safe to use the averages as an estimation for the upcoming years.
From all the factors that influence women population , immigrations is the most difficult to predict its future output. We will explore how this factor is developing through the years and we will make an assumption on what the figures will be in the next years.
We notice that the immigration balance (Immigrants out minus Immigrants in) for each women age group , after reaching their highest values for the years 2012 to 2014, trend to reach the zero value in the next years. This is driven by the slow decrease in the number of persons leaving the country and mostly by the increase in the number of persons entering the country. It is important to note that the increase in the number of immigrants entering the country is almost entirely driven by the increased number of immigrants coming from more and least developed countries.
We assume that the expected future growth of the Greek economy combined with the expected decrease in the Unemployment rate will slow down the number of persons leaving the country. It is also expected that the number of immigrants entering the country will remain at high rates. Based on above two combined facts we assume that the immigration balance will remain at 2016 figures for the upcoming years with a slightly downward trend.
In this chapter we will make an estimation on how women population will develop in the upcoming years. Below Graphs present actual figures from 2001 to 2016 and forecasted figures for the period 2017 to 2020.
Assumptions used in the model:
| Year | 20-24 | 25-29 | 30-34 | 35-39 | 40-44 |
|---|---|---|---|---|---|
| 2001 | 391296 | 405092 | 426037 | 392418 | 382065 |
| 2002 | 389458 | 403685 | 421814 | 401699 | 386300 |
| 2003 | 383169 | 405048 | 416394 | 411421 | 387620 |
| 2004 | 372466 | 405019 | 414166 | 420338 | 388028 |
| 2005 | 358884 | 406420 | 411994 | 426818 | 389665 |
| 2006 | 345698 | 405172 | 412408 | 429466 | 394982 |
| 2007 | 331589 | 403075 | 412037 | 426376 | 405099 |
| 2008 | 324153 | 396474 | 414501 | 422166 | 415881 |
| 2009 | 318161 | 384094 | 414373 | 420258 | 425172 |
| 2010 | 313750 | 367501 | 414295 | 417712 | 431569 |
| 2011 | 307573 | 348927 | 409352 | 415608 | 432298 |
| 2012 | 302011 | 327808 | 401684 | 411616 | 426217 |
| 2013 | 291019 | 313894 | 389033 | 410119 | 418849 |
| 2014 | 282664 | 304486 | 372106 | 406955 | 414970 |
| 2015 | 274232 | 297884 | 351318 | 404069 | 410599 |
| 2016 | 269192 | 292302 | 333768 | 398726 | 408224 |
| 2017 | 264255 | 287416 | 316852 | 393714 | 405471 |
| 2018 | 262667 | 276850 | 306849 | 383481 | 404985 |
| 2019 | 260202 | 269267 | 299963 | 367813 | 402257 |
| 2020 | 259165 | 261559 | 295803 | 348704 | 400449 |
| Year | Population |
|---|---|
| 2001 | 1996908 |
| 2002 | 2002956 |
| 2003 | 2003652 |
| 2004 | 2000017 |
| 2005 | 1993781 |
| 2006 | 1987726 |
| 2007 | 1978176 |
| 2008 | 1973175 |
| 2009 | 1962058 |
| 2010 | 1944827 |
| 2011 | 1913758 |
| 2012 | 1869336 |
| 2013 | 1822914 |
| 2014 | 1781181 |
| 2015 | 1738102 |
| 2016 | 1702212 |
| 2017 | 1667708 |
| 2018 | 1634832 |
| 2019 | 1599502 |
| 2020 | 1565680 |
In this chapter we will explore factors that are influencing ASFR of Women 20-44. We will also determine the best regression model to describe ASFR in Greece.
In the next graphs we explore the number of total marriages in Greece per year and the number of marriages with brides of age 20-44.
The percentage of marriages where brides are of age 20-44, comparing to the total number of marriages, is steady above 90% for each examined year from 2001 to 2016.
Above Graph indicates that approximately 92% of annual Births in Greece are coming from married couples where wife is of age 20-44. There is also a clear indication that this percentage trends to be decreased in most recent years, but still very close to 90%.
Main conclusion is that in Greece, almost all births are coming from married couples and very few coming from non-married couples. That’s the reason we choose to examine factors regarding households, focusing on households composed by a couple with no child or a couple with one child up to 16 years old. Below we introduce the factors we choose to examine:
Below we will examine existing variable relationship in pairs.
Table with variable correlations.
| ASFR | MAR | LONE | LONEINC | ONEKID | ONEKIDINC | |
|---|---|---|---|---|---|---|
| ASFR | 1.0000000 | 0.7175201 | -0.5592530 | 0.9561516 | -0.0410042 | 0.8752013 |
| MAR | 0.7175201 | 1.0000000 | -0.5184784 | 0.7584690 | 0.2932424 | 0.6946554 |
| LONE | -0.5592530 | -0.5184784 | 1.0000000 | -0.3755550 | -0.4973094 | -0.3506315 |
| LONEINC | 0.9561516 | 0.7584690 | -0.3755550 | 1.0000000 | -0.1541345 | 0.9199109 |
| ONEKID | -0.0410042 | 0.2932424 | -0.4973094 | -0.1541345 | 1.0000000 | -0.2610956 |
| ONEKIDINC | 0.8752013 | 0.6946554 | -0.3506315 | 0.9199109 | -0.2610956 | 1.0000000 |
Best set of variables for each model size.
## MAR LONE LONEINC ONEKID ONEKIDINC
## 1 ( 1 ) " " " " "*" " " " "
## 2 ( 1 ) " " "*" "*" " " " "
## 3 ( 1 ) "*" "*" "*" " " " "
## 4 ( 1 ) "*" "*" "*" "*" " "
## 5 ( 1 ) "*" "*" "*" "*" "*"
Best model selection using Adjusted R2 and CP
## [1] 3
## [1] 2
Adjusted R2 tells us that the best model is that with three variables. Using CP we arrive to the conclusion that the best model is that with two variables. We choose to proceed with a model of two variables (LONE,LONEINC)
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | 33.597425 | 2.9690159 | 11.316014 | 0.0000285 |
| LONE | -27.831565 | 10.4065941 | -2.674416 | 0.0368091 |
| LONEINC | 0.013052 | 0.0013093 | 9.968945 | 0.0000590 |
The main subject of this research is to monitor and explore the number of births in Greece from 2001 to 2016. We will try to identify the basic Factors that influence the results and further develop a forecast model, in order to estimate births output up until 2020.
There is a statistical significant difference in the no of Births in Greece when we compare the periods before (2001-2010) and during (2011-2016) the Economic Crisis. The number of Births is decreasing and this trend is generally noticed across all country’s regions. We also notice different trends when comparing these two periods, moving from an upward trend to a downward trend.
Additional:
Births regression model
We can easily built a good regression model to predict and best describe births by only using two factors:
Population of women 20-44 is influenced by the number of deaths , immigration and the effect of aging.
Below we are presenting the final women 20-44 population forecast up until 2020.
ASFR Women 20-44.
Almost all births in Greece are coming from married couples. That’s the reason we focused our research on households composed by a couple with no child or a couple with one child up to 16 years old.
We were able to built a good regression model to predict and best describe ASFR of Women 20-44 by only using two factors:
Births Forecast
We predict that the number of births will remain in low levels in the forthcoming years in Greece. This result will be driven by the continuous decrease in the women population of ages 20 to 44, which are the major birth contributors. This fact can be balanced or slightly reversed only if ASFR (Women 20-44) rates show a remarkable improvement, which is not the most common scenario. We have ran three different cases on the ASFR rates , presenting below the final births outcome for the forthcoming years.