require(knitr)
## Loading required package: knitr
opts_knit$set(root.dir = normalizePath('../'))

knitr::opts_chunk$set(fig.width=12, fig.height=8, fig.path='Figs/',
                      echo=FALSE, warning=FALSE, message=FALSE)
opts_knit$set(root.dir = normalizePath('../'))
# Libraries
library(dplyr)
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)
library(lubridate)
library(scales)

We will use the data generated in the previous report to plot some stats for Bangalore

## Source: local data frame [894 x 3]
## Groups: Age [?]
## 
##      Age      Dates  Count
##    (int)     (time)  (int)
## 1     18 2012-07-01  19318
## 2     18 2012-12-01   2010
## 3     18 2013-01-01  32637
## 4     18 2013-04-01  58098
## 5     18 2013-05-01  58276
## 6     18 2013-10-01 161829
## 7     18 2014-01-01 191720
## 8     18 2014-03-01 176060
## 9     18 2014-10-01 179034
## 10    18 2015-01-01 171810
## ..   ...        ...    ...

Overall population record

In 2011, Census gave the folowing data for Bangalore

The rest of the analysis is performed assuming that the inaccuracy is unifirmly spread and is akin to white noise and will cancel itselves out.

Year wise change in demographic pattern,

Age Group analysis

The voters can be classified in the following categories

Age Range Category
0-18 Minor
18-22 Student
22-27 Single Worker
27 -30 Newly Married
30 -40 Married
40-50 Mid-Level
50-60 Senior-Level
60+ Retired

Interpolating missing data.

Children age 0-18 are not a part of this database. There are 3 ways we can interpolate this data

  1. Use the census data and scale for changes in numbers between census and voter list.
  2. Assume a uniform spread to ensure 60:40 spread of voters and minors.
  3. Interpolate the data using average family size and fertality numbers.

To check how off our reasoning is, let us compare the census age data for Bangalore Urban District While the numbers in the census data are different this data also shows a sharp discontinuity around the age of 15-16 years. Another interesting pattern common to both the census data and the 2012 voters list is the regular spike in age. In case of census data this spike takes place every 10 years and can be an indication of the approximation done by the enumerator… * Average of 2 children per family. * Mothers Age at first child birth 26 * Mothers Age at second child birth 29 * Mothers Age at third child birth 32 (Note: The average numbers for India are not available online, the estimates for western developed countries are 29-30 for first child birth, using slightly lower numbers here.)

Sanity Check: Are minors 40% of the population? Total population in database= 8.845996710^{7} Minors = 2.825704710^{7}
Minors Percentage= 31.9433162 is far less than the expected 40%

Replotting the population summary graphics we see a lack of smooth transition between the minor’s and the voting population. This may be due to two possible reason.