Project Title: “Analysis of Factors Affecting Female Employment Rate”

NAME: Akshay Ratnawat

EMAIL: akshayratnawat@gmail.com

COLLEGE / COMPANY: Jawaharlal Nehru University (SSS)

table <- read.csv(paste("gapminder.csv", sep = ""))
library(psych)
describe(table)[c(2,3,7,9,10,12,13,16),]
##                    vars   n    mean       sd  median trimmed     mad
## incomeperperson       2 190 8740.97 14262.81 2553.50 5610.69 3285.71
## alcconsumption        3 187    6.69     4.90    5.92    6.30    5.53
## femaleemployrate      7 178   47.55    14.63   47.55   47.57   12.68
## internetuserate       9 192   35.63    27.78   31.81   33.70   32.76
## lifeexpectancy       10 191   69.75     9.71   73.13   70.75    8.31
## polityscore          12 161    3.69     6.31    6.00    4.36    5.93
## relectricperperson   13 136 1173.18  1681.44  597.14  821.25  778.23
## urbanrate            16 203   56.77    23.84   57.94   56.73   28.70
##                       min       max     range  skew kurtosis      se
## incomeperperson    103.78 105147.44 105043.66  3.20    14.07 1034.73
## alcconsumption       0.03     23.01     22.98  0.61    -0.25    0.36
## femaleemployrate    11.30     83.30     72.00  0.02     0.07    1.10
## internetuserate      0.21     95.64     95.43  0.45    -1.08    2.00
## lifeexpectancy      47.79     83.39     35.60 -0.81    -0.47    0.70
## polityscore        -10.00     10.00     20.00 -0.71    -0.97    0.50
## relectricperperson   0.00  11154.76  11154.76  3.11    12.02  144.18
## urbanrate           10.40    100.00     89.60 -0.02    -1.02    1.67

From the above Table we have following observations:

  1. There is a lot od deviation in the incomeper person across various countries. There is a lot of difference between the minimum and maximum income.

  2. The female employment rate has very less deviation in various countries which is very low thus its a problem of worry. After the Christian Lagard comment that using the female resources in the employment countries all over the world can increase the world GDP by 2%, countries should be seen working in this direction.

  3. Internet USe Rate Life Expectancy Rate has been used here as the proxy of development in the countries. The deviation in the in the internet use rate is highly dispersed across countries as this is a recent development in various countries. But the deviation between the life expectancy is not so much as the Internet use rate. In these two we can say that life expectancy has been a first round of development and countries are fast catching up in this. While the internet use rate is the second round of development which we can see fast to be converging in the near future.

  4. Polity Score has been found as a crucial measure of the polity structure in the country. It has been calculated as autocracy score subtracted from the democracy score. Thus it measures the level of free nature and democracy in the country. Thus we can see it as a crucial factor affecting the female employment rate.

  5. The Residential electricity also has large deviation. We can see this also as a crucial factor affecting the employment rate. As in a country the residential electricity consumption can be seen as a proxy of the street lighting and public place lighting as well, as both these in a country are correlated to each other.

  6. Urbanisation has a large impact on the culture, freedom to women etc. which in turn affect the female employment rate in a country. So we can see this also as a crucial factor in determining the female employment rate.

boxplot(table$incomeperperson,
        main = "Income Per Person",
        horizontal = TRUE,
        col = " Blue",
        xlab = "2010 Gross Domestic Product per capita in constant 2000 US$.")

As we see from the graph the Income per person has a lot of outliers.

boxplot(table$femaleemployrate,
        main = " Female employment Rate",
        horizontal = TRUE,
        col = "linen",
        xlab= " Female Employees (15+) % of Female population")

boxplot(table$internetuserate,
        main = "Internet Use Rate",
        horizontal = TRUE,
        col = "blueviolet",
        xlab = " Internet Users per 100 persons")

boxplot(table$polityscore,
        main = "Polity Score",
        horizontal = TRUE,
        col = "blueviolet",
        xlab = "2009 Democracy score (Polity)")

library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
theme_set(theme_bw())
ggplot(table, aes(x=`country`, y=polityscore, label=country)) + 
  geom_point(stat='identity', fill="black", size=4)  +
  geom_segment(aes(y = 0, 
                   x = `country`, 
                   yend = polityscore, 
                   xend = `country`), 
               color = "black") +
  geom_text(color="white", size=1) +
  labs(title="Diverging Polity Score") + 
  ylim(-10, 10) +
  coord_flip()
## Warning: Removed 52 rows containing missing values (geom_point).
## Warning: Removed 52 rows containing missing values (geom_segment).
## Warning: Removed 52 rows containing missing values (geom_text).

boxplot(table$relectricperperson,
        main = "Residential Electricity Per Person",
        horizontal = TRUE,
        col = "mediumseagreen",
        xlab = "2008 residential electricity consumption, per person (kWh)")

boxplot(table$urbanrate,
        main = "Urbanisation Rate",
        horizontal = TRUE,
        col = "seagreen",
        xlab = "2008 urban population (% of total)")

reg <- lm(table$femaleemployrate ~ table$incomeperperson)
plot(table$incomeperperson+1, table$femaleemployrate+1,
     xlab = "Female Employment Rate",
     ylab = "Income Per Person",
     main = "Relation Between Female Employment Rate and Income Per person",
     col = "Red")
abline (reg, col = "black")

reg1 <- lm(table$femaleemployrate ~ table$internetuserate)
plot(table$internetuserate, table$femaleemployrate, 
     xlab = "Female Employment Rate",
     ylab = "Internet Usage Rate",
     main = "Relation Between Female Employment Rate and Internet Usage Rate",
     col = "Blue")
abline (reg1, col = "red")

reg2 <- lm(table$femaleemployrate ~ table$lifeexpectancy)
plot(table$lifeexpectancy, table$femaleemployrate, 
     xlab = "Female Employment Rate",
     ylab = "Life Expectancy Rate",
     main = "Relation Between Female Employment Rate and Life Expectancy Rate",
     col = "black")
abline (reg2, col = "Red")

reg3 <- lm(table$femaleemployrate ~ table$polityscore)
plot(table$polityscore, table$femaleemployrate, 
     xlab = "Female Employment Rate",
     ylab = "Polity score of Democracy",
     main = "Relation Between Female Employment Rate and Polity Score",
     col = "green")
abline (reg3, col = "black")

reg4 <- lm(table$femaleemployrate ~ table$relectricperperson)
plot( table$relectricperperson+1, table$femaleemployrate+1,
     xlab = "Female Employment Rate",
     ylab = "Residential Electricity Per Person",
     main = "Relation Between Female Employment Rate and Residential Electricity per person",
     col = "Blue")
abline (reg4, col = "red")

reg5 <- lm(table$femaleemployrate ~ table$urbanrate)
plot(table$urbanrate, table$femaleemployrate, 
     xlab = "Female Employment Rate",
     ylab = "Urbanisation Rate",
     main = "Relation Between Female Employment Rate and Urbanisation Rate",
     col = "Blue")
abline(reg5, col ="Red")

library(car)
## 
## Attaching package: 'car'
## The following object is masked from 'package:psych':
## 
##     logit
scatterplotMatrix (formula = ~femaleemployrate + internetuserate + lifeexpectancy + polityscore + relectricperperson + urbanrate, 
                   data = table,
                   main = " Scatterplot Matrix of all the crucial Factors")

library(corrgram)
vars2 <- c("femaleemployrate", "internetuserate", "lifeexpectancy", "polityscore", "relectricperperson", "urbanrate")
corrgram(table[,vars2], order=TRUE,
         main="Factor affecting female employment rate",
         lower.panel=panel.shade, upper.panel=panel.pie,
         diag.panel=panel.minmax, text.panel=panel.txt)