Homework 5 Visualize anything with ggplot2

In this Assignment we have used a data set from CANSIM tables. We are going to work on the data in Table 202-0101 : Distribution of earnings, by sex, in 2011 constant dollars. This table contains 2100 series, with data for years 1976 - 2011 (not all combinations necessarily have data for all years), and was last released on 2013-06-27.

This table contains data described by the following dimensions (Not all combinations are available):

We have used only a part of this data set, containing only 240 observations through 7 variables.

Loading the Data and Initializations

ErnDat <- read.csv("EarningDistribution.csv")
str(ErnDat)
## 'data.frame':    240 obs. of  7 variables:
##  $ Year        : int  1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 ...
##  $ Province    : Factor w/ 5 levels "Atlantic provinces",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ SEX         : Factor w/ 2 levels "Females","Males": 2 2 2 2 2 2 2 2 2 2 ...
##  $ Income      : int  39600 40600 39800 39400 39500 37900 37900 38200 37000 37700 ...
##  $ EarnersCount: int  677 681 676 658 647 652 648 636 642 637 ...
##  $ FYFTEarning : int  49100 49900 49800 50400 51000 48000 48500 47900 46300 47500 ...
##  $ FYFTCount   : int  375 374 370 351 336 330 328 327 330 333 ...

We have selected the past 24-years data (1988-2011) and restricted ourselves to the following parameters:

library(ggplot2)

Income Distribution on Sex

In this part we want to depict the empirical distribution of income for both sexes separately and compare these distributions.

ggplot(ErnDat, aes(x = Income, color = SEX)) + geom_density() + facet_wrap(~Province) + 
    xlab("average total income (dollars)")

plot of chunk unnamed-chunk-3

As we see, in all the provinces, men generally earn higher incomes compared to women.

Earnings of Full-Year Full-Time workers in different Provinces

To compare the average of earnings of an FYFT worker among Provinces, we can use the following simple diagram.

ggplot(ErnDat, aes(reorder(Province, FYFTEarning), FYFTEarning)) + geom_point() + 
    geom_jitter(position = position_jitter(width = 0.1)) + facet_wrap(~SEX) + 
    ylab("Earnings of a Full-Year Full-Time worker (dollars)")

plot of chunk unnamed-chunk-4

We see that FYFT workers in Ontario have the highest average earnings among all the provinces (no matter what the sex type is). We see that the highest level of earnings of female FYFT workers ,across different Provinces in Canada, is not as much as the lowest level of earning of male workers.

Full-Year Full-Time workers over time

ggplot(ErnDat, aes(Year, FYFTCount, color = SEX)) + geom_point() + geom_line() + 
    facet_wrap(~Province) + ylab("Number of Full-Year Full-Time workers") + 
    xlab("Provinces")

plot of chunk unnamed-chunk-5

As we see there is more or less a gap between the number of Full-Year Full-Time men workers and that of women. As an example, this gap has been vanishing in Atlantic Provinces.

Full time Average Earning vs. Average Income

We want to show the relation between the earning of a full-year full-time person and her/his average income.We have depicted different sexes with different colors.

ggplot(ErnDat, aes(x = Income, y = FYFTEarning, col = SEX)) + geom_point() + 
    geom_smooth(method = "lm") + xlab("Average total income (dollars)") + ylab("Average earnings of full-year full-time workers (dollars)")

plot of chunk unnamed-chunk-6

We see that there is a close relationship between the total income and the average earnings of a full-time worker, across countries in different years.