Title: “Workers” Author: “Anupam Singh” Date: “April 5, 2017” Output html_document
Workers : A work on ward wise data on different type of worker force, data from Surat Municipal corporation which is part of Census 2011. Category of main worker is not defined in terms of organsied sector or unorganised sector in the meta data .However data variables suggest it is from unorganised sector.In data frame or sheet Rows are the ward name and columns represents type of workers stratified by gender,nature of work etc.
library(readr)
Workers <- read_csv("F:/Analytics in R/Data/Surat Data/Workers.csv")
## Parsed with column specification:
## cols(
## `Census Ward No.` = col_character(),
## `Ward Name` = col_character(),
## `Area in Sq.km` = col_double(),
## `Total Main Workers` = col_integer(),
## `Male Main Workers` = col_integer(),
## `Female Main Workers` = col_integer(),
## `Total Main Caltivator` = col_integer(),
## `Male Main Caltivator` = col_integer(),
## `Female Main Caltivator` = col_integer(),
## `Total Main Agriculture Labour` = col_integer(),
## `Male Main Agriculture Labour` = col_integer(),
## `Female Main Agriculture Labour` = col_integer(),
## `Total Main Household Industrial Labour` = col_integer(),
## `Male Main Household Industrial Labour` = col_integer(),
## `Female Main Household Industrial Labour` = col_integer(),
## `Total Main Other Labour` = col_integer(),
## `Male Main Other Labour` = col_integer(),
## `Female Main Other Labour` = col_integer()
## )
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
dim(Workers)
## [1] 89 18
As per output given by R above we can see that it has exhibit the structure of the data for example weather column contains integer value or character or double(value with decimal places).I have laoded the dplyr library because with dpylr we can do lot of smart data wrangling for furthur analysis.secondly, function dim(your worksheet name or data frame name inside brackets) tells us the number of rows and column present in the dataframe or worksheet in excel langauge. we can observe that there are 89 rows and 18 columns in the data set.
2.Next we will use the function glimse() to see the structure of data( this is available in dplyr package, good way to look at data structure)
glimpse(Workers)
## Observations: 89
## Variables: 18
## $ Census Ward No. <chr> "1", "2", "3", "4", "5...
## $ Ward Name <chr> "Nanpura", "Sagrampura...
## $ Area in Sq.km <dbl> 1.280, 1.310, 0.840, 0...
## $ Total Main Workers <int> 19039, 28016, 20623, 1...
## $ Male Main Workers <int> 15889, 23751, 16329, 1...
## $ Female Main Workers <int> 3150, 4265, 4294, 2969...
## $ Total Main Caltivator <int> 14, 51, 52, 27, 6, 8, ...
## $ Male Main Caltivator <int> 13, 42, 43, 19, 6, 8, ...
## $ Female Main Caltivator <int> 1, 9, 9, 8, 0, 0, 11, ...
## $ Total Main Agriculture Labour <int> 14, 41, 26, 18, 7, 10,...
## $ Male Main Agriculture Labour <int> 10, 35, 20, 15, 5, 7, ...
## $ Female Main Agriculture Labour <int> 4, 6, 6, 3, 2, 3, 7, 0...
## $ Total Main Household Industrial Labour <int> 586, 1039, 1792, 1705,...
## $ Male Main Household Industrial Labour <int> 360, 639, 1047, 980, 8...
## $ Female Main Household Industrial Labour <int> 226, 400, 745, 725, 37...
## $ Total Main Other Labour <int> 18425, 26885, 18753, 1...
## $ Male Main Other Labour <int> 15506, 23035, 15219, 1...
## $ Female Main Other Labour <int> 2919, 3850, 3534, 2233...
As we can see the glimpse function output, it shows the dimensions of the data along with the nature of the variable i.e.int,dbl or fctr.So it is better to use this function to get feel of the data.
3.Now we like to observe count of Total workers in data frame.
Totalworkers=(colSums(Workers[4]))
Totalworkers
## Total Main Workers
## 1729821
Data shows there are 1729821 workers in total including all wards mentioned in the data.Here one point to be noted that total main workers is the sum of all the categories.This figure may be more,looking into the total population of Surat which is approaching 70-80,00,000. if we look into the national fig,i.e proporion of unorganised worker force in the country to the total, it is 4.Next we want to see the total count of Male and Female workers.
# Male workers
colSums(Workers[5])
## Male Main Workers
## 1566329
# Female Workers
colSums(Workers[6])
## Female Main Workers
## 163492
5.It would be good to see the gender participation in the work force.So we can see the proportion of women.
tm=colSums(Workers[4])# Total Work Force
f=colSums(Workers[6])# female work force count
# Proportion female
propfemale=f/tm*100
propfemale
## Female Main Workers
## 9.451383
As we can see in above output, proportion of female particiaption is just 9.45 % in the workforce in surat.
Lets visualize this through Barchart
countworker=c(1566329,163492)
Gender=c("Male","Female")
library(lattice)
plot(barchart(countworker~Gender, origin = 0,xlab="Gender of worker",ylab="count of worker",main="Distribution of worker by Gender in surat-Census-2011"))
6.Distribution of total worker category wise
colSums(Workers[5:18])
## Male Main Workers
## 1566329
## Female Main Workers
## 163492
## Total Main Caltivator
## 7126
## Male Main Caltivator
## 6371
## Female Main Caltivator
## 755
## Total Main Agriculture Labour
## 13260
## Male Main Agriculture Labour
## 8845
## Female Main Agriculture Labour
## 4415
## Total Main Household Industrial Labour
## 26159
## Male Main Household Industrial Labour
## 15852
## Female Main Household Industrial Labour
## 10307
## Total Main Other Labour
## 1683276
## Male Main Other Labour
## 1535261
## Female Main Other Labour
## 148015
In next section we look some other data outputs like which ward has maximum number of female workforce and male workforce and other distributions .Thnak you.