Data Visualization

Author

Nidhi Agarwal

What We Will Do

In this program, we will:

  1. Load the required libraries and dataset
  2. Explore the structure of the dataset
  3. Clean and preprocess the data
  4. Perform categorical and trend analysis
  5. Visualize the findings using ‘ggplot2’
  6. Combine multiple plots into a dashboard

Step 1: Load Required Libraries and Dataset

  • ‘tidyverse’ is used for data manipulation and visualization
  • ‘ggplot2’ is used for creating graphs
  • ‘patchwork’ is used to combine multiple plots
  • ‘janitor’ is used to clean column names

Employment Statistics Dashboard

This report presents a visual analysis of employment-related data over multiple years.
The dashboard highlights trends in:

  • Total employment provided to scheduled castes
  • Female employment
  • Number of persons registered
  • Employment exchange infrastructure

The goal is to identify patterns, growth trends, and disparities in employment metrics.

library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.5.3
Warning: package 'ggplot2' was built under R version 4.5.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(patchwork)
Warning: package 'patchwork' was built under R version 4.5.3
library(ggplot2)

Load the dataset and clean the column names

data <-read.csv("C:/Users/Nidhi/OneDrive/Desktop/datavisual/es.csv")
library(janitor)
Warning: package 'janitor' was built under R version 4.5.3

Attaching package: 'janitor'
The following objects are masked from 'package:stats':

    chisq.test, fisher.test
data <- clean_names(data)
colnames(data)
[1] "year"                                                                   
[2] "no_of_employment_exchage"                                               
[3] "no_of_persons_registered_in_000"                                        
[4] "persons_on_live_registers_at_the_end_of_the_year_all_persons_in_000"    
[5] "persons_on_live_registers_at_the_end_of_the_year_educated_person_in_000"
[6] "persons_given_employment_total_in_000"                                  
[7] "persons_given_employment_females_in_number"                             
[8] "persons_given_employment_scheduled_castes_in_number"                    
[9] "persons_given_employment_scheduled_tribes_in_number"                    

Step 2: Explore the Dataset

Before analysis, we understand the dataset by checking:

Number of rows and columns Column names Data types Summary statistics First few rows

head(data)
      year no_of_employment_exchage no_of_persons_registered_in_000
1     2009                       57                             421
2     2010                       57                             477
3     2011                       57                             537
4     2012                       48                             540
5 2013 (A)                       48                             424
6                                NA                              NA
  persons_on_live_registers_at_the_end_of_the_year_all_persons_in_000
1                                                                1940
2                                                                1954
3                                                                2002
4                                                                2069
5                                                                2066
6                                                                  NA
  persons_on_live_registers_at_the_end_of_the_year_educated_person_in_000
1                                                                    1555
2                                                                    1521
3                                                                    1405
4                                                                    1677
5                                                                    1751
6                                                                      NA
  persons_given_employment_total_in_000
1                                     5
2                                     9
3                                     7
4                                    12
5                                     5
6                                    NA
  persons_given_employment_females_in_number
1                                        277
2                                        287
3                                        350
4                                        643
5                                        154
6                                         NA
  persons_given_employment_scheduled_castes_in_number
1                                                 425
2                                                1115
3                                                1487
4                                                1056
5                                                 242
6                                                  NA
  persons_given_employment_scheduled_tribes_in_number
1                                                 656
2                                                 988
3                                                1632
4                                                1178
5                                                 176
6                                                  NA
str(data)
'data.frame':   8 obs. of  9 variables:
 $ year                                                                   : chr  "2009" "2010" "2011" "2012" ...
 $ no_of_employment_exchage                                               : int  57 57 57 48 48 NA NA NA
 $ no_of_persons_registered_in_000                                        : int  421 477 537 540 424 NA NA NA
 $ persons_on_live_registers_at_the_end_of_the_year_all_persons_in_000    : int  1940 1954 2002 2069 2066 NA NA NA
 $ persons_on_live_registers_at_the_end_of_the_year_educated_person_in_000: int  1555 1521 1405 1677 1751 NA NA NA
 $ persons_given_employment_total_in_000                                  : int  5 9 7 12 5 NA NA NA
 $ persons_given_employment_females_in_number                             : int  277 287 350 643 154 NA NA NA
 $ persons_given_employment_scheduled_castes_in_number                    : int  425 1115 1487 1056 242 NA NA NA
 $ persons_given_employment_scheduled_tribes_in_number                    : int  656 988 1632 1178 176 NA NA NA
summary(data)
     year           no_of_employment_exchage no_of_persons_registered_in_000
 Length:8           Min.   :48.0             Min.   :421.0                  
 Class :character   1st Qu.:48.0             1st Qu.:424.0                  
 Mode  :character   Median :57.0             Median :477.0                  
                    Mean   :53.4             Mean   :479.8                  
                    3rd Qu.:57.0             3rd Qu.:537.0                  
                    Max.   :57.0             Max.   :540.0                  
                    NA's   :3                NA's   :3                      
 persons_on_live_registers_at_the_end_of_the_year_all_persons_in_000
 Min.   :1940                                                       
 1st Qu.:1954                                                       
 Median :2002                                                       
 Mean   :2006                                                       
 3rd Qu.:2066                                                       
 Max.   :2069                                                       
 NA's   :3                                                          
 persons_on_live_registers_at_the_end_of_the_year_educated_person_in_000
 Min.   :1405                                                           
 1st Qu.:1521                                                           
 Median :1555                                                           
 Mean   :1582                                                           
 3rd Qu.:1677                                                           
 Max.   :1751                                                           
 NA's   :3                                                              
 persons_given_employment_total_in_000
 Min.   : 5.0                         
 1st Qu.: 5.0                         
 Median : 7.0                         
 Mean   : 7.6                         
 3rd Qu.: 9.0                         
 Max.   :12.0                         
 NA's   :3                            
 persons_given_employment_females_in_number
 Min.   :154.0                             
 1st Qu.:277.0                             
 Median :287.0                             
 Mean   :342.2                             
 3rd Qu.:350.0                             
 Max.   :643.0                             
 NA's   :3                                 
 persons_given_employment_scheduled_castes_in_number
 Min.   : 242                                       
 1st Qu.: 425                                       
 Median :1056                                       
 Mean   : 865                                       
 3rd Qu.:1115                                       
 Max.   :1487                                       
 NA's   :3                                          
 persons_given_employment_scheduled_tribes_in_number
 Min.   : 176                                       
 1st Qu.: 656                                       
 Median : 988                                       
 Mean   : 926                                       
 3rd Qu.:1178                                       
 Max.   :1632                                       
 NA's   :3                                          

Step 3: Data Cleaning

We remove missing values to ensure accurate analysis.

# Remove NA values
data <- na.omit(data)

Step 4: Data Analysis

We analyze trends in:

Employment for scheduled castes Female employment Number of persons registered Employment exchanges

These variables help understand employment patterns over time.

Step 5: Data Visualization using ggplot2

Plot 1: Total Employment Over Years

plot1 <- ggplot(data, aes(x = year, y = persons_given_employment_scheduled_castes_in_number)) +
  geom_line(color = "blue") +
  geom_point() +
  theme_minimal() +
  labs(title = "Total Employment Over Years",
       x="Year",
       y="Employment total")

Interpretation: Total Employment

The line graph shows the trend of employment provided to scheduled castes over time.

  • A steady upward trend indicates improving employment opportunities.
  • Any dips suggest economic or policy-related fluctuations.
  • Overall, the data suggests gradual progress in inclusive employment.

Plot 2: Female Employment

plot2 <- ggplot(data, aes(x = year, y = persons_given_employment_females_in_number)) +
  geom_bar(stat = "identity", fill = "pink") +
  theme_minimal() +
  labs(title = "Female Employment",
       x="Year",
       y="Female employment"
       )

Interpretation: Female Employment

  • The bar chart highlights yearly female employment levels.
  • Increasing bar heights suggest improved gender inclusion.
  • Variations across years may indicate policy impact or social factors affecting women’s employment.

Plot 3: Persons Registered

plot3 <- ggplot(data, aes(x = year, y = no_of_persons_registered_in_000)) +
  geom_line(color = "green") +
  theme_minimal() +
  labs(title = "Persons Registered",
       x="Year",
       y="Persons registered")

Interpretation: Registrations

  • This plot reflects job demand trends.
  • Increasing registrations suggest higher job-seeking population.
  • If employment doesn’t match this growth, it may indicate unemployment pressure.

Plot 4: Employment Exchanges

plot4 <- ggplot(data, aes(x = year, y = no_of_employment_exchage)) +
  geom_bar(stat = "identity", fill = "orange") +
  theme_minimal() +
  labs(title = "Employment Exchanges",
       x="Year",
       y="employment exchange")

Interpretation: Employment Exchanges

  • This shows infrastructure supporting employment services.
  • Growth indicates government expansion of job facilitation systems.
  • Stable or declining trends may suggest limited infrastructure development.

Get column names

colnames(data)
[1] "year"                                                                   
[2] "no_of_employment_exchage"                                               
[3] "no_of_persons_registered_in_000"                                        
[4] "persons_on_live_registers_at_the_end_of_the_year_all_persons_in_000"    
[5] "persons_on_live_registers_at_the_end_of_the_year_educated_person_in_000"
[6] "persons_given_employment_total_in_000"                                  
[7] "persons_given_employment_females_in_number"                             
[8] "persons_given_employment_scheduled_castes_in_number"                    
[9] "persons_given_employment_scheduled_tribes_in_number"                    

STEP 6: PLOT IT IN A DASHBOARD

library(patchwork)

dashboard <- (plot1 | plot2) / (plot3 | plot4)

dashboard
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?

STEP 7: Make the font bold and align it to centre

dashboard + 
  plot_annotation(
    title = "Employment Statistics Dashboard",
    theme = theme(
      plot.title = element_text(
        hjust = 0.5,
        size = 18,
        face = "bold"
      )
    )
  )
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?

Overall Insights

  • Employment levels show a generally increasing trend, indicating economic growth.
  • Female participation is improving but may still lag behind total employment.
  • Rising registrations highlight increasing job demand.
  • Employment exchange growth reflects infrastructure development.

Conclusion

The data suggests positive progress in employment generation, but also highlights the need to: - Improve gender inclusion further
- Match job creation with rising demand
- Strengthen employment infrastructure

This dashboard provides a clear visual summary for policymakers and analysts.