LA Assignment

Prem & Kiran T L

Introduction

  • Analyze India Census dataset using R
  • Convert district data → state level
  • Visualize population distribution
  • Compare male vs female population
  • Identify top populated states

Load Libraries

Load Dataset

'data.frame':   640 obs. of  25 variables:
 $ District_code       : int  1 2 3 4 5 6 7 8 9 10 ...
 $ State_name          : chr  "JAMMU AND KASHMIR" "JAMMU AND KASHMIR" "JAMMU AND KASHMIR" "JAMMU AND KASHMIR" ...
 $ District_name       : chr  "Kupwara" "Badgam" "Leh(Ladakh)" "Kargil" ...
 $ Population          : int  870354 753745 133487 140802 476835 642415 616435 1008039 392232 1236829 ...
 $ Male                : int  474190 398041 78971 77785 251899 345351 326109 534733 207680 651124 ...
 $ Female              : int  396164 355704 54516 63017 224936 297064 290326 473306 184552 585705 ...
 $ Literate            : int  439654 335649 93770 86236 261724 364109 389204 545149 185979 748584 ...
 $ Workers             : int  229064 214866 75079 51873 161393 290912 200431 304200 149317 407188 ...
 $ Male_Workers        : int  190899 162578 53265 39839 117677 184752 161548 249581 101380 333151 ...
 $ Female_Workers      : int  38165 52288 21814 12034 43716 106160 38883 54619 47937 74037 ...
 $ Cultivator_Workers  : int  34680 55299 20869 8266 54264 136527 69533 57495 28232 12228 ...
 $ Agricultural_Workers: int  56759 36630 1645 3763 31583 24016 21566 62246 32882 10408 ...
 $ Household_Workers   : int  7946 29102 1020 1222 3930 4656 3952 15084 20484 20095 ...
 $ Hindus              : int  37128 10110 22882 10341 32604 221880 540063 30621 8439 42540 ...
 $ Muslims             : int  823286 736054 19057 108239 431279 402879 64234 959185 382006 1177342 ...
 $ Christians          : int  1700 1489 658 604 958 983 1828 1497 572 2746 ...
 $ Sikhs               : int  5600 5559 1092 1171 11188 15513 9551 14770 555 12187 ...
 $ Buddhists           : int  66 47 88635 20126 83 189 24 140 44 285 ...
 $ Jains               : int  39 6 103 28 10 26 16 29 17 74 ...
 $ Secondary_Education : int  74948 66459 16265 16938 46062 65921 91522 107837 35630 176409 ...
 $ Higher_Education    : int  39709 41367 8923 9826 29517 35804 47694 57932 18644 132727 ...
 $ Graduate_Education  : int  21751 27950 6197 3077 13962 18576 24330 48285 12721 121856 ...
 $ Age_Group_0_29      : int  600759 503223 70703 87532 304979 404903 357864 636524 252378 693238 ...
 $ Age_Group_30_49     : int  178435 160933 41515 35561 109818 153165 160123 239659 90465 351561 ...
 $ Age_Group_50        : int  89679 88978 21019 17488 61334 83319 97684 130513 48802 190330 ...
  • Dataset contains district-level population data
  • Includes male, female, and total population

Data Aggregation

  • Convert district data → state data
  • Use dplyr for grouping

Population by State

  • Shows state-wise population distribution
  • Helps identify highly populated states

Male vs Female Population

  • Compares gender distribution
  • Visualized using pie chart

Top 10 States by Population

  • Highlights highest population states
  • Ranking visualization

Key Insights

  • Population varies significantly across states
  • Some states dominate total population
  • Gender distribution is relatively balanced
  • Data aggregation simplifies analysis

Conclusion

  • Used dplyr for data processing
  • Used ggplot2 for visualization
  • Extracted meaningful population insights
  • Demonstrated multiple chart types