Interactive Scatter Plot Analysis

Introduction

  • Census datasets contain large demographic information
  • Understanding relationships between variables is important
  • Visualization helps in better data analysis
  • Scatter plots are used to analyze relationships

Objective

  • To analyze relationship between two continuous variables
  • To visualize data using scatter plot
  • To identify trends and patterns
  • To use interactive visualization

Data Source

  • Dataset: Census 2011 (India)
  • Source: data.gov.in
  • Contains population details of regions
  • Variables used:
    • Total Population
    • Male Population

Data Loading

  • Dataset stored as Excel file
  • Loaded into R using read_excel()
  • Converted into dataframe
  • Required columns selected for analysis

Data Cleaning

  • Column names cleaned using clean_names()
  • Selected required variables
  • Removed missing values
  • Renamed columns for clarity

Data Transformation

  • X-axis mapped to Total Population
  • Y-axis mapped to Male Population
  • Data prepared for visualization
  • Ensured numeric consistency

Feature Engineering

  • Bubble size represents population
  • Color represents variation
  • Hover text shows region name
  • Improves interpretability

Visualization

The scatter plot uses Plotly to display the relationship between total population and male population. It provides an interactive view with zooming, hovering, and panning features.

)

Create interactive scatter plot