Problem statement:

Develop a script in R to create a choropleth map showing literacy rates across Indian states using spatial visualization.

What we will do?

Load required libraries

Load and inspect dataset

Perform exploratory data analysis

Load India map (spatial data)

Merge dataset with map

Create choropleth map

Step 1: Load required Dataset

In this step, we load the required libraries.

ggplot2 is used for creating visualizations.

dplyr helps in data manipulation.

sf is used for handling spatial (map) data.

readr helps in reading data files efficiently.

{r} library(ggplot2) library(dplyr) library(maps)

Step 2: Load Dataset

Here, we load the dataset containing literacy rates using read.csv().We then preview the data using head() and check its structure using str() to understand the variables and their data types.

{r} data <- data.frame( State = c(“andhra pradesh”,“karnataka”,“tamil nadu”,“kerala”,“maharashtra”,“bihar”), Literacy_2001 = c(60,67,73,90,77,47), Literacy_2011 = c(67,75,80,94,83,63) )

data

Step 3: Exploratory Data Analysis

We analyze the dataset to understand values and check for missing data.

{r} summary(data) colSums(is.na(data))

Add growth column

data$Growth <- data$Literacy_2011 - data$Literacy_2001 data

Step 4: Load India Map

We use built-in map data of India using the maps package.

{r} # Load map data india_map <- map_data(“world”)

Convert to lowercase for matching

india_map$region <- tolower(india_map$region)

Filter India only

india_map <- india_map[india_map$region == “india”, ]

head(india_map)

Step 5: Merge Data with Map

In this step, we combine the literacy dataset with the spatial map data.This is done using left_join() by matching state names in both datasets.This step is crucial because it links data values with geographic regions.

{r} # Convert state names to lowercase data$State <- tolower(data$State)

Merge map and data

merged_data <- merge(india_map, data, by.x = “region”, by.y = “State”, all.x = TRUE)

head(merged_data)

Step 6: Create Choropleth Map

Map for Literacy Rate (2011)

Here, we create a choropleth map where each Indian state is colored based on its literacy rate in 2011. Darker shades indicate higher literacy, while lighter shades indicate lower literacy.

{r} ggplot(merged_data, aes(x = long, y = lat, group = group)) + geom_polygon(aes(fill = Literacy_2011), color = “black”) + scale_fill_gradient(low = “lightblue”, high = “darkblue”, na.value = “grey50”) + theme_minimal() + labs( title = “Literacy Rate in India (2011)”, fill = “Literacy %” )

LA

Add growth column

Convert to lowercase for matching

Filter India only

Merge map and data