Sl..No.
0
State.UT
0
District
0
Grade
0
District.score.2021.22...Overall
0
District.score.2021.22...Category...1.Outcome..290.
0
District.score.2021.22...Category...2..ECT..90.
0
District.score.2021.22...Category...3..IF.SE..51.
0
District.score.2021.22...Category...4.SS.CP..35.
0
District.score.2021.22...Category...5..DL..50.
0
District.score.2021.22...Category...6..GP..84.
0
Sl..No. State.UT District Grade
1 1 Andaman and Nicobar Islands Middle and North Andamans Uttam
2 2 Andaman and Nicobar Islands Andamans Uttam
3 3 Andaman and Nicobar Islands Nicobars Uttam
4 4 Andhra Pradesh Visakhapatnam Uttam
5 5 Andhra Pradesh Guntur Uttam
6 6 Andhra Pradesh Vizianagaram Uttam
District.score.2021.22...Overall
1 381
2 375
3 367
4 397
5 393
6 390
District.score.2021.22...Category...1.Outcome..290.
1 139
2 134
3 134
4 153
5 146
6 148
District.score.2021.22...Category...2..ECT..90.
1 85
2 86
3 85
4 84
5 86
6 83
District.score.2021.22...Category...3..IF.SE..51.
1 39
2 38
3 40
4 41
5 36
6 42
District.score.2021.22...Category...4.SS.CP..35.
1 32
2 30
3 33
4 35
5 35
6 35
District.score.2021.22...Category...5..DL..50.
1 15
2 22
3 16
4 18
5 16
6 14
District.score.2021.22...Category...6..GP..84. Region
1 69 Other
2 63 Other
3 59 Other
4 66 Southern
5 74 Southern
6 68 Southern
| Name | district_data |
| Number of rows | 748 |
| Number of columns | 12 |
| _______________________ | |
| Column type frequency: | |
| character | 4 |
| numeric | 8 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| State.UT | 0 | 1 | 3 | 40 | 0 | 36 | 0 |
| District | 0 | 1 | 3 | 29 | 0 | 748 | 0 |
| Grade | 0 | 1 | 5 | 11 | 0 | 6 | 0 |
| Region | 0 | 1 | 5 | 8 | 0 | 3 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| Sl..No. | 0 | 1 | 374.50 | 216.07 | 1 | 187.75 | 374.5 | 561.25 | 748 | ▇▇▇▇▇ |
| District.score.2021.22…Overall | 0 | 1 | 346.10 | 51.21 | 140 | 312.75 | 348.0 | 383.00 | 468 | ▁▂▆▇▂ |
| District.score.2021.22…Category…1.Outcome..290. | 0 | 1 | 132.45 | 23.64 | 70 | 115.75 | 131.0 | 148.00 | 212 | ▁▇▇▃▁ |
| District.score.2021.22…Category…2..ECT..90. | 0 | 1 | 76.77 | 10.23 | 11 | 74.00 | 80.0 | 83.00 | 88 | ▁▁▁▂▇ |
| District.score.2021.22…Category…3..IF.SE..51. | 0 | 1 | 37.02 | 6.34 | 11 | 33.00 | 38.0 | 42.00 | 49 | ▁▁▃▇▅ |
| District.score.2021.22…Category…4.SS.CP..35. | 0 | 1 | 27.71 | 7.41 | 0 | 25.00 | 30.0 | 34.00 | 35 | ▁▁▂▃▇ |
| District.score.2021.22…Category…5..DL..50. | 0 | 1 | 13.10 | 8.21 | 1 | 6.00 | 12.0 | 18.00 | 34 | ▇▇▅▃▂ |
| District.score.2021.22…Category…6..GP..84. | 0 | 1 | 59.09 | 12.38 | 14 | 53.00 | 59.0 | 66.00 | 84 | ▁▂▆▇▃ |
'data.frame': 748 obs. of 12 variables:
$ Sl..No. : int 1 2 3 4 5 6 7 8 9 10 ...
$ State.UT : chr "Andaman and Nicobar Islands" "Andaman and Nicobar Islands" "Andaman and Nicobar Islands" "Andhra Pradesh" ...
$ District : chr "Middle and North Andamans" "Andamans" "Nicobars" "Visakhapatnam" ...
$ Grade : chr "Uttam" "Uttam" "Uttam" "Uttam" ...
$ District.score.2021.22...Overall : int 381 375 367 397 393 390 385 383 381 379 ...
$ District.score.2021.22...Category...1.Outcome..290.: int 139 134 134 153 146 148 139 149 134 139 ...
$ District.score.2021.22...Category...2..ECT..90. : int 85 86 85 84 86 83 83 85 84 82 ...
$ District.score.2021.22...Category...3..IF.SE..51. : int 39 38 40 41 36 42 42 32 42 36 ...
$ District.score.2021.22...Category...4.SS.CP..35. : int 32 30 33 35 35 35 35 35 35 35 ...
$ District.score.2021.22...Category...5..DL..50. : int 15 22 16 18 16 14 16 12 14 14 ...
$ District.score.2021.22...Category...6..GP..84. : int 69 63 59 66 74 68 69 69 72 73 ...
$ Region : chr "Other" "Other" "Other" "Southern" ...
District.score.2021.22...Overall
District.score.2021.22...Overall 1.0000000
District.score.2021.22...Category...1.Outcome..290. 0.8315051
District.score.2021.22...Category...2..ECT..90. 0.7896997
District.score.2021.22...Category...1.Outcome..290.
District.score.2021.22...Overall 0.8315051
District.score.2021.22...Category...1.Outcome..290. 1.0000000
District.score.2021.22...Category...2..ECT..90. 0.5073756
District.score.2021.22...Category...2..ECT..90.
District.score.2021.22...Overall 0.7896997
District.score.2021.22...Category...1.Outcome..290. 0.5073756
District.score.2021.22...Category...2..ECT..90. 1.0000000
---
title: "District_PGI Performance Dashboard"
output:
flexdashboard::flex_dashboard:
orientation: rows
vertical_layout: scroll
theme: paper
social: menu
source_code: embed
---
```{r setup, include=FALSE}
# Load necessary libraries
library(tidyverse)
library(tidyr) # Explicitly load tidyr
library(ggplot2)
library(skimr)
library(dplyr)
library(flexdashboard)
library(DT)
# Set default figure width and height
knitr::opts_chunk$set(fig.width = 8, fig.height = 6)
```
# Missing values
### checking for missing values
```{r}
# Load Dataset from CSV file
district_data <- read.csv("District_PGI_Table_1.csv")
```
```{r}
# Check for missing values
colSums(is.na(district_data))
```
### Create Region column based on States
```{r}
# Create Region column based on States
district_data <- district_data %>%
mutate(Region = case_when(
State.UT %in% c("Andhra Pradesh", "Karnataka", "Kerala", "Tamil Nadu", "Telangana") ~ "Southern",
State.UT %in% c("Punjab", "Haryana", "Uttar Pradesh", "Bihar") ~ "Northern",
TRUE ~ "Other"
))
# Print a sample to verify
print(head(district_data))
```
# dataset description {.tabset}
### summary
```{r}
# Summary Statistics for the dataset
skim(district_data)
```
# structure of the data
```{r}
# Structure of the dataset
str(district_data)
```
## Univariate Analysis {.tabset .tabset-fade}
### Histogram of Overall Scores
```{r}
ggplot(district_data, aes(x = District.score.2021.22...Overall)) +
geom_histogram(binwidth = 5, fill = "skyblue", color = "black") +
labs(title = "Histogram of District Overall Scores (2021-22)",
x = "Overall District Score", y = "Count")
```
### Boxplot of Overall Scores
```{r}
ggplot(district_data, aes(y = District.score.2021.22...Overall)) +
geom_boxplot(fill = "lightblue", color = "black") +
labs(title = "Boxplot of District Overall Scores", y = "Overall Score")
```
## Bivariate Analysis {.tabset .tabset-fade}
### Scatter Plot: ECT Score vs Overall Score
```{r}
ggplot(district_data, aes(x = District.score.2021.22...Category...2..ECT..90., y = District.score.2021.22...Overall)) +
geom_point(color = "blue") +
geom_smooth(method = "lm", color = "red", se = FALSE) +
labs(title = "Scatter Plot: ECT Score vs Overall Score",
x = "ECT Score", y = "Overall Score")
```
### Boxplot of Overall Scores by Region
```{r}
ggplot(district_data, aes(x = Region, y = District.score.2021.22...Overall, fill = Region)) +
geom_boxplot() +
labs(title = "Boxplot of Overall Scores by Region", x = "Region", y = "Overall Score")
```
## Multivariate Analysis {.tabset .tabset-fade}
### Correlation Matrix of Scores
```{r}
# Correlation Matrix for Selected Columns
cor_matrix <- cor(district_data[, c("District.score.2021.22...Overall",
"District.score.2021.22...Category...1.Outcome..290.",
"District.score.2021.22...Category...2..ECT..90.")], use = "complete.obs")
cor_matrix
```
### Heatmap: State-wise PGI Performance
```{r}
ggplot(district_data, aes(x = State.UT, y = District.score.2021.22...Category...1.Outcome..290., fill = District.score.2021.22...Overall)) +
geom_tile() +
scale_fill_gradient(low = "white", high = "red") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
labs(title = "Heatmap of State-wise PGI Scores", x = "State/UT", y = "Outcome Score")
```
### lineplot
```{r}
# Rename the dataset to avoid conflicts with any existing functions
district_data <- read.csv("District_PGI_Table_1.csv")
# Reshape the data for multivariate line plot
district_data_long <- district_data %>%
pivot_longer(cols = c("District.score.2021.22...Overall",
"District.score.2021.22...Category...1.Outcome..290.",
"District.score.2021.22...Category...2..ECT..90."),
names_to = "Score_Type", values_to = "Score_Value")
# Line plot comparing district scores across different categories by State/UT
ggplot(district_data_long, aes(x = State.UT, y = Score_Value, color = Score_Type, group = Score_Type)) +
geom_line(size = 1) +
geom_point(size = 2) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
labs(title = "Line Plot: District Scores by State/UT and Score Category",
x = "State/UT", y = "Score Value", color = "Score Category")
```