Missing values

checking for missing values

                                            Sl..No. 
                                                  0 
                                           State.UT 
                                                  0 
                                           District 
                                                  0 
                                              Grade 
                                                  0 
                   District.score.2021.22...Overall 
                                                  0 
District.score.2021.22...Category...1.Outcome..290. 
                                                  0 
    District.score.2021.22...Category...2..ECT..90. 
                                                  0 
  District.score.2021.22...Category...3..IF.SE..51. 
                                                  0 
   District.score.2021.22...Category...4.SS.CP..35. 
                                                  0 
     District.score.2021.22...Category...5..DL..50. 
                                                  0 
     District.score.2021.22...Category...6..GP..84. 
                                                  0 

Create Region column based on States

  Sl..No.                    State.UT                  District Grade
1       1 Andaman and Nicobar Islands Middle and North Andamans Uttam
2       2 Andaman and Nicobar Islands                  Andamans Uttam
3       3 Andaman and Nicobar Islands                  Nicobars Uttam
4       4              Andhra Pradesh             Visakhapatnam Uttam
5       5              Andhra Pradesh                    Guntur Uttam
6       6              Andhra Pradesh              Vizianagaram Uttam
  District.score.2021.22...Overall
1                              381
2                              375
3                              367
4                              397
5                              393
6                              390
  District.score.2021.22...Category...1.Outcome..290.
1                                                 139
2                                                 134
3                                                 134
4                                                 153
5                                                 146
6                                                 148
  District.score.2021.22...Category...2..ECT..90.
1                                              85
2                                              86
3                                              85
4                                              84
5                                              86
6                                              83
  District.score.2021.22...Category...3..IF.SE..51.
1                                                39
2                                                38
3                                                40
4                                                41
5                                                36
6                                                42
  District.score.2021.22...Category...4.SS.CP..35.
1                                               32
2                                               30
3                                               33
4                                               35
5                                               35
6                                               35
  District.score.2021.22...Category...5..DL..50.
1                                             15
2                                             22
3                                             16
4                                             18
5                                             16
6                                             14
  District.score.2021.22...Category...6..GP..84.   Region
1                                             69    Other
2                                             63    Other
3                                             59    Other
4                                             66 Southern
5                                             74 Southern
6                                             68 Southern

dataset description

summary

Data summary
Name district_data
Number of rows 748
Number of columns 12
_______________________
Column type frequency:
character 4
numeric 8
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
State.UT 0 1 3 40 0 36 0
District 0 1 3 29 0 748 0
Grade 0 1 5 11 0 6 0
Region 0 1 5 8 0 3 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Sl..No. 0 1 374.50 216.07 1 187.75 374.5 561.25 748 ▇▇▇▇▇
District.score.2021.22…Overall 0 1 346.10 51.21 140 312.75 348.0 383.00 468 ▁▂▆▇▂
District.score.2021.22…Category…1.Outcome..290. 0 1 132.45 23.64 70 115.75 131.0 148.00 212 ▁▇▇▃▁
District.score.2021.22…Category…2..ECT..90. 0 1 76.77 10.23 11 74.00 80.0 83.00 88 ▁▁▁▂▇
District.score.2021.22…Category…3..IF.SE..51. 0 1 37.02 6.34 11 33.00 38.0 42.00 49 ▁▁▃▇▅
District.score.2021.22…Category…4.SS.CP..35. 0 1 27.71 7.41 0 25.00 30.0 34.00 35 ▁▁▂▃▇
District.score.2021.22…Category…5..DL..50. 0 1 13.10 8.21 1 6.00 12.0 18.00 34 ▇▇▅▃▂
District.score.2021.22…Category…6..GP..84. 0 1 59.09 12.38 14 53.00 59.0 66.00 84 ▁▂▆▇▃

structure of the data

'data.frame':   748 obs. of  12 variables:
 $ Sl..No.                                            : int  1 2 3 4 5 6 7 8 9 10 ...
 $ State.UT                                           : chr  "Andaman and Nicobar Islands" "Andaman and Nicobar Islands" "Andaman and Nicobar Islands" "Andhra Pradesh" ...
 $ District                                           : chr  "Middle and North Andamans" "Andamans" "Nicobars" "Visakhapatnam" ...
 $ Grade                                              : chr  "Uttam" "Uttam" "Uttam" "Uttam" ...
 $ District.score.2021.22...Overall                   : int  381 375 367 397 393 390 385 383 381 379 ...
 $ District.score.2021.22...Category...1.Outcome..290.: int  139 134 134 153 146 148 139 149 134 139 ...
 $ District.score.2021.22...Category...2..ECT..90.    : int  85 86 85 84 86 83 83 85 84 82 ...
 $ District.score.2021.22...Category...3..IF.SE..51.  : int  39 38 40 41 36 42 42 32 42 36 ...
 $ District.score.2021.22...Category...4.SS.CP..35.   : int  32 30 33 35 35 35 35 35 35 35 ...
 $ District.score.2021.22...Category...5..DL..50.     : int  15 22 16 18 16 14 16 12 14 14 ...
 $ District.score.2021.22...Category...6..GP..84.     : int  69 63 59 66 74 68 69 69 72 73 ...
 $ Region                                             : chr  "Other" "Other" "Other" "Southern" ...

Univariate Analysis

Histogram of Overall Scores

Boxplot of Overall Scores

Bivariate Analysis

Scatter Plot: ECT Score vs Overall Score

Boxplot of Overall Scores by Region

Multivariate Analysis

Correlation Matrix of Scores

                                                    District.score.2021.22...Overall
District.score.2021.22...Overall                                           1.0000000
District.score.2021.22...Category...1.Outcome..290.                        0.8315051
District.score.2021.22...Category...2..ECT..90.                            0.7896997
                                                    District.score.2021.22...Category...1.Outcome..290.
District.score.2021.22...Overall                                                              0.8315051
District.score.2021.22...Category...1.Outcome..290.                                           1.0000000
District.score.2021.22...Category...2..ECT..90.                                               0.5073756
                                                    District.score.2021.22...Category...2..ECT..90.
District.score.2021.22...Overall                                                          0.7896997
District.score.2021.22...Category...1.Outcome..290.                                       0.5073756
District.score.2021.22...Category...2..ECT..90.                                           1.0000000

Heatmap: State-wise PGI Performance

lineplot

---
title: "District_PGI Performance Dashboard"
output: 
  flexdashboard::flex_dashboard:
    orientation: rows
    vertical_layout: scroll
    theme: paper
    social: menu
    source_code: embed
---

```{r setup, include=FALSE}

# Load necessary libraries
library(tidyverse)
library(tidyr) # Explicitly load tidyr
library(ggplot2)
library(skimr)
library(dplyr)
library(flexdashboard)
library(DT)
# Set default figure width and height
knitr::opts_chunk$set(fig.width = 8, fig.height = 6)

```

# Missing values

### checking for missing values
```{r}
# Load Dataset from CSV file
district_data <- read.csv("District_PGI_Table_1.csv")
```
```{r}

# Check for missing values
colSums(is.na(district_data))
```

### Create Region column based on States
```{r}
# Create Region column based on States
district_data <- district_data %>%
  mutate(Region = case_when(
    State.UT %in% c("Andhra Pradesh", "Karnataka", "Kerala", "Tamil Nadu", "Telangana") ~ "Southern",
    State.UT %in% c("Punjab", "Haryana", "Uttar Pradesh", "Bihar") ~ "Northern",
    TRUE ~ "Other"
  ))

# Print a sample to verify
print(head(district_data))

```

# dataset description {.tabset}

### summary
```{r}
# Summary Statistics for the dataset
skim(district_data)

```

# structure of the data
```{r}
# Structure of the dataset
str(district_data)

```




## Univariate Analysis {.tabset .tabset-fade}

### Histogram of Overall Scores
```{r}

ggplot(district_data, aes(x = District.score.2021.22...Overall)) +
  geom_histogram(binwidth = 5, fill = "skyblue", color = "black") +
  labs(title = "Histogram of District Overall Scores (2021-22)",
       x = "Overall District Score", y = "Count")

```

### Boxplot of Overall Scores
```{r}
ggplot(district_data, aes(y = District.score.2021.22...Overall)) +
  geom_boxplot(fill = "lightblue", color = "black") +
  labs(title = "Boxplot of District Overall Scores", y = "Overall Score")

```

## Bivariate Analysis {.tabset .tabset-fade}

### Scatter Plot: ECT Score vs Overall Score
```{r}
ggplot(district_data, aes(x = District.score.2021.22...Category...2..ECT..90., y = District.score.2021.22...Overall)) +
  geom_point(color = "blue") +
  geom_smooth(method = "lm", color = "red", se = FALSE) +
  labs(title = "Scatter Plot: ECT Score vs Overall Score",
       x = "ECT Score", y = "Overall Score")

```

### Boxplot of Overall Scores by Region
```{r}
ggplot(district_data, aes(x = Region, y = District.score.2021.22...Overall, fill = Region)) +
  geom_boxplot() +
  labs(title = "Boxplot of Overall Scores by Region", x = "Region", y = "Overall Score")

```

## Multivariate Analysis {.tabset .tabset-fade}

### Correlation Matrix of Scores

```{r}
# Correlation Matrix for Selected Columns
cor_matrix <- cor(district_data[, c("District.score.2021.22...Overall", 
                           "District.score.2021.22...Category...1.Outcome..290.", 
                           "District.score.2021.22...Category...2..ECT..90.")], use = "complete.obs")
cor_matrix

```


### Heatmap: State-wise PGI Performance
```{r}
ggplot(district_data, aes(x = State.UT, y = District.score.2021.22...Category...1.Outcome..290., fill = District.score.2021.22...Overall)) +
  geom_tile() +
  scale_fill_gradient(low = "white", high = "red") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  labs(title = "Heatmap of State-wise PGI Scores", x = "State/UT", y = "Outcome Score")

```

### lineplot
```{r}
# Rename the dataset to avoid conflicts with any existing functions
district_data <- read.csv("District_PGI_Table_1.csv")

# Reshape the data for multivariate line plot
district_data_long <- district_data %>%
  pivot_longer(cols = c("District.score.2021.22...Overall", 
                        "District.score.2021.22...Category...1.Outcome..290.", 
                        "District.score.2021.22...Category...2..ECT..90."),
               names_to = "Score_Type", values_to = "Score_Value")

# Line plot comparing district scores across different categories by State/UT
ggplot(district_data_long, aes(x = State.UT, y = Score_Value, color = Score_Type, group = Score_Type)) +
  geom_line(size = 1) +
  geom_point(size = 2) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  labs(title = "Line Plot: District Scores by State/UT and Score Category",
       x = "State/UT", y = "Score Value", color = "Score Category")

```