Year Defence.Public.Sector.Undertakings..in.Rs.Cr.
1 2016-17 40427
2 2017-18 43464
3 2018-19 45387
4 2019-20 47655
5 2020-21 46711
6 2021-22 55790
New.Defence.Public.Sector.Undertakings..in.Rs.Cr.
1 14825
2 14829
3 12816
4 9227
5 14635
6 11913
Other.Public.Sector.Undertakings.Joint.Ventures..in.Rs.Cr.
1 4698
2 5180
3 5567
4 6295
5 6029
6 7222
Defence.Private.Companies..in.Rs.Cr. Total.Production..in.Rs.Cr.
1 14104 74054
2 15347 78820
3 17350 81120
4 15894 79071
5 17268 84643
6 19920 94845
Year Defence.Public.Sector.Undertakings..in.Rs.Cr.
2016-17:1 Min. :40427
2017-18:1 1st Qu.:44906
2018-19:1 Median :47183
2019-20:1 Mean :52106
2020-21:1 3rd Qu.:57709
2021-22:1 Max. :73945
(Other):2
New.Defence.Public.Sector.Undertakings..in.Rs.Cr.
Min. : 9227
1st Qu.:12590
Median :14730
Mean :14363
3rd Qu.:15371
Max. :19662
Other.Public.Sector.Undertakings.Joint.Ventures..in.Rs.Cr.
Min. :4698
1st Qu.:5470
Median :6162
Mean :6113
3rd Qu.:6865
Max. :7222
Defence.Private.Companies..in.Rs.Cr. Total.Production..in.Rs.Cr.
Min. :14104 Min. : 74054
1st Qu.:15757 1st Qu.: 79008
Median :17309 Median : 82882
Mean :18434 Mean : 91016
3rd Qu.:20211 3rd Qu.: 98305
Max. :26506 Max. :126887
Year
0
Defence.Public.Sector.Undertakings..in.Rs.Cr.
0
New.Defence.Public.Sector.Undertakings..in.Rs.Cr.
0
Other.Public.Sector.Undertakings.Joint.Ventures..in.Rs.Cr.
0
Defence.Private.Companies..in.Rs.Cr.
0
Total.Production..in.Rs.Cr.
0
### Bar Chart: Production of Different Sectors across the Years
---
title: "EDA for Gov data set"
output:
flexdashboard::flex_dashboard:
orientation: rows
vertical_layout: scroll
theme: journal
social: menu
source_code: embed
---
```{r setup, include=FALSE}
library(flexdashboard)
library(ggplot2)
library(dplyr)
library(reshape2)
library(corrplot)
```
## load data {.tabset}
### view the data set
```{r}
# Load the dataset
df <- read.csv("Defence_Production_22082024.csv")
# Data Cleaning: Remove commas and convert to numeric
df$Defence.Public.Sector.Undertakings..in.Rs.Cr. <- as.numeric(gsub(",", "", df$Defence.Public.Sector.Undertakings..in.Rs.Cr.))
df$New.Defence.Public.Sector.Undertakings..in.Rs.Cr. <- as.numeric(gsub(",", "", df$New.Defence.Public.Sector.Undertakings..in.Rs.Cr.))
df$Other.Public.Sector.Undertakings.Joint.Ventures..in.Rs.Cr. <- as.numeric(gsub(",", "", df$Other.Public.Sector.Undertakings.Joint.Ventures..in.Rs.Cr.))
df$Defence.Private.Companies..in.Rs.Cr. <- as.numeric(gsub(",", "", df$Defence.Private.Companies..in.Rs.Cr.))
df$Total.Production..in.Rs.Cr. <- as.numeric(gsub(",", "", df$Total.Production..in.Rs.Cr.))
# Convert Year to a factor for better plotting
df$Year <- as.factor(df$Year)
# Display the first few rows of the dataset
head(df)
```
### summary
```{r}
# Summary of the dataset
summary(df)
# Check for any missing values
colSums(is.na(df))
```
## plots {.tabset}
### Histogram of Total Production
```{r}
# 1. Histogram of Total Production
ggplot(df, aes(x = Total.Production..in.Rs.Cr.)) +
geom_histogram(binwidth = 5000, fill = "blue", color = "black") +
labs(title = "Histogram of Total Production", x = "Total Production (Rs Cr)", y = "Frequency")
```
### Boxplot of Total Production by Year
```{r}
# 2. Boxplot for comparison across categories
ggplot(df, aes(x = Year, y = Total.Production..in.Rs.Cr.)) +
geom_boxplot(fill = "green") +
labs(title = "Boxplot of Total Production by Year", x = "Year", y = "Total Production (Rs Cr)")
```
### Scatter Plot of Defence Public Sector Undertakings vs Total Production
```{r}
# 3. Scatter plot: Total Production vs Defence Public Sector Undertakings
ggplot(df, aes(x = Defence.Public.Sector.Undertakings..in.Rs.Cr., y = Total.Production..in.Rs.Cr.)) +
geom_point(color = "red") +
geom_smooth(method = "lm", se = FALSE) +
labs(title = "Scatter Plot of Defence Public Sector Undertakings vs Total Production",
x = "Defence Public Sector Undertakings (Rs Cr)",
y = "Total Production (Rs Cr)")
```
### Bar Chart: Production of Different Sectors across the Years
```{r}
# 4. Bar Chart: Production of Different Sectors across the Years
df_melted <- melt(df, id.vars = 'Year', measure.vars = c(
'Defence.Public.Sector.Undertakings..in.Rs.Cr.',
'New.Defence.Public.Sector.Undertakings..in.Rs.Cr.',
'Other.Public.Sector.Undertakings.Joint.Ventures..in.Rs.Cr.',
'Defence.Private.Companies..in.Rs.Cr.'
))
ggplot(df_melted, aes(x = Year, y = value, fill = variable)) +
geom_bar(stat = "identity", position = "dodge") +
labs(title = "Sector-Wise Production over the Years", x = "Year", y = "Production (Rs Cr)", fill = "Sector") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
```
### Line Plot: Trend in Total Production over the Years
```{r}
# 5. Line Plot: Trend in Total Production over the Years
ggplot(df, aes(x = Year, y = Total.Production..in.Rs.Cr.)) +
geom_line(group = 1, color = "blue", size = 1.2) +
geom_point(color = "red", size = 3) +
labs(title = "Trend of Total Production over the Years", x = "Year", y = "Total Production (Rs Cr)")
```