Dataset Description {.tabset .active}

About the Dataset

'data.frame':   549 obs. of  9 variables:
 $ Sl..No.          : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Depot            : chr  "SHN" "TRY" "CBE" "CB" ...
 $ Route.No.        : chr  "192UD" "127UD" "479UD" "131UD" ...
 $ From             : chr  "ALANKULAM" "ANNAVASAL" "ARANI" "ARANTHANGI" ...
 $ To               : chr  "CHENNAI" "CHENNAI" "COIMBATORE" "CHENNAI" ...
 $ Route.Length     : int  657 384 394 417 403 511 431 494 481 361 ...
 $ Type             : chr  "ULTRA" "ULTRA" "ULTRA" "ULTRA" ...
 $ No.of.Service    : int  1 1 1 1 1 2 1 1 1 5 ...
 $ Departure.Timings: chr  "17.45" "21.3" "21" "20.3" ...

Summary of the dataset

    Sl..No.       Depot            Route.No.             From          
 Min.   :  1   Length:549         Length:549         Length:549        
 1st Qu.:138   Class :character   Class :character   Class :character  
 Median :275   Mode  :character   Mode  :character   Mode  :character  
 Mean   :275                                                           
 3rd Qu.:412                                                           
 Max.   :549                                                           
      To             Route.Length       Type           No.of.Service   
 Length:549         Min.   : 89.0   Length:549         Min.   : 1.000  
 Class :character   1st Qu.:385.0   Class :character   1st Qu.: 1.000  
 Mode  :character   Median :467.0   Mode  :character   Median : 1.000  
                    Mean   :487.7                      Mean   : 2.614  
                    3rd Qu.:582.0                      3rd Qu.: 2.000  
                    Max.   :770.0                      Max.   :39.000  
 Departure.Timings 
 Length:549        
 Class :character  
 Mode  :character  
                   
                   
                   

Head of Dataset

  Sl..No. Depot Route.No.         From         To Route.Length  Type
1       1   SHN     192UD    ALANKULAM    CHENNAI          657 ULTRA
2       2   TRY     127UD    ANNAVASAL    CHENNAI          384 ULTRA
3       3   CBE     479UD        ARANI COIMBATORE          394 ULTRA
4       4    CB     131UD   ARANTHANGI    CHENNAI          417 ULTRA
5       5  KKDI     126AU     ARIMALAM    CHENNAI          403 ULTRA
6       6   TCN     170AU ARUPPUKOTTAI    CHENNAI          511 ULTRA
  No.of.Service Departure.Timings
1             1             17.45
2             1              21.3
3             1                21
4             1              20.3
5             1             19.15
6             2       19.00,20.00

Univariate Analysis {.tabset}

Histogram for Distribution of Route Length

Histogram for Distribution of Number of Services

Bivariate Analysis {.tabset}

Box Plot for Distribution of Route Length by Type of Service

Box Plot for Number of Services by Type of Service

Multivariate Analysis {.tabset}

Scatter Plot for Route Length vs No. of Services

Scatter Plot for Departure Time vs Route Length

Scatter Plot for Route Length vs Departure Time (with color by Type)

---
title: "EDA for SETC Dataset"
output:
  flexdashboard::flex_dashboard:
    orientation: rows
    vertical_layout: scroll
    theme: flatly
    social: menu
    source_code: embed
    navbar:
      - { title: "Dataset Description", href: "#dataset-description" }
      - { title: "Univariate Analysis", href: "#univariate-analysis" }
      - { title: "Bivariate Analysis", href: "#bivariate-analysis" }
      - { title: "Multivariate Analysis", href: "#multivariate-analysis" }
---
```{r setup, include=FALSE}
library(flexdashboard)
library(ggplot2)
library(dplyr)

# Load your dataset
bus <- read.csv("SETC.csv")
```

## Dataset Description {.tabset .active} {#dataset-description}

### About the Dataset

```{r}
str(bus)
```

### Summary of the dataset

```{r}
summary(bus)
```

### Head of Dataset

```{r}
head(bus)
```

## Univariate Analysis {.tabset} {#univariate-analysis}

### Histogram for Distribution of Route Length

```{r}
# Histogram: Distribution of Route Length
ggplot(bus, aes(x = Route.Length)) +
  geom_histogram(binwidth = 50, fill = "skyblue", color = "black", alpha = 0.7) +
  labs(title = "Histogram of Route Length",
       x = "Route Length (km)", y = "Frequency") +
  theme_minimal()
```

### Histogram for Distribution of Number of Services

```{r}
# Histogram: Distribution of Number of Services
ggplot(bus, aes(x = No.of.Service)) +
  geom_histogram(binwidth = 1, fill = "purple", color = "black", alpha = 0.7) +
  labs(title = "Histogram of Number of Services",
       x = "Number of Services", y = "Frequency") +
  theme_minimal()
```

## Bivariate Analysis {.tabset} {#bivariate-analysis}

### Box Plot for Distribution of Route Length by Type of Service

```{r}
# Box Plot: Distribution of Route Length by Type of Service
ggplot(bus, aes(x = Type, y = Route.Length)) +
  geom_boxplot(fill = "lightgreen", color = "darkgreen") +
  labs(title = "Box Plot of Route Length by Type of Service",
       x = "Type of Service", y = "Route Length (km)") +
  theme_minimal()
```

### Box Plot for Number of Services by Type of Service

```{r}
# Box Plot: Number of Services by Type of Service
ggplot(bus, aes(x = Type, y = No.of.Service)) +
  geom_boxplot(fill = "orange", color = "darkorange") +
  labs(title = "Box Plot of Number of Services by Type of Service",
       x = "Type of Service", y = "Number of Services") +
  theme_minimal()
```

## Multivariate Analysis {.tabset} {#multivariate-analysis}

### Scatter Plot for Route Length vs No. of Services

```{r}
# Scatter Plot: Route Length vs No. of Services
ggplot(bus, aes(x = Route.Length, y = No.of.Service)) +
  geom_point(color = "blue", alpha = 0.6) +
  geom_smooth(method = "lm", color = "red", linetype = "dashed") +  # Adds trendline
  labs(title = "Scatter Plot of Route Length vs No. of Services",
       x = "Route Length (km)", y = "Number of Services") +
  theme_minimal()
```

### Scatter Plot for Departure Time vs Route Length

```{r}
# Scatter Plot: Departure Time vs Route Length
ggplot(bus, aes(x = Departure.Timings, y = Route.Length)) +
  geom_point(color = "green", alpha = 0.6) +
  labs(title = "Scatter Plot of Departure Time vs Route Length",
       x = "Departure Timings", y = "Route Length (km)") +
  theme_minimal()
```

### Scatter Plot for Route Length vs Departure Time (with color by Type)

```{r}
# Scatter Plot: Route Length vs Departure Time (with color by Type)
ggplot(bus, aes(x = Departure.Timings, y = Route.Length, color = Type)) +
  geom_point(alpha = 0.7) +
  labs(title = "Scatter Plot of Departure Timings vs Route Length by Type",
       x = "Departure Timings", y = "Route Length (km)") +
  theme_minimal()
```