PROGRAM 1

Author

Balaji

Develop an R program to quickly explore a given dataset, including categorical analysis using the group_by command, and visualize the findings using ggplot2 features.

Step 1: Load necessary libraries

library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.1.3
-- Attaching packages --------------------------------------- tidyverse 1.3.2 -- v ggplot2 3.4.0     v purrr   1.0.1 v tibble  3.1.6     v dplyr   1.1.0 v tidyr   1.3.0     v stringr 1.5.0 v readr   2.1.1     v forcats 0.5.1
Warning: package 'ggplot2' was built under R version 4.1.3
Warning: package 'tidyr' was built under R version 4.1.3
Warning: package 'purrr' was built under R version 4.1.3
Warning: package 'dplyr' was built under R version 4.1.3
Warning: package 'stringr' was built under R version 4.1.3
-- Conflicts ------------------------------------------ tidyverse_conflicts() -- x dplyr::filter() masks stats::filter() x dplyr::lag()    masks stats::lag()
library(dplyr)

Step 2: Load the dataset

# Load dataset data <- mtcars  # Convert 'cyl' to a factor for categorical analysis data$cyl <- as.factor(data$cyl)

Step 3: Group by categorical variables

# Summarize average mpg by cylinder category summary_data <- data %>%   group_by(cyl) %>%   summarise(avg_mpg = mean(mpg), .groups = 'drop')  # Display summary print(summary_data)
# A tibble: 3 x 2   cyl   avg_mpg   <fct>   <dbl> 1 4        26.7 2 6        19.7 3 8        15.1

Step 4: Visualizing the findings

# Create a bar plot using ggplot2 ggplot(summary_data, aes(x = cyl, y = avg_mpg, fill = cyl)) +   geom_bar(stat = "identity") +   labs(title = "Average MPG by Cylinder Count",        x = "Number of Cylinders",        y = "Average MPG") +   theme_minimal()