PROGRAM 1
Develop an R program to quickly explore a given dataset, including categorical analysis using the group_by command, and visualize the findings using ggplot2 features.
Step 1: Load necessary libraries
library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.1.3
-- Attaching packages --------------------------------------- tidyverse 1.3.2 -- v ggplot2 3.4.0 v purrr 1.0.1 v tibble 3.1.6 v dplyr 1.1.0 v tidyr 1.3.0 v stringr 1.5.0 v readr 2.1.1 v forcats 0.5.1
Warning: package 'ggplot2' was built under R version 4.1.3
Warning: package 'tidyr' was built under R version 4.1.3
Warning: package 'purrr' was built under R version 4.1.3
Warning: package 'dplyr' was built under R version 4.1.3
Warning: package 'stringr' was built under R version 4.1.3
-- Conflicts ------------------------------------------ tidyverse_conflicts() -- x dplyr::filter() masks stats::filter() x dplyr::lag() masks stats::lag()
library(dplyr)
Step 2: Load the dataset
# Load dataset data <- mtcars # Convert 'cyl' to a factor for categorical analysis data$cyl <- as.factor(data$cyl)
Step 3: Group by categorical variables
# Summarize average mpg by cylinder category summary_data <- data %>% group_by(cyl) %>% summarise(avg_mpg = mean(mpg), .groups = 'drop') # Display summary print(summary_data)
# A tibble: 3 x 2 cyl avg_mpg <fct> <dbl> 1 4 26.7 2 6 19.7 3 8 15.1
Step 4: Visualizing the findings
# Create a bar plot using ggplot2 ggplot(summary_data, aes(x = cyl, y = avg_mpg, fill = cyl)) + geom_bar(stat = "identity") + labs(title = "Average MPG by Cylinder Count", x = "Number of Cylinders", y = "Average MPG") + theme_minimal()