program 1

Author

1NT23IS027-SECTION A -ANKITHA

1.develop an R program to quickly explore a given dataset ,including categorical analysis using group_by command, and visualize the findings using ggplot2 features.

Step1:Load the required library

library(ggplot2)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ lubridate 1.9.4     ✔ tibble    3.2.1
✔ purrr     1.0.4     ✔ tidyr     1.3.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Step 2:load dataset

we will use built in mtcars dataset.

temp<-mtcars
#temp$cyl
#class(temp)
#class(temp$cyl)
#mtcars[3]
#str(temp)
#(temp$cyl)
temp$cyl<-as.factor(temp$cyl)
str(temp)
'data.frame':   32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

Step3:group by categorical variable

we analyse the average mpg(miles per gallon) for each cylinder

library(dplyr)
summary_data<- temp %>% group_by(cyl) %>%
  summarise(avg_mpg = mean(mpg), .groups = 'drop')
#Dipsplay summary
print(summary_data)
# A tibble: 3 × 2
  cyl   avg_mpg
  <fct>   <dbl>
1 4        26.7
2 6        19.7
3 8        15.1

Step 4:Visualizing the findings

#create a bar plot using ggplot2
ggplot(summary_data, aes(x=cyl,y=avg_mpg,fill=cyl
                         ))+
  geom_bar(stat="identity")+
  labs(title="Average MPG by cylinder count",
       x="Number of cylinders", 
       y="Average MPG")+
 theme_minimal()