Develop an r programming to quickly explore a given dataset,including categorical analysis using the group_by command and visualize the findings using ggplot2 features
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)#create a bar plot using ggplot2ggplot(summary_data ,aes(x=cyl,y=avg_mpg,fill=cyl))+geom_bar(stat="identity") +labs(title ="average mpg in cylinder count",x="number of cylinders",y="average mpg")+theme_minimal()
Program 2
Write a r programming to create a scatter plot,incorporating categorical analysis through color -coded data points representing different groups using ggplot2.
Implement an R function to generate a line graph depicting the trend of a time -series dataset,with separate lines for each group,utilizing ggplot2’s group aesthetic.
library(ggplot2)library(dplyr)library(tidyr)#convert time-series data to a dataframeclass(AirPassengers)
plot_time_series<-function(data,x_col,y_col,group_col,title="Air Passengers Trends") {ggplot(data,aes_string(x=x_col,y_col,color=group_col,group=group_col))+geom_line(size=1.2)+geom_point(size=2)+labs(title=title,x="Year",y="Number of Passengers",color="Year")+theme_minimal()+theme(legend.position ="top")}#call functionplot_time_series(data,"Date","Passengers","Year","Trend of Airline Passengers Over Time")
Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
ℹ Please use tidy evaluation idioms with `aes()`.
ℹ See also `vignette("ggplot2-in-packages")` for more information.
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
Program 4
Develop a script in R to produce a bar graph displaying the frequency distributed of categorical data in a given dataset,grouped by a specific variable,using ggplot2
#conversion of numerical to categorialdata$cyl<-as.factor(data$cyl)data$gear<-as.factor(data$gear)# Create a bar graphggplot(data, aes(x = cyl, fill = gear)) +geom_bar(position ="dodge") +# Grouped bar chartlabs(title ="Frequency of Cylinders Grouped by Gear Type",x ="Number of Cylinders",y ="Count",fill ="Gears") +# Legend titletheme_minimal()
Program 5
Implement an R program to create a histogram illustrating the distribution of continuous variable,with overlays of density curves for each group,using ggplot2
library(ggplot2)#use the built-in 'iris' dataset#'Petal.Length' is a continuous variable#''species'is a categorial variablestr(iris)#shows the structur of the dataset
Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(density)` instead.
p<-p+geom_density(aes(color=Species),size=1.2)p
p<-p+labs(tilte="Distribution of petal length with group wise density curves" ,x="petal length ",y="density",fill="Species")+theme_minimal()p
Program 6
write a R script to construct a box plot showcasing the distribution of continuous variable,grouped by a categorical variable,using ggplot2’s fill aestetics
#load the library ggplot2library(ggplot2)str(iris)
p<-p+labs(title="Box plot of petal width by species",x="Species",y="Petal Width")+theme_minimal()p
Program 7
develop a function in R to plot a function curve based on a mathematical equation provided as input,with different curve styles for each group using ggpolt2