Read in the iris dataset as a dataframe. Using ggplot2,
First we read the iris dataset as a dataframe
data("iris")
Now we call the required packages
library(ggplot2)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
Now we create the charts.
a<-ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, col=Species))
a1<- a + geom_point(size=1) +
geom_smooth(method="lm",col="firebrick", se=FALSE) + # added regression line
facet_wrap(~Species, nrow=2) + # facet by species
labs(title="Sepal length vs sepal width",
subtitle="From iris dataset",
y="Sepal Width", x="Sepal Length") # added chart labels
# a1
ggplotly(a1)
## `geom_smooth()` using formula 'y ~ x'
## Warning: `group_by_()` is deprecated as of dplyr 0.7.0.
## Please use `group_by()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
b<-ggplot(iris, aes(x=Sepal.Length))
b1<- b + geom_histogram (fill="steel blue", bins=10) + # Selected 10 bins
geom_freqpoly(col='black', bins=10) +
labs(title="Histogram and Frequency Polygon of Sepal length",
subtitle="From iris dataset",
y="Count", x="Sepal Length") + # added chart labels
scale_x_continuous(breaks=seq(4.2, 8.2, 0.4)) +
scale_y_continuous(breaks=seq(0, 30, 5)) +
theme_classic()
# b1
ggplotly(b1)
c<-ggplot(data=iris, aes(Species, Petal.Length, col=Species))
c1<-c+geom_boxplot()+
labs(title="Box plot of Petal Length with Species",
subtitle="From iris dataset",
y="Petal Length", x="Species") # added chart labels
ggplotly(c1)
par(mfrow=c(1,2))
hist(iris$Petal.Length, xlab="Petal Length",
main="Histogram of Petal Length", col="steel blue")
hist(iris$Petal.Width, xlab="Petal Width",
main="Histogram of Petal Width", col="steel blue")
Comments on the Distributions:
Both the Petal Length and the Petal Width have very similar distributions. Both of them have two separate points of concentration - one at the lower end of the spectrum and another at the higher end. In both cases, the distribution of the observations concentrated at the lower end of the spectrum appears leptokurtic and positively skewed; and the observations concentrated at the higher end of the spectrum seem to have lower kurtosis and also display moderate positive skeweness.
End of Assignment