## install.packages(‘packrat’, ‘rsconnect’)

title: “Advanced Analysis of Iris Dataset” author: “Y. Mohammed Iqbal” date: “2024-07-11” output: html_document —

Introduction

This document demonstrates an advanced analysis combining text, code, and plots using the iris dataset.

Data Summary

First, we summarize the data:

# Load necessary libraries
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Load the iris dataset
data(iris)

# Summary of the iris dataset
summary(iris)
##   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
##  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
##  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
##  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
##  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
##  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
##  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
##        Species  
##  setosa    :50  
##  versicolor:50  
##  virginica :50  
##                 
##                 
## 
# Visualization
# Load the ggplot2 library
library(ggplot2)

# Scatter plot of Sepal.Length vs. Sepal.Width
ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
  geom_point() +
  labs(title = "Scatter Plot of Sepal Length vs. Sepal Width",
       x = "Sepal Length",
       y = "Sepal Width")

# Box plot
# Boxplot of Sepal.Length by Species
ggplot(data = iris, aes(x = Species, y = Sepal.Length, fill = Species)) +
  geom_boxplot() +
  labs(title = "Boxplot of Sepal Length by Species",
       x = "Species",
       y = "Sepal Length")

Explanation of the Code

  • YAML Header: The document metadata (title, author, date, output format) is specified in the YAML header.
  • Introduction: A brief introduction about what the document covers.
  • Data Summary:
    • The dplyr package is loaded to use data manipulation functions.
    • The iris dataset is loaded, and a summary of the dataset is provided.
  • Visualization:
    • The ggplot2 package is loaded to create plots.
    • A scatter plot of Sepal.Length vs. Sepal.Width is created, colored by Species.
  • Additional Analysis:
    • A boxplot of Sepal.Length by Species is created to compare the sepal length across different species.
  • Conclusion: A brief conclusion summarizing the document.

You can create this RMarkdown file in RStudio, knit it to your desired output format (HTML, PDF, or Word), and share the dynamically generated report.