Objective: Introduce the “iris” dataset and explain the reason for its selection.
I selected the “iris” dataset for analysis due to its ease of accessibility in R and its suitability for demonstrating data analysis techniques.
# Load the iris dataset
data(iris)
Objective: Describe the variables present in the “iris” dataset.
# Display column names and their corresponding variables
names(iris)
## [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
Sepal.Length: Represents the length of the sepal (in centimeters). This is a continuous variable.
Sepal.Width: Represents the width of the sepal (in centimeters). This is a continuous variable.
Petal.Length: Represents the length of the petal (in centimeters). This is a continuous variable.
Petal.Width: Represents the width of the petal (in centimeters). This is a continuous variable.
Species: Represents the species of iris flower. This is a categorical variable with three levels: setosa, versicolor, and virginica.
Objective: Discuss the number of observations in the “iris” dataset.
# Check the number of rows/observations in the dataset
nrow(iris)
## [1] 150
Each row in the “iris” dataset represents an individual iris flower. There are a total of 150 rows/observations in the dataset.
Objective: Analyze if there are any missing values in the dataset.
# Check for missing values
any(is.na(iris))
## [1] FALSE
There are no missing values in “iris” dataset.
Objective: Summarize the key points of the report
The “iris” dataset is a simple yet informative dataset suitable for various analysis tasks. It contains no missing values, making it ideal for demonstration purposes.