Objective: To equip students with the skills to perform more sophisticated statistical tests within their Exploratory Data Analysis (EDA) workflow using the ggstatsplot package, create interactive visualizations, and publish RPubs using the Breast Cancer Wisconsin dataset.
Prerequisites: Students have a basic understanding of:
Exploratory Data Analysis (EDA) concepts.
Base R for data manipulation and visualization.
Initial exposure to ggplot2 for creating static plots.
Basic statistical test concepts.
Familiarity with R Markdown.
Students will perform an EDA. For statistical tests using ggstatsplot and create interactive visualizations with plotly, they will use the Breast Cancer Wisconsin dataset. The entire analysis will be documented in an R Markdown document and published on RPubs.
Steps:
library(ggplot2)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
# Load the iris dataset
data(iris)
# Create a ggplot2 scatter plot with tooltip information
p_iris <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species,
text = paste("Sepal Length: ", Sepal.Length, "<br>",
"Sepal Width: ", Sepal.Width, "<br>",
"Petal Length: ", Petal.Length, "<br>",
"Petal Width: ", Petal.Width))) +
geom_point() +
labs(title = "Iris Flower Sepal Dimensions",
x = "Sepal Length (cm)",
y = "Sepal Width (cm)",
color = "Species") +
theme_minimal()
# Convert the ggplot2 object to an interactive plotly object, specifying the tooltip
fig_iris <- ggplotly(p_iris, tooltip = "text") %>%
layout(modebar = list(visible = FALSE))
# Display the interactive plot
fig_iris
Assessment Criteria:
Use of ggstatsplot: Correct application of ggstatsplot for statistical tests on the Breast Cancer Wisconsin dataset.
Interpretation of Results: Accurate interpretation of statistical test results.
Creation of Interactive Plot: Successful creation of an interactive plotly plot from a ggplot2 objects as part of your EDA, with tooltips and no modebar.
Reproducibility: Well-structured and reproducible R Markdown document.
Clarity and Communication: Clear explanations, appropriate visualizations, and logical flow.
RPubs Publication: Successful publication of the report on RPubs.