Scenario: You are a data analyst for the Star Wars Archives. Your task is to analyze the Star Wars dataset to understand physical traits of characters and present findings clearly to non-technical stakeholders.
Deliverables:
Setup: - Ensure you can access the
starwars dataset (available via dplyr). - Load
any packages you need (e.g., for data wrangling and for
skewness/kurtosis).
Objective: Build a quick “data dictionary” of the dataset.
Tasks:
dplyr)starwars# TODO: Write code to complete the tasks above.
# Add brief inline comments to explain your findings.
Short Write-up (2–3 sentences): Summarize what this dataset contains and any immediate data quality concerns you notice.
Objective: Create a clean analysis-ready subset for height and mass.
Tasks:
# TODO: Implement factor inspection and missing-value checks.
# TODO: Create your cleaned dataset (e.g., clean_data).
Short Write-up (1–2 sentences): Explain why you chose to remove the rows you did and how this affects analysis reliability.
Objective: Report central tendency and spread for height and mass.
Tasks (using your cleaned dataset):
# TODO: Calculate mean, median, and range for mass and height.
# TODO: Store results in clearly named objects or a small summary table.
Short Write-up (2–3 sentences): Compare mean vs. median for each variable and comment on potential outliers or skew based on these numbers alone.
Objective: Quantify distribution shape for mass.
Tasks:
# TODO: Compute skewness and kurtosis for mass.
# TODO: Add a short comment about what the values imply.
Short Write-up (2–3 sentences): Explain what the skewness and kurtosis imply about how representative the mean is and whether outliers are likely.
Objective: Create plots that reveal distribution, outliers, and relationships.
Tasks:
# TODO: Produce a histogram of mass.
# TODO: Produce a boxplot of mass.
# TODO: Produce a scatterplot of height vs. mass.
# Add concise captions/comments beneath each plot.
Short Write-up (3–4 sentences): Which plot was most helpful to identify outliers? What patterns (if any) do you observe between height and mass?
Objective: Communicate findings clearly to non-technical stakeholders.
Include the following:
# TODO: Assemble your final summary objects/plots here so they're visible together.