In this lab, you will practice creating APA-compliant graphs using ggplot2. Please complete each exercise by filling in the code chunks and interpreting the resulting graphs. Once you have completed the exercises, knit this document to HTML and publish it to RPubs. Make sure your YAML header includes a title, your name, and the date.
Run the below chunk to create the
apa_theme()
# Load ggplot2
library(ggplot2)
apa_theme <- theme_classic() +
theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = "bottom",
text = element_text(family = "serif", size = 12),
axis.title = element_text(size = 12),
plot.title = element_text(size = 14, hjust = 0.5)
)Dataset: Use the built-in iris dataset
to compare the mean sepal length across different species.
Tasks:
1. Create a bar graph that shows the average sepal length for each
species in the iris dataset.
2. Add error bars representing the standard error of the mean.
3. Apply APA formatting, including a descriptive title, axis labels, and
removing unnecessary grid lines.
4. Interpretation: Describe which species has the
longest average sepal length and which has the shortest. Discuss the
reliability of these comparisons based on the error bars.
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
library(dplyr)
# Calculate the mean sepal length for each species
iris_summary <- iris %>%
group_by(Species) %>%
summarise(mean_sepal_length = mean(Sepal.Length),
se_sepal_length = sd(Sepal.Length) / sqrt(n()))
# Create the bar graph with error bars
ggplot(iris_summary, aes(x = Species, y = mean_sepal_length)) +
geom_bar(stat = "identity", fill = "skyblue", color = "black") +
geom_errorbar(aes(ymin = mean_sepal_length - se_sepal_length,
ymax = mean_sepal_length + se_sepal_length),
width = 0.2) +
labs(title = "Mean Sepal Length by Species",
x = "Species", y = "Mean Sepal Length (cm)") +
apa_themeInterpretation:
The species virginica has the longest average sepal
length and setosa has the shortest. The error bars are
relatively small for each group, meaning the mean estimates are reliable
and meaningful.
Dataset: Use the built-in mtcars
dataset to create a scatter plot of hp (horsepower) versus
qsec (quarter-mile time).
Tasks:
1. Create a scatter plot showing the relationship between horsepower and
quarter-mile time.
2. Add a line that summarizes the overall trend in the data (a line that
shows the general direction the data points are heading).
3. Apply APA formatting to ensure the plot is clear and
professional.
4. Interpretation: Explain whether cars with higher
horsepower tend to have faster or slower quarter-mile times based on the
direction of the trend line.
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
# Create the APA-compliant scatter plot
# Scatter plot of horsepower vs. quarter-mile time
ggplot(mtcars, aes(x = hp, y = qsec)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Horsepower vs. Quarter-Mile Time",
x = "Horsepower", y = "Quarter-Mile Time (seconds)") +
apa_theme## `geom_smooth()` using formula = 'y ~ x'
Interpretation:
The trend line shows a negative relationship between
horsepower and quarter-mile time, showing that cars with higher
horsepower tend to complete the quarter mile more quickly.
Dataset: Use the built-in airquality
dataset to create a line graph showing the trend of average temperature
(Temp) over the months (Month).
Tasks:
1. Calculate the average temperature for each month in the
airquality dataset.
2. Create a line graph that shows the trend of average temperature over
the months.
3. Apply APA formatting, ensuring the title is centered, axis labels are
clear, and the graph is free of unnecessary grid lines.
4. Interpretation: Describe how temperature changes
over the months. Identify which month has the highest average
temperature and discuss any patterns you observe.
## Ozone Solar.R Wind Temp Month Day
## 1 41 190 7.4 67 5 1
## 2 36 118 8.0 72 5 2
## 3 12 149 12.6 74 5 3
## 4 18 313 11.5 62 5 4
## 5 NA NA 14.3 56 5 5
## 6 28 NA 14.9 66 5 6
# Create the APA-compliant line graph
# Calculate average temperature for each month
airquality_summary <- airquality %>%
group_by(Month) %>%
summarise(avg_temp = mean(Temp, na.rm = TRUE))
# Create the line graph of average temperature over months
ggplot(airquality_summary, aes(x = Month, y = avg_temp)) +
geom_line(color = "blue") +
labs(title = "Average Temperature Over the Months",
x = "Month", y = "Average Temperature (°F)") +
apa_themeInterpretation: The line graph shows a clear upward trend in average temperature from May through July, peaking in July, with has the highest average temperature. After July, the temperature begins to decline through August and September.
Submission Instructions:
Ensure to knit your document to HTML format, checking that all content is correctly displayed before submission. Submit the RPubs URL to Canvas Assignments.