These plot types help us understand distributions and group comparisons, rather than relationships over time or between variables.
A histogram shows the distribution of a single numeric variable.
It helps answer questions like:
We will look at miles per gallon (mpg) from
mtcars.
hist(mtcars$mpg)
Common options:
hist(
mtcars$mpg,
col = "lightblue",
main = "Distribution of Miles per Gallon",
xlab = "Miles per Gallon",
breaks = 10, #histogram specific
probability = TRUE #histogram specific (density instead of count)
)
Histogram Practice
Create a histogram of horsepower (hp).
Customize:
hist(mtcars$hp, col= "yellow", xlab = "Horse Power", main = "Distribution of Horse Power",
probability = FALSE,
breaks=10)
What is a Boxplot?
A boxplot summarizes data using: 1. Median 2. Quartiles 3. Range 4. Outliers
Boxplots are useful for:
boxplot(mtcars$mpg)
boxplot(
mtcars$mpg,
main = "Boxplot of Miles per Gallon",
ylab = "Miles per Gallon",
col = "lightgreen"
)
boxplot(
mpg ~ cyl,
data = mtcars,
xlab = "Number of Cylinders",
ylab = "Miles per Gallon",
main = "MPG by Cylinder Count",
col = "orange"
)
boxplot(
mpg ~ cyl,
data = mtcars,
xlab = "Number of Cylinders",
ylab = "Miles per Gallon",
main = "MPG by Cylinder Count",
col = c("orange", "pink", "blue"), # universal
outcol= "red", # boxplot specific
horizontal = TRUE #boxplot specific ## CHECK YOUR AXES!!!!
)
#lower level functions
legend("topright", legend = c("4 cyliner", "6 cyliner", "8 cylinder"), fill = c("orange", "pink", "blue"))
grid()
Create a boxplot comparing horsepower (hp) across cylinder groups (cyl).
Customize: - Title - Axis labels - Color
boxplot(
hp~cyl,
data=mtcars,
main="MPG by Horse Power",
ylab= "Horsepower",
xlab= "Miles Per Gallon",
col= c("lightblue","darkgreen","brown"),
horizontal = TRUE
)
What is a Barplot?
Barplots are used for categorical data or summarized counts.
They show:
# First, count how many cars have each cylinder number.
cyl_counts <- table(mtcars$cyl)
cyl_counts
##
## 4 6 8
## 11 7 14
barplot(cyl_counts)
barplot(
cyl_counts,
col = "purple",
main = "Number of Cars by Cylinder Count",
xlab = "Cylinders",
ylab = "Number of Cars"
)
barplot(
cyl_counts,
col = "orange",
main = "Number of Cars by Cylinder Count",
xlab = "Cylinders",
ylab = "Number of Cars",
border = "blue",
lwd= 2,
cex.main = 2, #universal
cex.lab= 1.5, # universal
las= 2, #barplot specific
space= 0.5 #barplot specific
)
Barplot Practice
Create a barplot showing how many cars fall into each gear category (gear).
Steps: 1. Use table() 2. Use barplot() 3. Add labels and color
gear_count= table(mtcars$gear)
barplot(
gear_count,
col="gold",
border="lightgrey",
xlab="Gears",
ylab="Number of Cars",
main="Number of Cars By Gear Count",
space=.5,
lwd= 3, #side bar width/bold
las=1, #oreintation of bottom numbers
cex.main = 2, #Changes size/bold of the title
cex.lab= 1.5, # changes size/bold of X and Y labels
)
Summary
You now know how to create:
Together with scatterplots and line plots, these give you a powerful toolkit for visualizing data in R.
In this assignment, you will create and customize three types of plots using base R:
You will use the built-in mtcars dataset unless
otherwise specified.
All plots must include:
A histogram displays the distribution of a single numeric variable.
Create a histogram of one numeric variable from mtcars
(for example: mpg, hp, or
wt).
Your histogram must include:
breakshist(
mtcars$wt,
col="darkred",
main="Distribution of Weight in Tons",
xlab="Weight in Tons",
probability=FALSE
)
Questions:
A boxplot summarizes a distribution using:
Create a boxplot comparing a numeric variable across groups.
Example: Compare mpg by number of
cylinders (cyl).
Your boxplot must include:
y ~ x)boxplot(
hp~cyl,
data=mtcars,
breaks=4,
horizontal = TRUE,
xlab="Horse Power",
ylab="Cylinders",
main="Horse Power by Cylinder",
col=c("pink","tan","white")
)
Questions
A barplot displays counts or summarized categorical data.
table() to count frequencies of a categorical
variable (gear or cyl).mtcars
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
carb_count= table(mtcars$carb)
barplot(carb_count,
col = "orange",
main = "Number of Cars by Carburetor Count",
xlab = "Carburetors",
ylab = "Number of Cars")
Questions