library(tidyverse)
library(leaflet)
earthquake <- read_csv("https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv")
head(earthquake)
## # A tibble: 6 x 22
## time latitude longitude depth mag magType nst gap
## <dttm> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 2020-05-15 01:03:32 34.2 -117. 14.2 1.37 ml 30 83
## 2 2020-05-15 00:55:19 35.8 -84.8 22.3 2.11 md 14 63
## 3 2020-05-15 00:54:40 60.6 -153. 237. 0.6 ml NA NA
## 4 2020-05-15 00:35:43 38.8 -123. 2.18 0.89 md 16 50
## 5 2020-05-15 00:22:06 39.6 -120. 6.5 1.1 ml 13 76.6
## 6 2020-05-15 00:15:34 60.5 -152. 92.1 2 ml NA NA
## # … with 14 more variables: dmin <dbl>, rms <dbl>, net <chr>, id <chr>,
## # updated <dttm>, place <chr>, type <chr>, horizontalError <dbl>,
## # depthError <dbl>, magError <dbl>, magNst <dbl>, status <chr>,
## # locationSource <chr>, magSource <chr>
From this map, the first thing that stands out is that most earthquakes are happening around coastal regions, with some happeing in the ocean as well. A high concentration of them seem to be happening in North America, specifically the western United States and Alaska.
ggplot(data = earthquake) +
borders("world", xlim = c(-180, 180), ylim = c(-90, 90)) +
geom_point(mapping = aes(x = earthquake$longitude,
y = earthquake$latitude,
color = mag))
Even though the United States has a high concentration of earthquakes, the majority seem to have a magnitude of 2.5 or lower. The strongest earthquakes seem to be occurring the oceans and throughout the Middle East and Asia.
ggplot(data = earthquake) +
borders("world", xlim = c(-180, 180), ylim = c(-90, 90)) +
geom_point(mapping = aes(x = earthquake$longitude,
y = earthquake$latitude,
color = mag,
cex = mag,
alpha = .1))
## Warning: Removed 2 rows containing missing values (geom_point).
head(mpg)
## # A tibble: 6 x 11
## manufacturer model displ year cyl trans drv cty hwy fl class
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi a4 1.8 1999 4 auto(… f 18 29 p comp…
## 2 audi a4 1.8 1999 4 manua… f 21 29 p comp…
## 3 audi a4 2 2008 4 manua… f 20 31 p comp…
## 4 audi a4 2 2008 4 auto(… f 21 30 p comp…
## 5 audi a4 2.8 1999 6 auto(… f 16 26 p comp…
## 6 audi a4 2.8 1999 6 manua… f 18 26 p comp…
We can see that cars with lower number of cylinders tend to have a higher city mileage than those with higher number of cylinders. It is also clear that compact and subcompact vehicles are mostly are cars that good on city mileage because they are mostly made with 4 cylinders. On the other hand, bigger vehicles, such as as pickups and SUVs, have lower city mileage as they are more likely to be made with higher number of cylinders.
ggplot(mpg, aes(cty, fill = class)) +
geom_histogram() +
facet_grid(cyl ~ .) +
labs(title = "City Mileage by Number of Cylinders")
Here, the scatterplot shows us that cars that are frontwheel drive had a higher mileage, in general, for both highway and city, than those who are rearwheel or four wheel drive.
ggplot(mpg, aes(cty, hwy, color = drv)) +
geom_point() +
labs(title = "City and Highway Mileage by Drivetrain") +
xlab("City Mileage") +
ylab("Highway Mileage")
The greatest manufacturer in this dataset were cars made by Dodge, followed closely by Toyota. The least common were those made by Lincoln, Land Rover, and Mercury.
mft <- mpg %>%
group_by(manufacturer) %>%
count()
ggplot(mft, aes(reorder(manufacturer,n), weight = n, fill = "forestgreen")) +
geom_bar() +
coord_flip()
Most cars in the dataset require either premium or regular gasoline, with very few requiring diesel or ethanol.
ggplot(mpg, aes(factor(fl), fill = factor(year))) +
geom_bar() +
xlab("Fuel Type") +
labs(title = "Fuel Type by Year")
D. Kahle and H. Wickham. ggmap: Spatial Visualization with ggplot2. The R Journal, 5(1), 144-161. URL http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf