List the top 10 customers who have rented the most films, including their first and last names, email addresses, and the total number of films they have rented.
SELECT c.first_name,
c.last_name,
c.email,
COUNT(r.customer_id) AS total_films_rented
FROM customer c
JOIN rental r ON c.customer_id = r.customer_id
GROUP BY c.customer_id
ORDER BY total_films_rented DESC
LIMIT 10;
| first_name | last_name | total_films_rented | |
|---|---|---|---|
| ELEANOR | HUNT | ELEANOR.HUNT@sakilacustomer.org | 46 |
| KARL | SEAL | KARL.SEAL@sakilacustomer.org | 45 |
| MARCIA | DEAN | MARCIA.DEAN@sakilacustomer.org | 42 |
| CLARA | SHAW | CLARA.SHAW@sakilacustomer.org | 42 |
| TAMMY | SANDERS | TAMMY.SANDERS@sakilacustomer.org | 41 |
| WESLEY | BULL | WESLEY.BULL@sakilacustomer.org | 40 |
| SUE | PETERS | SUE.PETERS@sakilacustomer.org | 40 |
| TIM | CARY | TIM.CARY@sakilacustomer.org | 39 |
| MARION | SNYDER | MARION.SNYDER@sakilacustomer.org | 39 |
| RHONDA | KENNEDY | RHONDA.KENNEDY@sakilacustomer.org | 39 |
Find the titles and descriptions of films that have never been rented, sorted alphabetically by title.
SELECT f.title, f.description
FROM film f
LEFT JOIN inventory i ON f.film_id = i.film_id
LEFT JOIN rental r ON i.inventory_id = r.inventory_id
WHERE r.rental_id IS NULL
ORDER BY f.title;
| title | description |
|---|---|
| ACADEMY DINOSAUR | A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies |
| ALICE FANTASIA | A Emotional Drama of a A Shark And a Database Administrator who must Vanquish a Pioneer in Soviet Georgia |
| APOLLO TEEN | A Action-Packed Reflection of a Crocodile And a Explorer who must Find a Sumo Wrestler in An Abandoned Mine Shaft |
| ARGONAUTS TOWN | A Emotional Epistle of a Forensic Psychologist And a Butler who must Challenge a Waitress in An Abandoned Mine Shaft |
| ARK RIDGEMONT | A Beautiful Yarn of a Pioneer And a Monkey who must Pursue a Explorer in The Sahara Desert |
| ARSENIC INDEPENDENCE | A Fanciful Documentary of a Mad Cow And a Womanizer who must Find a Dentist in Berlin |
| BOONDOCK BALLROOM | A Fateful Panorama of a Crocodile And a Boy who must Defeat a Monkey in The Gulf of Mexico |
| BUTCH PANTHER | A Lacklusture Yarn of a Feminist And a Database Administrator who must Face a Hunter in New Orleans |
| CATCH AMISTAD | A Boring Reflection of a Lumberjack And a Feminist who must Discover a Woman in Nigeria |
| CHINATOWN GLADIATOR | A Brilliant Panorama of a Technical Writer And a Lumberjack who must Escape a Butler in Ancient India |
Retrieve a list of film categories and the average length of films in each category, ordered by the average length in descending order. Include the category name and the average length.
SELECT c.name AS category_name,
AVG(f.length) AS average_length_of_film
FROM film_category fc
JOIN film f ON fc.film_id = f.film_id
JOIN category c ON fc.category_id = c.category_id
GROUP BY c.name
ORDER BY average_length_of_film DESC;
| category_name | average_length_of_film |
|---|---|
| Sports | 128.2027 |
| Games | 127.8361 |
| Foreign | 121.6986 |
| Drama | 120.8387 |
| Comedy | 115.8276 |
| Family | 114.7826 |
| Music | 113.6471 |
| Travel | 113.3158 |
| Horror | 112.4821 |
| Classics | 111.6667 |
Identify the 5 actors who have appeared in the most films, along with the number of films they’ve acted in. Include their first and last names.
SELECT a.first_name, a.last_name, COUNT(*) AS film_count
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
GROUP BY a.actor_id
ORDER BY film_count DESC
LIMIT 5;
| first_name | last_name | film_count |
|---|---|---|
| GINA | DEGENERES | 42 |
| WALTER | TORN | 41 |
| MARY | KEITEL | 40 |
| MATTHEW | CARREY | 39 |
| SANDRA | KILMER | 37 |
There are no rentals of a film with Johnny Depp,
Find the top 10 earning film titles along with their total revenue (rental fees), sorted by revenue in descending order. Include the film title and total revenue.
SELECT f.title AS film_title, SUM(p.amount) AS total_revenue
FROM film f
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON i.inventory_id = r.inventory_id
JOIN payment p ON r.rental_id = p.rental_id
GROUP BY f.film_id
ORDER BY total_revenue DESC
LIMIT 10;
| film_title | total_revenue |
|---|---|
| TELEGRAPH VOYAGE | 231.73 |
| WIFE TURN | 223.69 |
| ZORRO ARK | 214.69 |
| GOODFELLAS SALUTE | 209.69 |
| SATURDAY LAMBS | 204.72 |
| TITANS JERK | 201.71 |
| TORQUE BOUND | 198.72 |
| HARRY IDAHO | 195.70 |
| INNOCENT USUAL | 191.74 |
| HUSTLER PARTY | 190.78 |
Use the diamonds dataset to create a 2D plotly scatterplot (you can choose the variables). The plot should have a title and subtitle, axis labels, and hover information that is useful.
library(plotly)
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
plot1 <- plot_ly(data = diamonds, x = ~carat, y = ~price, color = ~color,
type = 'scatter', mode = 'markers') %>%
layout(title = "Diamonds Scatterplot <br><sup> Comparing Carats and Price </sup>",
xaxis = list(title = "Carat"),
yaxis = list(title = "Price"),
hovermode = "closest",
showlegend = TRUE,
legend = list(title = "Color"))
plot1
Compare and contrast ggplot2 and plotly. What is an example of a time when you would definitely want to use each one over the other?
library(ggplot2)
plot2 <- ggplot(diamonds, aes(x = carat, y = price, color = color)) +
geom_point() +
labs(title = "Diamonds Scatterplot: Carat vs. Price",
x = "Carat",
y = "Price",
color = "Color") +
theme_minimal()
plot2
COMPARE: ggplot and plotly are similar in that they are popular tools to create a wide range of visualizations, like the scatter plots above. I like using both ggplot and plotly because I can customize each of my plots fairly easily.
CONTRAST: Despite their similarities, ggplot and plotly differ significantly in how one writes each’s respective code. I mostly would use ggplot for making pretty, static visualizations that are simple to understand and code. For me, the aesthetic aspect and “layering” ideology is very intuitive and I can create a graphic fairly quickly because of such. I would probably choose to use ggplot when needing to present data to a layman, or if I am on a time crunch, because of its aesthetics and ease of use.
In contrast, plotly is better for creating interactive visualizations. I would use it if I wanted to have features in my graphic such as hovering, zooming, and filtering. Also, we used plotly when creating an animation; I don’t believe ggplot can do that. The dynamic properties of plotly make it ideal for showing change over time, so I would choose to use it over ggplot when making an interactive webapp for users to explore.
Using the gapminder dataset, make a Plotly plot which compares two variables over time. Explain the meaning of this comparison in context and describe what narrative is advanced by this comparison.
library(gapminder)
plot3 <- plot_ly(data = gapminder, x = ~gdpPercap, y = ~lifeExp,
text = ~country, frame = ~year, mode = 'markers') %>%
layout(title = "Life Expectancy and GDP per Capita from 1952 to 2007",
xaxis = list(title = "GDP per Capita"),
yaxis = list(title = "Life Expectancy"),
showlegend = FALSE) %>%
animation_slider(currentvalue = list(prefix = "Year: "))
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#scatter
plot3
The context of this comparison is that GDP per capita and Life Expectancy are, in theory, highly correlated because higher income means one can afford more access to healthcare and other goods that improve their health and thus life expectancy. The narrative of the animation of that life expectancy and GDP per capita have both increased over time in tandem, and their logistic relationship becomes more pronounced towards the 2000’s; this means that there is some limit (as of now) to amount that higher income can improve a country’s life expectancy.