SQL Questions

library(DBI)
library(dbplyr)
con <- dbConnect(drv = RSQLite::SQLite(),
                 dbname = ":memory:")

customer <- read.csv("C:/Users/miles/Downloads/customer.csv")
rental <- read.csv("C:/Users/miles/Downloads/rental.csv")
film <- read.csv("C:/Users/miles/Downloads/film.csv")
inventory <- read.csv("C:/Users/miles/Downloads/inventory.csv")
category <- read.csv("C:/Users/miles/Downloads/category.csv")
film_category <- read.csv("C:/Users/miles/Downloads/film_category.csv")
actor <- read.csv("C:/Users/miles/Downloads/actor.csv")
film_actor <- read.csv("C:/Users/miles/Downloads/film_actor.csv")
payment <- read.csv("C:/Users/miles/Downloads/payment.csv")

dbWriteTable(conn = con, name = "customer", value = customer)
dbWriteTable(conn = con, name = "rental", value = rental)
dbWriteTable(conn = con, name = "film", value = film)
dbWriteTable(conn = con, name = "inventory", value = inventory)
dbWriteTable(conn = con, name = "category", value = category)
dbWriteTable(conn = con, name = "film_category", value = film_category)
dbWriteTable(conn = con, name = "actor", value = actor)
dbWriteTable(conn = con, name = "film_actor", value = film_actor)
dbWriteTable(conn = con, name = "payment", value = payment)
knitr::opts_chunk$set(connection = "con")

Problem 1

SELECT c.first_name, c.last_name, c.email, COUNT(r.rental_id) AS total_rentals
FROM customer c
JOIN rental r ON c.customer_id = r.customer_id
GROUP BY c.customer_id
ORDER BY total_rentals DESC
LIMIT 10
Displaying records 1 - 10
first_name last_name email total_rentals
ELEANOR HUNT 46
KARL SEAL 45
MARCIA DEAN 42
CLARA SHAW 42
TAMMY SANDERS 41
WESLEY BULL 40
SUE PETERS 40
TIM CARY 39
MARION SNYDER 39
RHONDA KENNEDY 39

Problem 2

SELECT f.title, f.description
FROM film f
LEFT JOIN inventory i ON f.film_id = i.film_id
LEFT JOIN rental r ON i.inventory_id = r.inventory_id
WHERE r.rental_id IS NULL
ORDER BY f.title
Displaying records 1 - 10
title description
ACADEMY DINOSAUR A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies
ALICE FANTASIA A Emotional Drama of a A Shark And a Database Administrator who must Vanquish a Pioneer in Soviet Georgia
APOLLO TEEN A Action-Packed Reflection of a Crocodile And a Explorer who must Find a Sumo Wrestler in An Abandoned Mine Shaft
ARGONAUTS TOWN A Emotional Epistle of a Forensic Psychologist And a Butler who must Challenge a Waitress in An Abandoned Mine Shaft
ARK RIDGEMONT A Beautiful Yarn of a Pioneer And a Monkey who must Pursue a Explorer in The Sahara Desert
ARSENIC INDEPENDENCE A Fanciful Documentary of a Mad Cow And a Womanizer who must Find a Dentist in Berlin
BOONDOCK BALLROOM A Fateful Panorama of a Crocodile And a Boy who must Defeat a Monkey in The Gulf of Mexico
BUTCH PANTHER A Lacklusture Yarn of a Feminist And a Database Administrator who must Face a Hunter in New Orleans
CATCH AMISTAD A Boring Reflection of a Lumberjack And a Feminist who must Discover a Woman in Nigeria
CHINATOWN GLADIATOR A Brilliant Panorama of a Technical Writer And a Lumberjack who must Escape a Butler in Ancient India

Problem 3

SELECT c.name AS category, AVG(f.length) AS avg_length
FROM film f
JOIN film_category fc ON f.film_id = fc.film_id
JOIN category c ON fc.category_id = c.category_id
GROUP BY c.name
ORDER BY avg_length DESC
Displaying records 1 - 10
category avg_length
Sports 128.2027
Games 127.8361
Foreign 121.6986
Drama 120.8387
Comedy 115.8276
Family 114.7826
Music 113.6471
Travel 113.3158
Horror 112.4821
Classics 111.6667

Problem 4

SELECT a.first_name, a.last_name, COUNT(fa.film_id) AS film_count
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
GROUP BY a.actor_id
ORDER BY film_count DESC
LIMIT 5
5 records
first_name last_name film_count
GINA DEGENERES 42
WALTER TORN 41
MARY KEITEL 40
MATTHEW CARREY 39
SANDRA KILMER 37

Problem 5

SELECT DISTINCT c.first_name, c.last_name
FROM customer c
JOIN rental r ON c.customer_id = r.customer_id
JOIN inventory i ON r.inventory_id = i.inventory_id
JOIN film f ON i.film_id = f.film_id
JOIN film_actor fa ON f.film_id = fa.film_id
JOIN actor a ON fa.actor_id = a.actor_id
WHERE a.first_name = 'Johnny' AND a.last_name = 'Depp'
0 records
first_name last_name

Problem 6

SELECT f.title, SUM(p.amount) AS total_revenue
FROM film f
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON i.inventory_id = r.inventory_id
JOIN payment p ON r.rental_id = p.rental_id
GROUP BY f.title
ORDER BY total_revenue DESC
LIMIT 10
Displaying records 1 - 10
title total_revenue
TELEGRAPH VOYAGE 231.73
WIFE TURN 223.69
ZORRO ARK 214.69
GOODFELLAS SALUTE 209.69
SATURDAY LAMBS 204.72
TITANS JERK 201.71
TORQUE BOUND 198.72
HARRY IDAHO 195.70
INNOCENT USUAL 191.74
HUSTLER PARTY 190.78

Plotly Questions

Problem 1

## Warning: package 'plotly' was built under R version 4.4.3
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

Problem 2

Both plotly and ggplot2 can be good at creating data visualizations, but there are some instances where one may prefer one over another. Ggplot2 is great for creating static graphics that do not let viewers interact with the visualizations. On the other hand, viewers are able to interact with a plotly graphic. In the diamonds scatter plot, individuals can hover over a specific point to see that diamond’s specific color, clarity and position (carat, price). In short, ggplot may be better suited for academic/written reports/deliverables while plotly is idea for websites or apps.

Problem 3

## Warning: package 'gapminder' was built under R version 4.4.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:dbplyr':
## 
##     ident, sql
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors

This plotly highlights how GDP per capita ($usd) has changed over time for different countries. The countries highlighted are China, Germany, India, and the United States. Some of these countries, such as India and U.S., have some of the largest economies in the World. The graphic also highlights how GDP per capita is much higher in Germany and the U.S. compared to India and China. This may be due to India and China having extremely large populations. The graphic also highlights how GDP per capita for Germany and U.S. has increased substantially in recent years while India and China has been more stagnant. In short, the comparison illustrates a shift in global economic development and emphasizes the rise of emerging markets in recent years around the World.