SQL: Stanford’s Exercises - Movie Rating - PART I

You’ve started a new movie-rating website, and you’ve been collecting data on reviewers’ ratings of various movies.

-- Database of movie-rating website
/* Delete the tables if they already exist */
drop table if exists Movie;
drop table if exists Reviewer;
drop table if exists Rating;

/* Create the schema for our tables */
create table Movie(mID int, title text, year int, director text);
create table Reviewer(rID int, name text);
create table Rating(rID int, mID int, stars int, ratingDate date);

/* Populate the tables with our data */
insert into Movie values(101, 'Gone with the Wind', 1939, 'Victor Fleming');
insert into Movie values(102, 'Star Wars', 1977, 'George Lucas');
insert into Movie values(103, 'The Sound of Music', 1965, 'Robert Wise');
insert into Movie values(104, 'E.T.', 1982, 'Steven Spielberg');
insert into Movie values(105, 'Titanic', 1997, 'James Cameron');
insert into Movie values(106, 'Snow White', 1937, null);
insert into Movie values(107, 'Avatar', 2009, 'James Cameron');
insert into Movie values(108, 'Raiders of the Lost Ark', 1981, 'Steven Spielberg');

insert into Reviewer values(201, 'Sarah Martinez');
insert into Reviewer values(202, 'Daniel Lewis');
insert into Reviewer values(203, 'Brittany Harris');
insert into Reviewer values(204, 'Mike Anderson');
insert into Reviewer values(205, 'Chris Jackson');
insert into Reviewer values(206, 'Elizabeth Thomas');
insert into Reviewer values(207, 'James Cameron');
insert into Reviewer values(208, 'Ashley White');

insert into Rating values(201, 101, 2, '2011-01-22');
insert into Rating values(201, 101, 4, '2011-01-27');
insert into Rating values(202, 106, 4, null);
insert into Rating values(203, 103, 2, '2011-01-20');
insert into Rating values(203, 108, 4, '2011-01-12');
insert into Rating values(203, 108, 2, '2011-01-30');
insert into Rating values(204, 101, 3, '2011-01-09');
insert into Rating values(205, 103, 3, '2011-01-27');
insert into Rating values(205, 104, 2, '2011-01-22');
insert into Rating values(205, 108, 4, null);
insert into Rating values(206, 107, 3, '2011-01-15');
insert into Rating values(206, 106, 5, '2011-01-19');
insert into Rating values(207, 107, 5, '2011-01-20');
insert into Rating values(208, 104, 3, '2011-01-02');

Problem 1

Problem: Find the titles of all movies directed by Steven Spielberg.

-- Find the titles of all movies directed by Steven Spielberg. 
SELECT title
FROM Movie
WHERE director = 'Steven Spielberg';
##                     title
## 1                    E.T.
## 2 Raiders of the Lost Ark

Problem 2

Problem: Find all years that have a movie that received a rating of 4 or 5, and sort them in increasing order.

-- Find all years that have a movie that received a rating of 4 or 5, and sort them in increasing order. 
SELECT DISTINCT year
FROM Movie
JOIN Rating ON Rating.mID = Movie.mID
WHERE stars IN (4,5)
ORDER BY year ASC;
##   year
## 1 1937
## 2 1939
## 3 1981
## 4 2009

Problem 3

Problem: Find the titles of all movies that have no ratings.

-- Find the titles of all movies that have no ratings. 
SELECT title
FROM Movie
LEFT JOIN Rating
ON Movie.mID = Rating.mID
WHERE stars IS NULL;
##       title
## 1 Star Wars
## 2   Titanic

Problem 4

Problem: Some reviewers didn’t provide a date with their rating. Find the names of all reviewers who have ratings with a NULL value for the date.

-- Find the names of all reviewers who have ratings with a NULL value for the date. 
SELECT name
FROM Reviewer
LEFT JOIN Rating
ON Reviewer.rID = Rating.rID
WHERE ratingDate IS NULL;
##            name
## 1  Daniel Lewis
## 2 Chris Jackson

Problem 5

Problem: Write a query to return the ratings data in a more readable format: reviewer name, movie title, stars, and ratingDate. Also, sort the data, first by reviewer name, then by movie title, and lastly by number of stars.

-- Write a query to return the ratings data in a more readable format: reviewer name, movie title, stars, and ratingDate. Also, sort the data, first by reviewer name, then by movie title, and lastly by number of stars. 
SELECT name, title, stars, ratingDate
FROM Rating
INNER JOIN Movie
ON Rating.mID = Movie.mID
INNER JOIN Reviewer
ON Rating.rID = Reviewer.rID
ORDER BY name, title, stars ASC;
##                name                   title stars ratingdate
## 1      Ashley White                    E.T.     3 2011-01-02
## 2   Brittany Harris Raiders of the Lost Ark     2 2011-01-30
## 3   Brittany Harris Raiders of the Lost Ark     4 2011-01-12
## 4   Brittany Harris      The Sound of Music     2 2011-01-20
## 5     Chris Jackson                    E.T.     2 2011-01-22
## 6     Chris Jackson Raiders of the Lost Ark     4       <NA>
## 7     Chris Jackson      The Sound of Music     3 2011-01-27
## 8      Daniel Lewis              Snow White     4       <NA>
## 9  Elizabeth Thomas                  Avatar     3 2011-01-15
## 10 Elizabeth Thomas              Snow White     5 2011-01-19
## 11    James Cameron                  Avatar     5 2011-01-20
## 12    Mike Anderson      Gone with the Wind     3 2011-01-09
## 13   Sarah Martinez      Gone with the Wind     2 2011-01-22
## 14   Sarah Martinez      Gone with the Wind     4 2011-01-27

Problem 6

Problem: For all cases where the same reviewer rated the same movie twice and gave it a higher rating the second time, return the reviewer’s name and the title of the movie.

-- For all cases where the same reviewer rated the same movie twice and gave it a higher rating the second time, return the reviewer's name and the title of the movie. 
SELECT name, title
FROM Movie
INNER JOIN Rating R1 USING(mId)
INNER JOIN Rating R2 USING(rId, mId)
INNER JOIN Reviewer USING(rId)
WHERE R1.ratingDate < R2.ratingDate AND R1.stars < R2.stars;
##             name              title
## 1 Sarah Martinez Gone with the Wind

Problem 7

Problem: For each movie that has at least one rating, find the highest number of stars that movie received. Return the movie title and number of stars. Sort by movie title.

-- Find the titles of all movies directed by Steven Spielberg. 
SELECT title, max(stars)
FROM Movie
JOIN Rating ON Rating.mID = Movie.mID
GROUP BY Rating.mID, title
ORDER BY title ASC;
##                     title max
## 1                  Avatar   5
## 2                    E.T.   3
## 3      Gone with the Wind   4
## 4 Raiders of the Lost Ark   4
## 5              Snow White   5
## 6      The Sound of Music   3

Problem 8

Problem: For each movie, return the title and the ‘rating spread’, that is, the difference between highest and lowest ratings given to that movie. Sort by rating spread from highest to lowest, then by movie title.

-- For each movie, return the title and the 'rating spread', that is, the difference between highest and lowest ratings given to that movie. Sort by rating spread from highest to lowest, then by movie title. 
SELECT title, max(stars)-min(stars) AS rating_spread
FROM Movie
JOIN Rating ON Rating.mID = Movie.mID
GROUP BY Rating.mID, title
ORDER BY rating_spread DESC, title ASC;
##                     title rating_spread
## 1                  Avatar             2
## 2      Gone with the Wind             2
## 3 Raiders of the Lost Ark             2
## 4                    E.T.             1
## 5              Snow White             1
## 6      The Sound of Music             1

Problem 9

Problem: Find the difference between the average rating of movies released before 1980 and the average rating of movies released after 1980. (Make sure to calculate the average rating for each movie, then the average of those averages for movies before 1980 and movies after. Don’t just calculate the overall average rating before and after 1980.)

-- Find the difference between the average rating of movies released before 1980 and the average rating of movies released after 1980. (Make sure to calculate the average rating for each movie, then the average of those averages for movies before 1980 and movies after. Don't just calculate the overall average rating before and after 1980.) 
SELECT AVG(average_b1980) - AVG(average_a1980) 
FROM (
SELECT AVG(stars) AS average_b1980
FROM Rating INNER JOIN Movie ON Movie.mID = Rating.mID WHERE year < 1980
GROUP BY Rating.mID) AS average_bd1980
), 
(
SELECT AVG(stars) AS average_a1980 FROM Rating INNER JOIN Movie ON Movie.mID = Rating.mID WHERE year > 1980
GROUP BY Rating.mID) AS average_bs1980
);
##     ?column?
## 1 0.05555556