I’ll be loading data from a Postgres database that I created called Movies. I created my source .CSV file in MSExcel and created the Ratings table in Postgres using pgAdmin 4. I brought the data from the .CSV file into my Ratings table using pgAdmin’s import feature.
Here, I’ll connect to my PostgreSQL database and load the information from the Rating’s table into an R dataframe.
library(RPostgreSQL)
## Loading required package: DBI
library(TTR)
dbname <- "Movies"
dbuser <- "postgres"
dbpass <- "postgres"
dbhost <- "localhost"
dbport <- 5432
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, host=dbhost, port=dbport, dbname=dbname,user=dbuser, password=dbpass)
query <- dbSendQuery(con, query <- "SELECT * FROM public.ratings")
ratings <- fetch(query, n=-1)
ratings
## title
## 1 Spider-Man: Into the Spider-Verse
## 2 If Beale Street Could Talk
## 3 A Dog's Way Home
## 4 Escape Room
## 5 A Star is Born
## 6 Roma
## 7 Spider-Man: Into the Spider-Verse
## 8 If Beale Street Could Talk
## 9 A Dog's Way Home
## 10 Escape Room
## 11 A Star is Born
## 12 Roma
## 13 Spider-Man: Into the Spider-Verse
## 14 If Beale Street Could Talk
## 15 A Dog's Way Home
## 16 Escape Room
## 17 A Star is Born
## 18 Roma
## 19 Spider-Man: Into the Spider-Verse
## 20 If Beale Street Could Talk
## 21 A Dog's Way Home
## 22 Escape Room
## 23 A Star is Born
## 24 Roma
## 25 Spider-Man: Into the Spider-Verse
## 26 If Beale Street Could Talk
## 27 A Dog's Way Home
## 28 Escape Room
## 29 A Star is Born
## 30 Roma
## person rating
## 1 1 2
## 2 1 4
## 3 1 5
## 4 1 1
## 5 1 3
## 6 1 2
## 7 2 5
## 8 2 1
## 9 2 5
## 10 2 3
## 11 2 3
## 12 2 2
## 13 3 3
## 14 3 5
## 15 3 5
## 16 3 2
## 17 3 4
## 18 3 3
## 19 4 1
## 20 4 5
## 21 4 5
## 22 4 2
## 23 4 4
## 24 4 4
## 25 5 4
## 26 5 2
## 27 5 3
## 28 5 4
## 29 5 4
## 30 5 3
Next, I’ll query a table with the average rating for each movie.
query2 <- query2 <- dbSendQuery(con, query <- "SELECT title, AVG(CAST(rating AS FLOAT)) AS avg_score
FROM public.ratings
GROUP BY title;")
avgScores <- fetch(query2, n=-1)
avgScores
## title
## 1 A Star is Born
## 2 Roma
## 3 If Beale Street Could Talk
## 4 Escape Room
## 5 Spider-Man: Into the Spider-Verse
## 6 A Dog's Way Home
## avg_score
## 1 3.6
## 2 2.8
## 3 3.4
## 4 2.4
## 5 3.0
## 6 4.6
Finally, I’ll plot the average scores for each movies. ‘A Star is Born’ has the highest average rating while ‘If Beale Street Could Talk’ has the lowest for those who participated in my survey.
library(ggplot2)
p <- ggplot(data=avgScores, aes(x=title, y=avg_score)) + geom_bar(stat="identity")
p