This is a notebook that documents my first Individual Coding Exercise. This is the first time ever that I experiment with databases, so I expect some bumps, but I’m OK with it.
I start with this blog post comparing relational and non-relational databases. By reading it, I have the impression that the learning curve will be less steep with relational DB, so I choose to work with this type of databases.
Next, I gave this introductory DataCamp course a try. It really feels like another programming language, and the first few steps are quite intuitive.
SELECT, COUNT, and DISTINCT.WHERE.The concept of “wildcards” is quite interesting, plus the name is intriguing ;-)ORDER BY coming after FROMStatement of completion of course on Introduction to SQL.
In general, undertaking this first part of the assignment was easy, although I feel I made a mistake in terms of my learning: because I had a chunk of time available and was eager to understand the basics of SQL, I took the whole DataCamp course all at once, instead of practicing a consistently across a few days. I tend to prefer the latter approach, because I feel I consolidate knowledge better that way. Let’s wait and see.
Following the instructions in this tutorial, it was easy to create the database, although it seems that every step is a little risky, because the interface and set up feel a little “black-box-y.”
# Loading the libraries needed to manage the DB in R
library(DBI)
library(RMySQL)
# Storing credentials in variables
host <- "database-2.cluster-chhrqrpvglpz.us-east-1.rds.amazonaws.com"
port <- 3306
dbname <- "hudk4051"
user <- "renatorusso"
# Connecting to the database
my_db <- dbConnect(MySQL(), user = user, password = password, dbname = dbname, host = host, port = port)
# The first time that I "call" the database, the following 3 lines return empty values, obviously
my_db
## <MySQLConnection:0,0>
summary(my_db)
## <MySQLConnection:0,0>
## User: renatorusso
## Host: database-2.cluster-chhrqrpvglpz.us-east-1.rds.amazonaws.com
## Dbname: hudk4051
## Connection type: database-2.cluster-chhrqrpvglpz.us-east-1.rds.amazonaws.com via TCP/IP
##
## Results:
dbListTables(my_db)
## [1] "ACBQ_data"
#Loading data--for this exercise, I'm using data from Jolley et al (2021)[1] on beliefs in conspiracy theories. They made the results
# available through the [Open Science Framework](https://osf.io/7f3qc/)
ACBQ_data <- readxl::read_xlsx("ACBQ/Data_ACBQ.xlsx")
# Writing data to the DB using the DBI package. The line below is commented because I can't write the same file twice to the database using that
# function
# dbWriteTable(my_db,"ACBQ_data", ACBQ_data)
# Checking if the data has been written to the DB.
dbListTables(my_db)
## [1] "ACBQ_data"
# Read table in the
head(dbReadTable(my_db, "ACBQ_data"))
## ID ACBQ1 ACBQ2 ACBQ3 QCBQ4 ACBQ5 ACBQ6 ACBQ7 QCBQ8 ACBQ9 B1 B2 B3 B4 B5 B6 B7
## 1 1 5 5 6 2 2 2 1 4 5 5 5 5 2 1 5 2
## 2 2 6 5 6 5 2 2 1 4 4 4 5 1 3 3 2 2
## 3 3 4 1 1 1 2 1 1 1 1 3 1 1 2 3 1 1
## 4 4 7 7 5 5 3 2 1 6 6 3 6 5 2 4 5 7
## 5 5 5 5 6 2 2 2 2 4 4 4 4 2 2 5 3 7
## 6 6 7 6 5 6 6 6 4 5 5 6 6 4 5 4 5 5
## B8 B9 B10 B11 B12 B13 B14 B15 RW1 RW2 RW3 RW4 RW5 RW6 RW7 Age Gender
## 1 6 3 6 3 4 5 5 6 1 2 1 5 1 3 2 18 1
## 2 1 1 5 3 2 1 2 6 1 2 2 1 1 4 4 18 1
## 3 1 1 2 4 1 1 2 3 1 1 1 1 1 1 1 18 1
## 4 2 4 6 3 6 1 4 6 1 4 2 6 3 1 4 18 2
## 5 2 5 6 5 7 2 2 5 5 6 6 4 2 2 2 18 2
## 6 3 2 5 6 4 4 5 6 2 5 6 6 5 4 4 18 2
## EnglishLang BornUK ABCQ GeneralCT RealworldCT
## 1 1 1 3.555556 4.200000 2.142857
## 2 1 1 3.888889 2.733333 2.142857
## 3 1 1 1.444444 1.800000 1.000000
## 4 1 1 4.666667 4.266667 3.000000
## 5 1 1 3.555556 4.066667 3.857143
## 6 1 1 5.555556 4.666667 4.571429
# Selecting columns
dbGetQuery(my_db, "SELECT ACBQ1 FROM ACBQ_data WHERE gender = '1' LIMIT 10")
## ACBQ1
## 1 5
## 2 6
## 3 4
## 4 7
## 5 5
## 6 5
## 7 2
## 8 7
## 9 7
## 10 5
# Combining aggregate functions with WHERE
dbGetQuery(my_db, "SELECT ABCQ FROM ACBQ_data WHERE RealworldCT > 3 LIMIT 10")
## ABCQ
## 1 3.555556
## 2 5.555556
## 3 5.888889
## 4 5.444444
## 5 3.333333
## 6 5.888889
## 7 4.333333
## 8 5.222222
## 9 3.333333
## 10 3.888889
# [1] Jolley, D., Douglas, K., M, Skipper,Y,Thomas, E., &Cookson, D.(2021). Measuring adolescents’ beliefs in conspiracy theories: Development and validation of the Adolescent Conspiracy Beliefs Questionnaire (ACBQ). British Journal of Developmental Psychology.