This is a notebook that documents my first Individual Coding Exercise. This is the first time ever that I experiment with databases, so I expect some bumps, but I’m OK with it.

Part 1

I start with this blog post comparing relational and non-relational databases. By reading it, I have the impression that the learning curve will be less steep with relational DB, so I choose to work with this type of databases.

Part 2: “Intro to SQL” DataCamp course

Next, I gave this introductory DataCamp course a try. It really feels like another programming language, and the first few steps are quite intuitive.

Statement of completion of course on Introduction to SQL.

In general, undertaking this first part of the assignment was easy, although I feel I made a mistake in terms of my learning: because I had a chunk of time available and was eager to understand the basics of SQL, I took the whole DataCamp course all at once, instead of practicing a consistently across a few days. I tend to prefer the latter approach, because I feel I consolidate knowledge better that way. Let’s wait and see.

Part 3: creating an AWS Database

Following the instructions in this tutorial, it was easy to create the database, although it seems that every step is a little risky, because the interface and set up feel a little “black-box-y.”

# Loading the libraries needed to manage the DB in R
library(DBI)
library(RMySQL)

# Storing credentials in variables
host <- "database-2.cluster-chhrqrpvglpz.us-east-1.rds.amazonaws.com"
port <- 3306
dbname <- "hudk4051"
user <- "renatorusso"
# Connecting to the database
my_db <- dbConnect(MySQL(), user = user, password = password, dbname = dbname, host = host, port = port)

# The first time that I "call" the database, the following 3 lines return empty values, obviously
my_db
## <MySQLConnection:0,0>
summary(my_db)
## <MySQLConnection:0,0>
##   User:   renatorusso 
##   Host:   database-2.cluster-chhrqrpvglpz.us-east-1.rds.amazonaws.com 
##   Dbname: hudk4051 
##   Connection type: database-2.cluster-chhrqrpvglpz.us-east-1.rds.amazonaws.com via TCP/IP 
## 
## Results:
dbListTables(my_db)
## [1] "ACBQ_data"
#Loading data--for this exercise, I'm using data from Jolley et al (2021)[1] on beliefs in conspiracy theories. They made the results
# available through the [Open Science Framework](https://osf.io/7f3qc/)
ACBQ_data <- readxl::read_xlsx("ACBQ/Data_ACBQ.xlsx")

# Writing data to the DB using the DBI package. The line below is commented because I can't write the same file twice to the database using that
# function
# dbWriteTable(my_db,"ACBQ_data", ACBQ_data)

# Checking if the data has been written to the DB.
dbListTables(my_db)
## [1] "ACBQ_data"
# Read table in the 
head(dbReadTable(my_db, "ACBQ_data"))
##   ID ACBQ1 ACBQ2 ACBQ3 QCBQ4 ACBQ5 ACBQ6 ACBQ7 QCBQ8 ACBQ9 B1 B2 B3 B4 B5 B6 B7
## 1  1     5     5     6     2     2     2     1     4     5  5  5  5  2  1  5  2
## 2  2     6     5     6     5     2     2     1     4     4  4  5  1  3  3  2  2
## 3  3     4     1     1     1     2     1     1     1     1  3  1  1  2  3  1  1
## 4  4     7     7     5     5     3     2     1     6     6  3  6  5  2  4  5  7
## 5  5     5     5     6     2     2     2     2     4     4  4  4  2  2  5  3  7
## 6  6     7     6     5     6     6     6     4     5     5  6  6  4  5  4  5  5
##   B8 B9 B10 B11 B12 B13 B14 B15 RW1 RW2 RW3 RW4 RW5 RW6 RW7 Age Gender
## 1  6  3   6   3   4   5   5   6   1   2   1   5   1   3   2  18      1
## 2  1  1   5   3   2   1   2   6   1   2   2   1   1   4   4  18      1
## 3  1  1   2   4   1   1   2   3   1   1   1   1   1   1   1  18      1
## 4  2  4   6   3   6   1   4   6   1   4   2   6   3   1   4  18      2
## 5  2  5   6   5   7   2   2   5   5   6   6   4   2   2   2  18      2
## 6  3  2   5   6   4   4   5   6   2   5   6   6   5   4   4  18      2
##   EnglishLang BornUK     ABCQ GeneralCT RealworldCT
## 1           1      1 3.555556  4.200000    2.142857
## 2           1      1 3.888889  2.733333    2.142857
## 3           1      1 1.444444  1.800000    1.000000
## 4           1      1 4.666667  4.266667    3.000000
## 5           1      1 3.555556  4.066667    3.857143
## 6           1      1 5.555556  4.666667    4.571429
# Selecting columns
dbGetQuery(my_db, "SELECT ACBQ1 FROM ACBQ_data WHERE gender = '1' LIMIT 10")
##    ACBQ1
## 1      5
## 2      6
## 3      4
## 4      7
## 5      5
## 6      5
## 7      2
## 8      7
## 9      7
## 10     5
# Combining aggregate functions with WHERE
dbGetQuery(my_db, "SELECT ABCQ FROM ACBQ_data WHERE RealworldCT > 3 LIMIT 10")
##        ABCQ
## 1  3.555556
## 2  5.555556
## 3  5.888889
## 4  5.444444
## 5  3.333333
## 6  5.888889
## 7  4.333333
## 8  5.222222
## 9  3.333333
## 10 3.888889
# [1] Jolley, D., Douglas, K., M, Skipper,Y,Thomas, E., &Cookson, D.(2021). Measuring adolescents’ beliefs in conspiracy theories: Development and validation of the Adolescent Conspiracy Beliefs Questionnaire (ACBQ). British Journal of Developmental Psychology.