SQL Queries Demo: date formats and modifications

The document is created in Quarto accessed from RStudio.

1. Loading libraries

library(tidyverse) 
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(DBI) 
library(RPostgres) 
library(dbplyr)

Attaching package: 'dbplyr'

The following objects are masked from 'package:dplyr':

    ident, sql

2. Establishing a database connection

con <- DBI::dbConnect(   
  RPostgres::Postgres(),   
  dbname = 'postgres',   
  host = 'localhost',   
  port = 5432,   
  user = 'postgres',   
  password = 'abcd'  )

3. Creating tables

The clause UNIQUE is used with columns that function as logical keys (strings to be indexed).

The expression ON DELETE CASCADE indicates that deleting a row in a table including the referenced primary key (a parent row) will remove corresponding rows in the table that is connected to it with a foreign key (child rows).


CREATE TABLE account (
id SERIAL,
email VARCHAR(128) UNIQUE,
created_at DATE NOT NULL DEFAULT NOW(),
updated_at DATE NOT NULL DEFAULT NOW (),
PRIMARY KEY(id)
);

CREATE TABLE post ( 
id SERIAL, 
title VARCHAR(128) UNIQUE NOT NULL, 
content VARCHAR(1024), -- will extenf with ALTER,
account_id INTEGER REFERENCES account(id) ON DELETE CASCADE,
created_at TIMESTAMP NOT NULL DEFAULT NOW(), 
updated_at TIMESTAMP NOT NULL DEFAULT NOW (), 
PRIMARY KEY(id) 
);

CREATE TABLE comment ( 
id SERIAL, 
content TEXT NOT NULL, 
account_id INTEGER REFERENCES account(id) ON DELETE CASCADE,
post_id INTEGER REFERENCES post(id) ON DELETE CASCADE,
created_at TIMESTAMP NOT NULL DEFAULT NOW(), 
updated_at TIMESTAMP NOT NULL DEFAULT NOW (), 
PRIMARY KEY(id)
);

CREATE TABLE fav ( 
id SERIAL, 
contenxx TEXT NOT NULL, --typo, will remove later with ALTER 
post_id INTEGER REFERENCES post(id) ON DELETE CASCADE,
account_id INTEGER REFERENCES account(id) ON DELETE CASCADE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), 
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW (), 
UNIQUE(post_id, account_id),
PRIMARY KEY(id)
);

4. Altering table schema

Deleting a column that was added by mistake in “fav” table


ALTER TABLE fav DROP COLUMN contenxx;

Adding a new column to “fav” table


ALTER TABLE fav ADD COLUMN rating INTEGER;

Changing variable type from VARCHAR to TEXT in “post” table


ALTER TABLE post ALTER COLUMN content TYPE TEXT;

5. Inserting data into tables


INSERT INTO account(email) VALUES 
('ed@umich.edu'), ('sue@umich.edu'), ('sally@umich.edu');

INSERT INTO post (title, content, account_id) VALUES
( 'Dictionaries', 'Are fun', 3),  -- sally@umich.edu
( 'BeautifulSoup', 'Has a complex API', 1), -- ed@mich.edu
( 'Many to Many', 'Is elegant', (SELECT id FROM account WHERE email='sue@umich.edu' ));

INSERT INTO comment (content, post_id, account_id) VALUES
( 'I agree', 1, 1), -- dict / ed
( 'Especially for counting', 1, 2), -- dict / sue
( 'And I don''t understand why', 2, 2), -- dict / sue
( 'Someone should make "EasySoup" or something like that', 
    (SELECT id FROM post WHERE title='BeautifulSoup'),
    (SELECT id FROM account WHERE email='ed@umich.edu' )),
( 'Good idea - I might just do that', 
    (SELECT id FROM post WHERE title='BeautifulSoup'),
    (SELECT id FROM account WHERE email='sally@umich.edu' ))
;

6. Modifying timestamps

Now() function gives the current time:


SELECT NOW()
1 records
now
2024-10-03 13:22:06

Specifying a timezone


SELECT NOW() AT TIME ZONE 'UTC';
1 records
timezone
2024-10-03 13:22:06

Casting to different types


SELECT NOW(), 
CAST(NOW() AS TIME);
1 records
now now
2024-10-03 13:22:06 13:22:06.986994

To obtain the same result:


SELECT NOW(), 
NOW()::time;
1 records
now now
2024-10-03 13:22:06 13:22:06.996632

Date interval arithmetic


SELECT NOW() - INTERVAL '1 week';
1 records
?column?
2024-09-26 13:22:07

Getting the interval in a date format:


SELECT (NOW() - INTERVAL '11 days')::date;
1 records
date
2024-09-22

Discarding some of the accuracy of the timestamp


SELECT NOW(), DATE_TRUNC('day', NOW());
1 records
now date_trunc
2024-10-03 13:22:07 2024-10-03

Using DATE_TRUNC() with a SELECT statement


SELECT id, content, created_at FROM comment
WHERE created_at >= DATE_TRUNC('day', NOW())
AND created_at < DATE_TRUNC('day', NOW() + INTERVAL '1 day');
5 records
id content created_at
1 I agree 2024-10-03 13:22:06
2 Especially for counting 2024-10-03 13:22:06
3 And I don’t understand why 2024-10-03 13:22:06
4 Someone should make “EasySoup” or something like that 2024-10-03 13:22:06
5 Good idea - I might just do that 2024-10-03 13:22:06

The following query gives the same result as the previous one with DATE_TRUNC() but is slower as it requires a full table scan.


SELECT id, content, created_at FROM comment
WHERE created_at::date >=NOW()::date
AND created_at::date < (NOW()+ INTERVAL '1day');
5 records
id content created_at
1 I agree 2024-10-03 13:22:06
2 Especially for counting 2024-10-03 13:22:06
3 And I don’t understand why 2024-10-03 13:22:06
4 Someone should make “EasySoup” or something like that 2024-10-03 13:22:06
5 Good idea - I might just do that 2024-10-03 13:22:06

Using DATE_TRUNC is preferable performance-wise - we avoid a full table scan.