IC/CP-W04 BigQuery SQL Lecture 1

Author

Matthew Prado

1 SELECT / FROM / LIMIT

What it does: Picks the columns event_date and event_name from the GA4 events tables and returns a small sample (20 rows) to preview the data.

SELECT event_date, event_name FROMbigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*` LIMIT 20;

2 WHERE filters rows

What it does: Filters the table to only purchase events and shows which pseudo-user produced each purchase event (up to 50 rows).

SELECT event_date, event_name, user_pseudo_id FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_* WHERE event_name = 'purchase' LIMIT 50;

3 ORDER BY sorts results

What it does: Shows purchase events sorted newest-to-oldest using event_timestamp.

SELECT event_date, event_timestamp, event_name FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_* WHERE event_name = 'purchase' ORDER BY event_timestamp DESC LIMIT 20;

SELECT TIMESTAMP_MICROS(event_timestamp) AS event_time, event_name FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_* ORDER BY event_timestamp DESC LIMIT 10;

4 COUNT, SUM, AVG, MIN, MAX

What they do: summarize many rows into a few values.

4.1 Count total events in a date range

SELECT COUNT(*) AS event_count FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_* WHERE _TABLE_SUFFIX BETWEEN '20201201' AND '20201231';

4.2 Count purchases in a date range

SELECT COUNT(*) AS purchase_events FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_* WHERE event_name = 'purchase' AND _TABLE_SUFFIX BETWEEN '20201201' AND '20201231';

5 GROUP BY controls the grain

What it does: defines what one row in your result represents (e.g., per day, per event_name, per user).

5.1 Events by day

SELECT event_date, COUNT(*) AS events FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_* WHERE _TABLE_SUFFIX BETWEEN '20201201' AND '20201231' GROUP BY event_date ORDER BY event_date;

5.2 Top event types

SELECT event_name, COUNT(*) AS events FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_* WHERE _TABLE_SUFFIX BETWEEN '20201201' AND '20201231' GROUP BY event_name ORDER BY events DESC LIMIT 15;

6 HAVING filters after grouping

What it does: filters aggregated results (after GROUP BY).

SELECT event_name, COUNT(*) AS events FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_* WHERE _TABLE_SUFFIX BETWEEN '20201201' AND '20201231' GROUP BY event_name HAVING COUNT(*) >= 50000 ORDER BY events DESC;

7 Wildcard tables + _TABLE_SUFFIX

What it does: GA4 exports are daily tables, queires many days, and limits which days you scan (faster/cheaper)

SELECT event_date, COUNT(*) AS events FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_* WHERE _TABLE_SUFFIX BETWEEN '20201201' AND '20201207' GROUP BY event_date ORDER BY event_dat;