1. Basic Aggregation & Grouping
Scenario:
- You have a
salestable containing daily sales records. The table has the following structure:
CREATE TABLE sales (
sale_id INT,
sale_date DATE,
amount DECIMAL(10,2),
region VARCHAR(50)
);
INSERT INTO sales (sale_id, sale_date, amount, region) VALUES
(1, '2025-01-15', 100.00, 'North'),
(2, '2025-02-10', 150.00, 'South'),
(3, '2025-03-05', 200.00, 'East'),
(4, '2025-04-25', 250.00, 'North'),
(5, '2025-05-30', 300.00, 'South'),
(6, '2025-06-15', 350.00, 'East'),
(7, '2025-07-01', 400.00, 'North');Task:
- Assuming today is July 2025, write a SQL query that computes the total sales per region for the last quarter (April 1, 2025 to June 30, 2025).
Results:
- Please click the link to see the results: link
2. Joining Data & Aggregation
Scenario:
- There are two tables: orders and customers.
CREATE TABLE orders (
order_id INT,
customer_id INT,
order_date DATE,
order_total DECIMAL(10,2)
);
CREATE TABLE customers (
customer_id INT,
customer_name VARCHAR(100)
);
INSERT INTO customers (customer_id, customer_name) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Carol'),
(4, 'Dave');
INSERT INTO orders (order_id, customer_id, order_date, order_total) VALUES
(101, 1, '2025-03-10', 250.00),
(102, 2, '2025-04-15', 300.00),
(103, 1, '2025-05-20', 450.00),
(104, 3, '2025-06-05', 500.00),
(105, 4, '2025-07-02', 150.00),
(106, 2, '2025-07-03', 200.00);Task:
- Write a query that retrieves the top 5 customers by total order amount for the current year (2025). The candidate should join the two tables, group by customer, and sort the results by the total amount.
Results:
- Please click the link to see the results: link
3. Window Functions
Scenario:
- You have a transactions table that logs every transaction by a customer.
CREATE TABLE transactions (
transaction_id INT,
customer_id INT,
transaction_date DATE,
amount DECIMAL(10,2)
);
INSERT INTO transactions (transaction_id, customer_id, transaction_date, amount) VALUES
(1, 1, '2025-01-01', 100.00),
(2, 1, '2025-01-15', 150.00),
(3, 2, '2025-02-10', 200.00),
(4, 1, '2025-02-20', 50.00),
(5, 2, '2025-03-05', 300.00);Task:
- Write a SQL query that calculates the running total of transactions for each customer, ordered by transaction_date.
Results:
- Please click the link to see the results: link
4. Subqueries & Filtering
Scenario:
- Using the orders table from Scenario 2, you need to identify customers whose total orders exceed the average order total across all orders.
Task:
- Write a query that:
- Calculates the average order total across all records.
- Returns the list of customer_ids (or customer names if joined with the customers table) that have a total order amount above this average.
Results:
- Please click the link to see the results: link
5. Date and Time Functions
Scenario:
- You have the same sales table as in Scenario 1.
Task:
- Write a query that:
- Extracts the month and year from the sale_date.
- Groups the sales by month and year.
- Calculates the total sales for each month.
Results:
- Please click the link to see the results: link
6. Pivoting Data
Scenario:
- Imagine a sales table that, in addition to the sales date and amount, includes a product_category.
CREATE TABLE sales (
sale_date DATE,
product_category VARCHAR(50),
sale_amount DECIMAL(10,2)
);
INSERT INTO sales (sale_date, product_category, sale_amount) VALUES
('2025-01-10', 'Electronics', 500.00),
('2025-01-15', 'Clothing', 200.00),
('2025-02-20', 'Electronics', 700.00),
('2025-02-25', 'Clothing', 300.00),
('2025-02-28', 'Furniture', 400.00);Task:
- Write a query to pivot the data so that each product category becomes a column, showing the total monthly sales for each month.
Results:
- Please click the link to see the results: link
7. Comparative Analysis
Scenario:
- Using the sales table (from Scenario 1 or a similar table), you need to compare this month’s sales with the previous month’s sales.
Task:
- Write a query that:
- Calculates the total sales for the current month (e.g., July 2025) and the previous month (June 2025).
- Computes the percentage growth or decline from June to July.
Results:
- Please click the link to see the results: link
8. Handling Nulls and Data Quality
Scenario:
- You have an orders table where some order_total values might be NULL due to data entry issues.
CREATE TABLE orders (
order_id INT,
customer_id INT,
order_date DATE,
order_total DECIMAL(10,2)
);
INSERT INTO orders (order_id, customer_id, order_date, order_total) VALUES
(201, 1, '2025-04-10', 300.00),
(202, 2, '2025-04-15', NULL),
(203, 3, '2025-04-20', 450.00),
(204, 1, '2025-04-25', NULL);Task:
- Write a query that calculates the average order total while handling NULL values. For instance, use the COALESCE() function to treat NULL as 0 (or exclude them as needed) during the calculation.
Results:
- Please click the link to see the results: link
9. Using CTEs for Complex Queries
Scenario:
- Assume you have a daily_sales table with daily revenue.
CREATE TABLE daily_sales (
sale_date DATE,
revenue DECIMAL(10,2)
);
INSERT INTO daily_sales (sale_date, revenue) VALUES
('2025-06-01', 1000.00),
('2025-06-02', 1200.00),
('2025-06-03', 1100.00),
('2025-06-07', 1300.00),
('2025-06-08', 1400.00);Task:
- Using a Common Table Expression (CTE), write a query that:
- Aggregates daily sales into weekly totals.
- Calculates the week-over-week percentage change in revenue.
Results:
- Please click the link to see the results: link
10. Performance Optimization
Scenario:
- Imagine you have an orders table containing millions of rows. Frequent queries filter on order_date and customer_id, and you’ve noticed that some queries run slowly.
Task:
Indexing:
- Explain which columns you would index and why.
- (For example, indexing order_date can speed up date-range queries, and indexing customer_id may help with join and filtering operations.)
Query Analysis:
- Describe how you would use an EXPLAIN plan (or similar tool) to analyze and optimize a query.
- (No need to write a full query; the candidate should discuss optimization strategies such as avoiding full table scans, using covering indexes, etc.)
Answer
If queries filter by order_date and customer_id, I’d index both:
- order_date: Speeds up date-range queries.
- customer_id: Helps with filtering and joins.
- Composite Index: (customer_id, order_date) - Optimizes queries using both.
- To check efficiency, I’d run EXPLAIN to see if indexes are used or if the query is doing a full table scan. If performance is poor, I’d adjust indexes or rewrite queries (e.g., avoid functions on indexed columns).
Indexes improve read speed but slow down inserts, so I’d only index what’s necessary.