๐Ÿ—„๏ธ
Database Language

SQL

10 questions across 2 experience levels

Answer

SQL JOINs combine rows from two or more tables based on a related column. INNER JOIN: returns only rows where there is a match in BOTH tables. LEFT JOIN (LEFT OUTER): returns all rows from the left table, and matched rows from the right (NULLs where no match). RIGHT JOIN: opposite of LEFT JOIN. FULL OUTER JOIN: returns all rows from both tables, NULLs where no match. CROSS JOIN: every row of left multiplied by every row of right (Cartesian product).

Code Example
-- Sample tables: Customers (Id, Name) and Orders (Id, CustomerId, Amount)

-- INNER JOIN โ€” only customers who have orders
SELECT c.Name, o.Amount
FROM Customers c
INNER JOIN Orders o ON c.Id = o.CustomerId;

-- LEFT JOIN โ€” ALL customers, NULLs for those with no orders
SELECT c.Name, o.Amount
FROM Customers c
LEFT JOIN Orders o ON c.Id = o.CustomerId;

-- Find customers who have NO orders (anti-join pattern)
SELECT c.Name
FROM Customers c
LEFT JOIN Orders o ON c.Id = o.CustomerId
WHERE o.CustomerId IS NULL;

-- FULL OUTER JOIN โ€” all customers AND all orders (with NULLs)
SELECT c.Name, o.Amount
FROM Customers c
FULL OUTER JOIN Orders o ON c.Id = o.CustomerId;
๐Ÿ’ก Simple Analogy

Imagine two lists: 'Employees' and 'Departments'. INNER JOIN: only show employees who HAVE a department assigned โ€” no department, no row. LEFT JOIN: show ALL employees, and their department if they have one (NULL if they don't). Think LEFT = 'keep everything on the left side, match the right where possible'.

Answer

WHERE filters rows BEFORE grouping โ€” it applies to individual rows and cannot use aggregate functions (SUM, COUNT, AVG). HAVING filters groups AFTER the GROUP BY clause โ€” it can use aggregate functions. Rule of thumb: WHERE filters individual rows; HAVING filters the grouped results. You can use both in the same query.

Code Example
-- WHERE: filter rows before grouping
-- Only look at orders from 2024
SELECT CustomerId, COUNT(*) as OrderCount, SUM(Amount) as Total
FROM Orders
WHERE OrderDate >= '2024-01-01'          -- rows filtered here
GROUP BY CustomerId
HAVING COUNT(*) >= 5                     -- groups filtered here
ORDER BY Total DESC;

-- โŒ Wrong โ€” can't use aggregate in WHERE:
-- WHERE COUNT(*) > 5   -- Error!

-- โœ… Correct โ€” use HAVING for aggregates:
-- HAVING COUNT(*) > 5

-- Combined example:
SELECT Department, AVG(Salary) as AvgSalary
FROM Employees
WHERE IsActive = 1                       -- filter: only active
GROUP BY Department
HAVING AVG(Salary) > 60000              -- filter: only high-paying depts
ORDER BY AvgSalary DESC;
๐Ÿ’ก Simple Analogy

WHERE is the bouncer at the door โ€” it decides which INDIVIDUAL PEOPLE get into the party before the night begins. HAVING is the manager at closing time โ€” they look at the groups that formed and decide which GROUP gets to stay based on how many people are in it (an aggregate). WHERE can't count group members; HAVING can.

Answer

An index is a data structure (B-tree by default) that allows the database engine to find rows without scanning the entire table. Without an index, a query does a 'full table scan' (reads every row). With an index on the searched column, it's like looking up a book in a library index โ€” jump directly to the data. Indexes speed up reads but slightly slow down writes (INSERT/UPDATE/DELETE must also update the index). Always index: foreign keys, columns used in WHERE/JOIN/ORDER BY.

Code Example
-- Create a simple index on a frequently queried column
CREATE INDEX IX_Orders_CustomerId ON Orders(CustomerId);

-- Composite index โ€” good for queries filtering on both columns
CREATE INDEX IX_Orders_Status_Date ON Orders(Status, OrderDate DESC);

-- Unique index โ€” enforces uniqueness AND provides index benefit
CREATE UNIQUE INDEX UX_Users_Email ON Users(Email);

-- Check if a query uses an index (SQL Server)
SET STATISTICS IO ON;
SELECT * FROM Orders WHERE CustomerId = 42;
-- Look for 'Index Seek' (good) vs 'Table Scan' (no index used)

-- An index helps this:
SELECT * FROM Orders WHERE CustomerId = 42;
-- An index does NOT help this (leading wildcard):
SELECT * FROM Users WHERE Email LIKE '%@gmail.com'; -- full scan
๐Ÿ’ก Simple Analogy

Imagine a phone book (table) with 1 million entries. Without an index: you read every single entry to find 'John Smith' โ€” that's a table scan. WITH an index on LastName: you flip to 'S', jump to 'Smith', done in seconds. The index is a sorted shortcut list that points to the real data. Cost: the index takes up space and needs updating on every insert.

Answer

Aggregate functions perform calculations on a set of rows and return a single value per group. Common functions: COUNT(*) counts rows, SUM(col) totals a numeric column, AVG(col) calculates the average, MIN/MAX returns the smallest/largest value. GROUP BY divides result rows into groups based on one or more columns โ€” the aggregate is applied to each group separately. Every column in SELECT must be either in GROUP BY or wrapped in an aggregate function.

Code Example
-- Basic aggregates
SELECT 
    COUNT(*) AS TotalOrders,
    SUM(Amount) AS Revenue,
    AVG(Amount) AS AvgOrderValue,
    MAX(Amount) AS LargestOrder,
    MIN(Amount) AS SmallestOrder
FROM Orders
WHERE OrderDate >= '2024-01-01';

-- GROUP BY โ€” aggregates per group
SELECT 
    YEAR(OrderDate) AS Year,
    MONTH(OrderDate) AS Month,
    Region,
    COUNT(*) AS OrderCount,
    SUM(Amount) AS MonthlyRevenue
FROM Orders
GROUP BY YEAR(OrderDate), MONTH(OrderDate), Region
ORDER BY Year, Month, MonthlyRevenue DESC;

-- COUNT(*) vs COUNT(column): COUNT(*) counts all rows;
-- COUNT(column) ignores NULLs in that column
SELECT COUNT(*) AS AllRows, COUNT(Email) AS WithEmail
FROM Users;
๐Ÿ’ก Simple Analogy

GROUP BY is like sorting your receipts into separate envelopes by month. Once sorted, you can count the receipts per envelope (COUNT), add up the totals (SUM), or find the biggest purchase (MAX). Without GROUP BY, aggregate functions give you ONE number for the whole table. With GROUP BY, you get one number per group โ€” per month, per region, per product.

Answer

NULL in SQL represents the absence of a value (unknown). NULL is NOT equal to anything โ€” not even NULL. SQL uses three-valued logic (TRUE, FALSE, UNKNOWN). Any comparison with NULL (=, <>, <, >) returns UNKNOWN, not TRUE or FALSE. Use IS NULL or IS NOT NULL to check for NULLs. Functions like COALESCE, ISNULL, NULLIF help handle NULLs safely.

Code Example
-- NULL comparisons (all return UNKNOWN, not TRUE/FALSE)
SELECT * FROM Users WHERE Phone = NULL;    -- โŒ returns 0 rows
SELECT * FROM Users WHERE Phone != NULL;   -- โŒ returns 0 rows

-- โœ… Correct way to check for NULL
SELECT * FROM Users WHERE Phone IS NULL;
SELECT * FROM Users WHERE Phone IS NOT NULL;

-- COALESCE โ€” returns first non-NULL value
SELECT Name,
    COALESCE(Phone, MobilePhone, 'No phone') AS ContactNumber
FROM Users;

-- ISNULL (SQL Server) โ€” replaces NULL with a default
SELECT Name, ISNULL(Discount, 0) AS Discount FROM Orders;

-- NULLIF โ€” returns NULL if two values are equal (avoid divide by zero)
SELECT Revenue / NULLIF(OrderCount, 0) AS AvgRevenue FROM Summary;

-- NULL in aggregates: SUM/AVG/MAX/MIN ignore NULLs; COUNT(*) counts them
SELECT COUNT(*), COUNT(Phone), SUM(Amount) FROM Users; -- test it!
๐Ÿ’ก Simple Analogy

NULL means 'I don't know'. Is NULL = NULL? SQL says: 'I don't know if two unknowns are the same.' So WHERE Phone = NULL NEVER matches โ€” use WHERE Phone IS NULL. COALESCE is your 'use this if that is missing' tool: COALESCE(BillingPhone, HomePhone, MobilePhone) returns the first non-NULL. Always NULLIF before dividing: 5 / 0 crashes; 5 / NULLIF(0,0) returns NULL safely.