Skip to main content

Grouping Data in SQL

When you work with a lot of data, you often want to combine rows that have the same values in certain columns and calculate something for each group.
In SQL, this is done with the GROUP BY clause.

Why Use GROUP BY?​

Imagine you have a sales table:

idproductcategoryquantityprice
1AppleFruit102.5
2OrangeFruit53.0
3CarrotVegetable71.5
4AppleFruit82.5

If you want total quantity sold for each category, you can group the rows by category.

GROUP BY Syntax​

SELECT column1, column2, ..., aggregate_function(column)
FROM table_name
GROUP BY column1, column2, ...;
  • aggregate_function: Functions that calculate a value for a group, such as:

    • COUNT() β†’ Counts rows
    • SUM() β†’ Adds values
    • AVG() β†’ Calculates average
    • MIN() β†’ Finds smallest value
    • MAX() β†’ Finds largest value

Example: GROUP BY with SUM​

SELECT category, SUM(quantity) AS total_quantity
FROM sales
GROUP BY category;

Result:

categorytotal_quantity
Fruit23
Vegetable7

How it works:

  1. SQL looks at the category column.
  2. Rows with the same category are grouped together.
  3. The SUM(quantity) is calculated for each group.

GROUP BY with Multiple Columns​

You can group by more than one column.

SELECT category, product, SUM(quantity) AS total_quantity
FROM sales
GROUP BY category, product;

Now each unique (category, product) pair is its own group.

Filtering Groups with HAVING​

WHERE filters rows before grouping. HAVING filters groups after grouping.

Example: Show only categories where total quantity > 10.

SELECT category, SUM(quantity) AS total_quantity
FROM sales
GROUP BY category
HAVING SUM(quantity) > 10;

Result:

categorytotal_quantity
Fruit23

Difference Between WHERE and HAVING​

ClauseFilters OnWorks With Aggregates?
WHEREIndividual rows❌ (no aggregates)
HAVINGGrouped resultsβœ… (with aggregates)

Common Aggregate Functions​

FunctionDescriptionExample
COUNT(*)Counts all rowsCOUNT(*)
SUM(column)Adds all valuesSUM(quantity)
AVG(column)Average of valuesAVG(price)
MIN(column)Minimum valueMIN(price)
MAX(column)Maximum valueMAX(price)