Grouping Data in SQL
When you work with a lot of data, you often want to combine rows that have the same values in certain columns and calculate something for each group.
In SQL, this is done with the GROUP BY clause.
Why Use GROUP BY?β
Imagine you have a sales table:
| id | product | category | quantity | price |
|---|---|---|---|---|
| 1 | Apple | Fruit | 10 | 2.5 |
| 2 | Orange | Fruit | 5 | 3.0 |
| 3 | Carrot | Vegetable | 7 | 1.5 |
| 4 | Apple | Fruit | 8 | 2.5 |
If you want total quantity sold for each category, you can group the rows by category.
GROUP BY Syntaxβ
SELECT column1, column2, ..., aggregate_function(column)
FROM table_name
GROUP BY column1, column2, ...;
-
aggregate_function: Functions that calculate a value for a group, such as:COUNT()β Counts rowsSUM()β Adds valuesAVG()β Calculates averageMIN()β Finds smallest valueMAX()β Finds largest value
Example: GROUP BY with SUMβ
SELECT category, SUM(quantity) AS total_quantity
FROM sales
GROUP BY category;
Result:
| category | total_quantity |
|---|---|
| Fruit | 23 |
| Vegetable | 7 |
How it works:
- SQL looks at the
categorycolumn. - Rows with the same category are grouped together.
- The
SUM(quantity)is calculated for each group.
GROUP BY with Multiple Columnsβ
You can group by more than one column.
SELECT category, product, SUM(quantity) AS total_quantity
FROM sales
GROUP BY category, product;
Now each unique (category, product) pair is its own group.
Filtering Groups with HAVINGβ
WHERE filters rows before grouping.
HAVING filters groups after grouping.
Example: Show only categories where total quantity > 10.
SELECT category, SUM(quantity) AS total_quantity
FROM sales
GROUP BY category
HAVING SUM(quantity) > 10;
Result:
| category | total_quantity |
|---|---|
| Fruit | 23 |
Difference Between WHERE and HAVINGβ
| Clause | Filters On | Works With Aggregates? |
|---|---|---|
| WHERE | Individual rows | β (no aggregates) |
| HAVING | Grouped results | β (with aggregates) |
Common Aggregate Functionsβ
| Function | Description | Example |
|---|---|---|
| COUNT(*) | Counts all rows | COUNT(*) |
| SUM(column) | Adds all values | SUM(quantity) |
| AVG(column) | Average of values | AVG(price) |
| MIN(column) | Minimum value | MIN(price) |
| MAX(column) | Maximum value | MAX(price) |