Difference Between HAVING and GROUP BY in SQL
📙 Welcome to HAVING vs GROUP BY!
Hey there, SQL beginner! If you’ve ever wondered how to group data and filter those groups in SQL, you’ve likely come across GROUP BY and HAVING. These clauses are powerful tools for summarizing and filtering data, but they serve different purposes and are often confused. Using a simple students table (with columns id, name, age, marks, and city), we’ll break down their differences, show how they work together, provide a handy comparison table, and include clear examples to make you a pro. Let’s dive in!
📘 What Are GROUP BY and HAVING?
- GROUP BY: Organizes rows into groups based on one or more columns and is typically used with aggregate functions (e.g.,
COUNT,AVG,SUM) to summarize data within each group. - HAVING: Filters the grouped results based on conditions involving aggregate functions, acting like a
WHEREclause but for groups rather than individual rows.
Think of GROUP BY as sorting your data into buckets (e.g., grouping students by city), and HAVING as deciding which buckets to keep (e.g., only cities with an average mark above 80). They’re often used together in SQL queries, but they have distinct roles and rules.
Pro Tip: Always write
GROUP BYbeforeHAVINGin a query, as SQL processesGROUP BYfirst to create groups, then appliesHAVINGto filter them!
📘 Detailed Differences Between GROUP BY and HAVING
To understand when and how to use GROUP BY and HAVING, let’s explore their differences in detail, followed by a comparison table summarizing the key points.
1. Purpose
- GROUP BY:
- Groups rows with identical values in specified columns into summary rows.
- Used to aggregate data (e.g., calculate averages, counts) within each group.
- Example: Group students by
cityto find the average marks per city.
- HAVING:
- Filters the groups created by
GROUP BYbased on conditions involving aggregate functions. - Acts like a gatekeeper, keeping only the groups that meet the condition.
- Example: Keep only cities where the average marks are above 80.
- Filters the groups created by
2. What They Operate On
- GROUP BY:
- Operates on individual rows to organize them into groups.
- Works with raw column values (e.g.,
city,age) to define groups. - Must be used with aggregate functions (e.g.,
AVG,COUNT) in theSELECTclause for meaningful results.
- HAVING:
- Operates on the grouped results after
GROUP BYis applied. - Works with aggregate functions (e.g.,
AVG(marks),COUNT(id)) to filter groups. - Cannot reference non-aggregated columns unless they’re in the
GROUP BYclause.
- Operates on the grouped results after
3. Position in Query
- GROUP BY:
- Appears after the
FROMandWHEREclauses in a SQL query. - Precedes
HAVINGin both syntax and execution order. - Syntax order:
SELECT→FROM→WHERE→GROUP BY→HAVING→ORDER BY→LIMIT.
- Appears after the
- HAVING:
- Appears immediately after
GROUP BYin a query. - Applied after groups are formed, filtering the aggregated results.
- Cannot be used without
GROUP BYin standard SQL, as it relies on grouped data.
- Appears immediately after
4. Conditions They Support
- GROUP BY:
- Doesn’t support conditions directly; it defines how rows are grouped.
- Example:
GROUP BY citygroups all rows by unique city values.
- HAVING:
- Supports conditions using aggregate functions (e.g.,
AVG(marks) > 80). - Can also include non-aggregated columns if they’re part of the
GROUP BYclause (e.g.,city = 'Mumbai'). - Example:
HAVING AVG(marks) > 80keeps groups with high average marks.
- Supports conditions using aggregate functions (e.g.,
5. Comparison with WHERE
- GROUP BY:
- Works with
WHEREto filter individual rows before grouping. - Example: Use
WHERE age > 18to filter students before grouping by city.
- Works with
- HAVING:
- Acts like
WHEREbut for groups, applied afterGROUP BY. - Cannot use with non-aggregated data unless grouped, unlike
WHERE. - Example: Use
HAVING COUNT(id) > 2to keep groups with more than two students.
- Acts like
6. Execution Order
- GROUP BY:
- Executed after
FROMandWHERE, grouping rows based on specified columns. - Part of the query execution pipeline:
FROM→WHERE→GROUP BY→HAVING→SELECT→ORDER BY→LIMIT.
- Executed after
- HAVING:
- Executed after
GROUP BY, filtering the grouped results. - Only processes the aggregated data produced by
GROUP BY.
- Executed after
7. Use Cases
- GROUP BY:
- Summarizing data (e.g., average marks per city).
- Creating reports with aggregated metrics (e.g., total students per age group).
- Preparing data for further filtering with
HAVING.
- HAVING:
- Filtering groups based on aggregates (e.g., cities with high average marks).
- Refining reports to show only relevant groups (e.g., groups with more than one student).
- Combining with
GROUP BYfor advanced analysis.
8. As of 2025
- Modern DBMS (e.g., SQL Server 2025, PostgreSQL 17) optimize
GROUP BYwith parallel processing for large datasets. HAVINGbenefits from improved query planners, allowing complex aggregate conditions with better performance.- Some DBMS (e.g., PostgreSQL) support advanced grouping extensions like
GROUPING SETSthat work withHAVINGfor multi-level summaries.
Comparison Table
Here’s a concise table summarizing the key differences between GROUP BY and HAVING:
| Aspect | GROUP BY | HAVING |
|---|---|---|
| Purpose | Groups rows with identical values in specified columns for summarization. | Filters groups based on conditions involving aggregate functions. |
| Operates On | Individual rows, organizing them into groups based on column values. | Grouped results after GROUP BY, using aggregate functions. |
| Query Position | After FROM and WHERE, before HAVING. | After GROUP BY, before ORDER BY. |
| Conditions | Defines groups (e.g., GROUP BY city); no direct conditions. | Uses aggregate conditions (e.g., HAVING AVG(marks) > 80). |
| Relation to WHERE | Works with WHERE to filter rows before grouping. | Acts like WHERE for groups, applied after grouping. |
| Execution Order | After WHERE, before HAVING in the query pipeline. | After GROUP BY, before SELECT in the query pipeline. |
| Typical Use Cases | Summarize data (e.g., average marks by city). | Filter groups (e.g., cities with average marks > 80). |
| Dependencies | Can be used without HAVING. | Requires GROUP BY in standard SQL. |
📘 Examples to Illustrate Differences
Let’s use the students table to show how GROUP BY and HAVING work together and differ. Assume the table has the following data:
| id | name | age | marks | city |
|---|---|---|---|---|
| 1 | Alice | 20 | 85 | Mumbai |
| 2 | Bob | 22 | 92 | Mumbai |
| 3 | Carol | 19 | 75 | Delhi |
| 4 | Dave | 20 | 88 | Mumbai |
Examples:
- GROUP BY Alone
- Output
- GROUP BY with HAVING
- Output
- WHERE and HAVING
- Output
SELECT city, AVG(marks) AS avg_marks
FROM students
GROUP BY city;
| city | avg_marks |
|---|---|
| Mumbai | 88.33 |
| Delhi | 75.0 |
SELECT city, AVG(marks) AS avg_marks
FROM students
GROUP BY city
HAVING AVG(marks) > 80;
| city | avg_marks |
|---|---|
| Mumbai | 88.33 |
SELECT city, COUNT(id) AS student_count
FROM students
WHERE age > 19
GROUP BY city
HAVING COUNT(id) >= 2;
| city | student_count |
|---|---|
| Mumbai | 2 |
Explanation of Examples:
- GROUP BY Alone: Groups students by
cityand calculates the average marks for each city. All cities appear in the result. - GROUP BY with HAVING: Adds a
HAVINGclause to filter groups, keeping only cities where the average marks exceed 80 (only Mumbai qualifies). - WHERE and HAVING: Uses
WHEREto filter individual rows (age > 19) before grouping, thenGROUP BYto group by city, andHAVINGto keep only groups with at least two students.
📘 Key Rules and Best Practices
- GROUP BY:
- Always list all non-aggregated columns in the
SELECTclause in theGROUP BYclause (e.g.,SELECT city, AVG(marks)requiresGROUP BY city). - Use with aggregate functions like
COUNT,SUM,AVG,MAX,MIN. - Can group by multiple columns (e.g.,
GROUP BY city, age).
- Always list all non-aggregated columns in the
- HAVING:
- Only use aggregate functions or columns listed in
GROUP BYin the condition. - Place after
GROUP BYin the query. - Use for group-level filtering, not row-level (use
WHEREfor that).
- Only use aggregate functions or columns listed in
- Combining Them:
- Use
WHEREto filter rows before grouping,GROUP BYto create groups, andHAVINGto filter those groups. - Example: Filter students by age (
WHERE), group by city (GROUP BY), then keep groups with high averages (HAVING).
- Use
What NOT to Do:
- GROUP BY:
- Don’t include non-aggregated columns in
SELECTwithout adding them toGROUP BY—it causes errors in most DBMS (e.g., MySQL strict mode, PostgreSQL). - Don’t use
GROUP BYwithout an aggregate function unless you want unique combinations (rare). - Don’t group by unnecessary columns—it increases query complexity and slows performance.
- HAVING:
- Don’t use
HAVINGfor row-level filtering—useWHEREinstead to filter before grouping for better performance. - Don’t use column aliases in
HAVING(e.g.,HAVING avg_marks > 80)—use the aggregate function directly (e.g.,HAVING AVG(marks) > 80). - Don’t place
HAVINGbeforeGROUP BY—it’s a syntax error.
- Don’t use
- General:
- Don’t skip testing with small datasets;
GROUP BYandHAVINGcan produce unexpected results with complex queries. - Don’t assume
HAVINGworks withoutGROUP BY—it’s invalid in standard SQL.
- Don’t skip testing with small datasets;
✅ What You’ve Learned
You’re now a pro at understanding the differences between GROUP BY and HAVING! You’ve mastered:
- GROUP BY: Groups rows by columns for summarization, used with aggregates like
AVGorCOUNT. - HAVING: Filters groups based on aggregate conditions, applied after
GROUP BY. - Key Differences: Purpose, what they operate on, query position, conditions, and more, as summarized in the comparison table.
- Best Practices: Use
WHEREfor row filtering,GROUP BYfor grouping, andHAVINGfor group filtering in the correct order.
Practice these with the students table to create powerful summaries and reports. Follow the “What NOT to Do” tips to write efficient, error-free queries!