Postgres array_agg() function
Aggregate values into an array
The Postges array_agg()
function collects values from multiple rows into a single array.
It's particularly useful for denormalizing data, creating comma-separated lists, or preparing data for JSON output. For example, you can use it to list all products in a category from a products catalog table or all orders for a customer from an orders table.
Function signature
The array_agg()
function has this simple form:
expression
: The value to be aggregated into an array. This can be a column or expression of any data type.
expression
: The value to be aggregated into an array.ORDER BY
: Specifies the order in which the values should be aggregated.sort_expression
: The expression to sort by.ASC | DESC
: Specifies ascending or descending order (default is ASC).NULLS { FIRST | LAST }
: Specifies whether nulls should be first or last in the ordering (default depends on ASC or DESC).
Example usage
Consider an orders
table with columns order_id
, product_id
, and quantity
. You can use array_agg()
to list all the product IDs for each order.
This query groups the orders by order_id
and aggregates the product_id
values into an array for each order.
Advanced examples
Ordered array aggregation
You can specify an order for the elements in the resulting array:
This query aggregates the listed skills for each employee into an alphabetically ordered array.
Combining with other aggregate functions
array_agg()
can be used in combination with other aggregate functions:
This query aggregates products into an array with their total sales, for each category.
Using array_agg() with DISTINCT
You can use DISTINCT
with array_agg()
to remove duplicates from the output array:
This query creates an array of the browsers used by each user, without duplicates and in alphabetical order.
Additional considerations
Performance implications
While array_agg()
is powerful, it can be memory-intensive for large datasets. The function needs to hold all the aggregated values in memory before creating the final array. For very large result sets, consider using pagination or limiting the number of rows before aggregating.
NULL handling
By default, array_agg()
includes NULL values in the resulting array. If you want to exclude NULL values, you can use it in combination with FILTER
:
Alternative functions
string_agg()
: Concatenates string values into a single string, separated by a delimiter.json_agg()
: Aggregates values into a JSON array.