Orm 4: Aggregations

During the 2 last years I was training Nextras Orm few times and immediately realized that my take on collection extensions (aka custom collection functions) is not easy to explain, understand and use.

Collection functions should brought a way to implement advanced filtering & ordering, such as an aggregation filtering (“select all authors who have written more than 2 books”). Such filtering was missing and I was commonly asked if there is somewhere (even in an external repository) a package, which would implement basic count/sum/min/max aggregation filtering through the collection functions. But, quite expected, nobody done it and published it.

In the end, three weeks back I needed such filtering myself and I was lazy to switch to the mapper layer. I’ve tried to implement such aggregate function with the current collection function interface, but I failed to do it like I wanted and quickly realized it’s not so easy to write a generic version of aggregate function (both for filtering or ordering).

New collection functions

To easy the pain Orm 4 introduces new collection functions interfaces. (And removes the old ones, sorry.) The overall API is reduced and simplified; now if you want to create a collection function, just implement the specific interface for specific storage – namely Nextras/Orm/Collection/Functions/IQueryBuilderFunction for Dbal’s support and Nextras/Orm/Collection/Functions/IArrayFunction for non-persisted (array) collection support.

Newly, collection functions just return an expression (Dbal) or expression’s value (Array). This allows to combine them with other operators easily. The expression is enough for ordering. Filtering will require an additional comparison operator.

The second enhancement is just for Dbal – previously you could return just an Dbal expression for WHERE clause, but aggregation filtering requires putting these filtering conditions into HAVING clause. So we introduced a DbalExpressionResult object that wraps the information (WHERE/HAVING clause) and also provides new simplified interface to work with passed expression, which may be just a simple property name, property expression (needing an auto-join), or even other collection function’s result.

Aggregation functions

The described refactoring allowed a simple implementation of aggregate functions. We bring those aggregation functions into Orm distribution directly. Let me introduce:

  • CountAggregateFunction
  • SumAggregateFunction
  • AvgAggregateFunction
  • MinAggregateFunction
  • MaxAggregateFunction

All those functions are implemented both for Dbal and Array collections and are registered in repository as commonly provided functions.

The rules for using collection functions stayed the same. First, pass the function name and then its arguments – all the aggregation functions take only one argument – an expression that should be aggregated. Let’s see an example:

use Nextras\Orm\Collection\Functions\CountAggregateFunction;

$authorsCollection->orderBy(
    [CountAggregateFunction::class, 'books->id']
);

In the example we sort the collection of authors by the count of their books, e.g. authors with the least books will be at the beginning. The example allows the same “property expression” you may use for filtering. This is new for orderBy() method. Also, you can reverse the ordering:

use Nextras\Orm\Collection\Functions\CountAggregateFunction;
use Nextras\Orm\Collection\ICollection;

$authorsCollection->orderBy(
    [CountAggregateFunction::class, 'books->id'],
    ICollection::DESC
);

We can see the expression syntax is very light and simple. Let’s filter the collection by authors who have written more than 2 books. Using CountAggregationFunction itself won’t be enough. We need to compare its result with the wanted number, 2 this time. To do that use built-in CompareFunction. This function takes a property expression on the left, a comparison operator, and a value to compare.

use Nextras\Orm\Collection\Functions\CompareFunction;
use Nextras\Orm\Collection\Functions\CountAggregateFunction;

$authorsCollection->findBy(
    [
        CompareFunction::class,
        [CountAggregateFunction::class, 'books->id'],
        CompareFunction::OPERATOR_GREATER,
        2,
    ]
);

As you can see, you can nest these function calls together. This approach is very powerful and flexible, though, sometimes quite verbose. To ease this issue you may create own wrappers (not included in Orm!).

class Aggregate {
    public static function count(string $expression): array {
        return [CountAggregateFunction::class, $expression];
    }
}
class Compare {
    public static function gt(string $expression, $value): array {
        return [
            CompareFunction::class,
            $expression,
            CompareFunction::OPERATOR_GREATER,
            $value
        ];
    }
}

// filters authors who have more than 2 books 
// and sorts them by the number of their books descending
$authorsCollection
    ->findBy(Compare::gt(Aggregate::count('books->id'), 2))
    ->orderBy(Aggregate::count('books->id'), ICollection::DESC);

The time will show if such functions and helpers are a good approach, for Orm 4.0 you have to create them by yourself.


This is a fresh feature and I’d like to ask you to test it and give us feedback. Only with your support we can make Orm the best Orm ever. Test it by requiring "nextras/orm": "4.0.x-dev". Comment or open an issue on GitHub. Thank you.