Orm 4: Aggregations

During the 2 last years I was training Nextras Orm few times and immediately realized that my take on collection extensions (aka custom collection functions) is not easy to explain, understand and use.

Collection functions should brought a way to implement advanced filtering & ordering, such as an aggregation filtering (“select all authors who have written more than 2 books”). Such filtering was missing and I was commonly asked if there is somewhere (even in an external repository) a package, which would implement basic count/sum/min/max aggregation filtering through the collection functions. But, quite expected, nobody done it and published it.

In the end, three weeks back I needed such filtering myself and I was lazy to switch to the mapper layer. I’ve tried to implement such aggregate function with the current collection function interface, but I failed to do it like I wanted and quickly realized it’s not so easy to write a generic version of aggregate function (both for filtering or ordering).

New collection functions

To easy the pain Orm 4 introduces new collection functions interfaces. (And removes the old ones, sorry.) The overall API is reduced and simplified; now if you want to create a collection function, just implement the specific interface for specific storage – namely Nextras/Orm/Collection/Functions/IQueryBuilderFunction for Dbal’s support and Nextras/Orm/Collection/Functions/IArrayFunction for non-persisted (array) collection support.

Newly, collection functions just return an expression (Dbal) or expression’s value (Array). This allows to combine them with other operators easily. The expression is enough for ordering. Filtering will require an additional comparison operator.

The second enhancement is just for Dbal – previously you could return just an Dbal expression for WHERE clause, but aggregation filtering requires putting these filtering conditions into HAVING clause. So we introduced a DbalExpressionResult object that wraps the information (WHERE/HAVING clause) and also provides new simplified interface to work with passed expression, which may be just a simple property name, property expression (needing an auto-join), or even other collection function’s result.

Aggregation functions

The described refactoring allowed a simple implementation of aggregate functions. We bring those aggregation functions into Orm distribution directly. Let me introduce:

  • CountAggregateFunction
  • SumAggregateFunction
  • AvgAggregateFunction
  • MinAggregateFunction
  • MaxAggregateFunction

All those functions are implemented both for Dbal and Array collections and are registered in repository as commonly provided functions.

The rules for using collection functions stayed the same. First, pass the function name and then its arguments – all the aggregation functions take only one argument – an expression that should be aggregated. Let’s see an example:

use Nextras\Orm\Collection\Functions\CountAggregateFunction;

$authorsCollection->orderBy(
    [CountAggregateFunction::class, 'books->id']
);

In the example we sort the collection of authors by the count of their books, e.g. authors with the least books will be at the beginning. The example allows the same “property expression” you may use for filtering. This is new for orderBy() method. Also, you can reverse the ordering:

use Nextras\Orm\Collection\Functions\CountAggregateFunction;
use Nextras\Orm\Collection\ICollection;

$authorsCollection->orderBy(
    [CountAggregateFunction::class, 'books->id'],
    ICollection::DESC
);

We can see the expression syntax is very light and simple. Let’s filter the collection by authors who have written more than 2 books. Using CountAggregationFunction itself won’t be enough. We need to compare its result with the wanted number, 2 this time. To do that use built-in CompareFunction. This function takes a property expression on the left, a comparison operator, and a value to compare.

use Nextras\Orm\Collection\Functions\CompareFunction;
use Nextras\Orm\Collection\Functions\CountAggregateFunction;

$authorsCollection->findBy(
    [
        CompareFunction::class,
        [CountAggregateFunction::class, 'books->id'],
        CompareFunction::OPERATOR_GREATER,
        2,
    ]
);

As you can see, you can nest these function calls together. This approach is very powerful and flexible, though, sometimes quite verbose. To ease this issue you may create own wrappers (not included in Orm!).

class Aggregate {
    public static function count(string $expression): array {
        return [CountAggregateFunction::class, $expression];
    }
}
class Compare {
    public static function gt(string $expression, $value): array {
        return [
            CompareFunction::class,
            $expression,
            CompareFunction::OPERATOR_GREATER,
            $value
        ];
    }
}

// filters authors who have more than 2 books 
// and sorts them by the number of their books descending
$authorsCollection
    ->findBy(Compare::gt(Aggregate::count('books->id'), 2))
    ->orderBy(Aggregate::count('books->id'), ICollection::DESC);

The time will show if such functions and helpers are a good approach, for Orm 4.0 you have to create them by yourself.


This is a fresh feature and I’d like to ask you to test it and give us feedback. Only with your support we can make Orm the best Orm ever. Test it by requiring "nextras/orm": "4.0.x-dev". Comment or open an issue on GitHub. Thank you.

Orm 3.1 – Property containers

Today we are releasing Nextras Orm 3.1 – a release without much features, but with plenty small fixes and enhancements. Let’s examine the major enhancement of property containers. Also, see full release notes.

Property containers

Until now, Orm property containers were quite limited to pretty simple functionality, such as converting encapsulating object as a JSON. (You may have read an article about implementing such container.) Some advanced transformations to class-backed enums were possible, but limited just to the entity interface. The conversion may have failed you just minute later. ICollection required non-converted values for Dbal queries and converted values for in-memory queries. In 3.1 this is fixed. Property containers may define the reverse deserialize function, which will be used in ICollection for proper comparison/query building. Let’s see an example:

We define an enum with marc-mabe/php-enum:

class GeometryType extends MabeEnum\Enum 
{
    const PLACE = 1;
    const CITY = 2;
    const COUNTRY = 3;
    const CONTINENT = 4;
}

Then we define generic reusable enum property container for enums. To this, we inherit from abstract helper class ImmutableValuePropertyContainer that already implements a lot of IPropertyContainer interface.

The definition of EnumContainer is reusable, e.g. you may use it multiple times in different properties with different enum classes. Of your you may write just one-time-purpose property containers.

use MabeEnum\Enum;
use Nextras\Orm\Entity\ImmutableValuePropertyContainer;
use Nextras\Orm\Entity\Reflection\PropertyMetadata;

class EnumContainer extends ImmutableValuePropertyContainer
{
    /** @var string */
    private $enumClass;

    public function __construct(PropertyMetadata $propertyMetadata)
    {
        parent::__construct($propertyMetadata);
        // check the property has one valid type
        assert(count($propertyMetadata->types) === 1);
        $this->enumClass = key($propertyMetadata->types);
        // check the enum class exists
        assert(class_exists($this->enumClass));
    }

    public function convertToRawValue($value)
    {
        assert($value instanceof Enum);
        return $value->getValue();
    }

    public function convertFromRawValue($value)
    {
        $enumClass = $this->enumClass;
        return $enumClass::byValue($value);
    }
}

By default ImmutableValuePropertyContainer allows null values, if you want the null to be part of the enum, you have to override setRawValue, getRawValue respectively.

    public function setRawValue($value)
    {
        $this->value = $this->convertFromRawValue($value);
    }

    public function getRawValue()
    {
        return $this->convertToRawValue($this->value);
    }

The usage then is pretty simple:

/**
 * @property int|null          $id   {primary}
 * @property GeometryType|null $type {container EnumContainer}
 */
final class Geometry extends Entity
{
}

$geometry = new Geometry();
$geometry->type = GeometryType::PLACE();

Filtering requires using proper enum instances:

$geometriesRepository->findBy([
    'type' => GeometryType::CONTINENT()
]);

Property containers may be uses to other data encapsulation use-cases. This enum examples nicely adds type-safety to your code without much effort.

JSON in Nextras Orm 3.0

Storing an array/std-like object structure as json structure in one db column is pretty common use-case. However, the correct approach in Nextras Orm may not be obvious.


🔝 Use property container 😉

First, create your JSON container, by implementing abstract methods of predefined helper class ImmutableValuePropertyContainer.

use Nette\Utils\Json;
use Nextras\Orm\Entity\ImmutableValuePropertyContainer;

class JsonContainer extends ImmutableValuePropertyContainer
{
    protected function serialize($value)
    {
        return Json::encode($value); // or simple json_encode()
    }

    protected function deserialize($value)
    {
        return Json::decode($value); // or simple json_decode()
    }
}

Then, simply define your entity property with container:

/** 
 * ...
 * @property Nette\Utils\ArrayHash $data {container JsonContainer}
 */
class Users extends Nextras\Orm\Entity\Entity {}

Please note that ImmutableValuePropertyContainer API may lightly change in the future minor versions.


🆗 The second possible way is to write custom mapping in StorageReflection – add converter callbacks for your specific property.


🙈 The last possibility is to transform value in setters, getters and entity event callbacks. Please do not do this.

Nextras Orm 3.0

It’s here. It took more time than I have expected. Nextras Orm 3.0 comes with few major features, huge internal refactoring and many small fixes.

Check the release notes on GitHub and upgrade guide.

Collection Custom Functions & OR

Collection custom functions are a powerful enhancement of repository layer. Now you may do more advanced filtering & sorting of ICollection. We also added support for disjunction – OR filtering. As you might have guessed, it is internally also implemented as custom functions.

Custom functions hold the high level abstraction when being applied, though, they internally allow implementing low level behavior specific for the array and SQL storage.

// filter collection by entities with name Jan or age 10
$collection->findBy([
     ICollection::OR,
     ['name' => 'Jan'], 
     ['age' => 10], 
]);

Custom functions are quite powerful because the can be nested!

// filter collection by entities where name starts with Jan or age is 10
// note thet LikeFunction is not included in Orm
$collection->findBy([
     ICollection::OR,
     [LikeFunction::class, 'name', 'Jan'], 
     ['age' => 10], 
]);

Model

We have enhanced also the general model. Now it comes with two important functions:

  • clear() – flushes all the caches & references to entities;
  • refreshAll() – refreshes all data directly from the storage;

DateTimeImmutable

We have dropped support for \DateTime – sorry for that. On the other hand, we now support much better \DateTimeImmutable type.

PHP 7.0 scalar types & PHP 7.2

This release also adds scalar types where is it possible. We currently require PHP 7.0, therefore we do not use nullable types.

Also we have fixed support for PHP 7.2.

MSSQL support

We do not expect many of you will use Microsoft’s SQL server, though, this is quite nice feature and hopefully someone will be happy about that. Some of edge cases are not fully supported (e.g. there is an internal workaround). Please test it and report bugs 🙂


Thanks

I’d like to thank all the contributor and users. You support, question and Orm usage drives me in the further development. Namely I want to thank to Jan Tvrdik and David Matejka for the help with the development and valuable feedback and review. Thank to you guys.


Trainings

We have partnered with GeekyEdu and as a result there is the first Nextras Orm training ever in Prague. 🙂

The progress of Orm 3.0

Hi there! It’s been a while I have published some info about the upcoming release of Nextras Orm 3.0. So, what’s the current schedule?

Nextras Orm 3.0 will be released in 2017!

So, few notes to the current release plan:

  1. I apologize for such delay. The plans were quite extreme. I was not able to fulfill them. Though, many new great features will come, specifically: OR operator and custom functions.
  2. Some feature will be postponed: The most difficult part of these feature is the design. I would have a time to code them, but I had not enough time to design the API, to think it through, to validate the ideas. Also, this is a great oportunity for you to contribute: take a look and think how should the storage reflection factory look, how it should behave.
  3. Currently, only some minor features are missing. Personally, I have deployed current master branch to production of my quite big project. So I have already validated the stability and I expect quite a short RC phase.

The major features of Orm 3.0 will be:

  • OR operator: add a disjunction operator on collection: $collection->findBy(ICollection::OR, [...], [...]);
  • custom functions: create own filtering function to filter data over database and array collection;
  • IModel::refreshAll(): you may refresh the whole entity cache;
  • IModel::clear(): safely clear entity cache;
  • enforced DateTimeImmutable and full support for it;
  • fixed relationship caching, login, traversing;
  • PHP 7.0 type hints;
  • MS SQL Server support;

Orm will be released also with Dbal 3.0. Dbal will bring these features:

  • IConnection interface;
  • enhanced Tracy panel;
  • %json modifier;
  • nested transactions;
  • transaction isolation level;
  • MS SQL Server support;

Orm hackathon & plans for 3.0

Nextras Orm is getting more serious with every day. It has already passed 2000 installs per month! That is an exciting number!

The last weekend I, Jan Tvrdík and David Matějka have met to elaborate on Orm’s internal design and to hack some new needed features. We have discussed many topics, including mapper & repository architecture refactoring or support for new RDBMSes (such as MS SQL Server, which support is on the way into Nextras Dbal 3.0).

Finally, we agreed on the following hacking topics:

  • Refresh: implementing refresh is not an easy task, mainly because it’s pretty complicated to through the “refresh” consequences out; Therefore we decided to start with the easier possible implementation: Model::refreshAll() function that will refresh all loaded entities in your identity map. David Matejka was hacking this and it is available in a pull-request.
  • ICollection operations: another broad topic which should bring you a lot of variability for your collections. This feature should allow you to define special filtering implementations in your mapper layer, however, easily reusable in your collections (repository layer). This Jan Tvrdik’s hacking also includes complex refactoring of collection internals.
  • Embeddables: feature available in Doctrine, however, with few important limitations such as no-nullable embeddable. This was my hacking topic and the partial result is in this pull-request and it is waiting for Jan Tvrdik’s topic.
  • Mapper & Repository dependency: David Matějka continued with another complex refactoring for mapper, storage reflection, repository and identity map classes. The awesome result is already merged in the master branch.

The next day (on Saturday) Mikuláš Dítě has joined us and hacked “migrations plan” pull-request for save migrations running and merging in Nextras Migrations package.

Hackaton just started

The hackaton was taking place manGoweb offices. Thank you for the opportunity!


We didn’t finished all our work, but such meeting & hackaton proved that discussing and hacking in person may bring new ideas and information about Orm’s usage that I didn’t know.

All mentioned features are scheduled for 3.0 release. You may also take a look at 3.0 milestone on GitHub. I’d like to release the first 3.0 Release Candidate at the end of June. So the feature freeze should be in the middle of June.

You may help us by testing current master branch and reporting issues, or by enhancing the documentation, which is opensourced directly in the Orm’s repository.