Performance optimizations with Doctrine

In Symfony (or plain PHP) projects Doctrine allows you to focus on domain concepts and implement application use cases quickly, letting the library generate both write and read SQL queries.

Lazy loading

This approach involves a large number of queries, especially for complex domains, where entities have many associations. The problem is magnified for use cases that operate on a collection of objects, such as a list page.

Processing time can increase due to the large number of queries, but despite this, for many applications this mechanism works reasonably well. In some cases, however, it might be appropriate to perform optimizations, because the processing time become unacceptable or simply to improve the user experience.

Scenario

Optimized query

In this implementation, where we omitted any filtering and paging for simplicity, we added a join to the *-to-one “author” association, and asked Doctrine to hydrate these objects as well, adding the alias of the BookAuthor entity in the from clause.

In this way we reduced the number of queries to only one, while with lazy loading we would have carried out n + 1 queries, where n is the number of books returned by the main query.

Adding also the join towards the *-to-many “identifiers” association would have meant, due to the nature of the relational db and the SQL language, duplication of the data of books and their authors, with a consequent increase in the transfer time from the DB to the application, so it is generally not a good solution.

If it is known that the association has a very low average multiplicity, it is however a hypothesis to be taken into consideration, but unfortunately it is not possible to set a limit value of the average multiplicity below which this solution is more efficient, because it depends on several factors.

In our example, however, we have chosen not to add the join towards the *-to-many association, but in the method presented above we call this other one instead:

The use of “partials” allows us to minimize the amount of duplicate data transferred from the database, while the inclusion of both books and identifiers in the “select” clause ensures that the latter are hydrated and linked to the instances of the books already in memory. It is not even necessary for this method to return results.

Filtered associations

In this case we have selected only the ISBN identifiers, eliminating the problem inherent the *-to-many associations highlighted above, since in our domain for each book there is only one ISBN.

The condition that filters the *-to-many association must be included in the join clause. In fact, if we had added it in the where clause, we would have excluded from the query results the books that do not have an ISBN identifier.

It is important to point out that Book instances in memory will be different from those stored in DB, in general a situation to avoid, so it is recommended to adopt this solution only when performance is a critical element of the application. Furthermore, it seems appropriate in these cases to highlight this choice and its consequences with comments on the code.

Partial objects deprecated

Conclusions