Understanding the n+1 query problem
[update] And thanks to Gus and Al Davidson, some sample SQL for avoiding the problem - check out the comments!
Lets say you want to display a list of categories in an e-commerce store, but you’d also like to display how many products are in each. A naive solution might look something like:
Select CategoryTitle,CategoryID
From tbl_Category
</cfquery>
<cfoutput query="GetCategoryList">
<cfquery name="GetProductCount" datasource="mydatasource">
SELECT Count(ProductID) as ProductCount
FROM tbl_Product
WHERE ProductCategoryID = #CategoryID#
</cfquery>
<li>#CategoryTitle# (#GetProductCount.ProductCount#)<li>
</cfoutput>
So, how many queries are we going to be running? Well, even if there are no categories we will be running 1 query (to get the category list). If there is 1 category we’d run 1 category list query and 1 product count query. If 2 categories we’d run 1 category list query and 2 product count queries (one per category). In general terms we'll always run 1 category query and n-product count queries for an n-category situation, hence this is an “n+1 query” solution (for example, 50 categories, 51 queries).
The problem with n+1 query situations is that typically for simple web applications, database queries are one of the biggest bottlenecks in terms of performance, and running a lot of them sequentially is usually a bad thing. Lets say an average query took 50ms and we had 49 categories. That would suggest 2500ms (2.5 seconds) just to process the queries for a single page!
While you don’t want to get involved in premature optimization, be aware whenever you see an n+1 query situation as it may well be something you'll have to replace with a more sophisticated query if performance of that part of the application becomes a problem.


The way I would get this data is with one query, which can be done regardless of how many products or categories there are:
<cfquery name="GetCategoryListAndCountOfProducts" datasource="mydatasource">
SELECT cat.categoryTitle,cat.CategoryID,count(prod.pordictID) as productCount
FROM tbl_Category cat
INNER JOIN tbl_Product prod ON prod.productCategoryID = cat.categoryID
GROUP BY cat.categoryTitle,cat.CategoryID
</cfquery>
This is the t-sql syntax, but similar queries are possible with pretty much any DB.
Gus
SELECT cat.categoryTitle,cat.CategoryID,
( SELECT count(prod.pordictID)
FROM tbl_Product prod WHERE prod.productCategoryID = cat.CategoryID
) AS productCount
FROM tbl_Category cat
sometimes this can give you better performance, depending on indexes, rowcounts, what side of bed the query optimizer got out of this morning.... all kinds of things!
Thanks for posting the SQL. I was going to do that, but it was late and I had other things to deliver. FYI, I use the subquery in the select method. As I mentioned, Im not much of a SQL guru, but an inner join has the downside that a category won't display if it has no products which is not usually what you want. Also, if you have multiple aggregates (see my previous post) it is much easier to build up the SQL by putting subqeuries in the select as you effectively just define the SQL for each field in the SELECT statement in your DAO, making it very easy to build up the SQL auomatically for the subquery approach.
Best Wishes,
Peter
You can use outer joins if some joins return zero rows
Cheers
Richard
Good point. A left outer join would work perfectly for a single aggregate.
In the general case, I'm sure you could use joins if you had n-aggregates, but I tend to find it easier to use the subqueries in the select statement. Something I only learnt about a few months ago, but it seems to work pretty well.
Thanks for the heads up! Luckily I use MSSQL and with Windows Servers and the new SQL 2005 Express I don't have a good reason to move to MySQL, but that is really good to know.
Guess I'll keep me DAO generator MSSQL specific for now so I don't have to figure out an algorithm for generating the n-joins instead of using select subqueries which make the programming a lot easier. They perform just fine in MSSQL!
Hope you're doing well?!