A SImple SQL Question . . .
In theory a user can only have one job, but because I have a versioning system, I need to only return jobs where job.LiveVersion = 1 (so if a user has three versions of a job it should only return the single LiveVersion=1 of them and if a user is joined to a job which has records but where none of them are LiveVersion=1, the user should be returned, but without a job).
Naive SQL is SELECT user.FirstName, job.Salary FROM user INNER JOIN job ON user.JobID = job.JobID WHERE user.LastName LIKE 'b%' AND job.LiveVersion = 1.
The problem is that this then excludes any users who for whatever reason are joined to a job without a live version :-<
I'm trying to solve the generalized problem as part of an ORM, but if I could solve this I think I could generalize. Is it a subquery that is required?
Any help much appreciated! I'm running MSSQL 2000/2005 if dialect makes a difference.



SELECT
user.FirstName,
job.Salary
FROM
user
INNER JOIN
job
ON
(
user.JobID = job.JobID
AND
job.LiveVersion = 1
)
WHERE
user.LastName LIKE 'b%' AND
This way, the JOIN itself needs to bee both the user-job join AND the live version. If either of those breaks (false) then only the left table (user).
Thanks Ben!
I will add, though, that when solving this same situation generally (ie, for DataMgr), I use subqueries. It allows me to define something like LiveVersion = 1 as a filter for that table and then use the subquery to get the value from the other table.
(note that DataMgr can't specifically do this, but I am speaking more generally about how you could write an ORM/DAL to behave)
That being said, if you will need several fields from another table, it would probably be worth the extra effort to create your ORM to use the outer join.
@Ben, Great to know - nice to have an expert handy :->
SELECT user.FirstName, job.Salary
FROM user
LEFT JOIN job ON (user.JobID = job.JobID AND job.LiveVersion = 1)
WHERE user.LastName LIKE 'b%'
The subquery version looks like this:
SELECT user.FirstName,
(SELECT TOP 1 Salary FROM job WHERE job.JobID= user.JobID AND job.LiveVersion=1) as Salary
FROM user
WHERE user.LastName LIKE 'b%'
While this is fine, in my experience the outer join version will optimize better on most RDBMS platforms.
The only situation where I'd use the subquery would be in cases where there might be more than one job per user, but we actually only want to pull one record per user. In your case, you've stated there is only one Job per user (hopefully this is guaranteed by a unique constraint on JobID and LiveVersion in your Job table) so the left outer join will work fine.
Thanks for the subquery syntax too. I remember there are cases where you HAVE to use the subquery - I certainly use it when I may have one or more aggregate has-many joins.
As an aside, does anyone have any favorite concatenation code for returning "ProductIDList" where Category has-many products? The max, min, count and the like is easy but I seem to remember you need to do something a little funkier (at least in SQL server) to get a comma delimited concatenation of the values of a field as part of an aggregate subquery in the select . . .
DECLARE @id_list VARCHAR;
SET @id_list = '';
SELECT
@id_list = (@id_list + tbl.id)
FROM
[your_table] tbl
Something like that. I am sure there are other ways to do it though.
@id_list = (@id_list + tbl.id)
is supposed to have some sort of comma in it:
@id_list = (@id_list + ',' + tbl.id)
But unfortunately, your suggestion didn't work for me, at least in MS SQL Server.
"smaller intermediary tables before the WHERE clause is every executed.... of course, the SQL server's optimization engine probably does a lot of that for you, but it's good practice to have (and in cases like this is really the only way to go)."
My understanding is that in general, DBMSs tend to do the select operation (which is basically evaluating the where clause) before the joins, because joins are the most expensive operation, and we'd want to do that on smaller tables. Of course, if the join would so greatly reduce the number of rows to perform the selection (where clause) then it may do that first.
But you are right, it does do a lot of that for you, calculating estimated costs of performing which operations in what order, and selecting the best one from there, based on the table sizes and everything. Much more than I would care to think about most of the time when writing a query. =)
Peter,
I still don't see why changing INNER to OUTER in your original query shouldn't work. I may be missing something, but I would think both queries produce equivalent trees...
I'll post working code here when I get a chance if ben doesn't beart me to it - I KNOW I've seen and used concatenation code something like what Ben provided in the past.
@Sam, My mistake. My original SQL should have said LEFT OUTER JOIN which is actually what I'm doing as I don't want to exclude the unemployed!
DECLARE @temp TABLE(
id INT
);
DECLARE @id_list VARCHAR( 1000 );
INSERT INTO @temp (
id
)(
SELECT 1 AS id UNION ALL
SELECT 21 UNION ALL
SELECT 32 UNION ALL
SELECT 43
);
SET @id_list = '';
SELECT
@id_list = (@id_list + LTRIM( RTRIM( STR( id ) ) ) + ',')
FROM @temp
SELECT
@id_list AS id_list
Wow. Any idea how that'd fit into a subquery in the select? Could you put all of that in a SELECT FirstName, LastName, (all that code) AS ConcatenatedJobIDs . . . as part of a list query?
Please don't take this as definitive, as I have rarely done this... but I do not think that you can MIX concatenation AND standard data-retrieval selects in the same Select statement. So, if your eventual query was going to return more than one records (ie. N users), I am not sure you could then, for each row, have a sub-query that concatenates a list of IDs (as this would mix the two forms... not even sure how the variable would work).
But again, do not take that as the truth. If you are going to return just ONE row (ie. 1 user), no reason you couldn't create the concat list before the main query (as i demonstated) and then just select the concat list as an additional query column.
Got it. Let me do some digging because I have a feeling I found SQL that allowed me to do this for a list once - but it has been a long time :-> If I find something, I'll post it.