Michael J. Swart

September 12, 2012

When I Use Nested Joins

Filed under: SQL Scripts,SQLServerPedia Syndication,Technical Articles — Tags: — Michael J. Swart @ 12:00 pm

I want to explain how I avoid RIGHT OUTER joins in favor of LEFT OUTER joins and how I avoid OUTER joins in favor of INNER joins.

There's no place bitchin-er than Kitchener

Inner joins have no direction, but outer joins do so that we have three kinds of joins:

  • INNER JOIN (JOIN for short)
  • LEFT OUTER JOIN (LEFT JOIN for short)
  • RIGHT OUTER JOIN (RIGHT JOIN for short)

I leave out FULL OUTER JOINS for now because I never use them. And in fact RIGHT OUTER JOINS can always be written as LEFT OUTER JOINS, so in practice I only use just the two kinds of joins:

  • INNER JOIN
  • LEFT OUTER JOIN

Now as a rule of thumb, inner joins are more efficient than outer joins so it would be better to write queries that avoid outer joins. Let’s get to an example. My goal here is to write a query that gives me a list of employees and their director (if any) based on this schema.

CREATE TABLE STAFF
(
    Id NVARCHAR(20) NOT NULL PRIMARY KEY,
    Name NVARCHAR(400) NOT NULL,
    Department NVARCHAR(20),
    Role NVARCHAR(20) NOT NULL
)
 
CREATE TABLE BOSSES
(
    EmployeeId NVARCHAR(20)
        REFERENCES STAFF(Id),
    BossId nvarchar(20)
        REFERENCES STAFF(Id),
    PRIMARY KEY (EmployeeId, BossId)
)

BOSSES is a table that contains not just direct reports, but all direct and indirect reports (making it handy for this query).

Using a RIGHT JOIN

The answer is fairly straightforward. I join the BOSSES table with the STAFF table to give me all the directors and their reports:
Two inner joins

select S_Employee.Name, S_Boss.Name as Director
from STAFF S_Boss
join BOSSES B
    on B.BossId = S_Boss.Id
join STAFF S_Employee
	on B.EmployeeId = S_Employee.Id
where S_Boss.Role = 'Director'

But wait, this isn’t a complete list of employees. What about those in the company that don’t report to any director (not even indirectly). This is where the right outer join comes in:
An inner join and a right join

select S_Employee.Name, S_Boss.Name as Director
from STAFF S_Boss
join BOSSES B
    on B.BossId = S_Boss.Id
RIGHT join STAFF S_Employee
    on B.EmployeeId = S_Employee.Id
    and S_Boss.Role = 'Director'

Notice that the S_Boss.Role filter can’t belong to the where clause any more. If it did, we’d lose director-less employees again and we’d be back where we started.

Using LEFT JOINS Only

That works… but for the sake of style, let’s only use left joins. I prefer using only left joins in this case because logically, the results are meant to be the set of employees. So I like to start with that list of employees and then join in the bosses if necessary. That’s why I start with the set of employees as the first table. The other joined tables aren’t the stars of this query; they’re only there to help provide that director attribute. So I have this:
Two left joins

select S_Employee.Name, S_Boss.Name as Director
from STAFF S_Employee
left join BOSSES B
    on B.EmployeeId = S_Employee.Id
left join STAFF S_Boss
    on B.BossId = S_Boss.Id
    and S_Boss.Role = 'Director'

But you notice that I’m now using two left joins… Really I only want the one outer join that I was using in the first example. Turns out I can do that:

Using Nested Joins

Well that just looks like this:
One left join

select S_Employee.Name, S_Boss.Name as Director
from STAFF S_Employee
left join BOSSES B
    join STAFF S_Boss
        on B.BossId = S_Boss.Id
        and S_Boss.Role = 'Director'
    on B.EmployeeId= S_Employee.Id

This is logically equivalent to the right join but it uses left joins instead and only uses one outer join. Great!

A lot about this example is only about style. Maybe you prefer the right join example and think that I’m breaking some grammar rule here. It feels like I’m splitting an infinitive or something. Let me know what your preference is.

Powered by WordPress