Combining 2 Tables with a JOIN in SQL Server 2012

This post is from our Introduction to Transact-SQL online video training library.


Video transcript: Combining 2 Tables with a JOIN in SQL Server 2012

Instructor: Jeff Jones.

What You Will Learn:

  • Understand the JOIN operation and requirements for a JOIN
  • Define Inner JOIN
  • Define Outer JOIN and explore variations
  • Explore Table Aliases

The one thing that I get when I talk to people that are learning the SQL language is, they always have the question of, what the heck is a JOIN?

We’re going to go into the ability of combining two tables with a JOIN. The way we’re going to do this is, we’re first going to talk a little bit about the purpose of the JOIN. Why is it that we need to do the JOIN operation, anyhow? What are some of the requirements of the JOIN? Then we’re going to talk about two different types of JOIN operations that are supported in the SQL language. One of them is called the Inner JOIN, the other is the Outer JOIN. We’ll see there are variations on the Outer JOIN as well.

[0:41] Lastly, we’re going to talk about something called a Table Alias, which is basically a little technique to help make writing your queries a little faster, and not having to type so much information as you’re referencing different columns and different tables within your query.

What does the JOIN do?

001-purpose-of-the-join-in-sql

First of all, let’s talk a little bit about the purpose of the JOIN. The purpose of the JOIN is to combine data from two tables. Basically, we want to take two tables that have information inside of them and they have something in common.

Let’s say, for example, we have a CustomerTable and we have an OrdersTable. What we’d like to do is, we’d like to see the customers that have orders, and we’d like to combine those orders, associate them with the appropriate customer, and return that as our result set.

[1:33] We may want to do that with departments and employees. We have a Department table and we have an Employee table. What we’d like to do is take all the departments, JOIN them together or combine them with the employees, and return back a list of all the employees by department. Anytime you need to get data from multiple tables and bring that data together, then you need to use a JOIN operation.

One of requirements of the JOIN operation is that there must be something in common between those two tables. Usually that is some sort of column that is in common between the two tables. It could be some subset of columns. You have a lot of flexibility as to how you define what is in common between the two tables.

One of the most important paths that you’ll find is, when you look at a database, if it has relationships already predefined, that already gives you a set of JOIN paths that you can take advantage of.

[2:23] What’s also nice about them is that those JOIN paths are being validated by the database system. Whenever you’re making changes to the database, as you’re adding rows into one table, it’s validating it against the other table. It’s automatically making sure that the data is in sync.

The one thing about doing JOINs is, you always want to make sure that you’re getting a meaningful result. That means when you do the JOIN, you have written it in such a way that you’re using the proper columns to do the JOIN operation with.

[2:54] The JOIN is very flexible. You can do anything you want. It will allow you to JOIN data using invalid columns, and you’ll get an invalid result set. Therefore, it’s very important that you write the JOIN to make sure you understand which columns are going to be used to combine the data together.

Let’s take a look at this diagram to illustrate this.

002-purpose-of-the-join-in-sql

We have a couple of tables here. We have a ProductCategory table and we have a product table. For example, if I needed to display all my products and I wanted to look at them by ProductCategory, I would need to JOIN these two tables together to get that data put together in the right way.

003-purpose-of-the-join-in-sql

[3:36] What’s going to end up happening, it’s going to each row from, let’s say, the ProductCategory table. It’s going to find the matching row in the products table, and it’s going to return from that a single row. It’s putting those two tables together into a single row.

What’ll happen is, as we look at this particular problem with writing queries, we may need data from a lot of different tables. I may need data not only from the ProductCategory and products table, but I’d also like to see all the orders that were taken that used those particular products.

[4:13] What I need to do is find the path. I need to find things that are in common to get over to the SalesOrderHeader table. What we’d also see is, we have a relationship from the product to the sales order detail. I’ll need to do another JOIN operation because there is no direct relationship from Product or ProductCategory to the SalesOrderHeader. I’d have to go through the SalesOrderDetail table.

004-purpose-of-the-join-in-sql

I will need to find something that’s in common. I have the product ID, and I have the product ID here.

007-purpose-of-the-join-in-sql

Now I’m able to JOIN those two tables together. I can now say these two tables are going to be Joined to get to the sales order detail. Even though there’s no data in the sales order detail that I’m actually interested in, I still have to go through there to get to my SalesOrderHeader.

[5:02] Now I’m able to go from the sales order detail back up to the SalesOrderHeader. I would need to do another JOIN operation using the sales order ID in those two tables as the JOIN condition.

008-purpose-of-the-join-in-sql

Now I’m able to navigate my way through data base. That’s why database diagrams can be very helpful. They allow you do figure out the path. They’re like a road map. You’re looking for the “roads” that exist between these different tables so you can pick the right path to get there, to get that meaningful result.

The database diagram acts as a map. It really does come in handy as you are mapping out your navigation through the database to pull all the columns together that you need to generate the particular query that you’re interested in.

Types of JOIN Operations – Inner Join and Outer JOIN:

009-purpose-of-the-join-in-sql

[6:01] There are two different types of JOIN operations. There’s the Inner JOIN, and there’s the Outer JOIN. Let’s take a look at what these two different JOIN operations do and what the differences are.

Inner Join:

We have a couple of reports here. One report is a Monthly Customer Sales Report. In other words, we’d like to see a list of all of our customers that we took orders from in that particular month. The way we would do that is, we would be able to use an Inner JOIN operation.

[6:37] The Inner JOIN operation is going to be doing a matching operation. If you look at these two little circles here, this is the representation of something you probably ran into in junior high math. It’s when you started learning about set theory. They had these things called Venn diagrams.

The Venn diagram was a circle that represented a set of information. Another circle represented another set. You would show how those sets potentially intersect. That’s what an Inner JOIN does. It takes two sets. Each set is represented by a table.

This would be the customer table. This over here would be the order table. It’s looking for the intersection.

010-purpose-of-the-join-in-sql

The intersection is where those two tables match each other; in other words, where the JOIN criteria, the common columns, have the same exact value. That brings these two tables together.

[7:35] Whenever we do the Inner JOIN, it’s going to look for those orders that were related to a particular customer and make sure that order is assigned and displayed with the appropriate customer. As long as a customer had at least one order that month, they’re going to show up in this report.

Outer JOIN:

We have another requirement for another report that says we want to see all of our customer that had sales in that particular month, whether that customer had a sale or not. They way do that is, kick in and use our Outer JOIN operation.

The Outer JOIN operation says we’re going to display all of the customers, whether they had an order or not within that month. Where there is an order, we will match them. Where there isn’t an order, we’ll still display it. We’ll place empty values in the columns that come from the table that we’re Joining.

[8:35] If we go back to our Venn diagram again, we’ll see that we have our customer table here.

011-outer-join-purpose-of-the-join-in-sql

We have our sales order table here. When we look at the intersection, we’re actually saying that we’re going to return all of the customers as well as the intersecting orders.

The Outer JOIN operation preserves one or the other table. You’ll see that actually it will preserve one or the other or both as we talk about the different types of Outer JOINs. It’s going to preserve the data from one table. For example, in this case, the table is the customer table it’s going to preserve.

[9:34] Let’s take a look at the syntax of the inner JOIN operation.

Inner Join Query:

SELECT customer.customerid
, customer.firstname
, customer.lastname
, SalesOrderHeader.SalesOrderID
, SalesOrderHeader.OrderDate

FROM SalesLT.Customer

INNER JOIN SalesLT.SalesOrderHeader
ON customer.customerid = SalesOrderHeader.customerid;

  • Each join brings together two tables/views based on common column value(s)
  • The inner join query only returns a row when there is a matching value
  • Columns from both tables are available for the column list, grouping, filteringand sorting

Whenever we are writing our query, remember we’re here in our FROM clause and we’re going to be identifying the tables that are going to be used to pull the data out. This is our input phase of our query.

We’ve seen that you can specify a single table and have it return the data from it, but now we want to bring data back from multiple tables. The way we bring data back from multiple tables is by initiating the JOIN operation.

[10:07] Now we see we have this new set of keywords here called the Inner JOIN. The inner JOIN tells SQL Server that we need to take the first table that was identified and we need to JOIN it with the second table that’s identified just to the right of the word “JOIN.”

We also need to specify in the ON clause what are the columns that we’re going to do the JOIN. In other words, what are the two columns that are in common between these tables that we can use to combine this data together. Going from the customer to the order, we see that the CustomerID is the table that’s in common between those particular tables.

012-outer-join-purpose-of-the-join-in-sql

Because the CustomerID is the same name, we have to be a little bit more specific about which CustomerID column we’re talking about. We have to further qualify it with the name of the table that is going to be where that particular column is coming from.

[11:18] The CustomerID here is coming from the customer table, and the CustomerID here is coming from the SalesOrderHeader table.

013-outer-join-purpose-of-the-join-in-sql

Now, you’ll also notice in the column list above that we have specified the table in front of every single column.

014-outer-join-purpose-of-the-join-in-sql

It is only required that we specify the column or the table in front of each column name whenever there is two columns of the same name.

If you don’t you’ll get a message back saying that you have an ambiguous column name, and remember that term ambiguous column name. That’s pretty much exactly the error message that comes back from SQL Server if you do not do the proper prequalification of those columns that have the same name.

[12:07] The other columns, it’s always a good practice to do. So you don’t have to prequalify the other columns that are not duplicates within that list of column names, but it’s always a good practice so that you know which columns are coming from which table. It can come in handy during debugging where you may want to disable a JOIN operation. You’ll have to get rid of those columns in the column list that were in the table that you tried to disable the JOIN from. It’s a good practice just to put the table name in front of each column name whenever you reference it in a query that’s doing JOINs.

[12:51] We’re going to go in and we’ll just write a query. Remember, whenever you’re creating a query you can use the asterisk (*) only during development phase. You don’t worry about your column list yet. SO we’ll just say just “give me all the columns right now” by using SELECT *, and then we’ll focus in on our FROM clause.

015-select-table-purpose-of-the-join-in-sql

Then we’re going to go into SalesLT.Customer. Then a good practice putting all of these on separate lines. We would then specify we want to do an inner JOIN with SalesLT.SalesOrderHeader.

016-select-table-purpose-of-the-join-in-sql

[13:42] Then, of course, the next thing we have to specify is that ON clause, and this is where we identify what are the columns that we’re going to be Joining ON. So we say  ON and here we have to be explicit. We want to go to the Customer table, and pull out the CustomerID, and it’s going to be equal to the SalesOrderHeader table CustomerID. Put our semicolon on the end here, and when we execute this particular query.

017-select-table-purpose-of-the-join-in-sql

We get rows back that start out with customer columns. We have the CustomerID, their name, titles, company.

018-select-table-purpose-of-the-join-in-sql

If we keep sliding to the right eventually we’ll see that we have SalesOrderHeader information, the SalesOrderID, the RevisionNumber, the OrdeDate, DueDate, and so on and so forth. All of the information that is stored away in the SalesOrderHeader.

019-select-table-purpose-of-the-join-in-sql

Now, at this point, once you combine these together you say, “Gee, I’m getting all the right data that I want.

Then, of course you could go back in, and define your individual columns that you want to have displayed back to you. You may want the CustomerID, so we would definitely want to say Customer.CustomerID, maybe we’d also like the Customer.LastName so we’ll include the last name. we like the SalesOrderHeader, sales dot sales order ID. We’ll take that, and we’d also like the order date and we’ll take the SalesOrderHeader.order date. You can see that Intellisense can help you type all this stuff in quicker. Intellisense really does make a difference.

020-select-table-purpose-of-the-join-in-sql

 

021-select-table-purpose-of-the-join-in-sql

Now, when I execute this we’ll see that I’m getting my data back.

022-select-table-purpose-of-the-join-in-sql

And I’m getting, for each customer I have their orders, and if the same customer had multiple orders, then of course there would be multiple rows with the same customer name and customer information.

[16:08] If I had a customer that had three orders within the month that I was looking at or all the orders in this case, then I would end up seeing the customer information replicated three times, one for each order header, and that’s the way a relational result set comes back. It will repeat the data from the left side for each row on the right side.

That’s how we do an Inner JOIN operation. It looks pretty simple, and it really is. A lot of people get confused by it, but it actually is pretty straight forward.

Syntax of the inner JOIN operation:

SELECT customer.customerid
, customer.firstname
, customer.lastname
, SalesOrderHeader.SalesOrderID
, SalesOrderHeader.OrderDate
FROM SalesLT.Customer
LEFT OUTER JOIN SalesLT.SalesOrderHeader
ON customer.customerid = SalesOrderHeader.customerid;

  • Each join brings together two tables/views based on common column value(s).
  • The Outer JOIN query only returns a row when there is a matching value.
  • Rows from left, right or both tables (based on join operator) are reserved when there is no match.
  • Columns from the non-preserved table are set to NULL.

Let’s talk about the Outer JOIN, and this is where the mind starts to get a little fuzzy again when we start talking about Outer JOINs.

[17:06] The Outer JOIN is the same syntactic construct as the Inner JOIN except all we have to do is change the key words. Everything else is the same except we say left Outer JOIN, and we’ll see that there are three different types of Outer JOINs. There is a left Outer JOIN, a right Outer JOIN, and a full Outer JOIN. We’ll talk more about those in a moment.

023-outer-join-select-table-purpose-of-the-join-in-sql

Let’s focus on the left Outer JOIN. This is the one that you’ll probably do the most in writing queries. What we mean by left is that we’re looking at the table that was defined to the left of the JOIN keyword. Now, because of the way I’ve put the statement on multiple lines the table that’s to the left is the one that’s right in front of it, which is this table right here.

024-outer-join-select-table-purpose-of-the-join-in-sql

[17:48] That is the left table, and the right table is this table over here. It’s the one to the right, or following the JOIN keyword. Just keep that in mind. The left table is the first table identified in front of the JOIN keyword, and the right table is the table following the JOIN keyword.

Once we identify that what it’s basically saying is we want to preserve all the rows in the customer table. whether there is an order associated to that customer or not.

That’s really what the Outer JOIN does. It does everything that the Inner JOIN does. It just also preserves the data from which ever table you identify in the Outer JOIN operation.

Let’s take a look at the Outer JOIN, and I’m going to change the word “inner” to the word “left outer”. That’s all I really need to do.

025-outer-join-select-table-purpose-of-the-join-in-sql

To:

026-outer-join-select-table-purpose-of-the-join-in-sql

[18:58] When I execute this particular query, we’ll see that we have a number of customers over here that do not have an order.

027-outer-join-select-table-purpose-of-the-join-in-sql

If we scroll down, we’ll see eventually we will come across somebody that does, somewhere down in here. Actually looks like most of the customers in this database don’t have any orders. There’s one, there’s a customer that has an order. Beck has an order.

028-outer-join-select-table-purpose-of-the-join-in-sql

Wherever there was an order, it Joined them together just like an Inner JOIN. Where there was not an order, it preserved the customer information in a column from the customer table, and it put a “NULL” state into all the SalesOrderHeader columns.

One thing to be aware of, these keywords here have some that are considered what are called “noise words.” The word, “outer” is considered a noise word. In other words, you can get rid of the word, “outer” and just say “left JOIN,” and it knows there’s no such thing as a left inner JOIN.

I didn’t mention it before, but I’ll mention it now, the word, “inner” is also a noise word, not required. If I just say, “JOIN,” it will perform an inner JOIN operation. If I say, “left JOIN” it performs a left Outer JOIN operation.

029-outer-join-select-table-purpose-of-the-join-in-sql

The word, “inner” and “outer” are optional words.

[20:56] You may run into this as you’re looking at other people’s queries, and if you saw, “Where’s the word, “outer,” where’s the word, “outer”? It’s OK, they’re not necessary, but probably good from a writing a complete statement.

Let’s talk a little bit more about the Outer JOIN types.

030-outer-join-select-table-purpose-of-the-join-in-sql

I mentioned it a moment ago, let’s just highlight this again. We have what are called the left Outer JOIN, we saw that just a moment ago. We also have the right Outer JOIN, and the right Outer JOIN is basically saying, “Preserve the table that is defined to the right of the JOIN keyword.”

[21:36] All of those rows are going to be returned whether there is a match with the left table or not. There’s something called the Full Outer JOIN. This basically says that we’re going to preserve both tables. All the data from the left table is going to be preserved. All the data from the right table is going to be preserved.

Where there’s matching rows, they’re going to be Joined together just like an Inner JOIN. Where there are no rows in the right table, they will get NULLs. Where there are no rows in the left table, it will get NULLs.

[22:05] It has this way of allowing you to match data together, find out where there are differences, really useful for deltas, and I’m going to show you an example later. Let’s take a look at the right Outer JOIN for a moment.

If I change the left to right, and I execute this.

031-right-outer-join-select-table-purpose-of-the-join-in-sql

It basically comes back with the same data as if I did an inner JOIN.

032-right-outer-join-select-table-purpose-of-the-join-in-sql

The reason that that’s the case is because all of the orders have a customer associated with them.

[22:40] You don’t have to worry about it. Of course the reason is because there was a relationship defined that does not allow a SalesOrderHeader row to be inserted without a valid CustomerID. In this case, it just looks just like an inner…a right Outer JOIN looks just like an inner JOIN, effectively.

It’s still performing the Right Outer JOIN operation. Of course, all I’d have to do is I could flip these two tables around, put the customer table second, and the SalesOrderHeader table first, and it would come back with the same result set as a Left Outer JOIN, it just has the tables flipped around.

The ON clause does not have any impact on the left or right Outer JOIN. In other words, I could say, “SalesOrderHeader.customerID = customer.customerID,” or the way I have it now makes no difference at all. All of the preservation processing is determined by the keyword (RIGHT).

[23:39] Let’s take a look at the full Outer JOIN for a moment. You can see from the example the full Outer JOIN has NULL values showing up on both the left table here and the right table.

033-full-outer-join-sql

I’ve created a couple of tables called, NewProductModel and OldProductModel.

033-full-outer-join-example-sql

They basically are the same table except there are some rows that are matched, there are some rows that are not a match.

We’d like to figure out where this is a match, where there isn’t a match, where there’s rows in one table that don’t exist in the table on the right, and where there are rows in the table on the right that don’t exist on the table on the left.

[24:28] We’ll say, “SELECT * FROM dbo. and we’ll just start with the old table will be on the left. We’ll say, “FULL Outer JOIN,” and of course again the word, “OUTER” is optional. We’ll say, “DBO.NewProductModel.” We’re going to JOIN this ON.

034-full-outer-join-sql

If we look at the actual column in the table we’ll see they both have the same columns, ProductModel ID and name.

035-full-outer-join-sql

We could JOIN it on either the ProductModel ID or the name. Usually you’re going to JOIN on some sort of a key value, a numeric value is a little more efficient to JOIN on than alphabetic characters.

[25:13] We’ll just pick the ProductModel ID as our JOIN condition. We’ll say, “We’re going to take the OldProductModel.ProductModelID = NewProductModel.ProductModelID.”

036-full-outer-join-sql

When we execute this, we’ll see that we get back rows that match, so we see that product model ID one exists in both tables.

037-full-outer-join-sql

We see that ProductModelID two only exists in the NewProduct table, does not exist in the OldProduct table.

We see ProductModelID 3 and 4 exist in both tables.ProductModelID 7 exists in the Old table, but does not exist in the New table.

Now very easily, you’re able to go in and kind of figure out what the differences are between two exact tables.

[26:34] A Full Outer JOIN is very handy for that kind of operation, and you may run into others based on requirements that pop up, but here’s a good way of doing kind of a delta, and determining the differences between tables, or what’s in common between them, and so on and so forth. Full Outer JOIN has some handy uses, and that’s one use case right there.

There are 3 different types of Joins.

  • Left Outer
  • Right Outer
  • Full Outer

The one you’ll probably use the most is the Left Outer JOIN, but based on where your JOINs…or where your tables are defined within your query, it really depends on when you’re Joining more than one table, or if you’re Joining more than two tables.

If you’re Joining 3, 4, 5 or 6 tables, depending on how you write them, you may need to do a Right Outer JOIN. A Full Outer JOIN you’re going to have a specific purpose for.

Table Aliases:

038-full-outer-join-sql

[27:31] We also have this thing called Table Aliases, and Table Aliases are really short-hand, a way for you to cut down the amount of typing that you need to do as you’re identifying these columns, especially those that could potentially be ambiguous.

If you look at the example above, we’ll see that we have the same query first of all at the top without Table Aliases. Then down below using a table alias.

Up here, we had to reference each individual table name every time we wanted to further qualify that column.

[28:06] Down here, we see that we have the ability by specifying, “AS,” and then identifying some sort of other name we can create an alias. From this point forward within this query, the customer table is going to be called, “C,” and the SalesOrderHeader table is going to be called, “soh.”

You can make these aliases as small or as big as you want. Usually people make them smaller so they can reduce the amount of typing. You want them to be fairly descriptive, so that at least within the context of that query it’s easy to identify which particular table it’s associated to.

[28:58] Now for the entire query, you must now reference the customer table and the SalesOrderHeader table with the alias. Here even in the column list where we effectively haven’t defined this until the, “from” clause it still applies.

If you went in and you typed, “customer.customerID,” after defining customer with an alias, it will actually come back and say, “I don’t know what you’re talking about.” The alias is more than an alias. You have completely renamed that table in the context of that query. It will not even recognize its original name.

[29:38] Let’s go to our Full Outer JOIN and add aliases. I’ve got my OldProduct table, we’ll call it, “OP” for Old Product, and I have one over here called, “NP.” I told you that you would put the “AS” clause in front of it.

039-alias-full-outer-join-sql

AS is a noise word. (In other words, you don’t need it.) If you like it, and it makes things more readable, you can do it. It’ll work without it.

[30:03] I tell you this because you’ll probably run into other queries that basically don’t have the word, “as” and it still works.

Down below, you’ll notice what happened was all of a sudden, the OldProduct ModelProductModel ID column doesn’t recognize it.

040-alias-full-outer-join-sql

In fact that little error message says, “It’s not bound.” In other words, I don’t know what this column is.

[30:50] Same with this one. It doesn’t even know what that column is. The minute I put that alias on that table name, it doesn’t recognize it, the old table name. Now I need to go in and say, op.ProductModelID, and np.ProductModel ID.

041-alias-full-outer-join-sql

Of course if I wanted I can go up here and reference these columns. I have the np.name and I have the op.name.

042-alias-full-outer-join-sql

Again, the query looks at the processor of SQL, looks at the entire statement. It’s not like it’s waiting and says, “You didn’t define this here so you can’t use it up here.”

It looks at the entire statement at a time, because it’s a specification language, not a procedural language. I can execute that and I can see my results.

So the aliases are strictly for internal query structure purposes. It doesn’t reflect on the outside. You don’t even see that those columns have aliases in our result set. It’s strictly for internal query processing itself.

[32:04] So, that is the JOIN, and if there’s one thing you can get out of this particular video series is hopefully that you understand the JOIN. And you understand the inner JOIN, the Outer JOIN, and the Table Aliases. That is the most important thing that you’ll need to understand to really take your SQL knowledge to the next level.

Posted in SQL Server | Posted in , , , , , , , , | Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">