Which type of join returns the matching rows from the tables that are being joined?

SQL joins allow our relational database management systems to be, well, relational.

Joins allow us to re-construct our separated database tables back into the relationships that power our applications.

In this article, we'll look at each of the different join types in SQL and how to use them.

Here's what we'll cover:

  • What is a join?
  • Setting up your database
  • CROSS JOIN
  • Setting up our example data (directors and movies)
  • FULL OUTER JOIN
  • INNER JOIN
  • LEFT JOIN / RIGHT JOIN
  • Filtering using LEFT JOIN
  • Multiple joins
  • Joins with extra conditions
  • The reality about writing queries with joins

(Spoiler alert: we'll cover five different types—but you really only need to know two of them!)

What is a join?

A join is an operation that combines two rows together into one row.

These rows are usually from two different tables—but they don't have to be.

Before we look at how to write the join itself, let's look at what the result of a join would look like.

Let's take for example a system that stores information about users and their addresses.

The rows from the table that stores user information might look like this:

 id |     name     |        email        | age
----+--------------+---------------------+-----
  1 | John Smith   |  |  25
  2 | Jane Doe     |    |  28
  3 | Xavier Wills |      |  3
...
(7 rows)

And the rows from the table that stores address information might look like this:

 id |      street       |     city      | state | user_id
----+-------------------+---------------+-------+---------
  1 | 1234 Main Street  | Oklahoma City | OK    |       1
  2 | 4444 Broadway Ave | Oklahoma City | OK    |       2
  3 | 5678 Party Ln     | Tulsa         | OK    |       3
(3 rows)

We could write separate queries to retrieve both the user information and the address information—but ideally we could write one query and receive all of the users and their addresses in the same result set.

This is exactly what a join lets us do!

We'll look at how to write these joins soon, but if we joined our user information to our address information we could get a result like this:

 id |     name     |        email        | age | id |      street       |     city      | state | user_id
----+--------------+---------------------+-----+----+-------------------+---------------+-------+---------
  1 | John Smith   |  |  25 |  1 | 1234 Main Street  | Oklahoma City | OK    |       1
  2 | Jane Doe     |    |  28 |  2 | 4444 Broadway Ave | Oklahoma City | OK    |       2
  3 | Xavier Wills |      |  35 |  3 | 5678 Party Ln     | Tulsa         | OK    |       3
(3 rows)

Here we see all of our users and their addresses in one nice result set.

Besides producing a combined result set, another important use of joins is to pull extra information into our query that we can filter against.

For example, if we wanted to send some physical mail to all users who live in Oklahoma City, we could use this joined-together result set and filter based on the city column.

Now that we know the purpose of a joins—let's start writing some!

Setting up your database

Before we can write our queries we need to setup our database.

For these examples we'll be using PostgreSQL, but the queries and concepts shown here will easily translate to any other modern database system (like MySQL, SQL Server, etc.).

To work with our PostgreSQL database, we can use psql—the interactive PostgreSQL command line program. If you have another database client that you enjoy working with that's fine too.

To begin, let's create our database. With PostgreSQL already installed, we can run the command createdb <database-name> at our terminal to create a new database. I called mine fcc:

$ createdb fcc

Next let's start the interactive console by using the command psql and connect to the database we just made using \c <database-name>:

$ psql
psql (11.5)
Type "help" for help.

john=# \c fcc
You are now connected to database "fcc" as user "john".
fcc=#
Note: I've cleaned up the psql output in these examples to make it easier to read, so don't worry if the output shown here isn't exactly what you've seen in your terminal.

I encourage you to follow along with these examples and run these queries for yourself. You will learn and remember far more by working through these examples rather than just reading them.

Now onto the joins!

The simplest kind of join we can do is a CROSS JOIN or "Cartesian product."

This join takes each row from one table and joins it with each row of the other table.

If we had two lists—one containing 1, 2, 3 and the other containing A, B, C—the Cartesian product of those two lists would be this:

1A, 1B, 1C
2A, 2B, 2C
3A, 3B, 3C

Each value from the first list is paired with each value of the second list.

Let's write this same example as a SQL query.

First let's create two very simple tables and insert some data into them:

CREATE TABLE letters(
  letter TEXT
);

INSERT INTO letters(letter) VALUES ('A'), ('B'), ('C');

CREATE TABLE numbers(
  number TEXT
);

INSERT INTO numbers(number) VALUES (1), (2), (3);

Our two tables, letters and numbers, just have one column: a simple text field.

Now let's join them together with a CROSS JOIN:

SELECT *
FROM letters
CROSS JOIN numbers;
 letter | number
--------+--------
 A      | 1
 A      | 2
 A      | 3
 B      | 1
 B      | 2
 B      | 3
 C      | 1
 C      | 2
 C      | 3
(9 rows)

This is the simplest type of join we can do—but even in this simple example we can see the join at work: the two separate rows (one from letters and one from numbers) have been joined together to form one row.

While this type of join is often discussed as a mere academic example, it does have at least one good use case: covering date ranges.

CROSS JOIN with date ranges

One good use case of a CROSS JOIN is to take each row from a table and apply it to every day within a date range.

Say for example you were building an application that tracked daily tasks—things like brushing your teeth, eating breakfast, or showering.

If you wanted to generate a record for every task and for each day of the past week, you could use a CROSS JOIN against a date range.

To make this date range, we can use the generate_series function:

SELECT generate_series(
  (CURRENT_DATE - INTERVAL '5 day'),
  CURRENT_DATE,
  INTERVAL '1 day'
)::DATE AS day;

The generate_series function takes three parameters.

The first parameter is the starting value. In this example we use CURRENT_DATE - INTERVAL '5 day'. This returns the current date minus five days—or "five days ago."

The second parameter is the current date (CURRENT_DATE).

The third parameter is the "step interval"—or how much we want to increment the value each time. Since these are daily tasks we'll use the interval of one day (INTERVAL '1 day').

Putting it all together, this generates a series of dates starting five days ago, ending today, and going one day at a time.

Finally we remove the time portion by casting the output of these values to a date using ::DATE, and we alias this column using AS day to make the output a little nicer.

The output of this query is the past five days plus today:

    day
------------
 2020-08-19
 2020-08-20
 2020-08-21
 2020-08-22
 2020-08-23
 2020-08-24
(6 rows)

Going back to our tasks-per-day example, let's create a simple table to hold the tasks we want to complete and insert a few tasks:

CREATE TABLE tasks(
  name TEXT
);

INSERT INTO tasks(name) VALUES
('Brush teeth'),
('Eat breakfast'),
('Shower'),
('Get dressed');

Our tasks table just has one column, name, and we inserted four tasks into this table.

Now let's CROSS JOIN our tasks with the query to generate the dates:

SELECT
  tasks.name,
  dates.day
FROM tasks
CROSS JOIN
(
  SELECT generate_series(
    (CURRENT_DATE - INTERVAL '5 day'),
    CURRENT_DATE,
    INTERVAL '1 day'
  )::DATE	AS day
) AS dates

(Since our date generation query is not an actual table we just write it as a subquery.)

From this query we return the task name and the day, and the result set looks like this:

     name      |    day
---------------+------------
 Brush teeth   | 2020-08-19
 Brush teeth   | 2020-08-20
 Brush teeth   | 2020-08-21
 Brush teeth   | 2020-08-22
 Brush teeth   | 2020-08-23
 Brush teeth   | 2020-08-24
 Eat breakfast | 2020-08-19
 Eat breakfast | 2020-08-20
 Eat breakfast | 2020-08-21
 Eat breakfast | 2020-08-22
 ...
 (24 rows)

Like we expected, we get a row for each task for every day in our date range.

The CROSS JOIN is the simplest join we can do, but to look at the next few types we'll need a more-realistic table setup.

Creating directors and movies

To illustrate the following join types, we'll use the example of movies and movie directors.

In this situation, a movie has one director, but a movie isn't required to have a director—imagine a new movie being announced but the choice for director hasn't yet been confirmed.

Our directors table will store the name of each director, and the movies table will store the name of the movie as well as a reference to the director of the movie (if it has one).

Let's create those two tables and insert some data into them:

CREATE TABLE directors(
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL
);

INSERT INTO directors(name) VALUES
('John Smith'),
('Jane Doe'),
('Xavier Wills')
('Bev Scott'),
('Bree Jensen');

CREATE TABLE movies(
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL,
  director_id INTEGER REFERENCES directors 
);

INSERT INTO movies(name, director_id) VALUES
('Movie 1', 1),
('Movie 2', 1),
('Movie 3', 2),
('Movie 4', NULL),
('Movie 5', NULL);

We have five directors, five movies, and three of those movies have directors assigned to them. Director ID 1 has two movies, and director ID 2 has one.

FULL OUTER JOIN

Now that we have some data to work with let's look at the FULL OUTER JOIN.

A FULL OUTER JOIN has some similarities to a CROSS JOIN, but it has a couple key differences.

The first difference is that a FULL OUTER JOIN requires a join condition.

A join condition specifies how the rows between the two tables are related to each other and on what criteria they should be joined together.

In our example, our movies table has a reference to the director via the director_id column, and this column matches the id column of the directors table. These are the two columns that we will use as our join condition.

Here's how we write this join between our two tables:

SELECT *
FROM movies
FULL OUTER JOIN directors
  ON directors.id = movies.director_id;

Notice the join condition we specified that matches the movie to its director: ON movies.director_id = directors.id.

Our result set looks like an odd Cartesian product of sorts:

  id  |  name   | director_id |  id  |     name
------+---------+-------------+------+--------------
    1 | Movie 1 |           1 |    1 | John Smith
    2 | Movie 2 |           1 |    1 | John Smith
    3 | Movie 3 |           2 |    2 | Jane Doe
    4 | Movie 4 |        NULL | NULL | NULL
    5 | Movie 5 |        NULL | NULL | NULL
 NULL | NULL    |        NULL |    5 | Bree Jensen
 NULL | NULL    |        NULL |    4 | Bev Scott
 NULL | NULL    |        NULL |    3 | Xavier Wills
(8 rows)

The first rows we see are ones where the movie had a director, and our join condition evaluated to true.

However, after those rows we see each of the remaining rows from each table—but with NULL values where the other table didn't have a match.

Note: if you're unfamiliar with NULL values, see my explanation here in this SQL operator tutorial.

We also see another difference between the CROSS JOIN and FULL OUTER JOIN here. A FULL OUTER JOIN returns one distinct row from each table—unlike the CROSS JOIN which has multiple.

INNER JOIN

The next join type, INNER JOIN, is one of the most commonly used join types.

An inner join only returns rows where the join condition is true.

In our example, an inner join between our movies and directors tables would only return records where the movie has been assigned a director.

The syntax is basically the same as before:

SELECT *
FROM movies
INNER JOIN directors
  ON directors.id = movies.director_id;

Our result shows the three movies that have a director:

 id |  name   | director_id | id |    name
----+---------+-------------+----+------------
  1 | Movie 1 |           1 |  1 | John Smith
  2 | Movie 2 |           1 |  1 | John Smith
  3 | Movie 3 |           2 |  2 | Jane Doe
(3 rows)

Since an inner join only includes rows that match the join condition, the order of the two tables in the join don't matter.

If we reverse the order of the tables in the query we get same result:

SELECT *
FROM directors
INNER JOIN movies
  ON movies.director_id = directors.id;
 id |    name    | id |  name   | director_id
----+------------+----+---------+-------------
  1 | John Smith |  1 | Movie 1 |           1
  1 | John Smith |  2 | Movie 2 |           1
  2 | Jane Doe   |  3 | Movie 3 |           2
(3 rows)

Since we listed the directors table first in this query and we selected all columns (SELECT *), we see the directors column data first and then the columns from movies—but the resulting data is the same.

This is a useful property of inner joins, but it's not true for all join types—like our next type.

LEFT JOIN / RIGHT JOIN

These next two join types use a modifier (LEFT or RIGHT) that affects which table's data is included in the result set.

Note: the LEFT JOIN and RIGHT JOIN can also be referred to as LEFT OUTER JOIN and RIGHT OUTER JOIN.

These joins are used in queries where we want to return all of a particular table's data and, if it exists, the associated table's data as well.

If the associated data doesn't exist, we still get back all of the "primary" table's data.

It's a query for information about a particular thing and bonus information if that bonus information exists.

This will be simple to understand with an example. Let's find all movies and their directors, but we don't care if they have a director or not—it's a bonus:

SELECT *
FROM movies
LEFT JOIN directors
  ON directors.id = movies.director_id;

The query follows our same pattern as before—we've just specified the join as a LEFT JOIN.

In this example, the movies table is the "left" table.

If we write the query on one line it makes this a little easier to see:

... FROM movies LEFT JOIN directors ...

A left join returns all records from the "left" table.

A left join returns any rows from the "right" table that match the join condition.

Rows from the "right" table that don't match the join condition are returned as NULL.

 id |  name   | director_id |  id  |    name
----+---------+-------------+------+------------
  1 | Movie 1 |           1 |    1 | John Smith
  2 | Movie 2 |           1 |    1 | John Smith
  3 | Movie 3 |           2 |    2 | Jane Doe
  4 | Movie 4 |        NULL | NULL | NULL
  5 | Movie 5 |        NULL | NULL | NULL
(5 rows)

Looking at that result set, we can see why this type of join is useful for "all of this and, if it exists, some of that" type queries.

RIGHT JOIN

The RIGHT JOIN works exactly like the LEFT JOIN—except the rules about the two tables are reversed.

In a right join, all of the rows from the "right" table are returned. The "left" table is conditionally returned based on the join condition.

Let's use the same query as above but substitute LEFT JOIN for RIGHT JOIN:

SELECT *
FROM movies
RIGHT JOIN directors
  ON directors.id = movies.director_id;
  id  |  name   | director_id | id |     name
------+---------+-------------+----+--------------
    1 | Movie 1 |           1 |  1 | John Smith
    2 | Movie 2 |           1 |  1 | John Smith
    3 | Movie 3 |           2 |  2 | Jane Doe
 NULL | NULL    |        NULL |  5 | Bree Jensen
 NULL | NULL    |        NULL |  4 | Bev Scott
 NULL | NULL    |        NULL |  3 | Xavier Wills
(6 rows)

Our result set now returns every directors row and, if it exists, the movies data.

All we've done is switch which table we're considering the "primary" one—the table we want to see all of the data from regardless of if its associated data exists.

LEFT JOIN / RIGHT JOIN in production applications

In a production application, I only ever use LEFT JOIN and I never use RIGHT JOIN.

I do this because, in my opinion, a LEFT JOIN makes the query easier to read and understand.

When I'm writing queries I like to think of starting with a "base" result set, say all movies, and then bring in (or subtract out) groups of things from that base.

Because I like to start with a base, the LEFT JOIN fits this line of thinking. I want all of the rows from my base table (the "left" table), and I conditionally want the rows from the "right" table.

In practice, I don't think I've ever even seen a RIGHT JOIN in a production application. There's nothing wrong with a RIGHT JOIN—I just think it makes the query more difficult to understand.

Re-writing RIGHT JOIN

If we wanted to flip our scenario above and instead return all directors and conditionally their movies, we can easily re-write the RIGHT JOIN into a LEFT JOIN.

All we need to do is flip the order of the tables in the query, and change RIGHT to LEFT:

SELECT *
FROM directors
LEFT JOIN movies
  ON movies.director_id = directors.id;
Note: I like to put the table that is being joined on (the "right" table—in the example above movies) first in the join condition (ON movies.director_id = ...)—but that's just my personal preference.

Filtering using LEFT JOIN

There's two use cases for using a LEFT JOIN (or RIGHT JOIN).

The first use case we've already covered: to return all of the rows from one table and conditionally from another.

The second use case is to return rows from the first table where the data from the second table isn't present.

The scenario would look like this: find directors who don't belong to a movie.

To do this we'll start with a LEFT JOIN and our directors table will be the primary or "left" table:

SELECT *
FROM directors
LEFT JOIN movies
  ON movies.director_id = directors.id;

For a director that doesn't belong to a movie, the columns from the movies table are NULL:

 id |     name     |  id  |  name   | director_id
----+--------------+------+---------+-------------
  1 | John Smith   |    1 | Movie 1 |           1
  1 | John Smith   |    2 | Movie 2 |           1
  2 | Jane Doe     |    3 | Movie 3 |           2
  5 | Bree Jensen  | NULL | NULL    |        NULL
  4 | Bev Scott    | NULL | NULL    |        NULL
  3 | Xavier Wills | NULL | NULL    |        NULL
(6 rows)

In our example, director ID 3, 4, and 5 don't belong to a movie.

To filter our result set just to these rows, we can add a WHERE clause to only return rows where the movie data is NULL:

SELECT *
FROM directors
LEFT JOIN movies
  ON movies.director_id = directors.id
WHERE movies.id IS NULL;
 id |     name     |  id  | name | director_id
----+--------------+------+------+-------------
  5 | Bree Jensen  | NULL | NULL |        NULL
  4 | Bev Scott    | NULL | NULL |        NULL
  3 | Xavier Wills | NULL | NULL |        NULL
(3 rows)

And there are our three movie-less directors!

It's common to use the id column of the table to filter against (WHERE movies.id IS NULL), but all columns from the movies table are NULL—so any of them would work.

(Since we know that all the columns from the movies table will be NULL, in the query above we could just write SELECT directors.* instead of SELECT * to just return all of the director's information.)

Using LEFT JOIN to find matches

In our previous query we found directors that didn't belong to movies.

Using our same structure, we could find directors that do belong to movies by changing our WHERE condition to look for rows where the movie data is not NULL:

SELECT *
FROM directors
LEFT JOIN movies
  ON movies.director_id = directors.id
WHERE movies.id IS NOT NULL;
 id |    name    | id |  name   | director_id
----+------------+----+---------+-------------
  1 | John Smith |  1 | Movie 1 |           1
  1 | John Smith |  2 | Movie 2 |           1
  2 | Jane Doe   |  3 | Movie 3 |           2
(3 rows)

This may seem handy, but we've actually just re-implemented INNER JOIN!

Multiple joins

We've seen how to join two tables together, but what about multiple joins in a row?

It's actually quite simple, but to illustrate this we need a third table: tickets.

This table will represent tickets sold for a movie:

CREATE TABLE tickets(
  id SERIAL PRIMARY KEY,
  movie_id INTEGER REFERENCES movies NOT NULL
);

INSERT INTO tickets(movie_id) VALUES (1), (1), (3);

The tickets table just has an id and a reference to the movie: movie_id.

We've also inserted two tickets sold for movie ID 1, and one ticket sold for movie ID 3.

Now, let's join directors to movies—and then movies to tickets!

SELECT *
FROM directors
INNER JOIN movies
  ON movies.director_id = directors.id
INNER JOIN tickets
  ON tickets.movie_id = movies.id;

Since these are inner joins, the order in which we write the joins doesn't matter. We could have started with tickets, then joined on movies, and then joined on directors.

It again comes down to what you're trying to query and what makes the query the most understandable.

In our result set, we'll notice that we've further narrowed down the rows that are returned:

 id |    name    | id |  name   | director_id | id | movie_id
----+------------+----+---------+-------------+----+----------
  1 | John Smith |  1 | Movie 1 |           1 |  1 |        1
  1 | John Smith |  1 | Movie 1 |           1 |  2 |        1
  2 | Jane Doe   |  3 | Movie 3 |           2 |  3 |        3
(3 rows)

This makes sense because we've added another INNER JOIN. In effect this adds another "AND" condition to our query.

Our query essentially says: "return all directors that belong to movies that also have ticket sales."

If instead we wanted to find directors that belong to movies that may not have ticket sales yet, we could substitute our last INNER JOIN for a LEFT JOIN:

SELECT *
FROM directors
JOIN movies
  ON movies.director_id = directors.id
LEFT JOIN tickets
  ON tickets.movie_id = movies.id;

We can see that Movie 2 is now back in the result set:

 id |    name    | id |  name   | director_id |  id  | movie_id
----+------------+----+---------+-------------+------+----------
  1 | John Smith |  1 | Movie 1 |           1 |    1 |        1
  1 | John Smith |  1 | Movie 1 |           1 |    2 |        1
  2 | Jane Doe   |  3 | Movie 3 |           2 |    3 |        3
  1 | John Smith |  2 | Movie 2 |           1 | NULL |     NULL
(4 rows)

This movie didn't have any ticket sales, so it was previously excluded from the result set due to the INNER JOIN.

I'll leave this an Exercise For The Reader™, but how would you find directors that belong to movies that don't have any ticket sales?

Join execution order

In the end, we don't really care in what order the joins are executed.

One of the key differences between SQL and other modern programming languages is that SQL is a declarative language.

This means that we specify the outcome we want, but we don't specify the execution details—those details are left up to the database query planner. We specify the joins we want and the conditions on them and the query planner handles the rest.

But, in reality, the database is not joining three tables together at the same time. Instead, it will likely join the first two tables together into one intermediary result, and then join that intermediary result set to the third table.

(Note: This is a somewhat simplified explanation.)

So, as we're working with multiple joins in queries we can just think of them as a series of joins between two tables—although one of those tables can get quite large.

The last topic we'll cover is a join with extra conditions.

Similar to a WHERE clause, we can add as many conditions as we want to our join conditions.

For example, if we wanted to find movies with directors that are not named "John Smith", we could add that extra condition to our join with an AND:

SELECT *
FROM movies
INNER JOIN directors
  ON directors.id = movies.director_id
  AND directors.name <> 'John Smith';

We can use any operators we would put in a WHERE clause in this join condition.

We also get the same result from this query if we put the condition in a WHERE clause instead:

SELECT *
FROM movies
INNER JOIN directors
  ON directors.id = movies.director_id
WHERE directors.name <> 'John Smith';

There are some subtle differences happening under the hood here, but for the purpose of this article the result set is the same.

(If you're unfamiliar with all of the ways you can filter a SQL query, check out the previously mentioned article here.)

The reality about writing queries with joins

In reality, I find myself only using joins in three different ways:

INNER JOIN

The first use case is records where the relationship between two tables does exist. This is fulfilled by the INNER JOIN.

These are situations like finding "movies that have directors" or "users with posts".

LEFT JOIN

The second use case is records from one table—and if the relationship exists—records from a second table. This is fulfilled by the LEFT JOIN.

These are situations like "movies with directors if they have one" or "users with posts if they have some."

LEFT JOIN exclusion

The third most common use case is our second use case for a LEFT JOIN: finding records in one table thatdon'thave a relationship in the second table.

These are situations like "movies without directors" or "users without posts."

Two very useful join types

I don't think I've ever used a FULL OUTER JOIN or a RIGHT JOIN in a production application. The use case just doesn't come up often enough or the query can be written in a clearer way (in the case of RIGHT JOIN).

I have occasionally used a CROSS JOIN for things like spreading records across a date range (like we looked at the beginning), but that scenario also doesn't come up too often.

So, good news! There's really only two types of joins you need to understand for 99.9% of the use cases you'll come across: INNER JOIN and LEFT JOIN!

If you liked this post, you can follow me on twitter where I talk about database things and all other topics related to development.

Thanks for reading!

John

P.S. an extra tip for reading to the end: most database systems will let you just write JOIN in the place of INNER JOIN—it'll save you a little extra typing. :)

Learn to code for free. freeCodeCamp's open source curriculum has helped more than 40,000 people get jobs as developers. Get started

Which join returns the matching rows from the table?

A left join returns any rows from the "right" table that match the join condition.

Which type of join returns rows from two table only if there is a match between columns in both table?

Outer joins are joins that return matched values and unmatched values from either or both tables. There are a few types of outer joins: LEFT JOIN returns only unmatched rows from the left table, as well as matched rows in both tables.

What type of join returns all rows from the first table and any matching rows from the second table?

A cross join, also known as a Cartesian Product join, returns a result table where each row from the first table is combined with each row from the second table.

Which type of join is used to returns all records when there is a match in either left or right table?

The FULL OUTER JOIN keyword returns all records when there is a match in left (table1) or right (table2) table records.