Skip to main content

Introduction to SQL Joins

In this tutorial, you'll learn about the mechanics of joins in SQL and its different types.
Mar 2019  · 9 min read

During your data analysis in SQL, you will often need to look at multiple tables and join them to get the desired results. Joining tables for analyzing data is a required skill for a data scientist as well. In this tutorial, you will learn about different types of joins that you can perform in PostgreSQL. You will first study the preliminary concepts of joins in SQL, then learn about different types of joins and what are their syntaxes in PostgreSQL. You will finish off this tutorial by executing various types of join queries using two tables. To be able to follow along, you should be familiar with the basics of SQL and how to write simple queries in PostgreSQL. If you are looking for resources learn these, the following ones can be useful -

Let's begin.

Setting up the database environment in PostgreSQL

Before you study the basics of SQL joins it would be a good idea to create some tables for your analyses. You will need at least two tables having at least one column in common. Although you can execute self-join queries on the same table, let's ignore that fact for the time being.

You can perform all the SQL operations on a tool called pgAdmin which generally comes with the installation of PostgreSQL. Let's open pgAdmin and create a database (give it a name of your choice) to create the tables if you don't have an existing database to proceed with. You can then create the following two tables with the specifications as given -

  • student_name (id, name)
  • student_stream (id, stream)

Note that both the tables have id column in common. You can use the following CREATE statements to create the tables -

  • For student_name -
    CREATE TABLE student_name
    (
    id smallint,
    name "char"[]
    );
    
  • For student_stream -
    CREATE TABLE student_stream
    (
    id smallint,
    stream "char"[]
    );
    

Let's now insert some records into the tables.

  • For student_name -
INSERT INTO student_name(id, name) VALUES (1, 'Sayak');

INSERT INTO student_name(id, name) VALUES (2, 'Alex');

INSERT INTO student_name(id, name) VALUES (3, 'Sameer');

INSERT INTO student_name(id, name) VALUES (4, 'Rick');
  • For student_stream -
INSERT INTO student_stream(id, stream) VALUES (1, 'CS');

INSERT INTO student_stream(id, stream) VALUES (1, 'IT');

INSERT INTO student_stream(id, stream) VALUES (2, 'ECE');

INSERT INTO student_stream(id, stream) VALUES (9, 'ECE');

The table entries should be like the following in case you executed the above queries without changing anything -

student_name and student_stream tables

You now have two simple tables ready. Let's now proceed towards studying the basics of SQL joins.

SQL Joins - Basics

SQL Joins allow you to collate two or more (sometimes just one table) tables using common identifiers. Take the example of the above two tables that you created - both the tables have id column in common. You may question the need of joining in SQL. Let's discuss this in brief.

Due to normalization constraints, you may not have all the required information present in a single table. Normalization is not only desired but also is necessary in order to maintain consistency, to reduce redundancy and to prevent several insertions and update anomalies. Consider the above two table again. Suppose you wish to find out the streams in which Sayak has enrolled. To get the answer to this question, you will need to join the two tables (sub-queries can also be used, but let's not consider that now) and then proceed accordingly.

For two tables to be joined there has to be something in common between the two. Does this mean both the tables have to have at least one column with the same name? Or what exactly is this condition?

The tables you wish to join may not have a column with the same name, but logically they should be the same, i.e., their data-types should be the same. You cannot join two tables having a column with the same name but different data-types. Let's now study the different types of SQL joins.

Different types of SQL Joins

In this section, you will study several types of SQL Joins -

  • INNER JOIN
    • Self-join
  • OUTER JOIN
    • LEFT JOIN
    • RIGHT JOIN
    • FULL JOIN
  • CROSS JOIN
  • Semi-join and Anti-join

PostgreSQL provides separate keywords for the types of joins mentioned in all capital letters. Let's study each of them one by one. For this, you will first study the joins visually, and then you will execute the respective join queries in PostgreSQL.

INNER JOIN

Take a look at the following diagram in order to understand INNER JOIN visually -

Inner join diagram

In the above example, the column under consideration is the id column. INNER JOIN will ignore rest of the columns for which the values are not common in both the tables. Let's now execute a query in PostgreSQL which would perform INNER JOIN between the two tables - student_name and student_stream.

Upon executing the following query, you get the result as presented in the above figure -

SELECT s1.id, s1.name, s2.stream
FROM student_name AS s1
INNER JOIN student_stream AS s2
ON s1.id = s2.id;

s1 and s2 are the aliases for the two tables. You used the ON keyword along with INNER JOIN. The query can be executed using the USING keyword as well -

SELECT s1.id, s1.name, s2.stream
FROM student_name AS s1
INNER JOIN student_stream AS s2
USING (id);

Self Join allows you to perform joining the same table. But where this might be a good option? Suppose you have a table consisting of the following columns -

  • Country
  • Continent

Now you want to enlist two countries in a way that their continents are the same. The following figure should give you a sense of the expected results -

self join result (table form)

OUTER JOIN

OUTER JOIN can further be divided into three types -

LEFT JOIN or Left Outer JOIN: The following diagram gives you a fair idea of LEFT JOIN in SQL -

Diagram of left join in SQL

Note that unlike INNER JOIN, LEFT JOIN fetches you the records from the left (w.r.t to the order you specify in your query) table for which there was not any matching entry in the right table. This, in turn, tells you that Sameer and Rick have not enrolled in any streams. The respective query for this will be -

SELECT s1.id, s1.name, s2.stream
FROM student_name AS s1
LEFT JOIN student_stream AS s2
ON s1.id = s2.id;

RIGHT JOIN or Right Outer JOIN : RIGHT JOIN is exactly the opposite of LEFT JOIN -

diagram of right join in SQL

RIGHT JOIN can help you in finding the stream(s) for which no student has enrolled. The query for this would be -

SELECT s1.id, s1.name, s2.stream
FROM student_name AS s1
RIGHT JOIN student_stream AS s2
ON s1.id = s2.id;

FULL JOIN or Full Outer JOIN: FULL JOIN lets you combine both LEFT JOIN and RIGHT JOIN into a single compilation -

Diagram of full join in SQL

The respective query would be -

SELECT s1.id, s1.name, s2.stream
FROM student_name AS s1
FULL JOIN student_stream AS s2
ON s1.id = s2.id;

Note that you can execute all the above OUTER JOIN queries using the USING keyword. Let's now study CROSS JOIN.

CROSS JOIN

CROSS JOIN is essentially the cartesian product between two elements expressed using SQL. Suppose, you need to have all the combinations possible in between two tables or even in a single table. You will need CROSS JOIN for doing this. The following figure presents this concept visually -

Diagram of cross join

You already have two tables to test this out. In order to have all possible combinations between the id columns of the student_name and student_stream tables, you can execute the following query -

SELECT s1.id, s2.id
FROM student_name AS s1
CROSS JOIN student_stream AS s2;

And you get the following result -

All possible combinations table

Let's now see the other two types of join for which PostgreSQL does not provide any direct keywords.

Semi-join and Anti-join

Let's consider the tables that you created earlier in the tutorial:

Original table

Semi Join queries are generally executed in the form of subqueries where rows are picked up from the first table with respect to a condition (or a set of conditions) that is matched in the second table. Let's assume the left table is the first table, and the right table is the second table.

Anti Join queries are the exact opposite. In Anti Join, rows are picked up from the first table with respect to a condition (or a set of conditions) that is not matched in the second table. Here is a diagram for you to understand this visually -

Semi join and anti join diagrams

The query for realizing the Semi Join would be -

select id, name
from student_name
where id IN
(select id from student_stream where stream
IN ('CS', 'IT', 'ECE'));

Similarly, the query that realizes Anti Join, in this case, would be -

select id, name
from student_name
where id NOT IN
(select id from student_stream where stream
IN ('CS', 'IT', 'ECE'));

Note the use of NOT the above query.

Taking your SQL Join wizardry to the next level

That is all for this tutorial. You studied about a bunch different types SQL Joins in this tutorial and also made yourself familiar to their PostgreSQL syntaxes. If you want to practice some challenging SQL Join exercises, you should take DataCamp's Joining Data in SQL course. In fact, materials from the course were used as references to make this tutorial.

Let me know your questions in the Comments section.

Intermediate SQL Queries

Beginner
4 hours
1,485,093
Master the basics of querying tables in relational databases such as MySQL, SQL Server, and PostgreSQL.
See DetailsRight Arrow
Start Course

SQL for Joining Data

Beginner
5 hours
353,272
Join two or three tables together into one, combine tables using set theory, and work with subqueries in PostgreSQL.

Data Manipulation in SQL

Beginner
4 hours
155,624
Master the complex SQL queries necessary to answer a wide variety of data science questions and prepare robust data sets for analysis in PostgreSQL.
See all coursesRight Arrow
Related

How to Become a Data Analyst in 2023: 5 Steps to Start Your Career

Learn how to become a data analyst and discover everything you need to know about launching your career, including the skills you need and how to learn them.
Elena Kosourova 's photo

Elena Kosourova

18 min

What is SQL Used For? 7 Top SQL Uses

Discover the uses of SQL in industries and specific jobs. Plus, learn why the SQL language is so versatile and in demand.
Natassha Selvaraj's photo

Natassha Selvaraj

11 min

Sports Analytics: How Different Sports Use Data Analytics

Discover how sports analytics works and how different sports use data to provide meaningful insights. Plus, discover what it takes to become a sports data analyst.
Kurtis Pykes 's photo

Kurtis Pykes

13 min

SQL Window Functions Cheat Sheet

With this SQL Window Functions cheat sheet, you'll have a handy reference guide to the various types of window functions in SQL.
DataCamp Team's photo

DataCamp Team

10 min

COALESCE SQL Function

COALESCE() is one of the handiest functions in SQL. Read this tutorial to learn how to master it.
Travis Tang 's photo

Travis Tang

How to Write a Bash Script: A Simple Bash Scripting Tutorial

Discover the basics of bash scripting and learn how to write a bash script.
Kurtis Pykes 's photo

Kurtis Pykes

5 min

See MoreSee More