Skip to content
Joining Data in SQL - Course Notes by Chinmay Garg

Joining Data in SQL

For PostgreSQL, but applies to other dialects as well.

Joining data is an essential skill which enables us to draw information from separate tables together into a single, meaningful set of results.

1. INNER JOINs

  1. Key - a single column/ field or group of columns that uniquely identifies records in a table.
  2. JOINs - With SQL joins, you can join on a key field, or any other field.
  3. INNER JOIN - Looks for records in both tables with the same values in the key field.
  1. Syntax - After FROM, list the left table, followed by the INNER JOIN keyword and the right table. Specify the field/ column to match the tables on, using the ON keyword with ALIASING. When joining on 2 identical column/ field names, we can employ the USING.
  2. Note - When you SELECT fields, a field can be ambiguous. Ex. 2 tables, apples and oranges, containing a column called color. You need to use the syntax apples.color or oranges.color in the SELECT statement to point SQL to the correct table without which you'll get the error - column reference "color" is ambiguous.
  3. Tip - many SQL users prefer to write the SELECT statement after writing the JOIN code, in case the SELECT statement requires using table aliases.
Spinner
DataFrameavailable as
df
variable
-- Query 1.1 Fetching the inflation rate for ONLY the countries where inflation rate is recorded
SELECT c.code AS country_code, name, year, inflation_rate
FROM world.countries AS c
INNER JOIN world.economies AS e
ON e.code = c.code
Spinner
DataFrameavailable as
df1
variable
-- Query 1.2. Query 1.1 with USING
SELECT c.code AS country_code, name, year, inflation_rate
FROM world.countries AS c
INNER JOIN world.economies AS e
USING(code)

Relationships

  1. 1-to-many/ many-to-1 Relationships: Most common type of relationship - a single entity can be associated with several entities. Think about a music library. One artist can produce many songs over their career. This is a one-to-many relationship. The same applies for authors and their books, directors and movie titles, and so on.
  1. 1-to-1 Relationships: Less common. A commonly held premise of forensic science is that no two fingerprints are identical, and therefore that a particular fingerprint can only be generated by one person. This is an example of a one-to-one relationship: one fingerprint for one finger.
  1. Many-to-many Relationships: Ex. languages and countries. Many languages can be spoken in many countries. Belgium has 3 official languages: French, German, and Dutch. Conversely, languages can be official in many countries: Dutch is an official language of both the Netherlands and Belgium, but not Germany.

Many-to-many Scenario

  1. countries table has a many-to-many relationship with the languages table.
  2. ie. many languages can be spoken in a country, and a language can be spoken in many countries.

Select the incorrect answer from the following options

  1. There are at least 3 languages spoken in Armenia
  2. Alsatian is spoken in more than 1 country
  3. Bhojpuri is spoken in 2 countries
Spinner
DataFrameavailable as
df2
variable
SELECT l.name AS language, c.name AS country
FROM world.countries AS c
INNER JOIN world.languages AS l
USING(code)
WHERE c.name = 'Armenia' OR l.name IN ('Alsatian', 'Bhojpuri')
ORDER BY country;
  1. Alsatian is only spoken in France.
  2. When we read SQL results,the most important column is expected to be on the far left, and it's helpful if results are ordered by relevance to the question at hand.
  3. By default, results are ordered by the column from the left table, but you can change this using ORDER BY

Multiple Joins

  1. Multiple joins can be combined and run in a single query

Scenario for multiple joins

To find the relationship between fertility and unemployment rates, join tables to return the country name, year, fertility rate, and unemployment rate in a single result from the countries, populations and economies tables.

Spinner
DataFrameavailable as
df3
variable
-- Query 1.11:
SELECT name, e.year, fertility_rate, e.unemployment_rate
FROM world.countries AS c
INNER JOIN world.populations AS p
ON c.code = p.country_code
INNER JOIN world.economies AS e
USING(code);