Skip to content

Data Manipulation in SQL

Here you can access every table used in the course. To access each table, you will need to specify the soccer schema in your queries (e.g., soccer.match for the match table, and soccer.league for the league table).


Note: When using sample integrations such as those that contain course data, you have read-only access. You can run queries, but cannot make any changes such as adding, deleting, or modifying the data (e.g., creating tables, views, etc.).

Take Notes

Add notes about the concepts you've learned and SQL cells with queries you want to keep.

Add your notes here

Spinner
DataFrameas
df
variable
-- Add your own queries here
SELECT *
FROM soccer.match
LIMIT 5

Explore Datasets

Use the match, league, and country tables to explore the data and practice your skills!

  • Use the match, league, and country tables to return the number of matches played in Great Britain versus elsewhere in the world.
    • "England", "Scotland", and "Wales" should be categorized as "Great Britain"
    • All other leagues will need to be categorized as "World".
  • Use the match and country tables to return the countries in which the average number of goals (home and away goals) scored are greater than the average number of goals of all matches.
  • In a soccer league, points are assigned to teams based on the result of a game. Here, let's assume that 3 points are awarded for a win, 1 for a tie, and 0 for a defeat. Use the match table to calculate the running total of points earned by the team "Chelsea" (team id 8455) in the season "2014/2015".
    • The final output should have the match date, the points earned by Chelsea, and the running total.
Spinner
DataFrameas
df1
variable
-- Identify the home team as Bayern Munich, Schalke 04, or neither
SELECT 
	CASE WHEN hometeam_id = 10189 THEN 'FC Schalke 04'
        WHEN hometeam_id = 9823 THEN 'FC Bayern Munich'
         ELSE 'Other' END AS home_team,
	COUNT(id) AS total_matches
FROM matches_germany
-- Group by the CASE statement alias
GROUP BY home_team;
Spinner
DataFrameas
df2
variable
SELECT 
	-- Select the date of the match
	date,
	-- Identify home wins, losses, or ties
	CASE WHEN home_goal > away_goal THEN 'Home win!'
        WHEN home_goal < away_goal THEN 'Home loss :(' 
        ELSE 'Tie' END As outcome
FROM matches_spain;
Spinner
DataFrameas
df3
variable
SELECT 
	m.date,
	--Select the team long name column and call it 'opponent'
	t.team_long_name AS opponent, 
	-- Complete the CASE statement with an alias
	CASE WHEN m.home_goal > away_goal THEN 'Home win!'
        WHEN m.home_goal < away_goal THEN 'Home loss :('
        ELSE 'Tie' END As outcome
FROM matches_spain AS m
-- Left join teams_spain onto matches_spain
LEFT JOIN teams_spain AS t
ON m.awayteam_id = t.team_api_id;
Spinner
DataFrameas
df9
variable
SELECT 
	m.date,
	t.team_long_name AS opponent,
    -- Complete the CASE statement with an alias
	CASE WHEN m.home_goal > away_goal THEN 'Barcelona win!'
        WHEN m.home_goal < away_goal THEN 'Barcelona loss :(' 
        ELSE 'Tie' END As outcome 
FROM matches_spain AS m
LEFT JOIN teams_spain AS t 
ON m.awayteam_id = t.team_api_id
-- Filter for Barcelona as the home team
WHERE m.hometeam_id = 8634; 
Spinner
DataFrameas
df8
variable
-- Select matches where Barcelona was the away team
SELECT  
	m.date,
	t.team_long_name AS opponent,
	CASE WHEN m.home_goal < m.away_goal THEN 'Barcelona win!'
        WHEN m.home_goal > m.away_goal THEN 'Barcelona loss :(' 
        ELSE 'Tie' END As outcome
FROM matches_spain AS m
-- Join teams_spain to matches_spain
LEFT JOIN teams_spain AS t 
ON m.hometeam_id = t.team_api_id
WHERE m.awayteam_id = 8634;
Spinner
DataFrameas
df7
variable
SELECT 
	date,
	-- Identify the home team as Barcelona or Real Madrid
	CASE WHEN hometeam_id = 8634 THEN 'FC Barcelona' 
        ELSE 'Real Madrid CF' END as home,
    -- Identify the away team as Barcelona or Real Madrid
	CASE WHEN awayteam_id  = 8634 THEN 'FC Barcelona' 
        ELSE 'Real Madrid CF' END as away
FROM matches_spain
WHERE (awayteam_id = 8634 OR hometeam_id = 8634)
      AND (awayteam_id = 8633 OR hometeam_id = 8633);
Spinner
DataFrameas
df6
variable
SELECT 
	date,
	CASE WHEN hometeam_id = 8634 THEN 'FC Barcelona' 
         ELSE 'Real Madrid CF' END as home,
	CASE WHEN awayteam_id = 8634 THEN 'FC Barcelona' 
         ELSE 'Real Madrid CF' END As away,
	-- Identify all possible match outcomes
	CASE WHEN home_goal > away_goal AND hometeam_id = 8634 THEN 'Barcelona win!'
        WHEN home_goal > away_goal AND hometeam_id = 8633 THEN 'Real Madrid win!'
        WHEN home_goal < away_goal AND awayteam_id = 8634 THEN 'Barcelona win!'
        WHEN home_goal < away_goal AND awayteam_id = 8633 THEN 'Real Madrid win!'
        ELSE 'Tie!' END AS outcome
FROM matches_spain
WHERE (awayteam_id = 8634 OR hometeam_id = 8634)
      AND (awayteam_id = 8633 OR hometeam_id = 8633);
Spinner
DataFrameas
df4
variable
-- Select team_long_name and team_api_id from team
SELECT
	team_long_name,
	team_api_id
FROM teams_italy
-- Filter for team long name
WHERE team_long_name = 'Bologna';
Spinner
DataFrameas
df5
variable
-- Select the season and date columns
SELECT 
	season,
	date,
    -- Identify when Bologna won a match
	CASE WHEN hometeam_id = 9857 
        AND home_goal > away_goal 
        THEN 'Bologna Win'
		WHEN awayteam_id = 9857 
        AND away_goal > home_goal 
        THEN 'Bologna Win' 
		END AS outcome
FROM matches_italy;
Spinner
DataFrameas
df10
variable
-- Select the season, date, home_goal, and away_goal columns
SELECT 
	season,
    date,
	home_goal,
	away_goal
FROM matches_italy
WHERE 
-- Exclude games not won by Bologna
	CASE WHEN hometeam_id = 9857 AND home_goal > away_goal THEN 'Bologna Win'
		WHEN awayteam_id = 9857 AND away_goal > home_goal THEN 'Bologna Win' 
		END IS NOT NULL;