Skip to content

Data Manipulation in SQL

Here you can access every table used in the course. To access each table, you will need to specify the soccer schema in your queries (e.g., soccer.match for the match table, and soccer.league for the league table).


Note: When using sample integrations such as those that contain course data, you have read-only access. You can run queries, but cannot make any changes such as adding, deleting, or modifying the data (e.g., creating tables, views, etc.).

Take Notes

Add notes about the concepts you've learned and SQL cells with queries you want to keep.

-- Case_Statement CASE WHEN x = 1 THEN 'a' WHEN x = 2 THEN 'b' ELSE 'c' END AS new_column

Spinner
DataFrameas
df
variable
-- Add your own queries here
SELECT
    id,
	hometeam_id,
	awayteam_id,
    home_goal,
    away_goal, 
    CASE WHEN hometeam_id = 8455 AND home_goal > away_goal THEN 'Chelsea Home Win !!'
         When awayteam_id = 8455 AND home_goal < away_goal THEN 'Chelsea Away win !!'
         ELse 'tie or loss' END AS result
FROM soccer.match
WHERE CASE WHEN hometeam_id = 8455 AND home_goal > away_goal THEN 'Chelsea Home Win !!'
         When awayteam_id = 8455 AND home_goal < away_goal THEN 'Chelsea Away win !!'
         ELse 'tie or loss' END IS NOT NULL
AND season = '2011/2012'
LIMIT 10;

Explore Datasets

Use the match, league, and country tables to explore the data and practice your skills!

  • Use the match, league, and country tables to return the number of matches played in Great Britain versus elsewhere in the world.
    • "England", "Scotland", and "Wales" should be categorized as "Great Britain"
    • All other leagues will need to be categorized as "World".
  • Use the match and country tables to return the countries in which the average number of goals (home and away goals) scored are greater than the average number of goals of all matches.
  • In a soccer league, points are assigned to teams based on the result of a game. Here, let's assume that 3 points are awarded for a win, 1 for a tie, and 0 for a defeat. Use the match table to calculate the running total of points earned by the team "Chelsea" (team id 8455) in the season "2014/2015".
    • The final output should have the match date, the points earned by Chelsea, and the running total.

-- CASE statements are great for Categorizing data Filtering data Aggregating data

-- COUNTing CASES How many home and away goals did Liverpool score in each season?

Spinner
DataFrameas
df1
variable
SELECT
	season,
	COUNT(CASE WHEN hometeam_id = 8650 AND home_goal > away_goal
		  THEN id END) AS home_wins,
	COUNT(CASE WHEN awayteam_id = 8650 AND away_goal > home_goal
		  THEN id END) AS away_wins
FROM soccer.match
GROUP BY season;

-- Percentages with CASE and AVG

Spinner
DataFrameas
df2
variable
SELECT
	season,
	ROUND(AVG(CASE WHEN hometeam_id = 8455 AND home_goal > away_goal THEN 1
				   WHEN hometeam_id = 8455 AND home_goal < away_goal THEN 0
				   END),2) AS pct_homewins,
	ROUND(AVG(CASE WHEN awayteam_id = 8455 AND away_goal > home_goal THEN 1
			       WHEN awayteam_id = 8455 AND away_goal < home_goal THEN 0
				   END),2) AS pct_awaywins
FROM soccer.match
GROUP BY season;
Spinner
DataFrameas
df3
variable