Skip to content

Data Manipulation in SQL

Here you can access every table used in the course. To access each table, you will need to specify the soccer schema in your queries (e.g., soccer.match for the match table, and soccer.league for the league table).

Take Notes

Add notes about the concepts you've learned and SQL cells with queries you want to keep.

Add your notes here

Explore Datasets

Use the match, league, and country tables to explore the data and practice your skills!

  • Use the match, league, and country tables to return the number of matches played in Great Britain versus elsewhere in the world.
    • "England", "Scotland", and "Wales" should be categorized as "Great Britain"
    • All other leagues will need to be categorized as "World".
  • Use the match and country tables to return the countries in which the average number of goals (home and away goals) scored are greater than the average number of goals of all matches.
  • In a soccer league, points are assigned to teams based on the result of a game. Here, let's assume that 3 points are awarded for a win, 1 for a tie, and 0 for a defeat. Use the match table to calculate the running total of points earned by the team "Chelsea" (team id 8455) in the season "2014/2015".
    • The final output should have the match date, the points earned by Chelsea, and the running total.
Hidden code df
Spinner
Queryas
team
variable
[17]
--Preview of Teams table

SELECT * FROM soccer.team
LIMIT 5

1 hidden cell
Hidden code df12
Spinner
DataFrameas
df2
variable
--What are the EPL team names for 2012/13 using joins

SELECT DISTINCT team_long_name as teams,
				team_short_name as abbr, 
				t.team_api_id

FROM soccer.team as t
	 JOIN soccer.match as m
		ON t.team_api_id = m.hometeam_id
					
WHERE m.season = '2012/2013' AND m.country_id = 1729

--ORDER BY teams ASC
;
Hidden output
Spinner
DataFrameas
df6
variable
--What are the EPL team names for 2012/13 using subquerying

SELECT 	team_long_name as teams, 
		team_short_name as abbr, 
		team_api_id

FROM soccer.team
	 
WHERE team_api_id IN
		(SELECT hometeam_id
	 	 FROM soccer.match
	  	 WHERE season = '2012/2013' AND country_id = 1729
		)
--ORDER BY teams ASC
;
Hidden output
Spinner
Queryas
match
variable
--Preview of Matches Table

SELECT * FROM soccer.match

WHERE season = '2012/2013' AND country_id = 1729

--GROUP BY season
--ORDER BY season ASC
--LIMIT 10
Spinner
DataFrameas
df10
variable
--How many games in total were played

SELECT count(*) 

FROM soccer.match

WHERE season = '2012/2013' AND country_id = 1729
Spinner
DataFrameas
df3
variable
--What is the match records for each team in the EPL for 2012/13

SELECT t.team_long_name as teams, SUM(m.home_goal) as HG, SUM(m.away_goal) as AG
	
FROM soccer.team as t
	 JOIN soccer.match as m
		ON t.team_api_id = m.hometeam_id
					
WHERE m.season = '2012/2013' AND m.country_id = 1729

GROUP BY t.team_long_name

ORDER BY teams ASC
Spinner
DataFrameas
df7
variable
--what are the matches with more than 5 goals? who won?
----total the hg and ag for each match and use that as a filter

SELECT 	t1.team_long_name AS home_team,
		m.home_goal,
		t2.team_long_name AS away_team,
		m.away_goal,
		
		CASE 
			WHEN m.home_goal > m.away_goal THEN 'Home Win'
			WHEN m.away_goal > m.home_goal THEN 'Away Win'
			ELSE 'Draw'
		END AS RESULT

FROM soccer.match as m
	JOIN soccer.team as t1
		ON m.hometeam_id = t1.team_api_id
		
		JOIN soccer.team as t2
			ON m.awayteam_id = t2.team_api_id
			
			
--WHERE season = '2012/2013' AND country_id = 1729
--	and (m.home_goal + M.away_goal) > 5 
	
WHERE m.id IN
	(
		SELECT 
			CASE WHEN (m.home_goal + m.away_goal) > 5 THEN m.id END
		FROM soccer.match as m
		WHERE season = '2012/2013' AND country_id = 1729
		GROUP BY m.id
	)
	



Spinner
DataFrameas
df8
variable
--what is the league avg for hg?

SELECT AVG(m.home_goal)
	FROM soccer.match as m
		WHERE season = '2012/2013' AND country_id = 1729