Skip to content

Did you know that the average return from investing in stocks is 10% per year (not accounting for inflation)? But who wants to be average?!

You have been asked to support an investment firm by analyzing trends in high-growth companies. They are interested in understanding which industries are producing the highest valuations and the rate at which new high-value companies are emerging. Providing them with this information gives them a competitive insight as to industry trends and how they should structure their portfolio looking forward.

You have been given access to their unicorns database, which contains the following tables:

dates

ColumnDescription
company_idA unique ID for the company.
date_joinedThe date that the company became a unicorn.
year_foundedThe year that the company was founded.

funding

ColumnDescription
company_idA unique ID for the company.
valuationCompany value in US dollars.
fundingThe amount of funding raised in US dollars.
select_investorsA list of key investors in the company.

industries

ColumnDescription
company_idA unique ID for the company.
industryThe industry that the company operates in.

companies

ColumnDescription
company_idA unique ID for the company.
companyThe name of the company.
cityThe city where the company is headquartered.
countryThe country where the company is headquartered.
continentThe continent where the company is headquartered.

The output

Your query should return a table in the following format:

industryyearnum_unicornsaverage_valuation_billions
industry12021------
industry22020------
industry32019------
industry12021------
industry22020------
industry32019------
industry12021------
industry22020------
industry32019------

Where industry1, industry2, and industry3 are the three top-performing industries.

Step 1. Creation and test of firts CTE to find the top performing industries of 2019, 2020 and 2021 based on the volume of unicorn companies.

Spinner
DataFrameas
df2
variable
SELECT
	COUNT(d.company_id) AS num_unicorns,
	industry
FROM dates AS d
LEFT JOIN industries AS i
ON d.company_id = i.company_id
WHERE EXTRACT(YEAR FROM date_joined) IN (2019, 2020, 2021)
GROUP BY i.industry 
ORDER BY num_unicorns DESC
LIMIT 3;

Step 2. Creation and testing of a CTE that returns the the number of unicorn companies per industry and the average valuation by year.

Spinner
DataFrameas
df1
variable
SELECT
	COUNT(d.company_id) AS num_unicorns,
	i.industry AS industry,
	EXTRACT(YEAR FROM d.date_joined) AS year,
	AVG(f.valuation) AS average_valuation
FROM dates AS d
LEFT JOIN industries AS i
ON d.company_id = i.company_id
LEFT JOIN funding AS f
ON d.company_id = f.company_id
GROUP BY industry, year;

Step 3. Join both CTEs to rank the industry by number of unicorn companies and the average valuation per year, from 2021, 2020 and 2019.

Spinner
DataFrameas
df
variable
WITH top_industries AS (
	SELECT
		COUNT(d.company_id) AS num_unicorns,
		industry
	FROM dates AS d
	LEFT JOIN industries AS i
	ON d.company_id = i.company_id
	WHERE EXTRACT(YEAR FROM date_joined) IN (2019, 2020, 2021)
	GROUP BY i.industry 
	ORDER BY num_unicorns DESC
	LIMIT 3),
rank AS (
	SELECT
		COUNT(d.company_id) AS num_unicorns,
		i.industry AS industry,
		EXTRACT(YEAR FROM d.date_joined) AS year,
		AVG(f.valuation) AS average_valuation
	FROM dates AS d
	LEFT JOIN industries AS i
	ON d.company_id = i.company_id
	LEFT JOIN funding AS f
	ON d.company_id = f.company_id
	GROUP BY industry, year)
SELECT 
	r.industry AS industry,
	r.year AS year,
	r.num_unicorns AS num_unicorns,
	ROUND(AVG(average_valuation)/1000000000, 2) AS average_valuation_billions
FROM rank AS r
JOIN top_industries AS t
ON r.industry = t.industry
WHERE r.year IN (2019, 2020, 2021) AND r.industry IN (SELECT industry FROM top_industries)
GROUP BY r.industry, r.year, r.num_unicorns
ORDER BY r.year DESC, r.num_unicorns DESC;