Skip to content
Exploratory Data Analysis in SQL for Absolute Beginners (copy)
  • AI Chat
  • Code
  • Report
  • Exploratory Data Analysis in SQL for Absolute Beginners

    We'll be working with data from the Climate change adaptation innovation in the water sector in Africa paper which can be found here.

    This study looked at the response of technology to water vulnerability created by climate change in Africa.

    The data used for adaptation technology was water-related patent data. The water stress index accounts for things like projected change of annual runoff, projected change of annual groundwater recharge, fresh water withdrawal rate, water dependency ratio, dam capacity, and access to reliable drinking water. A higher index indicates higher vulnerability.

    The other variables are used to define the country's size (GDP), institutional effectiveness, research and development activity, and knowledge base.

    The fields included in this dataset are:

    • year (data has been pooled for the following years: 1990, 2000, 2005, and 2010 to 2016)
    • adaptation technologies
    • openness to trade (trade as percentage of gross domestic product)
    • time required to register property (calendar days)
    • gross domestic product per capita
    • employers (total)
    • gross enrolment ratio
    • water stress index

    Note that we have shortened the field names in our dataset for easier coding!

    To consult the solution, head over to the file browser and select notebook-solution.ipynb.

    Query the table

    1. Query the full table
    Hidden code df
    Hidden code
    1. Query the country and water_stress_index fields and order by descending order of the water_stress_index field
    Hidden code df
    1. Query the country, year, and gdp_per_capita field to get a list of the country names and their respective GDP; order by the GDP in ascending order but only view the top 10 values
    Hidden code df

    Filter the data

    1. Filter the data to see the country and year where the water_stress_index was between 0.5 and 0.6
    Hidden code df
    1. This time, filter the data to see the countries that start with the letter E or S and have a water_stress_index above 0.5
    Hidden code df