Skip to main content

How to Practice SQL Using any Dataset with Workspace

Learn how DataCamp Workspace optimizes the experience of working with Jupyter notebooks and SQL. Discover how to effortlessly write SQL queries, connect to databases, analyze CSV files, and leverage the power of AI assistance
May 2023  · 9 min read

DataCamp Workspace builds on Jupyter notebooks, optimizing the experience to make you more productive.

One common pain point data analysts and data scientists have is that it can be tricky to get a Jupyter notebook hooked up to a database to start working. The problem can be even more acute for beginners: how can you learn to write SQL if you can't even connect to a database?

Workspace makes it effortless to write SQL - whether you are practicing your skills or working. This tutorial shows you how to get writing SQL in Workspace in seconds.

This tutorial assumes that you've used SQL before. If you haven't, read our SQL Tutorial for Beginners first, then try the examples from that tutorial in Workspace using the techniques shown here!

Getting Started

To use Workspace, you'll need to sign up to DataCamp. (A free account is fine to get started.)

Once you have your account, use the top navigation bar to go to Workspace.

DataCamp Workspace menu

This tutorial will use the built-in NBA Shooting Data dataset. If you prefer to use your own dataset, skip to the section below.

Use the left-hand navigation to go to the Datasets page.

DataCamp Workspace menu

Type "nba" into the search box and click on the dataset.

NBA dataset search

You'll see a preview of the dataset. Read through it, then click "Use Dataset". 

NBA Datase overview

Uploading your own dataset

You can use SQL to analyze any table-like data in Workspace. Currently, CSV files are best supported, so for learning purposes, if you supply your own dataset, it is recommended to use a CSV file.

From the Overview page, click "Empty" to create an empty workspace.

Create a new workspace

Give your workspace a name, then click "Create Workspace."

Naming your workspace

In the left-hand toolbar, open the Files tool.

Files tool

In the top-left of the tool, drag a data file into the file pane to upload it. (You can also click "browse files" or click the ⊕ button then Upload).

Using SQL With Data In CSV Files

The NBA shooting data is in the nba_players_shooting.csv file. In the left-hand toolbar, open the Files tool to see the file.

Files tool NBA dataset

The notebook in the workspace contains some Python code to import the dataset. However, in this case, we want to use SQL.

At the end of the notebook, click 'Add SQL.'


In the "Select source" dropdown, select "DataFrames and CSVs."

Select data source

To write a SQL query against a CSV file, only one change to standard SQL is required. In the FROM clause, instead of naming the table, write the path to the CSV file in single quotes.

Let's calculate the proportion of shots made for each shooter, ordered from best to worst. In the SQL cell, type the following code.

	SUM((SCORE = 'MADE')::DOUBLE) / COUNT(*) AS prop_score_made
	FROM 'nba_players_shooting.csv'
	GROUP BY shooter
	ORDER BY prop_score_made DESC

Shots made query output

Note that the dialect of SQL used when you are working with a CSV file is DuckDB. Read more about this in DuckDB makes SQL a first-class citizen on DataCamp Workspace.

Continue working in Python, R, or no-code charts

In the last example, we kept the default settings for the query return format. Look closely at the top of the SQL cell, and you'll see that it shows the return value as a DataFrame named df. That means you can continue to work on the results in Python (or in R, for R workspaces) or draw no-code plots.

Add chart

Click 'Add Chart' to add a no-code chart.

Add chart

Set the following options.

  • Type: Bar
  • X-axis: prop_score_made
  • Y-axis: SHOOTER

Chart customization

Using SQL With Data In Data Frames

The SQL query that was run against the CSV file created a data frame named df. You can also write SQL queries against these pandas or R data frames. The only difference from standard SQL is that in the FROM clause, you provide the data frame name instead of the table name.

Here, we'll reuse the previous results in df and split the shooter column into separate first and last names. Add a SQL cell, select "DataFrames and CSVs" as the source, and use the following code.

str_split(shooter, ' ')[1] AS first_name,
str_split(shooter, ' ')[2] AS last_name,

Using Sample Databases

Workspace also provides several sample databases that you can use to practice your SQL skills.

Add a SQL cell, and in the source dropdown, scroll down to the "Sample Integrations" section. Here, we'll use "Bicycle Sales."

Sample integrations

A default query will appear. For the tutorial, let's replace it with a simpler query.

MIN(list_price) AS cheapest_list_price
FROM products
GROUP BY product_name

When you use a connection to a database (rather than writing SQL against a CSV file or dataframe), you get the option to return the results as a query. In the dropdown, select "Query." You can also name the resulting value. Here, change "query" to "best_price."

Rename query

Running the query gives the following output.

Query output

You can now use this result to write further SQL against. Essentially, you've written a common table expression (covered in the SQL Commands for Data Scientists tutorial) without having to use the common table expression syntax.

Add another SQL cell, and write this further query.

	FROM best_price
	WHERE cheapest_list_price > 10000

Running this query returns the following results.

Query output 2

Connecting to Other Databases

Workspace also allows you to connect to many types of database, including PostgreSQL, MySQL, Redshift, bigQuery, Athena, SQL Server, MariaDB, and Oracle Database. Full details of how to connect are provided in the Workspace Docs.

In the left-hand toolbar, click on "Integrations," then click the ⊕ button to create a new integration.

Choose the type of database that you are connecting to.

Connecting to external database

You'll be asked to provide connection details (these vary by the type of database).

Database connection console

Using the AI Assistant

While writing SQL can be fun and useful, getting an AI to write it for you can help you learn and make you more productive. You can read about the AI Assistant in From Data to Insights: Get There Faster with the DataCamp Workspace AI Assistant.

Add a SQL cell. In the right-hand context menu, click "Generate."

Generate function

A textbox appears at the top of the SQL cell. In this, write a task for the AI. Choose your own task, or try the following.

Return the product names and list prices for all mountain bikes made by Trek that are in stock.

After typing your query, press Enter to get the AI to generate SQL code.

AI suggested query

If you think that the code is reasonable, click "Accept & run" (or click "Accept," then "Run" later on) to see the results of the query.

Query output

Get the AI to Explain Your Errors

You can also use the AI built-in to Workspace to explain why errors occur in your SQL. Suppose you got confused while writing the query for the expensive bikes and used a Fortran-style .GT. instead of `>`.

    FROM best_price
    WHERE cheapest_list_price .GT. 10000

This results in an error. In the bottom-left of the SQL cell, click "Fix Error."

Error in query

The AI provides the correct code, followed by an explanation of what you did wrong.

Suggested query


You've seen how to avoid setting up a database by writing SQL queries against CSV files and data frames. You used sample databases to practice writing code against. And you saw how to use AI to generate SQL with natural language and fix your errors.

These features make it much easier to get started writing SQL and to be more productive while learning and working.

Take it to the Next Level

Get started with SQL in Workspace by logging in or signing up for a DataCamp account. You can also master some of the SQL fundamentals in our skill track, which will give you all the skills you need to interact with and query your data.

Photo of Richie Cotton
Richie Cotton

Richie helps organizations get from a vague sense of "hey we ought to get better at using data" to having realistic plans to become successful data-driven organizations. He's been a data scientist since before it was called data science, and has written several books and created many DataCamp courses on the subject.

Expand Your Knowledge

Joining Data in SQL

4 hr
Level up your SQL knowledge and learn to join tables together, apply relational set theory, and work with subqueries.
See DetailsRight Arrow
Start Course
See MoreRight Arrow

The 80 Top SQL Interview Questions and Answers for Beginners & Intermediate Practitioners

This article provides a comprehensive overview of 80 essential SQL questions and answers for job hunters, hiring managers, and recruiters, covering both general topics and technical questions.
Elena Kosourova's photo

Elena Kosourova

33 min

Building Your Data Science Portfolio With DataCamp Workspace (Part 2): Add an Analytics Workspace

Get tips and tricks for creating an analytics-style project to add to your data science portfolio using DataCamp Workspace.
Justin Saddlemyer's photo

Justin Saddlemyer

10 min

From Beginner to Pro: Dive into the Most Popular Workspace Publications on DataCamp

Explore the most popular and influential workspace publications created by our vibrant DataCamp community.
Luigi D'Introno's photo

Luigi D'Introno

Building Your Data Science Portfolio with DataCamp Workspace (Part 3): Add Machine Learning Workspace

Learn how to leverage DataCamp Workspace to produce a machine-learning project to add to your data science portfolio. We cover how to get started, how to structure your work, and common mistakes to avoid.
Justin Saddlemyer's photo

Justin Saddlemyer

8 min

Seamlessly Teach Data in the Classroom with DataCamp Workspace

This article will show you how to use DataCamp Workspace to teach data in the classroom with Python or R in less than 5 seconds.
Filip Schouwenaars's photo

Filip Schouwenaars

8 min

How to Use SQL in pandas Using pandasql Queries

Unleash the power of SQL within pandas and learn when and how to use SQL queries in pandas using the pandasql library for seamless integration.
Elena Kosourova's photo

Elena Kosourova

8 min

See MoreSee More