Skip to main content

DuckDB makes SQL a first-class citizen on DataLab

In this blog post, we list out all recent improvements that make it seamless and efficient to query data with SQL, all without leaving the tool.
Apr 2023

A duck codes in DuckDBIn today's data-driven world, extracting valuable insights from raw information is essential for businesses and researchers alike. This is where DataLab steps in: DataCamp's fully managed, collaborative data notebook that simplifies data analysis and makes sharing with colleagues, partners, or clients a breeze, without worrying about configuration and setup.

Since SQL is still the fundamental building block in many data projects, we've been working hard to further enhance the SQL capabilities in DataLab.

Query CSV Files and DataFrames

Many DataCamp users are at the beginning of their data science journey and don’t have access to a dedicated database or data warehouse. Instead, they are looking to play around with a CSV file they found online or uploaded from their local computer. We also came across this limitation ourselves when hosting DataLab-powered live trainings: we had to set up a database every time we wanted to do a SQL training, which was cumbersome.

Those times are history now! With the SQL cell inside a DataLab notebook, you can directly query CSV files. This effectively treats your CSV file as a database that you can query like any other database. Try it out here


Users are loving this enhancement, as the CSV support lets them skip database setup entirely. Others love the workflows that the DataFrame SQL support gives, allowing them to switch between SQL and Python, drawing from the best of both languages.

How it works

This new capability is powered by DuckDB. Dubbed “SQLite for analytics”, this embedded analytical database engine is able to query CSV files with excellent performance. When users select “DataFrames and CSVs”, DataLab does not send the query to a data warehouse as usual, but transforms the query into Python code that calls the DuckDB Python API in the user’s session and return the resulting DataFrame.

In addition to querying CSV files, DuckDB also allows us to query in-memory DataFrames that were created higher up in the notebook. This creates some pretty powerful workflows, for example:

  • Query a CSV file with a SQL cell. The query result is available as a pandas DataFrame
  • Use some pandas functions to clean up and restructure the code in a way that would be hard in SQL. Store the result as a new data frame.
  • Query the resulting data frame with SQL to calculate some aggregates, which is straightforward with SQL’s GROUP BY functionality.

Other improvements

Autocomplete for Table and Column Names

We didn’t stop with CSV and DataFrame support. Writing SQL queries can be cumbersome, especially when working with large databases containing lots of tables and columns. To make this easier, we have introduced autocompletion for table and column names in DataLab. As you type your query, our intelligent system will automatically suggest relevant table and column names based on your input, helping you avoid typos and save time. 

Table and column names are automatically suggested when you write a SQL query.

This new feature not only increases productivity but also minimizes errors. Whether you're a seasoned SQL veteran or just starting out, autocompletion will help you write better queries, faster.

Improved Schema Browser

To facilitate navigating and exploring your connected databases, we have overhauled the schema browser in DataLab. Now, you can quickly see all tables in a connected database, search through them, and even preview their contents with just a few clicks. 

Browse schemas, tables and columns with easy shortcuts to kickstart queries.

Our enhanced schema browser is designed to help you work more efficiently with your data, allowing you to spend less time searching for the right table or column and more time uncovering valuable insights.

Parametrize your SQL queries

When working with SQL, there are times you want to dynamically update your SQL queries based on results of previous calculations or other data in your notebook. DataLab now supports this through SQL parametrization, which allows you to insert variables into your SQL queries. 

Insert Python or R variables into your SQL query using interpolation

More Database and Data Warehouse Integrations

You should be able to seamlessly access your data, no matter where it's stored. DataCamp already allows you to query all common databases and data warehouses with a SQL cell, including PostgreSQL, MySQL, Redshift, Oracle, SQL Server, BigQuery, and Athena. We’ll be adding even more integrations to our platform, such as Snowflake and ClickHouse, so we cover all data stacks and use cases.

Making analytics work fast and enjoyable

At DataCamp, we're committed to making your data analysis experience seamless and enjoyable. With these SQL enhancements we believe that DataLab is now even better equipped to help you unlock the full potential of your data. Whether you're a data scientist, a business analyst, or just someone who loves working with data, we're confident that DataLab will help you streamline your workflows and elevate your analyses. Give these new features a try and experience the difference for yourself!

In the meantime, we'll continue to push the boundaries of data analysis and unlock new possibilities for insights discovery, with notebook scheduling, dark mode, native dbt support, advanced dashboarding, and much more! Is there something we should improve? Something you’re especially excited about? Let us know by sending an email to! Thanks!



PostgreSQL vs. MySQL: Choosing the Right Database for Your Project

Explore the key differences and similarities between PostgreSQL and MySQL to find the best database solution for your project's needs.
Jake Roach's photo

Jake Roach

8 min


SQL Developer Salaries: Expectations in 2024

In this article, we're going to learn what an SQL developer does, what factors impact SQL developer salaries, how the average SQL developer salary compares to the average salaries of other developer profiles, and how you can increase your salary as an SQL developer.
Elena Kosourova's photo

Elena Kosourova

7 min

Introducing datalab


Introducing DataLab

DataCamp is launching DataLab, an AI-enabled data notebook to make it easier and faster than ever before to go from data to insight. Read on to learn more about what makes DataLab unique and our path towards it.
Filip Schouwenaars's photo

Filip Schouwenaars

3 min


50 Years of SQL with Don Chamberlin, Computer Scientist and Co-Inventor of SQL

Richie and Don explore the early development of SQL, the commercialization and adoption of SQL, how it became standardized, how it evolved and spread via open source, the future of SQL through NoSQL and SQL++ and much more.
Richie Cotton's photo

Richie Cotton

36 min


A Beginner's Guide to Data Analysis with SQL

In this session, DataCamp's VP of Media Adel Nehme & co-host of the DataFramed podcast, shows you how to get started with SQL.
Adel Nehme's photo

Adel Nehme


Full Stack Data Engineering with Python

In this session, you'll see a full data workflow using some LIGO gravitational wave data (no physics knowledge required). You'll see how to work with HDF5 files, clean and analyze time series data, and visualize the results.
Blenda Guedes's photo

Blenda Guedes

See MoreSee More