Skip to main content
HomeBlogSQL

DuckDB makes SQL a first-class citizen on DataCamp Workspace

In this blog post, we list out all recent improvements that make it seamless and efficient to query data with SQL, all without leaving the tool.
Apr 2023

A duck codes in DuckDBIn today's data-driven world, extracting valuable insights from raw information is essential for businesses and researchers alike. This is where DataCamp Workspace steps in: a fully managed, collaborative data notebook that simplifies data analysis and makes sharing with colleagues, partners, or clients a breeze, without worrying about configuration and setup.

Since SQL is still the fundamental building block in many data projects, we've been working hard to further enhance the SQL capabilities in DataCamp Workspace.

Query CSV Files and DataFrames

Many DataCamp users are at the beginning of their data science journey and don’t have access to a dedicated database or data warehouse. Instead, they are looking to play around with a CSV file they found online or uploaded from their local computer. We also came across this limitation ourselves when hosting Workspace-powered live trainings: we had to set up a database every time we wanted to do a SQL training, which was cumbersome.

Those times are history now! With the SQL cell inside a DataCamp Workspace notebook, you can directly query CSV files. This effectively treats your CSV file as a database that you can query like any other database. Try it out here

image2.png

Users are loving this enhancement, as the CSV support lets them skip database setup entirely. Others love the workflows that the DataFrame SQL support gives, allowing them to switch between SQL and Python, drawing from the best of both languages.

How it works

This new capability is powered by DuckDB. Dubbed “SQLite for analytics”, this embedded analytical database engine is able to query CSV files with excellent performance. When users select “DataFrames and CSVs”, Workspace does not send the query to a data warehouse as usual, but transforms the query into Python code that calls the DuckDB Python API in the user’s session and return the resulting DataFrame.

In addition to querying CSV files, DuckDB also allows us to query in-memory DataFrames that were created higher up in the notebook. This creates some pretty powerful workflows, for example:

  • Query a CSV file with a SQL cell. The query result is available as a pandas DataFrame
  • Use some pandas functions to clean up and restructure the code in a way that would be hard in SQL. Store the result as a new data frame.
  • Query the resulting data frame with SQL to calculate some aggregates, which is straightforward with SQL’s GROUP BY functionality.

Other improvements

Autocomplete for Table and Column Names

We didn’t stop with CSV and DataFrame support. Writing SQL queries can be cumbersome, especially when working with large databases containing lots of tables and columns. To make this easier, we have introduced autocompletion for table and column names in DataCamp Workspace. As you type your query, our intelligent system will automatically suggest relevant table and column names based on your input, helping you avoid typos and save time. 

Table and column names are automatically suggested when you write a SQL query.

This new feature not only increases productivity but also minimizes errors. Whether you're a seasoned SQL veteran or just starting out, autocompletion will help you write better queries, faster.

Improved Schema Browser

To facilitate navigating and exploring your connected databases, we have overhauled the schema browser in DataCamp Workspace. Now, you can quickly see all tables in a connected database, search through them, and even preview their contents with just a few clicks. 

Browse schemas, tables and columns with easy shortcuts to kickstart queries.

Our enhanced schema browser is designed to help you work more efficiently with your data, allowing you to spend less time searching for the right table or column and more time uncovering valuable insights.

Parametrize your SQL queries

When working with SQL, there are times you want to dynamically update your SQL queries based on results of previous calculations or other data in your notebook. Workspace now supports this through SQL parametrization, which allows you to insert variables into your SQL queries. 

Insert Python or R variables into your SQL query using interpolation

More Database and Data Warehouse Integrations

You should be able to seamlessly access your data, no matter where it's stored. DataCamp already allows you to query all common databases and data warehouses with a SQL cell, including PostgreSQL, MySQL, Redshift, Oracle, SQL Server, BigQuery, and Athena. We’ll be adding even more integrations to our platform, such as Snowflake and ClickHouse, so we cover all data stacks and use cases.

Making analytics work fast and enjoyable

At DataCamp, we're committed to making your data analysis experience seamless and enjoyable. With these SQL enhancements we believe that DataCamp Workspace is now even better equipped to help you unlock the full potential of your data. Whether you're a data scientist, a business analyst, or just someone who loves working with data, we're confident that DataCamp Workspace will help you streamline your workflows and elevate your analyses. Give these new features a try and experience the difference for yourself!

In the meantime, we'll continue to push the boundaries of data analysis and unlock new possibilities for insights discovery, with notebook scheduling, dark mode, native dbt support, advanced dashboarding, and much more! Is there something we should improve? Something you’re especially excited about? Let us know by sending an email to [email protected]! Thanks!

Topics
Related

Top 5 SQL Server Certifications: A Complete Guide

Unlock SQL Server certification success with our guide on paths, preparation with DataCamp, and the top certifications to enhance your career.
Matt Crabtree's photo

Matt Crabtree

8 min

PostgreSQL vs. MySQL: Choosing the Right Database for Your Project

Explore the key differences and similarities between PostgreSQL and MySQL to find the best database solution for your project's needs.
Jake Roach's photo

Jake Roach

8 min

SQL Developer Salaries: Expectations in 2024

In this article, we're going to learn what an SQL developer does, what factors impact SQL developer salaries, how the average SQL developer salary compares to the average salaries of other developer profiles, and how you can increase your salary as an SQL developer.
Elena Kosourova's photo

Elena Kosourova

7 min

Mastering SQL NOT EQUAL Operator: A Beginner's Guide

Unlock the power of SQL NOT EQUAL with our expert guide. Learn to refine data queries with practical examples and optimization tips for better analysis.
Abid Ali Awan's photo

Abid Ali Awan

5 min

SQL NOT IN Operator: A Comprehensive Guide for Beginners

Master SQL's NOT IN operator with this beginner's guide. Learn to filter data effectively, avoid common pitfalls, and explore efficient alternatives
Abid Ali Awan's photo

Abid Ali Awan

5 min

SQL CONTAINS: A Comprehensive Tutorial

Unlock the power of SQL CONTAINS for advanced text searches. Dive into logical operators, proximity searches, and wildcard uses for precise data analysis.
Abid Ali Awan's photo

Abid Ali Awan

5 min

See MoreSee More