Skip to content
0

ℹ️ Introduction to data science notebooks

You can skip this section if you are already familiar with data science notebooks.

Data science notebooks

A data science notebook is a document containing text cells (what you're reading now) and code cells. What is unique about a notebook is that it's interactive: You can change or add code cells and then run a cell by selecting it and then clicking the Run button on the right ( , or Run All ) or hitting control + enter.

The result will be displayed directly in the notebook.

# Modify any of the numbers and rerun the cell
100 * 1.75 * 21

Data science notebooks & data analysis

Notebooks are great for interactive data analysis. You can add a Text, Python or SQL cell by clicking on the Add Text, Add Code, and Add SQL buttons that appear as you move the mouse pointer near the bottom of any cell.

Here at DataCamp, we call our interactive notebook Workspace. You can find out more about Workspace here.

We will use a SQL cell to load the Lego database containing a wealth of information on Lego sets, themes, colors, and much, much more.

We will use the commands SELECT and * to load the full sets table. We use the command LIMIT to only show the first 10 rows.

Spinner
DataFrameas
df
variable
SELECT * 
FROM sets WHERE year = 2015
LIMIT 100
Spinner
DataFrameas
df
variable
SELECT DISTINCT(year) FROM sets
Spinner
DataFrameas
df
variable
SELECT * FROM themes
Spinner
DataFrameas
df
variable
SELECT DISTINCT(set_num) FROM sets
Spinner
DataFrameas
df
variable
SELECT COUNT(*) FROM sets;
Spinner
DataFrameas
df
variable
SELECT COUNT(DISTINCT(set_num)) FROM sets
Spinner
DataFrameas
df
variable
SELECT COUNT(DISTINCT(set_num)) FROM inventories
Spinner
DataFrameas
df
variable
SELECT set_num, COUNT(*) FROM sets GROUP BY set_num
Spinner
DataFrameas
df
variable
SELECT *  
FROM sets WHERE lower(name) LIKE '%woman%' /* OR lower(name) LIKE '%women%' */
Spinner
DataFrameas
df
variable
SELECT *  
FROM parts  AS women_parts WHERE lower(name) LIKE '%female%' OR lower(name) LIKE '%woman%' OR lower(name) LIKE '%lady%'
Spinner
DataFrameas
df
variable
SELECT *  
FROM sets WHERE lower(name) LIKE '%catwoman%'
LIMIT 10

Get all the part names from the "Catwoman Catcycle City Chase" dataset. First, we do a query to get all the part nums...