Course
SQL is essential for data scientists and data engineers to retrieve insights from databases. A database consists of multiple tables, and there are different flavors of SQL ranging from PostgreSQL to Snowflake.
Data Definition Language (DDL) commands in SQL are used to define and manage the structure of database objects. In this article I will explain DDL commands in SQL with examples using a Movies Database. Let’s get started!
Associate Data Engineer in SQL
What Are DDL Commands in SQL?
Data Definition Language (DDL) commands allow me to define and manage a schema in SQL. In a nutshell, a schema in SQL is a blueprint that defines how data is organized in a database and how the relationships among different tables within the database are managed.
DDL commands consist of different commands that have different functionalities, which I will discuss in the following order:
DDL Command |
DESCRIPTION |
SYNTAX |
|
Create a table and its columns together with their datatype. |
|
|
Modify column names and add or delete a column. |
|
|
Change the name of the table. |
|
|
Add an explanation to the SQL code for other team members to review. |
|
|
Remove data from a table without deleting the table. |
|
|
Delete the table with its data. |
|
DDL Commands in SQL with Examples
Now that we have a basic understanding of DDL commands and their purposes, let's explore some practical examples using the Movies database.
CREATE command in SQL
When I need to create a new table, I use the CREATE TABLE
command as seen below:
CREATE TABLE table_name (
column_1 datatype,
column_2 datatype,
...,
column_n datatype
);
In the example above, I define columns with different data types. SQL supports various data types, although the most common types are usually STRING (text), INT (whole numbers), FLOAT (decimal numbers), and DATE (date).
For example, if I wanted to create the table actor
, I use CREATE TABLE
as such:
CREATE TABLE actor (
actor_id String(32767),
first_name String(32767),
last_name String(32767),
last_update String(32767)
);
Where:
actor
is the table nameactor_id
is a column name whose content is expected to be in text format (String
) and not exceed 32767 characters.first_name
is a column name whose content is expected to be in text format (String
) and not exceed 32767 characters.last_name
is a column name whose content is expected to be in text format (String
) and not exceed 32767 characters.last_update
is a column name whose content is expected to be in text format (String
) and not exceed 32767 characters.
Once I execute the query, the new table will be visible along with the other tables, so I can query from it anytime.
ALTER command in SQL
Using ALTER in SQL, I can update a table without having to create another table and delete the old one. Organizations may have thousands of tables, and knowing how to keep the data updated without risking data losses is essential.
It’s important to note that there are a variety of ways to use ALTER
—including:
- Adding new columns in a table
- Renaming columns in a table
- Modifying columns in a table
- Dropping columns in a table
ADD command in ALTER
To add a new column using ALTER
, we combine the ALTER
command with the ADD
command, as seen below:
ALTER TABLE table_name ADD column_name datatype;
Here, we add a new column nationality
with the data type STRING
to the actor table.
ALTER TABLE actor ADD nationality String(32767);
All ALTER
commands in SQL follow a similar structure, making SQL a natural programming language to learn and master.
RENAME command in ALTER
Whenever I need to change a column name, I use the RENAME
command in ALTER
as follows:
ALTER TABLE table_name RENAME COLUMN old_column_name TO new_column_name;
For example, let’s change the column name last_name
into family_name
to ensure that there is no confusion, whether it refers to a middle name or surname. In that case, I will query the following:
ALTER TABLE actor RENAME COLUMN last_name TO family_name;
MODIFY command in ALTER
Datatypes define column properties and influence the quality of the data. For example, last_update
was set to a String(text)
, but the data clearly shows that it represents date and time.
Looking at the different datatypes and referring back to the data in the column, the appropriate data type is TIMESTAMP
.
Updating a column’s datatype appropriately facilitates the aggregation and filtering of the data by date using other SQL commands. Here’s how we use MODIFY
in SQL:
ALTER TABLE table_name ALTER COLUMN column_name datatype;
Reflecting the above into last_update
, I use the following query:
ALTER TABLE actor ALTER COLUMN last_update TIMESTAMP;
DROP command in ALTER
The last option that ALTER
offers is to drop a column. There are numerous reasons why I would need to DROP
a column, which includes but is not limited to:
- Information that is no longer relevant,
- Duplicated information, where the same column is found in another table,
- Data quality issues,
- Optimize storage space.
Dropping a column in SQL is easy, but use it cautiously because it is not always possible to reverse a drop column action.
ALTER TABLE table_name DROP COLUMN column_name;
The command for dropping a column is similar to renaming a column, but the two are not the same. Renaming the column name keeps the data within the column but changes the header. On the other hand, the DROP
command removes both the data and the header, as if the column never existed.
After realizing that the column last_update
provides little information other than whether the database was refreshed, I decided to drop it using the following query:
ALTER TABLE actor DROP COLUMN last_update;
RENAME command in SQL
The RENAME
command only applies to changing the table's name and should not be confused with ALTER
’s RENAME
capabilities. In fact, to change a table’s name, the query is simply:
RENAME TABLE old_table_name TO new_table_name;
The table actor
contains only names and does not include other information about the actors. Therefore, it makes sense to clarify this by changing the table name to:
RENAME TABLE actor TO actor_names;
COMMENT in SQL
Commenting in SQL is crucial for clarity and context, making your queries easier to understand and maintain. It helps explain complex logic, document changes, and justify certain choices. In SQL, you can add comments in two ways:
- Single-line comments
- Multi-line comments
Let’s see them in action.
Single-line commenting in SQL
Commenting on a single line is usually done alongside the code, as below. Thus, I can read a colleague’s comment while reading the code.
In the actor
table, actor_id
and last_update
are not intuitive and require clarification using a comment.
CREATE TABLE actor (
actor_id String(32767), --unique identifier
first_name String(32767),
last_name String(32767),
last_update String(32767)--datetime of last update
);
Commenting in SQL is a different command than the rest in this article because it doesn’t call a function COMMENT. The two hyphens with no space in between ‘--’ shows the beginning of a comment. Anything written after it will not be considered part of the code.
Multi-line commenting in SQL
When the explanation is longer, use multi-line comments. This is preferred for explaining a function or providing detailed context. To add a multi-line comment in SQL, start with /*
and end with */
. The SQL engine ignores anything written between /*
and */
and is not executed as part of the query.
/* Create a table `actor` with the actors’ names. Store first and last name, and keep track of any updates*/
CREATE TABLE actor (
actor_id String(32767), --unique internal identifier
first_name String(32767),
last_name String(32767),
last_update String(32767)--datetime of last update
);
TRUNCATE command in SQL
Truncating a table is not the same as deleting it. Truncating removes all the data but keeps the table's structure intact, while deleting removes the entire table, including its structure. Caution must be taken to choose the proper command for the appropriate purpose.
The equivalent of TRUNCATE
in Excel can be achieved by selecting all the data under the column name, and press delete. In SQL, I write:
TRUNCATE TABLE table_name;
If the actor
needs truncating, I execute:
TRUNCATE TABLE actor;
The result would be a table with only column names and no values.
DROP command in SQL
Contrary to TRUNCATE
, the DDL command DROP
deletes the table and all its values altogether. One must be careful when executing such commands, as this action may not always be reversible. The DROP
command has a similar structure to the TRUNCATE
query:
DROP TABLE table_name;
So that deleting the table actor is as simple as
DROP TABLE actor;
While different SQL flavors may require slight changes in syntax, the above DDL commands are largely the same no matter what version of SQL you are using.
DDL Commands vs Other Types of Commands
In the previous section, I explained how DDL commands define the structure of tables and update their contents or properties as needed. Other types of SQL commands allow for selecting, aggregating, updating, and even joining tables to get a different view of the data, and each set of commands has a specific purpose.
Command |
Definition |
Purpose |
DQL |
Data Query Language |
Get data from the database based on a criteria |
DML |
Data Manipulation Language |
Manipulate tables in the database with more functionality than ALTER. |
DCL |
Data Control Language |
Permission controls on access and authority. |
TCL |
Transaction Control Language |
Updates the database with changes amended using DML commands |
Build your SQL Skills with DataCamp
Good data quality is achieved when data is well-defined and structured. DDL commands in SQL help data professionals build and maintain a structured database. Other SQL commands, like DML commands, allow you to view, aggregate, insert, filter, and merge data. Learn more about SQL and build your skills by exploring the following resources:
- Learn the different types of commands to work with SQL
- Keep a SQL cheat sheet handy to familiarize yourself with best practices and frequently used commands.
- Prepare for a SQL interview with our Top SQL Interview Guide
Data and AI professional with a vast experience in building bespoke data science models using statistical techniques as well as advanced models such as time-series, LLM, Neural Network and Graph Theory.